mSINGS is a software tool used to detect MSI (Microsatellite Instability). Its advantage seems to be that it can be used for tumor-only samples.
mSINGS project is hosted on Bitbucket instead of GitHub, and the project has been continuously updated… However, the installation guide written by the author seems a bit too technical… And it also does not have an all-in-one installer, so following the documentation for installation can be a bit confusing…
The core of this project is written in Python, and its core functionality involves reading mpileup files generated by samtools for analysis. Therefore, the essential dependencies should be Python3.6
and samtools
. (The git is probably used to clone the project.) Since the author specifies the Python version, he recommends using a Python virtual environment (to prevent future explosions). However, since I am not familiar with virtualenv
, I will use miniconda
instead, which is easier to get started. In summary: creating an environment/installing necessary dependencies -> installing the module -> testing.
- Creating an Environment/Installing Necessary Dependencies
1 | conda create -p /path/to/soft/mSINGS/conda python=3.6 git samtools |
- Installing the Module
1 | conda activate /path/to/soft/mSINGS/conda # Enter the environment to install the module itself |
- Testing
The author mentioned how to create a baseline (which should determine which MSI loci need scanning based on existing data). However, since I am just testing, I will use the files prepared by the project (doc/
directory). Additionally, it’s important to note that the author clearly stated that the input BAM file needs to be aligned to a reference genome without chr
strings (provided by GATK), so I found an FQ data for alignment. If the BAM file already meets the requirements, you can skip this step.
Another thing to note is that the author specifies in the script run_msings.sh
that it activates the virtual environment. Since I did not follow his instructions, the line # source msings-env/bin/activate
in the script needs to be commented out.
1 | set -e |
The final result will be in the Combined_MSI.txt
file, which should contain the results of each BAM file combined together. Of course, I only tested with one BAM file, so I’m not sure what the structure of the merged results will look like…
I tested this sample with 1,166 target MSI loci, and 27 were detected as unstable, representing 2% instability, resulting in a MSS (Microsatellite Stable) status.
1 | Position test |
Finally, let’s take a look at the author’s script run_msings.sh
:
1 |
|
It looks a bit long, but the content is actually quite simple. The author also provided some basic comments. Essentially, it involves sorting each input BAM file and generating mpileup files, then using the msi
program (which was installed by python setup.py install
) for analysis and data aggregation.
```