Exome-Seq Data Visualization using IGV

Steps:

1. Installing Integrative Genomics Viewer
2. Select a reference genome
3. Importing files
4. Load Annotations
5. Inspect Individual Variants

1. Installing Integrative Genomics Viewer

     Launch IGV from website: IGV Download Site

          Latest Version: 2.3.4

     screenshot1



     screenshot2

2. Select a reference genome

     screenshot3


     screenshot4

3. Importing files



     We will be looking at tumor and normal sample vcf files produced by a variety of variant callers and inspect the bam file output for each variant
     

Table: Description of sample files to be viewed in IGV (size limited to chr 10):

File Description Type
chr10g.vcf Normal (germline) sample vcf produced by germline variant caller (HaplotypeCaller, GATK) vcf
chr10t.vcf Tumor sample vcf produced by germline variant caller (HaplotypeCaller, GATK) vcf
chr10tg.vcf Tumor-normal comparison vcf produced by somatic variant caller (MuTect) vcf
chr10.germ.bam Alignment file from germline (normal) sample bam
chr10.tumor.bam Alignment file from tumor sample bam

     There are a few ways to load files into IGV (for this tutorial, you can use either method 2 or method 3):


     1. You can click on "Load from File" to directly load files into IGV from your local directory

     screenshot5


     To load vcf and bam files, you need index files [".idx" (for vcf) or ".bai" (for bam)] to be in the same directory


     screenshot6


     screenshot6a

     2. You can use a URL: click on load from URL - the files are in a publicly accessible folder in a remote server (helix)


     screenshot7


     Copy and paste the following URL's one at a time into the box (ctrl-c, ctrl-v):

      http://helix.nih.gov/~CCBR/exome/chr10g.vcf

      http://helix.nih.gov/~CCBR/exome/chr10t.vcf

      http://helix.nih.gov/~CCBR/exome/chr10tg.vcf

      http://helix.nih.gov/~CCBR/exome/chr10.germ.bam

      http://helix.nih.gov/~CCBR/exome/chr10.tumor.bam



     screenshot8


     3. You can load an entire saved session by opening a local xml file (using Open Session), or in this case, using a URL as before

     Copy and paste the following into the URL box :

      http://helix.nih.gov/~CCBR/exome/exome_igv_session.xml



     screenshot8a



4. Load annotation tracks

     To be able to see more gene or SNP annotations than default track (RefSeq)


     Select Load from Server:


     screenshot10


     Select latest Gencode Genes track:

     screenshot11


     Select latest dbSNP track:

     screenshot11a

5. Inspect individual variants

      Navigate to chromosome: Select Chromosome 10


     screenshot9


     Then paste in the coordinate beside chr10: "93694"


     screenshot12

     Alternatively, one can also directly enter into the box: "chr10:93694"

     Right click on the bam tracks and select "color alignments by -> read strand"


     screenshot12a


     Right click on the vcf tracks and select "collapse"


     screenshot13


     Right click on the bam tracks and select "Sort Alignments by -> base"


     screenshot12


     The tracks should now show the highlighted alternative base on top


     screenshot16


     Adjust the height of each of the tracks to fit all of the annotations from the bottom track on the screen - you will see this is an annotated SNP in dbSNP


     screenshot16a


     The vcf entry for this variant from the germline variant caller (HaplotypeCaller) is:

     #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT chr10.tum.bam
     10 93694 rs144539776 T C 2090.77 .
     AC=11;AF=0.367;AN=30;BaseQRankSum=1.50;ClippingRankSum=-9.220e-01;
     DB;DP=133;FS=18.219;InbreedingCoeff=-0.5928;MLEAC=11;MLEAF=0.367;MQ=45.10;MQ0=0;MQRankSum=-5.900e-02;
     QD=17.72;ReadPosRankSum=0.061


     GT:AD:DP:GQ:PL 0/1:4,9:13:99:333,0,118

     The following are descriptions of flags in VCF file:

     NON_REF=Represents any possible alternative allele at this location
     GT=Genotype
     AD=Allelic depths for the ref and alt alleles in the order listed
     DP=Approximate read depth (reads with MQ=255 or with bad mates are filtered)
     GQ=Genotype Quality
     PL=Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification
     SB=Per-sample component statistics which comprise the Fisher's Exact Test to detect strand bias




     Click on yellow balloon on the menu bar and select "Show Details on Click"


     screenshot12


     Click on the vcf track for the tumor sample (chr10t.vcf)


     screenshot17


     

Table: Variant Attributes in the pop-up as annotated in vcf

ID Description Value
AF Allele Frequency, for each ALT allele, in the same order as listed 0.367
AC Allele count in genotypes, for each ALT allele, in the same order as listed 11
MQRankSum Z-score From Wilcoxon rank sum test of Alt vs. Ref read mapping qualities -5.900e-02
MQ RMS Mapping Quality 45.10
MLEAC Maximum likelihood expectation (MLE) for the allele counts (not necessarily the same as the AC), for each ALT allele, in the same order as listed 11
BaseQRankSum Z-score from Wilcoxon rank sum test of Alt Vs. Ref base qualities 1.50
MLEAF Maximum likelihood expectation (MLE) for the allele frequency (not necessarily the same as the AF), for each ALT allele, in the same order as listed 0.367
DP Approximate read depth; some reads may have been filtered 133
ReadPosRankSum Z-score from Wilcoxon rank sum test of Alt vs. Ref read position bias 0.061
AN Total number of alleles in called genotypes 30
FS Phred-scaled p-value using Fisher's exact test to detect strand bias 18.219
MQ0 Total Mapping Quality Zero Reads 0
QD Variant Confidence/Quality by Depth 17.72
ClippingRankSum Z-score From Wilcoxon rank sum test of Alt vs. Ref number of hard clipped bases -9.220e-01
DB dbSNP Membership Flag


     Click on the bam coverage track for the tumor (chr10.tumor.bam.coverage)


     screenshot19


     Click on the colored variant on the vcf track for the tumor-normal comparison (chr10tg.vcf)


     The faded color means poor quality somatic call


     screenshot18


     Right-click on the left column for the vcf track for the tumor-normal comparison (chr10tg.vcf) to expand the track and look at individual sample genotypes


     screenshot18b


     Click on the lower vcf track for the tumor-normal comparison (chr10tg.vcf) to get sample specific genotype information


     screenshot18a


     The variant in the vcf file reads:

     #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT chr10.germ.bam chr10.tum.bam
     10 93694 rs144539776 T C . REJECT DB GT:AD:BQ:DP:FA 0:7,1:.:8:0.125 0/1:4,1:41:5:0.200

     These are the descriptions of tags in the VCF format:

    PASS,Description="Accept as a confident somatic mutation"
    REJECT,Description="Rejected as a confident somatic mutation"
    AD,Number=.,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed"
    BQ,Number=A,Type=Float,Description="Average base quality for reads supporting alleles"
    DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)"
    FA,Number=A,Type=Float,Description="Allele fraction of the alternate allele with regard to reference"
    GQ,Number=1,Type=Integer,Description="Genotype Quality"
    GT,Number=1,Type=String,Description="Genotype"
    PL,Number=G,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification"
    SS,Number=1,Type=Integer,Description="Variant status relative to non-adjacent Normal,0=wildtype,1=germline,2=somatic,3=LOH,4=post-transcriptional modification,5=unknown"
    DB,Number=0,Type=Flag,Description="dbSNP Membership"
    MQ0,Number=1,Type=Integer,Description="Total Mapping Quality Zero Reads"
    SOMATIC,Number=0,Type=Flag,Description="Somatic event"

       You can select a vcf track and navigate to adjacent variants (backwards using ctrl-b, forwards using ctrl-f)

     

       Other coordinates to inspect (limited to chr10):

          chr10:91505872
          chr10:34779500
          chr10:49206547
          chr10:81513939


          You can also enter a gene name: e.g. TUBB8

     screenshot20