TCGAbiolinks has provided a few functions to download mutation data from GDC. There are two options to download the data:
GDCquery_Maf
which will download MAF aligned against hg38GDCquery
, GDCdownload
and GDCpreprare
to downoad MAF aligned against hg19This exmaple will download MAF (mutation annotation files) for variant calling pipeline muse. Pipelines options are: muse
, varscan2
, somaticsniper
, mutect
. For more information please access GDC docs.
# Only first 50 to make render faster
datatable(maf[1:20,],
filter = 'top',
options = list(scrollX = TRUE, keys = TRUE, pageLength = 5),
rownames = FALSE)
Hugo_Symbol | Entrez_Gene_Id | Center | NCBI_Build | Chromosome | Start_Position | End_Position | Strand | Variant_Classification | Variant_Type | Reference_Allele | Tumor_Seq_Allele1 | Tumor_Seq_Allele2 | dbSNP_RS | dbSNP_Val_Status | Tumor_Sample_Barcode | Matched_Norm_Sample_Barcode | Match_Norm_Seq_Allele1 | Match_Norm_Seq_Allele2 | Tumor_Validation_Allele1 | Tumor_Validation_Allele2 | Match_Norm_Validation_Allele1 | Match_Norm_Validation_Allele2 | Verification_Status | Validation_Status | Mutation_Status | Sequencing_Phase | Sequence_Source | Validation_Method | Score | BAM_File | Sequencer | Tumor_Sample_UUID | Matched_Norm_Sample_UUID | HGVSc | HGVSp | HGVSp_Short | Transcript_ID | Exon_Number | t_depth | t_ref_count | t_alt_count | n_depth | n_ref_count | n_alt_count | all_effects | Allele | Gene | Feature | Feature_type | One_Consequence | Consequence | cDNA_position | CDS_position | Protein_position | Amino_acids | Codons | Existing_variation | ALLELE_NUM | DISTANCE | TRANSCRIPT_STRAND | SYMBOL | SYMBOL_SOURCE | HGNC_ID | BIOTYPE | CANONICAL | CCDS | ENSP | SWISSPROT | TREMBL | UNIPARC | RefSeq | SIFT | PolyPhen | EXON | INTRON | DOMAINS | GMAF | AFR_MAF | AMR_MAF | ASN_MAF | EAS_MAF | EUR_MAF | SAS_MAF | AA_MAF | EA_MAF | CLIN_SIG | SOMATIC | PUBMED | MOTIF_NAME | MOTIF_POS | HIGH_INF_POS | MOTIF_SCORE_CHANGE | IMPACT | PICK | VARIANT_CLASS | TSL | HGVS_OFFSET | PHENO | MINIMISED | ExAC_AF | ExAC_AF_Adj | ExAC_AF_AFR | ExAC_AF_AMR | ExAC_AF_EAS | ExAC_AF_FIN | ExAC_AF_NFE | ExAC_AF_OTH | ExAC_AF_SAS | GENE_PHENO | FILTER | CONTEXT | src_vcf_id | tumor_bam_uuid | normal_bam_uuid | case_id | GDC_FILTER | COSMIC | MC3_Overlap | GDC_Validation_Status |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hugo_Symbol | Entrez_Gene_Id | Center | NCBI_Build | Chromosome | Start_Position | End_Position | Strand | Variant_Classification | Variant_Type | Reference_Allele | Tumor_Seq_Allele1 | Tumor_Seq_Allele2 | dbSNP_RS | dbSNP_Val_Status | Tumor_Sample_Barcode | Matched_Norm_Sample_Barcode | Match_Norm_Seq_Allele1 | Match_Norm_Seq_Allele2 | Tumor_Validation_Allele1 | Tumor_Validation_Allele2 | Match_Norm_Validation_Allele1 | Match_Norm_Validation_Allele2 | Verification_Status | Validation_Status | Mutation_Status | Sequencing_Phase | Sequence_Source | Validation_Method | Score | BAM_File | Sequencer | Tumor_Sample_UUID | Matched_Norm_Sample_UUID | HGVSc | HGVSp | HGVSp_Short | Transcript_ID | Exon_Number | t_depth | t_ref_count | t_alt_count | n_depth | n_ref_count | n_alt_count | all_effects | Allele | Gene | Feature | Feature_type | One_Consequence | Consequence | cDNA_position | CDS_position | Protein_position | Amino_acids | Codons | Existing_variation | ALLELE_NUM | DISTANCE | TRANSCRIPT_STRAND | SYMBOL | SYMBOL_SOURCE | HGNC_ID | BIOTYPE | CANONICAL | CCDS | ENSP | SWISSPROT | TREMBL | UNIPARC | RefSeq | SIFT | PolyPhen | EXON | INTRON | DOMAINS | GMAF | AFR_MAF | AMR_MAF | ASN_MAF | EAS_MAF | EUR_MAF | SAS_MAF | AA_MAF | EA_MAF | CLIN_SIG | SOMATIC | PUBMED | MOTIF_NAME | MOTIF_POS | HIGH_INF_POS | MOTIF_SCORE_CHANGE | IMPACT | PICK | VARIANT_CLASS | TSL | HGVS_OFFSET | PHENO | MINIMISED | ExAC_AF | ExAC_AF_Adj | ExAC_AF_AFR | ExAC_AF_AMR | ExAC_AF_EAS | ExAC_AF_FIN | ExAC_AF_NFE | ExAC_AF_OTH | ExAC_AF_SAS | GENE_PHENO | FILTER | CONTEXT | src_vcf_id | tumor_bam_uuid | normal_bam_uuid | case_id | GDC_FILTER | COSMIC | MC3_Overlap | GDC_Validation_Status |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
FMN2 | 56776 | WUGSC | GRCh38 | chr1 | 240211162 | 240211162 | + | Nonsense_Mutation | SNP | T | T | A | TCGA-4G-AAZT-01A-11D-A417-09 | TCGA-4G-AAZT-10A-01D-A41A-09 | Somatic | Illumina HiSeq 2000 | 24c3dc90-d1f2-4256-9909-0d0c939c178f | e75e6102-170d-49e4-89e6-7687cad1f6b6 | c.3992T>A | p.Leu1331Ter | p.L1331* | ENST00000319653 | 6/18 | 58 | 34 | 24 | 26 | FMN2,stop_gained,p.L1331*,ENST00000319653,NM_020066.4&NM_001305424.1;FMN2,downstream_gene_variant,,ENST00000447095, | A | ENSG00000155816 | ENST00000319653 | Transcript | stop_gained | stop_gained | 4222/6434 | 3992/5169 | 1331/1722 | L/* | tTa/tAa | 1 | 1 | FMN2 | HGNC | HGNC:14074 | protein_coding | YES | CCDS31069.2 | ENSP00000318884 | Q9NZ56 | NM_020066.4;NM_001305424.1 | 6/18 | Pfam_domain:PF02181;SMART_domains:SM00498;Superfamily_domains:SSF101447 | HIGH | 1 | SNV | 5 | 1 | PASS | GGAATTATTTT | 263c128d-cf0a-4a8b-bafa-fa84d9baeb2c | 5a30d2bd-9cab-44ef-9071-0b34d386a9c0 | 2c662e9d-0c78-4ec4-bc57-3f9573ffc678 | b10c64c2-7fd2-4210-b975-034affb14b57 | COSM4571189 | True | Unknown | |||||||||||||||||||||||||||||||||||||||||||||||||||||
PAX3 | 5077 | WUGSC | GRCh38 | chr2 | 222297158 | 222297158 | + | Missense_Mutation | SNP | G | G | T | novel | TCGA-4G-AAZT-01A-11D-A417-09 | TCGA-4G-AAZT-10A-01D-A41A-09 | Somatic | Illumina HiSeq 2000 | 24c3dc90-d1f2-4256-9909-0d0c939c178f | e75e6102-170d-49e4-89e6-7687cad1f6b6 | c.141C>A | p.Asn47Lys | p.N47K | ENST00000350526 | 2/8 | 52 | 19 | 32 | 29 | PAX3,missense_variant,p.N47K,ENST00000350526,NM_181457.3;PAX3,missense_variant,p.N47K,ENST00000392069,NM_181459.3;PAX3,missense_variant,p.N47K,ENST00000344493,NM_181461.3;PAX3,missense_variant,p.N47K,ENST00000392070,NM_181458.3;PAX3,missense_variant,p.N47K,ENST00000336840,NM_181460.3;PAX3,missense_variant,p.N47K,ENST00000409551,NM_001127366.2;PAX3,missense_variant,p.N47K,ENST00000409828,NM_000438.5;PAX3,missense_variant,p.N47K,ENST00000258387,NM_013942.4;CCDC140,upstream_gene_variant,,ENST00000295226,NM_153038.1 | T | ENSG00000135903 | ENST00000350526 | Transcript | missense_variant | missense_variant | 278/3610 | 141/1440 | 47/479 | N/K | aaC/aaA | 1 | -1 | PAX3 | HGNC | HGNC:8617 | protein_coding | CCDS42826.1 | ENSP00000343052 | P23760 | A0A024R470 | UPI0000131369 | NM_181457.3 | deleterious(0) | possibly_damaging(0.813) | 2/8 | Pfam_domain:PF00292;Prints_domain:PR00027;SMART_domains:SM00351;PROSITE_profiles:PS51057;Superfamily_domains:SSF46689 | MODERATE | SNV | 5 | 1 | PASS | CTGCCGTTGAT | 263c128d-cf0a-4a8b-bafa-fa84d9baeb2c | 5a30d2bd-9cab-44ef-9071-0b34d386a9c0 | 2c662e9d-0c78-4ec4-bc57-3f9573ffc678 | b10c64c2-7fd2-4210-b975-034affb14b57 | True | Unknown | |||||||||||||||||||||||||||||||||||||||||||||||||||
NYAP2 | 57624 | WUGSC | GRCh38 | chr2 | 225582898 | 225582898 | + | Missense_Mutation | SNP | C | C | T | novel | TCGA-4G-AAZT-01A-11D-A417-09 | TCGA-4G-AAZT-10A-01D-A41A-09 | Somatic | Illumina HiSeq 2000 | 24c3dc90-d1f2-4256-9909-0d0c939c178f | e75e6102-170d-49e4-89e6-7687cad1f6b6 | c.1481C>T | p.Thr494Ile | p.T494I | ENST00000272907 | 4/6 | 47 | 30 | 17 | 25 | NYAP2,missense_variant,p.T494I,ENST00000272907,NM_020864.1 | T | ENSG00000144460 | ENST00000272907 | Transcript | missense_variant | missense_variant | 1894/4828 | 1481/1962 | 494/653 | T/I | aCc/aTc | 1 | 1 | NYAP2 | HGNC | HGNC:29291 | protein_coding | YES | CCDS46529.1 | ENSP00000272907 | Q9P242 | UPI00001C1DB6 | NM_020864.1 | deleterious(0.05) | benign(0.407) | 4/6 | MODERATE | 1 | SNV | 1 | 1 | PASS | GAACACCTACG | 263c128d-cf0a-4a8b-bafa-fa84d9baeb2c | 5a30d2bd-9cab-44ef-9071-0b34d386a9c0 | 2c662e9d-0c78-4ec4-bc57-3f9573ffc678 | b10c64c2-7fd2-4210-b975-034affb14b57 | True | Unknown | |||||||||||||||||||||||||||||||||||||||||||||||||||
CHL1 | 10752 | WUGSC | GRCh38 | chr3 | 398326 | 398326 | + | Missense_Mutation | SNP | C | C | G | novel | TCGA-4G-AAZT-01A-11D-A417-09 | TCGA-4G-AAZT-10A-01D-A41A-09 | Somatic | Illumina HiSeq 2000 | 24c3dc90-d1f2-4256-9909-0d0c939c178f | e75e6102-170d-49e4-89e6-7687cad1f6b6 | c.3146C>G | p.Thr1049Ser | p.T1049S | ENST00000397491 | 24/27 | 57 | 17 | 40 | 39 | CHL1,missense_variant,p.T1065S,ENST00000256509,NM_006614.3;CHL1,missense_variant,p.T1049S,ENST00000397491,NM_001253387.1;CHL1,intron_variant,,ENST00000620033,NM_001253388.1;CHL1,intron_variant,,ENST00000445697,;CHL1,3_prime_UTR_variant,,ENST00000453040, | G | ENSG00000134121 | ENST00000397491 | Transcript | missense_variant | missense_variant | 3613/5235 | 3146/3627 | 1049/1208 | T/S | aCt/aGt | 1 | 1 | CHL1 | HGNC | HGNC:1939 | protein_coding | CCDS58812.1 | ENSP00000380628 | O00533 | UPI0000E08093 | NM_001253387.1 | tolerated(0.34) | benign(0.271) | 24/27 | MODERATE | SNV | 1 | 1 | PASS | AATGACTAAGA | 263c128d-cf0a-4a8b-bafa-fa84d9baeb2c | 5a30d2bd-9cab-44ef-9071-0b34d386a9c0 | 2c662e9d-0c78-4ec4-bc57-3f9573ffc678 | b10c64c2-7fd2-4210-b975-034affb14b57 | True | Unknown | |||||||||||||||||||||||||||||||||||||||||||||||||||||
NUP210 | 23225 | WUGSC | GRCh38 | chr3 | 13341792 | 13341792 | + | Missense_Mutation | SNP | T | T | G | novel | TCGA-4G-AAZT-01A-11D-A417-09 | TCGA-4G-AAZT-10A-01D-A41A-09 | Somatic | Illumina HiSeq 2000 | 24c3dc90-d1f2-4256-9909-0d0c939c178f | e75e6102-170d-49e4-89e6-7687cad1f6b6 | c.3184A>C | p.Asn1062His | p.N1062H | ENST00000254508 | 23/40 | 33 | 13 | 20 | 25 | NUP210,missense_variant,p.N1062H,ENST00000254508,NM_024923.3;NUP210,upstream_gene_variant,,ENST00000485755, | G | ENSG00000132182 | ENST00000254508 | Transcript | missense_variant | missense_variant | 3267/7193 | 3184/5664 | 1062/1887 | N/H | Aat/Cat | 1 | -1 | NUP210 | HGNC | HGNC:30052 | protein_coding | YES | CCDS33704.1 | ENSP00000254508 | Q8TEM1 | UPI00001600AF | NM_024923.3 | deleterious(0) | benign(0) | 23/40 | MODERATE | 1 | SNV | 2 | 1 | PASS | TTTATTGGTCA | 263c128d-cf0a-4a8b-bafa-fa84d9baeb2c | 5a30d2bd-9cab-44ef-9071-0b34d386a9c0 | 2c662e9d-0c78-4ec4-bc57-3f9573ffc678 | b10c64c2-7fd2-4210-b975-034affb14b57 | True | Unknown |
This exmaple will download MAF (mutation annotation files) aligned against hg19 (Old TCGA maf files)
query.maf.hg19 <- GDCquery(project = "TCGA-CHOL",
data.category = "Simple nucleotide variation",
data.type = "Simple somatic mutation",
access = "open",
legacy = TRUE)
# Check maf availables
datatable(dplyr::select(getResults(query.maf.hg19),-contains("cases")),
filter = 'top',
options = list(scrollX = TRUE, keys = TRUE, pageLength = 10),
rownames = FALSE)
data_release | data_type | tags | file_name | submitter_id | file_id | file_size | state_comment | id | md5sum | updated_datetime | data_format | access | platform | state | version | data_category | type | experimental_strategy | created_datetime | project | code | center_name | center_short_name | center_center_id | center_namespace | center_center_type | tissue.definition |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
data_release | data_type | tags | file_name | submitter_id | file_id | file_size | state_comment | id | md5sum | updated_datetime | data_format | access | platform | state | version | data_category | type | experimental_strategy | created_datetime | project | code | center_name | center_short_name | center_center_id | center_namespace | center_center_type | tissue.definition |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Simple somatic mutation | snv,somatic | hgsc.bcm.edu_CHOL.IlluminaGA_DNASeq.1.somatic.maf | a8532d87-1eae-4289-8aea-3255d7b313cf | 2482745 | a8532d87-1eae-4289-8aea-3255d7b313cf | 8db4269d8aba6d8d397e2761e24e8e6e | 2017-03-05T09:28:11.866514-06:00 | MAF | open | Mixed platforms | live | Simple nucleotide variation | file | DNA-Seq | TCGA-CHOL | 10 | Baylor College of Medicine | BCM | d3b8c887-498b-5490-903e-760403c68307 | hgsc.bcm.edu | GSC | Primary solid Tumor | |||||
Simple somatic mutation | snv,somatic | bcgsc.ca_CHOL.IlluminaHiSeq_DNASeq.1.somatic.maf | 0d2e60c5-dd32-4a19-b600-7e76496f4f94 | 1012118 | 0d2e60c5-dd32-4a19-b600-7e76496f4f94 | 47268aa46006c53013466f740a3e1462 | 2017-03-05T10:25:29.699247-06:00 | MAF | open | Illumina HiSeq | live | Simple nucleotide variation | file | DNA-Seq | TCGA-CHOL | 34 | Canada's Michael Smith Genome Sciences Centre | BCGSC | 380301b3-6f8d-581d-a81f-f4dd462df12b | bcgsc.ca | GSC | Primary solid Tumor | |||||
Simple somatic mutation | somatic,snv | ucsc.edu_CHOL.IlluminaGA_DNASeq_automated.Level_2.1.0.0.somatic.maf | e45ec3d9-adcc-43db-a71c-9edaf7d11c86 | 785408 | e45ec3d9-adcc-43db-a71c-9edaf7d11c86 | b44be3f2e6a994be766cc881a3143b2b | 2017-03-05T18:49:13.665667-06:00 | MAF | open | Illumina GA | live | Simple nucleotide variation | file | DNA-Seq | TCGA-CHOL | 25 | University of California, Santa Cruz | UCSC | 79cc1498-5d7f-5eae-b631-e74b78c13581 | ucsc.edu | GSC | Primary solid Tumor | |||||
Simple somatic mutation | snv,somatic | hgsc.bcm.edu_CHOL.IlluminaGA_DNASeq.1.somatic.maf | 448661d5-a89a-480e-adfd-1cce8eb74e70 | 2052077 | 448661d5-a89a-480e-adfd-1cce8eb74e70 | ee6d4a3810593268b8038dfb13999ddd | 2017-03-05T00:22:24.678827-06:00 | MAF | open | Illumina GA | live | Simple nucleotide variation | file | DNA-Seq | TCGA-CHOL | 10 | Baylor College of Medicine | BCM | d3b8c887-498b-5490-903e-760403c68307 | hgsc.bcm.edu | GSC | Primary solid Tumor | |||||
Simple somatic mutation | snv,somatic | gsc_CHOL_pairs.aggregated.capture.tcga.uuid.automated.somatic.maf | 2d9ed46f-36a5-4f87-9304-74ce626ae96d | 4274149 | 2d9ed46f-36a5-4f87-9304-74ce626ae96d | 9288f4c155d47f4cc090eee3312e09c2 | 2017-03-05T12:45:35.461959-06:00 | MAF | open | Illumina GA | live | Simple nucleotide variation | file | DNA-Seq | 2016-06-13T17:02:09.527369-05:00 | TCGA-CHOL | 08 | Broad Institute of MIT and Harvard | BI | 61d634b8-e8dd-58bf-9a65-1233dc7c8c6a | broad.mit.edu | GSC | Primary solid Tumor |
query.maf.hg19 <- GDCquery(project = "TCGA-CHOL",
data.category = "Simple nucleotide variation",
data.type = "Simple somatic mutation",
access = "open",
file.type = "bcgsc.ca_CHOL.IlluminaHiSeq_DNASeq.1.somatic.maf",
legacy = TRUE)
GDCdownload(query.maf.hg19)
maf <- GDCprepare(query.maf.hg19)
# Only first 50 to make render faster
datatable(maf[1:20,],
filter = 'top',
options = list(scrollX = TRUE, keys = TRUE, pageLength = 5),
rownames = FALSE)
Hugo_Symbol | Entrez_Gene_Id | Center | NCBI_Build | Chromosome | Start_Position | End_Position | Strand | Variant_Classification | Variant_Type | Reference_Allele | Tumor_Seq_Allele1 | Tumor_Seq_Allele2 | dbSNP_RS | dbSNP_Val_Status | Tumor_Sample_Barcode | Matched_Norm_Sample_Barcode | Match_Norm_Seq_Allele1 | Match_Norm_Seq_Allele2 | Tumor_Validation_Allele1 | Tumor_Validation_Allele2 | Match_Norm_Validation_Allele1 | Match_Norm_Validation_Allele2 | Verification_Status | Validation_Status | Mutation_Status | Sequencing_Phase | Sequence_Source | Validation_Method | Score | BAM_File | Sequencer | Tumor_Sample_UUID | Matched_Norm_Sample_UUID | HGVSc | HGVSp | HGVSp_Short | Transcript_ID | Exon_Number | t_depth | t_ref_count | t_alt_count | n_depth | n_ref_count | n_alt_count | all_effects | Allele | Gene | Feature | Feature_type | One_Consequence | Consequence | cDNA_position | CDS_position | Protein_position | Amino_acids | Codons | Existing_variation | ALLELE_NUM | DISTANCE | TRANSCRIPT_STRAND | SYMBOL | SYMBOL_SOURCE | HGNC_ID | BIOTYPE | CANONICAL | CCDS | ENSP | SWISSPROT | TREMBL | UNIPARC | RefSeq | SIFT | PolyPhen | EXON | INTRON | DOMAINS | GMAF | AFR_MAF | AMR_MAF | ASN_MAF | EAS_MAF | EUR_MAF | SAS_MAF | AA_MAF | EA_MAF | CLIN_SIG | SOMATIC | PUBMED | MOTIF_NAME | MOTIF_POS | HIGH_INF_POS | MOTIF_SCORE_CHANGE | IMPACT | PICK | VARIANT_CLASS | TSL | HGVS_OFFSET | PHENO | MINIMISED | ExAC_AF | ExAC_AF_Adj | ExAC_AF_AFR | ExAC_AF_AMR | ExAC_AF_EAS | ExAC_AF_FIN | ExAC_AF_NFE | ExAC_AF_OTH | ExAC_AF_SAS | GENE_PHENO | FILTER | CONTEXT | src_vcf_id | tumor_bam_uuid | normal_bam_uuid | case_id | GDC_FILTER | COSMIC | MC3_Overlap | GDC_Validation_Status |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hugo_Symbol | Entrez_Gene_Id | Center | NCBI_Build | Chromosome | Start_Position | End_Position | Strand | Variant_Classification | Variant_Type | Reference_Allele | Tumor_Seq_Allele1 | Tumor_Seq_Allele2 | dbSNP_RS | dbSNP_Val_Status | Tumor_Sample_Barcode | Matched_Norm_Sample_Barcode | Match_Norm_Seq_Allele1 | Match_Norm_Seq_Allele2 | Tumor_Validation_Allele1 | Tumor_Validation_Allele2 | Match_Norm_Validation_Allele1 | Match_Norm_Validation_Allele2 | Verification_Status | Validation_Status | Mutation_Status | Sequencing_Phase | Sequence_Source | Validation_Method | Score | BAM_File | Sequencer | Tumor_Sample_UUID | Matched_Norm_Sample_UUID | HGVSc | HGVSp | HGVSp_Short | Transcript_ID | Exon_Number | t_depth | t_ref_count | t_alt_count | n_depth | n_ref_count | n_alt_count | all_effects | Allele | Gene | Feature | Feature_type | One_Consequence | Consequence | cDNA_position | CDS_position | Protein_position | Amino_acids | Codons | Existing_variation | ALLELE_NUM | DISTANCE | TRANSCRIPT_STRAND | SYMBOL | SYMBOL_SOURCE | HGNC_ID | BIOTYPE | CANONICAL | CCDS | ENSP | SWISSPROT | TREMBL | UNIPARC | RefSeq | SIFT | PolyPhen | EXON | INTRON | DOMAINS | GMAF | AFR_MAF | AMR_MAF | ASN_MAF | EAS_MAF | EUR_MAF | SAS_MAF | AA_MAF | EA_MAF | CLIN_SIG | SOMATIC | PUBMED | MOTIF_NAME | MOTIF_POS | HIGH_INF_POS | MOTIF_SCORE_CHANGE | IMPACT | PICK | VARIANT_CLASS | TSL | HGVS_OFFSET | PHENO | MINIMISED | ExAC_AF | ExAC_AF_Adj | ExAC_AF_AFR | ExAC_AF_AMR | ExAC_AF_EAS | ExAC_AF_FIN | ExAC_AF_NFE | ExAC_AF_OTH | ExAC_AF_SAS | GENE_PHENO | FILTER | CONTEXT | src_vcf_id | tumor_bam_uuid | normal_bam_uuid | case_id | GDC_FILTER | COSMIC | MC3_Overlap | GDC_Validation_Status |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
FMN2 | 56776 | WUGSC | GRCh38 | chr1 | 240211162 | 240211162 | + | Nonsense_Mutation | SNP | T | T | A | TCGA-4G-AAZT-01A-11D-A417-09 | TCGA-4G-AAZT-10A-01D-A41A-09 | Somatic | Illumina HiSeq 2000 | 24c3dc90-d1f2-4256-9909-0d0c939c178f | e75e6102-170d-49e4-89e6-7687cad1f6b6 | c.3992T>A | p.Leu1331Ter | p.L1331* | ENST00000319653 | 6/18 | 58 | 34 | 24 | 26 | FMN2,stop_gained,p.L1331*,ENST00000319653,NM_020066.4&NM_001305424.1;FMN2,downstream_gene_variant,,ENST00000447095, | A | ENSG00000155816 | ENST00000319653 | Transcript | stop_gained | stop_gained | 4222/6434 | 3992/5169 | 1331/1722 | L/* | tTa/tAa | 1 | 1 | FMN2 | HGNC | HGNC:14074 | protein_coding | YES | CCDS31069.2 | ENSP00000318884 | Q9NZ56 | NM_020066.4;NM_001305424.1 | 6/18 | Pfam_domain:PF02181;SMART_domains:SM00498;Superfamily_domains:SSF101447 | HIGH | 1 | SNV | 5 | 1 | PASS | GGAATTATTTT | 263c128d-cf0a-4a8b-bafa-fa84d9baeb2c | 5a30d2bd-9cab-44ef-9071-0b34d386a9c0 | 2c662e9d-0c78-4ec4-bc57-3f9573ffc678 | b10c64c2-7fd2-4210-b975-034affb14b57 | COSM4571189 | True | Unknown | |||||||||||||||||||||||||||||||||||||||||||||||||||||
PAX3 | 5077 | WUGSC | GRCh38 | chr2 | 222297158 | 222297158 | + | Missense_Mutation | SNP | G | G | T | novel | TCGA-4G-AAZT-01A-11D-A417-09 | TCGA-4G-AAZT-10A-01D-A41A-09 | Somatic | Illumina HiSeq 2000 | 24c3dc90-d1f2-4256-9909-0d0c939c178f | e75e6102-170d-49e4-89e6-7687cad1f6b6 | c.141C>A | p.Asn47Lys | p.N47K | ENST00000350526 | 2/8 | 52 | 19 | 32 | 29 | PAX3,missense_variant,p.N47K,ENST00000350526,NM_181457.3;PAX3,missense_variant,p.N47K,ENST00000392069,NM_181459.3;PAX3,missense_variant,p.N47K,ENST00000344493,NM_181461.3;PAX3,missense_variant,p.N47K,ENST00000392070,NM_181458.3;PAX3,missense_variant,p.N47K,ENST00000336840,NM_181460.3;PAX3,missense_variant,p.N47K,ENST00000409551,NM_001127366.2;PAX3,missense_variant,p.N47K,ENST00000409828,NM_000438.5;PAX3,missense_variant,p.N47K,ENST00000258387,NM_013942.4;CCDC140,upstream_gene_variant,,ENST00000295226,NM_153038.1 | T | ENSG00000135903 | ENST00000350526 | Transcript | missense_variant | missense_variant | 278/3610 | 141/1440 | 47/479 | N/K | aaC/aaA | 1 | -1 | PAX3 | HGNC | HGNC:8617 | protein_coding | CCDS42826.1 | ENSP00000343052 | P23760 | A0A024R470 | UPI0000131369 | NM_181457.3 | deleterious(0) | possibly_damaging(0.813) | 2/8 | Pfam_domain:PF00292;Prints_domain:PR00027;SMART_domains:SM00351;PROSITE_profiles:PS51057;Superfamily_domains:SSF46689 | MODERATE | SNV | 5 | 1 | PASS | CTGCCGTTGAT | 263c128d-cf0a-4a8b-bafa-fa84d9baeb2c | 5a30d2bd-9cab-44ef-9071-0b34d386a9c0 | 2c662e9d-0c78-4ec4-bc57-3f9573ffc678 | b10c64c2-7fd2-4210-b975-034affb14b57 | True | Unknown | |||||||||||||||||||||||||||||||||||||||||||||||||||
NYAP2 | 57624 | WUGSC | GRCh38 | chr2 | 225582898 | 225582898 | + | Missense_Mutation | SNP | C | C | T | novel | TCGA-4G-AAZT-01A-11D-A417-09 | TCGA-4G-AAZT-10A-01D-A41A-09 | Somatic | Illumina HiSeq 2000 | 24c3dc90-d1f2-4256-9909-0d0c939c178f | e75e6102-170d-49e4-89e6-7687cad1f6b6 | c.1481C>T | p.Thr494Ile | p.T494I | ENST00000272907 | 4/6 | 47 | 30 | 17 | 25 | NYAP2,missense_variant,p.T494I,ENST00000272907,NM_020864.1 | T | ENSG00000144460 | ENST00000272907 | Transcript | missense_variant | missense_variant | 1894/4828 | 1481/1962 | 494/653 | T/I | aCc/aTc | 1 | 1 | NYAP2 | HGNC | HGNC:29291 | protein_coding | YES | CCDS46529.1 | ENSP00000272907 | Q9P242 | UPI00001C1DB6 | NM_020864.1 | deleterious(0.05) | benign(0.407) | 4/6 | MODERATE | 1 | SNV | 1 | 1 | PASS | GAACACCTACG | 263c128d-cf0a-4a8b-bafa-fa84d9baeb2c | 5a30d2bd-9cab-44ef-9071-0b34d386a9c0 | 2c662e9d-0c78-4ec4-bc57-3f9573ffc678 | b10c64c2-7fd2-4210-b975-034affb14b57 | True | Unknown | |||||||||||||||||||||||||||||||||||||||||||||||||||
CHL1 | 10752 | WUGSC | GRCh38 | chr3 | 398326 | 398326 | + | Missense_Mutation | SNP | C | C | G | novel | TCGA-4G-AAZT-01A-11D-A417-09 | TCGA-4G-AAZT-10A-01D-A41A-09 | Somatic | Illumina HiSeq 2000 | 24c3dc90-d1f2-4256-9909-0d0c939c178f | e75e6102-170d-49e4-89e6-7687cad1f6b6 | c.3146C>G | p.Thr1049Ser | p.T1049S | ENST00000397491 | 24/27 | 57 | 17 | 40 | 39 | CHL1,missense_variant,p.T1065S,ENST00000256509,NM_006614.3;CHL1,missense_variant,p.T1049S,ENST00000397491,NM_001253387.1;CHL1,intron_variant,,ENST00000620033,NM_001253388.1;CHL1,intron_variant,,ENST00000445697,;CHL1,3_prime_UTR_variant,,ENST00000453040, | G | ENSG00000134121 | ENST00000397491 | Transcript | missense_variant | missense_variant | 3613/5235 | 3146/3627 | 1049/1208 | T/S | aCt/aGt | 1 | 1 | CHL1 | HGNC | HGNC:1939 | protein_coding | CCDS58812.1 | ENSP00000380628 | O00533 | UPI0000E08093 | NM_001253387.1 | tolerated(0.34) | benign(0.271) | 24/27 | MODERATE | SNV | 1 | 1 | PASS | AATGACTAAGA | 263c128d-cf0a-4a8b-bafa-fa84d9baeb2c | 5a30d2bd-9cab-44ef-9071-0b34d386a9c0 | 2c662e9d-0c78-4ec4-bc57-3f9573ffc678 | b10c64c2-7fd2-4210-b975-034affb14b57 | True | Unknown | |||||||||||||||||||||||||||||||||||||||||||||||||||||
NUP210 | 23225 | WUGSC | GRCh38 | chr3 | 13341792 | 13341792 | + | Missense_Mutation | SNP | T | T | G | novel | TCGA-4G-AAZT-01A-11D-A417-09 | TCGA-4G-AAZT-10A-01D-A41A-09 | Somatic | Illumina HiSeq 2000 | 24c3dc90-d1f2-4256-9909-0d0c939c178f | e75e6102-170d-49e4-89e6-7687cad1f6b6 | c.3184A>C | p.Asn1062His | p.N1062H | ENST00000254508 | 23/40 | 33 | 13 | 20 | 25 | NUP210,missense_variant,p.N1062H,ENST00000254508,NM_024923.3;NUP210,upstream_gene_variant,,ENST00000485755, | G | ENSG00000132182 | ENST00000254508 | Transcript | missense_variant | missense_variant | 3267/7193 | 3184/5664 | 1062/1887 | N/H | Aat/Cat | 1 | -1 | NUP210 | HGNC | HGNC:30052 | protein_coding | YES | CCDS33704.1 | ENSP00000254508 | Q8TEM1 | UPI00001600AF | NM_024923.3 | deleterious(0) | benign(0) | 23/40 | MODERATE | 1 | SNV | 2 | 1 | PASS | TTTATTGGTCA | 263c128d-cf0a-4a8b-bafa-fa84d9baeb2c | 5a30d2bd-9cab-44ef-9071-0b34d386a9c0 | 2c662e9d-0c78-4ec4-bc57-3f9573ffc678 | b10c64c2-7fd2-4210-b975-034affb14b57 | True | Unknown |
To visualize the data you can use the Bioconductor package maftools. For more information, please check its vignette.
datatable(getSampleSummary(maf),
filter = 'top',
options = list(scrollX = TRUE, keys = TRUE, pageLength = 5),
rownames = FALSE)
plotmafSummary(maf = maf, rmOutlier = TRUE, addStat = 'median', dashboard = TRUE)