Description
This track shows human clinically relevant variants from the
ClinVar database,
mapped from hg38 to the mm10 genome. The mapping uses UCSC's whole-genome alignments and the
tool LiftOver.
The annotations are somewhat speculative,
as LiftOver is not meant to be used for cross-organism mapping. Among others,
LiftOver has no notion of phylogenetic trees or protein orthology, so the
exact protein to which they are mapped may not be the annotated ortholog.
In areas with protein repeats it may have been mapped to the wrong exon. When the
genome nucleotide in mm10 is different from hg38, the corresponding position
could be several basepairs away. Generally, the more different the gene, the harder the
mapping. Before planning assays on these data, a manual alignment and annotation
of the human and mm10 nucleotide or amino acid sequences is recommended.
Display Conventions and Configuration
Genomic locations of ClinVar variants are labeled with the human ClinVar variant
descriptions. For example, the label "C>G" usually means that in human, the cDNA
nucleotide change is from C>T. On a transcript on the reverse strand, the human
genome nucleotide on the forward strand would be G. In mm10, the genome may not
be G at this position. Zoom in to see the nucleotide in mm10, or click the
variant to show the human position and nucleotide and the mm10 nucleotide.
All ClinVar information related to each is variant is shown on that
variant's details page. Hold the mouse over a feature
to show the clinical significance of a variant in humans.
Only short variants with a length < 10 bp on the human genome were
lifted. A few variants that after lifting result in mm10 annotations longer than
30bp were filtered out, too. This can happen in repetitive regions that are
hard to align.
Annotations are shaded by clinical annotation:
red for pathogenic,
dark grey for uncertain significance or not provided and
green for benign.
The score of the variants is the number of "stars" in ClinVar. On the track configuration
page (above), you can filter the track to show only variants with more than a certain
number of stars. For more information on the star rating, see the
ClinVar documentation.
Data updates
ClinVar is updated every month, but these mappings are not updated yet on a regular schedule.
Please contact us if you are interested in regular updates.
Data access
The raw data can be explored interactively with the
Table Browser
or the Data Integrator.
For automated download and analysis, the genome annotation is stored in a bigBed file that
can be downloaded from
our download server.
The files for this track are called clinvarLift.bb. Individual
regions or the whole genome annotation can be obtained using our tool bigBedToBed
which can be compiled from the source code or downloaded as a precompiled
binary for your system. Instructions for downloading source code and binaries can be found
here.
The tool can also be used to obtain only features within a given range, e.g.
bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/mm10/bbi/clinvarLift.bb -chrom=chr1 -start=0 -end=100000000 stdout
Methods
The hg38 ClinvarMain track was annotated with nucleotides and positions, lifted to mm10,
filtered again for variants < 30bp
and annotated with nucleotides again. The output was converted to the
bigBed format.
The program that performs the mapping is available on
Github.
Credits
Thanks to NCBI for making the ClinVar data available on their FTP site as a tab-separated file.
References
Landrum MJ, Lee JM, Benson M, Brown G, Chao C, Chitipiralla S, Gu B, Hart J, Hoffman D, Hoover J
et al.
ClinVar: public archive of interpretations of clinically relevant variants.
Nucleic Acids Res. 2016 Jan 4;44(D1):D862-8.
PMID: 26582918; PMC: PMC4702865
|