PeptideAtlas Track Settings

Home
Genomes
Genome Browser
Tools
Mirrors
- Euro/Asia Mirrors
- Mirroring Instructions
- US Server
- European Server
- Asian Server
Downloads
My Data
Projects
Help
About Us
- News
- Cite Us
- Release Log
- Staff
- Conditions of Use
- Our History
- Licenses
- Contact Us

Description

PeptideAtlas collects raw mass spectrometry proteomics datasets from laboratories around the world and reprocesses them in a uniform bioinformatics workflow using the Trans-Proteomic Pipeline . This track displays peptide identifications from the PeptideAtlas August 2014 (Build 433) Human build. This build, based on 971 samples containing 420,607,360 spectra, identified 1,021,823 distinct peptides, covering 15,136 canonical proteins.

Each PeptideAtlas build comprises a set of reprocessed experiments from a single species or subset of samples (such has human plasma) from a species. Processed results are filtered to a quality level such that there is a 1% false discovery rate at the protein level. All peptide identifications of sufficient quality to enter a build are mapped to the Ensembl genome (v75) using the Ensembl toolkit. Genomic coordinates for all identified peptides to all their Ensembl protein, transcript, and gene mappings, including intron spans, as calculated by the Ensembl toolkit are stored in the PeptideAtlas database.

All peptide sequences in the August 2014 human build (including unmapped sequences) are available for download in FASTA format.

Methods

Mass spectrometer spectra are compared to theoretical spectra (SEQUEST, X!Tandem) or actual spectra (SpectraST) to identify possible peptides. These peptide identifications are scored and filtered (using PeptideProphet) to retain only the highest scoring identifications. The filtered sequences are compared to protein sequence databases (for human, Ensembl, IPI, and Swiss-Prot). The CDS coordinates relative to protein start of matched sequences are used to then calculate genomic coordinates. The protein identifications are then clustered and annotated using ProteinProphet, and stored in the SBEAMS database, where they assigned a unique identifer of the form PAp[8 digit number], e.g. PAp00000001. The processing pipeline is summarized in the graphic below.

Description

Methods

Credits

References