Show simple item record

dc.contributor.authorWeisser, H
dc.contributor.authorWright, JC
dc.contributor.authorMudge, JM
dc.contributor.authorGutenbrunner, P
dc.contributor.authorChoudhary, JS
dc.date.accessioned2017-11-28T10:01:37Z
dc.date.issued2016-12-02
dc.identifier.citationJournal of proteome research, 2016, 15 (12), pp. 4686 - 4695
dc.identifier.issn1535-3893
dc.identifier.urihttps://repository.icr.ac.uk/handle/internal/956
dc.identifier.eissn1535-3907
dc.identifier.doi10.1021/acs.jproteome.6b00765
dc.description.abstractProteogenomics leverages information derived from proteomic data to improve genome annotations. Of particular interest are "novel" peptides that provide direct evidence of protein expression for genomic regions not previously annotated as protein-coding. We present a modular, automated data analysis pipeline aimed at detecting such "novel" peptides in proteomic data sets. This pipeline implements criteria developed by proteomics and genome annotation experts for high-stringency peptide identification and filtering. Our pipeline is based on the OpenMS computational framework; it incorporates multiple database search engines for peptide identification and applies a machine-learning approach (Percolator) to post-process search results. We describe several new and improved software tools that we developed to facilitate proteogenomic analyses that enhance the wealth of tools provided by OpenMS. We demonstrate the application of our pipeline to a human testis tissue data set previously acquired for the Chromosome-Centric Human Proteome Project, which led to the addition of five new gene annotations on the human reference genome.
dc.formatPrint-Electronic
dc.format.extent4686 - 4695
dc.languageeng
dc.language.isoeng
dc.publisherAMER CHEMICAL SOC
dc.rights.urihttps://creativecommons.org/licenses/by/4.0
dc.subjectTestis
dc.subjectHumans
dc.subjectProteomics
dc.subjectGenome, Human
dc.subjectSoftware
dc.subjectMale
dc.subjectData Mining
dc.subjectSearch Engine
dc.subjectMolecular Sequence Annotation
dc.subjectMachine Learning
dc.subjectProteogenomics
dc.titleFlexible Data Analysis Pipeline for High-Confidence Proteogenomics.
dc.typeJournal Article
dcterms.dateAccepted2016-10-27
rioxxterms.versionofrecord10.1021/acs.jproteome.6b00765
rioxxterms.licenseref.urihttps://creativecommons.org/licenses/by/4.0
rioxxterms.licenseref.startdate2016-12
rioxxterms.typeJournal Article/Review
dc.relation.isPartOfJournal of proteome research
pubs.issue12
pubs.notesNo embargo
pubs.organisational-group/ICR
pubs.organisational-group/ICR/Primary Group
pubs.organisational-group/ICR/Primary Group/ICR Divisions
pubs.organisational-group/ICR/Primary Group/ICR Divisions/Cancer Biology
pubs.organisational-group/ICR/Primary Group/ICR Divisions/Cancer Biology/Functional Proteomics Group
pubs.organisational-group/ICR
pubs.organisational-group/ICR/Primary Group
pubs.organisational-group/ICR/Primary Group/ICR Divisions
pubs.organisational-group/ICR/Primary Group/ICR Divisions/Cancer Biology
pubs.organisational-group/ICR/Primary Group/ICR Divisions/Cancer Biology/Functional Proteomics Group
pubs.publication-statusPublished
pubs.volume15
pubs.embargo.termsNo embargo
icr.researchteamFunctional Proteomics Group
dc.contributor.icrauthorWright, James
dc.contributor.icrauthorChoudhary, Jyoti


Files in this item

Thumbnail

This item appears in the following collection(s)

Show simple item record

https://creativecommons.org/licenses/by/4.0
Except where otherwise noted, this item's license is described as https://creativecommons.org/licenses/by/4.0