Show simple item record

dc.contributor.authorNyamundanda, Gen_US
dc.contributor.authorPoudel, Pen_US
dc.contributor.authorPatil, Yen_US
dc.contributor.authorSadanandam, Aen_US
dc.date.accessioned2020-06-15T10:51:38Z
dc.date.issued2017-09-07
dc.identifier.citationScientific reports, 2017, 7 (1), pp. 10849 - ?
dc.identifier.issn2045-2322
dc.identifier.urihttps://repository.icr.ac.uk/handle/internal/3743
dc.identifier.eissn2045-2322
dc.identifier.doi10.1038/s41598-017-11110-6
dc.description.abstractGenome projects now generate large-scale data often produced at various time points by different laboratories using multiple platforms. This increases the potential for batch effects. Currently there are several batch evaluation methods like principal component analysis (PCA; mostly based on visual inspection), and sometimes they fail to reveal all of the underlying batch effects. These methods can also lead to the risk of unintentionally correcting biologically interesting factors attributed to batch effects. Here we propose a novel statistical method, finding batch effect (findBATCH), to evaluate batch effect based on probabilistic principal component and covariates analysis (PPCCA). The same framework also provides a new approach to batch correction, correcting batch effect (correctBATCH), which we have shown to be a better approach to traditional PCA-based correction. We demonstrate the utility of these methods using two different examples (breast and colorectal cancers) by merging gene expression data from different studies after diagnosing and correcting for batch effects and retaining the biological effects. These methods, along with conventional visual inspection-based PCA, are available as a part of an R package exploring batch effect (exploBATCH; https://github.com/syspremed/exploBATCH ).
dc.formatElectronic
dc.format.extent10849 - ?
dc.languageeng
dc.language.isoeng
dc.rights.urihttps://creativecommons.org/licenses/by/4.0
dc.subjectHumans
dc.subjectBreast Neoplasms
dc.subjectColorectal Neoplasms
dc.subjectModels, Statistical
dc.subjectGenomics
dc.subjectDatabases, Genetic
dc.subjectFemale
dc.subjectGenetic Association Studies
dc.titleA Novel Statistical Method to Diagnose, Quantify and Correct Batch Effects in Genomic Studies.
dc.typeJournal Article
dcterms.dateAccepted2017-08-18
rioxxterms.versionofrecord10.1038/s41598-017-11110-6
rioxxterms.licenseref.urihttps://creativecommons.org/licenses/by/4.0
rioxxterms.licenseref.startdate2017-09-07
rioxxterms.typeJournal Article/Review
dc.relation.isPartOfScientific reports
pubs.issue1
pubs.notesNot known
pubs.organisational-group/ICR
pubs.organisational-group/ICR/Primary Group
pubs.organisational-group/ICR/Primary Group/ICR Divisions
pubs.organisational-group/ICR/Primary Group/ICR Divisions/Molecular Pathology
pubs.organisational-group/ICR/Primary Group/ICR Divisions/Molecular Pathology/Systems and Precision Cancer Medicine
pubs.publication-statusPublished
pubs.volume7
pubs.embargo.termsNot known
icr.researchteamSystems and Precision Cancer Medicineen_US
dc.contributor.icrauthorSadanandam, Angurajen


Files in this item

Thumbnail

This item appears in the following collection(s)

Show simple item record

https://creativecommons.org/licenses/by/4.0
Except where otherwise noted, this item's license is described as https://creativecommons.org/licenses/by/4.0