The sequence kernel association test for multicategorical outcomes.
Loading...
Embargo End Date
ICR Authors
Authors
Jiang, Z
Zhang, H
Ahearn, TU
Garcia-Closas, M
Chatterjee, N
Zhu, H
Zhan, X
Zhao, N
Zhang, H
Ahearn, TU
Garcia-Closas, M
Chatterjee, N
Zhu, H
Zhan, X
Zhao, N
Document Type
Journal Article
Date
2023-04-19
Date Accepted
2023-03-30
Abstract
Disease heterogeneity is ubiquitous in biomedical and clinical studies. In genetic studies, researchers are increasingly interested in understanding the distinct genetic underpinning of subtypes of diseases. However, existing set-based analysis methods for genome-wide association studies are either inadequate or inefficient to handle such multicategorical outcomes. In this paper, we proposed a novel set-based association analysis method, sequence kernel association test (SKAT)-MC, the sequence kernel association test for multicategorical outcomes (nominal or ordinal), which jointly evaluates the relationship between a set of variants (common and rare) and disease subtypes. Through comprehensive simulation studies, we showed that SKAT-MC effectively preserves the nominal type I error rate while substantially increases the statistical power compared to existing methods under various scenarios. We applied SKAT-MC to the Polish breast cancer study (PBCS), and identified gene FGFR2 was significantly associated with estrogen receptor (ER)+ and ER- breast cancer subtypes. We also investigated educational attainment using UK Biobank data ( N = 127 , 127 $N=127,127$ ) with SKAT-MC, and identified 21 significant genes in the genome. Consequently, SKAT-MC is a powerful and efficient analysis tool for genetic association studies with multicategorical outcomes. A freely distributed R package SKAT-MC can be accessed at https://github.com/Zhiwen-Owen-Jiang/SKATMC.
Citation
Genetic Epidemiology, 2023,
Source Title
Genetic Epidemiology
Publisher
WILEY
ISSN
0741-0395
eISSN
1098-2272
1098-2272
1098-2272
Collections
Research Team
Integrative Cancer Epidem