The sequence kernel association test for multicategorical outcomes.

Loading...
Thumbnail Image

Embargo End Date

Authors

Jiang, Z
Zhang, H
Ahearn, TU
Garcia-Closas, M
Chatterjee, N
Zhu, H
Zhan, X
Zhao, N

Document Type

Journal Article

Date

2023-04-19

Date Accepted

2023-03-30

Abstract

Disease heterogeneity is ubiquitous in biomedical and clinical studies. In genetic studies, researchers are increasingly interested in understanding the distinct genetic underpinning of subtypes of diseases. However, existing set-based analysis methods for genome-wide association studies are either inadequate or inefficient to handle such multicategorical outcomes. In this paper, we proposed a novel set-based association analysis method, sequence kernel association test (SKAT)-MC, the sequence kernel association test for multicategorical outcomes (nominal or ordinal), which jointly evaluates the relationship between a set of variants (common and rare) and disease subtypes. Through comprehensive simulation studies, we showed that SKAT-MC effectively preserves the nominal type I error rate while substantially increases the statistical power compared to existing methods under various scenarios. We applied SKAT-MC to the Polish breast cancer study (PBCS), and identified gene FGFR2 was significantly associated with estrogen receptor (ER)+ and ER- breast cancer subtypes. We also investigated educational attainment using UK Biobank data ( N = 127 , 127 $N=127,127$ ) with SKAT-MC, and identified 21 significant genes in the genome. Consequently, SKAT-MC is a powerful and efficient analysis tool for genetic association studies with multicategorical outcomes. A freely distributed R package SKAT-MC can be accessed at https://github.com/Zhiwen-Owen-Jiang/SKATMC.

Citation

Genetic Epidemiology, 2023,

Source Title

Genetic Epidemiology

Publisher

WILEY

ISSN

0741-0395

eISSN

1098-2272
1098-2272

Research Team

Integrative Cancer Epidem

Notes