A Network-Based Kernel Machine Test for the Identification of Risk Pathways in Genome-Wide Association StudiesFreytag S.a · Manitz J.b · Schlather M.d · Kneib T.b, c · Amos C.I.e · Risch A.f, g · Chang-Claude J.h · Heinrich J.i · Bickeböller H.a, c
aInstitute of Genetic Epidemiology, Medical School, bDepartment of Statistics and Econometrics, and cCenter for Statistics, Georg-August University Göttingen, Göttingen, and dInstitute for Mathematics, University of Mannheim, Mannheim, Germany; eDepartment of Community and Family Medicine, Geisel School of Medicine, Dartmouth College, Lebanon, N.H., USA; fDivision of Epigenomics and Cancer Risk Factors, Translational Lung Research Center Heidelberg, German Cancer Research Center, gTranslational Lung Research Center Heidelberg, Member of the German Center for Lung Research, and hDivision of Cancer Epidemiology, German Cancer Research Center, Heidelberg, and iInstitute of Epidemiology, Helmholtz Center Munich, German Research Center for Environmental Health, Neuherberg, Germany Corresponding Author
Institut für Genetische Epidemiologie, UMG
DE-37073 Göttingen (Germany)
Biological pathways provide rich information and biological context on the genetic causes of complex diseases. The logistic kernel machine test integrates prior knowledge on pathways in order to analyze data from genome-wide association studies (GWAS). In this study, the kernel converts the genomic information of 2 individuals into a quantitative value reflecting their genetic similarity. With the selection of the kernel, one implicitly chooses a genetic effect model. Like many other pathway methods, none of the available kernels accounts for the topological structure of the pathway or gene-gene interaction types. However, evidence indicates that connectivity and neighborhood of genes are crucial in the context of GWAS, because genes associated with a disease often interact. Thus, we propose a novel kernel that incorporates the topology of pathways and information on interactions. Using simulation studies, we demonstrate that the proposed method maintains the type I error correctly and can be more effective in the identification of pathways associated with a disease than non-network-based methods. We apply our approach to genome-wide association case-control data on lung cancer and rheumatoid arthritis. We identify some promising new pathways associated with these diseases, which may improve our current understanding of the genetic mechanisms.
© 2014 S. Karger AG, Basel