Federated Principal Component Analysis for Genome-Wide Association Studies

Anne Hartebrodt, Reza Nasirigerdeh, David B. Blumenthal, Richard Rottger

Publikation: Kapitel i bog/rapport/konference-proceedingKonferencebidrag i proceedingsForskningpeer review

Abstrakt

Federated learning (FL) has emerged as a privacy-aware alternative to centralized data analysis, especially for biomedical analyses such as genome-wide association studies (GWAS). The data remains with the owner, which enables studies previously impossible due to privacy protection regulations. Principal component analysis (PCA) is a frequent preprocessing step in GWAS, where the eigenvectors of the sample-by-sample covariance matrix are used as covariates in the statistical tests. Therefore, a federated version of PCA suitable for vertical data partitioning is required for federated GWAS. Existing federated PCA algorithms exchange the complete sample eigenvectors, a potential privacy breach. In this paper, we present a federated PCA algorithm for vertically partitioned data which does not exchange the sample eigenvectors and is hence suitable for federated GWAS. We show that it outperforms existing federated solutions in terms of convergence behavior and scalability. Additionally, we provide a user-friendly privacy-aware web tool to promote acceptance of federated PCA among GWAS researchers.

OriginalsprogEngelsk
TitelProceedings - 21st IEEE International Conference on Data Mining, ICDM 2021
RedaktørerJames Bailey, Pauli Miettinen, Yun Sing Koh, Dacheng Tao, Xindong Wu
ForlagIEEE
Publikationsdato2021
Sider1090-1095
ISBN (Trykt)978-1-6654-2399-1
ISBN (Elektronisk)978-1-6654-2398-4
DOI
StatusUdgivet - 2021
Begivenhed21st IEEE International Conference on Data Mining, ICDM 2021 - Virtual, Online, New Zealand
Varighed: 7. dec. 202110. dec. 2021

Konference

Konference21st IEEE International Conference on Data Mining, ICDM 2021
Land/OmrådeNew Zealand
ByVirtual, Online
Periode07/12/202110/12/2021
SponsorGoogle Inc., IEEE Technical Committee on Intelligent Informatics, School of Computer Science - The University of Auckland, Two Sigma, US National Science Foundation (NSF)
NavnProceedings - IEEE International Conference on Data Mining, ICDM
Vol/bind2021-December
ISSN1550-4786

Bibliografisk note

Funding Information:
ACKNOWLEDGMENTS The FeatureCloud project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 826078. This publication reflects only the authors’ view and the European Commission is not responsible for any use that may be made of the information it contains.

Publisher Copyright:
© 2021 IEEE.

Fingeraftryk

Dyk ned i forskningsemnerne om 'Federated Principal Component Analysis for Genome-Wide Association Studies'. Sammen danner de et unikt fingeraftryk.

Citationsformater