Abstrakt
Federated learning (FL) has emerged as a privacy-aware alternative to centralized data analysis, especially for biomedical analyses such as genome-wide association studies (GWAS). The data remains with the owner, which enables studies previously impossible due to privacy protection regulations. Principal component analysis (PCA) is a frequent preprocessing step in GWAS, where the eigenvectors of the sample-by-sample covariance matrix are used as covariates in the statistical tests. Therefore, a federated version of PCA suitable for vertical data partitioning is required for federated GWAS. Existing federated PCA algorithms exchange the complete sample eigenvectors, a potential privacy breach. In this paper, we present a federated PCA algorithm for vertically partitioned data which does not exchange the sample eigenvectors and is hence suitable for federated GWAS. We show that it outperforms existing federated solutions in terms of convergence behavior and scalability. Additionally, we provide a user-friendly privacy-aware web tool to promote acceptance of federated PCA among GWAS researchers.
Originalsprog | Engelsk |
---|---|
Titel | Proceedings - 21st IEEE International Conference on Data Mining, ICDM 2021 |
Redaktører | James Bailey, Pauli Miettinen, Yun Sing Koh, Dacheng Tao, Xindong Wu |
Forlag | IEEE |
Publikationsdato | 2021 |
Sider | 1090-1095 |
ISBN (Trykt) | 978-1-6654-2399-1 |
ISBN (Elektronisk) | 978-1-6654-2398-4 |
DOI | |
Status | Udgivet - 2021 |
Begivenhed | 21st IEEE International Conference on Data Mining, ICDM 2021 - Virtual, Online, New Zealand Varighed: 7. dec. 2021 → 10. dec. 2021 |
Konference
Konference | 21st IEEE International Conference on Data Mining, ICDM 2021 |
---|---|
Land/Område | New Zealand |
By | Virtual, Online |
Periode | 07/12/2021 → 10/12/2021 |
Sponsor | Google Inc., IEEE Technical Committee on Intelligent Informatics, School of Computer Science - The University of Auckland, Two Sigma, US National Science Foundation (NSF) |
Navn | Proceedings - IEEE International Conference on Data Mining, ICDM |
---|---|
Vol/bind | 2021-December |
ISSN | 1550-4786 |
Bibliografisk note
Funding Information:ACKNOWLEDGMENTS The FeatureCloud project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 826078. This publication reflects only the authors’ view and the European Commission is not responsible for any use that may be made of the information it contains.
Publisher Copyright:
© 2021 IEEE.