Abstract
Federated learning has gained great popularities in the last decade for its capability of collaboratively building models on data from multiple datasets. However, in real-world biomedical settings, practical challenges remain, including the needs to protect privacy of the patients, the capability of accounting for between-site heterogeneity in patient characteristics, and, from operational point of view, the number of needed communications across data partners. In this chapter, we describe and provide examples of multi-database data-sharing mechanisms in the healthcare data context and highlight the primary methods available for performing statistical regression analysis in each setting. For each method, we discuss the advantages and disadvantages in terms of data privacy, data communication efficiency, heterogeneity awareness, and statistical accuracy. Our goal is to provide researchers with the insight necessary to choose among the available algorithms for a given setting of conducting regression analysis using multi-site data.
| Original language | English |
|---|---|
| Title of host publication | Clinical Applications of Artificial Intelligence in Real-World Data |
| Publisher | Springer International Publishing |
| Pages | 125-139 |
| Number of pages | 15 |
| ISBN (Electronic) | 9783031366789 |
| ISBN (Print) | 9783031366772 |
| DOIs | |
| State | Published - Nov 4 2023 |