FedKD-CPI: Combining the federated knowledge distillation technique to accomplish synergistic compound-protein interaction prediction

Scritto il 17/01/2025
da Xuetao Wang

Methods. 2025 Jan 15;234:275-283. doi: 10.1016/j.ymeth.2024.12.014. Online ahead of print.

ABSTRACT

Compound-protein interaction (CPI) prediction is critical in the early stages of drug discovery, narrowing the search space for CPIs and reducing the cost and time required for traditional high-throughput screening. However, CPI-related data are usually distributed across different institutions and their sharing is restricted because of data privacy and intellectual property rights. Constructing a scheme that enhances multi-institutional collaboration to improve prediction accuracy while protecting data privacy is essential. To this end, we propose FedKD-CPI, the first framework based on federated knowledge distillation, to effectively facilitate multi-party CPI collaborative prediction and ensure data privacy and security. FedKD-CPI uses knowledge distillation technology to extract the updated knowledge of all client models and train the model on the server to achieve knowledge aggregation, which can effectively utilize the knowledge contained in public and private data. We evaluate FedKD-CPI on three benchmark datasets and compare it with four baselines. The results show that FedKD-CPI is very close to centralized learning and significantly better than localized learning. Furthermore, FedKD-CPI outperforms federated learning-based baselines on independent and identically distributed data and non-independent and identically distributed data. Overall, FedKD-CPI improves the CPI prediction while ensuring data security and promoting institutions' collaboration to accelerate drug discovery.

PMID:39824374 | DOI:10.1016/j.ymeth.2024.12.014