Emerging Standards and Trends in Drug Repurposing Bioinformatics: Explainable AI, Data Privacy, and Cloud-Based Advancements in 2024
Traditional drug discovery is a costly and time-consuming process. Drug repurposing offers a promising alternative by seeking new therapeutic applications for existing drugs. In 2024, bioinformatics has become essential to drug repurposing, integrating multi-omics data, computational models, and knowledge bases to accelerate this process. This report explores the current bioinformatics standards in drug repurposing, highlighting the growing importance of explainable AI, data privacy, cloud computing, and standardized ontologies. We examine key players in this field, including government agencies, philanthropic organizations, and industry stakeholders, and detail the vital tools propelling data integration and workflow efficiency.
Drug repurposing presents a paradigm shift in drug discovery, aiming to find new therapeutic uses for existing drugs. This approach drastically reduces development time and costs compared to traditional methods. Bioinformatics is now central to drug repurposing, leveraging multi-omics data, computational models, and biological knowledge bases. With ongoing advancements in machine learning and network-based pharmacology, explainable AI, robust data privacy measures, and scalable cloud infrastructure have become crucial for handling large datasets and building trust in predictive models for drug-target interaction prediction.
This report delves into the latest standards, tools, and prominent organizations driving progress in drug repurposing bioinformatics throughout 2024.
Modern drug repurposing heavily relies on integrating diverse datasets, including genomics, proteomics, transcriptomics, and drug-target interaction databases. Knowledge graphs (KGs), such as Neo4COVID19 and DRKG, provide a powerful framework for organizing this complex data. KGs link biological entities like drugs, proteins, and diseases, allowing researchers to explore relationships and identify potential drug repurposing candidates by understanding network perturbations [1, 15]. Challenges remain in standardizing data from disparate sources to ensure compatibility and meaningful analysis.
Machine learning models have shown promise in predicting drug-target interactions but can often be opaque "black boxes." In 2024, there is a significant push for explainable AI in drug repurposing. Techniques like SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) provide insights into the decision-making process of predictive models, enhancing transparency and building trust among researchers, clinicians, and regulatory bodies [7, 8]. Understanding the rationale behind model predictions is crucial for adoption and regulatory approval in healthcare settings.
The use of sensitive patient data in drug repurposing necessitates robust data security and privacy measures. Federated learning has emerged as a promising solution, enabling model training on decentralized data without compromising patient privacy [2, 9]. This approach allows multiple institutions to collaborate on drug repurposing research while adhering to privacy regulations. Further, techniques like differential privacy and homomorphic encryption provide additional layers of protection by adding noise to data or performing computations on encrypted data without decryption.
Although in silico methods have significantly accelerated the identification of potential drug candidates, experimental validation remains the gold standard in drug repurposing. Clinical trials are essential for assessing drug safety and efficacy in humans [3, 9]. Advances in virtual clinical trials, leveraging simulations and modeling, are streamlining the validation process. Additionally, integrating real-world evidence (RWE) from electronic health records and observational studies improves the efficiency and generalizability of findings.
Organizations such as the European Bioinformatics Institute (EMBL-EBI) and the National Institutes of Health (NIH) play a pivotal role in advancing drug repurposing bioinformatics [4, 5]. They spearhead initiatives focusing on network-based pharmacology, translational bioinformatics, and the development of standardized data-sharing platforms. Their efforts are critical in creating guidelines and best practices for drug discovery pipelines.
Private foundations, including the Bill & Melinda Gates Foundation and the Chan Zuckerberg Initiative, contribute significantly to drug repurposing, especially for neglected tropical diseases and global health priorities [6, 10]. Their funding supports the development of open-access bioinformatics resources, enabling researchers worldwide to access critical data and tools, fostering a more inclusive research landscape.
- Cytoscape: A widely used open-source software platform for visualizing and analyzing biological networks, particularly useful for mapping protein-protein interactions (PPIs) and drug-target interactions (DTIs) [11].
- AutoDock-Vina: A popular open-source tool for molecular docking, enabling researchers to predict the binding affinity and interactions between a drug and its target protein, crucial in understanding drug efficacy and potential side effects [12].
- Neo4j: Graph databases like Neo4j are increasingly important in drug repurposing, adept at handling complex biological data and enabling researchers to construct and query interconnected biological networks like Neo4COVID19 [13].
Other relevant tools include STRING for network analysis, DAVID and Metascape for pathway enrichment analysis, and SwissTargetPrediction for predicting drug targets based on chemical similarity.
Cloud platforms like Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure have become indispensable for managing the massive datasets involved in drug repurposing [14]. Their scalable infrastructure and on-demand computational power support resource-intensive machine learning models, large-scale bioinformatics analyses, and facilitate collaborative research efforts. Cloud computing significantly lowers the barrier to entry for researchers, making these resources accessible and cost-effective.
The use of standardized data formats such as BioPAX and SBML, alongside controlled vocabularies like Gene Ontology (GO) and Disease Ontology (DO), ensures interoperability between diverse biological datasets [16]. This standardization facilitates seamless data integration, a cornerstone of successful bioinformatics workflows. Other ontologies relevant to drug repurposing include the Human Phenotype Ontology (HPO) and the Chemical Entities of Biological Interest (ChEBI) ontology.
The field of drug repurposing bioinformatics in 2024 is defined by its emphasis on data integration, interpretability in machine learning models, and stringent data privacy practices. Government and philanthropic organizations play a vital role in promoting open data access and fostering collaborative research. Cloud computing has democratized access to high-performance computing resources, and standardized data formats are essential for efficiently analyzing diverse datasets. The convergence of AI, cloud infrastructure, and rigorous experimental validation promises to further accelerate drug repurposing efforts, bringing new therapies to patients faster and more cost-effectively.
- Zahoránszky-Kőhalmi, G., Siramshetty, V. B., Kumar, P., et al. "A Workflow of Integrated Resources to Catalyze Network Pharmacology Driven COVID-19 Research." Journal of Chemical Information and Modeling. 2022, 62, 718−729.
- Konečný, J., et al. "Federated learning: Strategies for improving communication efficiency." arXiv preprint arXiv:1610.05492 (2016).
- Patel, B., Gelat, B., Soni, M., Rathaur, P. "Bioinformatics Perspective of Drug Repurposing." Current Bioinformatics, Vol 19, 2024.
- National Institute on Aging. "Drug Repurposing: Translational Bioinformatics and Network Pharmacology." 2023. Source.
- EMBL-EBI. "Drug Repurposing in Bioinformatics." 2024. Source.
- Open Targets. "Drug Repurposing for Target Identification." 2023. Source.
- Lundberg, S., and Lee, S.-I. "A unified approach to interpreting model predictions." In Advances in Neural Information Processing Systems (2017).
- Ribeiro, M., et al. "Why should I trust you?: Explaining the predictions of any classifier." In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016).
- FDA. "Real-World Evidence." 2023. Source.
- Bill & Melinda Gates Foundation. "Drug Repurposing." 2024. Source.
- Cytoscape Consortium. "Cytoscape: An open-source platform for network analysis and visualization." Source.
- Trott, O., and Olson, A. J. "AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading." Journal of computational chemistry 31.2 (2010): 455-461.
- Neo4j. "Graph Data Platform." 2024. Source.
- National Center for Biotechnology Information. "Cloud Computing." 2024. Source.
- Wishart, D. S., et al. "DrugBank 5.0: a major update to the DrugBank database for 2018." Nucleic Acids Res. 2018;46(D1):D1074-D1082.
- The Gene Ontology Consortium. "The Gene Ontology Resource: 20 years and still GOing strong." Nucleic Acids Res. 2021 Jan 8;49(D1):D330-D338.
- NIH. "Drug Repurposing: Translational bioinformatics and network pharmacology." 2022 Source.