Introduction

An Expandable End-to-end Antiviral Drug Repurposing Framework by Multi-Modal Deep Embeddings Our manuscript provided Expandable Ensemble end-to-end (directly from sequence to drug) software for Antiviral Drug Repurposing by Multi-Modal Embeddings and Transfer Learning. By inputting a viral sequence Fasta file (DNA sequences) and /or descriptions of the virus, our software will generate corresponding features and embeddings and then predict potential drug candidates using a multi-feature-view domains ensemble machine learning algorithm. (In the first stage, we will focus on the SMILE-generated drug molecular graph, drug semantic context-based, and drug network-based features/embeddings). Pre-trained deep-learning models are leveraged for image-based and corpus-based embedding generation. (Such as Resnet-50, and Alberta) We leveraged the Double Anchor (fix random seeds in negative sampling and cross-validation dataset split, two random processes) to reduce the random seed influence in selecting negative samples and Feature Pool (select the best combination from different feature-view-domains, and kept only one for each virus-drug feature-view-domain combination pairs, or vdkey) to select the best feature/embedding extractors in the process of training the ensemble models. Software/Features/codes are available.