An Expandable End-to-end Antiviral Drug Repurposing Framework by Multi-Modal Deep Embeddings
In our manuscript, we provided an end-to-end (directly from sequence to drug) software for Antiviral Drug Repurposing.
By inputting a viral sequence Fasta file (DNA sequences), and/or description to the viral, our software will first generate corresponding features and embeddings and then predict potential drug candidates by multi-feature-view domains. (In the first stage, we will focus on the SMILE-generated drug molecular graph, drug semantic context-based, and drug network-based features/embeddings).
Pre-trained deep-learning models are leveraged for a image-based and corpus-based embedding generation. (Such as resnet-50, and Alberta)
We leveraged the Double Anchor (fix random seeds in negative sampling and cross-validation cut) to reduce the random seed influence in selecting negative samples and Feature Pool (select the best combination from different feature-view-domain, and kept only one for each virus-drug feature-view-domain combination pairs, or vdkey) to select the best feature/embedding extractors in the process of training the ensemble models.