Time complexity & Time

Time complexity & Time

In this page,we provided detailed table outlining the time complexity of the algorithm, and also provided running time for important parts of DeepSeq2drug.

Table1 Time complexity

CodeTime complexity
Main Program AUC/AUPR generationO(nmr)
Feature Pool for vdkey SelectionO(nm)

The main program that generates AUC/AUPR for all pairs of vdkey will Traverse all viral embeddings and drug embeddings. The time complexity for this step is O(nm), n equals the viral feature/embedding files, while m equals the drug embedding files. As we need to repeat the experiments, thus In summary, the overall code time complexity is O(nmr), r means the repetition of the experiments. Also, the embedding length and the training/testing might depend on the size of the dataset and model. Thus, the time complexity of these parts might be far more than O(nmr). For example, if we conduct our experiments using an extremely unbalanced dataset, the Time could also be extremely long.

The purpose of the feature Pool is to select the best vdkey pairs from each modality (feature-view domain); the time complexity of the feature pool depends on the loop of traversing all vdkeys and p-values. Thus, the time complexity of this step is O(nm), while n is the number of all vdkeys while m is the number of p-values.

Except for the main parts of our project, we still have other parts. We further recorded some real-time consumed for each process.

The platforms we use for testing/calculating Time are shown below:

  • Platform 1:
  • GTX2080Ti GPU
  • Intel i7-9700
  • 64GB DDR4 memory
  •  
  • Platform 2:
  • GTX4080 Laptop GPU
  • Intel i9-13900HX
  • 64GB DDR5 memory

Runing Time table for Different parts of DeepSeq2drug

PartsDescriptionSamplesPlatformAverage running time(s)
PreprocessExtraction of bio-GPT-Drugs100021.3087
PreprocessExtraction of 3D-resnet50 20.0402
PreprocessExtraction of GPT2-Drugs100010.0825  
PreprocessExtraction of Albert-virus  8910.0865  
PreprocessDoc2vec100010.0441
Preprocess5mer100010.0099
Preprocess4mer100010.0065
PreprocessPseKNC100010.7216
Feature poolP-values calculationN/A12.1084
Feature poolVdkey RankN/A10.0210
Train-Validation5mer+bioGPT (nr=0.5)N/A2267.3627
Train-Validation5mer+bioGPT (nr=1)N/A2410.1965
Train-Validation5mer+bioGPT (nr=2)N/A2733.2629
Train-Validation5mer+bioGPT (nr=4)N/A21424.035
Train-Validation5mer+bioGPT (nr=8)N/A22967.333
Train-Validation5mer+bioGPT (nr=16)N/A24508.527
Train-ValidationPseKNC+role2vec (nr=0.5)N/A2536.3279
Train-ValidationPseKNC+role2vec (nr=1)N/A2812.8335
Train-ValidationPseKNC+role2vec (nr=2)N/A21408.047
Train-ValidationPseKNC+role2vec (nr=4)N/A22758.145
Train-ValidationPseKNC+role2vec (nr=8)N/A25502.763
Train-ValidationPseKNC+role2vec (nr=18)N/A26688.011

As can be seen from the table above, if we conduct our experiments using different nr to control the size of an unbalanced dataset and using different feature combinations, the running Time could also be varied. Also, the most time consuming parts for Deepseq2drug is to train and validate the models.