Performance of DGL-LifeSci on MoleculeNet

Hi

Thank you for the great tool of DGL-LifeSci. When I run the demo on MoleculeNet with the default configurations, my results could not reach the reported performance of the pre-trained model. But I found that when I used a larger “patience”, the results got improved. May I ask if the provided pre-trained model is trained using the default configurations? If not, could you please share the used configurations? Thanks!

Best,

Hi, which experiments did you try? How did you run the experiments? How large was the performance gap?

1 Like

Hi @mufeili

Thank you for the so quick response! I appreciate it. I directly use the codes at MoleculeNet without any modifications. I tried the dataset Tox21 with the featurization method of canonical under both GCN and GAT. The results are as follows:

Tox21

method Val ROC-AUC Test ROC-AUC
GCN + canonical (Pre-trained) 0.82 0.77
GCN + canonical (Mine) 0.78 0.73
GAT + canonical (Pre-trained) 0.73 0.71
GAT + canonical (Mine) 0.69 0.68

My dgl version is 0.7.0 and my dgllife version is 0.2.8.
The detailed commands and the corresponding log files are as follows:

python classification.py -d Tox21 -mo GCN -f canonical
Created directory classification_results
Processing dgl graphs from scratch...
Processing molecule 1000/7831
Processing molecule 2000/7831
Processing molecule 3000/7831
Processing molecule 4000/7831
Processing molecule 5000/7831
Processing molecule 6000/7831
Processing molecule 7000/7831
Start initializing RDKit molecule instances...
Creating RDKit molecule instance 1000/7831
Creating RDKit molecule instance 2000/7831
Creating RDKit molecule instance 3000/7831
Creating RDKit molecule instance 4000/7831
Creating RDKit molecule instance 5000/7831
Creating RDKit molecule instance 6000/7831
Creating RDKit molecule instance 7000/7831
Start computing Bemis-Murcko scaffolds.
Computing Bemis-Murcko for compound 1000/7831
Computing Bemis-Murcko for compound 2000/7831
Computing Bemis-Murcko for compound 3000/7831
Computing Bemis-Murcko for compound 4000/7831
Computing Bemis-Murcko for compound 5000/7831
Computing Bemis-Murcko for compound 6000/7831
Computing Bemis-Murcko for compound 7000/7831
For metric roc_auc_score, the higher the better
epoch 1/1000, batch 1/196, loss 0.6382
epoch 1/1000, batch 21/196, loss 0.6117
epoch 1/1000, batch 41/196, loss 0.6025
epoch 1/1000, batch 61/196, loss 0.5658
epoch 1/1000, batch 81/196, loss 0.5144
epoch 1/1000, batch 101/196, loss 0.4612
epoch 1/1000, batch 121/196, loss 0.4830
epoch 1/1000, batch 141/196, loss 0.4488
epoch 1/1000, batch 161/196, loss 0.4569
epoch 1/1000, batch 181/196, loss 0.4074
epoch 1/1000, training roc_auc_score 0.6167
epoch 1/1000, validation roc_auc_score 0.6648, best validation roc_auc_score 0.6648
epoch 2/1000, batch 1/196, loss 0.3720
epoch 2/1000, batch 21/196, loss 0.3843
epoch 2/1000, batch 41/196, loss 0.3409
epoch 2/1000, batch 61/196, loss 0.3003
epoch 2/1000, batch 81/196, loss 0.2522
epoch 2/1000, batch 101/196, loss 0.2622
epoch 2/1000, batch 121/196, loss 0.2475
epoch 2/1000, batch 141/196, loss 0.2534
epoch 2/1000, batch 161/196, loss 0.2167
epoch 2/1000, batch 181/196, loss 0.1956
epoch 2/1000, training roc_auc_score 0.6617
epoch 2/1000, validation roc_auc_score 0.6892, best validation roc_auc_score 0.6892
epoch 3/1000, batch 1/196, loss 0.2145
epoch 3/1000, batch 21/196, loss 0.1822
epoch 3/1000, batch 41/196, loss 0.2230
epoch 3/1000, batch 61/196, loss 0.1656
epoch 3/1000, batch 81/196, loss 0.1629
epoch 3/1000, batch 101/196, loss 0.1677
epoch 3/1000, batch 121/196, loss 0.2200
epoch 3/1000, batch 141/196, loss 0.1876
epoch 3/1000, batch 161/196, loss 0.1689
epoch 3/1000, batch 181/196, loss 0.1974
epoch 3/1000, training roc_auc_score 0.7116
epoch 3/1000, validation roc_auc_score 0.7121, best validation roc_auc_score 0.7121
epoch 4/1000, batch 1/196, loss 0.2451
epoch 4/1000, batch 21/196, loss 0.1814
epoch 4/1000, batch 41/196, loss 0.2250
epoch 4/1000, batch 61/196, loss 0.2378
epoch 4/1000, batch 81/196, loss 0.1883
epoch 4/1000, batch 101/196, loss 0.1334
epoch 4/1000, batch 121/196, loss 0.2164
epoch 4/1000, batch 141/196, loss 0.1954
epoch 4/1000, batch 161/196, loss 0.1641
epoch 4/1000, batch 181/196, loss 0.1148
epoch 4/1000, training roc_auc_score 0.7442
epoch 4/1000, validation roc_auc_score 0.7206, best validation roc_auc_score 0.7206
epoch 5/1000, batch 1/196, loss 0.1345
epoch 5/1000, batch 21/196, loss 0.1747
epoch 5/1000, batch 41/196, loss 0.2068
epoch 5/1000, batch 61/196, loss 0.1349
epoch 5/1000, batch 81/196, loss 0.1496
epoch 5/1000, batch 101/196, loss 0.2401
epoch 5/1000, batch 121/196, loss 0.1379
epoch 5/1000, batch 141/196, loss 0.1648
epoch 5/1000, batch 161/196, loss 0.1633
epoch 5/1000, batch 181/196, loss 0.1580
epoch 5/1000, training roc_auc_score 0.7578
epoch 5/1000, validation roc_auc_score 0.7343, best validation roc_auc_score 0.7343
epoch 6/1000, batch 1/196, loss 0.1426
epoch 6/1000, batch 21/196, loss 0.1754
epoch 6/1000, batch 41/196, loss 0.3298
epoch 6/1000, batch 61/196, loss 0.1468
epoch 6/1000, batch 81/196, loss 0.2014
epoch 6/1000, batch 101/196, loss 0.1639
epoch 6/1000, batch 121/196, loss 0.1422
epoch 6/1000, batch 141/196, loss 0.1809
epoch 6/1000, batch 161/196, loss 0.1434
epoch 6/1000, batch 181/196, loss 0.2651
epoch 6/1000, training roc_auc_score 0.7693
epoch 6/1000, validation roc_auc_score 0.7437, best validation roc_auc_score 0.7437
epoch 7/1000, batch 1/196, loss 0.1492
epoch 7/1000, batch 21/196, loss 0.1709
epoch 7/1000, batch 41/196, loss 0.2666
epoch 7/1000, batch 61/196, loss 0.1958
epoch 7/1000, batch 81/196, loss 0.2279
epoch 7/1000, batch 101/196, loss 0.1528
epoch 7/1000, batch 121/196, loss 0.1783
epoch 7/1000, batch 141/196, loss 0.1934
epoch 7/1000, batch 161/196, loss 0.1751
epoch 7/1000, batch 181/196, loss 0.1737
epoch 7/1000, training roc_auc_score 0.7803
epoch 7/1000, validation roc_auc_score 0.7506, best validation roc_auc_score 0.7506
epoch 8/1000, batch 1/196, loss 0.1122
epoch 8/1000, batch 21/196, loss 0.1608
epoch 8/1000, batch 41/196, loss 0.1586
epoch 8/1000, batch 61/196, loss 0.1744
epoch 8/1000, batch 81/196, loss 0.1498
epoch 8/1000, batch 101/196, loss 0.2017
epoch 8/1000, batch 121/196, loss 0.1067
epoch 8/1000, batch 141/196, loss 0.1554
epoch 8/1000, batch 161/196, loss 0.1456
epoch 8/1000, batch 181/196, loss 0.1593
epoch 8/1000, training roc_auc_score 0.7930
epoch 8/1000, validation roc_auc_score 0.7548, best validation roc_auc_score 0.7548
epoch 9/1000, batch 1/196, loss 0.1108
epoch 9/1000, batch 21/196, loss 0.1214
epoch 9/1000, batch 41/196, loss 0.1960
epoch 9/1000, batch 61/196, loss 0.2054
epoch 9/1000, batch 81/196, loss 0.1627
epoch 9/1000, batch 101/196, loss 0.1846
epoch 9/1000, batch 121/196, loss 0.2577
epoch 9/1000, batch 141/196, loss 0.1412
epoch 9/1000, batch 161/196, loss 0.1724
epoch 9/1000, batch 181/196, loss 0.0984
epoch 9/1000, training roc_auc_score 0.8038
epoch 9/1000, validation roc_auc_score 0.7573, best validation roc_auc_score 0.7573
epoch 10/1000, batch 1/196, loss 0.1590
epoch 10/1000, batch 21/196, loss 0.0961
epoch 10/1000, batch 41/196, loss 0.2118
epoch 10/1000, batch 61/196, loss 0.1646
epoch 10/1000, batch 81/196, loss 0.1410
epoch 10/1000, batch 101/196, loss 0.1734
epoch 10/1000, batch 121/196, loss 0.1965
epoch 10/1000, batch 141/196, loss 0.1771
epoch 10/1000, batch 161/196, loss 0.2088
epoch 10/1000, batch 181/196, loss 0.1569
epoch 10/1000, training roc_auc_score 0.8042
epoch 10/1000, validation roc_auc_score 0.7581, best validation roc_auc_score 0.7581
epoch 11/1000, batch 1/196, loss 0.1736
epoch 11/1000, batch 21/196, loss 0.1556
epoch 11/1000, batch 41/196, loss 0.1700
epoch 11/1000, batch 61/196, loss 0.2398
epoch 11/1000, batch 81/196, loss 0.2494
epoch 11/1000, batch 101/196, loss 0.1455
epoch 11/1000, batch 121/196, loss 0.1865
epoch 11/1000, batch 141/196, loss 0.1688
epoch 11/1000, batch 161/196, loss 0.1909
epoch 11/1000, batch 181/196, loss 0.2225
epoch 11/1000, training roc_auc_score 0.8123
EarlyStopping counter: 1 out of 3
epoch 11/1000, validation roc_auc_score 0.7580, best validation roc_auc_score 0.7581
epoch 12/1000, batch 1/196, loss 0.2313
epoch 12/1000, batch 21/196, loss 0.2119
epoch 12/1000, batch 41/196, loss 0.2390
epoch 12/1000, batch 61/196, loss 0.1778
epoch 12/1000, batch 81/196, loss 0.1494
epoch 12/1000, batch 101/196, loss 0.1972
epoch 12/1000, batch 121/196, loss 0.1907
epoch 12/1000, batch 141/196, loss 0.1611
epoch 12/1000, batch 161/196, loss 0.1482
epoch 12/1000, batch 181/196, loss 0.2057
epoch 12/1000, training roc_auc_score 0.8134
EarlyStopping counter: 2 out of 3
epoch 12/1000, validation roc_auc_score 0.7526, best validation roc_auc_score 0.7581
epoch 13/1000, batch 1/196, loss 0.1985
epoch 13/1000, batch 21/196, loss 0.2182
epoch 13/1000, batch 41/196, loss 0.1506
epoch 13/1000, batch 61/196, loss 0.1421
epoch 13/1000, batch 81/196, loss 0.1017
epoch 13/1000, batch 101/196, loss 0.1897
epoch 13/1000, batch 121/196, loss 0.0831
epoch 13/1000, batch 141/196, loss 0.2379
epoch 13/1000, batch 161/196, loss 0.1697
epoch 13/1000, batch 181/196, loss 0.1412
epoch 13/1000, training roc_auc_score 0.8172
epoch 13/1000, validation roc_auc_score 0.7622, best validation roc_auc_score 0.7622
epoch 14/1000, batch 1/196, loss 0.2081
epoch 14/1000, batch 21/196, loss 0.1862
epoch 14/1000, batch 41/196, loss 0.2269
epoch 14/1000, batch 61/196, loss 0.1661
epoch 14/1000, batch 81/196, loss 0.1771
epoch 14/1000, batch 101/196, loss 0.0947
epoch 14/1000, batch 121/196, loss 0.1385
epoch 14/1000, batch 141/196, loss 0.1462
epoch 14/1000, batch 161/196, loss 0.1927
epoch 14/1000, batch 181/196, loss 0.1138
epoch 14/1000, training roc_auc_score 0.8237
epoch 14/1000, validation roc_auc_score 0.7665, best validation roc_auc_score 0.7665
epoch 15/1000, batch 1/196, loss 0.1145
epoch 15/1000, batch 21/196, loss 0.1235
epoch 15/1000, batch 41/196, loss 0.2017
epoch 15/1000, batch 61/196, loss 0.1515
epoch 15/1000, batch 81/196, loss 0.1152
epoch 15/1000, batch 101/196, loss 0.1635
epoch 15/1000, batch 121/196, loss 0.1515
epoch 15/1000, batch 141/196, loss 0.1778
epoch 15/1000, batch 161/196, loss 0.1838
epoch 15/1000, batch 181/196, loss 0.1651
epoch 15/1000, training roc_auc_score 0.8261
epoch 15/1000, validation roc_auc_score 0.7687, best validation roc_auc_score 0.7687
epoch 16/1000, batch 1/196, loss 0.1141
epoch 16/1000, batch 21/196, loss 0.1586
epoch 16/1000, batch 41/196, loss 0.1458
epoch 16/1000, batch 61/196, loss 0.1697
epoch 16/1000, batch 81/196, loss 0.1583
epoch 16/1000, batch 101/196, loss 0.1930
epoch 16/1000, batch 121/196, loss 0.2262
epoch 16/1000, batch 141/196, loss 0.1845
epoch 16/1000, batch 161/196, loss 0.1720
epoch 16/1000, batch 181/196, loss 0.1158
epoch 16/1000, training roc_auc_score 0.8307
EarlyStopping counter: 1 out of 3
epoch 16/1000, validation roc_auc_score 0.7667, best validation roc_auc_score 0.7687
epoch 17/1000, batch 1/196, loss 0.1685
epoch 17/1000, batch 21/196, loss 0.1571
epoch 17/1000, batch 41/196, loss 0.1028
epoch 17/1000, batch 61/196, loss 0.2175
epoch 17/1000, batch 81/196, loss 0.1187
epoch 17/1000, batch 101/196, loss 0.1680
epoch 17/1000, batch 121/196, loss 0.2077
epoch 17/1000, batch 141/196, loss 0.1119
epoch 17/1000, batch 161/196, loss 0.1334
epoch 17/1000, batch 181/196, loss 0.1876
epoch 17/1000, training roc_auc_score 0.8315
EarlyStopping counter: 2 out of 3
epoch 17/1000, validation roc_auc_score 0.7666, best validation roc_auc_score 0.7687
epoch 18/1000, batch 1/196, loss 0.1409
epoch 18/1000, batch 21/196, loss 0.1627
epoch 18/1000, batch 41/196, loss 0.1537
epoch 18/1000, batch 61/196, loss 0.1483
epoch 18/1000, batch 81/196, loss 0.1313
epoch 18/1000, batch 101/196, loss 0.1679
epoch 18/1000, batch 121/196, loss 0.1666
epoch 18/1000, batch 141/196, loss 0.1267
epoch 18/1000, batch 161/196, loss 0.2171
epoch 18/1000, batch 181/196, loss 0.1233
epoch 18/1000, training roc_auc_score 0.8327
epoch 18/1000, validation roc_auc_score 0.7773, best validation roc_auc_score 0.7773
epoch 19/1000, batch 1/196, loss 0.1092
epoch 19/1000, batch 21/196, loss 0.1653
epoch 19/1000, batch 41/196, loss 0.1772
epoch 19/1000, batch 61/196, loss 0.1367
epoch 19/1000, batch 81/196, loss 0.1016
epoch 19/1000, batch 101/196, loss 0.1350
epoch 19/1000, batch 121/196, loss 0.1811
epoch 19/1000, batch 141/196, loss 0.1159
epoch 19/1000, batch 161/196, loss 0.1307
epoch 19/1000, batch 181/196, loss 0.1494
epoch 19/1000, training roc_auc_score 0.8389
epoch 19/1000, validation roc_auc_score 0.7794, best validation roc_auc_score 0.7794
epoch 20/1000, batch 1/196, loss 0.1355
epoch 20/1000, batch 21/196, loss 0.1613
epoch 20/1000, batch 41/196, loss 0.1423
epoch 20/1000, batch 61/196, loss 0.1873
epoch 20/1000, batch 81/196, loss 0.1142
epoch 20/1000, batch 101/196, loss 0.1633
epoch 20/1000, batch 121/196, loss 0.1498
epoch 20/1000, batch 141/196, loss 0.1484
epoch 20/1000, batch 161/196, loss 0.1398
epoch 20/1000, batch 181/196, loss 0.1724
epoch 20/1000, training roc_auc_score 0.8371
EarlyStopping counter: 1 out of 3
epoch 20/1000, validation roc_auc_score 0.7768, best validation roc_auc_score 0.7794
epoch 21/1000, batch 1/196, loss 0.1243
epoch 21/1000, batch 21/196, loss 0.2853
epoch 21/1000, batch 41/196, loss 0.1680
epoch 21/1000, batch 61/196, loss 0.1096
epoch 21/1000, batch 81/196, loss 0.1415
epoch 21/1000, batch 101/196, loss 0.1503
epoch 21/1000, batch 121/196, loss 0.1356
epoch 21/1000, batch 141/196, loss 0.1473
epoch 21/1000, batch 161/196, loss 0.0976
epoch 21/1000, batch 181/196, loss 0.2510
epoch 21/1000, training roc_auc_score 0.8419
EarlyStopping counter: 2 out of 3
epoch 21/1000, validation roc_auc_score 0.7683, best validation roc_auc_score 0.7794
epoch 22/1000, batch 1/196, loss 0.1656
epoch 22/1000, batch 21/196, loss 0.1412
epoch 22/1000, batch 41/196, loss 0.1780
epoch 22/1000, batch 61/196, loss 0.1762
epoch 22/1000, batch 81/196, loss 0.1716
epoch 22/1000, batch 101/196, loss 0.1298
epoch 22/1000, batch 121/196, loss 0.0930
epoch 22/1000, batch 141/196, loss 0.0839
epoch 22/1000, batch 161/196, loss 0.1505
epoch 22/1000, batch 181/196, loss 0.1314
epoch 22/1000, training roc_auc_score 0.8334
EarlyStopping counter: 3 out of 3
epoch 22/1000, validation roc_auc_score 0.7697, best validation roc_auc_score 0.7794
val roc_auc_score 0.7794
test roc_auc_score 0.7338
python classification.py -d Tox21 -mo GAT -f canonical
Directory classification_results already exists.
Processing dgl graphs from scratch...
Processing molecule 1000/7831
Processing molecule 2000/7831
Processing molecule 3000/7831
Processing molecule 4000/7831
Processing molecule 5000/7831
Processing molecule 6000/7831
Processing molecule 7000/7831
Start initializing RDKit molecule instances...
Creating RDKit molecule instance 1000/7831
Creating RDKit molecule instance 2000/7831
Creating RDKit molecule instance 3000/7831
Creating RDKit molecule instance 4000/7831
Creating RDKit molecule instance 5000/7831
Creating RDKit molecule instance 6000/7831
Creating RDKit molecule instance 7000/7831
Start computing Bemis-Murcko scaffolds.
Computing Bemis-Murcko for compound 1000/7831
Computing Bemis-Murcko for compound 2000/7831
Computing Bemis-Murcko for compound 3000/7831
Computing Bemis-Murcko for compound 4000/7831
Computing Bemis-Murcko for compound 5000/7831
Computing Bemis-Murcko for compound 6000/7831
Computing Bemis-Murcko for compound 7000/7831
For metric roc_auc_score, the higher the better
epoch 1/1000, batch 1/25, loss 0.6502
epoch 1/1000, batch 21/25, loss 0.1865
epoch 1/1000, training roc_auc_score 0.5713
epoch 1/1000, validation roc_auc_score 0.6038, best validation roc_auc_score 0.6038
epoch 2/1000, batch 1/25, loss 0.2083
epoch 2/1000, batch 21/25, loss 0.1765
epoch 2/1000, training roc_auc_score 0.6807
epoch 2/1000, validation roc_auc_score 0.6329, best validation roc_auc_score 0.6329
epoch 3/1000, batch 1/25, loss 0.1814
epoch 3/1000, batch 21/25, loss 0.1713
epoch 3/1000, training roc_auc_score 0.7176
epoch 3/1000, validation roc_auc_score 0.6792, best validation roc_auc_score 0.6792
epoch 4/1000, batch 1/25, loss 0.1799
epoch 4/1000, batch 21/25, loss 0.1819
epoch 4/1000, training roc_auc_score 0.7188
epoch 4/1000, validation roc_auc_score 0.6864, best validation roc_auc_score 0.6864
epoch 5/1000, batch 1/25, loss 0.1545
epoch 5/1000, batch 21/25, loss 0.1837
epoch 5/1000, training roc_auc_score 0.7215
EarlyStopping counter: 1 out of 3
epoch 5/1000, validation roc_auc_score 0.6856, best validation roc_auc_score 0.6864
epoch 6/1000, batch 1/25, loss 0.1822
epoch 6/1000, batch 21/25, loss 0.1722
epoch 6/1000, training roc_auc_score 0.7269
EarlyStopping counter: 2 out of 3
epoch 6/1000, validation roc_auc_score 0.5677, best validation roc_auc_score 0.6864
epoch 7/1000, batch 1/25, loss 0.1683
epoch 7/1000, batch 21/25, loss 0.1918
epoch 7/1000, training roc_auc_score 0.7145
EarlyStopping counter: 3 out of 3
epoch 7/1000, validation roc_auc_score 0.6716, best validation roc_auc_score 0.6864
val roc_auc_score 0.6864
test roc_auc_score 0.6822

I think this might be due to different initial model parameters as we did not fix the random seed.

1 Like