Hello, I’m trying to train a custom MPNN to predict aqueous solubility via the ESOL/Delaney dataset. Existing benchmarks on this dataset (https://github.com/swansonk14/chemprop, search for “results”) suggest a test set RMSE of 0.6 should be achievable. When I train DGL’s MPNN.py, the results are significantly inferior – depending on how I set hyperparameters, I typically get a test set RMSE of 0.9 - 1.2, which is worse than a regression forest with RDKit features.
Has anyone here had success with MPNN.py in achieving good results?