Hi,
It’s my first time using DGL. I notice that many people use Bert as the feature for the GNN nodes. But my question is, can I fine-tune my Bert in the same dataset first. Then use the fine-tuned Bert as the features for the GNN nodes? I haven’t seen people do this before, so it is a little bit confusing for me.
For example, if I am doing a classification task, I first train a model based on a Bert and FNN. Then I choose the fine-tuned Bert with best dev set performance as the features of my GCN model. Then I do this task with the GCN model. Does it make sense? Or is it some kind of cheating?