Hello! I am a computer science student working on an explanation for a GNN and I stumbled over the DGL implementation of the SubgraphX which is of great interest to me but unfortunately, I have some trouble understanding the code. I would be very grateful if someone could be so kind as to explain it to me, especially the following points:
Line 139 in the subgraphx.py file reads:
split_point = num_nodes
and I don’t quite grasp why the split_point is assigned like this? Isn’t this just the “last node”?
In the lines 204 - 208:
# Get the largest weakly connected component in the subgraph. nx_graph = to_networkx(new_subg.cpu()) largest_cc_nids = list( max(nx.weakly_connected_components(nx_graph), key=len) )
only the largest weakly connected component in the subgraph without a particular chosen node is selected but why shouldn’t one look at all components?
Furthermore, during my tests on many different graphs, the solution  appears disproportionally often but the node labels are randomly assigned (numbered somehow) and looking at the graphical representation there doesn’t seem to be anything special about node 0 either. Am I maybe overlooking something obvious here?
Any help would be really appreciated, I don’t know why this is so confusing to me but despite looking at both the original paper and the code I just don’t get how exactly one translates to the other or why node 0 seems to be so special in my experiments.