How use DGL to solve ATSP graph? as it is directed graph

Let’s say I have a directed ATSP graph. The best approach to solving such a problem is to transform it into a string graph, where the edges become nodes and weights are assigned at the node level. Then, we learn the probability of the edge belonging to the optimal solution given the labels. But what if we don’t have labels? What do you think is the best loss function to encode properties about the ATSP problem? For example, the cycle, ensuring all nodes are visited once, and selecting the most important edges with the least global effect in the loss. On the other hand, do you think transforming ATSP to string graph in the undirected way is the best approach? Or should we encode edge types such as source-source, parallel, source-target, target-source, and target-target? Alternatively, should we apply a heterogeneous graph with 5 types of edges and 1 type of node? Keeping in mind the structure of the string graph with the 5 subgraph edge types, which may exhibit similarities and sometimes contain many connected components. What is the most efficient way of solving this problem? I hope you share your ideas so I can learn, as I’m new to using GNN and DGL.

Or perhaps it’s better not to use GNNs for ATSP, as the graph is complete and lacks any unique relations except for the weights