In my situation, I have encoded my graph data in the DGLgraph types, however, I can not pack my data into the dgl data loader. Can somebody can help me?
sureļ¼here is my code, and I donāt know how to PROCESSING them since I have already represented them into dglgraghs
here is what my data looks like
Graph(num_nodes=188, num_edges=450,
ndata_schemes={'glycine': Scheme(shape=(4,), dtype=torch.int32)}
edata_schemes={})
tensor(0., device='cuda:0')
This is how I pack those data
data_set = []
for i in range(len(rna_data)):
cur_data = rna_data.iloc[i]
seq = cur_data['seq']
matching = cur_data['matching']
label = cur_data['label']
u = []
v = []
for idx in range(len(seq) - 1):
u.append(idx)
v.append(idx+1)
par_dict = find_parentheses(matching)
matching_list = (collections.OrderedDict(sorted(par_dict.items())))
matching_list = list(matching_list.items())
skip_u = []
skip_v = []
for item in matching_list:
skip_u.append(item[0])
skip_v.append(item[1])
try:
g = dgl.graph((u, v))
g.edata['bonds'] = torch.tensor([[1, 0]] * len(u))
g.add_edges(skip_u, skip_v, {'bonds': torch.tensor([[0, 1]] * len(skip_u))})
g = dgl.to_bidirected(g)
except Exception as e:
continue
g = g.to('cuda')
glycine_one_hot_list = []
for glycine in seq:
one_hot = glycine_one_hot_dict[glycine]
glycine_one_hot_list.append(one_hot)
g.ndata['glycine'] = torch.tensor(glycine_one_hot_list).cuda()
label = torch.tensor(label, dtype=torch.float32).cuda()
data = (g, label)
data_set.append(data)
thanks a lot again
plus, another question I wanna ask is, in graph classification models, why there s no softmax or sigmoid function at the output layer?
Basically you can follow the custom dataset interface for PyTorch as here. Basically you just need to define a class as follows:
class GraphData:
def __init__(self):
# A list of preprocessed DGLGraphs
self.graphs = ...
# Labels corresponding to the DGLGraphs
# self.labels[i] is the labels corresponding to self.graphs[i]
self.labels = ...
def __getitem__(self, i):
return self.graphs[i], self.labels[i]
def __len__(self):
return len(self.graphs)
Once you have defined such a class, you can then use it as a normal PyTorch dataset.
Typically we call the values before sigmoid/softmax ālogitsā and the values after that āprobabilitiesā. There are different loss functions for taking logits and probabilities. In some cases, using logits in loss computation can be more numerically stable as that allows merging multiple operations into one operation.