g = dgl.heterograph({
('drug', 'interacts', 'drug'): (torch.tensor([0, 1]), torch.tensor([1, 2])),
('drug', 'treats', 'disease'): (torch.tensor([1]), torch.tensor([2]))})
g.nodes['drug'].data['hv'] = torch.rand(3, 10)
g.nodes['disease'].data['hv'] = torch.rand(3, 10)
g
# prints out
Graph(num_nodes={'disease': 3, 'drug': 3},
num_edges={('drug', 'interacts', 'drug'): 2, ('drug', 'treats', 'disease'): 1},
metagraph=[('drug', 'drug', 'interacts'), ('drug', 'disease', 'treats')])
dgl.to_block(g)
# prints out
Block(num_src_nodes={'disease': 1, 'drug': 3},
num_dst_nodes={'disease': 1, 'drug': 2},
num_edges={('drug', 'interacts', 'drug'): 2, ('drug', 'treats', 'disease'): 1},
metagraph=[('drug', 'drug', 'interacts'), ('drug', 'disease', 'treats')])
disease
is never a source (only drug
is) but is still included as a source node. I observe similarly weird behavior if I sample a subgraph from this heterogeneous graph.
sg = g.sample_neighbors(
{"drug": torch.tensor([1], dtype=torch.long),
"disease": torch.tensor([], dtype=torch.long)},
{"interacts": 1, "treats": 0})
sg
# prints out
Graph(num_nodes={'disease': 3, 'drug': 3},
num_edges={('drug', 'interacts', 'drug'): 1, ('drug', 'treats', 'disease'): 0},
metagraph=[('drug', 'drug', 'interacts'), ('drug', 'disease', 'treats')])
dgl.to_block(sg)
# prints out
Block(num_src_nodes={'disease': 0, 'drug': 2},
num_dst_nodes={'disease': 0, 'drug': 1},
num_edges={('drug', 'interacts', 'drug'): 1, ('drug', 'treats', 'disease'): 0},
metagraph=[('drug', 'drug', 'interacts'), ('drug', 'disease', 'treats')])
Here in the subgraph two source drug
nodes are included even though there is only one edge. What am I missing here?