Confused about Doc 6.1

There is a very [large graph] in my code and I am trying to break it [into sub-graph]

A sad story is that when I try this over my graph:

my_graph = dgl.graph(edge_list, idtype=dtype)
item_set = gb.ItemSet(torch.arange(my_graph.num_nodes(), dtype=torch.int32), names="seed_nodes")

This does not work

I checked more in Docs and found:

Ok, the graph type may be wrong…

Here is the question, doc:

So confused, need more explanation and help, to convert [my graph] into this [CSC graph]

Grateful

FusedCSCSamplingGraph

please try with this API to convert DGLGraph to FusedCSCSamplingGraph

1 Like

Thanks a lot. That works for me,

gb.from_dglgraph(g, True)

“True” option is necessary


Now a new question comes :sweat_smile:

My previous model for [small graph]

output = self.model(graph.ndata["x"], graph.edata["x"], graph)

As far as I observe from [Doc 6.1], my [.ndata[“x”]] should be substituted by [featrue]
Smart code:

datapipe = datapipe.fetch_feature(feature, node_feature_keys=["x"], edge_feature_keys=["x", "y"])

Here is the question: How to construct this feature from my ndata and edata?

Thx

dataset.feature is an instance of TorchBasedFeatureStore, see details here.

Hi, after some practice over this Dataloader, it seems like everything is okay about my dataloader,
except the edge feature fails to be fetched( return empty list []).

And another question is when I try to substitute dgl graph by blocks in my network.

with graph.local_scope():

An unexpected error occurs : AttributeError: ‘list’ object has no attribute local_scope

As far as I know: The DGL GraphConv modules can accept an element in blocks generated by the data loader as an argument.

I am worrying whether most APIs can accept blocks or not, is that possible and correct if I turn blocks into a DGL graph (a subgraph), and then feed it into my network?

If so, a possible solution would be : # dgl.to_homogeneous does not support DGLBlock MFGs · Issue #6296 · dmlc/dgl · GitHub

In order to fetch edge feature, you need to `gb.from_dglgraph(…, include_original_edge_id=True) first.

Are you replacing graph with data.blocks? Please elaborate more about your question.

For the first question:

# define dataloader
feature = [
    gb.OnDiskFeatureData(
        domain="node", name="x",
        format="numpy", path="./tmp/node_x.npy", in_memory=False),
    gb.OnDiskFeatureData(
        domain="node", name="y",
        format="numpy", path="./tmp/node_y.npy", in_memory=False),
    gb.OnDiskFeatureData(
        domain="edge", name="x",
        format="numpy", path="./tmp/edge.npy", in_memory=False),
]
feature_store = gb.TorchBasedFeatureStore(feature)
sub_graph_sampler = sub_graph_sampler.fetch_feature(feature_store, node_feature_keys=["x", "y", ], edge_feature_keys=["x"])
subgraph_dataloder = gb.DataLoader(sub_graph_sampler)

# fetch blocks, blocks' edge feature and node features
for j, subgraph in enumerate(subgraph_dataloder):
    node_x = subgraph.node_features['x'] # works well
    node_y = subgraph.node_features['y'] # works well
    edge_x = subgraph.node_features['y'] # return an empy list

edge_x = subgraph.edge_features['x']?

yes it should be edge ‘x’ not edge ‘y’

Here is the second question:

  • At first, it works fine with dgl graph
for graph in dataloader:
    node_x = graph.ndata["x"]
    node_y = graph.ndata["y"]
    edge_x = graph.edata["x"]
    loss = model(graph, node_x, node_y, edge_x)
  • Now I refactor my train loop:
Class GraphNet():
def forward(graph_or_blocks, node_x, node_y, edge_x):
    with graph.local_scope(): # Error Here
        edge_x = edge_x
        ...
    
...
for i, subdataloader in enumerate(dataloader):
    for j, subgraph in enumerate(subgraph_dataloder):
        node_x = subgraph.node_features['x']
        node_y = subgraph.node_features['y']
        edge_x = subgraph.edge_features['x']
        output = model(subgraph.blocks, node_x, node_y, edge_x)
...

what is dataloader previsouly? It’s GraphDataLoader or dgl.dataloading.DataLoader? How many layers in your model?