How can I compute the Assortivity or Disassortivity of a node classification dataset?

jiaruHithub · January 11, 2022, 2:56am

How can I compute the Assortivity or Disassortivity of a node classification dataset?
I can’t find the func. in the library.
Thank you for ur help.

minjie · January 12, 2022, 3:48am

Hi, currently we don’t have such a functionality, but you could write a simple function to calculate it. Here is a sketch (may not be accurate, just an idea according to this):

g = ... # some graph
f1 = g.in_degrees()
f2 = g.out_degrees()
f1_bar = f1.mean()
f2_bar = f2.mean()
r = ((f1 - f1_bar) * (f2 - f2_bar)).mean() / (f2 - f2_bar).sum().sqrt() / (f2 - f2_bar) ** 2).sum().sqrt()

Other options include converting your graph to networkx graph to use their provided API.

jiaruHithub · January 12, 2022, 4:24am

Sorry, I don’t really understand this algorithm. Assortivity is related to the label of the neighbor. I did not find the steps to visit the label of the neighbor. I am looking forward to your answer. The computing method can be found in the paper of Geom-GCN.

R=\frac{1}{N} \sum_{v \in V} \frac{\text { Number of } v \text { 's neighbors who have the same label as } v}{\text { Number of } v \text { 's neighbors }} .

minjie · January 12, 2022, 4:50am

I see. That’s a different type of assortivity. For your case, your could write a message passing function to achieve that.

g = ...  # some g
g.ndata['label'] = ...  # some node label
def mfunc(edges):
    return {'mask' : edges.src['label'] == edges.dst['label']}
g.update_all(mfunc, dgl.function.sum('mask', 'N'))
R = (g.ndata['N'] / g.in_degrees()).mean()

Look like a good module to have in DGL too. Would you like to contribute it? cc @mufeili

jiaruHithub · January 12, 2022, 5:23am

Thanks, I will try the method soon, keep in touch

jiaruHithub · January 13, 2022, 1:35am

Hello, The code reports an error:

Traceback (most recent call last):
  File "Assortivity.py", line 36, in <module>
    g.update_all(mfunc, dgl.function.sum('mask', 'N'))
  File "/opt/conda/lib/python3.7/site-packages/dgl/heterograph.py", line 4501, in update_all
    ndata = core.message_passing(g, message_func, reduce_func, apply_node_func)
  File "/opt/conda/lib/python3.7/site-packages/dgl/core.py", line 295, in message_passing
    ndata = invoke_gspmm(g, fn.copy_e(msg, msg), rfunc, edata=msgdata)
  File "/opt/conda/lib/python3.7/site-packages/dgl/core.py", line 255, in invoke_gspmm
    z = op(graph, x)
  File "/opt/conda/lib/python3.7/site-packages/dgl/ops/spmm.py", line 172, in func
    return gspmm(g, 'copy_rhs', reduce_op, None, x)
  File "/opt/conda/lib/python3.7/site-packages/dgl/ops/spmm.py", line 64, in gspmm
    lhs_data, rhs_data)
  File "/opt/conda/lib/python3.7/site-packages/dgl/backend/pytorch/sparse.py", line 260, in gspmm
    return GSpMM.apply(gidx, op, reduce_op, lhs_data, rhs_data)
  File "/opt/conda/lib/python3.7/site-packages/dgl/backend/pytorch/sparse.py", line 64, in forward
    out, (argX, argY) = _gspmm(gidx, op, reduce_op, X, Y)
  File "/opt/conda/lib/python3.7/site-packages/dgl/sparse.py", line 157, in _gspmm
    arg_e_nd)
  File "dgl/_ffi/_cython/./function.pxi", line 287, in dgl._ffi._cy3.core.FunctionBase.__call__
  File "dgl/_ffi/_cython/./function.pxi", line 232, in dgl._ffi._cy3.core.FuncCall
  File "dgl/_ffi/_cython/./base.pxi", line 155, in dgl._ffi._cy3.core.CALL
dgl._ffi.base.DGLError: [01:31:24] /opt/dgl/src/array/kernel.cc:94: Check failed: (out->dtype).code == kDLFloat ( vs. 2) : Feature data must be float type
Stack trace:
  [bt] (0) /opt/conda/lib/python3.7/site-packages/dgl/libdgl.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x4f) [0x7f6e31334c4f]
  [bt] (1) /opt/conda/lib/python3.7/site-packages/dgl/libdgl.so(dgl::aten::SpMM(std::string const&, std::string const&, std::shared_ptr<dgl::BaseHeteroGraph>, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray, std::vector<dgl::runtime::NDArray, std::allocator<dgl::runtime::NDArray> >)+0x3f8) [0x7f6e31460218]
  [bt] (2) /opt/conda/lib/python3.7/site-packages/dgl/libdgl.so(+0x6610d4) [0x7f6e314690d4]
  [bt] (3) /opt/conda/lib/python3.7/site-packages/dgl/libdgl.so(+0x6617d1) [0x7f6e314697d1]
  [bt] (4) /opt/conda/lib/python3.7/site-packages/dgl/libdgl.so(DGLFuncCall+0x48) [0x7f6e319f3e28]
  [bt] (5) /opt/conda/lib/python3.7/site-packages/dgl/_ffi/_cy3/core.cpython-37m-x86_64-linux-gnu.so(+0x163aa) [0x7f6e30bff3aa]
  [bt] (6) /opt/conda/lib/python3.7/site-packages/dgl/_ffi/_cy3/core.cpython-37m-x86_64-linux-gnu.so(+0x1691b) [0x7f6e30bff91b]
  [bt] (7) python(_PyObject_FastCallKeywords+0x48b) [0x563bc8ad000b]
  [bt] (8) python(_PyEval_EvalFrameDefault+0x49b6) [0x563bc8b34186]

I hope you can tell me how to solve it, thank you.

minjie · January 17, 2022, 4:24am

Try:

def mfunc(edges):
    return {'mask' : (edges.src['label'] == edges.dst['label']).float()}

because DGL currently only supports message passing of float values, which we should definitely improve. Could you help creating a feature request issue using this case? We’d like to track this.

jiaruHithub · January 18, 2022, 9:55am

I’m happy to help with this, but would like you to tell me what to do next

mufeili · January 20, 2022, 4:34pm

You can open a feature request on github here using our template. Thanks!

system · February 19, 2022, 4:35pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.