Have Problem with Setting Up IP Config

all right. I suppose it’s network configuration issue.

It seems that the reason I could make it run was that I launched the servers on the same machine with different ports, i.e. putting 1 IP address (i.e. xxx.xxx.10.17) with 2 different ports into ip_config.txt.

Then I met this error today, so I modified ip_config.txt and put 2 different IP address into it (i.e. xxx.xxx.10.17 and xxx.xxx.9.50).

Then I got this error:

[08:38:41] /opt/dgl/src/rpc/network/tcp_socket.cc:86: Failed bind on xxx.xxx.9.50:30051 , error: Cannot assign requested address

I’m not sure if it is a network configuration issue on my side. :frowning:

I’d recommend using DGL launch script with torchrun and not specifying any ports and/or duplicate IPs in the ip_config.txt file. Were you able to make the launch script work with torchrun? If not, I can send you the launch script that I made work with torchrun.

1 Like

It will be great if you can send me a copy of your launch script! I’ll message you my email address. Thanks.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.