0.5.2DGL coredump when run dist_server

HIHI, WHEN I RUN DEMO DIST_GRAPHSAGE , AT THE RUN DIST SERVER STEP ,
OCCUR COREDUMP WHEN I LOAD DATASET. .
LOG AS FOLLOW :

I DONT KNOW WHAT IS BUS ERROR , CAN ANYBODY HELP ME ?

Using backend: pytorch
Namespace(batch_size=1000, batch_size_eval=100000, close_profiler=False, dataset=None, dropout=0.5, eval_every=5, fan_out=‘10,25’, graph_name=‘reddit’, id=None, ip_config=‘data/ip_config.txt’, local_rank=None, log_every=20, lr=0.003, n_classes=None, num_clients=None, num_epochs=20, num_gpus=-1, num_hidden=16, num_layers=2, num_servers=2, num_workers=1, part_config=None, standalone=False)
torch.distributed.is_available(): True
/usr/local/lib64/python3.6/site-packages/dgl/backend/pytorch/tensor.py:253: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
mask = th.tensor(mask, dtype=th.bool)
load reddit
Bus error (core dumped)

when i start server with ’ load ogb-product’ , Bus error (core dumped) too .

Program terminated with signal 7, Bus error.
#0 0x00007f588169938d in __memcpy_ssse3_back () from /usr/lib64/libc.so.6


Environment
DGL Version (e.g., 1.0): 0.5.2
Backend Library & Version (e.g., PyTorch 0.4.1, MXNet/Gluon 1.3): PyTorch 1.6
OS (e.g., Linux): Linux
How you installed DGL (conda, pip, source): pip
Build command you used (if compiling from source):
Python version: 3.6
CUDA/cuDNN version (if applicable): 10.1
GPU models and configuration (e.g. V100): V100
Any other relevant information:

Hi,

Bus error is probably due to shared memory error. What machine are you using?

Hi I use centos7 docker , i need to compile dgl independent ?

No, you need to raise the shared memory limit in the docker

this is my config in my docker

ipcs -al

------ Messages Limits --------

max queues system wide = 8192

max size of message (bytes) = 8192

default max size of queue (bytes) = 16384

------ Shared Memory Limits --------

max number of segments = 4096

max seg size (kbytes) = 18014398509465599

max total shared memory (kbytes) = 18014398442373116

min seg size (bytes) = 1

------ Semaphore Limits --------

max number of arrays = 128

max semaphores per array = 250

max semaphores system wide = 32000

max ops per semop call = 32

semaphore max value = 32767

totol available 20g

free -m
total used free shared buff/cache available
Mem: 20480 290 13754 2340 6435 20125
Swap: 0 0 0

df -h | grep shm
shm 64M 64M 0 100% /dev/shm

3q i try raise my docker share memory config , the problom is looks like Resolved