GCMC multiple process multiple GPU

In the codding snippet between 189 and 214 line, /examples/pytorch/gcmc/train_sampling.py

Can anyone help explain what the following coding snippet is doing? why do we need this decorator ?

# According to https://github.com/pytorch/pytorch/issues/17199, this decorator
# is necessary to make fork() and openmp work together.
def thread_wrapped_func(func):
    """
    Wraps a process entry point to make it work with OpenMP.
    """
    @wraps(func)
    def decorated_function(*args, **kwargs):
        queue = Queue()
        def _queue_result():
            exception, trace, res = None, None, None
            try:
                res = func(*args, **kwargs)
            except Exception as e:
                exception = e
                trace = traceback.format_exc()
            queue.put((res, exception, trace))

        start_new_thread(_queue_result, ())
        result, exception, trace = queue.get()
        if exception is None:
            return result
        else:
            assert isinstance(exception, Exception)
            raise exception.__class__(trace)
    return decorated_function

This is because we use multi-processing when training on multiple GPUs, which have a strange deadlock issue with openmp (widely used in PyTorch) in fork mode. The decorator solves this problem. See https://github.com/pytorch/pytorch/issues/17199 for more details.