-------------------------------------------------------------------------- By default, for Open MPI 4.0 and later, infiniband ports on a device are not used by default. The intent is to use UCX for these devices. You can override this policy by setting the btl_openib_allow_ib MCA parameter to true. Local host: c012 Local adapter: mlx5_0 Local port: 1 -------------------------------------------------------------------------- -------------------------------------------------------------------------- WARNING: There was an error initializing an OpenFabrics device. Local host: c012 Local device: mlx5_0 -------------------------------------------------------------------------- corrupted size vs. prev_size [c012:1652706] *** Process received signal *** [c012:1652706] Signal: Aborted (6) [c012:1652706] Signal code: (-6) [c012:1652706] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x43090)[0x7f657ae53090] [c012:1652706] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7f657ae5300b] [c012:1652706] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7f657ae32859] [c012:1652706] [ 3] /lib/x86_64-linux-gnu/libc.so.6(+0x8d26e)[0x7f657ae9d26e] [c012:1652706] [ 4] /lib/x86_64-linux-gnu/libc.so.6(+0x952fc)[0x7f657aea52fc] [c012:1652706] [ 5] /lib/x86_64-linux-gnu/libc.so.6(+0x9596b)[0x7f657aea596b] [c012:1652706] [ 6] /lib/x86_64-linux-gnu/libc.so.6(+0x96e8b)[0x7f657aea6e8b] [c012:1652706] [ 7] /lib/x86_64-linux-gnu/libc.so.6(+0x98f22)[0x7f657aea8f22] [c012:1652706] [ 8] /lib/x86_64-linux-gnu/libc.so.6(realloc+0x2d6)[0x7f657aeab156] [c012:1652706] [ 9] python3[0x517956] [c012:1652706] [10] python3(PyTokenizer_FindEncodingFilename+0x94)[0x61af64] [c012:1652706] [11] python3(_Py_DisplaySourceLine+0x91)[0x67a9e1] [c012:1652706] [12] python3(PyTraceBack_Print+0x186)[0x67af46] [c012:1652706] [13] python3[0x4a3304] [c012:1652706] [14] python3(_PyErr_Display+0x53)[0x67e4a3] [c012:1652706] [15] python3(PyErr_Display+0x45)[0x67e525] [c012:1652706] [16] python3[0x67bb4e] [c012:1652706] [17] python3[0x5c4ef0] [c012:1652706] [18] python3[0x4f2ffe] [c012:1652706] [19] python3[0x67fb1a] [c012:1652706] [20] python3(PyRun_SimpleFileExFlags+0x1c5)[0x67fe65] [c012:1652706] [21] python3(Py_RunMain+0x212)[0x6b7c82] [c012:1652706] [22] python3(Py_BytesMain+0x2d)[0x6b800d] [c012:1652706] [23] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7f657ae34083] [c012:1652706] [24] python3(_start+0x2e)[0x5fb85e] [c012:1652706] *** End of error message *** Aborted (core dumped) Command exited with non-zero status 134 {"realtime":4.66,"usertime":5.71,"systime":20.26,"memmax":143004,"memavg":0}