We have the problem that after 2.147 billion messages, which is the range of an int32_t, we cannot receive any more messages.
Compiling OpenMPI with the flag "--with-mxm=/path/to/mxm" causes this problem while without this flag everything is fine. The Problem is reproducible with the attached example code, by compiling and running it with the follwing commands:
$ /path/to/openmpi/bin/mpic++
Maybe the issue is connected with the following lines from "mxm_def.h":
typedef uint32_t | mxm_tag_t; | /* MXM tag type */ |
typedef uint32_t | mxm_imm_t; | /* MXM immediate data type */ |
The problem occurs with the newest Mellanox firmware, OFED package and OpenMPI version.