I tried play with ib_send_bw with different message sizes. I noticed that the best performance can be reached if the size is power of two. I tried different size with 1 byte smaller or 1 byte larger, the results are surprisingly low. For example, for message size of 65536 byte, ib_send_bw can get ~ 97Gbps. For 65535 or 65537, the throughput drops between 20~25Gbps. Please see the attached file.
My server setup:
OS: ubuntu Linux 12.04
OFED: MLNX_OFED_LINUX-3.2-2.0.0.0-ubuntu12.04-x86_64
NIC: Mellanox Connect X-4 VPI dual port NIC MCX456A-ECAT
Connection: 100GbE by Mellanox 100GbE switch SN2700