I want to know the compatibility of Cisco SW and Mellanox SW.

July 9, 2018, 10:46 pm

≫ Next: Re: How to configure host chaining for ConnectX-5 VPI

≪ Previous: Re: How to enable VF multi-queue for SR-IOV on KVM?

Customer requirements include LRM compatibility verification between Cisco switches and Melanox switches.

cisco switch : cisco catalyst 6509-E 10Gbase-LRM TYPE

mellanox switch : sn 2010 10Gbase LRM TYPE

I have confirmed that 10gbase lrm Gbic is recognized by melanox switch.

However, I did not have a Cisco device and I was not sure if it was compatible between the two devices.

If someone have tried connecting the two devices using 10gbase lrm, please let me know.

↧

Re: How to configure host chaining for ConnectX-5 VPI

July 10, 2018, 6:57 am

≫ Next: Assign a MAC to a VLAN

≪ Previous: I want to know the compatibility of Cisco SW and Mellanox SW.

Putting this out there since we had so many complications with host chaining in order for it to work; and something Google will pick up is infinitely better than nothing.

The idea we had was that we wanted something that would have redundancy. With a switch configuration, we'd have to get two switches, and a lot more cables; very expensive.

HOST_CHAINING_MODE was a great idea, switchless, less cables, and less expense.

You do NOT need a subnet manager for this to work!

In order to get it working:

Aside: There is no solid documentation on this process as of this writing

1. What Marc said was accurate, set HOST_CHAINING_MODE=1 via the mlxconfig utility.

Aside: Both the VPI and EN type cards will work with host chaining. The VPI type does require you to put it into ethernet mode.

2. Restart the servers to set the mode.

3. Put all of the ports on the same subnet. EG. 172.19.50.0/24 Restart networking stack as required.

4. From there, all ports should be pingable from all other ports.

5. Set the MTU up to 9000. (see caveats for bug; lower to 8000 if 9k doesn't work)

Aside: The MTU could be higher; I have been unable to test higher due to a bug in the firmware. Around these forums, I've seen 9k floated about, and it seems like a good standard number.

If you aren't getting the throughput you're expecting, do ALL of the tuning from BIOS (Performance Tuning for Mellanox Adapters , BIOS Performance Tuning Example ) and software (Understanding PCIe Configuration for Maximum Performance , Linux sysctl Tuning ) for all servers. It does make a difference. On our small (under-powered) test boxes, we gained 20 GBit/s from our starting benchmark.

Another thing to make sure is that you have the proper PCI bandwidth to support line rate; and get the socket direct cards if you do not.

There are a lot of caveats.

The bandwidth that is possible IS link speed, only between two directly connected nodes. From our tests, there is a small dip in performance on each hop; and each hop also limits your max theoretical throughput.
FW version 16.22.1002 had a few bugs related to host chaining; one of those was the max MTU supported was 8150. Higher MTU, less IP overhead.
The 'ring' topology is a little funny. It is only one direction. If there is a cable cut scenario, it will NOT route around properly for certain hosts.

Aside: A cable cut is different than a cable disconnect. The transceiver itself registers whether there is a cable attached or not. When there is no cable present on one side, but is on the other, the above scenario is true (not properly routing.) When both sides of the cable are removed, the ring outright stops and does not work at all. I don't have any data to support an actual cable cut.

The ring works as described in the (scant) documentation, but is as follows from the firmware release notes:

Received packets from the wire with DMAC equal to the host MAC are forwarded to the local host
Received traffic from the physical port with DMAC different than the current MAC are forwarded to the other port:
Traffic can be transmitted by the other physical port
Traffic can reach functions on the port's Physical Function
Device allows hosts to transmit traffic only with its permanent MAC
To prevent loops, the received traffic from the wire with SMAC equal to the port permanent MAC is dropped (the packet cannot start a new loop)

If you run into problems, tcpdump is your friend, and ping is a great little tool to check your sanity.

Hope any of this helps anyone in the future,

Daniel

↧

Assign a MAC to a VLAN

July 10, 2018, 2:56 pm

≫ Next: rx_fifo_errors and rx_dropped errors using VMA where CPU user less than 40%

≪ Previous: Re: How to configure host chaining for ConnectX-5 VPI

Hi all.

Sorry my english.

Im using SX 1024 with software version SX_PPC_M460EX SX_3.3.5006.

I can assign a MAC Address to a VLAN? I need create a VLAN and two MAC Address the VLAN.

It sounds pretty simple, but I did not find it in the manual.

Thank you all.

↧

rx_fifo_errors and rx_dropped errors using VMA where CPU user less than 40%

July 10, 2018, 11:01 pm

≫ Next: Re: mlx5_core - Cable error / Power budget exceeded

≪ Previous: Assign a MAC to a VLAN

Hi,

I'm getting rx_fifo errors and rx_dropped_errors receiving UDP packets. I have 8 applications each receiving ~8000 byte UDP packets from 7 different pieces of hardware with different IP addresses. The packet and data rate is identical for each application - totalling 440k packets/sec and 29 Gbit/sec respectively. The packets are all transmitted synchronously, at a rate of 2x8000 byte packets every 1.5 ms for each of 56 different hardware cards.

In this mode, rx_dropped and rx_fifo_errors increased at a few tens of packets per second. Attached is a dump of what ethtool shows. vma_stats shows no dropped packets. Each application is bound with numactl to NUMA node 1 (which is is where the NIC is attached). top shows each core on that node is running at < 40% CPU. The switch shows no dropped packets.

Libvma configuration as shown below. I had the same problem when not using libvma (i.e. vanilla linux kernel packet processing).

Can anyone give me some hints on where to look to reduce the number of lost packets?

Many thanks in advance,

Keith

export VMA_MTU=9000 #don't need to set - should be intelligent but we'll set it anyway for now
export VMA_RX_BUFS=32768 # number of buffers -each of 1xMTU. Default is 200000 = 1 GB!
export VMA_RX_WRE=4096 # number work requests
export VMA_RX_POLL=0 # Don't waste CPU time polling. WE don't need to
export VMA_TX_BUFS=256 # Dont need many of these, so make it smalle
export VMA_TX_WRE=32 # Don't need to tx so make this small to save memory
export VMA_INTERNAL_THREAD_AFFINITY=15
export VMA_MEM_ALLOC_TYPE=0
export VMA_THREAD_MODE=0 # all socket processing is single threaded
export VMA_CQ_AIM_INTERRUPTS_RATE_PER_SEC=200
export VMA_CQ_KEEP_QP_FULL=0 # this does packet drops according ot the docs??
export VMA_SPEC=throughput

ban115@tethys:~$ lspci -v | grep -A 10 ellanox
84:00.0 Ethernet controller: Mellanox Technologies MT27500 Family [ConnectX-3]
    Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3]
    Flags: bus master, fast devsel, latency 0, IRQ 74, NUMA node 1
    Memory at c9800000 (64-bit, non-prefetchable) [size=1M]
    Memory at c9000000 (64-bit, prefetchable) [size=8M]
    Expansion ROM at <ignored> [disabled]
    Capabilities: <access denied>
    Kernel driver in use: mlx4_core
    Kernel modules: mlx4_core

ban115@tethys:~$ numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 15968 MB
node 0 free: 129 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 16114 MB
node 1 free: 2106 MB
node distances:
node 0 1
0: 10 21
1: 21 10

↧

Re: mlx5_core - Cable error / Power budget exceeded

July 11, 2018, 10:40 am

≫ Next: Re: How to enable VF multi-queue for SR-IOV on KVM?

≪ Previous: rx_fifo_errors and rx_dropped errors using VMA where CPU user less than 40%

Found solution:

sudo mlxconfig -e -d 04:00.0 set ADVANCED_POWER_SETTINGS=True

sudo mlxconfig -e -d 04:00.0 set DISABLE_SLOT_POWER_LIMITER=True

↧

Re: How to enable VF multi-queue for SR-IOV on KVM?

July 12, 2018, 1:27 am

≫ Next: Re: mlnxofedinstall of 4.3-3.0.2.1-rhel7.5alternate-aarch64 has some checking bug need to be fixed

≪ Previous: Re: mlx5_core - Cable error / Power budget exceeded

Hi,

Please open a support ticket.

Best Regards

Marc

↧

Re: mlnxofedinstall of 4.3-3.0.2.1-rhel7.5alternate-aarch64 has some checking bug need to be fixed

July 12, 2018, 4:57 am

≫ Next: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 On Ubuntu 16.04

≪ Previous: Re: How to enable VF multi-queue for SR-IOV on KVM?

Hi,

I have here a machine with Centos 7.5 on ARM and cannot see the same output.

I would like to investigate it even if you already a workaround.

For this purpose, I need you to open a case at support@mellanox.com

Thanks in advance

Marc

↧

Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 On Ubuntu 16.04

July 12, 2018, 3:56 pm

≫ Next: Small redundant MLAG setup

≪ Previous: Re: mlnxofedinstall of 4.3-3.0.2.1-rhel7.5alternate-aarch64 has some checking bug need to be fixed

Hi, have a Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 installed server with Ubuntu 16.04 and were able to install the drivers etc.,

But It card doesn't show up using ifconfig -a

Any ideas? Is this version of OS and Kernel supported for ConnectX VPI PCIe 2.0?

Here is more info:

root@ubuntu16-sdc:~# root@ubuntu16-sdc:~# uname -a

Linux ubuntu16-sdc 4.8.0-44-generic #47~16.04.1-Ubuntu SMP Wed Mar 22 18:51:56 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

root@ubuntu16-sdc:~# lspci | grep Mell

03:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0)

root@ubuntu16-sdc:~# /etc/init.d/openibd restart

Unloading HCA driver: [ OK ]

Loading HCA driver and Access Layer: [ OK ]

root@ubuntu16-sdc:~# hca_self_test.ofed

---- Performing Adapter Device Self Test ----

Number of CAs Detected ................. 1

PCI Device Check ....................... PASS

Kernel Arch ............................ x86_64

Host Driver Version .................... MLNX_OFED_LINUX-4.4-1.0.0.0 (OFED-4.4-1.0.0): 4.8.0-44-generic

Host Driver RPM Check .................. PASS

Firmware on CA #0 HCA .................. v2.10.0720

Host Driver Initialization ............. PASS

Number of CA Ports Active .............. 0

Kernel Syslog Check .................... PASS

Node GUID on CA #0 (HCA) ............... NA

------------------ DONE ---------------------

root@ubuntu16-sdc:~# mlxfwmanager --online -u -d 0000:03:00.0

Querying Mellanox devices firmware ...

Device #1:

----------

Device Type: ConnectX2

Part Number: MHQH19B-XTR_A1-A3

Description: ConnectX-2 VPI adapter card; single-port 40Gb/s QSFP; PCIe2.0 x8 5.0GT/s; tall bracket; RoHS R6

PSID: MT_0D90110009

PCI Device Name: 0000:03:00.0

Port1 MAC: 0002c94f2ec0

Port2 MAC: 0002c94f2ec1

Versions: Current Available

FW 2.10.0720 N/A

Status: No matching image found

root@ubuntu16-sdc:~# lsmod | grep ib

ib_ucm 20480 0

ib_ipoib 172032 0

ib_cm 53248 3 rdma_cm,ib_ipoib,ib_ucm

ib_uverbs 106496 2 ib_ucm,rdma_ucm

ib_umad 24576 0

mlx5_ib 270336 0

mlx5_core 806912 2 mlx5_fpga_tools,mlx5_ib

mlx4_ib 212992 0

ib_core 286720 10 ib_cm,rdma_cm,ib_umad,ib_uverbs,ib_ipoib,iw_cm,mlx5_ib,ib_ucm,rdma_ucm,mlx4_ib

mlx4_core 348160 2 mlx4_en,mlx4_ib

mlx_compat 20480 15 ib_cm,rdma_cm,ib_umad,ib_core,mlx5_fpga_tools,ib_uverbs,mlx4_en,ib_ipoib,mlx5_core,iw_cm,mlx5_ib,mlx4_core,ib_ucm,rdma_ucm,mlx4_ib

devlink 28672 4 mlx4_en,mlx5_core,mlx4_core,mlx4_ib

libfc 114688 1 tcm_fc

libcomposite 65536 2 usb_f_tcm,tcm_usb_gadget

udc_core 53248 2 usb_f_tcm,libcomposite

scsi_transport_fc 61440 3 qla2xxx,tcm_qla2xxx,libfc

target_core_iblock 20480 0

target_core_mod 356352 9 iscsi_target_mod,usb_f_tcm,vhost_scsi,target_core_iblock,tcm_loop,tcm_qla2xxx,target_core_file,target_core_pscsi,tcm_fc

configfs 40960 6 rdma_cm,iscsi_target_mod,usb_f_tcm,target_core_mod,libcomposite

libiscsi_tcp 24576 1 iscsi_tcp

libiscsi 53248 2 libiscsi_tcp,iscsi_tcp

scsi_transport_iscsi 98304 3 libiscsi,iscsi_tcp

Please let me know if any other info needed..

↧

Small redundant MLAG setup

July 15, 2018, 3:46 am

≫ Next: DPDK with MLX4 VF on Hyper-v VM

≪ Previous: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 On Ubuntu 16.04

Hi there,

first of all thanks for all the great information that can be found here.

I'm trying to build a fully redundant setup for two racks in different locations with the smallest amount of switches. From my understanding MLAG will help me to keep redundancy towards the servers. That means for each rack:

use two switches to create a mlag domain
attach all servers to both switches.

So I need at least 4 switches. Now to my problem. I want to avoid another two spine switches. For my setup (two racks) this seems be a little overpowered and the spines would only use 4-6 ports each. The question mark in the picture shows where the magic must happen.

Looking at other posts I see the following option.

With MLNX-OS 3.6.6102 STP and MLAG can coexist.
I implement a fully redundant interconnect of all 4 switches.
I activate MSTP (as I have multiple VLANs)
MSTP will allow to utilize interconnection as good as possible

Is this ok or am I missing something?

Thanks in advance.

↧

DPDK with MLX4 VF on Hyper-v VM

July 16, 2018, 12:20 am

≫ Next: Re: Windows RDMA QoS and WinOF-2 1.80 issues

≪ Previous: Small redundant MLAG setup

Hello,

(Not sure if this is the right place to ask, if not , please kindly point out the right place or person)

I have a connect-3 CX354a installed on my Window Server 2016, and enabled SR-IOV on the card and the server, following the WinOF user guide.

On the ubuntu 18.02 VM created using MS Hyperv-v, I can see the mlnx VF working properly. But when I tried to run testpmd on the VM using the following command:

./testpmd -l 0-1 -n 4 --vdev=net_vdev_netvsc0,iface=eth1,force=1 -w 0002:00:02.0 --vdev=net_vdev_netvsc0,iface=eth2,force=1 -w 0003:00:02.0 -- --rxq=2 --txq=2 -i

I ran into an error:

PMD: mlx4.c:138: mlx4_dev_start(): 0x562bff05e040: cannot attach flow rules (code 12, "Cannot allocate memory"), flow error type 2, cause 0x7f39ef408780, message: flow rule rejected by device

The command format was suggested by MS Azure to run DPDK on their AN-enabled VMs. And it does work on Azure VMs.

Here is ibv_devinfo output from my vm:

root@myVM:~/MLNX_OFED_SRC-4.4-1.0.0.0# ibv_devinfo

hca_id: mlx4_0

transport: InfiniBand (0)

fw_ver: 2.42.5000

node_guid: 0014:0500:691f:c3fa

sys_image_guid: ec0d:9a03:001c:92e3

vendor_id: 0x02c9

vendor_part_id: 4100

hw_ver: 0x0

board_id: MT_1090120019

phys_port_cnt: 1

port: 1

state: PORT_DOWN (1)

max_mtu: 4096 (5)

active_mtu: 1024 (3)

sm_lid: 0

port_lid: 0

port_lmc: 0x00

link_layer: Ethernet

hca_id: mlx4_1

transport: InfiniBand (0)

fw_ver: 2.42.5000

node_guid: 0014:0500:76e0:b9d1

sys_image_guid: ec0d:9a03:001c:92e3

vendor_id: 0x02c9

vendor_part_id: 4100

hw_ver: 0x0

board_id: MT_1090120019

phys_port_cnt: 1

port: 1

state: PORT_DOWN (1)

max_mtu: 4096 (5)

active_mtu: 1024 (3)

sm_lid: 0

port_lid: 0

port_lmc: 0x00

link_layer: Ethernet

and here is the kernel module info of mlx4_en:

filename:	/lib/modules/4.15.0-23-generic/updates/dkms/mlx4_en.ko
version:	4.4-1.0.0
license:	Dual BSD/GPL
description:	Mellanox ConnectX HCA Ethernet driver
author:	Liran Liss, Yevgeny Petrilin
srcversion:	23E8E7A25194AE68387DC95
depends:	mlx4_core,mlx_compat,ptp,devlink
retpoline:	Y
name:	mlx4_en
vermagic:	4.15.0-23-generic SMP mod_unload
parm:	udev_dev_port_dev_id:Work with dev_id or dev_port when supported by the kernel. Range: 0 <= udev_dev_port_dev_id <= 2 (default = 0).
	0: Work with dev_port if supported by the kernel, otherwise work with dev_id.
	1: Work only with dev_id regardless of dev_port support.
	2: Work with both of dev_id and dev_port (if dev_port is supported by the kernel). (int)
parm:	udp_rss:Enable RSS for incoming UDP traffic or disabled (0) (uint)
parm:	pfctx:Priority based Flow Control policy on TX[7:0]. Per priority bit mask (uint)
parm:	pfcrx:Priority based Flow Control policy on RX[7:0]. Per priority bit mask (uint)
parm:	inline_thold:Threshold for using inline data (range: 17-104, default: 104) (uint)

One difference I notice between my local VM and Azure VM is that the mlx4_en.ko module is definitely different. Azure seems to be using a specialized version of mlx4_en. Is this the reason why the testpmd works on azure but not my local Hyper-v VM?

If so, how can I get a DPDK-capable driver for MS Hyper-v?

Thank you!

↧

Re: Windows RDMA QoS and WinOF-2 1.80 issues

July 16, 2018, 7:40 am

≫ Next: Yocto embedded build of rdma-core

≪ Previous: DPDK with MLX4 VF on Hyper-v VM

Hi,

vlan tagged traffic is not required - but I do need it in a vlan

Sorry for late reply but it took me longer then expected to get back to this.

Anyway I think I solved this. Apparently after earlier learning experience I ended up with mismatch DCBX config between card and Windows. I had it enabled in firmware:

LLDP_NB_DCBX_P1

LLDP_NB_RX_MODE_P1

LLDP_NB_TX_MODE_P1

and disabled in windows.

After resetting card firmware settings to default (DCBX not listening to switch) it's ok now even with 1.80 driver.

For resetting my card: mlxconfig.exe -d mt4115_pciconf0 reset

Thanks for suggestions.

↧

Yocto embedded build of rdma-core

July 16, 2018, 7:43 am

≫ Next: Question about RC (read) and UD (send/recv) performance over CX3 and CX4 NIC

≪ Previous: Re: Windows RDMA QoS and WinOF-2 1.80 issues

Greetings,

I'm working with an embedded build of the rdma-core code (and rdma-perftest but I'm not that far yet). We're doing a cross build of the rdma-core code using yocto targeted at an Altera Arria10 board which includes a dual-core ARM Cortex-A9 processor. I've been able to successfully build the kernel 4.15 with the rxe modules. However, when I build the userland and get to the rdma-core library I run into a number of issues. One has been particularly vexing. During the do_configure() stage of the yocto build for rdma-core, I get errors regarding the installation of the rdma_man_pages. In particular, in the buildlib/rdma_man.cmake file, there is a routine: function(rdma_man_pages) that fails. If I comment out the entire body of this function, I can get the binaries to build but that takes me to another problem during the do_install portion. It appears that pandoc is not available at this stage. I suppose I can try to add pandoc with a separate recipe and then try again. Anyone have any comments on this build issue?

Thanks,

↧

Question about RC (read) and UD (send/recv) performance over CX3 and CX4 NIC

July 16, 2018, 9:47 am

≫ Next: "Priority trust-mode is not supported on your system"?

≪ Previous: Yocto embedded build of rdma-core

I am using CX3 and CX4 NICs to measure the throughput of RDMA verbs (RC Read and UD Send/Recv).

When I use the same test code to measure the peak throughput of small messages on CX3 and CX4.

The performance of RDMA Read Verbs is lower than Send/Recv Verbs on CX3, while the comparison result is reversed on CX4.

How about the performance trend of new generation of NICs like CX5 or CX6?

↧

"Priority trust-mode is not supported on your system"?

July 17, 2018, 6:41 am

≫ Next: Re: Can't get full FDR bandwidth with Connect-IB card in PCI 2.0 x16 slot

≪ Previous: Question about RC (read) and UD (send/recv) performance over CX3 and CX4 NIC

Hello, I met a problem when I set the trust-mode for the ConnectX3-Pro 40GbE NIC.

The system information follows:

LSB Version: :core-4.1-amd64:core-4.1-noarch

Distributor ID: CentOS

Description: CentOS Linux release 7.3.1611 (Core)

Release: 7.3.1611

Codename: Core

The ConnectX3-Pro NIC information follows:

hca_id: mlx4_1

transport: InfiniBand (0)

fw_ver: 2.40.7000

node_guid: f452:1403:0095:2280

sys_image_guid: f452:1403:0095:2280

vendor_id: 0x02c9

vendor_part_id: 4103

hw_ver: 0x0

board_id: MT_1090111023

phys_port_cnt: 2

Device ports:

port: 1

state: PORT_ACTIVE (4)

max_mtu: 4096 (5)

active_mtu: 1024 (3)

sm_lid: 0

port_lid: 0

port_lmc: 0x00

link_layer: Ethernet

port: 2

state: PORT_DOWN (1)

max_mtu: 4096 (5)

active_mtu: 1024 (3)

sm_lid: 0

port_lid: 0

port_lmc: 0x00

link_layer: Ethernet

It's the first time that I have met this problem. So I don't know what to do.

What does this tip mean? Is the system version the main cause?

Waiting for your help.

Thanks.

↧

Re: Can't get full FDR bandwidth with Connect-IB card in PCI 2.0 x16 slot

July 17, 2018, 11:56 am

≫ Next: Re: Can't get full FDR bandwidth with Connect-IB card in PCI 2.0 x16 slot

≪ Previous: "Priority trust-mode is not supported on your system"?

Server:
> ib_send_bw -a -F --report_gbits
Client:
> ib_send_bw -a -F --report_gbits <serverIP>

Please let me know your results and thank you...

~Steve

↧

Re: Can't get full FDR bandwidth with Connect-IB card in PCI 2.0 x16 slot

July 17, 2018, 12:24 pm

≫ Next: ASAP2 Live Migration & H/W LAG

≪ Previous: Re: Can't get full FDR bandwidth with Connect-IB card in PCI 2.0 x16 slot

Hi Steve,

Here you are. Thanks for taking an interest. Still getting ~45 Gb/sec on both client and server. Here is the client output:

[root@vx01 ~]# ib_send_bw -a -F --report_gbits vx02

---------------------------------------------------------------------------------------

Send BW Test

Dual-port : OFF Device : mlx5_0

Number of qps : 1 Transport type : IB

Connection type : RC Using SRQ : OFF

TX depth : 128

CQ Moderation : 100

Mtu : 4096[B]

Link type : IB

Max inline data : 0[B]

rdma_cm QPs : OFF

Data ex. method : Ethernet

---------------------------------------------------------------------------------------

local address: LID 0x3e4 QPN 0x005e PSN 0x239861

remote address: LID 0x3e6 QPN 0x004c PSN 0xed4513

---------------------------------------------------------------------------------------

#bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] MsgRate[Mpps]

2 1000 0.098220 0.091887 5.742968

4 1000 0.20 0.19 6.037776

8 1000 0.40 0.39 6.071169

16 1000 0.78 0.67 5.220818

32 1000 1.53 1.43 5.576730

64 1000 3.16 3.10 6.053410

128 1000 6.20 6.16 6.012284

256 1000 12.35 12.28 5.997002

512 1000 22.67 22.47 5.486812

1024 1000 38.02 36.69 4.478158

2048 1000 42.26 42.04 2.565771

4096 1000 43.82 43.68 1.332978

8192 1000 44.63 44.63 0.681005

16384 1000 44.79 44.79 0.341728

32768 1000 45.21 45.21 0.172449

65536 1000 45.35 45.35 0.086506

131072 1000 45.45 45.45 0.043342

262144 1000 45.45 45.45 0.021670

524288 1000 45.47 45.47 0.010840

1048576 1000 45.47 45.47 0.005421

2097152 1000 45.48 45.48 0.002711

4194304 1000 45.48 45.48 0.001355

8388608 1000 45.48 45.48 0.000678

---------------------------------------------------------------------------------------

Here is the server output:

[root@vx02 ~]# ib_send_bw -a -F --report_gbits

************************************

* Waiting for client to connect... *

************************************

---------------------------------------------------------------------------------------

Send BW Test

Dual-port : OFF Device : mlx5_0

Number of qps : 1 Transport type : IB

Connection type : RC Using SRQ : OFF

RX depth : 512

CQ Moderation : 100

Mtu : 4096[B]

Link type : IB

Max inline data : 0[B]

rdma_cm QPs : OFF

Data ex. method : Ethernet

---------------------------------------------------------------------------------------

local address: LID 0x3e6 QPN 0x004c PSN 0xed4513

remote address: LID 0x3e4 QPN 0x005e PSN 0x239861

---------------------------------------------------------------------------------------

#bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] MsgRate[Mpps]

2 1000 0.000000 0.099141 6.196311

4 1000 0.00 0.20 6.229974

8 1000 0.00 0.40 6.265230

16 1000 0.00 0.69 5.362016

32 1000 0.00 1.47 5.727960

64 1000 0.00 3.22 6.283794

128 1000 0.00 6.34 6.191118

256 1000 0.00 12.64 6.169975

512 1000 0.00 23.08 5.634221

1024 1000 0.00 37.53 4.581582

2048 1000 0.00 42.63 2.602155

4096 1000 0.00 44.07 1.344970

8192 1000 0.00 45.04 0.687191

16384 1000 0.00 45.04 0.343602

32768 1000 0.00 45.35 0.172994

65536 1000 0.00 45.45 0.086690

131072 1000 0.00 45.52 0.043409

262144 1000 0.00 45.51 0.021699

524288 1000 0.00 45.52 0.010852

1048576 1000 0.00 45.52 0.005427

2097152 1000 0.00 45.53 0.002714

4194304 1000 0.00 45.53 0.001357

8388608 1000 0.00 45.53 0.000678

---------------------------------------------------------------------------------------

Please let me know if you want any other info and I will send it straight away.

Regards,

Eric

↧

ASAP2 Live Migration & H/W LAG

July 17, 2018, 1:22 pm

≫ Next: The problem with RoCE connectivity between ConnectX-3 and ConnectX-4 Lx adapters

≪ Previous: Re: Can't get full FDR bandwidth with Connect-IB card in PCI 2.0 x16 slot

Hi,

Is ASAP2 OVS Offload support OpenStack Live Migration? If not, which ASAP2 mode should I use, OVS Acceleration or Application Acceleration (DPDK Offload)? How to have H/W LAG (with LACP) on each of those three modes?

Best regards,

↧

The problem with RoCE connectivity between ConnectX-3 and ConnectX-4 Lx adapters

July 18, 2018, 6:45 am

≫ Next: Unknown symbol nvme_find_pdev_from_bdev

≪ Previous: ASAP2 Live Migration & H/W LAG

Hello.

I have Microsoft Windows 2012 R2 cluster and some nodes have ConnectX-3 adapters and some nodes have ConnectX-4 Lx adapters.

There is RoCE connectivity between nodes with ConnectX-4 Lx adapters, but there isn’t connectivity between nodes with different adapters.

I think it’s because ConnectX-3 adapters use RoCE 1.0 mode, but ConnectX-4 Lx adapters use RoCE 2.0 mode.

I tred to change RoCE mode from 1.0 to 2.0 for ConnectX-3 adapters by “Set-MlnxDriverCoreSetting -RoceMode 2”, but had warning – “SingleFunc_2_0_0: RoCE v2.0 mode was requested, but it is not supported. The NIC starts in RoCE v1.5 mode” and RoCE connectivity doesn’t work.

What the best way to fix my problem ?

ConnectX-3 adapters don’t work in RoCE 2.0 mode at all ? How about new FW ? Today I have WinOF-5_35 with FW:

Image type: FS2

FW Version: 2.40.5032

FW Release Date: 16.1.2017

Product Version: 02.40.50.32

Rom Info: type=PXE version=3.4.747 devid=4099

Device ID: 4099

Description: Node Port1 Port2 Sys image

GUIDs: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff

MACs: e41d2ddfa540 e41d2ddfa541

VSD:

PSID: MT_1080120023

I don’t find way to change to RoCE 1.0 mode for ConnectX-4 Lx adapters for Microsoft Windows environment.

↧

Unknown symbol nvme_find_pdev_from_bdev

July 18, 2018, 11:17 am

≫ Next: Re: Firmware for MHJH29 ?

≪ Previous: The problem with RoCE connectivity between ConnectX-3 and ConnectX-4 Lx adapters

Hi all,

After installing MLNX_OFED_LINUX-4.4-1 on Ubuntu 18.04 (kernel 4.15.0-24) as "$ mlnxofedinstall --force --without-dkms --with-nvmf" I'm trying to use RDMA tools, but

- modprobe on nvme_rdma fails with "nvme_rdma: Unknown symbol nvme_delete_wq (err 0)"

- modprobe on nvmet_rdma fails with "nvmet: Unknown symbol nvme_find_pdev_from_bdev (err 0)"

What am I doing wrong, please?

I see two kernel modules loaded: nvme and nvme_core.

This is Mellanox MCX516A-CCAT ConnectX-5 EN Network Interface Card 100GbE Dual-Port QSFP28.

Any inputs will be greatly appreciated

Thank you

Dmitri Fedorov

Ciena Canada

↧

Re: Firmware for MHJH29 ?

July 19, 2018, 12:24 am

≫ Next: What are some good budget options for NICs and Switch?

≪ Previous: Unknown symbol nvme_find_pdev_from_bdev

> Service for this Hardware and FW has ended. So it will not be hosted on our site.

Thank you for your help. Seems this new firmware is no newer than the one I already have, unfortunately.

Those are 10+ years old cards... no surprise they are difficult to put back in service.

Cordially & thanks again for the great support,

↧