How to fix the IB card mode inside the driver code

August 30, 2017, 8:09 pm

≪ Previous: Re: The Way to manage the SN2100 Swtich

I know that there are some ways to configure the pattern,for example,modify the configuration file:/etc/infiniband/connect.conf.

But I was wondering if there is an interface or method to fix the mode when the driver is loading or the driver was loaded

↧

remote reboot sx6025

August 31, 2017, 9:36 am

≫ Next: Re: remote reboot sx6025

≪ Previous: How to fix the IB card mode inside the driver code

Hi all

Can any body help me ?

How I can remote reboot switch sx6025 ?

It's unmanagement switch and can't have access by ethernet.

this switches have access only by infiniband but in some time its network not work.

How I can remote reboot this sx6025 switch by ethernet or i2c or some any methods ?

↧

Re: remote reboot sx6025

August 31, 2017, 11:27 am

≫ Next: Re: remote reboot sx6025

≪ Previous: remote reboot sx6025

Hi Serge,

You can reset the switch using the flint command:

flint -d lid-<switch lid> swreset

for example:

flint -d lid-32 swreset

↧

Re: remote reboot sx6025

August 31, 2017, 10:09 pm

≫ Next: Re: ConnectX-3 Pro connecting at 10g instead of 40g

≪ Previous: Re: remote reboot sx6025

In this case transport for such command (flint -d lid-<switch lid> swreset) is infiniband

Will this work if opensm at some point in time suddenly can no longer see the switch ?

↧

Re: ConnectX-3 Pro connecting at 10g instead of 40g

September 1, 2017, 7:25 am

≫ Next: Re: remote reboot sx6025

≪ Previous: Re: remote reboot sx6025

I connected the cable in loopback at the Windows server and the connection was established at 10gb. The cable model isn't listed in the document provided, so it's likely not fully compatible. I have ordered a cable model from the list. I'll report back when it is installed.

Thanks.

↧

Re: remote reboot sx6025

September 3, 2017, 2:14 am

≫ Next: Re: Running ASAP2

≪ Previous: Re: ConnectX-3 Pro connecting at 10g instead of 40g

if the switches routing tables were not altered after the opensm went offline - then yes

↧

Re: Running ASAP2

September 3, 2017, 4:32 am

≫ Next: Re: remote reboot sx6025

≪ Previous: Re: remote reboot sx6025

Hi Francois,

I am working on ASAP post, it will be soon available.

↧

Re: remote reboot sx6025

September 3, 2017, 4:45 am

≫ Next: Re: ESXi 6.5U1 40GbE Performance problems

≪ Previous: Re: Running ASAP2

thanks

↧

Re: ESXi 6.5U1 40GbE Performance problems

September 3, 2017, 7:40 am

≫ Next: Re: Running ASAP2

≪ Previous: Re: remote reboot sx6025

Assuming you are using Mellanox-Native Esxi driver then as far as I recall - Esxi6.5 does not support Connectx-3 adapter (only Cx-4/5)

Nevertheless, from your description, it looks though your are running a VM to VM test

I would suggest that you first "dismount" the vswitch / virtual adapter mode and try a simple "bare-metal" 40G performance test from an Esxi_server-to-Esxi_server, meaning from a physical adapter (CX-3) to another CX-3 adapter
Ensure you have identical MTU size on both switch & Exsi server (MTU 9000)

Also suggesting you approach Mellanox website to read more about the Esxi6.5 va. Mellanox adapter compatibility matrix
http://www.mellanox.com/page/products_dyn?product_family=29

↧

Re: Running ASAP2

September 3, 2017, 9:04 pm

≫ Next: Re: Where's the procedure of packing network protocol header in RoCE v2?

≪ Previous: Re: ESXi 6.5U1 40GbE Performance problems

Thanks, this is great news.

I have SR-IOV running in my guest and legacy mode works.
I then change to card into eswitch mode and the VFs and representative netdevs are created on the host.
I can thus set up OVS but my current problem is that OVS does not see any traffic.
I suspect that it has something to do with the TC hooks.
Could you give me any details on the required kernel, OVS and iproute package versions.
From the ASAP user guide that I have "For the complete solution you need to install a supporting kernel, iproute2, and openvswitch packages."

Thanks in advance

↧

Re: Where's the procedure of packing network protocol header in RoCE v2?

September 3, 2017, 11:27 pm

≫ Next: Re: How to fix the IB card mode inside the driver code

≪ Previous: Re: Running ASAP2

There was a bug when driver/firmware generate UDP source port. Could you please try the latest mlnx_ofed-4.1?

↧

Re: How to fix the IB card mode inside the driver code

September 4, 2017, 1:08 am

≫ Next: Re: OL7.4 Mellanox OFED

≪ Previous: Re: Where's the procedure of packing network protocol header in RoCE v2?

IPoIB mode?

THanks

↧

Re: OL7.4 Mellanox OFED

September 4, 2017, 1:12 am

≫ Next: Re: ESXi 6.5U1 40GbE Performance problems

≪ Previous: Re: How to fix the IB card mode inside the driver code

Till now , no plan for support OEL 7.4

Next release (MLNX_OFED 4.2 ) will support OEL7.3

THanks

↧

Re: ESXi 6.5U1 40GbE Performance problems

September 3, 2017, 10:28 pm

≫ Next: ConnectX-3 WinOF 5.35 on Win2016 Multiple Partitions

≪ Previous: Re: OL7.4 Mellanox OFED

I test seperate VM on local SSD & ESXi 6.5 host.

Here is briefs

ESX host 01 - iPerf VM 01 - SX6036G - ESX host 02 - iPerf VM 02

All switch & vSwitch, vKernel adapter, nmlx_en adapter set to 9k MTU.

And ESXi 6.5 update 1 inbox driver include nmlx4_core that support Mellanox ConnectX-3, Pro and belows.

Why did you comment that based your personal memorandum?

In another thread on this community said CX-3 supported with ESXi 6.5 inbox driver.

I'll test Arista 40Gb switch with same configuration then I'll update result.

Best Regard,

Jae-Hoon Choi

Update 01. Direct connection between 2 of CX-3s test

This test shows almost 19Gb/s grade performance

Update 02. SX6036G Global Pause On switched test

This test also shows almost 19Gb/s grade performance

Update 03. SX6036G all port 10GbE switched test

This test also shows almost 10Gb/s grade performance

Update 04. ESXi inbox Ethernet driver packet drop bug - ESXi host iPerf test results...:(

It must have a packet drop & error bug in Mellanox inbox ESXi drivers.

How do you think about below?

Best Regard,

Jae-Hoon Choi

↧

ConnectX-3 WinOF 5.35 on Win2016 Multiple Partitions

September 4, 2017, 6:36 pm

≫ Next: Re: Where's the procedure of packing network protocol header in RoCE v2?

≪ Previous: Re: ESXi 6.5U1 40GbE Performance problems

I am in the process setting up an iSCSI storage system for Vmware using IPoIB as high speed interconnect. The switch used is 4036 running 3.9.1. iSCSI cluster has two nodes each has one ConnectX-3 dual port. The second port of the HCA is directly cross connected without a switch. IB SM is running on CentOS box with the following definition:

Default=0xffff , ipoib, mtu=5: ALL=full;

vmotion20=0x8014 , ipoib, mtu=5, defmember=full: ALL=full;

iscsi40=0x8028 , ipoib, mtu=5, defmember=full: ALL=full;

iscsi50=0x8032 , ipoib, mtu=5, defmember=full: ALL=full;

On node A of the storage cluster, "part_man add IPoIB#1 ipoib_8028 8028". The IPoIB#1 is configured with 10.0.0.1/24, and new virtual interface for PKey 0x8028 has 192.168.40.1/24.

On node B of the storage cluster, "part_man add IPoIB#1 ipoib_8032 8032". The IPoIB#1 is configured with 10.0.0.2/24, and new virtual interface for PKey 0x8032 has 192.168.50.1/24.

On SM (CentOS), ib0 is configured with 10.0.0.254/24, ib0.8028 has 192.168.40.254/24, ib0.8032 has 192.168.50.254/24.

The problem I am having is that connectivity to virtual interface on Windows server is sporadic after system restart/reboot. The ping among 10.0.0.1/10.0.0.2/10.0.0.254 are always successful. ibping and ibtracert between all three nodes are always successful. The ping on 192.168.40.0/24 and 192.168.50.0/24 are hit and miss. Sometimes, a restart of SM will reestablish the connectivity. When the connectivity is not there, tcpdump and wireshark on the virtual interface showed ARP who-has packet out of the ping originator but never showed up on the other end.

However, if SM is cross connected to the port #1 on the storage cluster node (bypassing 4036), I was not able to reproduce the problem. Of course, by doing this, the SM will see link goes down and then up, and it will pretty much trigger an event similar to SM restart. As indicated above, restart of SM seems to fix the connectivity on virtual interface (PKey partitions) often times.

Any help pointing me to the right direction will be greatly appreciated!

Thanks!

↧

Re: Where's the procedure of packing network protocol header in RoCE v2?

September 4, 2017, 5:09 am

≫ Next: Re: Where's the procedure of packing network protocol header in RoCE v2?

≪ Previous: ConnectX-3 WinOF 5.35 on Win2016 Multiple Partitions

Thanks for replying.

After updating to the latest mlnx_ofed-4.1,it seems working normaly.

But...I do more expriements.

exp 1.create 1 qp to transfer data,capture the network packets.

exp 2.create 2 qp to transfer data,capture the network packets.

exp 3.create 5 qp to transfer data,capture the network packets.

exp 4.create 10 qp to transfer data,capture the network packets.
exp 5.create 20 qp to transfer data,capture the network packets.

Result:

exp 1:all packets have the same src udp port.

exp 2:There exist 2 src udp port.

exp 3:There exist 5 src udp port.

exp 4:There exist >5 src udp port.(file size too large,can't capture all the packets)
exp 5:There exist >8 src udp port.(file size too large,can't capture all the packets)

it seems when number of qp increases the number of udp src port doesn't always increases with it.

Or the scramble mechanism is not the one-to-one mode rather than the multiplexing?

---------------------------------------------------------------------------------------------------------------------------------

update:

exp4 && exp5 There doesn't exit equal number of src udp port as number of queuepair no matter how long the sniffer has captured the network packets.

↧

Re: Where's the procedure of packing network protocol header in RoCE v2?

September 5, 2017, 12:01 am

≫ Next: Re: Where's the procedure of packing network protocol header in RoCE v2?

≪ Previous: Re: Where's the procedure of packing network protocol header in RoCE v2?

What kind of card you use ? ConnectX-4 ？

↧

Re: Where's the procedure of packing network protocol header in RoCE v2?

September 5, 2017, 12:13 am

≫ Next: RDMA write failing with RAE (AETH syndrome 96)

≪ Previous: Re: Where's the procedure of packing network protocol header in RoCE v2?

Hello, this is the detail of my device. Two devices are the same.

CA 'mlx5_0'

CA type: MT4119

Number of ports: 1

Firmware version: 16.20.1010

Hardware version: 0

Node GUID: 0x248a070300b59626

System image GUID: 0x248a070300b59626

Port 1:

State: Active

Physical state: LinkUp

Rate: 100

Base lid: 0

LMC: 0

SM lid: 0

Capability mask: 0x04010000

Port GUID: 0x268a07fffeb59626

Link layer: Ethernet

↧

RDMA write failing with RAE (AETH syndrome 96)

September 5, 2017, 6:51 am

≫ Next: Re: How to fix the IB card mode inside the driver code

≪ Previous: Re: Where's the procedure of packing network protocol header in RoCE v2?

First column is source and second column is destination.
12.12.12.10 is Mellanox and 12.12.12.12 is our XRNIC target. Last but column is
AETH syndrome.

If you see for every write request going out there are 2
incoming responses, out of them one with with RAE (AETH syndrome ‘d96) and next
one is just positive ACK and there is not much timing difference between these
responses only 15 us.

This is protocol violation, Please suggest what would be wrong for this

Our card

root@xhd-ipsspdk1 ~]# ibv_devinfo
hca_id: mlx4_0
transport:   InfiniBand (0)
fw_ver:    2.40.7000
node_guid:   7cfe:9003:009c:2400
sys_image_guid:   7cfe:9003:009c:2400
vendor_id:   0x02c9
vendor_part_id:   4103
hw_ver:    0x0
board_id:   MT_1090111023
phys_port_cnt:   2
Device ports:

↧

Re: How to fix the IB card mode inside the driver code

September 5, 2017, 8:15 am

≫ Next: ibv_reg_mr got file exists error when used nv_peer_mem

≪ Previous: RDMA write failing with RAE (AETH syndrome 96)

To change IPoIB mode /etc/infiniband/openib.conf file need to be edited ( or ifcfg file when speaking about network configuration file per interfaces). To change InfiniBand or Ethernet mode need to use mlxconfig utility from MFT package, that is available from Mellanox side. You should run utility only once and the change will be persistent between the reboots.

↧