Re: Question about ESXi 6.5 iSER PFC direct connection

September 15, 2017, 9:33 am

≫ Next: Re: Question about ESXi 6.5 iSER PFC direct connection

≪ Previous: Re: New iSER Driver installation on ESXi 6.5-U1

iSER protocol based on RoCE.

RoCE need a looseless network that based on PFC or GP.

Unfortunately you can't do it.

I think you should purchase a PFC based ethernet switch.

Best Regard,

Jae-Hoon Choi

↧

Re: Question about ESXi 6.5 iSER PFC direct connection

September 15, 2017, 10:36 am

≫ Next: Re: Question about ESXi 6.5 iSER PFC direct connection

≪ Previous: Re: Question about ESXi 6.5 iSER PFC direct connection

are you sure this wont work at all? since they are directly connected there is no real way for outside influences to affect the throughput. I would assume with 2 cards direct connected one cannot outsend the other etc. I did the same thing with srp without issue. I am currently doing it was iscsi as well. I will try anyways but i am holding off trying for now since everyone is reporting they cannot get iser to see their scst or lio targets anyways..

↧

Re: Question about ESXi 6.5 iSER PFC direct connection

September 16, 2017, 1:10 am

≫ Next: Re: How to test RDMA traffic congestion

≪ Previous: Re: Question about ESXi 6.5 iSER PFC direct connection

At first, I'm not a Mellanox employee.

But some information on this site show me a hintbellows.

01.ESXi 6.5 native driver

ESXi 6.5 native driver can't support vKernel driver like SRP, IPoIB and infiniband based iSER.

Therefore officially Ethernet driver supported in future.

02.Infiniband based SRP vs iSER

SRP just need SM on Target or Initiator.

Therefore you can build SRP fabtic with direct connection between HCAs.

Also Infiniband based iSER, too!

03.Ethernet based (or RoCE) iSER

RoCE must need a loseless ethernet fabric.

For example like DCB(X), ETS.

- It's a extention of ethernet.

All Converged Ethernet protocol like FCoE, RoCE relay on switched fabric only.

Ethernet iSER need a RoCE network.

That's a basic mandatory.

Best Regard,

Jae-Hoon Choi

↧

Re: How to test RDMA traffic congestion

September 16, 2017, 11:25 pm

≫ Next: Re: Question about ESXi 6.5 iSER PFC direct connection

≪ Previous: Re: Question about ESXi 6.5 iSER PFC direct connection

Hi,

For monitoring and analyse your network , I would suggest a new open tool provided by Mellanox, and is called NEO, it is a network management interface to analyse, monitor and diagnostics your ethernet network, it also works for RoCE.

Let's have a look at :

http://www.mellanox.com/page/products_dyn?product_family=278&mtag=neo_host_sw

Regards

Marc

↧

Re: Question about ESXi 6.5 iSER PFC direct connection

September 18, 2017, 7:59 am

≫ Next: Re: Question about ESXi 6.5 iSER PFC direct connection

≪ Previous: Re: How to test RDMA traffic congestion

Hmm i wonder if there is any software i can run on the linux target to emulate this functionality. Is there a true need for PFC in a nic to nic connection though?

↧

Re: Question about ESXi 6.5 iSER PFC direct connection

September 18, 2017, 9:55 am

≫ Next: Re: Question about ESXi 6.5 iSER PFC direct connection

≪ Previous: Re: Question about ESXi 6.5 iSER PFC direct connection

Here is a basic question about your environment.

If you want to use ethernet iSER you should be need PFC enabled ethernet configuration.

Then you must configure ethernet port on your Linux target with PFC.

It also need PFC configuration on ethernet switch, too!

Flow control for loseless ethernet traffic is a basic component on iSER fabric or RoCE.

Best Regard,

Jae-Hoon Choi

↧

Re: Question about ESXi 6.5 iSER PFC direct connection

September 18, 2017, 10:27 am

≫ Next: Asking about InfiniBand cards and related items for proper setup

≪ Previous: Re: Question about ESXi 6.5 iSER PFC direct connection

there is no switch in the storage path in this case, just on the data network. lldpad daemon may work, i havent started testing because i see users with normal roce setups are having problems connecting to scst, i didnt see a need to try my setup until the normal setups work =)

↧

Asking about InfiniBand cards and related items for proper setup

September 18, 2017, 8:56 pm

≫ Next: Re: ceph + rdma error： ibv_open_device failed

≪ Previous: Re: Question about ESXi 6.5 iSER PFC direct connection

Hi all,

Our company has 3 HP DL380p G8 and 3 DL380p G7 servers. We plan to upgrade their IO connectivity to reach 10Gbps network or even higher. The main reason is that we are building Hyper Converged Infrastructure, especially SDS (VMware VSAN, or Microsoft S2D). We already have a pair of 10Gb Switches (JG219A).

We took a deep search and found some useful information:

We see there are 2 options: 10Gb Ethernet NIC, or 40Gb InfiniBand (IB) card. We prefer InfiniBand as it has more hardware offload features that we could benefit from, such as RDMA (RoCE).
FlexLOM card (FLR) is not an option, as the only slot has already been occupied.
According to HP DL380p G8 and DL380 G7 QuickSpecs, the only suitable IB card is HP Infiniband FDR/Ethernet 10Gb/40Gb 2-port 544QSFP Adapter (649281-B21). I dig deeper and found that this IB card is OEMed by Mellanox, and according to HP-Mellanox-Reference-Guide-August-2012 the appropriate Mellanox Part # is MCX354A-FCBT.
I took one step further and found the ConnectX®-3 VPI Single and Dual QSFP+ Port Adapter Card User Manual, Rev 2.5. There on Section 7.6, MCX354A-FCB[T/S] Specifications, I found something:
- Protocol Support:
  - Ethernet: 10GBASE-CX4, 10GBASE-R, and 1000BASE-R, 40GBASE-R4
  - Data Rate: Up to 56Gb/s FDR– InfiniBand

1/10/40/56Gb/s – Ethernet

I know that we also need QSFP to SFP+ adapter P/N 655874-B21, or MAM1Q00A-QSA in Mellanox world in order to use 10Gb Ethernet.

However, I am not so familiar with InfiniBand, and not confident to tell what else we need to take. Do we need special cables in order to work with our existing 10Gb switches? Or do we need to buy special InfiniBand switches?

Hope I explained our case clearly. Long story short, we need to verify the possibility to invest on InfiniBand (6 - 10 dual-port cards) and related items.

Actually I tried to reach both HP Sales and Presales teams in my country, but haven't received any feedback from them for days. I have no options left but posting a question here and hope someone from the community could help me out.

Thank you very much in advanced!

↧

Re: ceph + rdma error： ibv_open_device failed

September 19, 2017, 6:33 am

≫ Next: Re: solution for design small HPC

≪ Previous: Asking about InfiniBand cards and related items for proper setup

Luminous does not support latest Ceph RDMA code. What version of Ceph are you using? Also, if you can please provide the ceph.conf configuration. Lastly, are you able to run other tool on this node like ib_write_bw?

↧

Re: solution for design small HPC

September 19, 2017, 12:45 pm

≫ Next: Re: Asking about InfiniBand cards and related items for proper setup

≪ Previous: Re: ceph + rdma error： ibv_open_device failed

Those are basic steps:

1. Hardware need to be the same, this is basic HPC requirement, so using two different servers is not recommended and you, probably, need some kind of job scheduler - SLURM, Torque, LSF, etc. If you are not using identical hardware, you most likely will have an issue related to performance.

2. Use the same network adapter ( onboard Ethernet, HighSpeed Adapter aka HCA). See item 1.

3. Install same OS, drivers

4. Is using IB, be sure to run OpenSM

5. If you are using Mellanox hardware, use HPC-X toolkit - http://www.mellanox.com/page/products_dyn?product_family=189&mtag=hpc-x

6. Run jobs

Your simple question, is is compilated subject that included almost everything - fabric design, networking, performance tuning including BIOS, OS and driver ( for reference you may check Mellanox Tuning Guide) and it is definitely should be splitted to different subjects.

Take, for example, this article - http://hpcugent.github.io/vsc_user_docs/pdf/intro-HPC-windows-gent.pdf (135 pages)

I would suggest to start building and run jobs using onboard Ethernet adapter ( for HPC you can use any communication channel that exists on the host). When this phase is over, add Mellanox and you'll get much better performance.

↧

Re: Asking about InfiniBand cards and related items for proper setup

September 19, 2017, 4:19 pm

≫ Next: NVMeOF SLES 12 SP3 : Initiator with 36 cores unable to discover/connect to target

≪ Previous: Re: solution for design small HPC

Dear Tu Nguyen Anh,

Thank you for posting your question on the Mellanox Community.

The quickest way to get all the information about the products you need for your setup is to fill in the form on the following link: https://store.mellanox.com/customer-service/contact-us/

Then a Sales Representative will contact you as soon as possible regarding your inquiry.

Thanks and regards,

~Mellanox Technical Support

↧

NVMeOF SLES 12 SP3 : Initiator with 36 cores unable to discover/connect to target

September 19, 2017, 8:55 pm

≫ Next: Melanox grid director 4036e won't boot.

≪ Previous: Re: Asking about InfiniBand cards and related items for proper setup

Hi,

I am trying NVMeOF with RoCE on SLES 12 SP3 using the document

HowTo Configure NVMe over Fabrics

I am noticing that whenever the initiator is having > 32 cores, the initiator is unable to discover/connect to the target. The same procedure works fine if the number of cores <= 32.

the dmesg:

kernel: [ 373.418811] nvme_fabrics: unknown parameter or missing value 'hostid=a61ecf3f-2925-49a7-9304-cea147f61ae' in ctrl creation request

for a successful connection:

[51354.292021] nvme nvme0: creating 32 I/O queues.

[51354.879684] nvme nvme0: new ctrl: NQN "mcx", addr 192.168.0.1:4420

Is there any parameter that can restrict the number of the cores the mlx5_core/nvme_rdma/nvmet_rdma driver can use to restrict the IO queue creation and result in a successful discovery/connection? I won't be able to disable the cores/hyperthreading from the BIOS/UEFI since there are other applications running on the host.

Appreciate any pointers/help!

↧

Melanox grid director 4036e won't boot.

September 19, 2017, 11:00 pm

≫ Next: Re: Melanox grid director 4036e won't boot.

≪ Previous: NVMeOF SLES 12 SP3 : Initiator with 36 cores unable to discover/connect to target

I have been asked to look at the aforementioned 4036e. This is my first time with Melanox switches.

No warning LEDS. All green. Power supply and fans ok.

Boots then crashes at different places in the boot sequence.

I am seeing a 'Warning - Bad CRC' before the switch decides to boot from the secondary flash.

The boot sequence creates 9 partitions.

When we get to the NAND device line it scans for bad blocks.

Then it creates 1 MTD partition.

Later it identifies a bad area from kernel access.

Then we just get a call trace and instruction dump and the loading process halts.

Line connection no longer responds.

I suspect a bad/faulty NAND flash chip.

Does anyone have any suggestions, is this replaceable? Should I try flashing the firmware.

I am not currently at that site, I will visit on Sunday and copy the full configuration then post back here.

I would appreciate any suggestions or ideas.

Many thanks.

Switchman.

↧

Re: Melanox grid director 4036e won't boot.

September 19, 2017, 11:07 pm

≫ Next: Re: IB Switch IS5035 MTU Setting?

≪ Previous: Melanox grid director 4036e won't boot.

I had saved a portion the end of the output and then attempting to boot a second time at the bottom:

============================================================================

Intel/Sharp Extended Query Table at 0x010A
Intel/Sharp Extended Query Table at 0x010A
Intel/Sharp Extended Query Table at 0x010A
Intel/Sharp Extended Query Table at 0x010A
Using buffer write method
Using auto-unlock on power-up/resume
cfi_cmdset_0001: Erase suspend on write enabled
cmdlinepart partition parsing not available
RedBoot partition parsing not available
Creating 9 MTD partitions on "4cc000000.nor_flash":
0x00000000-0x001e0000 : "kernel"
0x001e0000-0x00200000 : "dtb"
0x00200000-0x01dc0000 : "ramdisk"
0x01dc0000-0x01fa0000 : "safe-kernel"
0x01fa0000-0x01fc0000 : "safe-dtb"
0x01fc0000-0x03b80000 : "safe-ramdisk"
0x03b80000-0x03f60000 : "config"
0x03f60000-0x03fa0000 : "u-boot env"
0x03fa0000-0x04000000 : "u-boot"
NAND device: Manufacturer ID: 0x20, Chip ID: 0xda (ST Micro NAND 256MiB 3,3V 8-bit)
Scanning device for bad blocks
Creating 1 MTD partitions on "4e0000000.ndfc.nand":
0x00000000-0x10000000 : "log"
i2c /dev entries driver
IBM IIC driver v2.1
ibm-iic(/plb/opb/i2c@ef600700): using standard (100 kHz) mode
ibm-iic(/plb/opb/i2c@ef600800): using standard (100 kHz) mode
i2c-2: Virtual I2C bus (Physical bus i2c-0, multiplexer 0x70 port 0)
i2c-3: Virtual I2C bus (Physical bus i2c-0, multiplexer 0x70 port 1)
i2c-4: Virtual I2C bus (Physical bus i2c-0, multiplexer 0x70 port 2)
i2c-5: Virtual I2C bus (Physical bus i2c-0, multiplexer 0x70 port 3)
rtc-ds1307 6-0068: rtc core: registered ds1338 as rtc0
rtc-ds1307 6-0068: 56 bytes nvram
i2c-6: Virtual I2C bus (Physical bus i2c-0, multiplexer 0x70 port 4)
i2c-7: Virtual I2C bus (Physical bus i2c-0, multiplexer 0x70 port 5)
i2c-8: Virtual I2C bus (Physical bus i2c-0, multiplexer 0x70 port 6)
i2c-9: Virtual I2C bus (Physical bus i2c-0, multiplexer 0x70 port 7)
pca954x 0-0070: registered 8 virtual busses for I2C switch pca9548
TCP cubic registered
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
IPv6 over IPv4 tunneling driver
sit0: Disabled Privacy Extensions
ip6tnl0: Disabled Privacy Extensions
NET: Registered protocol family 17
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
rtc-ds1307 6-0068: setting system clock to 2000-01-18 01:06:09 UTC (948157569)
RAMDISK: Compressed image found at block 0
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 172k init
init started: BusyBox v1.12.2 (2011-01-03 14:13:22 IST)
starting pid 15, tty '': '/etc/rc.d/rcS'
mount: no /proc/mounts
Mounting /proc and /sys
Mounting filesystems
Loading module Voltaire
Empty flash at 0x0cdcf08c ends at 0x0cdcf800
Starting crond:
Starting telnetd:
ibsw-init.sh start...
Tue Jan 18 01:06:42 UTC 2000
INSTALL FLAG 0x0
starting syslogd & klogd ...
Starting ISR:                   Unable to handle kernel paging request for data at address 0x0000001e
Faulting instruction address: 0xc00ec934
Oops: Kernel access of bad area, sig: 11 [#1]
Voltaire
Modules linked in: ib_is4(+) ib_umad ib_sa ib_mad ib_core memtrack Voltaire
NIP: c00ec934 LR: c00ec930 CTR: 00000000
REGS: d7bdfd10 TRAP: 0300   Not tainted (2.6.26)
MSR: 00029000 <EE,ME> CR: 24000042 XER: 20000000
DEAR: 0000001e, ESR: 00000000
TASK = d7b9c800[49] 'jffs2_gcd_mtd9' THREAD: d7bde000
GPR00: 00000001 d7bdfdc0 d7b9c800 00000000 000000d0 00000003 df823040 0000007f
GPR08: 22396d59 d9743920 c022de58 00000000 24000024 102004bc c026b9a0 c026b910
GPR16: c026b954 c026b630 c026b694 c022b790 d8938150 d8301000 c022b758 d7bdfe30
GPR24: 00000000 0000037c d8301400 00000abf d9743d80 00000000 d8938158 df823000
NIP [c00ec934] jffs2_get_inode_nodes+0xb6c/0x1020
LR [c00ec930] jffs2_get_inode_nodes+0xb68/0x1020
Call Trace:
[d7bdfdc0] [c00ec758] jffs2_get_inode_nodes+0x990/0x1020 (unreliable)
[d7bdfe20] [c00ece28] jffs2_do_read_inode_internal+0x40/0x9e8
[d7bdfe90] [c00ed838] jffs2_do_crccheck_inode+0x68/0xa4
[d7bdff00] [c00f1ed8] jffs2_garbage_collect_pass+0x160/0x664
[d7bdff50] [c00f36c8] jffs2_garbage_collect_thread+0xf0/0x118
[d7bdfff0] [c000bdb8] kernel_thread+0x44/0x60
Instruction dump:
7f805840 409c000c 801d0004 48000008 801d0008 2f800000 409effdc 2f9d0000
40be0010 48000180 4802ba05 7c7d1b78 <a01d001e> 7fa3eb78 2f800000 409effec
---[ end trace b57e19dd3d61c6af ]---
ib_is4 0000:81:00.0: ep0_dev_name 0000:81:00.0
Unable to handle kernel paging request for data at address 0x00000034
Faulting instruction address: 0xc002f3b0
Oops: Kernel access of bad area, sig: 11 [#2]
Voltaire
Modules linked in: is4_cmd_driver ib_is4 ib_umad ib_sa ib_mad ib_core memtrack Voltaire
NIP: c002f3b0 LR: c002fb00 CTR: c00f3a10
REGS: df8a3de0 TRAP: 0300   Tainted: G      D    (2.6.26)
MSR: 00021000 <ME> CR: 24544e88 XER: 20000000
DEAR: 00000034, ESR: 00000000
TASK = df88e800[8] 'pdflush' THREAD: df8a2000
GPR00: c002fb00 df8a3e90 df88e800 00000001 d7b9c800 d7b9c800 00000000 00000001
GPR08: 00000001 00000000 24544e22 00000002 00004b1a 67cfb19f 1ffef400 00000000
GPR16: 1ffe42d8 00000000 1ffebfa4 00000000 00000000 00000004 c0038778 c0261ac4
GPR24: 00000001 c02f0000 00000000 d7b9c800 00000001 d7b9c800 00000000 d8301400
NIP [c002f3b0] prepare_signal+0x1c/0x1a4
LR [c002fb00] send_signal+0x28/0x214
Call Trace:
[df8a3e90] [c0021bb8] check_preempt_wakeup+0xd8/0x110 (unreliable)
[df8a3eb0] [c002fb00] send_signal+0x28/0x214
[df8a3ed0] [c002fe40] send_sig_info+0x28/0x48
[df8a3ef0] [c00f35c4] jffs2_garbage_collect_trigger+0x3c/0x50
[df8a3f00] [c00f3a40] jffs2_write_super+0x30/0x5c
[df8a3f10] [c007340c] sync_supers+0x80/0xd0
[df8a3f30] [c0054dc8] wb_kupdate+0x48/0x150
[df8a3f90] [c0055434] pdflush+0x104/0x1a4
[df8a3fe0] [c00387c4] kthread+0x4c/0x88
[df8a3ff0] [c000bdb8] kernel_thread+0x44/0x60
Instruction dump:
80010034 bb810020 7c0803a6 38210030 4e800020 9421ffe0 7c0802a6 bf810010
90010024 7c9d2378 83c4034c 7c7c1b78 <801e0034> 70090008 40820100 2f83001f
---[ end trace b57e19dd3d61c6af ]---
------------[ cut here ]------------
Badness at kernel/exit.c:965
NIP: c00273f0 LR: c000a03c CTR: c013b2b4
REGS: df8a3cb0 TRAP: 0700   Tainted: G      D    (2.6.26)
MSR: 00021000 <ME> CR: 24544e22 XER: 20000000
TASK = df88e800[8] 'pdflush' THREAD: df8a2000
GPR00: 00000001 df8a3d60 df88e800 0000000b 00002d73 ffffffff c013e13c c02eb620
GPR08: 00000001 00000001 00002d73 00000000 24544e84 67cfb19f 1ffef400 00000000
GPR16: 1ffe42d8 00000000 1ffebfa4 00000000 00000000 00000004 c0038778 c0261ac4
GPR24: 00000001 c02f0000 00000000 d7b9c800 df8a3de0 0000000b df88e800 0000000b
NIP [c00273f0] do_exit+0x24/0x5ac
LR [c000a03c] kernel_bad_stack+0x0/0x4c
Call Trace:
[df8a3d60] [00002d41] 0x2d41 (unreliable)
[df8a3da0] [c000a03c] kernel_bad_stack+0x0/0x4c
[df8a3dc0] [c000ef90] bad_page_fault+0xb8/0xcc
[df8a3dd0] [c000c4c8] handle_page_fault+0x7c/0x80
[df8a3e90] [c0021bb8] check_preempt_wakeup+0xd8/0x110
[df8a3eb0] [c002fb00] send_signal+0x28/0x214
[df8a3ed0] [c002fe40] send_sig_info+0x28/0x48
[df8a3ef0] [c00f35c4] jffs2_garbage_collect_trigger+0x3c/0x50
[df8a3f00] [c00f3a40] jffs2_write_super+0x30/0x5c
[df8a3f10] [c007340c] sync_supers+0x80/0xd0
[df8a3f30] [c0054dc8] wb_kupdate+0x48/0x150
[df8a3f90] [c0055434] pdflush+0x104/0x1a4
[df8a3fe0] [c00387c4] kthread+0x4c/0x88
[df8a3ff0] [c000bdb8] kernel_thread+0x44/0x60
Instruction dump:
bb61000c 38210020 4e800020 9421ffc0 7c0802a6 bf010020 90010044 7c7f1b78
7c5e1378 800203e0 3160ffff 7d2b0110 <0f090000> 54290024 8009000c 5409012f

U-Boot 1.3.4.32 (Feb 6 2011 - 10:18:30)

CPU:   AMCC PowerPC 460EX Rev. B at 666.666 MHz (PLB=166, OPB=83, EBC=83 MHz)
       Security/Kasumi support
       Bootstrap Option E - Boot ROM Location EBC (16 bits)
       Internal PCI arbiter disabled
       32 kB I-Cache 32 kB D-Cache
Board: 4036QDR - Voltaire 4036 QDR Switch Board
I2C:   ready
DRAM: 512 MB (ECC enabled, 333 MHz, CL3)
FLASH: 64 MB
NAND: 256 MiB
*** Warning - bad CRC, using default environment

MAC Address: 00:08:F1:20:52:E8
PCIE1: successfully set as root-complex
PCIE:   Bus Dev VenId DevId Class Int
        01 00 15b3 bd34 0c06 00
Net:   ppc_4xx_eth0

Type run flash_nfs to mount root filesystem over NFS

Hit any key to stop autoboot: 0
=> run flash_nfs
## Booting kernel from Legacy Image at fc000000 ...
   Image Name:   Linux-2.6.26
   Image Type:   PowerPC Linux Kernel Image (gzip compressed)
   Data Size:    1406000 Bytes = 1.3 MB
   Load Address: 00000000
   Entry Point: 00000000
   Verifying Checksum ... OK
   Uncompressing Kernel Image ... OK

↧

Re: IB Switch IS5035 MTU Setting?

September 20, 2017, 9:02 am

≫ Next: no iser adapter listed

≪ Previous: Re: Melanox grid director 4036e won't boot.

Thanks.

Can you tell me if this will take effect immediately, or require a restart of the servers.

Will it cause an interruption of traffic?

Our IB is for traffic from Cluster nodes to SAN storage. I just want to know what the impact will be.

Thanks again!

Todd

↧

no iser adapter listed

September 21, 2017, 8:57 am

≫ Next: Re: IB Switch IS5035 MTU Setting?

≪ Previous: Re: IB Switch IS5035 MTU Setting?

I installed the iser drivers sucessfully, i cannot add an iser adapter. I did the following:

[root@esxi-1:~] esxcfg-module -g iser

iser enabled = 1 options = ''

[root@esxi-1:~] vmkload_mod iser

vmkload_mod: Can not load module iser: module is already loaded

[root@esxi-1:~] esxcli rdma iser add

Failed to add device: com.vmware.iser

2017-09-21T15:54:05.956Z cpu20:68115 opID=8596fd32)WARNING: Device: 1316: Failed to register device 0x43055fde3050 logical#vmkernel#com.vmware.iser0 com.vmware.iser (parent=0x130c43055fde3244): Already exists

[root@esxi-1:~]

There is no iscsi or iser adapter listed. I did have an iscsi setup originally on this host but i removed it and the vkernel port. any suggestions?

↧

Re: IB Switch IS5035 MTU Setting?

September 21, 2017, 9:15 am

≫ Next: 40Gbps on 4x QDR ConnectX-2 VPI cards / Win10

≪ Previous: no iser adapter listed

The error should clear once the SM's partition configuration (IpoIB MTU setting) is adjusted (to eliminate the inconsistency):

For example, to enable a 4096-byte IpoIB MTU on the Subnet Manager's default partition, (assuming SM is running on a switch), perform the following commands below. If more than one switch is running SM, this change should be made on each switch running SM.

In MLNX-OS, to change this MTU setting, the path to this setting is:

In GUI:

IB SM Mgt tab

Partitions Tab

In the existing Default Partition,

IPoIB MTU can be changed from 2K to 4K.

No other settinngs need to be changed.

Apply changes, but be aware that this is an intrusive configuration change and will disrupt the cluster as the SM process gets restarted and the MTU changes are applied.

↧

40Gbps on 4x QDR ConnectX-2 VPI cards / Win10

September 21, 2017, 9:31 am

≫ Next: Trouble making Infiniband running udaddy

≪ Previous: Re: IB Switch IS5035 MTU Setting?

I recently bought a pair of used Mellanox Infiniband 4x QDR ConnectX-2 VPI cards. One cars is single port, the other is dual port.

I will use them to connect my workstation to a server for HPC applications.

I am running Windows 10 on both systems. If that helps the systems are somewhat old platform, 2x Xeon X5670 CPUs for each node.

The seller of the Mellanox cards told me that it will be difficult to reach 40G rates on Win10 except:

1) I use a 12K 12 port switch

2) I use a Linux to Linux configuration.

I don't want to lose time learning and configuring Linux OS, and i don't want to buy an expensive switch either!

Do you think it's possible to reach the maximum rate (40G) of the cards by any means?

↧

Trouble making Infiniband running udaddy

September 21, 2017, 2:23 pm

≫ Next: INFINIBAND RDMA_CM_EVENT_ADDR_ERROR

≪ Previous: 40Gbps on 4x QDR ConnectX-2 VPI cards / Win10

Hello currently I am using Mellanox ConnectX-3 Adapter for test

currently the pingpong test that was included in the Mellanox install package (ibv_rc_pingpong) are working

However the tests such as rping and udaddy that were mentioned in the post HowTo Enable, Verify and Troubleshoot RDMA

https://community.mellanox.com/docs/DOC-2086#jive_content_id_4_rping

None of the tests will run

here are the error result below

sungho@c1n15:~$ udaddy -s 172.23.10.30                           │sungho@c1n14:~$
udaddy: starting client                                          │sungho@c1n14:~$
udaddy: connecting                                               │sungho@c1n14:~$ udaddy
udaddy: event: RDMA_CM_EVENT_ADDR_ERROR, error: -19              │udaddy: starting server
test complete                                                    │
return status -19

I have two servers running connected with a switch,

and the infiniband ethernets are all pingable with each other

and all the ethernets are installed and running

However I have doubts about the arp table

because it doesn't seem to look like to be connected properly. (listed below)

here is the information of the two servers below

Do you think I need to statistically add the arp table? or is there something fundamentally wrong?

server (A)

sungho@c1n14:/usr/bin$ ibstat
CA 'mlx4_0'
        CA type: MT4099
        Number of ports: 1
        Firmware version: 2.42.5000
        Hardware version: 1
        Node GUID: 0x7cfe9003009a7c30
        System image GUID: 0x7cfe9003009a7c33
        Port 1:
                State: Active
                Physical state: LinkUp
                Rate: 56
                Base lid: 3
                LMC: 0
                SM lid: 3
                Capability mask: 0x0251486a
                Port GUID: 0x7cfe9003009a7c31
                Link layer: InfiniBand

Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         172.23.1.1      0.0.0.0         UG    0      0        0 enp1s0f0
172.23.0.0      0.0.0.0         255.255.0.0     U     0      0        0 enp1s0f0
172.23.0.0      0.0.0.0         255.255.0.0     U     0      0        0 ib0

sungho@c1n14:/usr/bin$ arp -n
Address                  HWtype HWaddress           Flags Mask            Iface
172.23.10.1              ether 0c:c4:7a:3a:35:88 C                     enp1s0f0
172.23.10.15             ether 0c:c4:7a:3a:35:72 C                     enp1s0f0
172.23.1.1               ether 00:1b:21:5b:6a:a8 C                     enp1s0f0

enp1s0f0 Link encap:Ethernet HWaddr 0c:c4:7a:3a:35:70
          inet addr:172.23.10.14 Bcast:172.23.255.255 Mask:255.255.0.0
          inet6 addr: fe80::ec4:7aff:fe3a:3570/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:12438 errors:0 dropped:5886 overruns:0 frame:0
          TX packets:5861 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:2356740 (2.3 MB) TX bytes:836306 (836.3 KB)

ib0       Link encap:UNSPEC HWaddr A0-00-02-20-FE-80-00-00-00-00-00-00-00-00-00-00
          inet addr:172.23.10.30 Bcast:172.23.255.255 Mask:255.255.0.0
          inet6 addr: fe80::7efe:9003:9a:7c31/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:256
          RX bytes:0 (0.0 B) TX bytes:616 (616.0 B)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1 Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING MTU:65536 Metric:1
          RX packets:189 errors:0 dropped:0 overruns:0 frame:0
          TX packets:189 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1
          RX bytes:13912 (13.9 KB) TX bytes:13912 (13.9 KB)

server (B)

sungho@c1n15:~$ ibstat
CA 'mlx4_0'
        CA type: MT4099
        Number of ports: 1
        Firmware version: 2.42.5000
        Hardware version: 1
        Node GUID: 0x7cfe9003009a6360
        System image GUID: 0x7cfe9003009a6363
        Port 1:
                State: Active
                Physical state: LinkUp
                Rate: 56
                Base lid: 1
                LMC: 0
                SM lid: 3
                Capability mask: 0x02514868
                Port GUID: 0x7cfe9003009a6361
                Link layer: InfiniBand

sungho@c1n15:~$ route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         172.23.1.1      0.0.0.0         UG    0      0        0 enp1s0f0
172.23.0.0      0.0.0.0         255.255.0.0     U     0      0        0 enp1s0f0
172.23.0.0      0.0.0.0         255.255.0.0     U     0      0        0 ib0

sungho@c1n15:~$ arp -n
Address                  HWtype HWaddress           Flags Mask            Iface
172.23.10.14             ether 0c:c4:7a:3a:35:70 C                     enp1s0f0
172.23.10.1              ether 0c:c4:7a:3a:35:88 C                     enp1s0f0
172.23.10.30             ether 0c:c4:7a:3a:35:70 C                     enp1s0f0
172.23.1.1               ether 00:1b:21:5b:6a:a8 C                     enp1s0f0

enp1s0f0 Link encap:Ethernet HWaddr 0c:c4:7a:3a:35:72
          inet addr:172.23.10.15 Bcast:172.23.255.255 Mask:255.255.0.0
          inet6 addr: fe80::ec4:7aff:fe3a:3572/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:19432 errors:0 dropped:5938 overruns:0 frame:0
          TX packets:8783 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:8246898 (8.2 MB) TX bytes:1050793 (1.0 MB)

ib0       Link encap:UNSPEC HWaddr A0-00-02-20-FE-80-00-00-00-00-00-00-00-00-00-00
          inet addr:172.23.10.31 Bcast:172.23.255.255 Mask:255.255.0.0
          inet6 addr: fe80::7efe:9003:9a:6361/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:16 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:256
          RX bytes:0 (0.0 B) TX bytes:1232 (1.2 KB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1 Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING MTU:65536 Metric:1
          RX packets:109 errors:0 dropped:0 overruns:0 frame:0
          TX packets:109 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1
          RX bytes:7992 (7.9 KB) TX bytes:7992 (7.9 KB)

↧

INFINIBAND RDMA_CM_EVENT_ADDR_ERROR

September 21, 2017, 2:27 pm

≫ Next: Re: solution for design small HPC

≪ Previous: Trouble making Infiniband running udaddy

Hello currently I am using Mellanox ConnectX-3 Adapter for test

currently the pingpong test that was included in the Mellanox install package (ibv_rc_pingpong) are working

However the tests such as rping and udaddy that were mentioned in the post HowTo Enable, Verify and Troubleshoot RDMA

https://community.mellanox.com/docs/DOC-2086#jive_content_id_4_rping

None of the tests will run

here are the error result below

sungho@c1n15:~$ udaddy -s 172.23.10.30                           │sungho@c1n14:~$
udaddy: starting client                                          │sungho@c1n14:~$
udaddy: connecting                                               │sungho@c1n14:~$ udaddy
udaddy: event: RDMA_CM_EVENT_ADDR_ERROR, error: -19              │udaddy: starting server
test complete                                                    │
return status -19

I have two servers running connected with a switch,

and the infiniband ethernets are all pingable with each other

and all the ethernets are installed and running

However I have doubts about the arp table

because it doesn't seem to look like to be connected properly. (listed below)

here is the information of the two servers below

Do you think I need to statistically add the arp table? or is there something fundamentally wrong?

server (A)

sungho@c1n14:/usr/bin$ ibstat
CA 'mlx4_0'
        CA type: MT4099
        Number of ports: 1
        Firmware version: 2.42.5000
        Hardware version: 1
        Node GUID: 0x7cfe9003009a7c30
        System image GUID: 0x7cfe9003009a7c33
        Port 1:
                State: Active
                Physical state: LinkUp
                Rate: 56
                Base lid: 3
                LMC: 0
                SM lid: 3
                Capability mask: 0x0251486a
                Port GUID: 0x7cfe9003009a7c31
                Link layer: InfiniBand

Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         172.23.1.1      0.0.0.0         UG    0      0        0 enp1s0f0
172.23.0.0      0.0.0.0         255.255.0.0     U     0      0        0 enp1s0f0
172.23.0.0      0.0.0.0         255.255.0.0     U     0      0        0 ib0

sungho@c1n14:/usr/bin$ arp -n
Address                  HWtype HWaddress           Flags Mask            Iface
172.23.10.1              ether 0c:c4:7a:3a:35:88 C                     enp1s0f0
172.23.10.15             ether 0c:c4:7a:3a:35:72 C                     enp1s0f0
172.23.1.1               ether 00:1b:21:5b:6a:a8 C                     enp1s0f0

enp1s0f0 Link encap:Ethernet HWaddr 0c:c4:7a:3a:35:70
          inet addr:172.23.10.14 Bcast:172.23.255.255 Mask:255.255.0.0
          inet6 addr: fe80::ec4:7aff:fe3a:3570/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:12438 errors:0 dropped:5886 overruns:0 frame:0
          TX packets:5861 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:2356740 (2.3 MB) TX bytes:836306 (836.3 KB)

ib0       Link encap:UNSPEC HWaddr A0-00-02-20-FE-80-00-00-00-00-00-00-00-00-00-00
          inet addr:172.23.10.30 Bcast:172.23.255.255 Mask:255.255.0.0
          inet6 addr: fe80::7efe:9003:9a:7c31/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:256
          RX bytes:0 (0.0 B) TX bytes:616 (616.0 B)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1 Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING MTU:65536 Metric:1
          RX packets:189 errors:0 dropped:0 overruns:0 frame:0
          TX packets:189 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1
          RX bytes:13912 (13.9 KB) TX bytes:13912 (13.9 KB)

server (B)

sungho@c1n15:~$ ibstat
CA 'mlx4_0'
        CA type: MT4099
        Number of ports: 1
        Firmware version: 2.42.5000
        Hardware version: 1
        Node GUID: 0x7cfe9003009a6360
        System image GUID: 0x7cfe9003009a6363
        Port 1:
                State: Active
                Physical state: LinkUp
                Rate: 56
                Base lid: 1
                LMC: 0
                SM lid: 3
                Capability mask: 0x02514868
                Port GUID: 0x7cfe9003009a6361
                Link layer: InfiniBand

sungho@c1n15:~$ route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         172.23.1.1      0.0.0.0         UG    0      0        0 enp1s0f0
172.23.0.0      0.0.0.0         255.255.0.0     U     0      0        0 enp1s0f0
172.23.0.0      0.0.0.0         255.255.0.0     U     0      0        0 ib0

sungho@c1n15:~$ arp -n
Address                  HWtype HWaddress           Flags Mask            Iface
172.23.10.14             ether 0c:c4:7a:3a:35:70 C                     enp1s0f0
172.23.10.1              ether 0c:c4:7a:3a:35:88 C                     enp1s0f0
172.23.10.30             ether 0c:c4:7a:3a:35:70 C                     enp1s0f0
172.23.1.1               ether 00:1b:21:5b:6a:a8 C                     enp1s0f0

enp1s0f0 Link encap:Ethernet HWaddr 0c:c4:7a:3a:35:72
          inet addr:172.23.10.15 Bcast:172.23.255.255 Mask:255.255.0.0
          inet6 addr: fe80::ec4:7aff:fe3a:3572/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:19432 errors:0 dropped:5938 overruns:0 frame:0
          TX packets:8783 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:8246898 (8.2 MB) TX bytes:1050793 (1.0 MB)

ib0       Link encap:UNSPEC HWaddr A0-00-02-20-FE-80-00-00-00-00-00-00-00-00-00-00
          inet addr:172.23.10.31 Bcast:172.23.255.255 Mask:255.255.0.0
          inet6 addr: fe80::7efe:9003:9a:6361/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:16 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:256
          RX bytes:0 (0.0 B) TX bytes:1232 (1.2 KB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1 Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING MTU:65536 Metric:1
          RX packets:109 errors:0 dropped:0 overruns:0 frame:0
          TX packets:109 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1
          RX bytes:7992 (7.9 KB) TX bytes:7992 (7.9 KB)

↧