Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all 6226 articles
Browse latest View live

ib_sdp {failed}

$
0
0

Hello folks,

    Hope all are doing well!

    I'm HPC Admin Trainee. I have one issue on my cluster. one of the node was not able to run due to ib_sdp {failed} showing at the time of boot. I tried following commands:

#etc/init.d/openibd restart

Unloading ib_addr                                          [FAILED]

ERROR: Module ib_addr is in use by ib_core

#service openibd stop

Unloading ib_addr                                          [FAILED]

ERROR: Module ib_addr is in use by ib_core

#service openibd start

ls: cannot access /sys/class/infiniband/qib*: No such file or directory

Loading HCA driver and Access Layer:                       [  OK  ]

Setting up InfiniBand network interfaces:

Determining if ip address 192.168.x.x is already in use for device ib0...

Bringing up interface ib0:                                 [  OK  ]

Setting up service network . . .                           [  done  ]

Loading ib_sdp                                             [FAILED]

 

Kindly help to resolve this issue.

Thanks in advance!!


Re: Ethernet mode non-functional with recent CentOS7 kernels and ConnectX-2 cards?

$
0
0

Understood Viki.  That may have had something to do with why I opened this topic in the Hobby and Home Users Group, where it's common to use gear sourced from (eg) Ebay and similar.

 

Regarding the problem itself, it's been confirmed by RH staff due to someone else reporting the same problem (I guess with a newer card).

 

A working solution is in the CentOS bug report too, in case that's of use to people.  It also works with the older ConnectX (series 1, not 2) cards, as I also tested with that.

Re: Freebsd 11.1 ConnectX-4 VPI

$
0
0

> What do you mean by inbox drivers?

 

"Inbox drivers" is a weird term that some vendors (mostly Mellanox it seems) use to describe the drivers provided by the OS.

 

My suspicion is that it's a shortened way to say "the drivers that came in the box".  I've not personally heard anyone other than Mellanox use the term until recently, where it was used by (from memory) a Red Hat engineer.  Hopefully the phrase dies out though, and people use the industry standard phrases instead.  eg "OS provided driver"

Re: Why MFA2P10-A003 doesn't work while MCP2M00-A002 works?

$
0
0

Hi Yao,

 

YOu should contact Xilinx as seems like an issue with Xilinx and cable.

Re: Ubuntu Connected mode not working

$
0
0

Hi Adam,

In order to set up connected mode, please try first disabling IPoIB enhanced mode to do so please follow the procedure below:

1) Disable "ipoib_enhanced":

#vi /etc/modprobe.d/ib_ipoib.conf

 

Add the following entry:

options ib_ipoib ipoib_enhanced=0

 

2) Stop and start openibd service:

#/etc/init.d/openibd stop

#/etc/init.d/openibd start

 

3) Verify parameter is disabled: (The value should be 0)

#cat /sys/module/ib_ipoib/parameters/ipoib_enhanced

 

4) Verify the mode:

#cat /sys/class/net/ib0/mode

 

Once disabled please go ahead and enable connected mode.

 

Thanks,

Samer

Re: mlxconfig set fails with "-E- Parameter ' value is smaller than minimum allowed 1"

Re: Proxmox 5.1 (Debian 9.2) Mellanox Connect-X2

$
0
0

Hi,

 

Can you send me the output of:

ifconfig -a, uname -a, arp -a, ip address show

 

Thanks

Marc

Re: mlxconfig set fails with "-E- Parameter ' value is smaller than minimum allowed 1"

$
0
0

Hi Samer,

 

thanks for your reply.

 

The card has the latest firmware:

 

[root@localhost:/opt/mellanox] ./bin/flint -d mt4115_pciconf0 -i fw-ConnectX4-rel-12_21_2010-MCX416A-BCA_Ax-FlexBoot-3.5.305.bin burn

 

    Current FW version on flash:  12.21.2010

    New FW version:                     12.21.2010

 

    Note: The new FW version is the same as the current FW version on flash.

 

Do you want to continue ? (y/n) [n] :

 

It must be something else.

 

Cheers

Axel


Soft RoCE not working (no errors)

$
0
0

Hello,

 

I am attempting to run soft RoCE and interface with a X4 card in a different computer.

 

I am running CentOS 7.4, and I installed MLNX_OFED version 4.2-1.2.0.0. The install finished without error, and I ran the service restart command when prompted. I proceed to try and setup soft RoCE following the directions here: HowTo Configure Soft-RoCE.

 

When I run rxe_cfg status/start the script complains that the rdma_rxe module is not loaded (and no other errors even in verbose mode). When I run run lsmod | grep rdma_rxe, I see that rdma_rxe is in fact loaded loaded, and that it is using mlx_compat. Small variation from the above instructions on my system - rdma_rxe is using mlx_compat, not ib_core (even though ib_core is loaded and used by mlx_compat). I figured this is some wrapper used by Mellanox in newer version of the OFED. I have even tried running modprobe rdma_rxe and see no error messages in loading rdma_rxe, and dmesg does not show any error messages from the kernel. I have also tried reloading the module and restarting the machine.

 

After 'starting' rxe_cfg, doing rxe_cfg add <adapter_name> does nothing. It does load any IB devices associated with the NIC, and I still see the 'rdma_rxe module is not loaded' message.

 

I looked around a bunch and could not find anything which helped. I have also tried the same stuff with version 4.2-1.0.0.0 of MLNX_OFED. This computer did have a X4 card in it when I first installed the OFED package. I took it out in case it was preventing soft RoCE from working on other NICs, restarted, re-installed OFED, and did the same troubleshooting without the Mellanox card in.

 

Any help would be appreciated.

Re: mlxconfig set fails with "-E- Parameter ' value is smaller than minimum allowed 1"

$
0
0

Hi Samer,

 

it was the "mlxconfig -d <device> reset" which did the trick. So I guess there was a f... up in the configuration and the reset cleaned it.

 

Thank you very much!

Axel

Re: Freebsd 11.1 ConnectX-4 VPI

$
0
0

"inbox driver" is a term that Mellanox uses for the drivers that comes with ubuntu centos debian rhel etc linux distributions.it is generic for all types of linux distros.this is the way of doing their business.

My problem is that even if I dont install anything , freebsd11 lists connectx-4 VPI drivers.My question is that what the use of installing mellanox drivers on github is if freebsd inbox drivers are already there.

Furthermore if ISER-RDMA is supported only for initiator but not target, I just concluded this because I did not get any response on this issue.initiator without target is like coffee without cookie.

Recover SN2100 switch from 'grub rescue' prompt

$
0
0

Hi,

 

I've just bricked an SN2100 because of power failure during ONIE uninstall process. Now everytime I boot it, it will goes to 'grub rescue' prompt. Is there any way to recover it? Can I follow this guide to re-install ONIE using USB drive?

 

BR,

 

Donny Hariady

Re: Recover SN2100 switch from 'grub rescue' prompt

$
0
0

Hi Donny,

 

Yes - you can use the usb onie recovery procedure

Re: Recover SN2100 switch from 'grub rescue' prompt

$
0
0

Hi Eddie,

 

Thanks for your quick answer! Right now I'm still waiting for usb-to-usb_mini converter cable, but as I observed I'll need a password to enter the BIOS setting. Do we need to change boot order setting in order to boot from the USB drive? If yes then I'll ask how or to whom should I ask for the password? This is a brand new switch - about one month - and nobody except me ever touched it so I'm sure the password is still default from factory.

Re: Recover SN2100 switch from 'grub rescue' prompt


Re: spectre and meltdown vulnerabilites

SR-IOV VMware ESXi 6.5 ConnectX-4 problem with more VFs

$
0
0

Hi,

 

My configuration:

Server Dell PowerEdge R710, 2x6 core (2x12 threads), 96GB RAM.

Mellanox ConnectX-4 Lx EN-MCX4121A-XCAT

Firmware: FW 14.21.2010, PXE 3.5.0305

VMware 6.5: DellEMC-ESXi-6.5U1-7388607-A07 (Dell)

 

Based on this documentation: http://www.mellanox.com/related-docs/prod_software/Mellanox_MLNX-NATIVE-ESX-ConnectX-4-5_Driver_for_VMware_ESXi_6.5_Rele…

 

I configure Card:

/opt/mellanox/bin/mlxconfig -d mt4117_pciconf0 set SRIOV_EN=1 NUM_OF_VFS=18 (In documentation: Firmware VF configuration must be N+1)

esxcli system module parameters set -m nmlx5_core -p max_vfs=16

 

And I have 32VFs (16VFs per port), how to create more VFs when a see in ESXi "esxcli system module parameters list -m nmlx5_core"

max_vfs, Values : 0-16, 0 - disabled

ESXi limitation to 16 VFs per port ? how to create more VFs ?

 

Best Regards

Robert

RoCEv2 on Windows 7 using ConnectX-3 Pro Ethernet adapter

$
0
0

Hello,

I'm trying to figure out how to use RoCEv2 (or v1, neither seems to work...) on Windows 7 using a ConnectX-3 Pro Ethernet Adapter and WinOF v5.35.

I have enabled RoCEv2 and set the RoCEv2 Port to 4791 using the registry keys in HKLM/SYSTEM/CurrentControlSet/services/mlx4_bus/Parameters/Roce.

The WinOF User manual states to use the Microsoft "Network Direct SPI" for RoCE programming. When I do so there is no Network Direct Provider available on the system.

What can I do? Is there an alternate way of programming with RoCE? Or is there a way to install the NetworkDirectProvider?

All the configuration tutorials I have found so far are for Server Platforms, but I cannot use one.

 

If it is not possible to use RoCE in Windows 7, would it be possible in Windows 10?

 

Any help is highly appreciated!

 

Best Regads,

Dominic

Connecting 4 computers without a switch

$
0
0

Hi,

I have been trying to set up a small computing cluster of 4 computers just using connectX-5 HCAs.  We have one Master computer with two cards (a double and a single port) and then three Slave computers each with a single port.  When I just have a master and slave computer hooked up it works fine, but when I start adding more slaves, the connection drops between the other system. 

Do I need to have some sort of connection manager set up?  any advice of how to set this up would be greatly appreciated. (or at least a point to the documentation would be nice)

Thanks in advance,

Bryan

IS5024Q InfiniScale IV firmware issue

$
0
0

I'm trying to update the firmware on my switch and have issued the following commands and ran into a problem where the PSID is a mismatch.  I'm unable to find any mention of GEM0F80110002 on the net and wasn't sure if a force flash could be done.

 

C:\>mst ib add

-I- Discovering the fabric - Running: ibnetdiscover.exe

-I- Added 3 in-band devices

 

 

 

C:\>mst status

MST devices:

------------

  mt4099_pci_cr0

  mt4099_pciconf0

 

 

Inband devices:

-------------------

  CA_MT4099_LLWPHOSTP01_lid-0x0003

  CA_MT4099_LLWPHOSTP02_lid-0x0004

  SW_MT48438_0x2c90200433c38_lid-0x0002

 

 

 

C:\>flint -d /dev/mst/SW_MT48438_0x2c90200433c38_lid-0x0002 q

Image type:            FS2

FW Version:            7.4.0000

Device ID:             48438

Description:           Node             Sys image

GUIDs:                 0002c90200433c38 0002c90200433c3b

VSD:                   n/a

PSID:                  GEM0F80110002

 

C:\Tools>flint -d /dev/mst/SW_MT48438_0x2c90200433c38_lid-0x0002 -i ./fw-IS4-rel-7_4_3000-MIS5024Q_A1-A5.bin -qq b

 

 

    Current FW version on flash:  7.4.0000

    New FW version:               7.4.3000

 

 

 

 

-E- PSID mismatch. The PSID on flash (GEM0F80110002) differs from the PSID in the given image (MT_0F80110002).

Viewing all 6226 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>