Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all 6226 articles
Browse latest View live

Re: Neo, error 'Device Management Discovery' .


Setup Mellanox MSX1012B in HA environment.

$
0
0

Hello Community,

 

I am new to Mellanox switches. I am trying to configure 2 units of MSX1012B in HA environment. These switches will be behind 2 juniper firewalls serving the server farm.

I followed the configuration guide but i am little confused here whether i need to configure IPL and MGLAG to meet my requirement. Below is the diagram which i want to achieve. Switch-A and Switch-B are MSX1012B.

 

Looking forward.IDC-for-community - Page 1.png

 

Thank you!

 

 

Re: Neo, error 'Device Management Discovery' .

$
0
0

For ETH discovery to work properly, you must configure LLDP for all managed devices such as MSN2700B, MSN2410.

1) Configure lldp on the switches.

2) Turn on LLDP Discovery .

3) Restart the NEO service please run the following command

/opt/neo/neoservice restart

4) Monitor if the same issue occurs.

Re: Neo, error 'Device Management Discovery' .

$
0
0

Done.

But the same issue has occurred.

On all switches it was configured yet:

##

## LLDP configuration

##

lldp

Re: Neo, error 'Device Management Discovery' .

$
0
0

Hi ,

I saw that you opened a support case#474466 through IBS account.

We will continue the debug through the support case.

 

Thanks,

Samer

Re: Firmware for MHJH29 ?

$
0
0

Hello Romain -

   Good day to you...

Could you get the board_id with "ibv_devinfo"

And the part number with:

> lspci | grep Mell       NOTE: the bus:dev.func of the device

> lspci -s bus:dev.func -xxxvvv

See:

Read-only fields:

                        [PN] Part number:

 

If you could update this thread with this information it would be very helpful.

thanks - steve

Re: Problem with symbol error counter

$
0
0

Usually, symbol errors caused by some physical condition and in many cases fixed by a) reseating BOTH ends of the cable or b) replacing the cable. If you are using OEM solution, you might contact the hardware vendor after trying reseating the cables and see if your equipment is under the warranty or open case with him.

In order to reset the fabric counters, use 'ibdiagnet -pc' command and the same command should be used to collect information about the fabric. ibqueryerrors, despite that it exists in Mellanox OFED, shouldn't be used as it not under the development. ibdiagnet is a swiss army knife.

missing ifup-ib in latest release?

$
0
0

Hi .. I have some old cluster nodes that were working fine under previous versions of CentOS 7 (I think it was CentOS 7.3 before update) but after doing a recent update to CentOS 7.5 I can't seem to get the interface to come up. I reinstalled the latest MLNX_OFED drivers (MLNX_OFED_LINUX-4.3-3.0.2.1-rhel7.5-x86_64) which installed properly. I see the card in lspci and the kernel modules seem to be loaded as well. However, I can't seem to bring up the interface. Doing an ifup I get this:

 

ifup ib0

ERROR     : [/etc/sysconfig/network-scripts/ifup-eth] Device ib0 does not seem to be present, delaying initialization.

 

Which seemed weird to me that it was trying to use the ifup-eth code instead of the ifup-ib code to bring up the interface. When I looked for this file I don't see it on the system with the mlnx_ofed software installed. If I don't install mlnx_ofed and just leave the CentOS drivers installed the card comes up fine. I also notice this comes from the rdma-core package from CentOS:

 

# rpm -qf /etc/sysconfig/network-scripts/ifup-ib

rdma-core-15-7.el7_5.x86_64

 

When I look at the mlnx_ofed installed machine I don't see an rdma-core package...

 

 

# rpm -qa | grep rdma

librdmacm-41mlnx1-OFED.4.2.0.1.3.43302.x86_64

librdmacm-utils-41mlnx1-OFED.4.2.0.1.3.43302.x86_64

librdmacm-devel-41mlnx1-OFED.4.2.0.1.3.43302.x86_64

 

So I'm wondering if I am missing something with this? Previous versions I didn't seem to have any issues with getting it installed and using it. Anyone have some advice as to what I should look at further to figure this out? Thanks,


Re: ConnectX-5 EN vRouter Offload

$
0
0

Hi Marc,

 

Do you mean that product brief has over promising mistake? Contrail cannot use OVS.

 

Best regards,

Re: Ceph with OVS Offload

Re: igmp mlag vlan config right?

$
0
0

Thank you for the answers to my questions!

One Correct:

Wrong command:

ip igmp snooping static-group 232.43.211.234 interface mlag-port-channel 1 source 192.168.1.1 192.168.1.2 192.168.1.3 192.168.1.4

 

Right command:

vlan 1 ip igmp snooping static-group 232.43.211.234 interface mlag-port-channel 1
Unfortunately, this does not work because the IPL interface Po11 can not be added to a static group.

 

mrouter problem:

When I set the mlag ports as mrouter, I do not get a multicast group, either dynamically or statically.

Here is the working configuration in a mlag:

#enable global

ip igmp snooping


#enable snooping via vlan
vlan 1 ip igmp snooping

 

#enable querier per vlan
vlan 1 ip igmp snooping querier

#set the igmp snooping querier ip address in Webui off to free ip 1.1.1.1
vlan 1 ip igmp snooping querier address 1.1.1.1

Re: missing ifup-ib in latest release?

$
0
0

Hi ,

 

Could you please check if ib0 interface found under "ifconfig -a" ?

If not, i suggest the following:

1) invoke mst start -> mst status -> ifconfig and check again

2) Try to restart the interfaces:

- /etc/init.d/openibd restart

- opensm start or start the SM on the switch

 

3) If the above still not working , create interface manually :

vi /etc/sysconfig/network-scripts/ifcfg-ib0

NAME="ib0"

DEVICE="ib0"

ONBOOT=yes

BOOTPROTO=static

TYPE=Infiniband

IPADDR=<ip from the same subnet>

 

Thanks,

Samer

How to enable VF multi-queue for SR-IOV on KVM?

$
0
0

I have successfully enable SR-IOV on kvm for ConnectX-3 with KVM (InfiniBand).Speeds up to 28.6Gb/s between guest hosts by using the iperf tool,but speeds up to 14Gb/s between virtual machines.I found that although the virtual machine shows multiple queues in /proc/interrupts,only one queue is actually available.I have configured smp_affinity and disable the irqbalance service.How to enable VF multi-queue for SR-IOV on KVM?

Thanks !

 

vm host:

[root@host-09 ~]# cat /proc/interrupts | grep mlx4

45:        106         52         58         59         59         59         54         55   PCI-MSI-edge      mlx4-async@pci:0000:00:07.0

46:    2435659    2939965      41253      26523      49013      59796      56406      70341   PCI-MSI-edge      mlx4-1@0000:00:07.0

47:          0          0          0          0          0          0          0          0   PCI-MSI-edge      mlx4-2@0000:00:07.0

48:          0          0          0          0          0          0          0          0   PCI-MSI-edge      mlx4-3@0000:00:07.0

49:          0          0          0          0          0          0          0          0   PCI-MSI-edge      mlx4-4@0000:00:07.0

50:          0          0          0          0          0          0          0          0   PCI-MSI-edge      mlx4-5@0000:00:07.0

51:          0          0          0          0          0          0          0          0   PCI-MSI-edge      mlx4-6@0000:00:07.0

52:          0          0          0          0          0          0          0          0   PCI-MSI-edge      mlx4-7@0000:00:07.0

53:          0          0          0          0          0          0          0          0   PCI-MSI-edge      mlx4-8@0000:00:07.0

54:          0          0          0          0          0          0          0          0   PCI-MSI-edge      mlx4-9@0000:00:07.0

55:          0          0          0          0          0          0          0          0   PCI-MSI-edge      mlx4-10@0000:00:07.0

56:          0          0          0          0          0          0          0          0   PCI-MSI-edge      mlx4-11@0000:00:07.0

57:          0          0          0          0          0          0          0          0   PCI-MSI-edge      mlx4-12@0000:00:07.0

58:          0          0          0          0          0          0          0          0   PCI-MSI-edge      mlx4-13@0000:00:07.0

59:          0          0          0          0          0          0          0          0   PCI-MSI-edge      mlx4-14@0000:00:07.0

60:          0          0          0          0          0          0          0          0   PCI-MSI-edge      mlx4-15@0000:00:07.0

61:          0          0          0          0          0          0          0          0   PCI-MSI-edge      mlx4-16@0000:00:07.0

 

[root@host-09 ~]# cat /proc/irq/46/smp_affinity

02

[root@host-09 ~]# cat /proc/irq/47/smp_affinity

04

[root@host-09 ~]# cat /proc/irq/48/smp_affinity

08

[root@host-09 ~]# cat /proc/irq/49/smp_affinity

10

[root@host-09 ~]# cat /proc/irq/50/smp_affinity

20

[root@host-09 ~]# cat /proc/irq/51/smp_affinity

40

[root@host-09 ~]# cat /proc/irq/52/smp_affinity

80

[root@host-09 ~]# cat /proc/irq/53/smp_affinity

01

[root@host-09 ~]# cat /proc/irq/54/smp_affinity

02

[root@host-09 ~]# cat /proc/irq/55/smp_affinity

04

[root@host-09 ~]# cat /proc/irq/56/smp_affinity

08

[root@host-09 ~]# cat /proc/irq/57/smp_affinity

10

[root@host-09 ~]# cat /proc/irq/58/smp_affinity

20

[root@host-09 ~]# cat /proc/irq/59/smp_affinity

40

[root@host-09 ~]# cat /proc/irq/60/smp_affinity

80

[root@host-09 ~]# cat /proc/irq/61/smp_affinity

01

 

[root@host-09 ~]# ls -la /sys/class/net/ib0/queues/

total 0

drwxr-xr-x 4 root root 0 Jun 26 12:11 .

drwxr-xr-x 5 root root 0 Jun 26 12:11 ..

drwxr-xr-x 2 root root 0 Jun 26 12:11 rx-0

drwxr-xr-x 3 root root 0 Jun 26 12:11 tx-0

Re: mlnx_qos cannot assign priority values to TCs after 8 SR-IOV devices

$
0
0

Hi Steve,

 

I'm using the latest FW and MOFED as well.

I've issued the command you advised, it returned the correct output, but priority levels are all stacked under TC0, which by default are spreaded over all the TCs.

 

The output:

mlnx_qos -i enp6s0f1 --pfc 0,0,0,1,0,0,0,0

DCBX mode: OS controlled

Priority trust state: pcp

Cable len: 7

PFC configuration:

priority    0   1   2   3   4   5   6   7

enabled     0   0   0   1   0   0   0   0  

tc: 0 ratelimit: unlimited, tsa: vendor

priority:  0

priority:  1

priority:  2

priority:  3

priority:  4

priority:  5

priority:  6

priority:  7

 

Thanks,

David

Re: Firmware for MHJH29 ?

$
0
0

Stephen Yannalfo wrote:

> Could you get the board_id with "ibv_devinfo"

 

Sure; took a little time as I had to get a system running to put the HCA in ;-)

 

When I put some IB between systems at home years ago, the PCIe

would limit me to about DDR anyway (one 8x 1.0, and 4x 2.0 ...), so

i used DDR boards with available firmware. I have since upgraded

some bits of hardware, and wondering if I couldn't get QDR on 8x 2.0.

But it seems QDR was mostly deployed with QSFP connector, not

CX4, so the MHJH29 seems a bit of a black sheep from that era...

 

Thanks for your help !

 

 

Host is a Supermicro X9SRi, primary PCIe slot (16x 2.0). Running

CentOS 7.3

 

[root@localhost ~]# ibv_devinfo

hca_id: mlx4_0

        transport:                      InfiniBand (0)

        fw_ver:                         2.6.900

        node_guid:                      0002:c903:0002:173a

        sys_image_guid:                 0002:c903:0002:173d

        vendor_id:                      0x02c9

        vendor_part_id:                 26428

        hw_ver:                         0xA0

        board_id:                       MT_04E0120005

        phys_port_cnt:                  2

                port:   1

                        state:                  PORT_DOWN (1)

                        max_mtu:                4096 (5)

                        active_mtu:             4096 (5)

                        sm_lid:                 0

                        port_lid:               0

                        port_lmc:               0x00

                        link_layer:             InfiniBand

(second port same as first, nothing is plugged in).

2.6.9 feels old, the latest seems to be 2.9.1[000]

 

> And the part number with:

> > lspci | grep Mell       NOTE: the bus:dev.func of the device

> > lspci -s bus:dev.func -xxxvvv

 

That's one verbose lspci :-)

 

02:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev a0)

        Subsystem: Mellanox Technologies Device 0005

        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+

        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-

        Latency: 0, Cache Line Size: 64 bytes

        Interrupt: pin A routed to IRQ 61

        NUMA node: 0

        Region 0: Memory at fba00000 (64-bit, non-prefetchable) [size=1M]

        Region 2: Memory at 38ffff000000 (64-bit, prefetchable) [size=8M]

        Capabilities: [40] Power Management version 3

                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)

                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-

        Capabilities: [48] Vital Product Data

                Product Name: Eagle QDR

                Read-only fields:

                        [PN] Part number: MHJH29-XTC          

                        [EC] Engineering changes: X5

                        [SN] Serial number: MT0821X00122           

                        [V0] Vendor specific: PCIe Gen2 x8   

                        [RV] Reserved: checksum good, 0 byte(s) reserved

                Read/write fields:

                        [V1] Vendor specific: N/A  

                        [YA] Asset tag: N/A                            

                        [RW] Read-write area: 111 byte(s) free

                End

        Capabilities: [9c] MSI-X: Enable+ Count=256 Masked-

                Vector table: BAR=0 offset=0007c000

                PBA: BAR=0 offset=0007d000

        Capabilities: [60] Express (v2) Endpoint, MSI 00

                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 unlimited

                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 75.000W

                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-

                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-

                        MaxPayload 256 bytes, MaxReadReq 512 bytes

                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-

                LnkCap: Port #8, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s unlimited, L1 unlimited

                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-

                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-

                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-

                LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-

                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported

                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled

                LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-

                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-

                         Compliance De-emphasis: -6dB

                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-

                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-

        Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI)

                ARICap: MFVC- ACS-, Next Function: 1

                ARICtl: MFVC- ACS-, Function Group: 0

        Kernel driver in use: mlx4_core

        Kernel modules: mlx4_core

00: b3 15 3c 67 06 04 10 00 a0 00 06 0c 10 00 00 00

10: 04 00 a0 fb 00 00 00 00 0c 00 00 ff ff 38 00 00

20: 00 00 00 00 00 00 00 00 00 00 00 00 b3 15 05 00

30: 00 00 00 00 40 00 00 00 00 00 00 00 0b 01 00 00

40: 01 48 03 00 00 00 00 00 03 9c ff 7f 00 00 00 78

50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

60: 10 00 02 00 01 8e 2c 01 20 20 00 00 82 f4 03 08

70: 00 00 82 00 00 00 00 00 00 00 00 00 00 00 00 00

80: 00 00 00 00 1f 00 00 00 00 00 00 00 00 00 00 00

90: 02 00 00 00 00 00 00 00 00 00 00 00 11 60 ff 80

a0: 00 c0 07 00 00 d0 07 00 05 00 8a 00 00 00 00 00

b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00


Re: ConnectX-5 EN vRouter Offload

$
0
0

Hi Marc,

 

What is the best way to run Contrail vRouter on HCI environment where kernel will use same bonded ConnectX-5 EN ports with vRouter. I want to achieve maximum possible offload of ConnectX-5? SR-IOV, DPDK, SR-IOV + DPDK or else? If related to SR-IOV, should I use it with PF or VF?

 

Best regards,

get a dump cqe when trying to invalid mr in cx4

$
0
0

Hi,

 

I have a problem with local invalid/send with invalid operation with ConnectX-4 NICs.

 

[863318.002031] mlx5_0:dump_cqe:275:(pid 31419): dump error cqe

[863318.002032] 00000000 00000000 00000000 00000000

[863318.002033] 00000000 00000000 00000000 00000000

[863318.002034] 00000000 00000000 00000000 00000000

[863318.002035] 00000000 09007806 25000178 000006d2

 

ofed version:

MLNX_OFED_LINUX-4.1-1.0.2.0 (OFED-4.1-1.0.2)

 

firmware version:

Querying Mellanox devices firmware ...

 

Device #1:

----------

 

  Device Type:      ConnectX4LX

  Part Number:      MCX4121A-ACA_Ax

  Description:      ConnectX-4 Lx EN network interface card; 25GbE dual-port SFP28; PCIe3.0 x8; ROHS R6

  PSID:             MT_2420110034

  PCI Device Name:  0000:81:00.0

  Base MAC:         0000248a07b37aa2

  Versions:         Current        Available    

     FW             14.20.1010     N/A          

     PXE            3.5.0210       N/A          

 

  Status:           No matching image found

 

I used the tools I developed, and it works in other device like cx3, qlogic. I'v checked the input parameter and make sure it's right. Please share me any suggestions about how to fix it.

Thanks

Re: XenServer 7.2 64bit

Re: Connection between two infiniband ports

Re: mlnx_qos cannot assign priority values to TCs after 8 SR-IOV devices

$
0
0

Hello David -

   I hope all is well...

Could you open a case with Mellanox Support so we can take a deeper look at this issue?

 

Thank you -

Steve

Viewing all 6226 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>