Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all 6226 articles
Browse latest View live

SX1012 Connection to Cisco 4506-E

$
0
0

Hi All,

 

We are trying to connect SX1012 on Cisco 4506-E 10GBE port. We used 40GbE to 10GbE (DAC) connection cable but the link us up only when I change the speed of Mellanox port to 1GbE. Is there anything missing on our configuration? Does the Cisco port need to force the speed to 10GbE rather than auto-negotiate?

 

Thank.

 

Regards,

Reggie


Re: SX1012 Connection to Cisco 4506-E

$
0
0

Hi Reggie,

 

Which type of cable you are using? what is the vendor and part number?

Cisco are not locking the cable on it's end?

 

you configured the port on the Mellanox side to 10G?

Re: SX1012 Connection to Cisco 4506-E

$
0
0

Hi Eddie,

 

I interchanged MC2309130-005 and MC2609130-003 (split speed to 10Gb if this one).

I also configured the port speed to 10GbE. See its status on GUI.

 

Port Information.JPG

 

I will check if they didn't locked the cables on their end.

 

Regards,

Reggie

Re: Connecting 4 computers without a switch

$
0
0

We are considering the same thing I bet.

Building a "no-hop" grid but out of 100GbE links.

I read from googling posts that one way to test the throughput / cabling is to connect the CU QSFP28 to another one in another PC and do a dd or other tool and do a direct transfer point to point.

This validates the cable and ports without any switch in the way.   Why wouldn't that work if natively as you also suggest - no switch latency delays.

The rates we need limit the number of links to target to 10-12 which is doable with 5-6 ConnectX4 in the right box.

 

Thoughts?

Re: OL7.4 Mellanox OFED

$
0
0

They newly released driver works with your previously supplied patch. Mellanox still needs to fix the compat layer for the 4.1.12-112 family of kernels

 

Thanks

RHEL drivers (mlx5)

$
0
0

Hi all,

This is a rather general(noob) question but we are trying to determine when you know you've "outgrown" the built in drivers for RHEL, in our case 6.9 or 7.4 (which seem to use the exact same driver from a few years back). We recently built up a dedicated IP SAN (leaf/spine) to run analytics workloads over NFS to an all flash NAS. These are DL380Gen9's with dual E5-2620s, 64GB RAM, dual ConnectX-4 Lx EN cards talking to N9K switches (via 25Gb) to the leafs, leafs all 100Gb connected, no oversubscription, spines are connected via 8x40Gb connections to the NAS. We aren't even close to saturating any of these links, even with multiple streams, but we seem to be running into some number of drops out of the nodes, segments not making to the NAS, so after some amount of load, things tend to break down. Lot's of *nix admins seem to be dead set against "aftermarket" drivers until proven there is some uber-benefit, so we haven't even crossed that bridge yet. I'm wondering if it is rather silly to try and really push the limits of these higher speed cards and networks without the latest drivers. I see Mellanox "recommends" the latest driver, this seems to be the case whether it's Intel, HPE, or pick your card vendor. How does someone know whether the additional complications of driver installs will be worth it?

No drops seen on any interface, but netstats (and captures) indicate loss (small %, but still). RH seems to think it's not an OS problem. We've tuned some things to get systems settled down, but when pushed, nodes hang, traffic turns into a trainwreck.

thanks

Re: RHEL drivers (mlx5)

$
0
0

So I believe you answered the question yourself. You have stated you have a problem in the environment with dropped segments etc. Seems to me as part of normal troubleshooting you should come up with a way to reproduce the problem in a controllable and consistent way, install the latest drivers, and determine if the failure continues to exist. If you continue to have problems, then you can eliminate the driver as a possible cause and move on to the next thing and continue to use the built in driver. Just need to break the problem down to each part, and start working your way up to try to find cause. Next step would be to span a port on the switch and analyze the traffic and work your way back.

How to build OFED 4.2.1.2.0 for Centos 7.4 Kernel 4.8.7 or 4.10.17

$
0
0

Hi,

 

Any pointers on how to build OFED 4.2.1.2 for kernel version 4.8.7 or 4.10.17? We tried both but it fails. From docs it seems that OFED 4.2.1.2 supports kernel 4.8.7 and 4.10.

We followed this document HowTo Install MLNX_OFED Driver to build OFED for 4.8.7 but it failed.

The 4.8.7 kernel was built using sources from https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.8.7.tar.xz

 

It builds fine with kernel version 3.10.0-693.17.1.el7.x86_64 though.

Are there any performance implications wrt to NVMEOF protocol when we run OFED with Kernel 3.10?

Is there a need to move to a higher version of kernel for OFED to run optimally?

 

Thanks,

Subhadeep


dpdk-18.02 testpmd disable-rss with multiple queue & mlx5

$
0
0

Hi,

My adapter is connectx-5, and use the dpdk-18.02 testpmd with default rss parameter and multiple rx queues, the all rx queues can receive packets.

build/app/testpmd -c 0xff00ff00 -n 4 -w 84:00.0,txq_inline=896, --socket-mem=0,8192 -- --port-numa-config=0,1 --socket-num=1 --burst=64 --txd=1024 --rxd=512 --mbcache=512 –rxq=16 --txq=16--nb-cores=16 -i --rss-udp

set fwd rxonly

set portlist 0

start

 

but the testpmd use disable-rss parameter, only 1 rx queue can receive packets, and other rx queues receive 0 packets.

build/app/testpmd -c 0xff00ff00 -n 4 -w 84:00.0,txq_inline=896, --socket-mem=0,8192 -- --port-numa-config=0,1 --socket-num=1 --burst=64 --txd=1024 --rxd=512 --mbcache=512 –rxq=16 --txq=16--nb-cores=16 -i --disable-rss

set fwd rxonly

set portlist 0

start

 

Is there a need setup other paramers with dpdk testpmd or mlx5 driver?

Thanks,

arthas

 

Re: Does anyone know what the Max Junction temperature is for the MT27508 IC on a ConnectX-3

$
0
0

Hello,

 

Mellanox HCAs’ temperature threshold is 105 [degrees centigrade]. Temperature measured on HCA lower or equal than this threshold is considered as proper temperature.

You can measure it also using mget_temp tool that comes with MFT (Mellanox firmware tool) that can be downloaded from Mellanox website:

http://www.mellanox.com/page/management_tools

 

Regards,

Viki

Re: Does anyone know what the Max Junction temperature is for the MT27508 IC on a ConnectX-3

$
0
0

Hi,

 

Please note that I corrected my answer above. The maximum temperature is 105 C. this is the Junction Temperature (chip temp).

If you use the mget_temp utility from the Mellanox firmware tool (MFT), it gives you the Junction Temperature, so 105 and lower is ok.

This is relevant for both ConnectX-3 and ConnectX-4.

This information can be found in the adapter's datasheets. this document can be provided by Mellanox support only if there is NDA signed.

 

The 55C temperature you see in the adapter HW user manual is the Ambient Temperature- this temperature cannot be measured by Mellanox tools.

For example- page 68:

http://www.mellanox.com/related-docs/user_manuals/ConnectX-3_VPI_Single_and_Dual_QSFP+_Port_Adapter_Card_User_Manual.pdf

 

Best Regards,

Viki

Unable to configure SR-IOV on Connect-IB

$
0
0

Hi,

I have followed all the instruction in this article.HowTo Configure SR-IOV for Connect-IB/ConnectX-4 with KVM (InfiniBand)   but I am getting following error while Setting the desired number of VFs using "echo 6 > /sys/class/infiniband/mlx5_0/device/mlx5_num_vfs"

-bash: echo: write error: Invalid argument

Any help would be greatly appreciated.

 

intel_iommu=on and iommu=pt are added to kernel bootime parameter

 

# cat /proc/cmdline

BOOT_IMAGE=/boot/vmlinuz-3.10.0-514.el7.x86_64 root=UUID=51777676-1b13-40a7-aed8-12e9609e4b31 ro intel_pstate=disable console=tty0 console=ttyS0,115200n8 net.ifnames=0 crashkernel=auto rhgb quiet intel_iommu=on iommu=pt

#

 

 

mlxconfig -d /dev/mst/mt4113_pciconf0 q

 

Device #1:

----------

 

Device type:    ConnectIB      

PCI device:     /dev/mst/mt4113_pciconf0

 

Configurations:                              Next Boot

         ROCE_NEXT_PROTOCOL                  254            

         NUM_OF_VFS                          6              

         SRIOV_EN                            True(1)   

 


 


 

 

Re: Direct servers connection with ConnectX-4 25Gbe

$
0
0

Hi,

 

There's no problem to connect servers Back to Back with connectx-4 Lx EN or any other mellanox cards.

 

BR

Marc

Re: ib_sdp {failed}

$
0
0

Hi,

 

Can you give me more details :

Which adapter do you have ?

Which driver ?

Does it occur at boot time ?

Can you send me dmesg ?

 

Can you try to unload the module modprobe -r ib_sdp before start/restarting the driver.

 

Thanks

Marc

Re: SX1012 Connection to Cisco 4506-E

$
0
0

Hi All,

 

Saw this on their forums. 

This answered my question. They already used their 10GbE uplinks the reason Mellanox auto negotiated to 1GbE speed.

Single Supervisor Mode

In single supervisor mode, WS-X45-SUP-7L-E supports the uplink configuration of at most either two 10-Gigabit or four 1-Gigabit ports

 

*Posting external link for future connectivity references.

 

*Solved: Catalyst 4510R-E 10GB SFP+ Not Working - Cisco Support Community

 

Thank you.


Re: Unable to configure SR-IOV on Connect-IB

$
0
0

Hi,

 

Kindly note that SR-IOV protocol need to be enabled in 4 different places:

1. Firmware level - according to mlxconfig output it is enabled.

2. System BIOS - need to verify the "Virtualization Technology" option is enabled.

3. Operation System at grub.conf - we can see that "intel_iommu=on" exists

In Connect-IB you must also set FPP_EN=1

4. Driver - Set the desired number of VFs by invoking:

echo 4 > /sys/class/infiniband/mlx5_0/device/mlx5_num_vfs

cat /sys/class/infiniband/mlx5_0/device/mlx5_num_vfs

 

5. The command to enable SRIOV support for ConnectX4 , Connect-IB and ConnectX-5 on an MLNX-OS based subnet manage is:

switch(config)# ib sm virt enable

 

In addition, if after applying the above settings you still encounter the same issue

I suggest reviewing the release notes of the latest OFED 4.3 : http://www.mellanox.com/related-docs/prod_software/Mellanox_OFED_Linux_Release_Notes_4_3-1_0_1_0.pdf

And check if there are known issues with SRIOV with older OFED versions , if yes please try and upgrade the OFED version accordingly and check if the issue resolved .

 

Thanks,

Samer

Problem with vlan with multicast mac over SR-IOV (VMware)

$
0
0

Hi,

 

I have big problem with multicast mac with vlan tag. I suspect filtering and droping packet somewhere within VMware.

Set in VMware VLAN = 4095 for SR-IOV. Tests in both as a mode PCI Device and BYPASS mode (through Port Group).

 

Problem does not occur when the cards are configured in the passthrough mode (no SR-IOV mode).

Multicasts not tagged, pass without a problem (Precison Time Protocol).

 

Running tcpdump on the side of the sender log:

Frame 1: 144 bytes on wire (1152 bits), 144 bytes captured (1152 bits)

    Encapsulation type: Ethernet (1)

    Arrival Time: Mar  4, 2018 21:35:15.204041000 Central European Standard Time

    [Time shift for this packet: 0.000000000 seconds]

    Epoch Time: 1520195715.204041000 seconds

    [Time delta from previous captured frame: 0.000000000 seconds]

    [Time delta from previous displayed frame: 0.000000000 seconds]

    [Time since reference or first frame: 0.000000000 seconds]

    Frame Number: 1

    Frame Length: 144 bytes (1152 bits)

    Capture Length: 144 bytes (1152 bits)

    [Frame is marked: False]

    [Frame is ignored: False]

    [Protocols in frame: eth:ethertype:vlan:ethertype:sv]

    [Coloring Rule Name: Broadcast]

    [Coloring Rule String: eth[0] & 1]

Ethernet II, Src: Vmware_51:c4:8e (00:0c:29:51:c4:8e), Dst: Iec-Tc57_04:00:00 (01:0c:cd:04:00:00)

    Destination: Iec-Tc57_04:00:00 (01:0c:cd:04:00:00)

        Address: Iec-Tc57_04:00:00 (01:0c:cd:04:00:00)

        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)

        .... ...1 .... .... .... .... = IG bit: Group address (multicast/broadcast)

    Source: Vmware_51:c4:8e (00:0c:29:51:c4:8e)

        Address: Vmware_51:c4:8e (00:0c:29:51:c4:8e)

        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)

        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)

    Type: 802.1Q Virtual LAN (0x8100)

802.1Q Virtual LAN, PRI: 7, DEI: 0, ID: 4

    111. .... .... .... = Priority: Network Control (7)

    ...0 .... .... .... = DEI: Ineligible

    .... 0000 0000 0100 = ID: 4

    Type: IEC 61850/SV (Sampled Value Transmission (0x88ba)

IEC61850 Sampled Values

    APPID: 0x4000

    Length: 126

    Reserved 1: 0x0000 (0)

    Reserved 2: 0x0000 (0)

    savPdu

        noASDU: 1

        seqASDU: 1 item

 

In this document, http://www.mellanox.com/related-docs/prod_software/Mellanox_OFED_ESXi_User_Manual_v2.4.0.pdf. Point  3.3.1.5 is written:

"Any vlan-tagged packets sent by the VF are silently dropped."

 

Configuration:

Server: 3 x Dell  PowerEdge R710 rev II (12 CPU - 24 logical, 96GB RAM)

Mellanox Card: 3 x ConnectX-4 Lx EN-MCX4121A-XCAT (latest FW 14.21.2010,  PXE 3.5.0305)

Switches: 2 x Cisco Nexus 3048TP-1GE (latest stable firmware 7.0.3.I7.3)

 

VMware: DellEMC-ESXi-6.5U1-7388607-A07 (latest from Dell)

VM OS: CentOS Linux release 7.4.1708 (Core, updated)

VMware network driver: nmlx5_core (VMware native, 4.16.10.3-1OEM.650.0.0.4598673)

VM OS network driver mlx5_core: 4.2-1.0.1 (latest Linux driver EN)

 

How to solve the problem, otherwise research does not make much sense.

I am asking for support, it is very important for Me and My thesis, research.

 

 

Best Regards,

Robert

DGX-1V 32GB: [Mellanox CX-5 IB card] It takes long time(7 minutes and 30 seconds) to power on the system to POST screen with installed CX-5 IB cards.

$
0
0

It takes an unusually long time (around 7.5 minutes) after powering on a DGX-1V 32GB server to boot into the operating system (Ubuntu 16.04 in this case). We do not observe this long of a delay on boot with the CX4 HCAs with the Option ROM enabled. The boot time on the CX5 HCAs is improved and equal to that of CX4 HCAs by disabling the Option ROM on the CX5 HCA with all else being the same.

 

[System Configuration]

SBIOS: S2W_3A04

BMC: 3.20.30

CX-5 IB cards: Mellanox ConnetX-5 EDR + 100GbE(Model No: CX556A)

 

[Reproduced steps]

1. Replace existing "CX-4 IB cards" x4 with "new CX-5 IB cards" x4 on DGX-1V 32GB server.

2. Power on the system.

 

[Observation]

It takes long time(7 minutes and 30 seconds) to power on the system to POST screen with installed CX-5 IB cards.

 

[Comparison]

DGX-1V + CX-4 IB card x4: It takes 1 minute and 27 seconds to power on the system to POST screen.

Re: Problem with vlan with multicast mac over SR-IOV (VMware)

$
0
0

Hi,

 

I found a newer driver / firmware:

fw-ConnectX4Lx-rel-14_22_1002-MCX4121A-XCA_Ax-UEFI-14.15.19-FlexBoot-3.5.403.bin

nmst-4.9.0.38-1OEM.650.0.0.4598673.x86_64.vib

mft-4.9.0.38-10EM-650.0.0.4598673.x86_64.vib

MLNX-NATIVE-ESX-ConnectX-4-5_4.16.12.12-10EM-650.0.0-7412885.zip

 

Unfortunately, it did not help, what's worse I found a bug in SR-IOV, max_vfs is only possible 8 per port (max_vfs = 8,8). The driver does not load when set (max_vfs = 16,16).


In the previous driver, max _vfs was set to 16 per port (max_vfs = 16).

Earlier it was set up globally, one value for all ports.Which allowed to get 32 VFs.

 

Best Regards,

Robert

direct server connection with "ConnectX-4 Lx EN"

$
0
0

hi guys,

 

we want to setup an hyper-v failover cluster with storage spaces direct. we plan 2 servers with one ConnectX-4 Lx EN card each for direct communication of the internal storage. is this setup possible with these adapter cards? if yes, is there a configuration guide for direct connections?

 

thanks for your help and greets (-:

Viewing all 6226 articles
Browse latest View live


Latest Images

<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>