Re: ARP timeout on Mellanox IB Gateway SX6036G

April 20, 2015, 12:53 am

≫ Next: Re: MHEA28-XTC on Windows Server 2012 R2 and Windows 7

≪ Previous: Re: Is kernel 3.18 supported by MLNX OFED 2.4.1.0.4?

Hi Ale,

thank you for your answer!!!!

Let me try to explain it better... If I launch an instance I would like that the Infiniband NIC is available, so, for example, I can associate a floating ip using infiniband

↧

Re: MHEA28-XTC on Windows Server 2012 R2 and Windows 7

April 20, 2015, 4:25 am

≫ Next: Re: vSphere OFED 1.8.3 Beta

≪ Previous: Re: ARP timeout on Mellanox IB Gateway SX6036G

Allright that's a downer... Anyway I'll pop them in the vSphere machines then and work with them that way (if they run on ESXi 6?).

↧

Re: vSphere OFED 1.8.3 Beta

April 21, 2015, 9:10 am

≫ Next: Re: MHEA28-XTC on Windows Server 2012 R2 and Windows 7

≪ Previous: Re: MHEA28-XTC on Windows Server 2012 R2 and Windows 7

I am using Solaris 11.2 with Napp-it. Hardware Acceleration: unknown.

↧

Re: MHEA28-XTC on Windows Server 2012 R2 and Windows 7

April 22, 2015, 1:49 am

≫ Next: Re: Omnios + RSF-1 + Inifiniband

≪ Previous: Re: vSphere OFED 1.8.3 Beta

Not sure if these will run on the ESXi driver, won't hurt to give it a try.

As far as I know they're only certified work on Linux boxes

↧

Re: Omnios + RSF-1 + Inifiniband

April 24, 2015, 1:48 am

≫ Next: Re: Omnios + RSF-1 + Inifiniband

≪ Previous: Re: MHEA28-XTC on Windows Server 2012 R2 and Windows 7

Hi,

Thanks, I have finally set this up now. It appears to be working, failover occurs and the datastores come backup after a few seconds, so looking good.

I have a couple of additional questions for you. For the link failover of the storage side are you using IPMP?

Have you worked with windows also with the storage, I have 2 servers in a cluster connected (MSSQL), the failover again works ok for windows, in terms of the storage being available, but randomly the storage is reported offline after the pool is back online, more often if data is being written, the cluster manager and has to be manually enabled again. This occurs the moment the storage is available again, before that it reports its ok. Just wondering if you had seen this, could be some of the timeouts that I had been messing around with before to see if i could fix the previous problem.

Thanks

↧

Re: Omnios + RSF-1 + Inifiniband

April 24, 2015, 3:18 am

≫ Next: Re: vSphere 1.8.3 driver iSER result

≪ Previous: Re: Omnios + RSF-1 + Inifiniband

Hi David

Glad to to hear that yout got in a working condition at least for non-Windows.

The failover in Solaris is realized with an IPMP Group containing the datalinks:

.....

storage_64_0 803D net5 up ----

storage_64_1 803D net6 up ----

.....

ipadm add-ipmp -i storage_64_0 -i storage_64_1 storage61

ipadm create-addr -T static -a 192.168.61.64/24 storage61/v4addr

ipadm set-ifprop -p standby=on -m ip storage_64_1

You may set one of the datalinks as standby. Check it out how it works for your Installation.

Some years ago we tried to use infiniband with Windows2003. There was no multipathing available and so we stopped to using it with Windows.

And not with the Cluster functionality of Windows. You should check out the SCSI requirements of Windows Cluster which must be supported by the storage. If my Memory serves well there were issues with SCSI-3 reservations.

Andreas

↧

Re: vSphere 1.8.3 driver iSER result

April 24, 2015, 4:54 am

≫ Next: Re: vSphere 1.8.3 driver iSER result

≪ Previous: Re: Omnios + RSF-1 + Inifiniband

Any reason iSER's performance is far behind SRP?

↧

Re: vSphere 1.8.3 driver iSER result

April 24, 2015, 5:17 am

≫ Next: How to enable nvgre on ConnectX ® -3 Pro

≪ Previous: Re: vSphere 1.8.3 driver iSER result

Hi!

You said that your storage os is Solaris 11.2 with ConnectX-2 VPI 2.10.700 firmware.

Does it your storage reboot -p command execute properly?

My OmniOS only support to "reboot -p" properly with CX-2 VPI firmware 2.9.1000.

And Performance concern....

Here is my vSphere 6 ESXi host configuration.

Can you try it?

Open your ESXi's console and comit this commands and reboot your ESXi host then retest iSER target.

- I'm also use 4k MTU.

- Because of iSER use CM mode. Therefore IPoIB vKernel MTU setup isn't need to 4092.

- But I'm also use IPoIB vMotion that cause use "mtu_4k=1" option partition configuration...

esxcli system module parameters set -m=ib_ipoib -p="ipoib_recvq_size=1024 ipoib_sendq_size=1024"

esxcli system module parameters set -m=mlx4_core -p="mtu_4k=1 msi_x=1"

esxcli system module parameters set -m=ib_iser -p="debug_level=0"

↧

How to enable nvgre on ConnectX ® -3 Pro

April 24, 2015, 5:58 am

≫ Next: Re: vSphere 1.8.3 driver iSER result

≪ Previous: Re: vSphere 1.8.3 driver iSER result

Hi,

I am using ConnectX ® -3 Pro nic with OVS version 2.3.31.

With VxLAN i am achieving 9.10 to 9.30 Gbps data. But with NVGRE configuration i am getting around 3.20 to 4.00 Gbps data. Please can help me out if anything extra we need to configure to get achieve good numbers with NVGRE tunnel.

NVGRE configuration:

[root@localhost openvswitch-2.3.1]# ovs-vsctl show

2581c25f-7a80-436d-98fe-f420623f1c68

Bridge "br0"

Port "nvgre"

Interface "nvgre"

type: gre

options: {remote_ip="44.44.44.10"}

Port "br0"

Interface "br0"

type: internal

Host details:

Host PC: Dell T630

Operating system: CentOS-7(3.10.0-123.el7.x86_64 )

↧

Re: vSphere 1.8.3 driver iSER result

April 24, 2015, 6:19 am

≫ Next: The ibutils rpm from MLNX_OFED 2.4.1 requires libopensm.so.8()(64bit)

≪ Previous: How to enable nvgre on ConnectX ® -3 Pro

reboot -p works without any issue for me.

Let me try 4K MTU next for performance.

↧

The ibutils rpm from MLNX_OFED 2.4.1 requires libopensm.so.8()(64bit)

April 24, 2015, 11:03 am

≫ Next: Re: How to enable nvgre on ConnectX ® -3 Pro

≪ Previous: Re: vSphere 1.8.3 driver iSER result

The ibutils-1.5.7.1-0.11.g616bc1e.x86_64.rpm from MLNX_OFED 2.4.1 requires libopensm.so.8()(64bit), but I failed to find opensm-libs that provides this version of libopensm.so.8.

Could someone points to location where I can download that rpm?

Thanks!

Jay Lan

↧

Re: How to enable nvgre on ConnectX ® -3 Pro

April 24, 2015, 2:51 pm

≫ Next: Re: The ibutils rpm from MLNX_OFED 2.4.1 requires libopensm.so.8()(64bit)

≪ Previous: The ibutils rpm from MLNX_OFED 2.4.1 requires libopensm.so.8()(64bit)

I moved the questions to the technical forums

↧

Re: The ibutils rpm from MLNX_OFED 2.4.1 requires libopensm.so.8()(64bit)

April 24, 2015, 3:14 pm

≫ Next: Re: vSphere 1.8.3 driver iSER result

≪ Previous: Re: How to enable nvgre on ConnectX ® -3 Pro

The error messages were:

Error: Package: ibutils-1.5.7.1-0.11.g616bc1e.x86_64 (centos6_MOFED)

Requires: libopensm.so.8()(64bit)

Error: Package: ibutils-1.5.7.1-0.11.g616bc1e.x86_64 (centos6_MOFED)

Requires: libopensm.so.8(OPENSM_1.5)(64bit)

Requires: opensm-libs = 4.3.0.MLNX20141222.713c9d5-0.1

Requires: opensm-static = 4.3.0.MLNX20141222.713c9d5-0.1

↧

Re: vSphere 1.8.3 driver iSER result

April 25, 2015, 9:56 pm

≫ Next: Best way to connect/cable WAN/Internet to Mellanox Switch, Advice on using 2xSX1036's for general purpose networking aswell as storage. (all networking exept management)

≪ Previous: Re: The ibutils rpm from MLNX_OFED 2.4.1 requires libopensm.so.8()(64bit)

1.What is the SRP?

Look at the picture iser installed successfully?

3.How to connect to iser target?

↧

Best way to connect/cable WAN/Internet to Mellanox Switch, Advice on using 2xSX1036's for general purpose networking aswell as storage. (all networking exept management)

April 26, 2015, 12:30 am

≫ Next: Re: Best way to connect/cable WAN/Internet to Mellanox Switch, Advice on using 2xSX1036's for general purpose networking aswell as storage. (all networking exept management)

≪ Previous: Re: vSphere 1.8.3 driver iSER result

Hi guys,

I have a fairly complicated situation, and I need all the help I can get on it, so here it goes;

Here is my situation; I have 2xSX1036 with L3 and 56Gbe functionality. All my hosts have eather a dual port ConnectX3 VPI PRO or dual ported connectx3 EN PRO.

My build will be used for Virtualization, 3D VDI, and cloud related hosting, high performance web-hosting and lots out outbound connections. Some VDI users wil be using 4G now and then, so I want to make sure my networking to the ISP is top-notch. I want to use only these 2 switches, I want the setup to be redundant/production-proof, and I want the switches acting in an active-active kind of configuration.

I was thinking; why not use mellanox for my complete networking fabric? It can do SDN, has more than enough BW and has the low latency that i'm looking for. If I could setup a complete network using just these 2 switches, and the wan cables directly attached to them, i could reach the lowest latency possible for my connection?!! Allot of good things so far, But the only thing it lacks that i can think of is a connection to the BMC/IPMI ports of the servers, so management is the only thing that I will need a small 1gb utp cabled network for. OK, i'll do that on 2 simple switches with some extra ports available should there be any future issue that would need 1g utp connections to any of the servers. But my main strategy would be BMC/IPMI only on ethernet 1g network (isolated) and Mellanox for all the rest.

P.S. If im talking ms i mean MILLISECONDS, not microseconds!!!! I'll number the questions for easy quoting

Q1 So, This gets us to my first question; What do you guys think about this idea? Please give me your thoughts, and tell me if I'm missing something huge here.

My second question is about cabling, and there effect on performance (mostly latency/jitter, i want my connection to have lowest possible latency and jitter, Since my remote users would benefit from a crystal clear link. to my ISP

Q2 So, I'm checking out providers, and most 500/500 or 1000/1000 providers install a GeneXis OCG-1012 or similar modems/fiber termination points. Some even install there own router also, and both of these are of course RJ45 based. So, How would I go about connecting this RJ45 port to one of my SX1036 ports, while adding the most minimum amount of latency/delay or other issues to the connection.

Q3 I Understand that I will need to split QSFP first, would u advice on split 1 to 4 or 1 to 1 in this case? (any difference as far as performance/stability?)

I learned that each router/switch etc. in a network adds most latency, so these should be avoided right? Is there a way to convert a RJ45 cable to something that goes in a SFP+ port? Or would this be dumb. because of the added copper wire for the RJ45 cable, and should we pick a way that uses fiber all the way to the RJ45 port ? Does this kind of equipment exist? How much would the benefit be going with this route?

Q4 I think even better (perhaps the best, read next q ) than that would be to find a provider that will put a custom modem with SFP+ port at my site, so I can direct-connect the SX1036 to the modem without any converters except for the splitter from QSFP to SFP+, but i doubt i'll find one that will. Most modems that do i know of have SFP ports anyway, not SFP... so I would still need to convert that to SFP+ right?

Q5 What is your opinion on all the solutions discussed so far, will it even matter? would it shave off 1ms during a ping using the best solution compared with the Modem->extra router->switch with SFP+ port->Mellanox switch solution? shave off 0.1? Are there any other things as latency that would benefit from a better solution? Jitter? packetloss ? Im not that well-educated in network performance after BW and latency.

What would you do in my case? Go for the simple solution or the optimal solution since it will actually make a difference? (remember we talking doing 3d VDI over the internet, and webhosting that needs to be as fast as possible )

Q6Logic tells me that the converting with extra routers and switches/converters should be avoided as the plague, but I'm wondering 2 things; in the end, would choices like these really make a difference, or are we talking about such small numbers that it doesn't really matter at all?

Q7What would the absolute best solution be in terms of performance/latency ? Is it possible to terminate the fibers at the SX1036? Thereby leaving the whole modem out of the picture? Has anyone heard of an ISP that allows such a thing? prop. not. So lets say the fiber-end unit HAS to be placed by the ISP, what would be best in this case? Direct SFP+ port in the ISP placed Modem to QSFP splitter? Or QSFP port in modem, QSFP cable to SX1036 directly? Would the difference be these 2 be even measurable in terms of ping? 0.01ms? or even far less than that.

Questions about hardware / way of setting this up.

Q8 Would there be anymore advice you want to share on running my network this way? Are there any other people doing this? Maybe some Pro/Cons?

Q9 Would 2 SX1036 with L3 and 56gbe even be enough in terms of switches/gear to setup my network like this, in a redundant way? (I have far less nodes than ports ofc.)

Q10 What protocol should I set up? OSPF MLAG+whatever? What would my options be, and what are the pro's/con's for each?

Q 11 Most nodes as of now have one card with dual ports, each port goes to one of the switches, how would I best cable the switches ? Should they even be connected?

Q 12 Is there any problem mixing connectx-3 VPI, connectx-3 VPI PRO and connectx-3 EN and connectx-3 EN PRO cards together in one setup? What card would you guys advice on and why?

I'll be using GPU's in some hypervisors, if I want to use GPU-Direct between nodes that would only be possible if both nodes have a connect-X3 VPI PRO right? This is not possible on the non-pro VPI nor on either of the EN cards correct?

I really hope some knowledgeable people can help me set this whole project up in the best way possible, so that after that is done, we have this topic filled with information for people who have plans doing more than just storage with their Mellanox gear

Any advice, or opinion is very much appreciated!!

Thanks,

CloudBuilder

↧

Re: Best way to connect/cable WAN/Internet to Mellanox Switch, Advice on using 2xSX1036's for general purpose networking aswell as storage. (all networking exept management)

April 26, 2015, 12:46 am

≫ Next: Re: Difference between PRO VPI and PRO EN adapter (RoCE V2) (Hyperconverged Network, SwitchX connected to internet?)

≪ Previous: Best way to connect/cable WAN/Internet to Mellanox Switch, Advice on using 2xSX1036's for general purpose networking aswell as storage. (all networking exept management)

So I did some testing on my home network which uses standard rj45 cat6 the same as the ISP installed modem and router would use, and I have to say I am quite shocked about the results.

Using HRPING average ping to:

127.0.0.1 0.174ms
The first router I am connected to 0.721ms
The gateway, (second hop and last router) 1.164ms

My external IP (same gateway) 1.532ms

So this indicates that using rj45 solution would indeed add latency noticeable in the 1ms range.

The more scary thing however, if that it i perfor all tests and let them run for a minute or so, I start to see enormous spikes of much higher pings, even when pinging the first router im connected to, with no other active devices on my network these spikes could go as high as 4ms! Gateway 7ms, and external IP the highest spike was even 9ms !!!

smalles spikes (about 2x normal range) seem to happen about once every 10 pings, while the big ones happen not that often.

I guess that not only the latency but also the spikes would improve when using one of the solutions we're discussing right? Could someone do a test for me to see if SFP+/Mellanox gear suffers the same problem?
Whats your thought on this?

↧

Re: Difference between PRO VPI and PRO EN adapter (RoCE V2) (Hyperconverged Network, SwitchX connected to internet?)

April 26, 2015, 3:49 am

≫ Next: Re: vSphere 1.8.3 driver iSER result

≪ Previous: Re: Best way to connect/cable WAN/Internet to Mellanox Switch, Advice on using 2xSX1036's for general purpose networking aswell as storage. (all networking exept management)

Well, I am not really sure about allot of things.

This is what I have;

3 Servers with ConnectX3 VPI PRO adapters 56gbe dual port
3 Servers with ConnectX3 EN adapters 56gbe dual port
2x sx1036 with following licenses: 56gbe + L3

I want maximum performance(Load Balancing?), and redundancy for my network. And I want to keep things as simple as possible by only using these switches for all of my networking.
Each server is connected with one link to each switch. I guess I should also link the switches to eachother right? For loadbalancing and routing etc.?

As of now I have one subnet, but this could expand in the future, nothing crazy though,
Im still fighting with different ISP's for the best deal and hardware, but most of them ship a FTU/Modem with 1x 1gb RJ45 port. (Im starting at 500/500 bandwith) This gets me to the following questions;

1: What would be the most effective way to connect my SX1036's to the RJ45(1g) port while minimizing added latency?
2: How much of a difference would a direct SFP+ connection make? Is this something worth fighting for?
3: Would it be possible to pass the whole modem/FTU and put each of the pair of(1 up 1 down) fibers in a SFP+ connector, and connect both these SFP+ connectors to the splitout cable to one of the switches? Or would this pose a problem with the SX1036 because of the wan link consisting of 2 ports? (one up one down). Is it even possible to do this kind of thing with the SX1036 and SX6036 series? Or do you need special tech for this kind of thing?

So, I absolutely plan to use both switches as L3 router, while also using them in L2, and actually as my all-in-one networking solution:) Which ofcourse, still needs a firewall;

Mellanox says the SX1036 is capable of SDN, but this is very restricted it seems? Are there any firewalls that can run on these boys? What kind of SDN DOES work on these switches?
If firewall on the switches is not possible, a VM serving as a firewall on each host would be the next best thing right? No, I think dedicated would prob. be easier, but buying 2 ded. switches adds allot of costs, and also ads latency, so because of that I think using VM's with SR-IOV instead. Would a linux based firewall runing inside a VM using SR-IOV and proper QoS tuning be faster than a dedicated FW? If not, How much latency will this add?

After my internet is connected to my host, and behind a firewall, I need to get it to the running vm's, what would be the lowest-latency way of doing this? I'm sure leveraging the adapters capabilities can help allot here? I was thinking SR-IOV ? Or would this not work with a FW VM, since everyone connects to the connector "directly" using sriov?

Any thoughts on how to solve the FW problem? Do I even need a dedicated firewall? Cant the switches and adapters do most of the FW work? Combined with NVGRE/VXLAN tunneling?

I would really appreciate any advice on both matters.

Thank you,

Builder

↧

Re: vSphere 1.8.3 driver iSER result

April 26, 2015, 4:21 am

≫ Next: Re: Problem with MEllanox MT27500 10GB under Ubuntu 14.04

≪ Previous: Re: Difference between PRO VPI and PRO EN adapter (RoCE V2) (Hyperconverged Network, SwitchX connected to internet?)

1. SRP

SCSI Remote Protocol or RDMA Remote Protocol. The secret source of why Mellanox perform better than others.

2. Which version of driver did you install? What's your HCA?

3. What target are you using?

↧

Re: Problem with MEllanox MT27500 10GB under Ubuntu 14.04

April 28, 2015, 6:38 am

≫ Next: Re: ibdump on Connect-IB cards

≪ Previous: Re: vSphere 1.8.3 driver iSER result

It seems to be the same type of cable I used for connecting other 10GB fiber NICs - I see "TIGER patch cord LC/PC-LC/PC, MM50/123,OM3, 2.0mm,DX"on its package.

BTW, I tried right now to connect other server with Intel 520 10GB card using the same cable - it worked.

↧

Re: ibdump on Connect-IB cards

April 28, 2015, 12:56 pm

≫ Next: Re: Is kernel 3.18 supported by MLNX OFED 2.4.1.0.4?

≪ Previous: Re: Problem with MEllanox MT27500 10GB under Ubuntu 14.04

Thanks - That's good to know.

Is there any other mechanism that I can debug to see Infiniband multicast

traffic is working OK. I have apps on two machines which communicate

sporadically. I don't think ethtool or ifconfig counters will display

Infiniband traffic stats. Is there any other way to diagnose until ibdump

is available.

- Anal Patel

↧