Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all articles
Browse latest Browse all 6226

Re: rx_over_errors incrementing

$
0
0

Hi

 

I have the same issue in a a 9 node cluster with almost only multcast traffic,two of the nodes suffer from

high number of rx_dropped and ( the same number of rx_over errors), on both of these nodes I can see that CPU1 is running 100% but 0% user and system mode, e,g. it is 100% occupied servicing interupts. 

Thus I suspect non optimal interupt coalescing,

ethtool -C says Adaptive RX: on

Should I try using manual confiugration instead ? any sugget values ?

Any other suggestions ?

The strange thing is that the other 7 nodes seems to be coping with the same load without problems.

I have checked the PCI affinity it it seems OK the conenctx3 card is on the PCI bus connected to the same socket as CPU1

 

One data point is that I have more receivers on some of the nodes, could taht affect the issues?

The Mellanox pictrue describesrx_ over_erros as being the hardware buffer on card,

thus I can not quite see that the number of consumers should matter,...


Viewing all articles
Browse latest Browse all 6226

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>