Good day!
I read the docs several times but I must be missing something for sure.
I have this setup: SuperMicro server with 4x AMD Opteron 6272 (64 cores total), 256GB Ram, some HDD, Intel 82576 Gbit Network card and, of course, Mellanox ConnectX-3 (MT27500 Family).
I am trying to setup PCI passthrough to let VMs running on the host access the ConnectX-3 card directly and run a sort of private cloud (extended to two more nodes).
I have tried using Proxmox, a distribution specialized in virtualization, based on KVM but, due to its "strange" nature, it is impossible to install Mellanox OFED using standard scripts: the distribution is based on Debian 7.4 but it uses RHEL 7 3.10.x kernel.
Stock OFED works out of the box but I can pass only the PF to the VMs and this implies running only one VM at a time connected to IB.
lspci output is always (Proxmox, CentOS 6.5 and CentOS 7) the same reported below:
04:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
04:00.1 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:00.2 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:00.3 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:00.4 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:00.5 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:00.6 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:00.7 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:01.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:01.1 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:01.2 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:01.3 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:01.4 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:01.5 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:01.6 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:01.7 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
04:02.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
I switched to CentOS 6.5, installed Mellanox OFED, installed oVirt but it was impossible to pass PCI device (VF) to the VM; the VM cannot boot with the following error:
Failed to assign device "hostdev0" : Permission denied
qemu-kvm: -device pci-assign,host=04:00.5,id=hostdev0,configfd=28,bus=pci.0,addr=0x8: Device 'pci-assign' could not be initialized
The configuration was done by virt-manager gui.
I tried disabling selinux and running the command as root. No success. Of course device 04:00.5 is listed in lspci output as a virtual function.
Now I've just installed CentOS 7 to use the same kernel version as Proxmox, installed Mellanox OFED, set mlx4_core options, etc. etc. Everything looks good, lspci shows the Virtual Functions but I canot start a VM with the VF attached.
I decided to reboot with Proxmox and to try with the onboard Intel 82576 ethernet card (just to exclude a buggy PCI passthrough): one shot, one kill! It worked perfectly and I was able to attach the VFs to several VMs on the same host.
What's the difference between Mellanox and Intel?
I do not know the very deep differences but I noticed that, using Intel card, I can find iommu_group definition in every VF, e.g.:
for the physical device:
root@hpc001:~# readlink /sys/bus/pci/devices/0000\:02\:00.0/iommu_group
../../../../kernel/iommu_groups/11
for one virtual function:
root@hpc001:~# readlink /sys/bus/pci/devices/0000\:02\:10.0/iommu_group
../../../../kernel/iommu_groups/13
for another VF:
root@hpc001:~# readlink /sys/bus/pci/devices/0000\:02\:10.4/iommu_group
../../../../kernel/iommu_groups/15
while for Mellanox I can found iommu_group assigment only for the PF:
root@hpc001:~# readlink /sys/bus/pci/devices/0000\:04\:00.0/iommu_group
../../../../kernel/iommu_groups/10
root@hpc001:~# readlink /sys/bus/pci/devices/0000\:04\:00.1/iommu_group
root@hpc001:~# readlink /sys/bus/pci/devices/0000\:04\:00.2/iommu_group
root@hpc001:~# readlink /sys/bus/pci/devices/0000\:04\:00.3/iommu_group
the latest three readlink return an empty reply because the link does not exists at all and this explains the "Cannot open iommu_group: No such file or directory" I got when trying to start a VM with a VF connected.
I am really banging my head against the wall because I cannot understand what's wrong with this setup.
Thanks in advance for any help.
Ciao,
Roberto