Kernel panics while booting linux if mellanox card is connected to the network. It boots okay if I disconnect the card.
(after it successfully boots I can connect it to the network. though it sometime(not always) causes host to hang when I run ping over the network, for which I don't have much details to post..)
Here are details on the system
# uname -a
Linux <hostname> 4.2.0-35-generic #40-Ubuntu SMP Tue Mar 15 22:15:45 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
# mlxup
Querying Mellanox devices firmware ...
Device #
1
:
----------
Device Type: ConnectX3Pro
Part Number: MCX312B-XCC_Ax
Description: ConnectX-
3
Pro EN network
interface
card; 10GigE; dual-port SFP+; PCIe3.
0
x8 8GT/s; RoHS R6
PSID: MT_1200111023
PCI Device Name:
0000
:
02
:
00.0
Port1 MAC: e41d2db25040
Port2 MAC: e41d2db25041
Versions: Current Available
FW
2.36
.
5000
2.36
.
5000
PXE
3.4
.
0718
3.4
.
0718
Status: Up to date
Stack dump from crash(dmesg file is attached)
KERNEL: /usr/lib/debug/boot/vmlinux-4.2.0-35-generic
DUMPFILE: ../201607301001/dump.201607301001 [PARTIAL DUMP]
CPUS: 8
DATE: Sat Jul 30 10:01:52 2016
UPTIME: 00:00:14
LOAD AVERAGE: 1.19, 0.25, 0.08
TASKS: 584
NODENAME: <hostname>
RELEASE: 4.2.0-35-generic
VERSION: #40-Ubuntu SMP Tue Mar 15 22:15:45 UTC 2016
MACHINE: x86_64 (3409 Mhz)
MEMORY: 16 GB
PANIC: "BUG: unable to handle kernel paging request at 0000001100000002"
PID: 1625
COMMAND: "docker"
TASK: ffff8803e1f5a940 [THREAD_INFO: ffff8803de0e8000]
CPU: 4
STATE: TASK_RUNNING (PANIC)
crash> bt
PID: 1625 TASK: ffff8803e1f5a940 CPU: 4 COMMAND: "docker"
#0 [ffff88041ed033f0] machine_kexec at ffffffff8105913b
#1 [ffff88041ed03460] crash_kexec at ffffffff81109bf2
#2 [ffff88041ed03530] oops_end at ffffffff81018ead
#3 [ffff88041ed03560] no_context at ffffffff810682a5
#4 [ffff88041ed035d0] __bad_area_nosemaphore at ffffffff81068570
#5 [ffff88041ed03620] bad_area_nosemaphore at ffffffff810686f3
#6 [ffff88041ed03630] __do_page_fault at ffffffff810689d7
#7 [ffff88041ed03690] do_page_fault at ffffffff81068d42
#8 [ffff88041ed036b0] page_fault at ffffffff817fabc8
[exception RIP: __netdev_pick_tx+102]
RIP: ffffffff816e64e6 RSP: ffff88041ed03768 RFLAGS: 00010202
RAX: ffff88040c2d97f0 RBX: 0000000000000000 RCX: ffffffff816e6480
RDX: 000000000000000c RSI: ffff8803d4359b00 RDI: ffff8803fb440000
RBP: ffff88041ed037a8 R8: ffff88041ed19b00 R9: ffff8803d4359b00
R10: 0000000000000000 R11: 0000000000000150 R12: ffff8803fb440000
R13: 0000000000000000 R14: 00000000ffffffff R15: 0000001100000002
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000
#9 [ffff88041ed037b0] mlx4_en_select_queue at ffffffffc0187a7f [mlx4_en]
#10 [ffff88041ed037d0] netdev_pick_tx at ffffffff816edac1
#11 [ffff88041ed03800] __dev_queue_xmit at ffffffff816edc07
#12 [ffff88041ed03860] dev_queue_xmit_sk at ffffffff816ee0e3
#13 [ffff88041ed03870] netdev_send at ffffffffc04de305 [openvswitch]
#14 [ffff88041ed038b0] ovs_vport_send at ffffffffc04ddc28 [openvswitch]
#15 [ffff88041ed038d0] do_output at ffffffffc04d0289 [openvswitch]
#16 [ffff88041ed038f0] do_execute_actions at ffffffffc04d0874 [openvswitch]
#17 [ffff88041ed039a0] ovs_execute_actions at ffffffffc04d177f [openvswitch]
#18 [ffff88041ed039d0] ovs_dp_process_packet at ffffffffc04d4f04 [openvswitch]
#19 [ffff88041ed03a60] ovs_vport_receive at ffffffffc04dd38b [openvswitch]
#20 [ffff88041ed03c10] netdev_frame_hook at ffffffffc04de5d0 [openvswitch]
#21 [ffff88041ed03c40] __netif_receive_skb_core at ffffffff816eb2d4
#22 [ffff88041ed03ce0] __netif_receive_skb at ffffffff816eb988
#23 [ffff88041ed03d00] netif_receive_skb_internal at ffffffff816eba02
#24 [ffff88041ed03d40] napi_gro_frags at ffffffff816ec4a7
#25 [ffff88041ed03d70] mlx4_en_process_rx_cq at ffffffffc0189870 [mlx4_en]
#26 [ffff88041ed03e10] mlx4_en_poll_rx_cq at ffffffffc0189db6 [mlx4_en]
#27 [ffff88041ed03e60] net_rx_action at ffffffff816ebf09
#28 [ffff88041ed03ef0] __do_softirq at ffffffff81081131
#29 [ffff88041ed03f60] irq_exit at ffffffff81081433
#30 [ffff88041ed03f70] do_IRQ at ffffffff817fb878
--- <IRQ stack> ---
#31 [ffff8803de0ebf58] ret_from_intr at ffffffff817f97eb
RIP: 000000000088d618 RSP: 000000c82024d118 RFLAGS: 00000202
RAX: 0000000073f84770 RBX: 0000000000000400 RCX: 0000000054423aca
RDX: 0000000089ecd45f RSI: 000000c820542940 RDI: 000000c820544000
RBP: 00000000e4458357 R8: 000000008847594a R9: 0000000039eb6dc2
R10: 00000000d57b5eff R11: 00000000fa36c492 R12: 0000000000000004
R13: 0000000000dd5c19 R14: 0000000000000002 R15: 0000000000000008
ORIG_RAX: ffffffffffffff3d CS: 0033 SS: 002b
crash>
Has anyone seen the similar issue?