Created
December 9, 2010 14:49
-
-
Save dhh/734779 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Dec 9 14:27:03 acc-db-01 [248740.420900] divide error: 0000 [#1] SMP | |
Dec 9 14:27:03 acc-db-01 [248740.428791] last sysfs file: /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map | |
Dec 9 14:27:03 acc-db-01 [248740.444194] CPU 6 | |
Dec 9 14:27:03 acc-db-01 [248740.450660] Modules linked in: btrfs zlib_deflate crc32c libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs reiserfs xfs exportfs nfs lockd nfs_acl auth_rpcgss sunrpc ipmi_devintf ipmi_si ipmi_msghandler autofs4 bonding fbcon tileblit font bitblit softcursor vga16fb vgastate bnx2 psmouse dell_wmi serio_raw joydev power_meter dcdbas lp parport ses enclosure usbhid hid megaraid_sas | |
Dec 9 14:27:03 acc-db-01 [248740.516499] Pid: 17864, comm: dsm_sa_snmp32d Not tainted 2.6.32-22-generic #33-Ubuntu PowerEdge R710 | |
Dec 9 14:27:03 acc-db-01 [248740.538698] RIP: 0010:[<ffffffff8105621c>] [<ffffffff8105621c>] find_busiest_group+0x63c/0x900 | |
Dec 9 14:27:03 acc-db-01 [248740.561223] RSP: 0018:ffff880604711b88 EFLAGS: 00010046 | |
Dec 9 14:27:03 acc-db-01 [248740.574067] RAX: 0000000000000000 RBX: ffff880604711d54 RCX: 0000000000000001 | |
Dec 9 14:27:03 acc-db-01 [248740.597125] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 | |
Dec 9 14:27:03 acc-db-01 [248740.621991] RBP: ffff880604711cf8 R08: ffff88034ac6fd88 R09: 0000000000000040 | |
Dec 9 14:27:03 acc-db-01 [248740.648293] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffffffff | |
Dec 9 14:27:03 acc-db-01 [248740.676802] R13: 0000000000015bc0 R14: ffffffffffffffff R15: 0000000000000000 | |
Dec 9 14:27:03 acc-db-01 [248740.706983] FS: 0000000000000000(0000) GS:ffff88034ac60000(0063) knlGS:00000000f56ffb70 | |
Dec 9 14:27:03 acc-db-01 [248740.739611] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b | |
Dec 9 14:27:03 acc-db-01 [248740.757771] CR2: 00007f593183a000 CR3: 00000002abc38000 CR4: 00000000000006e0 | |
Dec 9 14:27:03 acc-db-01 [248740.790812] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 | |
Dec 9 14:27:03 acc-db-01 [248740.824413] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 | |
Dec 9 14:27:03 acc-db-01 [248740.858726] Process dsm_sa_snmp32d (pid: 17864, threadinfo ffff880604710000, task ffff88062d4a44d0) | |
Dec 9 14:27:03 acc-db-01 [248740.895283] Stack: | |
Dec 9 14:27:03 acc-db-01 [248740.910801] ffff880604711c98 ffff880604711c08 ffff880604711d40 0000000000000cce | |
Dec 9 14:27:03 acc-db-01 [248740.932047] <0> ffff88034ac6fc60 00000006810fae02 000000010000000e 0000000000000008 | |
Dec 9 14:27:03 acc-db-01 [248740.967302] <0> 0000000000015bc0 0000000000015bc0 ffff88034ac6fd70 0000000000015bc0 | |
Dec 9 14:27:03 acc-db-01 [248741.016468] Call Trace: | |
Dec 9 14:27:03 acc-db-01 [248741.032629] [<ffffffff8105c928>] load_balance_newidle+0xa8/0x310 | |
Dec 9 14:27:03 acc-db-01 [248741.052237] [<ffffffff8153ea7a>] thread_return+0x35a/0x420 | |
Dec 9 14:27:03 acc-db-01 [248741.071095] [<ffffffff8153fd4d>] do_nanosleep+0x8d/0xc0 | |
Dec 9 14:27:03 acc-db-01 [248741.089458] [<ffffffff81089834>] hrtimer_nanosleep+0xc4/0x180 | |
Dec 9 14:27:03 acc-db-01 [248741.108107] [<ffffffff81088550>] ? hrtimer_wakeup+0x0/0x30 | |
Dec 9 14:27:03 acc-db-01 [248741.126252] [<ffffffff81089664>] ? hrtimer_start_range_ns+0x14/0x20 | |
Dec 9 14:27:03 acc-db-01 [248741.144902] [<ffffffff810acee4>] compat_sys_nanosleep+0xb4/0x120 | |
Dec 9 14:27:03 acc-db-01 [248741.163137] [<ffffffff8104870f>] sysenter_dispatch+0x7/0x2e | |
Dec 9 14:27:03 acc-db-01 [248741.180700] Code: ff c7 85 c4 fe ff ff 01 00 00 00 e9 95 fb ff ff 0f 1f 80 00 00 00 00 48 8b 95 e0 fe ff ff 48 8b 45 a8 8b 72 08 48 c1 e0 0a 31 d2 <48> f7 f6 48 8b 75 b0 48 89 45 a0 31 c0 48 85 f6 74 0c 48 8b 45 | |
Dec 9 14:27:03 acc-db-01 [248741.242654] RIP [<ffffffff8105621c>] find_busiest_group+0x63c/0x900 | |
Dec 9 14:27:03 acc-db-01 [248741.261816] RSP <ffff880604711b88> | |
Dec 9 14:27:03 acc-db-01 [248747.675681] ------------[ cut here ]------------ | |
Dec 9 14:27:03 acc-db-01 [248747.692559] WARNING: at /build/buildd/linux-2.6.32/net/sched/sch_generic.c:261 dev_watchdog+0x262/0x270() | |
Dec 9 14:27:03 acc-db-01 [248747.725803] Hardware name: PowerEdge R710 | |
Dec 9 14:27:03 acc-db-01 [248747.741343] NETDEV WATCHDOG: eth0 (bnx2): transmit queue 3 timed out | |
Dec 9 14:27:03 acc-db-01 [248747.759239] Modules linked in: btrfs zlib_deflate crc32c libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs reiserfs xfs exportfs nfs lockd nfs_acl auth_rpcgss sunrpc ipmi_devintf ipmi_si ipmi_msghandler autofs4 bonding fbcon tileblit font bitblit softcursor vga16fb vgastate bnx2 psmouse dell_wmi serio_raw joydev power_meter dcdbas lp parport ses enclosure usbhid hid megaraid_sas | |
Dec 9 14:27:03 acc-db-01 [248747.856877] Pid: 0, comm: swapper Not tainted 2.6.32-22-generic #33-Ubuntu | |
Dec 9 14:27:03 acc-db-01 [248747.875145] Call Trace: | |
Dec 9 14:27:03 acc-db-01 [248747.888978] <IRQ> [<ffffffff81066d0b>] warn_slowpath_common+0x7b/0xc0 | |
Dec 9 14:27:03 acc-db-01 [248747.907068] [<ffffffff81066db1>] warn_slowpath_fmt+0x41/0x50 | |
Dec 9 14:27:03 acc-db-01 [248747.923951] [<ffffffff814765e2>] dev_watchdog+0x262/0x270 | |
Dec 9 14:27:03 acc-db-01 [248747.940388] [<ffffffff8108b37d>] ? sched_clock_cpu+0xcd/0x110 | |
Dec 9 14:27:03 acc-db-01 [248747.957298] [<ffffffff8101a103>] ? native_sched_clock+0x13/0x60 | |
Dec 9 14:27:03 acc-db-01 [248747.974338] [<ffffffff81019e59>] ? sched_clock+0x9/0x10 | |
Dec 9 14:27:03 acc-db-01 [248747.990651] [<ffffffff81476380>] ? dev_watchdog+0x0/0x270 | |
Dec 9 14:27:03 acc-db-01 [248748.007098] [<ffffffff81077697>] run_timer_softirq+0x197/0x340 | |
Dec 9 14:27:03 acc-db-01 [248748.024087] [<ffffffff81094870>] ? tick_sched_timer+0x0/0xc0 | |
Dec 9 14:27:03 acc-db-01 [248748.040982] [<ffffffff8108f523>] ? ktime_get+0x63/0xe0 | |
Dec 9 14:27:03 acc-db-01 [248748.057391] [<ffffffff8106e3a7>] __do_softirq+0xb7/0x1e0 | |
Dec 9 14:27:03 acc-db-01 [248748.073871] [<ffffffff8109445a>] ? tick_program_event+0x2a/0x30 | |
Dec 9 14:27:03 acc-db-01 [248748.090929] [<ffffffff810142ec>] call_softirq+0x1c/0x30 | |
Dec 9 14:27:03 acc-db-01 [248748.107042] [<ffffffff81015cb5>] do_softirq+0x65/0xa0 | |
Dec 9 14:27:03 acc-db-01 [248748.122742] [<ffffffff8106e245>] irq_exit+0x85/0x90 | |
Dec 9 14:27:03 acc-db-01 [248748.137184] [<ffffffff81545f91>] smp_apic_timer_interrupt+0x71/0x9c | |
Dec 9 14:27:03 acc-db-01 [248748.153075] [<ffffffff81013cb3>] apic_timer_interrupt+0x13/0x20 | |
Dec 9 14:27:03 acc-db-01 [248748.168536] <EOI> [<ffffffff8130d337>] ? acpi_idle_enter_bm+0x28a/0x2be | |
Dec 9 14:27:03 acc-db-01 [248748.185007] [<ffffffff8130d330>] ? acpi_idle_enter_bm+0x283/0x2be | |
Dec 9 14:27:03 acc-db-01 [248748.200691] [<ffffffff81437507>] ? cpuidle_idle_call+0xa7/0x140 | |
Dec 9 14:27:03 acc-db-01 [248748.216235] [<ffffffff81011e73>] ? cpu_idle+0xb3/0x110 | |
Dec 9 14:27:03 acc-db-01 [248748.231001] [<ffffffff8153ad4b>] ? start_secondary+0xa8/0xaa | |
Dec 9 14:27:03 acc-db-01 [248748.246119] ---[ end trace d893f09a380f2ae2 ]--- |
In lieu of patching the kernel for now, wouldn't it be possible to switch schedulers at boot to avoid the bug?
That's the exact patch we have started using.
@lusis Apparently switching schedulers does not help. Haven't tested this independently.
@tweibley yes i worked with the original bug reporter on this a while back. yes bnx2 is a piece of shit and no switching schedulers doesnt help.
@ice799 Know anything about C states (see http://support.citrix.com/article/CTX127395) and Ubuntu? http://lists.us.dell.com/pipermail/linux-poweredge/2010-May/042280.html also.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
you need this: http://launchpadlibrarian.net/58956370/lp614853.patch