plug & crash

I had some problems in my local network to-day, so I had to turn off the switch. Bang! My workstation hang immediately as power cord was unplugged from the D-Link. Initially I thought that something is wrong with the grounding, but a series of painful experiments proved that electricity has nothing to do with this: workstation hung whenever Ethernet cable was pulled off the socket. This looks like complete magic[1] isn't it? Fortunately I was advised to boot kernel with nmi_watchdog=1 (before I went insane, that is), and with the first stack-trace problem became obvious:

  • when cable is pulled, rtl8139 driver receives an interrupt and the first thing it does is grabbing of spin-lock, protecting struct rtl8139_private...
  • after that it goes to print „link-down“ message to the console...
  • but I am using netconsole[2], and to print that message netconsole calls back into rtl8139 driver...
  • and the first thing it does is grabbing of spin-lock... which is already grabbed by that very thread---deadlock.

[1] when I'll have a time, I'll also record a true story that happened during reiserfs debugging: how saving file in emacs made all processes in the system invisible to ps(1) and top(1).

[2] a console driver that sends kernel messages over network in UDP packets---very useful for debugging.


No comments:

Post a Comment