Maybe you are in the nice situation of running machine with multiple cpu cores having to crunch a lot of numbert or providing network daemons. Multiple cpu cores can boost your performance dramatically but there are, of course, possible issues. On is the fact, that interrupts by default are mainly called by the first cpu. As applications that aren’t able to thread correctly can stick to this cpu you might notice by a „cat /proc/interrupts“ that something goes wrong. The following is a (compressed) /proc/interrupts from a live server running for about ten weeks:
CPU0 CPU1 0: 1855681242 574 timer 7: 0 0 parport0 8: 1 0 rtc 9: 1 0 acpi 14: 65 0 ide0 58: 4 0 ehci_hcd:usb1, uhci_hcd:usb2 66: 212905125 72 3w-xxxx 74: 1094755762 0 eth0 185: 6223686 0 uhci_hcd:usb4, eth1 193: 1978 0 uhci_hcd:usb3, libata
You can see, that actually all interrupts are called by CPU0. We want to brush this up! How? Just run a „aptitude install irqbalance“ as irqbalance promises:
Daemon to balance interrupts across multiple CPUs, which can lead to better performance and IO balance on SMP systems. This package is especially useful on systems with multi-core processors, as interrupts will typically only be serviced by the first core.
So let’s check after about one week by another „cat /proc/interrupts“:
CPU0 CPU1 0: 1887385089 33827155 timer 7: 0 0 parport0 8: 1 0 rtc 9: 1 0 acpi 14: 65 0 ide0 58: 4 0 ehci_hcd:usb1, uhci_hcd:usb2 66: 212950265 11810501 3w-xxxx 74: 1310191290 0 eth0 169: 0 0 uhci_hcd:usb5 185: 6223686 228881 uhci_hcd:usb4, eth1 193: 1978 0 uhci_hcd:usb3, libata
Nice, isn’t it? CPU1 started to grab interrupts also. If we would reboot the server, most of the irqs would look balanced over time. (most, not all)
Please notice:
Before installing irqbalance, check your /proc/interrupts. It might be possible, that you don’t need it though you have multiple cores as there is a value „CONFIG_IRQBALANCE“ in 2.6 kernels that can be turned on.
[update]
The comments (thank you!) pointed out the following:
- There are reports on crashed systems using irqbalance. (Though I have never seen anyone by myself)
- Note that irqbalance is not in main – if you are using it on an important server.
- CONFIG_IRQBALANCE seems to be enabled in Bbuntu Hardy by default.
- There are discussions about removing CONFIG_IRQBALANCE as it is said that irqbalance is more reliable.
So it is up to you to decice which one to use!
[update2]
Actually a glance on /boot/config-2.6.24-17-generic shows, that CONFIG_IRQBALANCE seems not to be enabled in Hardy though the balancing seems to work. Actually I am not one of the kernel guys so my investigation will take it’s time. Any hints welcome (thank you lissyx).
Well, I am careful with irqbalance since I experienced kernel panics linked to this tool on dual core opterons…And it is not in main, so use at your own risks
@Yann: thanks for your reply. I am running irqbalance on several servers for about a year now. You notice is interesting though if I run into any panics in the future.
Thanks for the tip, I’ve basically been experiencing the same thing as you
kinus@claire:~$ cat /proc/interrupts
CPU0 CPU1
0: 10453426 0 IO-APIC-edge timer
1: 6973 0 IO-APIC-edge i8042
8: 7 0 IO-APIC-edge rtc
9: 41668 0 IO-APIC-fasteoi acpi
12: 1438126 0 IO-APIC-edge i8042
14: 657783 0 IO-APIC-edge libata
p
Just installed irqbalance and am interested to see the effect it has.
@Kyle: Fine! Feel free to post your experiences here.
From a default updated Ubuntu 8.04 Hardy Heron install on a Dell XPS M1330:
jesse@bigredone:~$ cat /proc/interrupts
CPU0 CPU1
0: 157547 157190 IO-APIC-edge timer
1: 605 597 IO-APIC-edge i8042
8: 1 0 IO-APIC-edge rtc
9: 22 22 IO-APIC-fasteoi acpi
12: 151785 150755 IO-APIC-edge i8042
14: 7844 7911 IO-APIC-edge libata
15: 0 0 IO-APIC-edge libata
16: 43837 44032 IO-APIC-fasteoi nvidia
17: 7905 7867 IO-APIC-fasteoi libata
18: 0 0 IO-APIC-fasteoi sdhci:slot0
19: 2 0 IO-APIC-fasteoi ohci1394
20: 12 10 IO-APIC-fasteoi uhci_hcd:usb1, ehci_hcd:usb4, uhci_hcd:usb5
21: 372 402 IO-APIC-fasteoi uhci_hcd:usb2, uhci_hcd:usb6, HDA Intel
22: 13 15 IO-APIC-fasteoi ehci_hcd:usb3, uhci_hcd:usb7
505: 14985 15020 PCI-MSI-edge iwl4965
506: 2 1 PCI-MSI-edge eth0
NMI: 0 0 Non-maskable interrupts
LOC: 121081 121941 Local timer interrupts
RES: 63991 64205 Rescheduling interrupts
CAL: 179 195 function call interrupts
TLB: 1539 1672 TLB shootdowns
TRM: 0 0 Thermal event interrupts
THR: 0 0 Threshold APIC interrupts
SPU: 0 0 Spurious interrupts
ERR: 0
jesse@bigredone:~$
Looks like Ubuntu has ‚CONFIG_IRQBALANCE‘ enabled by default. Nice.
I just realized how big that terminal output was. Sorry!
Mine are equal too, and I’m on 8.04. Nice that they enabled it
Will this benefit hyper-threaded procs as well?
I have read many times in lkml that the userspace irbalancer which you describe) is preferred over the in kernel irqbalance. In fact the in kernel one may dissapear soon
I’m sorry for you guys, but you should have checked in your Ubuntu’s kernel’s config file :
/boot/config-2.6.20-16-generic:CONFIG_IRQBALANCE=y
/boot/config-2.6.22-14-generic:# CONFIG_IRQBALANCE is not set
/boot/config-2.6.24-16-generic:# CONFIG_IRQBALANCE is not set
/boot/config-2.6.24-17-generic:# CONFIG_IRQBALANCE is not set
Meaning, Hardy have disabled it !
@lissyx: Thanks for your comment. Seems you are right, though the question is, why the balance works in hardy without irqbalance installed.
I always stumble upon things like this. Try and do something, only to realize Ubuntu already does it… *sigh* it’s popular for a reason I guess.
irqbalance is not for everyone. Laptops, for example. If you can do per core throttling and idling, it might save more power to only wake one CPU up and leave the other idle for minor processing. Of course this hurts throughput and latency.
powertop tells you: Suggestion: Disable the CONFIG_IRQBALANCE kernel configuration option.
The in-kernel irq balancer is obsolete and wakes the CPU up far more than needed.
In a recent kernel (2.6.26-rc4) even with „CONFIG_IRQBALANCE is not set“ and without the daemon:
CPU0 CPU1
0: 10799723 11144542 IO-APIC-edge timer
1: 3242 3004 IO-APIC-edge i8042
9: 842 681 IO-APIC-fasteoi acpi
12: 1200398 1168041 IO-APIC-edge i8042
14: 176001 169192 IO-APIC-edge ata_piix
15: 0 0 IO-APIC-edge ata_piix
16: 8376 8404 IO-APIC-fasteoi uhci_hcd:usb1, i915@pci:0000:00:02.0
18: 12 8 IO-APIC-fasteoi ehci_hcd:usb3, uhci_hcd:usb6
19: 0 0 IO-APIC-fasteoi uhci_hcd:usb5
21: 0 0 IO-APIC-fasteoi uhci_hcd:usb2
22: 1323453 1025204 IO-APIC-fasteoi HDA Intel
23: 4584 4436 IO-APIC-fasteoi uhci_hcd:usb4, ehci_hcd:usb7
220: 1699398 1692895 PCI-MSI-edge eth0
221: 107476 107021 PCI-MSI-edge ahci
NMI: 0 0 Non-maskable interrupts
LOC: 6378217 6929446 Local timer interrupts
RES: 6352006 7088681 Rescheduling interrupts
CAL: 17466 2619 function call interrupts
TLB: 3540 3489 TLB shootdowns
TRM: 0 0 Thermal event interrupts
SPU: 0 0 Spurious interrupts
ERR: 0
MIS: 0
@vanila-kernel:
And do you have the very hint, which module/setting is responsible for this behaviour?
Not really, I guess something new in the kernel. Maybe
30,3% ( 56,0) : Rescheduling interrupts
what ever that does 😉
mpstat is a nice tool to get a quick overview of per-core situation:
psilo@fbx:~$ mpstat -P ALL
Linux 2.6.22-14-generic (fbx.physnet) 06/10/2008
11:18:39 PM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s
11:18:39 PM all 10.41 0.02 2.43 0.26 0.03 0.15 0.00 86.71 97.62
11:18:39 PM 0 10.73 0.02 2.36 0.42 0.06 0.28 0.00 86.13 90.22
11:18:39 PM 1 10.09 0.01 2.49 0.09 0.00 0.02 0.00 87.28 7.40