Commit 18404756765c713a0be4eb1082920c04822ce588

Authored by Max Krasnyansky
Committed by Thomas Gleixner
1 parent c3b25b32e8

genirq: Expose default irq affinity mask (take 3)

Current IRQ affinity interface does not provide a way to set affinity
for the IRQs that will be allocated/activated in the future.
This patch creates /proc/irq/default_smp_affinity that lets users set
default affinity mask for the newly allocated IRQs. Changing the default
does not affect affinity masks for the currently active IRQs, they
have to be changed explicitly.

Updated based on Paul J's comments and added some more documentation.

Signed-off-by: Max Krasnyansky <maxk@qualcomm.com>
Cc: pj@sgi.com
Cc: a.p.zijlstra@chello.nl
Cc: tglx@linutronix.de
Cc: rdunlap@xenotime.net
Cc: mingo@elte.hu
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Showing 7 changed files with 134 additions and 38 deletions Side-by-side Diff

Documentation/IRQ-affinity.txt
  1 +ChangeLog:
  2 + Started by Ingo Molnar <mingo@redhat.com>
  3 + Update by Max Krasnyansky <maxk@qualcomm.com>
1 4  
2   -SMP IRQ affinity, started by Ingo Molnar <mingo@redhat.com>
  5 +SMP IRQ affinity
3 6  
4   -
5 7 /proc/irq/IRQ#/smp_affinity specifies which target CPUs are permitted
6 8 for a given IRQ source. It's a bitmask of allowed CPUs. It's not allowed
7 9 to turn off all CPUs, and if an IRQ controller does not support IRQ
8 10 affinity then the value will not change from the default 0xffffffff.
9 11  
  12 +/proc/irq/default_smp_affinity specifies default affinity mask that applies
  13 +to all non-active IRQs. Once IRQ is allocated/activated its affinity bitmask
  14 +will be set to the default mask. It can then be changed as described above.
  15 +Default mask is 0xffffffff.
  16 +
10 17 Here is an example of restricting IRQ44 (eth1) to CPU0-3 then restricting
11   -the IRQ to CPU4-7 (this is an 8-CPU SMP box):
  18 +it to CPU4-7 (this is an 8-CPU SMP box):
12 19  
  20 +[root@moon 44]# cd /proc/irq/44
13 21 [root@moon 44]# cat smp_affinity
14 22 ffffffff
  23 +
15 24 [root@moon 44]# echo 0f > smp_affinity
16 25 [root@moon 44]# cat smp_affinity
17 26 0000000f
18 27  
19 28  
... ... @@ -21,17 +30,27 @@
21 30 --- hell ping statistics ---
22 31 6029 packets transmitted, 6027 packets received, 0% packet loss
23 32 round-trip min/avg/max = 0.1/0.1/0.4 ms
24   -[root@moon 44]# cat /proc/interrupts | grep 44:
25   - 44: 0 1785 1785 1783 1783 1
26   -1 0 IO-APIC-level eth1
  33 +[root@moon 44]# cat /proc/interrupts | grep 'CPU\|44:'
  34 + CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
  35 + 44: 1068 1785 1785 1783 0 0 0 0 IO-APIC-level eth1
  36 +
  37 +As can be seen from the line above IRQ44 was delivered only to the first four
  38 +processors (0-3).
  39 +Now lets restrict that IRQ to CPU(4-7).
  40 +
27 41 [root@moon 44]# echo f0 > smp_affinity
  42 +[root@moon 44]# cat smp_affinity
  43 +000000f0
28 44 [root@moon 44]# ping -f h
29 45 PING hell (195.4.7.3): 56 data bytes
30 46 ..
31 47 --- hell ping statistics ---
32 48 2779 packets transmitted, 2777 packets received, 0% packet loss
33 49 round-trip min/avg/max = 0.1/0.5/585.4 ms
34   -[root@moon 44]# cat /proc/interrupts | grep 44:
35   - 44: 1068 1785 1785 1784 1784 1069 1070 1069 IO-APIC-level eth1
36   -[root@moon 44]#
  50 +[root@moon 44]# cat /proc/interrupts | 'CPU\|44:'
  51 + CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
  52 + 44: 1068 1785 1785 1783 1784 1069 1070 1069 IO-APIC-level eth1
  53 +
  54 +This time around IRQ44 was delivered only to the last four processors.
  55 +i.e counters for the CPU0-3 did not change.
Documentation/filesystems/proc.txt
... ... @@ -380,28 +380,35 @@
380 380 Of some interest is the introduction of the /proc/irq directory to 2.4.
381 381 It could be used to set IRQ to CPU affinity, this means that you can "hook" an
382 382 IRQ to only one CPU, or to exclude a CPU of handling IRQs. The contents of the
383   -irq subdir is one subdir for each IRQ, and one file; prof_cpu_mask
  383 +irq subdir is one subdir for each IRQ, and two files; default_smp_affinity and
  384 +prof_cpu_mask.
384 385  
385 386 For example
386 387 > ls /proc/irq/
387 388 0 10 12 14 16 18 2 4 6 8 prof_cpu_mask
388   - 1 11 13 15 17 19 3 5 7 9
  389 + 1 11 13 15 17 19 3 5 7 9 default_smp_affinity
389 390 > ls /proc/irq/0/
390 391 smp_affinity
391 392  
392   -The contents of the prof_cpu_mask file and each smp_affinity file for each IRQ
393   -is the same by default:
  393 +smp_affinity is a bitmask, in which you can specify which CPUs can handle the
  394 +IRQ, you can set it by doing:
394 395  
395   - > cat /proc/irq/0/smp_affinity
396   - ffffffff
  396 + > echo 1 > /proc/irq/10/smp_affinity
397 397  
398   -It's a bitmask, in which you can specify which CPUs can handle the IRQ, you can
399   -set it by doing:
  398 +This means that only the first CPU will handle the IRQ, but you can also echo
  399 +5 which means that only the first and fourth CPU can handle the IRQ.
400 400  
401   - > echo 1 > /proc/irq/prof_cpu_mask
  401 +The contents of each smp_affinity file is the same by default:
402 402  
403   -This means that only the first CPU will handle the IRQ, but you can also echo 5
404   -which means that only the first and fourth CPU can handle the IRQ.
  403 + > cat /proc/irq/0/smp_affinity
  404 + ffffffff
  405 +
  406 +The default_smp_affinity mask applies to all non-active IRQs, which are the
  407 +IRQs which have not yet been allocated/activated, and hence which lack a
  408 +/proc/irq/[0-9]* directory.
  409 +
  410 +prof_cpu_mask specifies which CPUs are to be profiled by the system wide
  411 +profiler. Default value is ffffffff (all cpus).
405 412  
406 413 The way IRQs are routed is handled by the IO-APIC, and it's Round Robin
407 414 between all the CPUs which are allowed to handle it. As usual the kernel has
arch/alpha/kernel/irq.c
... ... @@ -42,8 +42,7 @@
42 42 #ifdef CONFIG_SMP
43 43 static char irq_user_affinity[NR_IRQS];
44 44  
45   -int
46   -select_smp_affinity(unsigned int irq)
  45 +int irq_select_affinity(unsigned int irq)
47 46 {
48 47 static int last_cpu;
49 48 int cpu = last_cpu + 1;
... ... @@ -51,7 +50,7 @@
51 50 if (!irq_desc[irq].chip->set_affinity || irq_user_affinity[irq])
52 51 return 1;
53 52  
54   - while (!cpu_possible(cpu))
  53 + while (!cpu_possible(cpu) || !cpu_isset(cpu, irq_default_affinity))
55 54 cpu = (cpu < (NR_CPUS-1) ? cpu + 1 : 0);
56 55 last_cpu = cpu;
57 56  
include/linux/interrupt.h
... ... @@ -104,8 +104,11 @@
104 104  
105 105 #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_HARDIRQS)
106 106  
  107 +extern cpumask_t irq_default_affinity;
  108 +
107 109 extern int irq_set_affinity(unsigned int irq, cpumask_t cpumask);
108 110 extern int irq_can_set_affinity(unsigned int irq);
  111 +extern int irq_select_affinity(unsigned int irq);
109 112  
110 113 #else /* CONFIG_SMP */
111 114  
... ... @@ -118,6 +121,8 @@
118 121 {
119 122 return 0;
120 123 }
  124 +
  125 +static inline int irq_select_affinity(unsigned int irq) { return 0; }
121 126  
122 127 #endif /* CONFIG_SMP && CONFIG_GENERIC_HARDIRQS */
123 128  
... ... @@ -244,15 +244,6 @@
244 244 }
245 245 #endif
246 246  
247   -#ifdef CONFIG_AUTO_IRQ_AFFINITY
248   -extern int select_smp_affinity(unsigned int irq);
249   -#else
250   -static inline int select_smp_affinity(unsigned int irq)
251   -{
252   - return 1;
253   -}
254   -#endif
255   -
256 247 extern int no_irq_affinity;
257 248  
258 249 static inline int irq_balancing_disabled(unsigned int irq)
... ... @@ -17,6 +17,8 @@
17 17  
18 18 #ifdef CONFIG_SMP
19 19  
  20 +cpumask_t irq_default_affinity = CPU_MASK_ALL;
  21 +
20 22 /**
21 23 * synchronize_irq - wait for pending IRQ handlers (on other CPUs)
22 24 * @irq: interrupt number to wait for
23 25  
... ... @@ -95,8 +97,29 @@
95 97 return 0;
96 98 }
97 99  
  100 +#ifndef CONFIG_AUTO_IRQ_AFFINITY
  101 +/*
  102 + * Generic version of the affinity autoselector.
  103 + */
  104 +int irq_select_affinity(unsigned int irq)
  105 +{
  106 + cpumask_t mask;
  107 +
  108 + if (!irq_can_set_affinity(irq))
  109 + return 0;
  110 +
  111 + cpus_and(mask, cpu_online_map, irq_default_affinity);
  112 +
  113 + irq_desc[irq].affinity = mask;
  114 + irq_desc[irq].chip->set_affinity(irq, mask);
  115 +
  116 + set_balance_irq_affinity(irq, mask);
  117 + return 0;
  118 +}
98 119 #endif
99 120  
  121 +#endif
  122 +
100 123 /**
101 124 * disable_irq_nosync - disable an irq without waiting
102 125 * @irq: Interrupt to disable
... ... @@ -382,6 +405,9 @@
382 405 } else
383 406 /* Undo nested disables: */
384 407 desc->depth = 1;
  408 +
  409 + /* Set default affinity mask once everything is setup */
  410 + irq_select_affinity(irq);
385 411 }
386 412 /* Reset broken irq detection when installing new handler */
387 413 desc->irq_count = 0;
... ... @@ -570,8 +596,6 @@
570 596 action->name = devname;
571 597 action->next = NULL;
572 598 action->dev_id = dev_id;
573   -
574   - select_smp_affinity(irq);
575 599  
576 600 #ifdef CONFIG_DEBUG_SHIRQ
577 601 if (irqflags & IRQF_SHARED) {
... ... @@ -44,7 +44,7 @@
44 44 unsigned long count, void *data)
45 45 {
46 46 unsigned int irq = (int)(long)data, full_count = count, err;
47   - cpumask_t new_value, tmp;
  47 + cpumask_t new_value;
48 48  
49 49 if (!irq_desc[irq].chip->set_affinity || no_irq_affinity ||
50 50 irq_balancing_disabled(irq))
51 51  
52 52  
... ... @@ -62,17 +62,51 @@
62 62 * way to make the system unusable accidentally :-) At least
63 63 * one online CPU still has to be targeted.
64 64 */
65   - cpus_and(tmp, new_value, cpu_online_map);
66   - if (cpus_empty(tmp))
  65 + if (!cpus_intersects(new_value, cpu_online_map))
67 66 /* Special case for empty set - allow the architecture
68 67 code to set default SMP affinity. */
69   - return select_smp_affinity(irq) ? -EINVAL : full_count;
  68 + return irq_select_affinity(irq) ? -EINVAL : full_count;
70 69  
71 70 irq_set_affinity(irq, new_value);
72 71  
73 72 return full_count;
74 73 }
75 74  
  75 +static int default_affinity_read(char *page, char **start, off_t off,
  76 + int count, int *eof, void *data)
  77 +{
  78 + int len = cpumask_scnprintf(page, count, irq_default_affinity);
  79 + if (count - len < 2)
  80 + return -EINVAL;
  81 + len += sprintf(page + len, "\n");
  82 + return len;
  83 +}
  84 +
  85 +static int default_affinity_write(struct file *file, const char __user *buffer,
  86 + unsigned long count, void *data)
  87 +{
  88 + unsigned int full_count = count, err;
  89 + cpumask_t new_value;
  90 +
  91 + err = cpumask_parse_user(buffer, count, new_value);
  92 + if (err)
  93 + return err;
  94 +
  95 + if (!is_affinity_mask_valid(new_value))
  96 + return -EINVAL;
  97 +
  98 + /*
  99 + * Do not allow disabling IRQs completely - it's a too easy
  100 + * way to make the system unusable accidentally :-) At least
  101 + * one online CPU still has to be targeted.
  102 + */
  103 + if (!cpus_intersects(new_value, cpu_online_map))
  104 + return -EINVAL;
  105 +
  106 + irq_default_affinity = new_value;
  107 +
  108 + return full_count;
  109 +}
76 110 #endif
77 111  
78 112 static int irq_spurious_read(char *page, char **start, off_t off,
... ... @@ -171,6 +205,21 @@
171 205 remove_proc_entry(action->dir->name, irq_desc[irq].dir);
172 206 }
173 207  
  208 +void register_default_affinity_proc(void)
  209 +{
  210 +#ifdef CONFIG_SMP
  211 + struct proc_dir_entry *entry;
  212 +
  213 + /* create /proc/irq/default_smp_affinity */
  214 + entry = create_proc_entry("default_smp_affinity", 0600, root_irq_dir);
  215 + if (entry) {
  216 + entry->data = NULL;
  217 + entry->read_proc = default_affinity_read;
  218 + entry->write_proc = default_affinity_write;
  219 + }
  220 +#endif
  221 +}
  222 +
174 223 void init_irq_proc(void)
175 224 {
176 225 int i;
... ... @@ -179,6 +228,8 @@
179 228 root_irq_dir = proc_mkdir("irq", NULL);
180 229 if (!root_irq_dir)
181 230 return;
  231 +
  232 + register_default_affinity_proc();
182 233  
183 234 /*
184 235 * Create entries for all existing IRQs.