Commit ad5fa913991e9e0f122b021e882b0d50051fbdbc

Authored by Andi Kleen
Committed by Andi Kleen
1 parent a7420aa54d

HWPOISON: Add new SIGBUS error codes for hardware poison signals

Add new SIGBUS codes for reporting machine checks as signals. When
the hardware detects an uncorrected ECC error it can trigger these
signals.

This is needed for telling KVM's qemu about machine checks that happen to
guests, so that it can inject them, but might be also useful for other programs.
I find it useful in my test programs.

This patch merely defines the new types.

- Define two new si_codes for SIGBUS.  BUS_MCEERR_AO and BUS_MCEERR_AR
* BUS_MCEERR_AO is for "Action Optional" machine checks, which means that some
corruption has been detected in the background, but nothing has been consumed
so far. The program can ignore those if it wants (but most programs would
already get killed)
* BUS_MCEERR_AR is for "Action Required" machine checks. This happens
when corrupted data is consumed or the application ran into an area
which has been known to be corrupted earlier. These require immediate
action and cannot just returned to. Most programs would kill themselves.
- They report the address of the corruption in the user address space
in si_addr.
- Define a new si_addr_lsb field that reports the extent of the corruption
to user space. That's currently always a (small) page. The user application
cannot tell where in this page the corruption happened.

AK: I plan to write a man page update before anyone asks.

Signed-off-by: Andi Kleen <ak@linux.intel.com>

Showing 1 changed file with 7 additions and 1 deletions Side-by-side Diff

include/asm-generic/siginfo.h
... ... @@ -82,6 +82,7 @@
82 82 #ifdef __ARCH_SI_TRAPNO
83 83 int _trapno; /* TRAP # which caused the signal */
84 84 #endif
  85 + short _addr_lsb; /* LSB of the reported address */
85 86 } _sigfault;
86 87  
87 88 /* SIGPOLL */
... ... @@ -112,6 +113,7 @@
112 113 #ifdef __ARCH_SI_TRAPNO
113 114 #define si_trapno _sifields._sigfault._trapno
114 115 #endif
  116 +#define si_addr_lsb _sifields._sigfault._addr_lsb
115 117 #define si_band _sifields._sigpoll._band
116 118 #define si_fd _sifields._sigpoll._fd
117 119  
... ... @@ -192,7 +194,11 @@
192 194 #define BUS_ADRALN (__SI_FAULT|1) /* invalid address alignment */
193 195 #define BUS_ADRERR (__SI_FAULT|2) /* non-existant physical address */
194 196 #define BUS_OBJERR (__SI_FAULT|3) /* object specific hardware error */
195   -#define NSIGBUS 3
  197 +/* hardware memory error consumed on a machine check: action required */
  198 +#define BUS_MCEERR_AR (__SI_FAULT|4)
  199 +/* hardware memory error detected in process but not consumed: action optional*/
  200 +#define BUS_MCEERR_AO (__SI_FAULT|5)
  201 +#define NSIGBUS 5
196 202  
197 203 /*
198 204 * SIGTRAP si_codes