17 Jan, 2012
1 commit
-
When suspending, there was a large list of warnings going something like:
Device 'machinecheck1' does not have a release() function, it is broken and must be fixed
This patch turns the static mce_devices into dynamically allocated, and
properly frees them when they are removed from the system. It solves
the warning messages on my laptop here.Reported-by: "Srivatsa S. Bhat"
Reported-by: Linus Torvalds
Tested-by: Djalal Harouni
Cc: Kay Sievers
Cc: Tony Luck
Cc: Borislav Petkov
Signed-off-by: Greg Kroah-Hartman
Signed-off-by: Linus Torvalds
14 Jan, 2012
1 commit
-
Commit 8a25a2fd126c ("cpu: convert 'cpu' and 'machinecheck' sysdev_class
to a regular subsystem") changed how things are dealt with in the MCE
subsystem. Some of the things that got broken due to this are CPU
hotplug and suspend/hibernate.MCE uses per_cpu allocations of struct device. So, when a CPU goes
offline and comes back online, in order to ensure that we start from a
clean slate with respect to the MCE subsystem, zero out the entire
per_cpu device structure to 0 before using it.Signed-off-by: Srivatsa S. Bhat
Signed-off-by: Linus Torvalds
08 Jan, 2012
1 commit
-
* 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (73 commits)
arm: fix up some samsung merge sysdev conversion problems
firmware: Fix an oops on reading fw_priv->fw in sysfs loading file
Drivers:hv: Fix a bug in vmbus_driver_unregister()
driver core: remove __must_check from device_create_file
debugfs: add missing #ifdef HAS_IOMEM
arm: time.h: remove device.h #include
driver-core: remove sysdev.h usage.
clockevents: remove sysdev.h
arm: convert sysdev_class to a regular subsystem
arm: leds: convert sysdev_class to a regular subsystem
kobject: remove kset_find_obj_hinted()
m86k: gpio - convert sysdev_class to a regular subsystem
mips: txx9_sram - convert sysdev_class to a regular subsystem
mips: 7segled - convert sysdev_class to a regular subsystem
sh: dma - convert sysdev_class to a regular subsystem
sh: intc - convert sysdev_class to a regular subsystem
power: suspend - convert sysdev_class to a regular subsystem
power: qe_ic - convert sysdev_class to a regular subsystem
power: cmm - convert sysdev_class to a regular subsystem
s390: time - convert sysdev_class to a regular subsystem
...Fix up conflicts with 'struct sysdev' removal from various platform
drivers that got changed:
- arch/arm/mach-exynos/cpu.c
- arch/arm/mach-exynos/irq-eint.c
- arch/arm/mach-s3c64xx/common.c
- arch/arm/mach-s3c64xx/cpu.c
- arch/arm/mach-s5p64x0/cpu.c
- arch/arm/mach-s5pv210/common.c
- arch/arm/plat-samsung/include/plat/cpu.h
- arch/powerpc/kernel/sysfs.c
and fix up cpu_is_hotpluggable() as per Greg in include/linux/cpu.h
07 Jan, 2012
3 commits
-
* 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: add IRQ context simulation in module mce-inject
x86, mce, therm_throt: Don't report power limit and package level thermal throttle events in mcelog
x86, MCE: Drain mcelog buffer
x86, mce: Add wrappers for registering on the decode chain -
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Skip cpus with apic-ids >= 255 in !x2apic_mode
x86, x2apic: Allow "nox2apic" to disable x2apic mode setup by BIOS
x86, x2apic: Fallback to xapic when BIOS doesn't setup interrupt-remapping
x86, acpi: Skip acpi x2apic entries if the x2apic feature is not present
x86, apic: Add probe() for apic_flat
x86: Simplify code by removing a !SMP #ifdefs from 'struct cpuinfo_x86'
x86: Convert per-cpu counter icr_read_retry_count into a member of irq_stat
x86: Add per-cpu stat counter for APIC ICR read tries
pci, x86/io-apic: Allow PCI_IOAPIC to be user configurable on x86
x86: Fix the !CONFIG_NUMA build of the new CPU ID fixup code support
x86: Add NumaChip support
x86: Add x86_init platform override to fix up NUMA core numbering
x86: Make flat_init_apic_ldr() available -
This resolves the conflict in the arch/arm/mach-s3c64xx/s3c6400.c file,
and it fixes the build error in the arch/x86/kernel/microcode_core.c
file, that the merge did not catch.The microcode_core.c patch was provided by Stephen Rothwell
who was invaluable in the merge issues involved
with the large sysdev removal process in the driver-core tree.Signed-off-by: Greg Kroah-Hartman
22 Dec, 2011
1 commit
-
This moves the 'cpu sysdev_class' over to a regular 'cpu' subsystem
and converts the devices to regular devices. The sysdev drivers are
implemented as subsystem interfaces now.After all sysdev classes are ported to regular driver core entities, the
sysdev implementation will be entirely removed from the kernel.Userspace relies on events and generic sysfs subsystem infrastructure
from sysdev devices, which are made available with this conversion.Cc: Haavard Skinnemoen
Cc: Hans-Christian Egtvedt
Cc: Tony Luck
Cc: Fenghua Yu
Cc: Arnd Bergmann
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Martin Schwidefsky
Cc: Heiko Carstens
Cc: Paul Mundt
Cc: "David S. Miller"
Cc: Chris Metcalf
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Borislav Petkov
Cc: Tigran Aivazian
Cc: Len Brown
Cc: Zhang Rui
Cc: Dave Jones
Cc: Peter Zijlstra
Cc: Russell King
Cc: Andrew Morton
Cc: Arjan van de Ven
Cc: "Rafael J. Wysocki"
Cc: "Srivatsa S. Bhat"
Signed-off-by: Kay Sievers
Signed-off-by: Greg Kroah-Hartman
21 Dec, 2011
1 commit
-
Several fields in struct cpuinfo_x86 were not defined for the
!SMP case, likely to save space. However, those fields still
have some meaning for UP, and keeping them allows some #ifdef
removal from other files. The additional size of the UP kernel
from this change is not significant enough to worry about
keeping up the distinction:text data bss dec hex filename
4737168 506459 972040 6215667 5ed7f3 vmlinux.o.before
4737444 506459 972040 6215943 5ed907 vmlinux.o.afterfor a difference of 276 bytes for an example UP config.
If someone wants those 276 bytes back badly then it should
be implemented in a cleaner way.Signed-off-by: Kevin Winchester
Cc: Steffen Persvold
Link: http://lkml.kernel.org/r/1324428742-12498-1-git-send-email-kjwinchester@gmail.com
Signed-off-by: Ingo Molnar
18 Dec, 2011
1 commit
17 Dec, 2011
1 commit
-
mce-inject provides a mechanism to simulate errors so that test
scripts can check for correct operation of the kernel without
requiring any specialized hardware to create rare events.The existing code can simulate events in normal process context
and also in NMI context - but not in IRQ context. This patch
fills that gap.Link: https://lkml.org/lkml/2011/12/7/537
Signed-off-by: Chen Gong
Signed-off-by: Tony Luck
15 Dec, 2011
2 commits
-
Thermal throttle and power limit events are not defined as MCE errors in x86
architecture and should not generate MCE errors in mcelog.Current kernel generates fake software defined MCE errors for these events.
This may confuse users because they may think the machine has real MCE errors
while actually only thermal throttle or power limit events happen.To make it worse, buggy firmware on some platforms may falsely generate
the events. Therefore, kernel reports MCE errors which users think as real
hardware errors. Although the firmware bugs should be fixed, on the other hand,
kernel should not report MCE errors either.So mcelog is not a good mechanism to report these events. To report the events, we count them in respective counters (core_power_limit_count,
package_power_limit_count, core_throttle_count, and package_throttle_count) in
/sys/devices/system/cpu/cpu#/thermal_throttle/. Users can check the counters
for each event on each CPU. Please note that all CPU's on one package report
duplicate counters. It's user application's responsibity to retrieve a package
level counter for one package.This patch doesn't report package level power limit, core level power limit, and
package level thermal throttle events in mcelog. When the events happen, only
report them in respective counters in sysfs.Since core level thermal throttle has been legacy code in kernel for a while and
users accepted it as MCE error in mcelog, core level thermal throttle is still
reported in mcelog. In the mean time, the event is counted in a counter in sysfs
as well.Signed-off-by: Fenghua Yu
Acked-by: Borislav Petkov
Acked-by: Tony Luck
Link: http://lkml.kernel.org/r/20111215001945.GA21009@linux-os.sc.intel.com
Signed-off-by: H. Peter Anvin
14 Dec, 2011
2 commits
-
Add a function which drains whatever MCEs were logged in already during
boot and before the decoder chains were registered.Signed-off-by: Borislav Petkov
-
No functionality change, this is done so that in a follow-on patch all
queued-up MCEs can be decoded after registering on the chain.Signed-off-by: Borislav Petkov
12 Dec, 2011
1 commit
-
Interrupts notify the idle exit state before calling irq_enter().
But the notifier code calls rcu_read_lock() and this is not
allowed while rcu is in an extended quiescent state. We need
to wait for irq_enter() -> rcu_idle_exit() to be called before
doing so otherwise this results in a grumpy RCU:[ 0.099991] WARNING: at include/linux/rcupdate.h:194 __atomic_notifier_call_chain+0xd2/0x110()
[ 0.099991] Hardware name: AMD690VM-FMH
[ 0.099991] Modules linked in:
[ 0.099991] Pid: 0, comm: swapper Not tainted 3.0.0-rc6+ #255
[ 0.099991] Call Trace:
[ 0.099991] [] warn_slowpath_common+0x7a/0xb0
[ 0.099991] [] warn_slowpath_null+0x15/0x20
[ 0.099991] [] __atomic_notifier_call_chain+0xd2/0x110
[ 0.099991] [] atomic_notifier_call_chain+0x11/0x20
[ 0.099991] [] exit_idle+0x43/0x50
[ 0.099991] [] smp_apic_timer_interrupt+0x39/0xa0
[ 0.099991] [] apic_timer_interrupt+0x13/0x20
[ 0.099991] [] ? default_idle+0xa7/0x350
[ 0.099991] [] ? default_idle+0xa5/0x350
[ 0.099991] [] amd_e400_idle+0x8b/0x110
[ 0.099991] [] ? rcu_enter_nohz+0x8f/0x160
[ 0.099991] [] cpu_idle+0xb0/0x110
[ 0.099991] [] rest_init+0xe5/0x140
[ 0.099991] [] ? rest_init+0x48/0x140
[ 0.099991] [] start_kernel+0x3d1/0x3dc
[ 0.099991] [] x86_64_start_reservations+0x131/0x135
[ 0.099991] [] x86_64_start_kernel+0xed/0xf4Signed-off-by: Frederic Weisbecker
Cc: Paul E. McKenney
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: H. Peter Anvin
Cc: Andy Henroid
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett
08 Nov, 2011
1 commit
-
Arjan would like to make struct file_operations const, but
mce-inject directly writes to the mce_chrdev_ops to install its
write handler. In an ideal world mce-inject would have its own
character device, but we have a sizable legacy of test scripts
that hardwire "/dev/mcelog", so it would be painful to switch to
a separate device now. Instead, this patch switches to a stub
function in the mce code, with a registration helper that
mce-inject can call when it is loaded.Note that this would also allow for a sane process to allow
mce-inject to be unloaded again (with an unregister function,
and appropriate module_{get,put}() calls), but that is left for
potential future patches.Reported-by: Arjan van de Ven
Signed-off-by: Tony Luck
Link: http://lkml.kernel.org/r/4eb2e1971326651a3b@agluck-desktop.sc.intel.com
Signed-off-by: Ingo Molnar
07 Nov, 2011
1 commit
-
* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
Revert "tracing: Include module.h in define_trace.h"
irq: don't put module.h into irq.h for tracking irqgen modules.
bluetooth: macroize two small inlines to avoid module.h
ip_vs.h: fix implicit use of module_get/module_put from module.h
nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
include: replace linux/module.h with "struct module" wherever possible
include: convert various register fcns to macros to avoid include chaining
crypto.h: remove unused crypto_tfm_alg_modname() inline
uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
pm_runtime.h: explicitly requires notifier.h
linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
miscdevice.h: fix up implicit use of lists and types
stop_machine.h: fix implicit use of smp.h for smp_processor_id
of: fix implicit use of errno.h in include/linux/of.h
of_platform.h: delete needless include
acpi: remove module.h include from platform/aclinux.h
miscdevice.h: delete unnecessary inclusion of module.h
device_cgroup.h: delete needless include
net: sch_generic remove redundant use of
net: inet_timewait_sock doesnt need
...Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in
- drivers/media/dvb/frontends/dibx000_common.c
- drivers/media/video/{mt9m111.c,ov6650.c}
- drivers/mfd/ab3550-core.c
- include/linux/dmaengine.h
03 Nov, 2011
1 commit
-
* 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac: (21 commits)
MAINTAINERS: add an entry for Edac Sandy Bridge driver
edac: tag sb_edac as EXPERIMENTAL, as it requires more testing
EDAC: Fix incorrect edac mode reporting in sb_edac
edac: sb_edac: Add it to the building system
edac: Add an experimental new driver to support Sandy Bridge CPU's
i7300_edac: Fix error cleanup logic
i7core_edac: Initialize memory name with cpu, channel, bank
i7core_edac: Fix compilation on 32 bits arch
i7core_edac: scrubbing fixups
EDAC: Correct Kconfig dependencies
i7core_edac: return -ENODEV if no MC is found
i7core_edac: use edac's own way to print errors
MAINTAINERS: remove dropped edac_mce.* from the file
i7core_edac: Drop the edac_mce facility
x86, MCE: Use notifier chain only for MCE decoding
EDAC i7core: Use mce socketid for better compatibility
i7core_edac: Don't enable memory scrubbing for Xeon 35xx
i7core_edac: Add scrubbing support
edac: Move edac main structs to include/linux/edac.h
i7core_edac: Fix oops when trying to inject errors
...
01 Nov, 2011
3 commits
-
Remove edac_mce pieces and use the normal MCE decoder notifier chain by
retaining the same functionality with considerably less code.Signed-off-by: Borislav Petkov
Signed-off-by: Mauro Carvalho Chehab -
These files were implicitly getting EXPORT_SYMBOL via device.h
which was including module.h, but that will be fixed up shortly.By fixing these now, we can avoid seeing things like:
arch/x86/kernel/rtc.c:29: warning: type defaults to ‘int’ in declaration of ‘EXPORT_SYMBOL’
arch/x86/kernel/pci-dma.c:20: warning: type defaults to ‘int’ in declaration of ‘EXPORT_SYMBOL’
arch/x86/kernel/e820.c:69: warning: type defaults to ‘int’ in declaration of ‘EXPORT_SYMBOL_GPL’[ with input from Randy Dunlap and also
from Stephen Rothwell ]Signed-off-by: Paul Gortmaker
-
Drop the edac_mce custom hook in favor of the generic notifier
mechanism. Also, do not log the error to mcelog if the notified agent
was able to decode it.Signed-off-by: Borislav Petkov
Acked-by: Ingo Molnar
Signed-off-by: Mauro Carvalho Chehab
28 Oct, 2011
1 commit
-
* 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, microcode, AMD: Add microcode revision to /proc/cpuinfo
x86, microcode: Correct microcode revision format
coretemp: Get microcode revision from cpu_data
x86, intel: Use c->microcode for Atom errata check
x86, intel: Output microcode revision in /proc/cpuinfo
x86, microcode: Don't request microcode from userspace unnecessarilyFix up trivial conflicts in arch/x86/kernel/cpu/amd.c (conflict between
moving AMD BSP code to cpu_dev helper function and adding AMD microcode
revision to /proc/cpuinfo code)
26 Oct, 2011
1 commit
-
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (121 commits)
perf symbols: Increase symbol KSYM_NAME_LEN size
perf hists browser: Refuse 'a' hotkey on non symbolic views
perf ui browser: Use libslang to read keys
perf tools: Fix tracing info recording
perf hists browser: Elide DSO column when it is set to just one DSO, ditto for threads
perf hists: Don't consider filtered entries when calculating column widths
perf hists: Don't decay total_period for filtered entries
perf hists browser: Honour symbol_conf.show_{nr_samples,total_period}
perf hists browser: Do not exit on tab key with single event
perf annotate browser: Don't change selection line when returning from callq
perf tools: handle endianness of feature bitmap
perf tools: Add prelink suggestion to dso update message
perf script: Fix unknown feature comment
perf hists browser: Apply the dso and thread filters when merging new batches
perf hists: Move the dso and thread filters from hist_browser
perf ui browser: Honour the xterm colors
perf top tui: Give color hints just on the percentage, like on --stdio
perf ui browser: Make the colors configurable and change the defaults
perf tui: Remove unneeded call to newtCls on startup
perf hists: Don't format the percentage on hist_entry__snprintf
...Fix up conflicts in arch/x86/kernel/kprobes.c manually.
Ingo's tree did the insane "add volatile to const array", which just
doesn't make sense ("volatile const"?). But we could remove the const
*and* make the array volatile to make doubly sure that gcc doesn't
optimize it away..Also fix up kernel/trace/ring_buffer.c non-data-conflicts manually: the
reader_lock has been turned into a raw lock by the core locking merge,
and there was a new user of it introduced in this perf core merge. Make
sure that new use also uses the raw accessor functions.
19 Oct, 2011
1 commit
-
506ed6b53e00 ("x86, intel: Output microcode revision in /proc/cpuinfo")
added microcode revision format to /proc/cpuinfo and the MCE handler in
decimal format but both AMD and Intel patch levels are handled as hex
numbers. Fix it.Acked-by: Andi Kleen
Signed-off-by: Borislav Petkov
14 Oct, 2011
1 commit
-
I got a request to make it easier to determine the microcode
update level on Intel CPUs. This patch adds a new "microcode"
field to /proc/cpuinfo.The microcode level is also outputed on fatal machine checks
together with the other CPUID model information.I removed the respective code from the microcode update driver,
it just reads the field from cpu_data. Also when the microcode
is updated it fills in the new values too.I had to add a memory barrier to native_cpuid to prevent it
being optimized away when the result is not used.This turns out to clean up further code which already got this
information manually. This is done in followon patches.Signed-off-by: Andi Kleen
Acked-by: H. Peter Anvin
Link: http://lkml.kernel.org/r/1318466795-7393-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar
10 Oct, 2011
1 commit
-
Just convert all the files that have an nmi handler to the new routines.
Most of it is straight forward conversion. A couple of places needed some
tweaking like kgdb which separates the debug notifier from the nmi handler
and mce removes a call to notify_die.[Thanks to Ying for finding out the history behind that mce call
https://lkml.org/lkml/2010/5/27/114
And Boris responding that he would like to remove that call because of it
https://lkml.org/lkml/2011/9/21/163]
The things that get converted are the registeration/unregistration routines
and the nmi handler itself has its args changed along with code removal
to check which list it is on (most are on one NMI list except for kgdb
which has both an NMI routine and an NMI Unknown routine).Signed-off-by: Don Zickus
Signed-off-by: Peter Zijlstra
Acked-by: Corey Minyard
Cc: Jason Wessel
Cc: Andi Kleen
Cc: Robert Richter
Cc: Huang Ying
Cc: Corey Minyard
Cc: Jack Steiner
Link: http://lkml.kernel.org/r/1317409584-23662-4-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar
14 Sep, 2011
1 commit
-
del_timer_sync() can cause a deadlock when called in interrupt context.
It is used with on_each_cpu() in some parts for sysfs files like bank*,
check_interval, cmci_disabled and ignore_ce.However, use of on_each_cpu() results in calling the function passed
as the argument in interrupt context. This causes a flood of nested
warnings from del_timer_sync() (it runs on each CPU) caused even by a
simple file access like:$ echo 300 > /sys/devices/system/machinecheck/machinecheck0/check_interval
Fortunately, these MCE-specific files are rarely used and AFAIK only few
MCE geeks experience this warning.To remove the warning, move timer deletion outside of the interrupt
context.Signed-off-by: Hidetoshi Seto
Signed-off-by: Borislav Petkov
13 Sep, 2011
1 commit
-
The cmci_discover_lock can be taken in atomic context (cpu bring
up sequence) and therefore cannot be preempted on -rt.In mainline this change documents the low level nature of
the lock - otherwise there's no functional difference. Lockdep
and Sparse checking will work as usual.Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar
16 Jun, 2011
12 commits
-
There are many functions named mce_* so use a new prefix for the subset
of functions related to sysfs support.And since f3c6ea1b06c71b43f751b36bd99345369fe911af introduces
syscore_ops, use the prefix mce_syscore for some functions related to
power management which were in sysdev_class before.Before: After:
mce_device mce_sysdev
mce_sysclass mce_sysdev_class
mce_attrs mce_sysdev_attrs
mce_dev_initialized mce_sysdev_initialized
mce_create_device mce_sysdev_create
mce_remove_device mce_sysdev_removemce_suspend mce_syscore_suspend
mce_shutdown mce_syscore_shutdown
mce_resume mce_syscore_resumeSigned-off-by: Hidetoshi Seto
Acked-by: Tony Luck
Link: http://lkml.kernel.org/r/4DEED81B.8020506@jp.fujitsu.com
Signed-off-by: Borislav Petkov -
There are many functions named mce_* so use a new prefix for the subset
of functions dealing with the character device /dev/mcelog.This change doesn't impact the mce-inject module because the exported
symbol mce_chrdev_ops already has the prefix, therefore it is left
unchanged.Before: After:
mce_wait mce_chrdev_wait
mce_state_lock mce_chrdev_state_lock
open_count mce_chrdev_open_count
open_exclu mce_chrdev_open_exclu
mce_open mce_chrdev_open
mce_release mce_chrdev_release
mce_read_mutex mce_chrdev_read_mutex
mce_read mce_chrdev_read
mce_poll mce_chrdev_poll
mce_ioctl mce_chrdev_ioctl
mce_log_device mce_chrdev_deviceSigned-off-by: Hidetoshi Seto
Acked-by: Tony Luck
Link: http://lkml.kernel.org/r/4DEED7CD.3040500@jp.fujitsu.com
Signed-off-by: Borislav Petkov -
Use a temporary local variable m to simplify the code. No change in
logic.Signed-off-by: Hidetoshi Seto
Acked-by: Tony Luck
Link: http://lkml.kernel.org/r/4DEED7A8.8020307@jp.fujitsu.com
Signed-off-by: Borislav Petkov -
Use temporary local variable sysdev to simplify the code. No change in
logic.Signed-off-by: Hidetoshi Seto
Acked-by: Tony Luck
Link: http://lkml.kernel.org/r/4DEED777.7080205@jp.fujitsu.com
Signed-off-by: Borislav Petkov -
Because "ancient CPUs" like p5 and winchip don't have X86_FEATURE_MCA
(I suppose so), mcheck_cpu_init() on such CPUs will return at check of
mce_available() after __mcheck_cpu_ancient_init().It is hard to know this implicit behavior without knowing the CPUs
well. So make it clear that we leave mcheck_cpu_init() when the CPU is
initialized in __mcheck_cpu_ancient_init().Signed-off-by: Hidetoshi Seto
Acked-by: Tony Luck
Link: http://lkml.kernel.org/r/4DEED74B.20502@jp.fujitsu.com
Signed-off-by: Borislav Petkov -
This patch introduces mce_gather_info() which is to be called at the
beginning of error handling and gathers minimum error information from
proper error registers (and saved registers).As the result of mce_get_rip() is integrated, unnecessary zeroing
is removed. This also takes care of saving RIP which is required to
make some decision about error severity for SRAR errors, instead of
retrieving it later in the handler.Signed-off-by: Hidetoshi Seto
Acked-by: Tony Luck
Link: http://lkml.kernel.org/r/4DEED71A.1060906@jp.fujitsu.com
Signed-off-by: Borislav Petkov -
Follow other MCi register defines. Plus define MCI_MISC_ADDR_LSB() and
MCI_MISC_ADDR_MODE().Signed-off-by: Hidetoshi Seto
Acked-by: Tony Luck
Link: http://lkml.kernel.org/r/4DEED6E8.9090509@jp.fujitsu.com
Signed-off-by: Borislav Petkov -
The MCE handler uses a special vector for self IPI to invoke
post-emergency processing in an interrupt context, e.g. call an
NMI-unsafe function, wakeup loggers, schedule time-consuming work for
recovery, etc.This mechanism is now generalized by the following commit:
> e360adbe29241a0194e10e20595360dd7b98a2b3
> Author: Peter Zijlstra
> Date: Thu Oct 14 14:01:34 2010 +0800
>
> irq_work: Add generic hardirq context callbacks
>
> Provide a mechanism that allows running code in IRQ context. It is
> most useful for NMI code that needs to interact with the rest of the
> system -- like wakeup a task to drain buffers.
:So change to use provided generic mechanism.
Signed-off-by: Hidetoshi Seto
Acked-by: Tony Luck
Link: http://lkml.kernel.org/r/4DEED6B2.6080005@jp.fujitsu.com
Signed-off-by: Borislav Petkov -
More specifically:
- sort bits in the macros
- use BITCLR/BITSET
- coordinate message pattern
- use m for struct mce
- cleanup for severities_debugfs_init()No functional change.
Signed-off-by: Hidetoshi Seto
Acked-by: Tony Luck
Link: http://lkml.kernel.org/r/4DEED679.9090503@jp.fujitsu.com
Signed-off-by: Borislav Petkov -
The current format of an item in this table is:
condition(param, ..., level, message [, condition2 ...])So we have to check both an item's head and tail to find the conditions
which match the item.Format them in a more straight forward manner:
item(level, message, condition [, condition2 ...])Signed-off-by: Hidetoshi Seto
Acked-by: Tony Luck
Link: http://lkml.kernel.org/r/4DEED61F.5010502@jp.fujitsu.com
Signed-off-by: Borislav Petkov -
The table looks very complicated and hard to read for people other than
skilled developers. So let's clean it up a bit. At first, change format
to ease reading elements in the table.Signed-off-by: Hidetoshi Seto
Acked-by: Tony Luck
Link: http://lkml.kernel.org/r/4DEED5EB.6050400@jp.fujitsu.com
Signed-off-by: Borislav Petkov -
The "Spurious not enabled" entry is redundant: the "Not enabled" entry
earlier in the table will cover this case.The "Action required; unknown MCACOD" entry shouldn't specify MCACOD in
the .mask field. Current code will only match for mcacod==0 rather than
all AR=1 entries.Signed-off-by: Tony Luck
Signed-off-by: Hidetoshi Seto
Link: http://lkml.kernel.org/r/4DEED5BC.8030703@jp.fujitsu.com
Signed-off-by: Borislav Petkov