17 Jan, 2012

1 commit

  • When suspending, there was a large list of warnings going something like:

    Device 'machinecheck1' does not have a release() function, it is broken and must be fixed

    This patch turns the static mce_devices into dynamically allocated, and
    properly frees them when they are removed from the system. It solves
    the warning messages on my laptop here.

    Reported-by: "Srivatsa S. Bhat"
    Reported-by: Linus Torvalds
    Tested-by: Djalal Harouni
    Cc: Kay Sievers
    Cc: Tony Luck
    Cc: Borislav Petkov
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: Linus Torvalds

    Greg Kroah-Hartman
     

14 Jan, 2012

1 commit

  • Commit 8a25a2fd126c ("cpu: convert 'cpu' and 'machinecheck' sysdev_class
    to a regular subsystem") changed how things are dealt with in the MCE
    subsystem. Some of the things that got broken due to this are CPU
    hotplug and suspend/hibernate.

    MCE uses per_cpu allocations of struct device. So, when a CPU goes
    offline and comes back online, in order to ensure that we start from a
    clean slate with respect to the MCE subsystem, zero out the entire
    per_cpu device structure to 0 before using it.

    Signed-off-by: Srivatsa S. Bhat
    Signed-off-by: Linus Torvalds

    Srivatsa S. Bhat
     

08 Jan, 2012

1 commit

  • * 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (73 commits)
    arm: fix up some samsung merge sysdev conversion problems
    firmware: Fix an oops on reading fw_priv->fw in sysfs loading file
    Drivers:hv: Fix a bug in vmbus_driver_unregister()
    driver core: remove __must_check from device_create_file
    debugfs: add missing #ifdef HAS_IOMEM
    arm: time.h: remove device.h #include
    driver-core: remove sysdev.h usage.
    clockevents: remove sysdev.h
    arm: convert sysdev_class to a regular subsystem
    arm: leds: convert sysdev_class to a regular subsystem
    kobject: remove kset_find_obj_hinted()
    m86k: gpio - convert sysdev_class to a regular subsystem
    mips: txx9_sram - convert sysdev_class to a regular subsystem
    mips: 7segled - convert sysdev_class to a regular subsystem
    sh: dma - convert sysdev_class to a regular subsystem
    sh: intc - convert sysdev_class to a regular subsystem
    power: suspend - convert sysdev_class to a regular subsystem
    power: qe_ic - convert sysdev_class to a regular subsystem
    power: cmm - convert sysdev_class to a regular subsystem
    s390: time - convert sysdev_class to a regular subsystem
    ...

    Fix up conflicts with 'struct sysdev' removal from various platform
    drivers that got changed:
    - arch/arm/mach-exynos/cpu.c
    - arch/arm/mach-exynos/irq-eint.c
    - arch/arm/mach-s3c64xx/common.c
    - arch/arm/mach-s3c64xx/cpu.c
    - arch/arm/mach-s5p64x0/cpu.c
    - arch/arm/mach-s5pv210/common.c
    - arch/arm/plat-samsung/include/plat/cpu.h
    - arch/powerpc/kernel/sysfs.c
    and fix up cpu_is_hotpluggable() as per Greg in include/linux/cpu.h

    Linus Torvalds
     

07 Jan, 2012

3 commits

  • * 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86: add IRQ context simulation in module mce-inject
    x86, mce, therm_throt: Don't report power limit and package level thermal throttle events in mcelog
    x86, MCE: Drain mcelog buffer
    x86, mce: Add wrappers for registering on the decode chain

    Linus Torvalds
     
  • * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86: Skip cpus with apic-ids >= 255 in !x2apic_mode
    x86, x2apic: Allow "nox2apic" to disable x2apic mode setup by BIOS
    x86, x2apic: Fallback to xapic when BIOS doesn't setup interrupt-remapping
    x86, acpi: Skip acpi x2apic entries if the x2apic feature is not present
    x86, apic: Add probe() for apic_flat
    x86: Simplify code by removing a !SMP #ifdefs from 'struct cpuinfo_x86'
    x86: Convert per-cpu counter icr_read_retry_count into a member of irq_stat
    x86: Add per-cpu stat counter for APIC ICR read tries
    pci, x86/io-apic: Allow PCI_IOAPIC to be user configurable on x86
    x86: Fix the !CONFIG_NUMA build of the new CPU ID fixup code support
    x86: Add NumaChip support
    x86: Add x86_init platform override to fix up NUMA core numbering
    x86: Make flat_init_apic_ldr() available

    Linus Torvalds
     
  • This resolves the conflict in the arch/arm/mach-s3c64xx/s3c6400.c file,
    and it fixes the build error in the arch/x86/kernel/microcode_core.c
    file, that the merge did not catch.

    The microcode_core.c patch was provided by Stephen Rothwell
    who was invaluable in the merge issues involved
    with the large sysdev removal process in the driver-core tree.

    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

22 Dec, 2011

1 commit

  • This moves the 'cpu sysdev_class' over to a regular 'cpu' subsystem
    and converts the devices to regular devices. The sysdev drivers are
    implemented as subsystem interfaces now.

    After all sysdev classes are ported to regular driver core entities, the
    sysdev implementation will be entirely removed from the kernel.

    Userspace relies on events and generic sysfs subsystem infrastructure
    from sysdev devices, which are made available with this conversion.

    Cc: Haavard Skinnemoen
    Cc: Hans-Christian Egtvedt
    Cc: Tony Luck
    Cc: Fenghua Yu
    Cc: Arnd Bergmann
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Paul Mundt
    Cc: "David S. Miller"
    Cc: Chris Metcalf
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Borislav Petkov
    Cc: Tigran Aivazian
    Cc: Len Brown
    Cc: Zhang Rui
    Cc: Dave Jones
    Cc: Peter Zijlstra
    Cc: Russell King
    Cc: Andrew Morton
    Cc: Arjan van de Ven
    Cc: "Rafael J. Wysocki"
    Cc: "Srivatsa S. Bhat"
    Signed-off-by: Kay Sievers
    Signed-off-by: Greg Kroah-Hartman

    Kay Sievers
     

21 Dec, 2011

1 commit

  • Several fields in struct cpuinfo_x86 were not defined for the
    !SMP case, likely to save space. However, those fields still
    have some meaning for UP, and keeping them allows some #ifdef
    removal from other files. The additional size of the UP kernel
    from this change is not significant enough to worry about
    keeping up the distinction:

    text data bss dec hex filename
    4737168 506459 972040 6215667 5ed7f3 vmlinux.o.before
    4737444 506459 972040 6215943 5ed907 vmlinux.o.after

    for a difference of 276 bytes for an example UP config.

    If someone wants those 276 bytes back badly then it should
    be implemented in a cleaner way.

    Signed-off-by: Kevin Winchester
    Cc: Steffen Persvold
    Link: http://lkml.kernel.org/r/1324428742-12498-1-git-send-email-kjwinchester@gmail.com
    Signed-off-by: Ingo Molnar

    Kevin Winchester
     

18 Dec, 2011

1 commit


17 Dec, 2011

1 commit

  • mce-inject provides a mechanism to simulate errors so that test
    scripts can check for correct operation of the kernel without
    requiring any specialized hardware to create rare events.

    The existing code can simulate events in normal process context
    and also in NMI context - but not in IRQ context. This patch
    fills that gap.

    Link: https://lkml.org/lkml/2011/12/7/537
    Signed-off-by: Chen Gong
    Signed-off-by: Tony Luck

    Chen Gong
     

15 Dec, 2011

2 commits

  • Ingo Molnar
     
  • Thermal throttle and power limit events are not defined as MCE errors in x86
    architecture and should not generate MCE errors in mcelog.

    Current kernel generates fake software defined MCE errors for these events.
    This may confuse users because they may think the machine has real MCE errors
    while actually only thermal throttle or power limit events happen.

    To make it worse, buggy firmware on some platforms may falsely generate
    the events. Therefore, kernel reports MCE errors which users think as real
    hardware errors. Although the firmware bugs should be fixed, on the other hand,
    kernel should not report MCE errors either.

    So mcelog is not a good mechanism to report these events. To report the events, we count them in respective counters (core_power_limit_count,
    package_power_limit_count, core_throttle_count, and package_throttle_count) in
    /sys/devices/system/cpu/cpu#/thermal_throttle/. Users can check the counters
    for each event on each CPU. Please note that all CPU's on one package report
    duplicate counters. It's user application's responsibity to retrieve a package
    level counter for one package.

    This patch doesn't report package level power limit, core level power limit, and
    package level thermal throttle events in mcelog. When the events happen, only
    report them in respective counters in sysfs.

    Since core level thermal throttle has been legacy code in kernel for a while and
    users accepted it as MCE error in mcelog, core level thermal throttle is still
    reported in mcelog. In the mean time, the event is counted in a counter in sysfs
    as well.

    Signed-off-by: Fenghua Yu
    Acked-by: Borislav Petkov
    Acked-by: Tony Luck
    Link: http://lkml.kernel.org/r/20111215001945.GA21009@linux-os.sc.intel.com
    Signed-off-by: H. Peter Anvin

    Fenghua Yu
     

14 Dec, 2011

2 commits


12 Dec, 2011

1 commit

  • Interrupts notify the idle exit state before calling irq_enter().
    But the notifier code calls rcu_read_lock() and this is not
    allowed while rcu is in an extended quiescent state. We need
    to wait for irq_enter() -> rcu_idle_exit() to be called before
    doing so otherwise this results in a grumpy RCU:

    [ 0.099991] WARNING: at include/linux/rcupdate.h:194 __atomic_notifier_call_chain+0xd2/0x110()
    [ 0.099991] Hardware name: AMD690VM-FMH
    [ 0.099991] Modules linked in:
    [ 0.099991] Pid: 0, comm: swapper Not tainted 3.0.0-rc6+ #255
    [ 0.099991] Call Trace:
    [ 0.099991] [] warn_slowpath_common+0x7a/0xb0
    [ 0.099991] [] warn_slowpath_null+0x15/0x20
    [ 0.099991] [] __atomic_notifier_call_chain+0xd2/0x110
    [ 0.099991] [] atomic_notifier_call_chain+0x11/0x20
    [ 0.099991] [] exit_idle+0x43/0x50
    [ 0.099991] [] smp_apic_timer_interrupt+0x39/0xa0
    [ 0.099991] [] apic_timer_interrupt+0x13/0x20
    [ 0.099991] [] ? default_idle+0xa7/0x350
    [ 0.099991] [] ? default_idle+0xa5/0x350
    [ 0.099991] [] amd_e400_idle+0x8b/0x110
    [ 0.099991] [] ? rcu_enter_nohz+0x8f/0x160
    [ 0.099991] [] cpu_idle+0xb0/0x110
    [ 0.099991] [] rest_init+0xe5/0x140
    [ 0.099991] [] ? rest_init+0x48/0x140
    [ 0.099991] [] start_kernel+0x3d1/0x3dc
    [ 0.099991] [] x86_64_start_reservations+0x131/0x135
    [ 0.099991] [] x86_64_start_kernel+0xed/0xf4

    Signed-off-by: Frederic Weisbecker
    Cc: Paul E. McKenney
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: H. Peter Anvin
    Cc: Andy Henroid
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Frederic Weisbecker
     

08 Nov, 2011

1 commit

  • Arjan would like to make struct file_operations const, but
    mce-inject directly writes to the mce_chrdev_ops to install its
    write handler. In an ideal world mce-inject would have its own
    character device, but we have a sizable legacy of test scripts
    that hardwire "/dev/mcelog", so it would be painful to switch to
    a separate device now. Instead, this patch switches to a stub
    function in the mce code, with a registration helper that
    mce-inject can call when it is loaded.

    Note that this would also allow for a sane process to allow
    mce-inject to be unloaded again (with an unregister function,
    and appropriate module_{get,put}() calls), but that is left for
    potential future patches.

    Reported-by: Arjan van de Ven
    Signed-off-by: Tony Luck
    Link: http://lkml.kernel.org/r/4eb2e1971326651a3b@agluck-desktop.sc.intel.com
    Signed-off-by: Ingo Molnar

    Luck, Tony
     

07 Nov, 2011

1 commit

  • * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
    Revert "tracing: Include module.h in define_trace.h"
    irq: don't put module.h into irq.h for tracking irqgen modules.
    bluetooth: macroize two small inlines to avoid module.h
    ip_vs.h: fix implicit use of module_get/module_put from module.h
    nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
    include: replace linux/module.h with "struct module" wherever possible
    include: convert various register fcns to macros to avoid include chaining
    crypto.h: remove unused crypto_tfm_alg_modname() inline
    uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
    pm_runtime.h: explicitly requires notifier.h
    linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
    miscdevice.h: fix up implicit use of lists and types
    stop_machine.h: fix implicit use of smp.h for smp_processor_id
    of: fix implicit use of errno.h in include/linux/of.h
    of_platform.h: delete needless include
    acpi: remove module.h include from platform/aclinux.h
    miscdevice.h: delete unnecessary inclusion of module.h
    device_cgroup.h: delete needless include
    net: sch_generic remove redundant use of
    net: inet_timewait_sock doesnt need
    ...

    Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in
    - drivers/media/dvb/frontends/dibx000_common.c
    - drivers/media/video/{mt9m111.c,ov6650.c}
    - drivers/mfd/ab3550-core.c
    - include/linux/dmaengine.h

    Linus Torvalds
     

03 Nov, 2011

1 commit

  • * 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac: (21 commits)
    MAINTAINERS: add an entry for Edac Sandy Bridge driver
    edac: tag sb_edac as EXPERIMENTAL, as it requires more testing
    EDAC: Fix incorrect edac mode reporting in sb_edac
    edac: sb_edac: Add it to the building system
    edac: Add an experimental new driver to support Sandy Bridge CPU's
    i7300_edac: Fix error cleanup logic
    i7core_edac: Initialize memory name with cpu, channel, bank
    i7core_edac: Fix compilation on 32 bits arch
    i7core_edac: scrubbing fixups
    EDAC: Correct Kconfig dependencies
    i7core_edac: return -ENODEV if no MC is found
    i7core_edac: use edac's own way to print errors
    MAINTAINERS: remove dropped edac_mce.* from the file
    i7core_edac: Drop the edac_mce facility
    x86, MCE: Use notifier chain only for MCE decoding
    EDAC i7core: Use mce socketid for better compatibility
    i7core_edac: Don't enable memory scrubbing for Xeon 35xx
    i7core_edac: Add scrubbing support
    edac: Move edac main structs to include/linux/edac.h
    i7core_edac: Fix oops when trying to inject errors
    ...

    Linus Torvalds
     

01 Nov, 2011

3 commits

  • Remove edac_mce pieces and use the normal MCE decoder notifier chain by
    retaining the same functionality with considerably less code.

    Signed-off-by: Borislav Petkov
    Signed-off-by: Mauro Carvalho Chehab

    Borislav Petkov
     
  • These files were implicitly getting EXPORT_SYMBOL via device.h
    which was including module.h, but that will be fixed up shortly.

    By fixing these now, we can avoid seeing things like:

    arch/x86/kernel/rtc.c:29: warning: type defaults to ‘int’ in declaration of ‘EXPORT_SYMBOL’
    arch/x86/kernel/pci-dma.c:20: warning: type defaults to ‘int’ in declaration of ‘EXPORT_SYMBOL’
    arch/x86/kernel/e820.c:69: warning: type defaults to ‘int’ in declaration of ‘EXPORT_SYMBOL_GPL’

    [ with input from Randy Dunlap and also
    from Stephen Rothwell ]

    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     
  • Drop the edac_mce custom hook in favor of the generic notifier
    mechanism. Also, do not log the error to mcelog if the notified agent
    was able to decode it.

    Signed-off-by: Borislav Petkov
    Acked-by: Ingo Molnar
    Signed-off-by: Mauro Carvalho Chehab

    Borislav Petkov
     

28 Oct, 2011

1 commit

  • * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86, microcode, AMD: Add microcode revision to /proc/cpuinfo
    x86, microcode: Correct microcode revision format
    coretemp: Get microcode revision from cpu_data
    x86, intel: Use c->microcode for Atom errata check
    x86, intel: Output microcode revision in /proc/cpuinfo
    x86, microcode: Don't request microcode from userspace unnecessarily

    Fix up trivial conflicts in arch/x86/kernel/cpu/amd.c (conflict between
    moving AMD BSP code to cpu_dev helper function and adding AMD microcode
    revision to /proc/cpuinfo code)

    Linus Torvalds
     

26 Oct, 2011

1 commit

  • * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (121 commits)
    perf symbols: Increase symbol KSYM_NAME_LEN size
    perf hists browser: Refuse 'a' hotkey on non symbolic views
    perf ui browser: Use libslang to read keys
    perf tools: Fix tracing info recording
    perf hists browser: Elide DSO column when it is set to just one DSO, ditto for threads
    perf hists: Don't consider filtered entries when calculating column widths
    perf hists: Don't decay total_period for filtered entries
    perf hists browser: Honour symbol_conf.show_{nr_samples,total_period}
    perf hists browser: Do not exit on tab key with single event
    perf annotate browser: Don't change selection line when returning from callq
    perf tools: handle endianness of feature bitmap
    perf tools: Add prelink suggestion to dso update message
    perf script: Fix unknown feature comment
    perf hists browser: Apply the dso and thread filters when merging new batches
    perf hists: Move the dso and thread filters from hist_browser
    perf ui browser: Honour the xterm colors
    perf top tui: Give color hints just on the percentage, like on --stdio
    perf ui browser: Make the colors configurable and change the defaults
    perf tui: Remove unneeded call to newtCls on startup
    perf hists: Don't format the percentage on hist_entry__snprintf
    ...

    Fix up conflicts in arch/x86/kernel/kprobes.c manually.

    Ingo's tree did the insane "add volatile to const array", which just
    doesn't make sense ("volatile const"?). But we could remove the const
    *and* make the array volatile to make doubly sure that gcc doesn't
    optimize it away..

    Also fix up kernel/trace/ring_buffer.c non-data-conflicts manually: the
    reader_lock has been turned into a raw lock by the core locking merge,
    and there was a new user of it introduced in this perf core merge. Make
    sure that new use also uses the raw accessor functions.

    Linus Torvalds
     

19 Oct, 2011

1 commit


14 Oct, 2011

1 commit

  • I got a request to make it easier to determine the microcode
    update level on Intel CPUs. This patch adds a new "microcode"
    field to /proc/cpuinfo.

    The microcode level is also outputed on fatal machine checks
    together with the other CPUID model information.

    I removed the respective code from the microcode update driver,
    it just reads the field from cpu_data. Also when the microcode
    is updated it fills in the new values too.

    I had to add a memory barrier to native_cpuid to prevent it
    being optimized away when the result is not used.

    This turns out to clean up further code which already got this
    information manually. This is done in followon patches.

    Signed-off-by: Andi Kleen
    Acked-by: H. Peter Anvin
    Link: http://lkml.kernel.org/r/1318466795-7393-1-git-send-email-andi@firstfloor.org
    Signed-off-by: Ingo Molnar

    Andi Kleen
     

10 Oct, 2011

1 commit

  • Just convert all the files that have an nmi handler to the new routines.
    Most of it is straight forward conversion. A couple of places needed some
    tweaking like kgdb which separates the debug notifier from the nmi handler
    and mce removes a call to notify_die.

    [Thanks to Ying for finding out the history behind that mce call

    https://lkml.org/lkml/2010/5/27/114

    And Boris responding that he would like to remove that call because of it

    https://lkml.org/lkml/2011/9/21/163]

    The things that get converted are the registeration/unregistration routines
    and the nmi handler itself has its args changed along with code removal
    to check which list it is on (most are on one NMI list except for kgdb
    which has both an NMI routine and an NMI Unknown routine).

    Signed-off-by: Don Zickus
    Signed-off-by: Peter Zijlstra
    Acked-by: Corey Minyard
    Cc: Jason Wessel
    Cc: Andi Kleen
    Cc: Robert Richter
    Cc: Huang Ying
    Cc: Corey Minyard
    Cc: Jack Steiner
    Link: http://lkml.kernel.org/r/1317409584-23662-4-git-send-email-dzickus@redhat.com
    Signed-off-by: Ingo Molnar

    Don Zickus
     

14 Sep, 2011

1 commit

  • del_timer_sync() can cause a deadlock when called in interrupt context.
    It is used with on_each_cpu() in some parts for sysfs files like bank*,
    check_interval, cmci_disabled and ignore_ce.

    However, use of on_each_cpu() results in calling the function passed
    as the argument in interrupt context. This causes a flood of nested
    warnings from del_timer_sync() (it runs on each CPU) caused even by a
    simple file access like:

    $ echo 300 > /sys/devices/system/machinecheck/machinecheck0/check_interval

    Fortunately, these MCE-specific files are rarely used and AFAIK only few
    MCE geeks experience this warning.

    To remove the warning, move timer deletion outside of the interrupt
    context.

    Signed-off-by: Hidetoshi Seto
    Signed-off-by: Borislav Petkov

    Hidetoshi Seto
     

13 Sep, 2011

1 commit

  • The cmci_discover_lock can be taken in atomic context (cpu bring
    up sequence) and therefore cannot be preempted on -rt.

    In mainline this change documents the low level nature of
    the lock - otherwise there's no functional difference. Lockdep
    and Sparse checking will work as usual.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     

16 Jun, 2011

12 commits

  • There are many functions named mce_* so use a new prefix for the subset
    of functions related to sysfs support.

    And since f3c6ea1b06c71b43f751b36bd99345369fe911af introduces
    syscore_ops, use the prefix mce_syscore for some functions related to
    power management which were in sysdev_class before.

    Before: After:
    mce_device mce_sysdev
    mce_sysclass mce_sysdev_class
    mce_attrs mce_sysdev_attrs
    mce_dev_initialized mce_sysdev_initialized
    mce_create_device mce_sysdev_create
    mce_remove_device mce_sysdev_remove

    mce_suspend mce_syscore_suspend
    mce_shutdown mce_syscore_shutdown
    mce_resume mce_syscore_resume

    Signed-off-by: Hidetoshi Seto
    Acked-by: Tony Luck
    Link: http://lkml.kernel.org/r/4DEED81B.8020506@jp.fujitsu.com
    Signed-off-by: Borislav Petkov

    Hidetoshi Seto
     
  • There are many functions named mce_* so use a new prefix for the subset
    of functions dealing with the character device /dev/mcelog.

    This change doesn't impact the mce-inject module because the exported
    symbol mce_chrdev_ops already has the prefix, therefore it is left
    unchanged.

    Before: After:
    mce_wait mce_chrdev_wait
    mce_state_lock mce_chrdev_state_lock
    open_count mce_chrdev_open_count
    open_exclu mce_chrdev_open_exclu
    mce_open mce_chrdev_open
    mce_release mce_chrdev_release
    mce_read_mutex mce_chrdev_read_mutex
    mce_read mce_chrdev_read
    mce_poll mce_chrdev_poll
    mce_ioctl mce_chrdev_ioctl
    mce_log_device mce_chrdev_device

    Signed-off-by: Hidetoshi Seto
    Acked-by: Tony Luck
    Link: http://lkml.kernel.org/r/4DEED7CD.3040500@jp.fujitsu.com
    Signed-off-by: Borislav Petkov

    Hidetoshi Seto
     
  • Use a temporary local variable m to simplify the code. No change in
    logic.

    Signed-off-by: Hidetoshi Seto
    Acked-by: Tony Luck
    Link: http://lkml.kernel.org/r/4DEED7A8.8020307@jp.fujitsu.com
    Signed-off-by: Borislav Petkov

    Hidetoshi Seto
     
  • Use temporary local variable sysdev to simplify the code. No change in
    logic.

    Signed-off-by: Hidetoshi Seto
    Acked-by: Tony Luck
    Link: http://lkml.kernel.org/r/4DEED777.7080205@jp.fujitsu.com
    Signed-off-by: Borislav Petkov

    Hidetoshi Seto
     
  • Because "ancient CPUs" like p5 and winchip don't have X86_FEATURE_MCA
    (I suppose so), mcheck_cpu_init() on such CPUs will return at check of
    mce_available() after __mcheck_cpu_ancient_init().

    It is hard to know this implicit behavior without knowing the CPUs
    well. So make it clear that we leave mcheck_cpu_init() when the CPU is
    initialized in __mcheck_cpu_ancient_init().

    Signed-off-by: Hidetoshi Seto
    Acked-by: Tony Luck
    Link: http://lkml.kernel.org/r/4DEED74B.20502@jp.fujitsu.com
    Signed-off-by: Borislav Petkov

    Hidetoshi Seto
     
  • This patch introduces mce_gather_info() which is to be called at the
    beginning of error handling and gathers minimum error information from
    proper error registers (and saved registers).

    As the result of mce_get_rip() is integrated, unnecessary zeroing
    is removed. This also takes care of saving RIP which is required to
    make some decision about error severity for SRAR errors, instead of
    retrieving it later in the handler.

    Signed-off-by: Hidetoshi Seto
    Acked-by: Tony Luck
    Link: http://lkml.kernel.org/r/4DEED71A.1060906@jp.fujitsu.com
    Signed-off-by: Borislav Petkov

    Hidetoshi Seto
     
  • Follow other MCi register defines. Plus define MCI_MISC_ADDR_LSB() and
    MCI_MISC_ADDR_MODE().

    Signed-off-by: Hidetoshi Seto
    Acked-by: Tony Luck
    Link: http://lkml.kernel.org/r/4DEED6E8.9090509@jp.fujitsu.com
    Signed-off-by: Borislav Petkov

    Hidetoshi Seto
     
  • The MCE handler uses a special vector for self IPI to invoke
    post-emergency processing in an interrupt context, e.g. call an
    NMI-unsafe function, wakeup loggers, schedule time-consuming work for
    recovery, etc.

    This mechanism is now generalized by the following commit:

    > e360adbe29241a0194e10e20595360dd7b98a2b3
    > Author: Peter Zijlstra
    > Date: Thu Oct 14 14:01:34 2010 +0800
    >
    > irq_work: Add generic hardirq context callbacks
    >
    > Provide a mechanism that allows running code in IRQ context. It is
    > most useful for NMI code that needs to interact with the rest of the
    > system -- like wakeup a task to drain buffers.
    :

    So change to use provided generic mechanism.

    Signed-off-by: Hidetoshi Seto
    Acked-by: Tony Luck
    Link: http://lkml.kernel.org/r/4DEED6B2.6080005@jp.fujitsu.com
    Signed-off-by: Borislav Petkov

    Hidetoshi Seto
     
  • More specifically:

    - sort bits in the macros
    - use BITCLR/BITSET
    - coordinate message pattern
    - use m for struct mce
    - cleanup for severities_debugfs_init()

    No functional change.

    Signed-off-by: Hidetoshi Seto
    Acked-by: Tony Luck
    Link: http://lkml.kernel.org/r/4DEED679.9090503@jp.fujitsu.com
    Signed-off-by: Borislav Petkov

    Hidetoshi Seto
     
  • The current format of an item in this table is:
    condition(param, ..., level, message [, condition2 ...])

    So we have to check both an item's head and tail to find the conditions
    which match the item.

    Format them in a more straight forward manner:
    item(level, message, condition [, condition2 ...])

    Signed-off-by: Hidetoshi Seto
    Acked-by: Tony Luck
    Link: http://lkml.kernel.org/r/4DEED61F.5010502@jp.fujitsu.com
    Signed-off-by: Borislav Petkov

    Hidetoshi Seto
     
  • The table looks very complicated and hard to read for people other than
    skilled developers. So let's clean it up a bit. At first, change format
    to ease reading elements in the table.

    Signed-off-by: Hidetoshi Seto
    Acked-by: Tony Luck
    Link: http://lkml.kernel.org/r/4DEED5EB.6050400@jp.fujitsu.com
    Signed-off-by: Borislav Petkov

    Hidetoshi Seto
     
  • The "Spurious not enabled" entry is redundant: the "Not enabled" entry
    earlier in the table will cover this case.

    The "Action required; unknown MCACOD" entry shouldn't specify MCACOD in
    the .mask field. Current code will only match for mcacod==0 rather than
    all AR=1 entries.

    Signed-off-by: Tony Luck
    Signed-off-by: Hidetoshi Seto
    Link: http://lkml.kernel.org/r/4DEED5BC.8030703@jp.fujitsu.com
    Signed-off-by: Borislav Petkov

    Tony Luck