27 Sep, 2010

1 commit

  • f4347553b30ec66530bfe63c84530afea3803396 removed the edac polling
    mechanism in favor of using a notifier chain for conveying MCE
    information to edac. However, the module removal path didn't test
    whether the driver had setup the polling function workqueue at all and
    the rmmod process was hanging in the kernel at try_to_del_timer_sync()
    in the cancel_delayed_work() path, trying to cancel an uninitialized
    work struct.

    Fix that by adding a balancing check to the workqueue removal path.

    Signed-off-by: Borislav Petkov

    Borislav Petkov
     

08 Dec, 2009

1 commit

  • Instead of using deeply-nested conditionals for dumping the DIMM type in
    debug mode, add a strings array of the supported DIMM types.

    This is useful in cases where an edac driver supports multiple DRAM
    types and is only defined in debug builds.

    Signed-off-by: Borislav Petkov

    Borislav Petkov
     

24 Sep, 2009

1 commit

  • Module edac_core.ko uses call_rcu() callbacks in edac_device.c, edac_mc.c
    and edac_pci.c.

    They all use a wait_for_completion() scheme, but this scheme it not 100%
    safe on multiple CPUs. See the _rcu_barrier() implementation which
    explains why extra precausion is needed.

    The patch adds a comment about rcu_barrier() and as a precausion calls
    rcu_barrier(). A maintainer needs to look at removing the
    wait_for_completion code.

    [dougthompson@xmission.com: remove the wait_for_completion code]
    Signed-off-by Jesper Dangaard Brouer
    Signed-off-by: Doug Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jesper Dangaard Brouer
     

14 Apr, 2009

1 commit

  • The edac-core driver includes code which assumes that the work_struct
    which is included in every delayed_work is the first member of that
    structure. This is currently the case but might change in the future, so
    use to_delayed_work() instead, which doesn't make such an assumption.

    linux-2.6.30-rc1 has the to_delayed_work() function that will allow this
    patch to work

    Signed-off-by: Jean Delvare
    Signed-off-by: Doug Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jean Delvare
     

07 Jan, 2009

1 commit

  • This patch is part of a larger patch series which will remove the "char
    bus_id[20]" name string from struct device. The device name is managed in
    the kobject anyway, and without any size limitation, and just needlessly
    copied into "struct device".

    [akpm@linux-foundation.org: coding-style fixes]
    Acked-by: Greg Kroah-Hartman
    Acked-by: Doug Thompson
    Signed-off-by: Kay Sievers
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kay Sievers
     

06 May, 2008

1 commit

  • Commit 06916639e2fed9ee475efef2747a1b7429f8fe76 ("driver-core: add
    dev_name() to help transition away from using bus_id") added a static
    inline dev_name() and used it in dev_printk.

    Unfortunately, drivers/edac/edac_core.h defines a macro called
    dev_name(). Rename the latter.

    Diagnosis by Tony Breeds and Michael Ellerman.

    Signed-off-by: Stephen Rothwell
    Acked-by: Doug Thompson
    Signed-off-by: Linus Torvalds

    Stephen Rothwell
     

29 Apr, 2008

2 commits

  • Collection of patches, merged into one, from Adrian that do the following:

    1) This patch makes the following needlessly global functions static:
    - edac_pci_get_log_pe()
    - edac_pci_get_log_npe()
    - edac_pci_get_panic_on_pe()
    - edac_pci_unregister_sysfs_instance_kobj()
    - edac_pci_main_kobj_setup()

    2) Remove unneeded function edac_device_find()

    3) Added #if 0 around function edac_pci_find()

    4) make the needlessly global edac_pci_generic_check() static

    5) Removed function edac_check_mc_devices()

    Doug Thompson modified Adrian's patches, to bettern represent
    the direction of EDAC, and make them one patch.

    Cc: Alan Cox
    Signed-off-by: Adrian Bunk
    Signed-off-by: Doug Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • Signed-off-by: Robert P. J. Day
    Acked-by: Doug Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert P. J. Day
     

27 Jul, 2007

1 commit

  • This fixes a deadlock that could occur on a 'setup' and 'teardown' sequence of
    the workq for a edac_mc control structure instance. A similiar fix was
    previously implemented for the edac_device code.

    In addition, the edac_mc device code there was missing code to allow the workq
    period valu to be altered via sysfs control.

    This patch adds that fix on the code, and allows for the changing of the
    period value as well.

    Cc: Alan Cox
    Signed-off-by: Doug Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Doug Thompson
     

20 Jul, 2007

18 commits

  • Fix mutex locking deadlock on the device controller linked list. Was calling
    a lock then a function that could call the same lock. Moved the cancel workq
    function to outside the lock

    Added some short circuit logic in the workq code

    Added comments of description

    Code tidying

    Signed-off-by: Doug Thompson
    Cc: Greg KH
    Cc: Alan Cox
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Doug Thompson
     
  • This patch refactors the 'releasing' of kobjects for the edac_mc type of
    device. The correct pattern of kobject release is followed.

    As internal kobjs are allocated they bump a ref count on the top level kobj.
    It in turn has a module ref count on the edac_core module. When internal
    kobjects are released, they dec the ref count on the top level kobj. When the
    top level kobj reaches zero, it decrements the ref count on the edac_core
    object, allow it to be unloaded, as all resources have all now been released.

    Cc: Alan Cox alan@lxorguk.ukuu.org.uk
    Signed-off-by: Doug Thompson
    Acked-by: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Doug Thompson
     
  • Refactoring of sysfs code necessitated the refactoring of the edac_mc_alloc()
    and edac_mc_add_mc() apis, of moving the index value to the alloc() function.
    This patch alters the in tree drivers to utilize this new api signature.

    Having the index value performed later created a chicken-and-the-egg issue.
    Moving it to the alloc() function allows for creating the necessary sysfs
    entries with the proper index number

    Cc: Alan Cox alan@lxorguk.ukuu.org.uk
    Signed-off-by: Doug Thompson
    Cc: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Doug Thompson
     
  • Refactor the edac_align_ptr() function to reduce the noise of casting the
    aligned pointer to the various types of data objects and modified its callers
    to its new signature

    Signed-off-by: Douglas Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Douglas Thompson
     
  • This patch fixes some remnant spaces inserted by the use of Lindent.
    Seems Lindent adds some spaces when it shoulded. These have been fixed.
    In addition, goto targets have issues, these have been fixed
    in this patch.

    Signed-off-by: Douglas Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Douglas Thompson
     
  • The origin of this code comes from patches at sourceforge, that
    allow EDAC to be updated to various kernels. With kernel version 2.6.20 a
    new workq system was installed, thus the patches needed to be modified
    based on the kernel version. For submitting to the latest kernel.org
    those #ifdefs are removed

    Signed-off-by: Douglas Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Douglas Thompson
     
  • Run the EDAC CORE files through Lindent for cleanup

    Signed-off-by: Douglas Thompson
    Signed-off-by: Dave Jiang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Douglas Thompson
     
  • Fixup poll values for MC and PCI.
    Also make mc function names unique to mc.

    Signed-off-by: Dave Jiang
    Signed-off-by: Douglas Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Jiang
     
  • Change error check and clear variable from an atomic to an int

    Signed-off-by: Dave Jiang
    Signed-off-by: Douglas Thompson
    Signed-off-by: Linus Torvalds

    Dave Jiang
     
  • Move the memory controller object to work queue based implementation from the
    kernel thread based.

    Signed-off-by: Dave Jiang
    Signed-off-by: Douglas Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Jiang
     
  • Move dev_name() macro to a more generic interface since it's not possible
    to determine whether a device is pci, platform, or of_device easily.

    Now each low level driver sets the name into the control structure, and
    the EDAC core references the control structure for the information.

    Better abstraction.

    Signed-off-by: Dave Jiang
    Signed-off-by: Douglas Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Jiang
     
  • In the refactoring of edac_mc.c into several subsystem files,
    the header file edac_mc.h became meaningless. A new header file
    edac_core.h was created. All the files that previously included
    "edac_mc.h" are changed to include "edac_core.h".

    Signed-off-by: Douglas Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Douglas Thompson
     
  • Provides a way for NMI reported errors on x86 to notify the EDAC
    subsystem pending ECC errors by writing to a software state variable.

    Here's the reworked patch. I added an EDAC stub to the kernel so we can
    have variables that are in the kernel even if EDAC is a module. I also
    implemented the idea of using the chip driver to select error detection
    mode via module parameter and eliminate the kernel compile option.
    Please review/test. Thx!

    Also, I only made changes to some of the chipset drivers since I am
    unfamiliar with the other ones. We can add similar changes as we go.

    Signed-off-by: Dave Jiang
    Signed-off-by: Douglas Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Jiang
     
  • The EDAC core code uses a semaphore as mutex. use the mutex API
    instead of the (binary) semaphore.

    Matthaias wrote this, but since I had some patches ahead of it,
    I need to modify it to follow my patches.

    Signed-off-by: Matthias Kaehlcke
    Signed-off-by: Douglas Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthias Kaehlcke
     
  • This patch adds the new 'class' of object to be managed, named: 'edac_device'.

    As a peer of the 'edac_mc' class of object, it provides a non-memory centric
    view of an ERROR DETECTING device in hardware. It provides a sysfs interface
    and an abstraction for varioius EDAC type devices.

    Multiple 'instances' within the class are possible, with each 'instance'
    able to have multiple 'blocks', and each 'block' having 'attributes'.

    At the 'block' level there are the 'ce_count' and 'ue_count' fields
    which the device driver can update and/or call edac_device_handle_XX()
    functions. At each higher level are additional 'total' count fields,
    which are a summation of counts below that level.

    This 'edac_device' has been used to capture and present ECC errors
    which are found in a a L1 and L2 system on a per CORE/CPU basis.

    Signed-off-by: Douglas Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Douglas Thompson
     
  • This is a large patch to refactor the original EDAC module in the kernel
    and to break it up into better file granularity, such that each source
    file contains a given subsystem of the EDAC CORE.

    Originally, the EDAC 'core' was contained in one source file: edac_mc.c
    with it corresponding edac_mc.h file.

    Now, there are the following files:

    edac_module.c The main module init/exit function and other overhead
    edac_mc.c Code handling the edac_mc class of object
    edac_mc_sysfs.c Code handling for sysfs presentation
    edac_pci_sysfs.c Code handling for PCI sysfs presentation
    edac_core.h CORE .h include file for 'edac_mc' and 'edac_device' drivers
    edac_module.h Internal CORE .h include file

    This forms a foundation upon which a later patch can create the 'edac_device'
    class of object code in a new file 'edac_device.c'.

    Signed-off-by: Douglas Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Douglas Thompson
     
  • This patch makes needlessly global code static, in the edac core

    Signed-off-by: Adrian Bunk
    Cc: Doug Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • This simple patch adds an important CORE API for EDAC that EDAC drivers can
    use to find their edac_mc control structure by passing a mem_ctl_info
    'instance' value

    Needed for subsequent patches

    Signed-off-by: Douglas Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Douglas Thompson
     

18 Jul, 2007

1 commit

  • Currently, the freezer treats all tasks as freezable, except for the kernel
    threads that explicitly set the PF_NOFREEZE flag for themselves. This
    approach is problematic, since it requires every kernel thread to either
    set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't
    care for the freezing of tasks at all.

    It seems better to only require the kernel threads that want to or need to
    be frozen to use some freezer-related code and to remove any
    freezer-related code from the other (nonfreezable) kernel threads, which is
    done in this patch.

    The patch causes all kernel threads to be nonfreezable by default (ie. to
    have PF_NOFREEZE set by default) and introduces the set_freezable()
    function that should be called by the freezable kernel threads in order to
    unset PF_NOFREEZE. It also makes all of the currently freezable kernel
    threads call set_freezable(), so it shouldn't cause any (intentional)
    change of behaviour to appear. Additionally, it updates documentation to
    describe the freezing of tasks more accurately.

    [akpm@linux-foundation.org: build fixes]
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Nigel Cunningham
    Cc: Pavel Machek
    Cc: Oleg Nesterov
    Cc: Gautham R Shenoy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

13 Feb, 2007

2 commits

  • Eric Wollesen ported the Bluesmoke Memory Controller driver for the Intel
    5000X/V/P (Blackford/Greencreek) chipset to the in kernel EDAC model.

    This patch incorporates those required changes to the edac_mc.c and edac_mc.h
    core files by added new Fully Buffered DIMM interface to the EDAC Core module.

    Signed-off-by: eric wollesen
    Signed-off-by: doug thompson
    Acked-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    eric wollesen
     
  • This is an attempt of providing an interface for memory scrubbing control in
    EDAC.

    This patch modifies the EDAC Core to provide the Interface for memory
    controller modules to implment.

    The following things are still outstanding:

    - K8 is the first implemenation,

    The patch provide a method of configuring the K8 hardware memory scrubber
    via the 'mcX' sysfs directory. There should be some fallback to a generic
    scrubber implemented in software if the hardware does not support
    scrubbing.

    Or .. the scrubbing sysfs entry should not be visible at all.

    - Only works with SDRAM, not cache,

    The K8 can scrub cache and l2cache also - but I think this is not so
    useful as the cache is busy all the time (one hopes).

    One would also expect that cache scrubbing requires hardware support.

    - Error Handling,

    I would like that errors are returned to the user in "terms of file
    system".

    - Presentation,

    I chose Bandwidth in Bytes/Second as a representation of the scrubbing
    rate for the following reasons:

    I like that the sysfs entries are sort-of textual, related to something
    that makes sense instead of magical values that must be looked up.

    "My People" wants "% main memory scrubbed per hour" others prefer "%
    memory bandwidth used" as representation, "bandwith used" makes it easy to
    calculate both versions in one-liner scripts.

    If one later wants to scrub cache, the scaling becomes wierd for K8
    changing from "blocks of 64 byte memory" to "blocks of 64 cache lines" to
    "blocks of 64 bit". Using "bandwidth used" makes sense in all three cases,
    (I.M.O. anyway ;-).

    - Discovery,

    There is no way to discover the possible settings and what they do
    without reading the code and the documentation.

    *I* do not know how to make that work in a practical way.

    - Bugs(??),

    other tools can set invalid values in the memory scrub control register,
    those will read back as '-1', requiring the user to reset the scrub rate.
    This is how *I* think it should be.

    - Afflicting other areas of code,

    I made changes to edac_mc.c and edac_mc.h which will show up globally -
    this is not nice, it would be better that the memory scrubbing fuctionality
    and interface could be entirely contained within the memory controller it
    applies to.

    Frithiof Jensen

    edac_mc.c and its .h file is a CORE helper module for EDAC
    driver modules. This provides the abstraction for device specific
    drivers. It is fine to modify this CORE to provide help for
    new features of the the drivers

    doug thompson

    Signed-off-by: Frithiof Jensen
    Signed-off-by: doug thompson
    Acked-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Frithiof Jensen
     

08 Dec, 2006

1 commit


04 Nov, 2006

1 commit

  • Call sysdev_class_unregister() on failure in edac_sysfs_memctrl_setup()
    and decrease identation level for clear logic.

    Acked-by: Doug Thompson
    Signed-off-by: Akinobu Mita
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     

11 Jul, 2006

1 commit

  • When EDAC was first introduced into the kernel it had a sysfs interface,
    but due to some problems it was disabled in 2.6.16 and remained disabled in
    2.6.17.

    With feedback, several of the control and attribute files of that interface
    had some good constructive feedback. PCI Blacklist/Whitelist was a major
    set which has design issues and it has been removed in this patch. Instead
    of storing PCI broken parity status in EDAC, it has been moved to the
    pci_dev structure itself by a previous PCI patch. A future patch will
    enable that feature in EDAC by utilizing the pci_dev info.

    The sysfs is now enabled in this patch, with a minimal set of control and
    attribute files for examining EDAC state and for enabling/disabling the
    memory and PCI operations.

    The Documentation for EDAC has also been updated to reflect the new state
    of EDAC operation.

    Signed-off-by:Doug Thompson
    Cc: Greg KH
    Cc: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Doug Thompson
     

01 Jul, 2006

4 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial:
    Remove obsolete #include
    remove obsolete swsusp_encrypt
    arch/arm26/Kconfig typos
    Documentation/IPMI typos
    Kconfig: Typos in net/sched/Kconfig
    v9fs: do not include linux/version.h
    Documentation/DocBook/mtdnand.tmpl: typo fixes
    typo fixes: specfic -> specific
    typo fixes in Documentation/networking/pktgen.txt
    typo fixes: occuring -> occurring
    typo fixes: infomation -> information
    typo fixes: disadvantadge -> disadvantage
    typo fixes: aquire -> acquire
    typo fixes: mecanism -> mechanism
    typo fixes: bandwith -> bandwidth
    fix a typo in the RTC_CLASS help text
    smb is no longer maintained

    Manually merged trivial conflict in arch/um/kernel/vmlinux.lds.S

    Linus Torvalds
     
  • Remove add_mc_to_global_list(). In next patch, this function will be
    reimplemented with different semantics.

    1 Reimplement add_mc_to_global_list() with semantics that allow the caller to
    determine the ID number for a mem_ctl_info structure. Then modify
    edac_mc_add_mc() so that the caller specifies the ID number for the new
    mem_ctl_info structure. Platform-specific code should be able to assign the
    ID numbers in a platform-specific manner. For instance, on Opteron it makes
    sense to have the ID of the mem_ctl_info structure match the ID of the node
    that the memory controller belongs to.

    2 Modify callers of edac_mc_add_mc() so they use the new semantics.

    Signed-off-by: Doug Thompson
    Cc: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Doug Thompson
     
  • Change MC drivers from using CVS revision strings for their version number,
    Now each driver has its own local string.

    Remove some PCI dependencies from the core EDAC module. Made the code 'struct
    device' centric instead of 'struct pci_dev' Most of the code changes here are
    from a patch by Dave Jiang. It may be best to eventually move the
    PCI-specific code into a separate source file.

    Signed-off-by: Doug Thompson
    Cc: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Doug Thompson
     
  • Signed-off-by: Jörn Engel
    Signed-off-by: Adrian Bunk

    Jörn Engel
     

29 Mar, 2006

1 commit


27 Mar, 2006

2 commits