21 Feb, 2013

1 commit

  • In order for it to work with it builtin, the EDAC core should
    be initialized earlier, otherwise the ghes_edac driver initializes
    before edac_mc_sysfs_init() being called:

    ...
    [ 4.998373] EDAC MC0: Giving out device to 'ghes_edac.c' 'ghes_edac': DEV ghes
    ...
    [ 4.998373] EDAC MC1: Giving out device to 'ghes_edac.c' 'ghes_edac': DEV ghes
    [ 6.519495] EDAC MC: Ver: 3.0.0
    [ 6.523749] EDAC DEBUG: edac_mc_sysfs_init: device mc created

    The net result is that no EDAC sysfs nodes will appear.

    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     

28 Nov, 2012

1 commit


12 Jun, 2012

5 commits

  • Create a single, top-level "edac" directory for debugfs. An "mc[0-N]"
    directory is then created for each memory controller. Individual drivers
    can create additional entries such as h/w error injection control.

    Signed-off-by: Rob Herring
    Signed-off-by: Mauro Carvalho Chehab

    Rob Herring
     
  • There were lots of changes introduced to justify renaming it to
    3.0.0:

    - EDAC core were redesigned to represent all types of
    memory controllers;

    - EDAC API were redesigned to properly represent the memory
    controller hierarchy;

    - a tracepoint-based API were added to report memory errors.

    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     
  • Use a more common debugging style.

    Remove __FILE__ uses, add missing newlines,
    coalesce formats and align arguments.

    Signed-off-by: Joe Perches
    Signed-off-by: Mauro Carvalho Chehab

    Joe Perches
     
  • The debug macro already adds that. Most of the work here was
    made by this small script:

    $f .=$_ while (<>);

    $f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*": /\1"/g;
    $f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*/\1/g;
    $f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*"MC: /\1"/g;

    $f =~ s/(debugf[0-9]\s*\(\")\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+)__func__\s*\,\s*/\1\2/g;
    $f =~ s/(debugf[0-9]\s*\(\")\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+),\s*__func__\s*\)/\1\2)/g;
    $f =~ s/(debugf[0-9]\s*\(\"MC\:\s*)\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+)__func__\s*\,\s*/\1\2/g;
    $f =~ s/(debugf[0-9]\s*\(\"MC\:\s*)\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+),\s*__func__\s*\)/\1\2)/g;

    $f =~ s/\"MC\: \\n\"/"MC:\\n"/g;

    print $f;

    After running the script, manual cleanups were done to fix it the remaining
    places.

    While here, removed the __LINE__ on most places, as it doesn't actually give
    useful info on most places.

    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     
  • The EDAC subsystem uses the old struct sysdev approach,
    creating all nodes using the raw sysfs API. This is bad,
    as the API is deprecated.

    As we'll be changing the EDAC API, let's first port the existing
    code to struct device.

    There's one drawback on this patch: driver-specific sysfs
    nodes, used by mpc85xx_edac, amd64_edac and i7core_edac
    won't be created anymore. While it would be possible to
    also port the device-specific code, that would mix kobj with
    struct device, with is not recommended. Also, it is easier and nicer
    to move the code to the drivers, instead, as the core can get rid
    of some complex logic that just emulates what the device_add()
    and device_create_file() already does.

    The next patches will convert the driver-specific code to use
    the device-specific calls. Then, the remaining bits of the old
    sysfs API will be removed.

    NOTE: a per-MC bus is required, otherwise devices with more than
    one memory controller will hit a bug like the one below:

    [ 819.094946] EDAC DEBUG: find_mci_by_dev: find_mci_by_dev()
    [ 819.094948] EDAC DEBUG: edac_create_sysfs_mci_device: edac_create_sysfs_mci_device() idx=1
    [ 819.094952] EDAC DEBUG: edac_create_sysfs_mci_device: edac_create_sysfs_mci_device(): creating device mc1
    [ 819.094967] EDAC DEBUG: edac_create_sysfs_mci_device: edac_create_sysfs_mci_device creating dimm0, located at channel 0 slot 0
    [ 819.094984] ------------[ cut here ]------------
    [ 819.100142] WARNING: at fs/sysfs/dir.c:481 sysfs_add_one+0xc1/0xf0()
    [ 819.107282] Hardware name: S2600CP
    [ 819.111078] sysfs: cannot create duplicate filename '/bus/edac/devices/dimm0'
    [ 819.119062] Modules linked in: sb_edac(+) edac_core ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables bridge stp llc sunrpc binfmt_misc dm_mirror dm_region_hash dm_log vhost_net macvtap macvlan tun kvm microcode pcspkr iTCO_wdt iTCO_vendor_support igb i2c_i801 i2c_core sg ioatdma dca sr_mod cdrom sd_mod crc_t10dif ahci libahci isci libsas libata scsi_transport_sas scsi_mod wmi dm_mod [last unloaded: scsi_wait_scan]
    [ 819.175748] Pid: 10902, comm: modprobe Not tainted 3.3.0-0.11.el7.v12.2.x86_64 #1
    [ 819.184113] Call Trace:
    [ 819.186868] [] warn_slowpath_common+0x7f/0xc0
    [ 819.193573] [] warn_slowpath_fmt+0x46/0x50
    [ 819.200000] [] sysfs_add_one+0xc1/0xf0
    [ 819.206025] [] sysfs_do_create_link+0x135/0x220
    [ 819.212944] [] ? sysfs_create_group+0x13/0x20
    [ 819.219656] [] sysfs_create_link+0x13/0x20
    [ 819.226109] [] bus_add_device+0xe6/0x1b0
    [ 819.232350] [] device_add+0x2db/0x460
    [ 819.238300] [] edac_create_dimm_object+0x84/0xf0 [edac_core]
    [ 819.246460] [] edac_create_sysfs_mci_device+0xe8/0x290 [edac_core]
    [ 819.255215] [] edac_mc_add_mc+0x5a/0x2c0 [edac_core]
    [ 819.262611] [] sbridge_register_mci+0x1bc/0x279 [sb_edac]
    [ 819.270493] [] sbridge_probe+0xef/0x175 [sb_edac]
    [ 819.277630] [] ? pm_runtime_enable+0x58/0x90
    [ 819.284268] [] local_pci_probe+0x5c/0xd0
    [ 819.290508] [] __pci_device_probe+0xf1/0x100
    [ 819.297117] [] pci_device_probe+0x3a/0x60
    [ 819.303457] [] really_probe+0x73/0x270
    [ 819.309496] [] driver_probe_device+0x4e/0xb0
    [ 819.316104] [] __driver_attach+0xab/0xb0
    [ 819.322337] [] ? driver_probe_device+0xb0/0xb0
    [ 819.329151] [] bus_for_each_dev+0x56/0x90
    [ 819.335489] [] driver_attach+0x1e/0x20
    [ 819.341534] [] bus_add_driver+0x1b0/0x2a0
    [ 819.347884] [] ? 0xffffffffa0346fff
    [ 819.353641] [] driver_register+0x76/0x140
    [ 819.359980] [] ? printk+0x51/0x53
    [ 819.365524] [] ? 0xffffffffa0346fff
    [ 819.371291] [] __pci_register_driver+0x56/0xd0
    [ 819.378096] [] sbridge_init+0x54/0x1000 [sb_edac]
    [ 819.385231] [] do_one_initcall+0x3f/0x170
    [ 819.391577] [] sys_init_module+0xbe/0x230
    [ 819.397926] [] system_call_fastpath+0x16/0x1b
    [ 819.404633] ---[ end trace 1654fdd39556689f ]---

    This happens because the bus is not being properly initialized.
    Instead of putting the memory sub-devices inside the memory controller,
    it is putting everything under the same directory:

    $ tree /sys/bus/edac/
    /sys/bus/edac/
    ├── devices
    │ ├── all_channel_counts -> ../../../devices/system/edac/mc/mc0/all_channel_counts
    │ ├── csrow0 -> ../../../devices/system/edac/mc/mc0/csrow0
    │ ├── csrow1 -> ../../../devices/system/edac/mc/mc0/csrow1
    │ ├── csrow2 -> ../../../devices/system/edac/mc/mc0/csrow2
    │ ├── dimm0 -> ../../../devices/system/edac/mc/mc0/dimm0
    │ ├── dimm1 -> ../../../devices/system/edac/mc/mc0/dimm1
    │ ├── dimm3 -> ../../../devices/system/edac/mc/mc0/dimm3
    │ ├── dimm6 -> ../../../devices/system/edac/mc/mc0/dimm6
    │ ├── inject_addrmatch -> ../../../devices/system/edac/mc/mc0/inject_addrmatch
    │ ├── mc -> ../../../devices/system/edac/mc
    │ └── mc0 -> ../../../devices/system/edac/mc/mc0
    ├── drivers
    ├── drivers_autoprobe
    ├── drivers_probe
    └── uevent

    On a multi-memory controller system, the names "csrow%d" and "dimm%d"
    should be under "mc%d", and not at the main hierarchy level.

    So, we need to create a per-MC bus, in order to have its own namespace.

    Reviewed-by: Aristeu Rozanski
    Cc: Doug Thompson
    Cc: Greg K H
    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     

19 Apr, 2011

1 commit

  • The kernel already prints its build timestamp during boot, no need to
    repeat it in random drivers and produce different object files each
    time.

    Cc: Doug Thompson
    Cc: bluesmoke-devel@lists.sourceforge.net
    Cc: linux-edac@vger.kernel.org
    Acked-by: Mauro Carvalho Chehab
    Signed-off-by: Michal Marek

    Michal Marek
     

21 Oct, 2010

1 commit


25 Jan, 2008

1 commit


20 Jul, 2007

12 commits

  • Change EXPORT_SYMBOLs to EXPORT_SYMBOLS_GPL
    Tidy changes: blank lines, inline removal, add comment

    Signed-off-by: Doug Thompson
    Cc: Greg KH
    Cc: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Doug Thompson
     
  • This patch refactors the 'releasing' of kobjects for the edac_mc type of
    device. The correct pattern of kobject release is followed.

    As internal kobjs are allocated they bump a ref count on the top level kobj.
    It in turn has a module ref count on the edac_core module. When internal
    kobjects are released, they dec the ref count on the top level kobj. When the
    top level kobj reaches zero, it decrements the ref count on the edac_core
    object, allow it to be unloaded, as all resources have all now been released.

    Cc: Alan Cox alan@lxorguk.ukuu.org.uk
    Signed-off-by: Doug Thompson
    Acked-by: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Doug Thompson
     
  • Refactored the function edac_op_state_toString() to be edac_op_state_to_string()
    for consistent style, and its callers

    Signed-off-by: Douglas Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Douglas Thompson
     
  • Patches to conform to coding style, namely static don't need to be initialized
    to NULL nor '0', as that is the default

    Signed-off-by: Douglas Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Douglas Thompson
     
  • This patch fixes some remnant spaces inserted by the use of Lindent.
    Seems Lindent adds some spaces when it shoulded. These have been fixed.
    In addition, goto targets have issues, these have been fixed
    in this patch.

    Signed-off-by: Douglas Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Douglas Thompson
     
  • Run the EDAC CORE files through Lindent for cleanup

    Signed-off-by: Douglas Thompson
    Signed-off-by: Dave Jiang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Douglas Thompson
     
  • Moving PCI to a per-instance device model

    This should include the correct sysfs setup as well. Please review.

    Signed-off-by: Dave Jiang
    Signed-off-by: Douglas Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Jiang
     
  • Move the memory controller object to work queue based implementation from the
    kernel thread based.

    Signed-off-by: Dave Jiang
    Signed-off-by: Douglas Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Jiang
     
  • In the refactoring of edac_mc.c into several subsystem files,
    the header file edac_mc.h became meaningless. A new header file
    edac_core.h was created. All the files that previously included
    "edac_mc.h" are changed to include "edac_core.h".

    Signed-off-by: Douglas Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Douglas Thompson
     
  • Provides a way for NMI reported errors on x86 to notify the EDAC
    subsystem pending ECC errors by writing to a software state variable.

    Here's the reworked patch. I added an EDAC stub to the kernel so we can
    have variables that are in the kernel even if EDAC is a module. I also
    implemented the idea of using the chip driver to select error detection
    mode via module parameter and eliminate the kernel compile option.
    Please review/test. Thx!

    Also, I only made changes to some of the chipset drivers since I am
    unfamiliar with the other ones. We can add similar changes as we go.

    Signed-off-by: Dave Jiang
    Signed-off-by: Douglas Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Jiang
     
  • This patch adds the new 'class' of object to be managed, named: 'edac_device'.

    As a peer of the 'edac_mc' class of object, it provides a non-memory centric
    view of an ERROR DETECTING device in hardware. It provides a sysfs interface
    and an abstraction for varioius EDAC type devices.

    Multiple 'instances' within the class are possible, with each 'instance'
    able to have multiple 'blocks', and each 'block' having 'attributes'.

    At the 'block' level there are the 'ce_count' and 'ue_count' fields
    which the device driver can update and/or call edac_device_handle_XX()
    functions. At each higher level are additional 'total' count fields,
    which are a summation of counts below that level.

    This 'edac_device' has been used to capture and present ECC errors
    which are found in a a L1 and L2 system on a per CORE/CPU basis.

    Signed-off-by: Douglas Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Douglas Thompson
     
  • This is a large patch to refactor the original EDAC module in the kernel
    and to break it up into better file granularity, such that each source
    file contains a given subsystem of the EDAC CORE.

    Originally, the EDAC 'core' was contained in one source file: edac_mc.c
    with it corresponding edac_mc.h file.

    Now, there are the following files:

    edac_module.c The main module init/exit function and other overhead
    edac_mc.c Code handling the edac_mc class of object
    edac_mc_sysfs.c Code handling for sysfs presentation
    edac_pci_sysfs.c Code handling for PCI sysfs presentation
    edac_core.h CORE .h include file for 'edac_mc' and 'edac_device' drivers
    edac_module.h Internal CORE .h include file

    This forms a foundation upon which a later patch can create the 'edac_device'
    class of object code in a new file 'edac_device.c'.

    Signed-off-by: Douglas Thompson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Douglas Thompson