11 Jun, 2012

1 commit

  • Some edac drivers register themselves as mce decoders via
    notifier_chain. But in current notifier_chain implementation logic,
    it doesn't accept same notifier registered twice. If so, it will be
    wrong when adding/removing the element from the list. For example,
    on one SandyBridge platform, remove module sb_edac and then trigger
    one error, it will hit oops because it has no mce decoder registered
    but related notifier_chain still points to an invalid callback
    function. Here is an example:

    Call Trace:
    [] atomic_notifier_call_chain+0x1a/0x20
    [] mce_log+0x46/0x180
    [] apei_mce_report_mem_error+0x4a/0x60
    [] ghes_do_proc+0x192/0x210
    [] ghes_proc+0x46/0x70
    [] ghes_notify_sci+0x48/0x80
    [] notifier_call_chain+0x55/0x80
    [] __blocking_notifier_call_chain+0x5a/0x80
    [] ? acpi_os_wait_events_complete+0x23/0x23
    [] blocking_notifier_call_chain+0x16/0x20
    [] acpi_hed_notify+0x19/0x1b
    [] acpi_device_notify+0x19/0x1b
    [] acpi_ev_notify_dispatch+0x67/0x7f
    [] acpi_os_execute_deferred+0x29/0x36
    [] process_one_work+0x132/0x450
    [] worker_thread+0x17b/0x3c0
    [] ? manage_workers+0x120/0x120
    [] kthread+0x9e/0xb0
    [] kernel_thread_helper+0x4/0x10
    [] ? kthread_freezable_should_stop+0x70/0x70
    [] ? gs_change+0x13/0x13
    Code: f3 49 89 d4 45 85 ed 4d 89 c6 48 8b 0f 74 48 48 85 c9 75 17 eb 41
    0f 1f 80 00 00 00 00 41 83 ed 01 4c 89 f9 74 22 4d 85 ff 74 1d 8b
    79 08 4c 89 e2 48 89 de 48 89 cf ff 11 4d 85 f6 74 04 41
    RIP [] notifier_call_chain+0x46/0x80
    RSP
    CR2: ffffffffa01af838
    ---[ end trace 0100930068e73e6f ]---
    BUG: unable to handle kernel paging request at fffffffffffffff8
    IP: [] kthread_data+0x10/0x20
    PGD 1a0d067 PUD 1a0e067 PMD 0
    Oops: 0000 [#2] SMP

    Only i7core_edac and sb_edac have such issues because they have more
    than one memory controller which means they have to register mce
    decoder many times.

    Cc: # 3.2 and upper
    Signed-off-by: Chen Gong
    Signed-off-by: Mauro Carvalho Chehab

    Chen Gong
     

30 May, 2012

1 commit

  • Pull EDAC internal API changes from Mauro Carvalho Chehab:
    "This changeset is the first part of a series of patches that fixes the
    EDAC sybsystem. On this set, it changes the Kernel EDAC API in order
    to properly represent the Intel i3/i5/i7, Xeon 3xxx/5xxx/7xxx, and
    Intel E5-xxxx memory controllers.

    The EDAC core used to assume that:

    - the DRAM chip select pin is directly accessed by the memory
    controller

    - when multiple channels are used, they're all filled with the
    same type of memory.

    None of the above premises is true on Intel memory controllers since
    2002, when RAMBUS and FB-DIMMs were introduced, and Advanced Memory
    Buffer or by some similar technologies hides the direct access to the
    DRAM pins.

    So, the existing drivers for those chipsets had to lie to the EDAC
    core, in general telling that just one channel is filled. That
    produces some hard to understand error messages like:

    EDAC MC0: CE row 3, channel 0, label "DIMM1": 1 Unknown error(s): memory read error on FATAL area : cpu=0 Err=0008:00c2 (ch=2), addr = 0xad1f73480 => socket=0, Channel=0(mask=2), rank=1

    The location information there (row3 channel 0) is completely bogus:
    it has no physical meaning, and are just some random values that the
    driver uses to talk with the EDAC core. The error actually happened
    at CPU socket 0, channel 0, slot 1, but this is not reported anywhere,
    as the EDAC core doesn't know anything about the memory layout. So,
    only advanced users that know how the EDAC driver works and that tests
    their systems to see how DIMMs are mapped can actually benefit for
    such error logs.

    This patch series fixes the error report logic, in order to allow the
    EDAC to expose the memory architecture used by them to the EDAC core.
    So, as the EDAC core now understands how the memory is organized, it
    can provide an useful report:

    EDAC MC0: CE memory read error on DIMM1 (channel:0 slot:1 page:0x364b1b offset:0x600 grain:32 syndrome:0x0 - count:1 area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:4)

    The location of the DIMM where the error happened is reported by "MC0"
    (cpu socket #0), at "channel:0 slot:1" location, and matches the
    physical location of the DIMM.

    There are two remaining issues not covered by this patch series:

    - The EDAC sysfs API will still report bogus values. So,
    userspace tools like edac-utils will still use the bogus data;

    - Add a new tracepoint-based way to get the binary information
    about the errors.

    Those are on a second series of patches (also at -next), but will
    probably miss the train for 3.5, due to the slow review process."

    Fix up trivial conflict (due to spelling correction of removed code) in
    drivers/edac/edac_device.c

    * git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac: (42 commits)
    i7core: fix ranks information at the per-channel struct
    i5000: Fix the fatal error handling
    i5100_edac: Fix a warning when compiled with 32 bits
    i82975x_edac: Test nr_pages earlier to save a few CPU cycles
    e752x_edac: provide more info about how DIMMS/ranks are mapped
    i5000_edac: Fix the logic that retrieves memory information
    i5400_edac: improve debug messages to better represent the filled memory
    edac: Cleanup the logs for i7core and sb edac drivers
    edac: Initialize the dimm label with the known information
    edac: Remove the legacy EDAC ABI
    x38_edac: convert driver to use the new edac ABI
    tile_edac: convert driver to use the new edac ABI
    sb_edac: convert driver to use the new edac ABI
    r82600_edac: convert driver to use the new edac ABI
    ppc4xx_edac: convert driver to use the new edac ABI
    pasemi_edac: convert driver to use the new edac ABI
    mv64x60_edac: convert driver to use the new edac ABI
    mpc85xx_edac: convert driver to use the new edac ABI
    i82975x_edac: convert driver to use the new edac ABI
    i82875p_edac: convert driver to use the new edac ABI
    ...

    Linus Torvalds
     

29 May, 2012

8 commits

  • There is a flag at the per-channel struct that indicates if there are
    any 4R dimm on it. The way the presence of this flag were reported
    is not ok, as it might give the false idea that the channel were filled
    with 2R memories:

    [ 580.588701] EDAC DEBUG: get_dimm_config: Ch1 phy rd1, wr1 (0x063f7431): 2 ranks, UDIMMs
    [ 580.588704] EDAC DEBUG: get_dimm_config: dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400

    (in this case, just one 1R memory is filled on channel 1)

    So, use a better way to represent the per-channel ranks information.
    After the patch, it will show:

    [ 2002.233978] EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f7431): UDIMMs
    [ 2002.233982] EDAC DEBUG: get_dimm_config: dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
    [ 2002.233988] EDAC DEBUG: get_dimm_config: dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400

    (in this case, there isn't any 4R memories)

    Reported-by: Borislav Petkov
    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     
  • Remove some information that it is duplicated at the MCE log,
    and don't have much usage for the error. Those data will be
    added again, when creating a trace function that outputs both
    memory errors and MCE fields.

    Cc: Aristeu Rozanski
    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     
  • Now that all drivers got converted to use the new ABI, we can
    drop the old one.

    Acked-by: Chris Metcalf
    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     
  • The legacy edac ABI is going to be removed. Port the driver to use
    and benefit from the new API functionality.

    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     
  • The number of pages is a dimm property. Move it to the dimm struct.

    After this change, it is possible to add sysfs nodes for the DIMM's that
    will properly represent the DIMM stick properties, including its size.

    A TODO fix here is to properly represent dual-rank/quad-rank DIMMs when
    the memory controller represents the memory via chip select rows.

    Reviewed-by: Aristeu Rozanski
    Acked-by: Borislav Petkov
    Acked-by: Chris Metcalf
    Cc: Doug Thompson
    Cc: Mark Gross
    Cc: Jason Uhlenkott
    Cc: Tim Small
    Cc: Ranganathan Desikan
    Cc: "Arvind R."
    Cc: Olof Johansson
    Cc: Egor Martovetsky
    Cc: Michal Marek
    Cc: Jiri Kosina
    Cc: Joe Perches
    Cc: Dmitry Eremin-Solenikov
    Cc: Benjamin Herrenschmidt
    Cc: Hitoshi Mitake
    Cc: Andrew Morton
    Cc: "Niklas Söderlund"
    Cc: Shaohui Xie
    Cc: Josh Boyer
    Cc: linuxppc-dev@lists.ozlabs.org
    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     
  • Almost all edac drivers initialize csrow_info->first_page,
    csrow_info->last_page and csrow_info->page_mask. Those vars are
    used inside the EDAC core, in order to calculate the csrow affected
    by an error, by using the routine edac_mc_find_csrow_by_page().

    However, very few drivers actually use it:
    e752x_edac.c
    e7xxx_edac.c
    i3000_edac.c
    i82443bxgx_edac.c
    i82860_edac.c
    i82875p_edac.c
    i82975x_edac.c
    r82600_edac.c

    There also a few other drivers that have their own calculus
    formula internally using those vars.

    All the others are just wasting time by initializing those
    data.

    While initializing data without using them won't cause any troubles, as
    those information is stored at the wrong place (at csrows structure), it
    is better to remove what is unused, in order to simplify the next patch.

    Reviewed-by: Aristeu Rozanski
    Acked-by: Borislav Petkov
    Acked-by: Chris Metcalf
    Cc: Doug Thompson
    Cc: Hitoshi Mitake
    Cc: Andrew Morton
    Cc: "Niklas Söderlund"
    Cc: Josh Boyer
    Cc: Jiri Kosina
    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     
  • On systems based on chip select rows, all channels need to use memories
    with the same properties, otherwise the memories on channels A and B
    won't be recognized.

    However, such assumption is not true for all types of memory
    controllers.

    Controllers for FB-DIMM's don't have such requirements.

    Also, modern Intel controllers seem to be capable of handling such
    differences.

    So, we need to get rid of storing the DIMM information into a per-csrow
    data, storing it, instead at the right place.

    The first step is to move grain, mtype, dtype and edac_mode to the
    per-dimm struct.

    Reviewed-by: Aristeu Rozanski
    Reviewed-by: Borislav Petkov
    Acked-by: Chris Metcalf
    Cc: Doug Thompson
    Cc: Borislav Petkov
    Cc: Mark Gross
    Cc: Jason Uhlenkott
    Cc: Tim Small
    Cc: Ranganathan Desikan
    Cc: "Arvind R."
    Cc: Olof Johansson
    Cc: Egor Martovetsky
    Cc: Michal Marek
    Cc: Jiri Kosina
    Cc: Joe Perches
    Cc: Dmitry Eremin-Solenikov
    Cc: Benjamin Herrenschmidt
    Cc: Hitoshi Mitake
    Cc: Andrew Morton
    Cc: James Bottomley
    Cc: "Niklas Söderlund"
    Cc: Shaohui Xie
    Cc: Josh Boyer
    Cc: Mike Williams
    Cc: linuxppc-dev@lists.ozlabs.org
    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     
  • The way a DIMM is currently represented implies that they're
    linked into a per-csrow struct. However, some drivers don't see
    csrows, as they're ridden behind some chip like the AMB's
    on FBDIMM's, for example.

    This forced drivers to fake^Wvirtualize a csrow struct, and to create
    a mess under csrow/channel original's concept.

    Move the DIMM labels into a per-DIMM struct, and add there
    the real location of the socket, in terms of csrow/channel.
    Latter patches will modify the location to properly represent the
    memory architecture.

    All other drivers will use a per-csrow type of location.
    Some of those drivers will require a latter conversion, as
    they also fake the csrows internally.

    TODO: While this patch doesn't change the existing behavior, on
    csrows-based memory controllers, a csrow/channel pair points to a memory
    rank. There's a known bug at the EDAC core that allows having different
    labels for the same DIMM, if it has more than one rank. A latter patch
    is need to merge the several ranks for a DIMM into the same dimm_info
    struct, in order to avoid having different labels for the same DIMM.

    The edac_mc_alloc() will now contain a per-dimm initialization loop that
    will be changed by latter patches in order to match other types of
    memory architectures.

    Reviewed-by: Aristeu Rozanski
    Reviewed-by: Borislav Petkov
    Cc: Doug Thompson
    Cc: Ranganathan Desikan
    Cc: "Arvind R."
    Cc: "Niklas Söderlund"
    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     

30 Apr, 2012

1 commit


19 Mar, 2012

1 commit

  • These const tables are currently marked __devinitdata, but
    Documentation/PCI/pci.txt says:

    "o The ID table array should be marked __devinitconst; this is done
    automatically if the table is declared with DEFINE_PCI_DEVICE_TABLE()."

    So use DEFINE_PCI_DEVICE_TABLE(x).

    Based on PaX and earlier work by Andi Kleen.

    Signed-off-by: Lionel Debroux
    Signed-off-by: Borislav Petkov

    Lionel Debroux
     

14 Dec, 2011

1 commit


01 Nov, 2011

11 commits


19 Aug, 2011

1 commit


27 May, 2011

1 commit

  • * 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
    gfs2: Drop __TIME__ usage
    isdn/diva: Drop __TIME__ usage
    atm: Drop __TIME__ usage
    dlm: Drop __TIME__ usage
    wan/pc300: Drop __TIME__ usage
    parport: Drop __TIME__ usage
    hdlcdrv: Drop __TIME__ usage
    baycom: Drop __TIME__ usage
    pmcraid: Drop __DATE__ usage
    edac: Drop __DATE__ usage
    rio: Drop __DATE__ usage
    scsi/wd33c93: Drop __TIME__ usage
    scsi/in2000: Drop __TIME__ usage
    aacraid: Drop __TIME__ usage
    media/cx231xx: Drop __TIME__ usage
    media/radio-maxiradio: Drop __TIME__ usage
    nozomi: Drop __TIME__ usage
    cyclades: Drop __TIME__ usage

    Linus Torvalds
     

19 Apr, 2011

1 commit

  • The kernel already prints its build timestamp during boot, no need to
    repeat it in random drivers and produce different object files each
    time.

    Cc: Doug Thompson
    Cc: bluesmoke-devel@lists.sourceforge.net
    Cc: linux-edac@vger.kernel.org
    Acked-by: Mauro Carvalho Chehab
    Signed-off-by: Michal Marek

    Michal Marek
     

31 Mar, 2011

1 commit


28 Dec, 2010

1 commit


24 Oct, 2010

11 commits

  • Due to the nature of i7core, we need to probe and attach all PCI
    devices used by this driver during the first time probe is called.
    However, PCI core will call the probe routine one time for each CPU
    socket. If we return -EINVAL to those calls, it would seem that the
    driver fails, when, in fact, there's no more devices left to initialize.

    Changing the return code to -ENODEV solves this issue.

    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     
  • At pci_xeon_fixup(), it waits for a null-terminated table, while at
    i7core_get_all_devices, it just do a for 0..ARRAY_SIZE. As other tables
    are zero-terminated, change it to be terminate with 0 as well, and fixes
    a bug where it may be running out of the table elements.

    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     
  • That's a nasty bug that took me a lot of time to track, and whose
    solution took just one line to solve. The best fragrances and the worse
    poisons are shipped on the smalest bottles.

    The drivers/pci/quick.c implements the pci_get_device function. The normal
    behavior is that you call it, the function returns you a pdev pointer
    and increment pdev->kobj.kref.refcount of the pci device. However,
    if you want to keep searching an object, you need to pass the previous
    pdev function to the search.

    When you use a not null pointer to pdev "from" field, pci_get_device
    will decrement pdev->kobj.kref.refcount, assuming that the driver won't
    be using the previous pdev.

    The solution is simple: we just need to call pci_dev_get() manually,
    for the pdev's that the driver will actually use.

    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     
  • Probably due to a bug or some testing logic at PCI level, device
    refcount for :00.0 device is decremented at the end of the
    pci_get_device, made by i7core_get_all_devices(). The fact is that
    the first versions of the driver relied on those devices to probe
    for Nehalem, but the current versions don't use it at all.

    So, let's just remove those devices from the driver, making it simpler
    and fixing the bug.

    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     
  • i7core_unregister_mci() checks internally when mci=NULL. There's no
    need to test it outside.

    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     
  • changeset c91d57ba9ce5b5c93a7077e2f72510eb1f9131c4 moved the init
    of the priv pointer to the end of the probe routine. However, we need
    them before that, otherwise, we hit an OOPS:

    [ 67.743453] EDAC DEBUG: mci_bind_devs: Associated fn 0.0, dev = ffff88011b46e000, socket 0
    [ 67.751861] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
    [ 67.759685] IP: [] i7core_probe+0x979/0x130c [i7core_edac]
    [ 67.766721] PGD 10bd38067 PUD 10bd37067 PMD 0
    [ 67.771178] Oops: 0000 [#1] SMP
    [ 67.774414] last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map
    [ 67.782213] CPU 1
    [ 67.784042] Modules linked in: i7core_edac(+) edac_core cpufreq_ondemand binfmt_misc dm_multipath video output pci_slot snd_hda_codd

    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     
  • Signed-off-by: Hidetoshi Seto
    Signed-off-by: Mauro Carvalho Chehab

    Hidetoshi Seto
     
  • A local is enough.

    Signed-off-by: Hidetoshi Seto
    Signed-off-by: Mauro Carvalho Chehab

    Hidetoshi Seto
     
  • We can check the number of channels in i7core_register_mci.

    Signed-off-by: Hidetoshi Seto
    Signed-off-by: Mauro Carvalho Chehab

    Hidetoshi Seto
     
  • In i7core_probe, when setup of mci for 2nd or later socket failed,
    we should cleanup prepared mci for 1st socket or so before "put" of
    all devices.

    So let have i7core_unregister_mci that can be shared between here
    and i7core_remove.

    While here fix a typo "hanler".

    Signed-off-by: Hidetoshi Seto
    Signed-off-by: Mauro Carvalho Chehab

    Hidetoshi Seto
     
  • We already have saved pointers. Use shorter ones.

    Signed-off-by: Hidetoshi Seto
    Signed-off-by: Mauro Carvalho Chehab

    Hidetoshi Seto