29 May, 2012

4 commits

  • The number of pages is a dimm property. Move it to the dimm struct.

    After this change, it is possible to add sysfs nodes for the DIMM's that
    will properly represent the DIMM stick properties, including its size.

    A TODO fix here is to properly represent dual-rank/quad-rank DIMMs when
    the memory controller represents the memory via chip select rows.

    Reviewed-by: Aristeu Rozanski
    Acked-by: Borislav Petkov
    Acked-by: Chris Metcalf
    Cc: Doug Thompson
    Cc: Mark Gross
    Cc: Jason Uhlenkott
    Cc: Tim Small
    Cc: Ranganathan Desikan
    Cc: "Arvind R."
    Cc: Olof Johansson
    Cc: Egor Martovetsky
    Cc: Michal Marek
    Cc: Jiri Kosina
    Cc: Joe Perches
    Cc: Dmitry Eremin-Solenikov
    Cc: Benjamin Herrenschmidt
    Cc: Hitoshi Mitake
    Cc: Andrew Morton
    Cc: "Niklas Söderlund"
    Cc: Shaohui Xie
    Cc: Josh Boyer
    Cc: linuxppc-dev@lists.ozlabs.org
    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     
  • Almost all edac drivers initialize csrow_info->first_page,
    csrow_info->last_page and csrow_info->page_mask. Those vars are
    used inside the EDAC core, in order to calculate the csrow affected
    by an error, by using the routine edac_mc_find_csrow_by_page().

    However, very few drivers actually use it:
    e752x_edac.c
    e7xxx_edac.c
    i3000_edac.c
    i82443bxgx_edac.c
    i82860_edac.c
    i82875p_edac.c
    i82975x_edac.c
    r82600_edac.c

    There also a few other drivers that have their own calculus
    formula internally using those vars.

    All the others are just wasting time by initializing those
    data.

    While initializing data without using them won't cause any troubles, as
    those information is stored at the wrong place (at csrows structure), it
    is better to remove what is unused, in order to simplify the next patch.

    Reviewed-by: Aristeu Rozanski
    Acked-by: Borislav Petkov
    Acked-by: Chris Metcalf
    Cc: Doug Thompson
    Cc: Hitoshi Mitake
    Cc: Andrew Morton
    Cc: "Niklas Söderlund"
    Cc: Josh Boyer
    Cc: Jiri Kosina
    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     
  • On systems based on chip select rows, all channels need to use memories
    with the same properties, otherwise the memories on channels A and B
    won't be recognized.

    However, such assumption is not true for all types of memory
    controllers.

    Controllers for FB-DIMM's don't have such requirements.

    Also, modern Intel controllers seem to be capable of handling such
    differences.

    So, we need to get rid of storing the DIMM information into a per-csrow
    data, storing it, instead at the right place.

    The first step is to move grain, mtype, dtype and edac_mode to the
    per-dimm struct.

    Reviewed-by: Aristeu Rozanski
    Reviewed-by: Borislav Petkov
    Acked-by: Chris Metcalf
    Cc: Doug Thompson
    Cc: Borislav Petkov
    Cc: Mark Gross
    Cc: Jason Uhlenkott
    Cc: Tim Small
    Cc: Ranganathan Desikan
    Cc: "Arvind R."
    Cc: Olof Johansson
    Cc: Egor Martovetsky
    Cc: Michal Marek
    Cc: Jiri Kosina
    Cc: Joe Perches
    Cc: Dmitry Eremin-Solenikov
    Cc: Benjamin Herrenschmidt
    Cc: Hitoshi Mitake
    Cc: Andrew Morton
    Cc: James Bottomley
    Cc: "Niklas Söderlund"
    Cc: Shaohui Xie
    Cc: Josh Boyer
    Cc: Mike Williams
    Cc: linuxppc-dev@lists.ozlabs.org
    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     
  • The way a DIMM is currently represented implies that they're
    linked into a per-csrow struct. However, some drivers don't see
    csrows, as they're ridden behind some chip like the AMB's
    on FBDIMM's, for example.

    This forced drivers to fake^Wvirtualize a csrow struct, and to create
    a mess under csrow/channel original's concept.

    Move the DIMM labels into a per-DIMM struct, and add there
    the real location of the socket, in terms of csrow/channel.
    Latter patches will modify the location to properly represent the
    memory architecture.

    All other drivers will use a per-csrow type of location.
    Some of those drivers will require a latter conversion, as
    they also fake the csrows internally.

    TODO: While this patch doesn't change the existing behavior, on
    csrows-based memory controllers, a csrow/channel pair points to a memory
    rank. There's a known bug at the EDAC core that allows having different
    labels for the same DIMM, if it has more than one rank. A latter patch
    is need to merge the several ranks for a DIMM into the same dimm_info
    struct, in order to avoid having different labels for the same DIMM.

    The edac_mc_alloc() will now contain a per-dimm initialization loop that
    will be changed by latter patches in order to match other types of
    memory architectures.

    Reviewed-by: Aristeu Rozanski
    Reviewed-by: Borislav Petkov
    Cc: Doug Thompson
    Cc: Ranganathan Desikan
    Cc: "Arvind R."
    Cc: "Niklas Söderlund"
    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     

29 Mar, 2012

1 commit

  • Pull EDAC fixes from Mauro Carvalho Chehab:
    "A series of EDAC driver fixes. It also has one core fix at the
    documentation, and a rename patch, fixing the name of the struct that
    contains the rank information."

    * 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac:
    edac: rename channel_info to rank_info
    i5400_edac: Avoid calling pci_put_device() twice
    edac: i5100 ack error detection register after each read
    edac: i5100 fix erroneous define for M1Err
    edac: sb_edac: Fix a wrong value setting for the previous value
    edac: sb_edac: Fix a INTERLEAVE_MODE() misuse
    edac: sb_edac: Let the driver depend on PCI_MMCONFIG
    edac: Improve the comments to better describe the memory concepts
    edac/ppc4xx_edac: Fix compilation
    Fix sb_edac compilation with 32 bits kernels

    Linus Torvalds
     

22 Mar, 2012

3 commits

  • >From the driver design, the variable limit wants to compare with its
    previous value, we should set the value of limit instead of the value
    of tmp_mb to the variable prev.

    Signed-off-by: Hui Wang
    Signed-off-by: Mauro Carvalho Chehab

    Hui Wang
     
  • We can identify dram interleave mode from the Dram Rule register
    rather than Dram Interleave list register.

    In this context, the reg of INTERLEAVE_MODE(reg) contains the Dram
    Interleave list register, we can't get interleave mode from the reg,
    while the variable interleave_mode saves the the mode got from the
    Dram Rule register, so we use the variable to replace
    INTERLEAVE_MDDE(reg) here.

    Signed-off-by: Hui Wang
    Signed-off-by: Mauro Carvalho Chehab

    Hui Wang
     
  • As reported by Josh Boyer :
    > drivers/edac/sb_edac.c: In function 'get_memory_error_data':
    > drivers/edac/sb_edac.c:861:2: warning: left shift count >= width of type
    > [enabled by default]
    >
    > ERROR: "__udivdi3" [drivers/edac/sb_edac.ko] undefined!
    > make[1]: *** [__modpost] Error 1
    > make: *** [modules] Error 2

    PS.: compile-tested only

    Reported-by: Josh Boyer
    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     

19 Mar, 2012

1 commit

  • These const tables are currently marked __devinitdata, but
    Documentation/PCI/pci.txt says:

    "o The ID table array should be marked __devinitconst; this is done
    automatically if the table is declared with DEFINE_PCI_DEVICE_TABLE()."

    So use DEFINE_PCI_DEVICE_TABLE(x).

    Based on PaX and earlier work by Andi Kleen.

    Signed-off-by: Lionel Debroux
    Signed-off-by: Borislav Petkov

    Lionel Debroux
     

07 Jan, 2012

1 commit


21 Dec, 2011

1 commit

  • Several fields in struct cpuinfo_x86 were not defined for the
    !SMP case, likely to save space. However, those fields still
    have some meaning for UP, and keeping them allows some #ifdef
    removal from other files. The additional size of the UP kernel
    from this change is not significant enough to worry about
    keeping up the distinction:

    text data bss dec hex filename
    4737168 506459 972040 6215667 5ed7f3 vmlinux.o.before
    4737444 506459 972040 6215943 5ed907 vmlinux.o.after

    for a difference of 276 bytes for an example UP config.

    If someone wants those 276 bytes back badly then it should
    be implemented in a cleaner way.

    Signed-off-by: Kevin Winchester
    Cc: Steffen Persvold
    Link: http://lkml.kernel.org/r/1324428742-12498-1-git-send-email-kjwinchester@gmail.com
    Signed-off-by: Ingo Molnar

    Kevin Winchester
     

14 Dec, 2011

1 commit


01 Nov, 2011

3 commits

  • The edac driver for Sandy Bridge was found to be reporting "FPM"
    for edac_mode, which clearly doesn't make sense. It was found that
    sb_edac.c:get_dimm_config was reusing a variable for both mem_type
    and edac_type, and thus was overwriting the value after setting
    it correctly. This patch fixes that issue.

    Before the patch:
    /sys/devices/system/edac/mc/mc0/csrow0/edac_mode:FPM
    /sys/devices/system/edac/mc/mc0/csrow1/edac_mode:FPM
    /sys/devices/system/edac/mc/mc0/csrow2/edac_mode:FPM
    /sys/devices/system/edac/mc/mc0/csrow3/edac_mode:FPM

    After:
    /sys/devices/system/edac/mc/mc0/csrow0/edac_mode:S4ECD4ED
    /sys/devices/system/edac/mc/mc0/csrow1/edac_mode:S4ECD4ED
    /sys/devices/system/edac/mc/mc0/csrow2/edac_mode:S4ECD4ED
    /sys/devices/system/edac/mc/mc0/csrow3/edac_mode:S4ECD4ED

    Signed-off-by: Mark A. Grondona
    Signed-off-by: Mauro Carvalho Chehab

    Mark A. Grondona
     
  • Some changes on it were required due to changeset cd90cc84c6bf0, that
    changed the glue with the MCE logic.

    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     
  • This driver is known to work on mine and Tony's test environments,
    using software error injection, and a partial hardware/software
    error injection tool.

    There's no broader range test yet to double check if the error decoding
    logic will actually point to the right DIMM, so use it with care.
    More tests are required to be sure that the driver will work on all
    different types of memory configurations.

    If you're willing to risk using it, I suggest you to enable EDAC debugs
    for your test machines, as the debug logs helps to track what's going
    inside the driver.

    Please feed me with bug reports, if you notice that the driver
    is miss-behaving.

    Tested-by: Tony Luck
    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab