14 Nov, 2018

3 commits

  • commit 8f18973877204dc8ca4ce1004a5d28683b9a7086 upstream.

    The code "lchan = (lchan << 1) | ~lchan" for logical channel
    intermediate decoding is wrong. The wrong intermediate decoding
    result is {0xffffffff, 0xfffffffe}.

    Fix it by replacing '~' with '!'. The correct intermediate
    decoding result is {0x1, 0x2}.

    Signed-off-by: Qiuxu Zhuo
    Signed-off-by: Tony Luck
    Signed-off-by: Borislav Petkov
    CC: Aristeu Rozanski
    CC: Mauro Carvalho Chehab
    CC: linux-edac
    Cc: stable@vger.kernel.org
    Link: http://lkml.kernel.org/r/20181009172025.18594-1-tony.luck@intel.com
    Signed-off-by: Greg Kroah-Hartman

    Qiuxu Zhuo
     
  • commit 432de7fd7630c84ad24f1c2acd1e3bb4ce3741ca upstream.

    The count of errors is picked up from bits 52:38 of the machine check
    bank status register. But this is the count of *corrected* errors. If an
    uncorrected error is being logged, the h/w sets this field to 0. Which
    means that when edac_mc_handle_error() is called, the EDAC core will
    carefully add zero to the appropriate uncorrected error counts.

    Signed-off-by: Tony Luck
    [ Massage commit message. ]
    Signed-off-by: Borislav Petkov
    Cc: stable@vger.kernel.org
    Cc: Aristeu Rozanski
    Cc: Mauro Carvalho Chehab
    Cc: Qiuxu Zhuo
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/20180928213934.19890-1-tony.luck@intel.com
    Signed-off-by: Greg Kroah-Hartman

    Tony Luck
     
  • commit 8960de4a5ca7980ed1e19e7ca5a774d3b7a55c38 upstream.

    Add new device IDs for family 17h, models 10h-2fh.

    This is required by amd64_edac_mod in order to properly detect PCI
    device functions 0 and 6.

    Signed-off-by: Michael Jin
    Reviewed-by: Yazen Ghannam
    Cc:
    Link: http://lkml.kernel.org/r/20180816192840.31166-1-mikhail.jin@gmail.com
    Signed-off-by: Borislav Petkov
    Signed-off-by: Greg Kroah-Hartman

    Michael Jin
     

04 Oct, 2018

2 commits

  • [ Upstream commit 4708aa85d50cc6e962dfa8acf5ad4e0d290a21db ]

    Make sure to use put_device() to free the initialised struct device so
    that resources managed by driver core also gets released in the event of
    a registration failure.

    Signed-off-by: Johan Hovold
    Cc: Denis Kirjanov
    Cc: Mauro Carvalho Chehab
    Cc: linux-edac
    Fixes: 2d56b109e3a5 ("EDAC: Handle error path in edac_mc_sysfs_init() properly")
    Link: http://lkml.kernel.org/r/20180612124335.6420-1-johan@kernel.org
    Signed-off-by: Borislav Petkov
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Johan Hovold
     
  • [ Upstream commit 6c974d4dfafe5e9ee754f2a6fba0eb1864f1649e ]

    Make sure to free and deregister the addrmatch and chancounts devices
    allocated during probe in all error paths. Also fix use-after-free in a
    probe error path and in the remove success path where the devices were
    being put before before deregistration.

    Signed-off-by: Johan Hovold
    Cc: Mauro Carvalho Chehab
    Cc: linux-edac
    Fixes: 356f0a30860d ("i7core_edac: change the mem allocation scheme to make Documentation/kobject.txt happy")
    Link: http://lkml.kernel.org/r/20180612124335.6420-2-johan@kernel.org
    Signed-off-by: Borislav Petkov
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Johan Hovold
     

24 Aug, 2018

1 commit

  • commit b748f2de4b2f578599f46c6000683a8da755bf68 upstream.

    The edac_mem_types[] array misses a MEM_LRDDR4 entry, which leads to
    NULL pointer dereference when accessed via sysfs or such.

    Signed-off-by: Takashi Iwai
    Cc: Mauro Carvalho Chehab
    Cc: Yazen Ghannam
    Cc: linux-edac
    Cc:
    Link: http://lkml.kernel.org/r/20180810141426.8918-1-tiwai@suse.de
    Fixes: 1e8096bb2031 ("EDAC: Add LRDDR4 DRAM type")
    Signed-off-by: Borislav Petkov
    Signed-off-by: Greg Kroah-Hartman

    Takashi Iwai
     

03 Aug, 2018

1 commit

  • [ Upstream commit 9ef20753e044f7468c4113e5aecd785419b0b3cc ]

    The kbuild test robot reported the following warning:

    drivers/edac/altera_edac.c: In function 'ocram_free_mem':
    drivers/edac/altera_edac.c:1410:42: warning: cast from pointer to integer
    of different size [-Wpointer-to-int-cast]
    gen_pool_free((struct gen_pool *)other, (u32)p, size);
    ^

    After adding support for ARM64 architectures, the unsigned long
    parameter is 64 bits and causes a build warning on 64-bit configs. Fix
    by casting to the correct size (unsigned long) instead of u32.

    Reported-by: kbuild test robot
    Signed-off-by: Thor Thayer
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-edac
    Fixes: c3eea1942a16 ("EDAC, altera: Add Altera L2 cache and OCRAM support")
    Link: http://lkml.kernel.org/r/1526317441-4996-1-git-send-email-thor.thayer@linux.intel.com
    Signed-off-by: Borislav Petkov
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Thor Thayer
     

19 Apr, 2018

1 commit

  • commit 68627a697c195937672ce07683094c72b1174786 upstream.

    Currently, bank 4 is reserved on Fam17h, so we chose not to initialize
    bank 4 in the smca_banks array. This means that when we check if a bank
    is initialized, like during boot or resume, we will see that bank 4 is
    not initialized and try to initialize it.

    This will cause a call trace, when resuming from suspend, due to
    rdmsr_*on_cpu() calls in the init path. The rdmsr_*on_cpu() calls issue
    an IPI but we're running with interrupts disabled. This triggers:

    WARNING: CPU: 0 PID: 11523 at kernel/smp.c:291 smp_call_function_single+0xdc/0xe0
    ...

    Reserved banks will be read-as-zero, so their MCA_IPID register will be
    zero. So, like the smca_banks array, the threshold_banks array will not
    have an entry for a reserved bank since all its MCA_MISC* registers will
    be zero.

    Enumerate a "Reserved" bank type that matches on a HWID_MCATYPE of 0,0.

    Use the "Reserved" type when checking if a bank is reserved. It's
    possible that other bank numbers may be reserved on future systems.

    Don't try to find the block address on reserved banks.

    Signed-off-by: Yazen Ghannam
    Signed-off-by: Borislav Petkov
    Cc: # 4.14.x
    Cc: Borislav Petkov
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/20180221101900.10326-7-bp@alien8.de
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Yazen Ghannam
     

12 Apr, 2018

1 commit

  • [ Upstream commit 68fa24f9121c04ef146b5158f538c8b32f285be5 ]

    We should not call edac_mc_del_mc() if a corresponding call to
    edac_mc_add_mc() has not been performed yet.

    So here, we should go to err instead of err2 to branch at the right
    place of the error handling path.

    Signed-off-by: Christophe JAILLET
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/20180107205400.14068-1-christophe.jaillet@wanadoo.fr
    Signed-off-by: Borislav Petkov
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Christophe JAILLET
     

09 Mar, 2018

1 commit

  • commit bf8486709ac7fad99e4040dea73fe466c57a4ae1 upstream.

    Commit

    3286d3eb906c ("EDAC, sb_edac: Drop NUM_CHANNELS from 8 back to 4")

    decreased NUM_CHANNELS from 8 to 4, but this is not enough for Knights
    Landing which supports up to 6 channels.

    This caused out-of-bounds writes to pvt->mirror_mode and pvt->tolm
    variables which don't pay critical role on KNL code path, so the memory
    corruption wasn't causing any visible driver failures.

    The easiest way of fixing it is to change NUM_CHANNELS to 6. Do that.

    An alternative solution would be to restructure the KNL part of the
    driver to 2MC/3channel representation.

    Reported-by: Dan Carpenter
    Signed-off-by: Anna Karbownik
    Cc: Mauro Carvalho Chehab
    Cc: Tony Luck
    Cc: jim.m.snow@intel.com
    Cc: krzysztof.paliswiat@intel.com
    Cc: lukasz.odzioba@intel.com
    Cc: qiuxu.zhuo@intel.com
    Cc: linux-edac
    Cc:
    Fixes: 3286d3eb906c ("EDAC, sb_edac: Drop NUM_CHANNELS from 8 back to 4")
    Link: http://lkml.kernel.org/r/1519312693-4789-1-git-send-email-anna.karbownik@intel.com
    [ Massage commit message. ]
    Signed-off-by: Borislav Petkov
    Signed-off-by: Greg Kroah-Hartman

    Anna Karbownik
     

22 Feb, 2018

1 commit

  • commit b399151cb48db30ad1e0e93dd40d68c6d007b637 upstream.

    x86_mask is a confusing name which is hard to associate with the
    processor's stepping.

    Additionally, correct an indent issue in lib/cpu.c.

    Signed-off-by: Jia Zhang
    [ Updated it to more recent kernels. ]
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bp@alien8.de
    Cc: tony.luck@intel.com
    Link: http://lkml.kernel.org/r/1514771530-70829-1-git-send-email-qianyue.zj@alibaba-inc.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Jia Zhang
     

17 Feb, 2018

1 commit

  • commit 544e92581a2ac44607d7cc602c6b54d18656f56d upstream.

    Fix an uninitialized variable warning in the Octeon EDAC driver, as seen
    in MIPS cavium_octeon_defconfig builds since v4.14 with Codescape GNU
    Tools 2016.05-03:

    drivers/edac/octeon_edac-lmc.c In function ‘octeon_lmc_edac_poll_o2’:
    drivers/edac/octeon_edac-lmc.c:87:24: warning: ‘((long unsigned int*)&int_reg)[1]’ may \
    be used uninitialized in this function [-Wmaybe-uninitialized]
    if (int_reg.s.sec_err || int_reg.s.ded_err) {
    ^
    Iinitialise the whole int_reg variable to zero before the conditional
    assignments in the error injection case.

    Signed-off-by: James Hogan
    Acked-by: David Daney
    Cc: linux-edac
    Cc: linux-mips@linux-mips.org
    Fixes: 1bc021e81565 ("EDAC: Octeon: Add error injection support")
    Link: http://lkml.kernel.org/r/20171113161206.20990-1-james.hogan@mips.com
    Signed-off-by: Borislav Petkov
    Signed-off-by: Greg Kroah-Hartman

    James Hogan
     

10 Dec, 2017

1 commit

  • [ Upstream commit a8e9b186f153a44690ad0363a56716e7077ad28c ]

    Add missing break statement in order to prevent the code from falling
    through.

    Signed-off-by: Gustavo A. R. Silva
    Cc: Qiuxu Zhuo
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/20171016174029.GA19757@embeddedor.com
    Signed-off-by: Borislav Petkov
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Gustavo A. R. Silva
     

21 Nov, 2017

1 commit

  • commit 15cc3ae001873845b5d842e212478a6570c7d938 upstream.

    Yi Zhang reported the following failure on a 2-socket Haswell (E5-2603v3)
    server (DELL PowerEdge 730xd):

    EDAC sbridge: Some needed devices are missing
    EDAC MC: Removed device 0 for sb_edac.c Haswell SrcID#0_Ha#0: DEV 0000:7f:12.0
    EDAC MC: Removed device 1 for sb_edac.c Haswell SrcID#1_Ha#0: DEV 0000:ff:12.0
    EDAC sbridge: Couldn't find mci handler
    EDAC sbridge: Couldn't find mci handler
    EDAC sbridge: Failed to register device with error -19.

    The refactored sb_edac driver creates the IMC1 (the 2nd memory
    controller) if any IMC1 device is present. In this case only
    HA1_TA of IMC1 was present, but the driver expected to find
    HA1/HA1_TM/HA1_TAD[0-3] devices too, leading to the above failure.

    The document [1] says the 'E5-2603 v3' CPU has 4 memory channels max. Yi
    Zhang inserted one DIMM per channel for each CPU, and did random error
    address injection test with this patch:

    4024 addresses fell in TOLM hole area
    12715 addresses fell in CPU_SrcID#0_Ha#0_Chan#0_DIMM#0
    12774 addresses fell in CPU_SrcID#0_Ha#0_Chan#1_DIMM#0
    12798 addresses fell in CPU_SrcID#0_Ha#0_Chan#2_DIMM#0
    12913 addresses fell in CPU_SrcID#0_Ha#0_Chan#3_DIMM#0
    12674 addresses fell in CPU_SrcID#1_Ha#0_Chan#0_DIMM#0
    12686 addresses fell in CPU_SrcID#1_Ha#0_Chan#1_DIMM#0
    12882 addresses fell in CPU_SrcID#1_Ha#0_Chan#2_DIMM#0
    12934 addresses fell in CPU_SrcID#1_Ha#0_Chan#3_DIMM#0
    106400 addresses were injected totally.

    The test result shows that all the 4 channels belong to IMC0 per CPU, so
    the server really only has one IMC per CPU.

    In the 1st page of chapter 2 in datasheet [2], it also says 'E5-2600 v3'
    implements either one or two IMCs. For CPUs with one IMC, IMC1 is not
    used and should be ignored.

    Thus, do not create a second memory controller if the key HA1 is absent.

    [1] http://ark.intel.com/products/83349/Intel-Xeon-Processor-E5-2603-v3-15M-Cache-1_60-GHz
    [2] https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e5-v3-datasheet-vol-2.pdf

    Reported-and-tested-by: Yi Zhang
    Signed-off-by: Qiuxu Zhuo
    Cc: Tony Luck
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/20170913104214.7325-1-qiuxu.zhuo@intel.com
    [ Massage commit message. ]
    Signed-off-by: Borislav Petkov
    Signed-off-by: Greg Kroah-Hartman

    Qiuxu Zhuo
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

21 Aug, 2017

3 commits


20 Aug, 2017

1 commit

  • Make these const as they are only stored in the type field of a device
    structure, which is const.

    Done using Coccinelle.

    Signed-off-by: Bhumika Goyal
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/1503130946-2854-2-git-send-email-bhumirks@gmail.com
    Signed-off-by: Borislav Petkov

    Bhumika Goyal
     

19 Aug, 2017

5 commits

  • Properly handle hidden state of P2SB PCI device (DEV:D, FUN:0) for
    Apollo Lake.

    Signed-off-by: Qiuxu Zhuo
    Cc: Tony Luck
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/20170814154905.21707-1-qiuxu.zhuo@intel.com
    Signed-off-by: Borislav Petkov

    Qiuxu Zhuo
     
  • On Deverton server, the P2SB PCI device (DEV:1F, FUN:1) is used by multiple
    device drivers.

    If it's hidden by some device driver (e.g. with the i801 I2C driver,
    the commit

    9424693035a5 ("i2c: i801: Create iTCO device on newer Intel PCHs")

    unconditionally hid the P2SB PCI device wrongly) it will make the
    pnd2_edac driver read out an invalid BAR value of 0xffffffff and then
    fail on ioremap().

    Therefore, store the presence state of P2SB PCI device before unhiding
    it for reading BAR and restore the presence state after reading BAR.

    Signed-off-by: Qiuxu Zhuo
    Cc: Tony Luck
    Cc: linux-edac
    Cc: linux-i2c@vger.kernel.org
    Link: http://lkml.kernel.org/r/20170814154845.21663-1-qiuxu.zhuo@intel.com
    Signed-off-by: Borislav Petkov

    Qiuxu Zhuo
     
  • Bit[0] of BAR is always zero. Bit[2:1] and bit[3] of BAR contain the
    information of 'type' and the 'prefetchable' accordingly. Therefore,
    mask the lower four bits to retrieve the actual base address of a BAR.

    Signed-off-by: Qiuxu Zhuo
    Cc: Tony Luck
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/20170814154813.21619-1-qiuxu.zhuo@intel.com
    Signed-off-by: Borislav Petkov

    Qiuxu Zhuo
     
  • Return the proper error value if ioremap() fails (and not 0).

    Signed-off-by: Christophe JAILLET
    Cc: David Daney
    Cc: Ralf Baechle
    Cc: linux-edac
    Cc: linux-mips@linux-mips.org
    Link: http://lkml.kernel.org/r/20170816045821.14165-1-christophe.jaillet@wanadoo.fr
    [ Massage commit message, remove newline. ]
    Signed-off-by: Borislav Petkov

    Christophe JAILLET
     
  • Return the proper error value if devm_ioremap() fails (and not 0).

    Signed-off-by: Christophe JAILLET
    Acked-by: Thor Thayer
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/20170816050506.14541-1-christophe.jaillet@wanadoo.fr
    [ Massage commit message. ]
    Signed-off-by: Borislav Petkov

    Christophe JAILLET
     

04 Aug, 2017

1 commit

  • I've been waing a long time for the generic sideband driver to
    appear. Patience has run out, so include the minimum here to
    just read registers.

    Signed-off-by: Tony Luck
    Cc: Aristeu Rozanski
    Cc: Mauro Carvalho Chehab
    Cc: Patrick Geary
    Cc: Qiuxu Zhuo
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/20170803210536.5662-1-tony.luck@intel.com
    Signed-off-by: Borislav Petkov

    Tony Luck
     

02 Aug, 2017

1 commit

  • Basically, there are full memory mirroring and address range partial
    memory mirroring (supported by Haswell EX and Broadwell EX) modes.

    a) In full memory mirroring, the memory behind each memory controller
    is mirrored, i.e. the memory is split into two identical mirrors
    (primary and secondary), half of the memory is reserved for redundancy.

    b) In address range partial memory mirroring, the memory size (range)
    of primary and secondary behind each memory controller can be user
    defined by the TAD0 register. The rest of memory ranges defined by
    TAD1/TAD2/... in that memory controller are non-mirrored.

    For more detail on memory mirroring, see the following link written by Tony Luck:

    https://01.org/lkp/blogs/tonyluck/2016/address-range-partial-memory-mirroring-linux

    Currently the sb_edac driver only supports address decoding in full
    memory mirroring and non-mirroring modes. In address range partial
    memory mirroring mode, it may fail to decode an address that falls in a
    non-mirroring area (the following was one of this kind of failed logs).

    mce: Uncorrected hardware memory error in user-access at 566d53a400
    Memory failure: 0x566d53a: Killing einj_mem_uc:4647 due to hardware memory corruption
    Memory failure: 0x566d53a: recovery action for dirty LRU page: Recovered
    mce: [Hardware Error]: Machine check events logged
    EDAC sbridge MC1: HANDLING MCE MEMORY ERROR
    EDAC sbridge MC1: CPU 48: Machine Check Event: 0 Bank 7: ec00000000010090
    EDAC sbridge MC1: TSC 4b914aa5a99dab
    EDAC sbridge MC1: ADDR 566d53a400
    EDAC sbridge MC1: MISC 1443a0c86
    EDAC sbridge MC1: PROCESSOR 0:406f1 TIME 1499712764 SOCKET 2 APIC 80
    EDAC MC1: 0 UE Can't discover the memory rank for ch addr 0x7fb54e900 on any memory ( page:0x0 offset:0x0 grain:32)
    mce: [Hardware Error]: Machine check events logged

    Therefore, classify memory mirroring modes and make the address decoding
    in address range partial memory mode correct.

    Signed-off-by: Qiuxu Zhuo
    Cc: Tony Luck
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/20170730180651.30060-1-qiuxu.zhuo@intel.com
    Signed-off-by: Borislav Petkov

    Qiuxu Zhuo
     

19 Jul, 2017

1 commit

  • Now that we have a custom printf format specifier, convert users of
    full_name to use %pOF instead. This is preparation to remove storing of
    the full path string for each node.

    Signed-off-by: Rob Herring
    Cc: devicetree@vger.kernel.org
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/20170718214339.7774-19-robh@kernel.org
    Signed-off-by: Borislav Petkov

    Rob Herring
     

17 Jul, 2017

3 commits

  • It is a write-only variable so get rid of it.

    Signed-off-by: Borislav Petkov
    Acked-by: Robert Richter
    Acked-by: Michal Simek
    Acked-by: Thor Thayer
    Acked-by: Tony Luck
    Cc: Mark Gross
    Cc: Tim Small
    Cc: Ranganathan Desikan
    Cc: "Arvind R."
    Cc: Jason Baron
    Cc: "Sören Brinkmann"
    Cc: Ralf Baechle
    Cc: David Daney
    Cc: Loc Ho
    Cc: linux-edac@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-mips@linux-mips.org

    Borislav Petkov
     
  • attribute_groups are not supposed to change at runtime. All functions
    working with attribute_groups provided by work with
    const attribute_group. So mark the non-const structs as const.

    Signed-off-by: Arvind Yadav
    CC: linux-edac@vger.kernel.org
    Link: http://lkml.kernel.org/r/776cb8265509054abd01b0b551624cc0da3b88e7.1499078335.git.arvind.yadav.cs@gmail.com
    Signed-off-by: Borislav Petkov

    Arvind Yadav
     
  • Using the homegrown amd_get_nb_id() to find a node ID on AMD was fine
    while the L3 to node mapping was 1:1. And Zen topology broke this. So
    let's start slowly moving away from it and use the topology interfaces
    instead.

    Signed-off-by: Yazen Ghannam
    Cc: linux-edac
    Cc: x86-ml
    Link: http://lkml.kernel.org/r/1490041614-90057-2-git-send-email-Yazen.Ghannam@amd.com
    [ Massage commit message. ]
    Signed-off-by: Borislav Petkov

    Yazen Ghannam
     

29 Jun, 2017

2 commits

  • Non-existent or empty DIMM slots result in error return from
    RD_REGP(). But we shouldn't give up on failure.

    So long as we find at least one DIMM we can continue.

    Signed-off-by: Tony Luck
    Cc: Qiuxu Zhuo
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/20170628234407.21521-1-tony.luck@intel.com
    Signed-off-by: Borislav Petkov

    Tony Luck
     
  • In the i5000 and i5400 drivers, the NRECMEMB register is defined as a
    16-bit value, which results in wrong shifts in the code, as reported by
    sparse.

    In the datasheets ([1], section 3.9.22.20 and [2], section 3.9.22.21),
    this register is a 32-bit register. A u32 value for the register fixes
    the wrong shifts warnings and matches the datasheet.

    Also fix the mask to access to the CAS bits [27:16] in the i5000 driver.

    [1]: https://www.intel.com/content/dam/doc/datasheet/5000p-5000v-5000z-chipset-memory-controller-hub-datasheet.pdf
    [2]: https://www.intel.se/content/dam/doc/datasheet/5400-chipset-memory-controller-hub-datasheet.pdf

    Signed-off-by: Jérémy Lefaure
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/20170629005729.8478-1-jeremy.lefaure@lse.epita.fr
    Signed-off-by: Borislav Petkov

    Jérémy Lefaure
     

26 Jun, 2017

1 commit

  • The function sbi_send() is local to just pnd2_edac.c and does not need
    to be in global scope, so make it static.

    Signed-off-by: Colin Ian King
    Cc: Tony Luck
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/20170623084855.9197-1-colin.king@canonical.com
    Signed-off-by: Borislav Petkov

    Colin Ian King
     

23 Jun, 2017

1 commit

  • Add code comment to make it clear that the fall-through is intentional
    and, OR ret with its previous value to avoid overwriting it so that
    callers can check the correct return value.

    Signed-off-by: Gustavo A. R. Silva
    Cc: Qiuxu Zhuo
    Cc: Tony Luck
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/20170622220535.GA4896@embeddedgus
    [ Massage a bit. ]
    Signed-off-by: Borislav Petkov

    Gustavo A. R. Silva
     

14 Jun, 2017

2 commits

  • Use of_address_to_resource() and resource_size() instead of manually
    parsing the "reg" property from the "memory" node(s).

    Signed-off-by: Chris Packham
    Tested-by: Thor Thayer
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/20170606235500.22772-3-chris.packham@alliedtelesis.co.nz
    Signed-off-by: Borislav Petkov

    Chris Packham
     
  • Xiaolong Ye reported the following failure on Broadwell D server:

    EDAC sbridge: Some needed devices are missing
    EDAC MC: Removed device 0 for sbridge_edac.c Broadwell SrcID#0_Ha#0: DEV 0000:ff:12.0
    EDAC sbridge: Couldn't find mci handler
    EDAC sbridge: Failed to register device with error -19.

    Broadwell D (only IMC0 per socket) and Broadwell X (IMC0 and IMC1 per
    socket) use the same PCI device IDs for IMC0 per socket, then they
    share pci_dev_descr_broadwell_table (n_imcs_per_sock=2). In this case,
    Broadwell D wrongly creates the nonexistent SOCK EDAC memory controller
    and reports above error messages, since it has no IMC1 per socket.

    Avoid creating the nonexistent SOCK memory controller.

    Reported-and-tested-by: Xiaolong Ye
    Signed-off-by: Qiuxu Zhuo
    Cc: Tony Luck
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/20170608113351.25323-1-qiuxu.zhuo@intel.com
    [ Massage. ]
    Signed-off-by: Borislav Petkov

    Qiuxu Zhuo
     

13 Jun, 2017

1 commit


09 Jun, 2017

1 commit

  • edac_op_state is a module parameter which affects the behaviour of
    the driver probe which can potentially be invoked as soon as the
    platform driver registration happens. Because of this we need to
    ensure that we sanity check the module parameter before calling
    platform_register_drivers().

    Signed-off-by: Chris Packham
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/20170607215530.8604-1-chris.packham@alliedtelesis.co.nz
    Signed-off-by: Borislav Petkov

    Chris Packham
     

02 Jun, 2017

1 commit

  • Compare the number of debugfs entries created by
    thunderx_create_debugfs_nodes() with the requested number of entries to
    properly determine whether to print a warning.

    Signed-off-by: Vadim Lomovtsev
    Cc: linux-edac
    Cc: linux-mips@linux-mips.org
    Link: http://lkml.kernel.org/r/20170531155157.93583-1-stemerkhanov@cavium.com
    Signed-off-by: Sergey Temerkhanov
    Signed-off-by: Borislav Petkov

    Vadim Lomovtsev
     

30 May, 2017

1 commit

  • Check the return status of platform_driver_register() in
    mv64x60_edac_init(). Only output messages and initialise the
    edac_op_state if the registration is successful.

    Signed-off-by: Chris Packham
    Cc: linux-edac
    Cc: linuxppc-dev@lists.ozlabs.org
    Link: http://lkml.kernel.org/r/20170529212142.25572-2-chris.packham@alliedtelesis.co.nz
    Signed-off-by: Borislav Petkov

    Chris Packham