14 Feb, 2017

1 commit


28 Jan, 2017

2 commits

  • Match one of the devices in amd64_cpuids[] before loading the module.
    This is an additional sanity check against users trying to load
    amd64_edac_mod on unsupported systems.

    Signed-off-by: Yazen Ghannam
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/1485537863-2707-9-git-send-email-Yazen.Ghannam@amd.com
    [ Get rid of err_ret label, make it a bit more readable this way. ]
    Signed-off-by: Borislav Petkov

    Yazen Ghannam
     
  • amd64_{debug,notice} don't have any users, so remove them.

    Signed-off-by: Yazen Ghannam
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/1485537863-2707-6-git-send-email-Yazen.Ghannam@amd.com
    Signed-off-by: Borislav Petkov

    Yazen Ghannam
     

15 Dec, 2016

1 commit


01 Dec, 2016

1 commit


30 Nov, 2016

2 commits

  • How we need to decode UMC errors is different from how we decode bus
    errors, so let's define a new function for this. We also need a way to
    determine the UMC channel since we're not guaranteed that there is a
    fixed relation between channel and MCA bank.

    Signed-off-by: Yazen Ghannam
    Cc: Aravind Gopalakrishnan
    Cc: linux-edac
    Cc: x86-ml
    Link: http://lkml.kernel.org/r/1480359593-80369-1-git-send-email-Yazen.Ghannam@amd.com
    [ Fold in decode_synd_reg(), simplify. ]
    Signed-off-by: Borislav Petkov

    Yazen Ghannam
     
  • Read a few more UMC registers and provide debug output in order to be as
    similar as possible to older AMD systems.

    Signed-off-by: Yazen Ghannam
    Cc: Aravind Gopalakrishnan
    Cc: linux-edac
    Cc: x86-ml
    Link: http://lkml.kernel.org/r/1480344621-14966-1-git-send-email-Yazen.Ghannam@amd.com
    [ Remove unneeded K8 check and comments, fixup others. ]
    Signed-off-by: Borislav Petkov

    Yazen Ghannam
     

29 Nov, 2016

3 commits

  • Fam17h has new register offsets and fields for setting up the DRAM
    scrubber so add support for this.

    Signed-off-by: Yazen Ghannam
    Cc: Aravind Gopalakrishnan
    Cc: linux-edac
    Cc: x86-ml
    Link: http://lkml.kernel.org/r/1479423463-8536-17-git-send-email-Yazen.Ghannam@amd.com
    Signed-off-by: Borislav Petkov

    Yazen Ghannam
     
  • Fam17h has a different set of registers and bitfields. Most of these
    registers are read through SMN (System Management Network) rather
    than PCI config space. Also, the derivation of various values is now
    different.

    Update amd64_edac to read the appropriate registers and extract the
    correct values for Fam17h.

    Signed-off-by: Yazen Ghannam
    Cc: Aravind Gopalakrishnan
    Cc: linux-edac
    Cc: x86-ml
    Link: http://lkml.kernel.org/r/1479423463-8536-12-git-send-email-Yazen.Ghannam@amd.com
    [ Save us the indentation level in read_mc_regs(), add defines ]
    Signed-off-by: Borislav Petkov

    Yazen Ghannam
     
  • Fam17h needs PCI device functions 0 and 6 instead of 1 and 2 as on older
    systems. Update struct amd64_pvt to hold the new functions and reserve
    them if on Fam17h.

    Also, allocate an array of UMC structs within our newly allocated PVT
    struct.

    Signed-off-by: Yazen Ghannam
    Cc: Aravind Gopalakrishnan
    Cc: linux-edac
    Cc: x86-ml
    Link: http://lkml.kernel.org/r/1479423463-8536-11-git-send-email-Yazen.Ghannam@amd.com
    [ init_one_instance() error handling, shorten lines, unbreak >80 cols lines. ]
    Signed-off-by: Borislav Petkov

    Yazen Ghannam
     

25 Nov, 2016

2 commits

  • Add a family type and associated ops for Fam17h. Define a struct to hold
    all the UMC registers that we need. Make this a part of struct amd64_pvt
    in order to maximize code reuse in the rest of the driver.

    Signed-off-by: Yazen Ghannam
    Cc: Aravind Gopalakrishnan
    Cc: linux-edac
    Cc: x86-ml
    Link: http://lkml.kernel.org/r/1479423463-8536-10-git-send-email-Yazen.Ghannam@amd.com
    Signed-off-by: Borislav Petkov

    Yazen Ghannam
     
  • Update the ecc_enabled() function to work on Fam17h. This entails
    reading a different set of registers and using the SMN (System
    Management Network) rather than PCI devices.

    Signed-off-by: Yazen Ghannam
    Cc: Aravind Gopalakrishnan
    Cc: linux-edac
    Cc: x86-ml
    Link: http://lkml.kernel.org/r/1479423463-8536-9-git-send-email-Yazen.Ghannam@amd.com
    [ Fixup ecc_en assignment and get_umc_base(). ]
    Signed-off-by: Borislav Petkov

    Yazen Ghannam
     

10 May, 2016

1 commit

  • - remove homegrown instances counting.
    - take F3 PCI device from amd_nb caching instead of F2 which was used with the
    PCI core.

    With those changes, the driver doesn't need to register a PCI driver and
    relies on the northbridges caching which we do anyway on AMD.

    Signed-off-by: Borislav Petkov
    Cc: Yazen Ghannam

    Borislav Petkov
     

29 Sep, 2015

2 commits

  • Git provides us all the changelogs anyway. So trim the comments section
    here. Update the copyrights info while at it.

    Signed-off-by: Aravind Gopalakrishnan
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/1443440593-2316-3-git-send-email-Aravind.Gopalakrishnan@amd.com
    Signed-off-by: Borislav Petkov

    Aravind Gopalakrishnan
     
  • The scrub rate control register has moved to function 2 in PCI config
    space and is at a different offset on family 0x15, models 0x60 and
    later. The minimum recommended scrub rate has also changed. (Refer to
    D18F2x1c9_dct[1:0][DramScrub] in Fam15hM60h BKDG).

    Adjust set_scrub_rate() and get_scrub_rate() functions to accommodate
    this.

    Tested on F15hM60h, Fam15h, models 00h-0fh and Fam10h systems.

    Signed-off-by: Aravind Gopalakrishnan
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/1443440593-2316-2-git-send-email-Aravind.Gopalakrishnan@amd.com
    [ Cleanup conditionals. ]
    Signed-off-by: Borislav Petkov

    Aravind Gopalakrishnan
     

23 Feb, 2015

1 commit

  • Instead of calling device_create_file() and device_remove_file()
    manually, pass the static attribute groups with the new
    edac_mc_add_mc_with_groups(). The conditional creation of inject sysfs
    files is done by a proper is_visible callback.

    Signed-off-by: Takashi Iwai
    Link: http://lkml.kernel.org/r/1423046938-18111-4-git-send-email-tiwai@suse.de
    Signed-off-by: Borislav Petkov

    Takashi Iwai
     

30 Oct, 2014

1 commit

  • This patch adds support for ECC error decoding for F15h M60h processor.
    Aside from the usual changes, the patch adds support for some new features
    in the processor:
    - DDR4(unbuffered, registered); LRDIMM DDR3 support
    - relevant debug messages have been modified/added to report these
    memory types
    - new dbam_to_cs mappers
    - if (F15h M60h && LRDIMM); we need a 'multiplier' value to find
    cs_size. This multiplier value is obtained from the per-dimm
    DCSM register. So, change the interface to accept a 'cs_mask_nr'
    value to facilitate this calculation
    - switch-casing determine_memory_type()
    - done to cleanse the function of too many if-else statements
    and improve readability
    - This is now called early in read_mc_regs() to cache dram_type

    Misc cleanup:
    - amd64_pci_table[] is condensed by using PCI_VDEVICE macro.

    Testing details:
    Tested the patch by injecting 'ECC' type errors using mce_amd_inj
    and error decoding works fine.

    Signed-off-by: Aravind Gopalakrishnan
    Link: http://lkml.kernel.org/r/1414617483-4941-1-git-send-email-Aravind.Gopalakrishnan@amd.com
    [ Boris: determine_memory_type() cleanups ]
    Signed-off-by: Borislav Petkov

    Aravind Gopalakrishnan
     

23 Sep, 2014

1 commit

  • Rationale behind this change:
    - F2x1xx addresses were stopped from being mapped explicitly to DCT1
    from F15h (OR) onwards. They use _dct[0:1] mechanism to access the
    registers. So we should move away from using address ranges to select
    DCT for these families.
    - On newer processors, the address ranges used to indicate DCT1 (0x140,
    0x1a0) have different meanings than what is assumed currently.

    Changes introduced:
    - amd64_read_dct_pci_cfg() now takes in dct value and uses it for
    'selecting the dct'
    - Update usage of the function. Keep in mind that different families
    have specific handling requirements
    - Remove [k8|f10]_read_dct_pci_cfg() as they don't do much different
    from amd64_read_pci_cfg()
    - Move the k8 specific check to amd64_read_pci_cfg
    - Remove f15_read_dct_pci_cfg() and move logic to amd64_read_dct_pci_cfg()
    - Remove now needless .read_dct_pci_cfg

    Testing:
    - Tested on Fam 10h; Fam15h Models: 00h, 30h; Fam16h using 'EDAC_DEBUG'
    and mce_amd_inj
    - driver obtains info from F2x registers and caches it in pvt
    structures correctly
    - ECC decoding works fine

    Signed-off-by: Aravind Gopalakrishnan
    Link: http://lkml.kernel.org/r/1410799058-3149-1-git-send-email-aravind.gopalakrishnan@amd.com
    Signed-off-by: Borislav Petkov

    Aravind Gopalakrishnan
     

28 Feb, 2014

1 commit

  • Extend ECC decoding support for F16h M30h. Tested on F16h M30h with ECC
    turned on using mce_amd_inj module and the patch works fine.

    Signed-off-by: Aravind Gopalakrishnan
    Link: http://lkml.kernel.org/r/1392913726-16961-1-git-send-email-Aravind.Gopalakrishnan@amd.com
    Tested-by: Arindam Nath
    Acked-by: H. Peter Anvin
    Signed-off-by: Borislav Petkov

    Aravind Gopalakrishnan
     

22 Oct, 2013

1 commit

  • GENMASK is used to create a contiguous bitmask([hi:lo]). It is
    implemented twice in current kernel. One is in EDAC driver, the other
    is in SiS/XGI FB driver. Move it to a more generic place for other
    usage.

    Signed-off-by: Chen, Gong
    Cc: Borislav Petkov
    Cc: Thomas Winischhofer
    Cc: Jean-Christophe Plagniol-Villard
    Cc: Tomi Valkeinen
    Acked-by: Borislav Petkov
    Acked-by: Mauro Carvalho Chehab
    Signed-off-by: Tony Luck

    Chen, Gong
     

12 Aug, 2013

2 commits

  • Now that we cache (family, model, stepping) locally, use them instead of
    boot_cpu_data.

    No functionality change.

    Signed-off-by: Borislav Petkov

    Borislav Petkov
     
  • On newer models, support has been included for upto 4 DCT's, however,
    only DCT0 and DCT3 are currently configured (cf BKDG Section 2.10).
    Also, the routing DRAM Requests algorithm is different for F15h M30h.
    Thus it is cleaner to use a brand new function rather than adding quirks
    to the more generic f1x_match_to_this_node(). Refer to "2.10.5 DRAM
    Routing Requests" in the BKDG for further info.

    Tested on Fam15h M30h with ECC turned on using mce_amd_inj facility and
    verified to be functionally correct.

    While at it, verify if erratum workarounds for E505 and E637 still hold.
    From email conversations within AMD, the current status of the errata
    is:

    * Erratum 505: fixed in model 0x1, stepping 0x1 and later.
    * Erratum 637: not fixed.

    Signed-off-by: Aravind Gopalakrishnan
    [ Cleanups, corrections ]
    Signed-off-by: Borislav Petkov

    Aravind Gopalakrishnan
     

19 Apr, 2013

1 commit


10 Jan, 2013

2 commits

  • Use appropriate types for northbridge IDs and memory ranges. Mark
    immutable data const and keep within compilation unit on related
    structures.

    Signed-off-by: Daniel J Blueman
    Link: http://lkml.kernel.org/r/1354265060-22956-2-git-send-email-daniel@numascale-asia.com
    [Boris: Drop arg change to node_to_amd_nb]
    Signed-off-by: Borislav Petkov

    Daniel J Blueman
     
  • Fix get_node_id to match northbridge IDs from the array of detected
    ones, allowing multi-server support such as with Numascale's
    NumaConnect, renaming to 'amd_get_node_id' for consistency.

    Signed-off-by: Daniel J Blueman
    Link: http://lkml.kernel.org/r/1353997932-8475-1-git-send-email-daniel@numascale-asia.com
    [Boris: shorten lines to fit 80 cols]
    Signed-off-by: Borislav Petkov

    Daniel J Blueman
     

28 Nov, 2012

4 commits


30 Oct, 2012

1 commit


12 Jun, 2012

1 commit


26 Apr, 2011

2 commits

  • F15h CPUs may report a non-DRAM address when reporting an error address
    belonging to a CC6 state save area. Add a workaround to detect this
    condition and compute the actual DRAM address of the error as documented
    in the Revision Guide for AMD Family 15h Models 00h-0Fh Processors.

    Signed-off-by: Borislav Petkov

    Borislav Petkov
     
  • F15h and later use a portion of DRAM as a CC6 storage area. BIOS
    programs D18F1x[17C:140,7C:40] DRAM Base/Limit accordingly by
    subtracting the storage area from the DRAM limit setting. However, in
    order for edac to consider that part of DRAM too, we need to include it
    into the per-node range.

    Signed-off-by: Borislav Petkov

    Borislav Petkov
     

17 Mar, 2011

7 commits