14 Feb, 2017
1 commit
-
Last time we did that was when we enabled Bulldozer. Now, we enabled Zen
so it is only natural ... :-)Signed-off-by: Borislav Petkov
Cc: Yazen Ghannam
28 Jan, 2017
2 commits
-
Match one of the devices in amd64_cpuids[] before loading the module.
This is an additional sanity check against users trying to load
amd64_edac_mod on unsupported systems.Signed-off-by: Yazen Ghannam
Cc: linux-edac
Link: http://lkml.kernel.org/r/1485537863-2707-9-git-send-email-Yazen.Ghannam@amd.com
[ Get rid of err_ret label, make it a bit more readable this way. ]
Signed-off-by: Borislav Petkov -
amd64_{debug,notice} don't have any users, so remove them.
Signed-off-by: Yazen Ghannam
Cc: linux-edac
Link: http://lkml.kernel.org/r/1485537863-2707-6-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov
15 Dec, 2016
1 commit
-
Now, all left at edac_core.h are at drivers/edac/edac_mc.c,
so rename it to edac_mc.h.Signed-off-by: Mauro Carvalho Chehab
01 Dec, 2016
1 commit
-
Prefix the warn and error macros with the respective string so that
callers don't have to say "Error" or "Warning". We save us string length
this way in the actual calls.While at it, shorten the calls in reserve_mc_sibling_devs().
Signed-off-by: Borislav Petkov
Cc: Dan Carpenter
Cc: Yazen Ghannam
30 Nov, 2016
2 commits
-
How we need to decode UMC errors is different from how we decode bus
errors, so let's define a new function for this. We also need a way to
determine the UMC channel since we're not guaranteed that there is a
fixed relation between channel and MCA bank.Signed-off-by: Yazen Ghannam
Cc: Aravind Gopalakrishnan
Cc: linux-edac
Cc: x86-ml
Link: http://lkml.kernel.org/r/1480359593-80369-1-git-send-email-Yazen.Ghannam@amd.com
[ Fold in decode_synd_reg(), simplify. ]
Signed-off-by: Borislav Petkov -
Read a few more UMC registers and provide debug output in order to be as
similar as possible to older AMD systems.Signed-off-by: Yazen Ghannam
Cc: Aravind Gopalakrishnan
Cc: linux-edac
Cc: x86-ml
Link: http://lkml.kernel.org/r/1480344621-14966-1-git-send-email-Yazen.Ghannam@amd.com
[ Remove unneeded K8 check and comments, fixup others. ]
Signed-off-by: Borislav Petkov
29 Nov, 2016
3 commits
-
Fam17h has new register offsets and fields for setting up the DRAM
scrubber so add support for this.Signed-off-by: Yazen Ghannam
Cc: Aravind Gopalakrishnan
Cc: linux-edac
Cc: x86-ml
Link: http://lkml.kernel.org/r/1479423463-8536-17-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov -
Fam17h has a different set of registers and bitfields. Most of these
registers are read through SMN (System Management Network) rather
than PCI config space. Also, the derivation of various values is now
different.Update amd64_edac to read the appropriate registers and extract the
correct values for Fam17h.Signed-off-by: Yazen Ghannam
Cc: Aravind Gopalakrishnan
Cc: linux-edac
Cc: x86-ml
Link: http://lkml.kernel.org/r/1479423463-8536-12-git-send-email-Yazen.Ghannam@amd.com
[ Save us the indentation level in read_mc_regs(), add defines ]
Signed-off-by: Borislav Petkov -
Fam17h needs PCI device functions 0 and 6 instead of 1 and 2 as on older
systems. Update struct amd64_pvt to hold the new functions and reserve
them if on Fam17h.Also, allocate an array of UMC structs within our newly allocated PVT
struct.Signed-off-by: Yazen Ghannam
Cc: Aravind Gopalakrishnan
Cc: linux-edac
Cc: x86-ml
Link: http://lkml.kernel.org/r/1479423463-8536-11-git-send-email-Yazen.Ghannam@amd.com
[ init_one_instance() error handling, shorten lines, unbreak >80 cols lines. ]
Signed-off-by: Borislav Petkov
25 Nov, 2016
2 commits
-
Add a family type and associated ops for Fam17h. Define a struct to hold
all the UMC registers that we need. Make this a part of struct amd64_pvt
in order to maximize code reuse in the rest of the driver.Signed-off-by: Yazen Ghannam
Cc: Aravind Gopalakrishnan
Cc: linux-edac
Cc: x86-ml
Link: http://lkml.kernel.org/r/1479423463-8536-10-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov -
Update the ecc_enabled() function to work on Fam17h. This entails
reading a different set of registers and using the SMN (System
Management Network) rather than PCI devices.Signed-off-by: Yazen Ghannam
Cc: Aravind Gopalakrishnan
Cc: linux-edac
Cc: x86-ml
Link: http://lkml.kernel.org/r/1479423463-8536-9-git-send-email-Yazen.Ghannam@amd.com
[ Fixup ecc_en assignment and get_umc_base(). ]
Signed-off-by: Borislav Petkov
10 May, 2016
1 commit
-
- remove homegrown instances counting.
- take F3 PCI device from amd_nb caching instead of F2 which was used with the
PCI core.With those changes, the driver doesn't need to register a PCI driver and
relies on the northbridges caching which we do anyway on AMD.Signed-off-by: Borislav Petkov
Cc: Yazen Ghannam
29 Sep, 2015
2 commits
-
Git provides us all the changelogs anyway. So trim the comments section
here. Update the copyrights info while at it.Signed-off-by: Aravind Gopalakrishnan
Cc: linux-edac
Link: http://lkml.kernel.org/r/1443440593-2316-3-git-send-email-Aravind.Gopalakrishnan@amd.com
Signed-off-by: Borislav Petkov -
The scrub rate control register has moved to function 2 in PCI config
space and is at a different offset on family 0x15, models 0x60 and
later. The minimum recommended scrub rate has also changed. (Refer to
D18F2x1c9_dct[1:0][DramScrub] in Fam15hM60h BKDG).Adjust set_scrub_rate() and get_scrub_rate() functions to accommodate
this.Tested on F15hM60h, Fam15h, models 00h-0fh and Fam10h systems.
Signed-off-by: Aravind Gopalakrishnan
Cc: linux-edac
Link: http://lkml.kernel.org/r/1443440593-2316-2-git-send-email-Aravind.Gopalakrishnan@amd.com
[ Cleanup conditionals. ]
Signed-off-by: Borislav Petkov
23 Feb, 2015
1 commit
-
Instead of calling device_create_file() and device_remove_file()
manually, pass the static attribute groups with the new
edac_mc_add_mc_with_groups(). The conditional creation of inject sysfs
files is done by a proper is_visible callback.Signed-off-by: Takashi Iwai
Link: http://lkml.kernel.org/r/1423046938-18111-4-git-send-email-tiwai@suse.de
Signed-off-by: Borislav Petkov
30 Oct, 2014
1 commit
-
This patch adds support for ECC error decoding for F15h M60h processor.
Aside from the usual changes, the patch adds support for some new features
in the processor:
- DDR4(unbuffered, registered); LRDIMM DDR3 support
- relevant debug messages have been modified/added to report these
memory types
- new dbam_to_cs mappers
- if (F15h M60h && LRDIMM); we need a 'multiplier' value to find
cs_size. This multiplier value is obtained from the per-dimm
DCSM register. So, change the interface to accept a 'cs_mask_nr'
value to facilitate this calculation
- switch-casing determine_memory_type()
- done to cleanse the function of too many if-else statements
and improve readability
- This is now called early in read_mc_regs() to cache dram_typeMisc cleanup:
- amd64_pci_table[] is condensed by using PCI_VDEVICE macro.Testing details:
Tested the patch by injecting 'ECC' type errors using mce_amd_inj
and error decoding works fine.Signed-off-by: Aravind Gopalakrishnan
Link: http://lkml.kernel.org/r/1414617483-4941-1-git-send-email-Aravind.Gopalakrishnan@amd.com
[ Boris: determine_memory_type() cleanups ]
Signed-off-by: Borislav Petkov
23 Sep, 2014
1 commit
-
Rationale behind this change:
- F2x1xx addresses were stopped from being mapped explicitly to DCT1
from F15h (OR) onwards. They use _dct[0:1] mechanism to access the
registers. So we should move away from using address ranges to select
DCT for these families.
- On newer processors, the address ranges used to indicate DCT1 (0x140,
0x1a0) have different meanings than what is assumed currently.Changes introduced:
- amd64_read_dct_pci_cfg() now takes in dct value and uses it for
'selecting the dct'
- Update usage of the function. Keep in mind that different families
have specific handling requirements
- Remove [k8|f10]_read_dct_pci_cfg() as they don't do much different
from amd64_read_pci_cfg()
- Move the k8 specific check to amd64_read_pci_cfg
- Remove f15_read_dct_pci_cfg() and move logic to amd64_read_dct_pci_cfg()
- Remove now needless .read_dct_pci_cfgTesting:
- Tested on Fam 10h; Fam15h Models: 00h, 30h; Fam16h using 'EDAC_DEBUG'
and mce_amd_inj
- driver obtains info from F2x registers and caches it in pvt
structures correctly
- ECC decoding works fineSigned-off-by: Aravind Gopalakrishnan
Link: http://lkml.kernel.org/r/1410799058-3149-1-git-send-email-aravind.gopalakrishnan@amd.com
Signed-off-by: Borislav Petkov
28 Feb, 2014
1 commit
-
Extend ECC decoding support for F16h M30h. Tested on F16h M30h with ECC
turned on using mce_amd_inj module and the patch works fine.Signed-off-by: Aravind Gopalakrishnan
Link: http://lkml.kernel.org/r/1392913726-16961-1-git-send-email-Aravind.Gopalakrishnan@amd.com
Tested-by: Arindam Nath
Acked-by: H. Peter Anvin
Signed-off-by: Borislav Petkov
22 Oct, 2013
1 commit
-
GENMASK is used to create a contiguous bitmask([hi:lo]). It is
implemented twice in current kernel. One is in EDAC driver, the other
is in SiS/XGI FB driver. Move it to a more generic place for other
usage.Signed-off-by: Chen, Gong
Cc: Borislav Petkov
Cc: Thomas Winischhofer
Cc: Jean-Christophe Plagniol-Villard
Cc: Tomi Valkeinen
Acked-by: Borislav Petkov
Acked-by: Mauro Carvalho Chehab
Signed-off-by: Tony Luck
12 Aug, 2013
2 commits
-
Now that we cache (family, model, stepping) locally, use them instead of
boot_cpu_data.No functionality change.
Signed-off-by: Borislav Petkov
-
On newer models, support has been included for upto 4 DCT's, however,
only DCT0 and DCT3 are currently configured (cf BKDG Section 2.10).
Also, the routing DRAM Requests algorithm is different for F15h M30h.
Thus it is cleaner to use a brand new function rather than adding quirks
to the more generic f1x_match_to_this_node(). Refer to "2.10.5 DRAM
Routing Requests" in the BKDG for further info.Tested on Fam15h M30h with ECC turned on using mce_amd_inj facility and
verified to be functionally correct.While at it, verify if erratum workarounds for E505 and E637 still hold.
From email conversations within AMD, the current status of the errata
is:* Erratum 505: fixed in model 0x1, stepping 0x1 and later.
* Erratum 637: not fixed.Signed-off-by: Aravind Gopalakrishnan
[ Cleanups, corrections ]
Signed-off-by: Borislav Petkov
19 Apr, 2013
1 commit
-
Add code to handle DRAM ECC errors decoding for Fam16h.
Tested on Fam16h with ECC turned on using the mce_amd_inj facility and
works fine.Signed-off-by: Aravind Gopalakrishnan
[ Boris: cleanups and clarifications ]
Signed-off-by: Borislav Petkov
10 Jan, 2013
2 commits
-
Use appropriate types for northbridge IDs and memory ranges. Mark
immutable data const and keep within compilation unit on related
structures.Signed-off-by: Daniel J Blueman
Link: http://lkml.kernel.org/r/1354265060-22956-2-git-send-email-daniel@numascale-asia.com
[Boris: Drop arg change to node_to_amd_nb]
Signed-off-by: Borislav Petkov -
Fix get_node_id to match northbridge IDs from the array of detected
ones, allowing multi-server support such as with Numascale's
NumaConnect, renaming to 'amd_get_node_id' for consistency.Signed-off-by: Daniel J Blueman
Link: http://lkml.kernel.org/r/1353997932-8475-1-git-send-email-daniel@numascale-asia.com
[Boris: shorten lines to fit 80 cols]
Signed-off-by: Borislav Petkov
28 Nov, 2012
4 commits
-
Instead of open-coding it, use the DBAM_DIMM macro in
amd64_csrow_nr_pages() which we have already.Signed-off-by: Borislav Petkov
-
Rewrite CE/UE paths so that they use the same code and drop additional
code duplication in handle_ue. Add a struct err_info which collects
required info for the error reporting. This, in turn, helps slimming all
edac_mc_handle_error() calls down to one.Signed-off-by: Borislav Petkov
-
When injecting DRAM ECC errors over the F3xB[8,C] interface, the machine
does this by injecting the error in the next non-cached access. This
takes relatively long time on a normal system so that in order for us to
expedite it, we disable the caches around the injection.Signed-off-by: Borislav Petkov
-
Invert kstrtoul return value testing and win one indentation level.
Also, shorten up macro names so that the lines can fit into 80 cols. No
functional change.Signed-off-by: Borislav Petkov
30 Oct, 2012
1 commit
-
My @amd.com address will be invalid soon so move to private
email address.Signed-off-by: Borislav Petkov
Link: http://lkml.kernel.org/r/1351532410-4887-2-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar
12 Jun, 2012
1 commit
-
Now that the EDAC core supports struct device, there's no sense
on having any logic at the EDAC core to simulate it. So, instead
of adding such logic there, change the logic at amd64_edac to
use it.Reviewed-by: Aristeu Rozanski
Cc: Doug Thompson
Cc: Borislav Petkov
Signed-off-by: Mauro Carvalho Chehab
26 Apr, 2011
2 commits
-
F15h CPUs may report a non-DRAM address when reporting an error address
belonging to a CC6 state save area. Add a workaround to detect this
condition and compute the actual DRAM address of the error as documented
in the Revision Guide for AMD Family 15h Models 00h-0Fh Processors.Signed-off-by: Borislav Petkov
-
F15h and later use a portion of DRAM as a CC6 storage area. BIOS
programs D18F1x[17C:140,7C:40] DRAM Base/Limit accordingly by
subtracting the storage area from the DRAM limit setting. However, in
order for edac to consider that part of DRAM too, we need to include it
into the per-node range.Signed-off-by: Borislav Petkov
17 Mar, 2011
7 commits
-
Return unsigned u8 values only.
Signed-off-by: Borislav Petkov
-
A node id can never be negative since we use it as an index into
the DRAM ranges array. This also makes one of the BUG_ON conditions
redundant.Signed-off-by: Borislav Petkov
-
Those were moved to the mce_amd.h header.
Signed-off-by: Borislav Petkov
-
Add the PCI device ids required for driver registration. Remove
pvt->ctl_name and use the family descriptor directly, instead. Then,
bump driver version and fixup its format. Finally, enable DRAM ECC
decoding on F15h.Signed-off-by: Borislav Petkov
-
F15h has the same ECC symbol size options as F10h revD and later so
adjust checks to that. Simplify code a bit.Signed-off-by: Borislav Petkov
-
Drop per-instance variable and compute min scrubrate dynamically.
Signed-off-by: Borislav Petkov
-
Drop static tables which map the bits in F2x80 to a chip select size in
favor of functions doing the mapping with some bit fiddling. Also, add
F15 support.Signed-off-by: Borislav Petkov