30 Dec, 2020
1 commit
-
[ Upstream commit 8de0c9917cc1297bc5543b61992d5bdee4ce621a ]
The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and
later systems. This function is used in amd64_edac_mod to do
system-specific decoding for DRAM ECC errors. The function takes a
"NodeId" as a parameter.In AMD documentation, NodeId is used to identify a physical die in a
system. This can be used to identify a node in the AMD_NB code and also
it is used with umc_normaddr_to_sysaddr().However, the input used for decode_dram_ecc() is currently the NUMA node
of a logical CPU. In the default configuration, the NUMA node and
physical die will be equivalent, so this doesn't have an impact.But the NUMA node configuration can be adjusted with optional memory
interleaving modes. This will cause the NUMA node enumeration to not
match the physical die enumeration. The mismatch will cause the address
translation function to fail or report incorrect results.Use struct cpuinfo_x86.cpu_die_id for the node_id parameter to ensure the
physical ID is used.Fixes: fbe63acf62f5 ("EDAC, mce_amd: Use cpu_to_node() to find the node ID")
Signed-off-by: Yazen Ghannam
Signed-off-by: Borislav Petkov
Link: https://lkml.kernel.org/r/20201109210659.754018-4-Yazen.Ghannam@amd.com
Signed-off-by: Sasha Levin
13 Oct, 2020
1 commit
-
Pull RAS updates from Borislav Petkov:
- Extend the recovery from MCE in kernel space also to processes which
encounter an MCE in kernel space but while copying from user memory
by sending them a SIGBUS on return to user space and umapping the
faulty memory, by Tony Luck and Youquan Song.- memcpy_mcsafe() rework by splitting the functionality into
copy_mc_to_user() and copy_mc_to_kernel(). This, as a result, enables
support for new hardware which can recover from a machine check
encountered during a fast string copy and makes that the default and
lets the older hardware which does not support that advance recovery,
opt in to use the old, fragile, slow variant, by Dan Williams.- New AMD hw enablement, by Yazen Ghannam and Akshay Gupta.
- Do not use MSR-tracing accessors in #MC context and flag any fault
while accessing MCA architectural MSRs as an architectural violation
with the hope that such hw/fw misdesigns are caught early during the
hw eval phase and they don't make it into production.- Misc fixes, improvements and cleanups, as always.
* tag 'ras_updates_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Allow for copy_mc_fragile symbol checksum to be generated
x86/mce: Decode a kernel instruction to determine if it is copying from user
x86/mce: Recover from poison found while copying from user space
x86/mce: Avoid tail copy when machine check terminated a copy from user
x86/mce: Add _ASM_EXTABLE_CPY for copy user access
x86/mce: Provide method to find out the type of an exception handler
x86/mce: Pass pointer to saved pt_regs to severity calculation routines
x86/copy_mc: Introduce copy_mc_enhanced_fast_string()
x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()
x86/mce: Drop AMD-specific "DEFERRED" case from Intel severity rule list
x86/mce: Add Skylake quirk for patrol scrub reported errors
RAS/CEC: Convert to DEFINE_SHOW_ATTRIBUTE()
x86/mce: Annotate mce_rd/wrmsrl() with noinstr
x86/mce/dev-mcelog: Do not update kflags on AMD systems
x86/mce: Stop mce_reign() from re-computing severity for every CPU
x86/mce: Make mce_rdmsrl() panic on an inaccessible MSR
x86/mce: Increase maximum number of banks to 64
x86/mce: Delay clearing IA32_MCG_STATUS to the end of do_machine_check()
x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap
RAS/CEC: Fix cec_init() prototype
20 Aug, 2020
1 commit
-
The Extended Error Code Bitmap (xec_bitmap) for a Scalable MCA bank type
was intended to be used by the kernel to filter out invalid error codes
on a system. However, this is unnecessary after a few product releases
because the hardware will only report valid error codes. Thus, there's
no need for it with future systems.Remove the xec_bitmap field and all references to it.
Signed-off-by: Yazen Ghannam
Signed-off-by: Borislav Petkov
Link: https://lkml.kernel.org/r/20200720145353.43924-1-Yazen.Ghannam@amd.com
17 Aug, 2020
1 commit
-
A few existing MCA bank types will have new error types in future SMCA
systems.Add the descriptions for the new error types.
Signed-off-by: Yazen Ghannam
Signed-off-by: Borislav Petkov
Link: https://lkml.kernel.org/r/20200708153515.1911642-1-Yazen.Ghannam@amd.com
23 Jun, 2020
1 commit
-
Print the Protected Processor Identification Number (PPIN) on processors
which support it.[ bp: Massage. ]
Signed-off-by: Smita Koralahalli
Signed-off-by: Borislav Petkov
Link: https://lkml.kernel.org/r/20200623130059.8870-1-Smita.KoralahalliChannabasappa@amd.com
14 Apr, 2020
2 commits
-
If the handler took any action to log or deal with the error, set a bit
in mce->kflags so that the default handler on the end of the machine
check chain can see what has been done.Get rid of NOTIFY_STOP returns. Make the EDAC and dev-mcelog handlers
skip over errors already processed by CEC.Signed-off-by: Tony Luck
Signed-off-by: Borislav Petkov
Tested-by: Tony Luck
Link: https://lkml.kernel.org/r/20200214222720.13168-5-tony.luck@intel.com -
... because no one should be interested in spurious MCEs anyway. Make
the filtering unconditional and move it to amd_filter_mce().Signed-off-by: Borislav Petkov
Tested-by: Tony Luck
Link: https://lkml.kernel.org/r/20200407163414.18058-2-bp@alien8.de
19 Feb, 2020
1 commit
-
This warning is output for every virtual CPU in a guest on an EPYC 2
system because kvm doesn't enable SMCA. Once is enough too.[ bp: Massage. ]
Signed-off-by: Prarit Bhargava
Signed-off-by: Borislav Petkov
Link: https://lkml.kernel.org/r/20200217134627.19765-1-prarit@redhat.com
17 Jan, 2020
3 commits
-
... and do not kmalloc a three-pointer struct. Which simplifies
mce_amd_init() a bit.No functional changes.
Signed-off-by: Borislav Petkov
Link: https://lkml.kernel.org/r/20200116163403.GF27148@zn.tnic -
MCA error decoding on SMCA systems is not dependent on family. Return
success early if the system supports the SMCA feature.Signed-off-by: Yazen Ghannam
Signed-off-by: Borislav Petkov
Link: https://lkml.kernel.org/r/20200110015651.14887-3-Yazen.Ghannam@amd.com -
Add support for a new version of the Load Store unit bank type as
indicated by its McaType value, which will be present in future SMCA
systems.Add the new (HWID, MCATYPE) tuple. Reuse the same name, since this is
logically the same to the user.Also, add the new error descriptions to edac_mce_amd.
Signed-off-by: Yazen Ghannam
Signed-off-by: Borislav Petkov
Link: https://lkml.kernel.org/r/20200110015651.14887-2-Yazen.Ghannam@amd.com
21 May, 2019
1 commit
-
Add SPDX license identifiers to all files which:
- Have no license information of any form
- Have MODULE_LICENCE("GPL*") inside which was used in the initial
scan/conversion to ignore the fileThese files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:GPL-2.0-only
Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
24 Apr, 2019
1 commit
-
AMD family 17h Models 10h-2Fh may report a high number of L1 BTB MCA
errors under certain conditions. The errors are benign and can safely be
ignored. However, the high error rate may cause the MCA threshold
counter to overflow causing a high rate of thresholding interrupts.In addition, users may see the errors reported through the AMD MCE
decoder module, even with the interrupt disabled, due to MCA polling.Clear the "Counter Present" bit in the Instruction Fetch bank's
MCA_MISC0 register. This will prevent enabling MCA thresholding on this
bank which will prevent the high interrupt rate due to this error.Define an AMD-specific function to filter these errors from the MCE
event pool so that they don't get reported during early boot.Rename filter function in EDAC/mce_amd to avoid a naming conflict, while
at it.[ bp: Move function prototype to the internal header and
massage/cleanup, fix typos. ]Reported-by: Rafał Miłecki
Signed-off-by: Yazen Ghannam
Signed-off-by: Borislav Petkov
Cc: "H. Peter Anvin"
Cc: "clemej@gmail.com"
Cc: Arnd Bergmann
Cc: Ingo Molnar
Cc: James Morse
Cc: Kees Cook
Cc: Mauro Carvalho Chehab
Cc: Pu Wen
Cc: Qiuxu Zhuo
Cc: Shirish S
Cc: Thomas Gleixner
Cc: Tony Luck
Cc: Vishal Verma
Cc: linux-edac
Cc: x86-ml
Cc: # 5.0.x: c95b323dcd35: x86/MCE/AMD: Turn off MC4_MISC thresholding on all family 0x15 models
Cc: # 5.0.x: 30aa3d26edb0: x86/MCE/AMD: Carve out the MC4_MISC thresholding quirk
Cc: # 5.0.x: 9308fd407455: x86/MCE: Group AMD function prototypes in
Cc: # 5.0.x
Link: https://lkml.kernel.org/r/20190325163410.171021-2-Yazen.Ghannam@amd.com
15 Feb, 2019
2 commits
-
Sort the MCA_STATUS bits in decode output to follow how they are defined
in the register.The order is as follows:
Bit | Decode
------------
62 | Over
61 | UC
59 | MiscV
58 | AddrV
57 | PCC
55 | TCC
53 | SyndV
46 | CECC
45 | UECC
44 | Deferred
43 | Poison
40 | Scrub[ bp: Massage a bit. ]
Signed-off-by: Yazen Ghannam
Signed-off-by: Borislav Petkov
Cc: Mauro Carvalho Chehab
Cc: linux-edac
Cc: x86@kernel.org
Link: https://lkml.kernel.org/r/20190212212417.107049-2-Yazen.Ghannam@amd.com -
Previous AMD systems have had a bit in MCA_STATUS to indicate that an
error was detected on a scrub operation. However, this bit was defined
differently within different banks and families/models.Starting with Family 17h, MCA_STATUS[40] is either Reserved/Read-as-Zero
or defined as "Scrub", for all MCA banks and CPU models. Therefore, this
bit can be defined as the "Scrub" bit.Define MCA_STATUS[40] as "Scrub" and decode it in the AMD MCE decoding
module for Family 17h and newer systems.Signed-off-by: Yazen Ghannam
Signed-off-by: Borislav Petkov
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: James Morse
Cc: linux-edac
Cc: Mauro Carvalho Chehab
Cc: Pu Wen
Cc: Qiuxu Zhuo
Cc: Thomas Gleixner
Cc: Tony Luck
Cc: Vishal Verma
Cc: x86-ml
Link: https://lkml.kernel.org/r/20190212212417.107049-1-Yazen.Ghannam@amd.com
05 Feb, 2019
1 commit
-
Save a log line by printing the extended error code and the description
on a single line. This is similar to how errors are printed in other
subsystems, e.g. "#, description". If we don't have a valid description
then only the number/code is printed.Signed-off-by: Yazen Ghannam
Signed-off-by: Borislav Petkov
Cc: linux-edac
Cc: Mauro Carvalho Chehab
Cc: Tony Luck
Cc: x86@kernel.org
Link: https://lkml.kernel.org/r/20190201225534.8177-6-Yazen.Ghannam@amd.com
03 Feb, 2019
4 commits
-
Update the error descriptions to match the latest documentation for
easier searching. In some cases the changes are small and in other cases
the changes may be total rewording of the description.No functional changes.
Signed-off-by: Yazen Ghannam
Signed-off-by: Borislav Petkov
Cc: linux-edac
Cc: Mauro Carvalho Chehab
Cc: Tony Luck
Cc: x86@kernel.org
Link: https://lkml.kernel.org/r/20190201225534.8177-5-Yazen.Ghannam@amd.com -
Some SMCA bank types on future systems will report new error types even
though the bank type is not treated as a new version. These new error
types will reported by bits that are reserved in past systems.Add the new error descriptions to the lists in edac_mce_amd.
Signed-off-by: Yazen Ghannam
Signed-off-by: Borislav Petkov
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: Kees Cook
Cc: linux-edac
Cc: Mauro Carvalho Chehab
Cc: Shirish S
Cc: Thomas Gleixner
Cc: Tony Luck
Cc: x86-ml
Link: https://lkml.kernel.org/r/20190201225534.8177-4-Yazen.Ghannam@amd.com -
The existing CS, PSP, and SMU SMCA bank types will see new versions (as
indicated by their McaTypes) in future SMCA systems.Add the new (HWID, MCATYPE) tuples for these new versions. Reuse the
same names as the older versions, since they are logically the same to
the user. SMCA systems won't mix and match IP blocks with different
McaType versions in the same system, so there isn't a need to
distinguish them. The MCA_IPID register is saved when logging an MCA
error, and that can be used to triage the error.Also, add the new error descriptions to edac_mce_amd. Some error types
(positions in the list) are overloaded compared to the previous
McaTypes. Therefore, just create new lists of the error descriptions to
keep things simple even if some of the error descriptions are the same
between versions.Signed-off-by: Yazen Ghannam
Signed-off-by: Borislav Petkov
Cc: Arnd Bergmann
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: Kees Cook
Cc: linux-edac
Cc: Mauro Carvalho Chehab
Cc: Pu Wen
Cc: Qiuxu Zhuo
Cc: Shirish S
Cc: Thomas Gleixner
Cc: Tony Luck
Cc: Vishal Verma
Cc: x86-ml
Link: https://lkml.kernel.org/r/20190201225534.8177-3-Yazen.Ghannam@amd.com -
Add the (HWID, MCATYPE) tuples and names for the new MP5, NBIO, and
PCIE SMCA bank types.Also, add their respective error descriptions to the MCE decoding module
edac_mce_amd.Signed-off-by: Yazen Ghannam
Signed-off-by: Borislav Petkov
Cc: Arnd Bergmann
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: Kees Cook
Cc: linux-edac
Cc: Mauro Carvalho Chehab
Cc: Pu Wen
Cc: Qiuxu Zhuo
Cc: Shirish S
Cc: Thomas Gleixner
Cc: Tony Luck
Cc: Vishal Verma
Cc: x86-ml
Link: https://lkml.kernel.org/r/20190201225534.8177-2-Yazen.Ghannam@amd.com
28 Sep, 2018
1 commit
-
Add support for Hygon Dhyana CPU to EDAC.
Signed-off-by: Pu Wen
Signed-off-by: Borislav Petkov
Cc: mchehab@kernel.org
Cc: tglx@linutronix.de
Cc: mingo@redhat.com
Cc: hpa@zytor.com
Cc: thomas.lendacky@amd.com
Cc: linux-edac@vger.kernel.org
Link: https://lkml.kernel.org/r/9d71061301177822bc55b3bfd44f91057458d886.1537533369.git.puwen@hygon.cn
22 Feb, 2018
1 commit
-
Currently, bank 4 is reserved on Fam17h, so we chose not to initialize
bank 4 in the smca_banks array. This means that when we check if a bank
is initialized, like during boot or resume, we will see that bank 4 is
not initialized and try to initialize it.This will cause a call trace, when resuming from suspend, due to
rdmsr_*on_cpu() calls in the init path. The rdmsr_*on_cpu() calls issue
an IPI but we're running with interrupts disabled. This triggers:WARNING: CPU: 0 PID: 11523 at kernel/smp.c:291 smp_call_function_single+0xdc/0xe0
...Reserved banks will be read-as-zero, so their MCA_IPID register will be
zero. So, like the smca_banks array, the threshold_banks array will not
have an entry for a reserved bank since all its MCA_MISC* registers will
be zero.Enumerate a "Reserved" bank type that matches on a HWID_MCATYPE of 0,0.
Use the "Reserved" type when checking if a bank is reserved. It's
possible that other bank numbers may be reserved on future systems.Don't try to find the block address on reserved banks.
Signed-off-by: Yazen Ghannam
Signed-off-by: Borislav Petkov
Cc: # 4.14.x
Cc: Borislav Petkov
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tony Luck
Cc: linux-edac
Link: http://lkml.kernel.org/r/20180221101900.10326-7-bp@alien8.de
Signed-off-by: Ingo Molnar
21 Aug, 2017
3 commits
-
... and use the macro for that.
No functionality change.
Signed-off-by: Borislav Petkov
-
struct mce.cpuid contains CPUID(1).EAX which contains family, model and
stepping and thus has enough information for our purposes. Thus get rid
of some external dependencies which are not really needed.No functionality change.
Signed-off-by: Borislav Petkov
-
Singular fits better because it decodes a single error.
No functionality change.
Signed-off-by: Borislav Petkov
17 Jul, 2017
1 commit
-
Using the homegrown amd_get_nb_id() to find a node ID on AMD was fine
while the L3 to node mapping was 1:1. And Zen topology broke this. So
let's start slowly moving away from it and use the topology interfaces
instead.Signed-off-by: Yazen Ghannam
Cc: linux-edac
Cc: x86-ml
Link: http://lkml.kernel.org/r/1490041614-90057-2-git-send-email-Yazen.Ghannam@amd.com
[ Massage commit message. ]
Signed-off-by: Borislav Petkov
13 Jun, 2017
1 commit
-
Fix typo in "poison consumption" error description.
Signed-off-by: Yazen Ghannam
Cc: linux-edac
Link: http://lkml.kernel.org/r/1497286703-62853-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov
21 Feb, 2017
1 commit
-
Pull RAS updates from Ingo Molnar:
"The main changes in this cycle were:- Assign notifier chain priorities for all RAS related handlers to
make the ordering explicit (Borislav Petkov)- Improve the AMD MCA banks sysfs output (Yazen Ghannam)
- Various cleanups and restructuring of the x86 RAS code (Borislav
Petkov)"* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ras, EDAC, acpi: Assign MCE notifier handlers a priority
x86/ras: Get rid of mce_process_work()
EDAC/mce/amd: Dump TSC value
EDAC/mce/amd: Unexport amd_decode_mce()
x86/ras/amd/inj: Change dependency
x86/ras: Flip the TSC-adding logic
x86/ras/amd: Make sysfs names of banks more user-friendly
x86/ras/therm_throt: Do not log a fake MCE for thermal events
x86/ras/inject: Make it depend on X86_LOCAL_APIC=y
16 Feb, 2017
1 commit
-
Currently, the IPID and Syndrome are printed on the same line as the
Address. There are cases when we can have a valid Syndrome but not a
valid Address.For example, the MCA_SYND register can be used to hold more detailed
error info that the hardware folks can use. It's not just DRAM ECC
syndromes. There are some error types that aren't related to memory that
may have valid syndromes, like some errors related to links in the Data
Fabric, etc.In these cases, the IPID and Syndrome are not printed at the same log
level as the rest of the stanza, so users won't see them on the console.Console:
[Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b
[Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2Dmesg:
[Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b
, Syndrome: 0x000000010b404000, IPID: 0x0001002e00000002
[Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2Print the IPID first and on a new line. The IPID should always be
printed on SMCA systems. The Syndrome will then be printed with the IPID
and at the same log level when valid:[Hardware Error]: CPU:16 (17:1:0) MC22_STATUS[Over|CE|MiscV|-|-|-|-|SyndV|-]: 0xd82000000002080b
[Hardware Error]: IPID: 0x0001002e00000002, Syndrome: 0x000000010b404000
[Hardware Error]: Power, Interrupts, etc. Extended Error Code: 2Signed-off-by: Yazen Ghannam
Cc: linux-edac
Link: http://lkml.kernel.org/r/1487192182-2474-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov
28 Jan, 2017
1 commit
-
Users may not be familiar with the concept of deferred errors. There is
no action for users to take on this type of error, so give more context
in the error message to make this more clear.Signed-off-by: Yazen Ghannam
Cc: linux-edac
Link: http://lkml.kernel.org/r/1485297149-13733-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov
24 Jan, 2017
3 commits
-
Assign all notifiers on the MCE decode chain a priority so that they get
called in the correct order.Suggested-by: Thomas Gleixner
Signed-off-by: Borislav Petkov
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Tony Luck
Cc: Yazen Ghannam
Cc: linux-edac
Link: http://lkml.kernel.org/r/20170123183514.13356-10-bp@alien8.de
Signed-off-by: Ingo Molnar -
Dump the TSC value of the time when the MCE got logged.
Signed-off-by: Borislav Petkov
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tony Luck
Cc: Yazen Ghannam
Cc: linux-edac
Link: http://lkml.kernel.org/r/20170123183514.13356-8-bp@alien8.de
Signed-off-by: Ingo Molnar -
It is not used outside of the driver anymore.
Signed-off-by: Borislav Petkov
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tony Luck
Cc: Yazen Ghannam
Cc: linux-edac
Link: http://lkml.kernel.org/r/20170123183514.13356-7-bp@alien8.de
Signed-off-by: Ingo Molnar
29 Nov, 2016
1 commit
-
MCA_STATUS[43] has been defined as "Poison" or "Reserved" for every bank
since Fam15h except for Fam15h, bank 4 in which case it's defined as
part of the McaStatSubCache bitfield.Filter out that case.
Reported-by: Dean Liberty
Signed-off-by: Yazen Ghannam
Cc: Aravind Gopalakrishnan
Cc: linux-edac
Cc: x86-ml
Link: http://lkml.kernel.org/r/1479478222-19896-1-git-send-email-Yazen.Ghannam@amd.com
[ Split an almost unparseable ternary conditional, add a comment. ]
Signed-off-by: Borislav Petkov
24 Nov, 2016
1 commit
-
tip:ras/core contains the respective Fam17h x86 RAS bits which
amd64_edac is going to use. So merge it into the EDAC branch.Signed-off-by: Borislav Petkov
21 Nov, 2016
1 commit
-
nb_bus_decoder() is only used for DRAM ECC errors so rename it so that
the name is more generic and descriptive.Also, call it for DRAM ECC errors on SMCA systems.
[ Boris: rename it to real function name with a verb in it. ]
Signed-off-by: Yazen Ghannam
Cc: Aravind Gopalakrishnan
Cc: linux-edac
Link: http://lkml.kernel.org/r/1479423463-8536-4-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov
09 Nov, 2016
3 commits
-
Add accessor functions and hide the smca_names array. Also, add a
sanity-check to bank HWID assignment in get_smca_bank_info().Signed-off-by: Borislav Petkov
Link: http://lkml.kernel.org/r/20161104152317.5r276t35df53qk76@pd.tnic
Signed-off-by: Thomas Gleixner -
Make it differ more from struct smca_bank_name for better readability.
Signed-off-by: Borislav Petkov
Tested-by: Yazen Ghannam
Link: http://lkml.kernel.org/r/20161103125556.15482-3-bp@alien8.de
Signed-off-by: Thomas Gleixner -
Call it simply smca_hwid and call local variables "hwid". More readable.
Signed-off-by: Borislav Petkov
Tested-by: Yazen Ghannam
Link: http://lkml.kernel.org/r/20161103125556.15482-2-bp@alien8.de
Signed-off-by: Thomas Gleixner
13 Sep, 2016
1 commit
-
Bank 4 is reserved on family 0x17 and shouldn't generate any MCE
records. However, broken hardware and software is not something unheard
of so warn about bank 4 errors. They shouldn't be coming from bank 4
naturally but users can still use mce_amd_inj to simulate errors from it
for testing purposed.Also, avoid special handling in the injector mce_amd_inj like it is
being done on the older families.[ bp: Rewrite commit message and merge into one patch. Use boot_cpu_data. ]
Signed-off-by: Yazen Ghannam
Signed-off-by: Borislav Petkov
Reviewed-by: Aravind Gopalakrishnan
Link: http://lkml.kernel.org/r/1473384591-5323-1-git-send-email-Yazen.Ghannam@amd.com
Link: http://lkml.kernel.org/r/1473384591-5323-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner