10 Sep, 2020

4 commits


02 Sep, 2020

4 commits

  • Implementation of ORC requires some definitions that are currently
    provided by the target architecture headers. Do not depend on these
    definitions when the orc subcommand is not implemented.

    This avoid requiring arches with no orc implementation to provide dummy
    orc definitions.

    Signed-off-by: Julien Thierry
    Reviewed-by: Miroslav Benes
    Signed-off-by: Josh Poimboeuf

    Julien Thierry
     
  • Orc generation is only done for text sections, but some instructions
    can be found in non-text sections (e.g. .discard.text sections).

    Skip setting their orc sections since their whole sections will be
    skipped for orc generation.

    Reviewed-by: Miroslav Benes
    Signed-off-by: Julien Thierry
    Signed-off-by: Josh Poimboeuf

    Julien Thierry
     
  • Now that the objtool_file can be obtained outside of the check function,
    orc generation builtin no longer requires check to explicitly call its
    orc related functions.

    Signed-off-by: Julien Thierry
    Reviewed-by: Miroslav Benes
    Signed-off-by: Josh Poimboeuf

    Julien Thierry
     
  • Structure objtool_file can be used by different subcommands. In fact
    it already is, by check and orc.

    Provide a function that allows to initialize objtool_file, that builtin
    can call, without relying on check to do the correct setup for them and
    explicitly hand the objtool_file to them.

    Reviewed-by: Miroslav Benes
    Signed-off-by: Julien Thierry
    Signed-off-by: Josh Poimboeuf

    Julien Thierry
     

01 Sep, 2020

18 commits

  • Replace many of the indirect calls with static_call().

    The average PMI time, as measured by perf_sample_event_took()*:

    PRE: 3283.03 [ns]
    POST: 3145.12 [ns]

    Which is a ~138 [ns] win per PMI, or a ~4.2% decrease.

    [*] on an IVB-EP, using: 'perf record -a -e cycles -- make O=defconfig-build/ -j80'

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Cc: Linus Torvalds
    Link: https://lore.kernel.org/r/20200818135805.338001015@infradead.org

    Peter Zijlstra
     
  • Currently the tracepoint site will iterate a vector and issue indirect
    calls to however many handlers are registered (ie. the vector is
    long).

    Using static_call() it is possible to optimize this for the common
    case of only having a single handler registered. In this case the
    static_call() can directly call this handler. Otherwise, if the vector
    is longer than 1, call a function that iterates the whole vector like
    the current code.

    [peterz: updated to new interface]

    Signed-off-by: Steven Rostedt (VMware)
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Cc: Linus Torvalds
    Link: https://lore.kernel.org/r/20200818135805.279421092@infradead.org

    Steven Rostedt (VMware)
     
  • In order to use static_call() to wire up x86_pmu, we need to
    initialize earlier, specifically before memory allocation works; copy
    some of the tricks from jump_label to enable this.

    Primarily we overload key->next to store a sites pointer when there
    are no modules, this avoids having to use kmalloc() to initialize the
    sites and allows us to run much earlier.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Reviewed-by: Steven Rostedt (VMware)
    Link: https://lore.kernel.org/r/20200818135805.220737930@infradead.org

    Peter Zijlstra
     
  • Verify the text we're about to change is as we expect it to be.

    Requested-by: Steven Rostedt

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Link: https://lore.kernel.org/r/20200818135805.161974981@infradead.org

    Peter Zijlstra
     
  • GCC can turn our static_call(name)(args...) into a tail call, in which
    case we get a JMP.d32 into the trampoline (which then does a further
    tail-call).

    Teach objtool to recognise and mark these in .static_call_sites and
    adjust the code patching to deal with this.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Cc: Linus Torvalds
    Link: https://lore.kernel.org/r/20200818135805.101186767@infradead.org

    Peter Zijlstra
     
  • Extend the static_call infrastructure to optimize the following common
    pattern:

    if (func_ptr)
    func_ptr(args...)

    For the trampoline (which is in effect a tail-call), we patch the
    JMP.d32 into a RET, which then directly consumes the trampoline call.

    For the in-line sites we replace the CALL with a NOP5.

    NOTE: this is 'obviously' limited to functions with a 'void' return type.

    NOTE: DEFINE_STATIC_COND_CALL() only requires a typename, as opposed
    to a full function.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Cc: Linus Torvalds
    Link: https://lore.kernel.org/r/20200818135805.042977182@infradead.org

    Peter Zijlstra
     
  • Future patches will need to poke a RET instruction, provide the
    infrastructure required for this.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Reviewed-by: Steven Rostedt (VMware)
    Cc: Masami Hiramatsu
    Link: https://lore.kernel.org/r/20200818135804.982214828@infradead.org

    Peter Zijlstra
     
  • Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Link: https://lore.kernel.org/r/20200818135804.922581202@infradead.org

    Peter Zijlstra
     
  • Add the inline static call implementation for x86-64. The generated code
    is identical to the out-of-line case, except we move the trampoline into
    it's own section.

    Objtool uses the trampoline naming convention to detect all the call
    sites. It then annotates those call sites in the .static_call_sites
    section.

    During boot (and module init), the call sites are patched to call
    directly into the destination function. The temporary trampoline is
    then no longer used.

    [peterz: merged trampolines, put trampoline in section]

    Signed-off-by: Josh Poimboeuf
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Cc: Linus Torvalds
    Link: https://lore.kernel.org/r/20200818135804.864271425@infradead.org

    Josh Poimboeuf
     
  • Add the x86 out-of-line static call implementation. For each key, a
    permanent trampoline is created which is the destination for all static
    calls for the given key. The trampoline has a direct jump which gets
    patched by static_call_update() when the destination function changes.

    [peterz: fixed trampoline, rewrote patching code]

    Signed-off-by: Josh Poimboeuf
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Cc: Linus Torvalds
    Link: https://lore.kernel.org/r/20200818135804.804315175@infradead.org

    Josh Poimboeuf
     
  • Similar to how we disallow kprobes on any other dynamic text
    (ftrace/jump_label) also disallow kprobes on inline static_call()s.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Link: https://lore.kernel.org/r/20200818135804.744920586@infradead.org

    Peter Zijlstra
     
  • Add infrastructure for an arch-specific CONFIG_HAVE_STATIC_CALL_INLINE
    option, which is a faster version of CONFIG_HAVE_STATIC_CALL. At
    runtime, the static call sites are patched directly, rather than using
    the out-of-line trampolines.

    Compared to out-of-line static calls, the performance benefits are more
    modest, but still measurable. Steven Rostedt did some tracepoint
    measurements:

    https://lkml.kernel.org/r/20181126155405.72b4f718@gandalf.local.home

    This code is heavily inspired by the jump label code (aka "static
    jumps"), as some of the concepts are very similar.

    For more details, see the comments in include/linux/static_call.h.

    [peterz: simplified interface; merged trampolines]

    Signed-off-by: Josh Poimboeuf
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Reviewed-by: Steven Rostedt (VMware)
    Cc: Linus Torvalds
    Link: https://lore.kernel.org/r/20200818135804.684334440@infradead.org

    Josh Poimboeuf
     
  • Static calls are a replacement for global function pointers. They use
    code patching to allow direct calls to be used instead of indirect
    calls. They give the flexibility of function pointers, but with
    improved performance. This is especially important for cases where
    retpolines would otherwise be used, as retpolines can significantly
    impact performance.

    The concept and code are an extension of previous work done by Ard
    Biesheuvel and Steven Rostedt:

    https://lkml.kernel.org/r/20181005081333.15018-1-ard.biesheuvel@linaro.org
    https://lkml.kernel.org/r/20181006015110.653946300@goodmis.org

    There are two implementations, depending on arch support:

    1) out-of-line: patched trampolines (CONFIG_HAVE_STATIC_CALL)
    2) basic function pointers

    For more details, see the comments in include/linux/static_call.h.

    [peterz: simplified interface]

    Signed-off-by: Josh Poimboeuf
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Reviewed-by: Steven Rostedt (VMware)
    Cc: Linus Torvalds
    Link: https://lore.kernel.org/r/20200818135804.623259796@infradead.org

    Josh Poimboeuf
     
  • The __ADDRESSABLE() macro uses the __LINE__ macro to create a temporary
    symbol which has a unique name. However, if the macro is used multiple
    times from within another macro, the line number will always be the
    same, resulting in duplicate symbols.

    Make the temporary symbols truly unique by using __UNIQUE_ID instead of
    __LINE__.

    Signed-off-by: Josh Poimboeuf
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Acked-by: Ard Biesheuvel
    Link: https://lore.kernel.org/r/20200818135804.564436253@infradead.org

    Josh Poimboeuf
     
  • Nothing ensures the module exists while we're iterating
    mod->jump_entries in __jump_label_mod_text_reserved(), take a module
    reference to ensure the module sticks around.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Reviewed-by: Steven Rostedt (VMware)
    Link: https://lore.kernel.org/r/20200818135804.504501338@infradead.org

    Peter Zijlstra
     
  • Now that notifiers got unbroken; use the proper interface to handle
    notifier errors and propagate them.

    There were already MODULE_STATE_COMING notifiers that failed; notably:

    - jump_label_module_notifier()
    - tracepoint_module_notify()
    - bpf_event_notify()

    By propagating this error, we fix those users.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Reviewed-by: Miroslav Benes
    Acked-by: Jessica Yu
    Acked-by: Josh Poimboeuf
    Link: https://lore.kernel.org/r/20200818135804.444372853@infradead.org

    Peter Zijlstra
     
  • While auditing all module notifiers I noticed a whole bunch of fail
    wrt the return value. Notifiers have a 'special' return semantics.

    As is; NOTIFY_DONE vs NOTIFY_OK is a bit vague; but
    notifier_from_errno(0) results in NOTIFY_OK and NOTIFY_DONE has a
    comment that says "Don't care".

    From this I've used NOTIFY_DONE when the function completely ignores
    the callback and notifier_to_error() isn't used.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Reviewed-by: Mathieu Desnoyers
    Reviewed-by: Joel Fernandes (Google)
    Reviewed-by: Robert Richter
    Acked-by: Steven Rostedt (VMware)
    Link: https://lore.kernel.org/r/20200818135804.385360407@infradead.org

    Peter Zijlstra
     
  • The current notifiers have the following error handling pattern all
    over the place:

    int err, nr;

    err = __foo_notifier_call_chain(&chain, val_up, v, -1, &nr);
    if (err & NOTIFIER_STOP_MASK)
    __foo_notifier_call_chain(&chain, val_down, v, nr-1, NULL)

    And aside from the endless repetition thereof, it is broken. Consider
    blocking notifiers; both calls take and drop the rwsem, this means
    that the notifier list can change in between the two calls, making @nr
    meaningless.

    Fix this by replacing all the __foo_notifier_call_chain() functions
    with foo_notifier_call_chain_robust() that embeds the above pattern,
    but ensures it is inside a single lock region.

    Note: I switched atomic_notifier_call_chain_robust() to use
    the spinlock, since RCU cannot provide the guarantee
    required for the recovery.

    Note: software_resume() error handling was broken afaict.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Ingo Molnar
    Acked-by: Rafael J. Wysocki
    Link: https://lore.kernel.org/r/20200818135804.325626653@infradead.org

    Peter Zijlstra
     

31 Aug, 2020

12 commits

  • Linus Torvalds
     
  • Pull crypto fixes from Herbert Xu:

    - fix regression in af_alg that affects iwd

    - restore polling delay in qat

    - fix double free in ingenic on error path

    - fix potential build failure in sa2ul due to missing Kconfig dependency

    * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
    crypto: af_alg - Work around empty control messages without MSG_MORE
    crypto: sa2ul - add Kconfig selects to fix build error
    crypto: ingenic - Drop kfree for memory allocated with devm_kzalloc
    crypto: qat - add delay before polling mailbox

    Linus Torvalds
     
  • Pull x86 fixes from Thomas Gleixner:
    "Three interrupt related fixes for X86:

    - Move disabling of the local APIC after invoking fixup_irqs() to
    ensure that interrupts which are incoming are noted in the IRR and
    not ignored.

    - Unbreak affinity setting.

    The rework of the entry code reused the regular exception entry
    code for device interrupts. The vector number is pushed into the
    errorcode slot on the stack which is then lifted into an argument
    and set to -1 because that's regs->orig_ax which is used in quite
    some places to check whether the entry came from a syscall.

    But it was overlooked that orig_ax is used in the affinity cleanup
    code to validate whether the interrupt has arrived on the new
    target. It turned out that this vector check is pointless because
    interrupts are never moved from one vector to another on the same
    CPU. That check is a historical leftover from the time where x86
    supported multi-CPU affinities, but not longer needed with the now
    strict single CPU affinity. Famous last words ...

    - Add a missing check for an empty cpumask into the matrix allocator.

    The affinity change added a warning to catch the case where an
    interrupt is moved on the same CPU to a different vector. This
    triggers because a condition with an empty cpumask returns an
    assignment from the allocator as the allocator uses for_each_cpu()
    without checking the cpumask for being empty. The historical
    inconsistent for_each_cpu() behaviour of ignoring the cpumask and
    unconditionally claiming that CPU0 is in the mask struck again.
    Sigh.

    plus a new entry into the MAINTAINER file for the HPE/UV platform"

    * tag 'x86-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    genirq/matrix: Deal with the sillyness of for_each_cpu() on UP
    x86/irq: Unbreak interrupt affinity setting
    x86/hotplug: Silence APIC only after all interrupts are migrated
    MAINTAINERS: Add entry for HPE Superdome Flex (UV) maintainers

    Linus Torvalds
     
  • Pull irq fixes from Thomas Gleixner:
    "A set of fixes for interrupt chip drivers:

    - Revert the platform driver conversion of interrupt chip drivers as
    it turned out to create more problems than it solves.

    - Fix a trivial typo in the new module helpers which made probing
    reliably fail.

    - Small fixes in the STM32 and MIPS Ingenic drivers

    - The TI firmware rework which had badly managed dependencies and had
    to wait post rc1"

    * tag 'irq-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    irqchip/ingenic: Leave parent IRQ unmasked on suspend
    irqchip/stm32-exti: Avoid losing interrupts due to clearing pending bits by mistake
    irqchip: Revert modular support for drivers using IRQCHIP_PLATFORM_DRIVER helperse
    irqchip: Fix probing deferal when using IRQCHIP_PLATFORM_DRIVER helpers
    arm64: dts: k3-am65: Update the RM resource types
    arm64: dts: k3-am65: ti-sci-inta/intr: Update to latest bindings
    arm64: dts: k3-j721e: ti-sci-inta/intr: Update to latest bindings
    irqchip/ti-sci-inta: Add support for INTA directly connecting to GIC
    irqchip/ti-sci-inta: Do not store TISCI device id in platform device id field
    dt-bindings: irqchip: Convert ti, sci-inta bindings to yaml
    dt-bindings: irqchip: ti, sci-inta: Update docs to support different parent.
    irqchip/ti-sci-intr: Add support for INTR being a parent to INTR
    dt-bindings: irqchip: Convert ti, sci-intr bindings to yaml
    dt-bindings: irqchip: ti, sci-intr: Update bindings to drop the usage of gic as parent
    firmware: ti_sci: Add support for getting resource with subtype
    firmware: ti_sci: Drop unused structure ti_sci_rm_type_map
    firmware: ti_sci: Drop the device id to resource type translation

    Linus Torvalds
     
  • Pull scheduler fix from Thomas Gleixner:
    "A single fix for the scheduler:

    - Make is_idle_task() __always_inline to prevent the compiler from
    putting it out of line into the wrong section because it's used
    inside noinstr sections"

    * tag 'sched-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    sched: Use __always_inline on is_idle_task()

    Linus Torvalds
     
  • Pull locking fixes from Thomas Gleixner:
    "A set of fixes for lockdep, tracing and RCU:

    - Prevent recursion by using raw_cpu_* operations

    - Fixup the interrupt state in the cpu idle code to be consistent

    - Push rcu_idle_enter/exit() invocations deeper into the idle path so
    that the lock operations are inside the RCU watching sections

    - Move trace_cpu_idle() into generic code so it's called before RCU
    goes idle.

    - Handle raw_local_irq* vs. local_irq* operations correctly

    - Move the tracepoints out from under the lockdep recursion handling
    which turned out to be fragile and inconsistent"

    * tag 'locking-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    lockdep,trace: Expose tracepoints
    lockdep: Only trace IRQ edges
    mips: Implement arch_irqs_disabled()
    arm64: Implement arch_irqs_disabled()
    nds32: Implement arch_irqs_disabled()
    locking/lockdep: Cleanup
    x86/entry: Remove unused THUNKs
    cpuidle: Move trace_cpu_idle() into generic code
    cpuidle: Make CPUIDLE_FLAG_TLB_FLUSHED generic
    sched,idle,rcu: Push rcu_idle deeper into the idle path
    cpuidle: Fixup IRQ state
    lockdep: Use raw_cpu_*() for per-cpu variables

    Linus Torvalds
     
  • Pull cfis fix from Steve French:
    "DFS fix for referral problem when using SMB1"

    * tag '5.9-rc2-smb-fix' of git://git.samba.org/sfrench/cifs-2.6:
    cifs: fix check of tcon dfs in smb1

    Linus Torvalds
     
  • Pull powerpc fixes from Michael Ellerman:

    - Revert our removal of PROT_SAO, at least one user expressed an
    interest in using it on Power9. Instead don't allow it to be used in
    guests unless enabled explicitly at compile time.

    - A fix for a crash introduced by a recent change to FP handling.

    - Revert a change to our idle code that left Power10 with no idle
    support.

    - One minor fix for the new scv system call path to set PPR.

    - Fix a crash in our "generic" PMU if branch stack events were enabled.

    - A fix for the IMC PMU, to correctly identify host kernel samples.

    - The ADB_PMU powermac code was found to be incompatible with
    VMAP_STACK, so make them incompatible in Kconfig until the code can
    be fixed.

    - A build fix in drivers/video/fbdev/controlfb.c, and a documentation
    fix.

    Thanks to Alexey Kardashevskiy, Athira Rajeev, Christophe Leroy,
    Giuseppe Sacco, Madhavan Srinivasan, Milton Miller, Nicholas Piggin,
    Pratik Rajesh Sampat, Randy Dunlap, Shawn Anastasio, Vaidyanathan
    Srinivasan.

    * tag 'powerpc-5.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
    powerpc/32s: Disable VMAP stack which CONFIG_ADB_PMU
    Revert "powerpc/powernv/idle: Replace CPU feature check with PVR check"
    powerpc/perf: Fix reading of MSR[HV/PR] bits in trace-imc
    powerpc/perf: Fix crashes with generic_compat_pmu & BHRB
    powerpc/64s: Fix crash in load_fp_state() due to fpexc_mode
    powerpc/64s: scv entry should set PPR
    Documentation/powerpc: fix malformed table in syscall64-abi
    video: fbdev: controlfb: Fix build for COMPILE_TEST=y && PPC_PMAC=n
    selftests/powerpc: Update PROT_SAO test to skip ISA 3.1
    powerpc/64s: Disallow PROT_SAO in LPARs by default
    Revert "powerpc/64s: Remove PROT_SAO support"

    Linus Torvalds
     
  • Pull USB fixes from Greg KH:
    "Let's try this again... Here are some USB fixes for 5.9-rc3.

    This differs from the previous pull request for this release in that
    the usb gadget patch now does not break some systems, and actually
    does what it was intended to do. Many thanks to Marek Szyprowski for
    quickly noticing and testing the patch from Andy Shevchenko to resolve
    this issue.

    Additionally, some more new USB quirks have been added to get some new
    devices to work properly based on user reports.

    Other than that, the patches are all here, and they contain:

    - usb gadget driver fixes

    - xhci driver fixes

    - typec fixes

    - new quirks and ids

    - fixes for USB patches that went into 5.9-rc1.

    All of these have been tested in linux-next with no reported issues"

    * tag 'usb-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (33 commits)
    usb: storage: Add unusual_uas entry for Sony PSZ drives
    USB: Ignore UAS for JMicron JMS567 ATA/ATAPI Bridge
    usb: host: ohci-exynos: Fix error handling in exynos_ohci_probe()
    USB: gadget: u_f: Unbreak offset calculation in VLAs
    USB: quirks: Ignore duplicate endpoint on Sound Devices MixPre-D
    usb: typec: tcpm: Fix Fix source hard reset response for TDA 2.3.1.1 and TDA 2.3.1.2 failures
    USB: PHY: JZ4770: Fix static checker warning.
    USB: gadget: f_ncm: add bounds checks to ncm_unwrap_ntb()
    USB: gadget: u_f: add overflow checks to VLA macros
    xhci: Always restore EP_SOFT_CLEAR_TOGGLE even if ep reset failed
    xhci: Do warm-reset when both CAS and XDEV_RESUME are set
    usb: host: xhci: fix ep context print mismatch in debugfs
    usb: uas: Add quirk for PNY Pro Elite
    tools: usb: move to tools buildsystem
    USB: Fix device driver race
    USB: Also match device drivers using the ->match vfunc
    usb: host: xhci-tegra: fix tegra_xusb_get_phy()
    usb: host: xhci-tegra: otg usb2/usb3 port init
    usb: hcd: Fix use after free in usb_hcd_pci_remove()
    usb: typec: ucsi: Hold con->lock for the entire duration of ucsi_register_port()
    ...

    Linus Torvalds
     
  • Pull EDAC fix from Borislav Petkov:
    "A fix to properly clear ghes_edac driver state on driver remove so
    that a subsequent load can probe the system properly (Shiju Jose)"

    * tag 'edac_urgent_for_v5.9_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
    EDAC/ghes: Fix NULL pointer dereference in ghes_edac_register()

    Linus Torvalds
     
  • Pull dma-mapping fix from Christoph Hellwig:
    "Fix a possibly uninitialized variable (Dan Carpenter)"

    * tag 'dma-mapping-5.9-2' of git://git.infradead.org/users/hch/dma-mapping:
    dma-pool: Fix an uninitialized variable bug in atomic_pool_expand()

    Linus Torvalds
     
  • Most of the CPU mask operations behave the same way, but for_each_cpu() and
    it's variants ignore the cpumask argument and claim that CPU0 is always in
    the mask. This is historical, inconsistent and annoying behaviour.

    The matrix allocator uses for_each_cpu() and can be called on UP with an
    empty cpumask. The calling code does not expect that this succeeds but
    until commit e027fffff799 ("x86/irq: Unbreak interrupt affinity setting")
    this went unnoticed. That commit added a WARN_ON() to catch cases which
    move an interrupt from one vector to another on the same CPU. The warning
    triggers on UP.

    Add a check for the cpumask being empty to prevent this.

    Fixes: 2f75d9e1c905 ("genirq: Implement bitmap matrix allocator")
    Reported-by: kernel test robot
    Signed-off-by: Thomas Gleixner
    Cc: stable@vger.kernel.org

    Thomas Gleixner
     

30 Aug, 2020

2 commits

  • …el/git/gustavoars/linux

    Pull fallthrough fixes from Gustavo A. R. Silva:
    "Fix some minor issues introduced by the recent treewide fallthrough
    conversions:

    - Fix identation issue

    - Fix erroneous fallthrough annotation

    - Remove unnecessary fallthrough annotation

    - Fix code comment changed by fallthrough conversion"

    * tag 'fallthrough-fixes-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
    arm64/cpuinfo: Remove unnecessary fallthrough annotation
    media: dib0700: Fix identation issue in dib8096_set_param_override()
    afs: Remove erroneous fallthough annotation
    iio: dpot-dac: fix code comment in dpot_dac_read_raw()

    Linus Torvalds
     
  • Commit ef91bb196b0d ("kernel.h: Silence sparse warning in
    lower_32_bits") caused new warnings to show in the fsldma driver, but
    that commit was not to blame: it only exposed some very incorrect code
    that tried to take the low 32 bits of an address.

    That made no sense for multiple reasons, the most notable one being that
    that code was intentionally limited to only 32-bit ppc builds, so "only
    low 32 bits of an address" was completely nonsensical. There were no
    high bits to mask off to begin with.

    But even more importantly fropm a correctness standpoint, turning the
    address into an integer then caused the subsequent address arithmetic to
    be completely wrong too, and the "+1" actually incremented the address
    by one, rather than by four.

    Which again was incorrect, since the code was reading two 32-bit values
    and trying to make a 64-bit end result of it all. Surprisingly, the
    iowrite64() did not suffer from the same odd and incorrect model.

    This code has never worked, but it's questionable whether anybody cared:
    of the two users that actually read the 64-bit value (by way of some C
    preprocessor hackery and eventually the 'get_cdar()' inline function),
    one of them explicitly ignored the value, and the other one might just
    happen to work despite the incorrect value being read.

    This patch at least makes it not fail the build any more, and makes the
    logic superficially sane. Whether it makes any difference to the code
    _working_ or not shall remain a mystery.

    Compile-tested-by: Guenter Roeck
    Reviewed-by: Guenter Roeck
    Signed-off-by: Linus Torvalds

    Linus Torvalds