04 Oct, 2016

1 commit

  • Pull RAS updates from Ingo Molnar:
    "The main changes were:

    - Lots of enhancements for AMD SMCA (Scalable MCA
    features/extensions) systems: extract, decode and print more
    hardware error information and add matching support on the
    injection/testing side as well. (Yazn Ghannam)

    - Various MCE handling improvements on modern Intel Xeons. (Tony
    Luck)

    - Plus misc fixes and enhancements"

    * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
    x86/RAS/mce_amd_inj: Remove debugfs dir recursively on exit
    x86/RAS/mce_amd_inj: Fix signed wrap around when decrementing index 'i'
    x86/RAS/mce_amd_inj: Fix some W= warnings
    x86/MCE/AMD, EDAC: Handle reserved bank 4 on Fam17h properly
    x86/mce/AMD: Extract the error address on SMCA systems
    x86/mce, EDAC/mce_amd: Print MCA_SYND and MCA_IPID during MCE on SMCA systems
    x86/mce/AMD: Save MCA_IPID in MCE struct on SMCA systems
    x86/mce/AMD: Ensure the deferred error interrupt is of type APIC on SMCA systems
    x86/mce/AMD: Update sysfs bank names for SMCA systems
    x86/mce/AMD, EDAC/mce_amd: Define and use tables for known SMCA IP types
    EDAC/mce_amd: Use SMCA prefix for error descriptions arrays
    EDAC/mce_amd: Add missing SMCA error descriptions
    x86/mce/AMD: Read MSRs on the CPU allocating the threshold blocks
    x86/RAS: Add syndrome support to mce_amd_inj
    EDAC/mce_amd: Print syndrome register value on SMCA systems
    x86/mce: Add support for new MCA_SYND register
    x86/mce/AMD: Use msr_ops.misc() in allocate_threshold_blocks()
    x86/mce: Drop X86_FEATURE_MCE_RECOVERY and the related model string test
    x86/mce: Improve memcpy_mcsafe()
    x86/mce: Add PCI quirks to identify Xeons with machine check recovery
    ...

    Linus Torvalds
     

07 Sep, 2016

1 commit

  • The static key API is currently designed around single variable
    definitions. There are cases where an array of static keys is desirable,
    so extend the API to allow this rather than using the internal static
    key implementation directly.

    Cc: Jason Baron
    Cc: Jonathan Corbet
    Acked-by: Peter Zijlstra (Intel)
    Suggested-by: Dave P Martin
    Signed-off-by: Catalin Marinas
    Signed-off-by: Will Deacon

    Catalin Marinas
     

05 Sep, 2016

1 commit

  • We will need to provide declarations of static keys in header
    files. Provide DECLARE_STATIC_KEY_{TRUE,FALSE} macros.

    Signed-off-by: Tony Luck
    Acked-by: Borislav Petkov
    Cc: Peter Zijlstra
    Cc: Dan Williams
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/816881cf85bd3cf13385d212882618f38a3b5d33.1472754711.git.tony.luck@intel.com
    Signed-off-by: Thomas Gleixner

    Tony Luck
     

04 Aug, 2016

1 commit

  • The current jump_label.h includes bug.h for things such as WARN_ON().
    This makes the header problematic for inclusion by kernel.h or any
    headers that kernel.h includes, since bug.h includes kernel.h (circular
    dependency). The inclusion of atomic.h is similarly problematic. Thus,
    this should make jump_label.h 'includable' from most places.

    Link: http://lkml.kernel.org/r/7060ce35ddd0d20b33bf170685e6b0fab816bdf2.1467837322.git.jbaron@akamai.com
    Signed-off-by: Jason Baron
    Cc: "David S. Miller"
    Cc: Arnd Bergmann
    Cc: Benjamin Herrenschmidt
    Cc: Chris Metcalf
    Cc: Heiko Carstens
    Cc: Joe Perches
    Cc: Martin Schwidefsky
    Cc: Michael Ellerman
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jason Baron
     

24 Jun, 2016

1 commit

  • The following scenario is possible:

    CPU 1 CPU 2
    static_key_slow_inc()
    atomic_inc_not_zero()
    -> key.enabled == 0, no increment
    jump_label_lock()
    atomic_inc_return()
    -> key.enabled == 1 now
    static_key_slow_inc()
    atomic_inc_not_zero()
    -> key.enabled == 1, inc to 2
    return
    ** static key is wrong!
    jump_label_update()
    jump_label_unlock()

    Testing the static key at the point marked by (**) will follow the
    wrong path for jumps that have not been patched yet. This can
    actually happen when creating many KVM virtual machines with userspace
    LAPIC emulation; just run several copies of the following program:

    #include
    #include
    #include
    #include

    int main(void)
    {
    for (;;) {
    int kvmfd = open("/dev/kvm", O_RDONLY);
    int vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0);
    close(ioctl(vmfd, KVM_CREATE_VCPU, 1));
    close(vmfd);
    close(kvmfd);
    }
    return 0;
    }

    Every KVM_CREATE_VCPU ioctl will attempt a static_key_slow_inc() call.
    The static key's purpose is to skip NULL pointer checks and indeed one
    of the processes eventually dereferences NULL.

    As explained in the commit that introduced the bug:

    706249c222f6 ("locking/static_keys: Rework update logic")

    jump_label_update() needs key.enabled to be true. The solution adopted
    here is to temporarily make key.enabled == -1, and use go down the
    slow path when key.enabled
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: # v4.3+
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Fixes: 706249c222f6 ("locking/static_keys: Rework update logic")
    Link: http://lkml.kernel.org/r/1466527937-69798-1-git-send-email-pbonzini@redhat.com
    [ Small stylistic edits to the changelog and the code. ]
    Signed-off-by: Ingo Molnar

    Paolo Bonzini
     

23 Nov, 2015

1 commit

  • There were still a number of references to my old Red Hat email
    address in the kernel source. Remove these while keeping the
    Red Hat copyright notices intact.

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    Cc: Vince Weaver
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

06 Nov, 2015

1 commit

  • Pull cgroup updates from Tejun Heo:
    "The cgroup core saw several significant updates this cycle:

    - percpu_rwsem for threadgroup locking is reinstated. This was
    temporarily dropped due to down_write latency issues. Oleg's
    rework of percpu_rwsem which is scheduled to be merged in this
    merge window resolves the issue.

    - On the v2 hierarchy, when controllers are enabled and disabled, all
    operations are atomic and can fail and revert cleanly. This allows
    ->can_attach() failure which is necessary for cpu RT slices.

    - Tasks now stay associated with the original cgroups after exit
    until released. This allows tracking resources held by zombies
    (e.g. pids) and makes it easy to find out where zombies came from
    on the v2 hierarchy. The pids controller was broken before these
    changes as zombies escaped the limits; unfortunately, updating this
    behavior required too many invasive changes and I don't think it's
    a good idea to backport them, so the pids controller on 4.3, the
    first version which included the pids controller, will stay broken
    at least until I'm sure about the cgroup core changes.

    - Optimization of a couple common tests using static_key"

    * 'for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (38 commits)
    cgroup: fix race condition around termination check in css_task_iter_next()
    blkcg: don't create "io.stat" on the root cgroup
    cgroup: drop cgroup__DEVEL__legacy_files_on_dfl
    cgroup: replace error handling in cgroup_init() with WARN_ON()s
    cgroup: add cgroup_subsys->free() method and use it to fix pids controller
    cgroup: keep zombies associated with their original cgroups
    cgroup: make css_set_rwsem a spinlock and rename it to css_set_lock
    cgroup: don't hold css_set_rwsem across css task iteration
    cgroup: reorganize css_task_iter functions
    cgroup: factor out css_set_move_task()
    cgroup: keep css_set and task lists in chronological order
    cgroup: make cgroup_destroy_locked() test cgroup_is_populated()
    cgroup: make css_sets pin the associated cgroups
    cgroup: relocate cgroup_[try]get/put()
    cgroup: move check_for_release() invocation
    cgroup: replace cgroup_has_tasks() with cgroup_is_populated()
    cgroup: make cgroup->nr_populated count the number of populated css_sets
    cgroup: remove an unused parameter from cgroup_task_migrate()
    cgroup: fix too early usage of static_branch_disable()
    cgroup: make cgroup_update_dfl_csses() migrate all target processes atomically
    ...

    Linus Torvalds
     

18 Sep, 2015

1 commit


15 Sep, 2015

1 commit


08 Sep, 2015

1 commit

  • Commit:

    412758cb2670 ("jump label, locking/static_keys: Update docs")

    introduced a typo that might as well get fixed.

    Signed-off-by: Jonathan Corbet
    Cc: Andrew Morton
    Cc: Jason Baron
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20150907131803.54c027e1@lwn.net
    Signed-off-by: Ingo Molnar

    Jonathan Corbet
     

03 Aug, 2015

5 commits

  • Signed-off-by: Jason Baron
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: benh@kernel.crashing.org
    Cc: bp@alien8.de
    Cc: davem@davemloft.net
    Cc: ddaney@caviumnetworks.com
    Cc: heiko.carstens@de.ibm.com
    Cc: linux-kernel@vger.kernel.org
    Cc: liuj97@gmail.com
    Cc: luto@amacapital.net
    Cc: michael@ellerman.id.au
    Cc: rabin@rab.in
    Cc: ralf@linux-mips.org
    Cc: rostedt@goodmis.org
    Cc: vbabka@suse.cz
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/6b50f2f6423a2244f37f4b1d2d6c211b9dcdf4f8.1438227999.git.jbaron@akamai.com
    Signed-off-by: Ingo Molnar

    Jason Baron
     
  • There are various problems and short-comings with the current
    static_key interface:

    - static_key_{true,false}() read like a branch depending on the key
    value, instead of the actual likely/unlikely branch depending on
    init value.

    - static_key_{true,false}() are, as stated above, tied to the
    static_key init values STATIC_KEY_INIT_{TRUE,FALSE}.

    - we're limited to the 2 (out of 4) possible options that compile to
    a default NOP because that's what our arch_static_branch() assembly
    emits.

    So provide a new static_key interface:

    DEFINE_STATIC_KEY_TRUE(name);
    DEFINE_STATIC_KEY_FALSE(name);

    Which define a key of different types with an initial true/false
    value.

    Then allow:

    static_branch_likely()
    static_branch_unlikely()

    to take a key of either type and emit the right instruction for the
    case.

    This means adding a second arch_static_branch_jump() assembly helper
    which emits a JMP per default.

    In order to determine the right instruction for the right state,
    encode the branch type in the LSB of jump_entry::key.

    This is the final step in removing the naming confusion that has led to
    a stream of avoidable bugs such as:

    a833581e372a ("x86, perf: Fix static_key bug in load_mm_cr4()")

    ... but it also allows new static key combinations that will give us
    performance enhancements in the subsequent patches.

    Tested-by: Rabin Vincent # arm
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Michael Ellerman # ppc
    Acked-by: Heiko Carstens # s390
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Add two helpers to make it easier to treat the refcount as boolean.

    Suggested-by: Jason Baron
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • … the static_key* pattern

    Rename the JUMP_LABEL_TYPE_* macros to be JUMP_TYPE_* and move the
    inline helpers into kernel/jump_label.c, since that's the only place
    they're ever used.

    Also rename the helpers where it's all about static keys.

    This is the second step in removing the naming confusion that has led to
    a stream of avoidable bugs such as:

    a833581e372a ("x86, perf: Fix static_key bug in load_mm_cr4()")

    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Peter Zijlstra
     
  • Since we've already stepped away from ENABLE is a JMP and DISABLE is a
    NOP with the branch_default bits, and are going to make it even worse,
    rename it to make it all clearer.

    This way we don't mix multiple levels of logic attributes, but have a
    plain 'physical' name for what the current instruction patching status
    of a jump label is.

    This is a first step in removing the naming confusion that has led to
    a stream of avoidable bugs such as:

    a833581e372a ("x86, perf: Fix static_key bug in load_mm_cr4()")

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-kernel@vger.kernel.org
    [ Beefed up the changelog. ]
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

09 Apr, 2015

1 commit

  • To use jump labels in assembly we need the HAVE_JUMP_LABEL
    define, so we select a fallback version if the toolchain does
    not support them.

    Modify linux/jump_label.h so it can be included by assembly
    files. We also need to add -DCC_HAVE_ASM_GOTO to KBUILD_AFLAGS.

    Signed-off-by: Anton Blanchard
    Acked-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: benh@kernel.crashing.org
    Cc: catalin.marinas@arm.com
    Cc: davem@davemloft.net
    Cc: heiko.carstens@de.ibm.com
    Cc: jbaron@akamai.com
    Cc: linux@arm.linux.org.uk
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: liuj97@gmail.com
    Cc: mgorman@suse.de
    Cc: mmarek@suse.cz
    Cc: mpe@ellerman.id.au
    Cc: paulus@samba.org
    Cc: ralf@linux-mips.org
    Cc: rostedt@goodmis.org
    Cc: schwidefsky@de.ibm.com
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/1428551492-21977-2-git-send-email-anton@samba.org
    Signed-off-by: Ingo Molnar

    Anton Blanchard
     

10 Aug, 2014

1 commit

  • Was reading through the documentation of this code and noticed
    a few typos, missing commas, etc.

    Cc: Jason Baron
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Cc: Borislav Petkov
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Cc: Mel Gorman
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

05 Jun, 2014

1 commit

  • This patch exposes the jump_label reference count in preparation for the
    next patch. cpusets cares about both the jump_label being enabled and how
    many users of the cpusets there currently are.

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Mel Gorman
    Cc: Johannes Weiner
    Cc: Vlastimil Babka
    Cc: Jan Kara
    Cc: Michal Hocko
    Cc: Hugh Dickins
    Cc: Dave Hansen
    Cc: Theodore Ts'o
    Cc: "Paul E. McKenney"
    Cc: Oleg Nesterov
    Cc: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     

08 Jan, 2014

1 commit


13 Nov, 2013

2 commits

  • Pull networking updates from David Miller:

    1) The addition of nftables. No longer will we need protocol aware
    firewall filtering modules, it can all live in userspace.

    At the core of nftables is a, for lack of a better term, virtual
    machine that executes byte codes to inspect packet or metadata
    (arriving interface index, etc.) and make verdict decisions.

    Besides support for loading packet contents and comparing them, the
    interpreter supports lookups in various datastructures as
    fundamental operations. For example sets are supports, and
    therefore one could create a set of whitelist IP address entries
    which have ACCEPT verdicts attached to them, and use the appropriate
    byte codes to do such lookups.

    Since the interpreted code is composed in userspace, userspace can
    do things like optimize things before giving it to the kernel.

    Another major improvement is the capability of atomically updating
    portions of the ruleset. In the existing netfilter implementation,
    one has to update the entire rule set in order to make a change and
    this is very expensive.

    Userspace tools exist to create nftables rules using existing
    netfilter rule sets, but both kernel implementations will need to
    co-exist for quite some time as we transition from the old to the
    new stuff.

    Kudos to Patrick McHardy, Pablo Neira Ayuso, and others who have
    worked so hard on this.

    2) Daniel Borkmann and Hannes Frederic Sowa made several improvements
    to our pseudo-random number generator, mostly used for things like
    UDP port randomization and netfitler, amongst other things.

    In particular the taus88 generater is updated to taus113, and test
    cases are added.

    3) Support 64-bit rates in HTB and TBF schedulers, from Eric Dumazet
    and Yang Yingliang.

    4) Add support for new 577xx tigon3 chips to tg3 driver, from Nithin
    Sujir.

    5) Fix two fatal flaws in TCP dynamic right sizing, from Eric Dumazet,
    Neal Cardwell, and Yuchung Cheng.

    6) Allow IP_TOS and IP_TTL to be specified in sendmsg() ancillary
    control message data, much like other socket option attributes.
    From Francesco Fusco.

    7) Allow applications to specify a cap on the rate computed
    automatically by the kernel for pacing flows, via a new
    SO_MAX_PACING_RATE socket option. From Eric Dumazet.

    8) Make the initial autotuned send buffer sizing in TCP more closely
    reflect actual needs, from Eric Dumazet.

    9) Currently early socket demux only happens for TCP sockets, but we
    can do it for connected UDP sockets too. Implementation from Shawn
    Bohrer.

    10) Refactor inet socket demux with the goal of improving hash demux
    performance for listening sockets. With the main goals being able
    to use RCU lookups on even request sockets, and eliminating the
    listening lock contention. From Eric Dumazet.

    11) The bonding layer has many demuxes in it's fast path, and an RCU
    conversion was started back in 3.11, several changes here extend the
    RCU usage to even more locations. From Ding Tianhong and Wang
    Yufen, based upon suggestions by Nikolay Aleksandrov and Veaceslav
    Falico.

    12) Allow stackability of segmentation offloads to, in particular, allow
    segmentation offloading over tunnels. From Eric Dumazet.

    13) Significantly improve the handling of secret keys we input into the
    various hash functions in the inet hashtables, TCP fast open, as
    well as syncookies. From Hannes Frederic Sowa. The key fundamental
    operation is "net_get_random_once()" which uses static keys.

    Hannes even extended this to ipv4/ipv6 fragmentation handling and
    our generic flow dissector.

    14) The generic driver layer takes care now to set the driver data to
    NULL on device removal, so it's no longer necessary for drivers to
    explicitly set it to NULL any more. Many drivers have been cleaned
    up in this way, from Jingoo Han.

    15) Add a BPF based packet scheduler classifier, from Daniel Borkmann.

    16) Improve CRC32 interfaces and generic SKB checksum iterators so that
    SCTP's checksumming can more cleanly be handled. Also from Daniel
    Borkmann.

    17) Add a new PMTU discovery mode, IP_PMTUDISC_INTERFACE, which forces
    using the interface MTU value. This helps avoid PMTU attacks,
    particularly on DNS servers. From Hannes Frederic Sowa.

    18) Use generic XPS for transmit queue steering rather than internal
    (re-)implementation in virtio-net. From Jason Wang.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1622 commits)
    random32: add test cases for taus113 implementation
    random32: upgrade taus88 generator to taus113 from errata paper
    random32: move rnd_state to linux/random.h
    random32: add prandom_reseed_late() and call when nonblocking pool becomes initialized
    random32: add periodic reseeding
    random32: fix off-by-one in seeding requirement
    PHY: Add RTL8201CP phy_driver to realtek
    xtsonic: add missing platform_set_drvdata() in xtsonic_probe()
    macmace: add missing platform_set_drvdata() in mace_probe()
    ethernet/arc/arc_emac: add missing platform_set_drvdata() in arc_emac_probe()
    ipv6: protect for_each_sk_fl_rcu in mem_check with rcu_read_lock_bh
    vlan: Implement vlan_dev_get_egress_qos_mask as an inline.
    ixgbe: add warning when max_vfs is out of range.
    igb: Update link modes display in ethtool
    netfilter: push reasm skb through instead of original frag skbs
    ip6_output: fragment outgoing reassembled skb properly
    MAINTAINERS: mv643xx_eth: take over maintainership from Lennart
    net_sched: tbf: support of 64bit rates
    ixgbe: deleting dfwd stations out of order can cause null ptr deref
    ixgbe: fix build err, num_rx_queues is only available with CONFIG_RPS
    ...

    Linus Torvalds
     
  • if (unlikely(x) > 0) doesn't seem to help branch prediction

    Signed-off-by: Roel Kluin
    Cc: Raghavendra K T
    Cc: Konrad Rzeszutek Wilk
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roel Kluin
     

20 Oct, 2013

1 commit

  • Usage of the static key primitives to toggle a branch must not be used
    before jump_label_init() is called from init/main.c. jump_label_init
    reorganizes and wires up the jump_entries so usage before that could
    have unforeseen consequences.

    Following primitives are now checked for correct use:
    * static_key_slow_inc
    * static_key_slow_dec
    * static_key_slow_dec_deferred
    * jump_label_rate_limit

    The x86 architecture already checks this by testing if the default_nop
    was already replaced with an optimal nop or with a branch instruction. It
    will panic then. Other architectures don't check for this.

    Because we need to relax this check for the x86 arch to allow code to
    transition from default_nop to the enabled state and other architectures
    did not check for this at all this patch introduces checking on the
    static_key primitives in a non-arch dependent manner.

    All checked functions are considered slow-path so the additional check
    does no harm to performance.

    The warnings are best observed with earlyprintk.

    Based on a patch from Andi Kleen.

    Cc: Steven Rostedt
    Cc: Peter Zijlstra
    Cc: Andi Kleen
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

09 Aug, 2013

1 commit

  • Commit b202952075f62603bea9bfb6ebc6b0420db11949 ("perf, core: Rate limit
    perf_sched_events jump_label patching") introduced rate limiting
    for jump label disabling. The changes were made in the jump label code
    in order to be more widely available and to keep things tidier. This is
    all fine, except now jump_label.h includes linux/workqueue.h, which
    makes it impossible to include jump_label.h from anything that
    workqueue.h needs. For example, it's now impossible to include
    jump_label.h from asm/spinlock.h, which is done in proposed
    pv-ticketlock patches. This patch splits out the rate limiting related
    changes from jump_label.h into a new file, jump_label_ratelimit.h, to
    resolve the issue.

    Signed-off-by: Andrew Jones
    Link: http://lkml.kernel.org/r/1376058122-8248-10-git-send-email-raghavendra.kt@linux.vnet.ibm.com
    Reviewed-by: Konrad Rzeszutek Wilk
    Signed-off-by: Raghavendra K T
    Acked-by: Ingo Molnar
    Signed-off-by: H. Peter Anvin

    Andrew Jones
     

06 Jul, 2012

1 commit

  • Remove the obsolete static_branch() interface, since the supported interface
    is now static_key_false()/true() - which is used by all in-tree code.

    See commit:

    c5905afb0e ("static keys: Introduce 'struct static_key', static_key_true()/false()
    and static_key_slow_[inc|dec]()").

    Signed-off-by: Jason Baron
    Cc: rostedt@goodmis.org
    Cc: mathieu.desnoyers@efficios.com
    Cc: Linus Torvalds
    Cc: Andrew Morton
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/199332c47eef8005d5a5bf1018a80d25929a5746.1340909155.git.jbaron@redhat.com
    Signed-off-by: Ingo Molnar

    Jason Baron
     

29 Feb, 2012

1 commit

  • In the jump label enabled case, calling static_key_enabled()
    results in a function call. The function returns the results of
    a compare, so it really doesn't need the overhead of a full
    function call. Let's make it 'static inline' for both the jump
    label enabled and disabled cases.

    Signed-off-by: Jason Baron
    Cc: a.p.zijlstra@chello.nl
    Cc: rostedt@goodmis.org
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/201202281849.q1SIn1p2023270@int-mx10.intmail.prod.int.phx2.redhat.com
    Signed-off-by: Ingo Molnar

    Jason Baron
     

24 Feb, 2012

1 commit

  • …_key_slow_[inc|dec]()

    So here's a boot tested patch on top of Jason's series that does
    all the cleanups I talked about and turns jump labels into a
    more intuitive to use facility. It should also address the
    various misconceptions and confusions that surround jump labels.

    Typical usage scenarios:

    #include <linux/static_key.h>

    struct static_key key = STATIC_KEY_INIT_TRUE;

    if (static_key_false(&key))
    do unlikely code
    else
    do likely code

    Or:

    if (static_key_true(&key))
    do likely code
    else
    do unlikely code

    The static key is modified via:

    static_key_slow_inc(&key);
    ...
    static_key_slow_dec(&key);

    The 'slow' prefix makes it abundantly clear that this is an
    expensive operation.

    I've updated all in-kernel code to use this everywhere. Note
    that I (intentionally) have not pushed through the rename
    blindly through to the lowest levels: the actual jump-label
    patching arch facility should be named like that, so we want to
    decouple jump labels from the static-key facility a bit.

    On non-jump-label enabled architectures static keys default to
    likely()/unlikely() branches.

    Signed-off-by: Ingo Molnar <mingo@elte.hu>
    Acked-by: Jason Baron <jbaron@redhat.com>
    Acked-by: Steven Rostedt <rostedt@goodmis.org>
    Cc: a.p.zijlstra@chello.nl
    Cc: mathieu.desnoyers@efficios.com
    Cc: davem@davemloft.net
    Cc: ddaney.cavm@gmail.com
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Link: http://lkml.kernel.org/r/20120222085809.GA26397@elte.hu
    Signed-off-by: Ingo Molnar <mingo@elte.hu>

    Ingo Molnar
     

27 Jan, 2012

1 commit

  • akpm figured we could do with a blub explaining what static_branch()
    is and why it lives...

    Grumpily-requested-by: Andrew Morton
    Cc: Jason Baron
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/n/tip-h02wu6kabpoojxf03wke704k@git.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

07 Dec, 2011

1 commit

  • Provide two initializers for jump_label_key that initialize it enabled
    or disabled. Also modify all jump_label code to allow for jump_labels to be
    initialized enabled.

    Signed-off-by: Peter Zijlstra
    Cc: Jason Baron
    Link: http://lkml.kernel.org/n/tip-p40e3yj21b68y03z1yv825e7@git.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

06 Dec, 2011

1 commit

  • jump_lable patching is very expensive operation that involves pausing all
    cpus. The patching of perf_sched_events jump_label is easily controllable
    from userspace by unprivileged user.

    When te user runs a loop like this:

    "while true; do perf stat -e cycles true; done"

    ... the performance of my test application that just increments a counter
    for one second drops by 4%.

    This is on a 16 cpu box with my test application using only one of
    them. An impact on a real server doing real work will be worse.

    Performance of KVM PMU drops nearly 50% due to jump_lable for "perf
    record" since KVM PMU implementation creates and destroys perf event
    frequently.

    This patch introduces a way to rate limit jump_label patching and uses
    it to fix the above problem.

    I believe that as jump_label use will spread the problem will become more
    common and thus solving it in a generic code is appropriate. Also fixing
    it in the perf code would result in moving jump_label accounting logic to
    perf code with all the ifdefs in case of JUMP_LABEL=n kernel. With this
    patch all details are nicely hidden inside jump_label code.

    Signed-off-by: Gleb Natapov
    Acked-by: Jason Baron
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20111127155909.GO2557@redhat.com
    Signed-off-by: Ingo Molnar

    Gleb Natapov
     

26 Oct, 2011

4 commits


27 Jul, 2011

1 commit

  • This allows us to move duplicated code in
    (atomic_inc_not_zero() for now) to

    Signed-off-by: Arun Sharma
    Reviewed-by: Eric Dumazet
    Cc: Ingo Molnar
    Cc: David Miller
    Cc: Eric Dumazet
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun Sharma
     

05 Apr, 2011

1 commit

  • Introduce:

    static __always_inline bool static_branch(struct jump_label_key *key);

    instead of the old JUMP_LABEL(key, label) macro.

    In this way, jump labels become really easy to use:

    Define:

    struct jump_label_key jump_key;

    Can be used as:

    if (static_branch(&jump_key))
    do unlikely code

    enable/disale via:

    jump_label_inc(&jump_key);
    jump_label_dec(&jump_key);

    that's it!

    For the jump labels disabled case, the static_branch() becomes an
    atomic_read(), and jump_label_inc()/dec() are simply atomic_inc(),
    atomic_dec() operations. We show testing results for this change below.

    Thanks to H. Peter Anvin for suggesting the 'static_branch()' construct.

    Since we now require a 'struct jump_label_key *key', we can store a pointer into
    the jump table addresses. In this way, we can enable/disable jump labels, in
    basically constant time. This change allows us to completely remove the previous
    hashtable scheme. Thanks to Peter Zijlstra for this re-write.

    Testing:

    I ran a series of 'tbench 20' runs 5 times (with reboots) for 3
    configurations, where tracepoints were disabled.

    jump label configured in
    avg: 815.6

    jump label *not* configured in (using atomic reads)
    avg: 800.1

    jump label *not* configured in (regular reads)
    avg: 803.4

    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Jason Baron
    Suggested-by: H. Peter Anvin
    Tested-by: David Daney
    Acked-by: Ralf Baechle
    Acked-by: David S. Miller
    Acked-by: Mathieu Desnoyers
    Signed-off-by: Steven Rostedt

    Jason Baron
     

30 Oct, 2010

1 commit

  • On i386 (not x86_64) early implementations of gcc would have a bug
    with asm goto causing it to produce code like the following:

    (This was noticed by Peter Zijlstra)

    56 pushl 0
    67 nopl jmp 0x6f
    popl
    jmp 0x8c

    6f mov
    test
    je 0x8c

    8c mov
    call *(%esp)

    The jump added in the asm goto skipped over the popl that matched
    the pushl 0, which lead up to a quick crash of the system when
    the jump was enabled. The nopl is defined in the asm goto () statement
    and when tracepoints are enabled, the nop changes to a jump to the label
    that was specified by the asm goto. asm goto is suppose to tell gcc that
    the code in the asm might jump to an external label. Here gcc obviously
    fails to make that work.

    The bug report for gcc is here:

    http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46226

    The bug only appears on x86 when not compiled with
    -maccumulate-outgoing-args. This option is always set on x86_64 and it
    is also the work around for a function graph tracer i386 bug.
    (See commit: 746357d6a526d6da9d89a2ec645b28406e959c2e)
    This explains why the bug only showed up on i386 when function graph
    tracer was not enabled.

    This patch now adds a CONFIG_JUMP_LABEL option that is default
    off instead of using jump labels by default. When jump labels are
    enabled, the -maccumulate-outgoing-args will be used (causing a
    slightly larger kernel image on i386). This option will exist
    until we have a way to detect if the gcc compiler in use is safe
    to use on all configurations without the work around.

    Note, there exists such a test, but for now we will keep the enabling
    of jump label as a manual option.

    Archs that know the compiler is safe with asm goto, may choose to
    select JUMP_LABEL and enable it by default.

    Reported-by: Ingo Molnar
    Cause-discovered-by: Peter Zijlstra
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Jason Baron
    Cc: H. Peter Anvin
    Cc: David Daney
    Cc: Mathieu Desnoyers
    Cc: Masami Hiramatsu
    Cc: David Miller
    Cc: Richard Henderson
    LKML-Reference:
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

28 Oct, 2010

1 commit

  • register_kprobe() downs the 'text_mutex' and then calls
    jump_label_text_reserved(), which downs the 'jump_label_mutex'.
    However, the jump label code takes those mutexes in the reverse
    order.

    Fix by requiring the caller of jump_label_text_reserved() to do
    the jump label locking via the newly added: jump_label_lock(),
    jump_label_unlock(). Currently, kprobes is the only user
    of jump_label_text_reserved().

    Reported-by: Ingo Molnar
    Acked-by: Masami Hiramatsu
    Signed-off-by: Jason Baron
    LKML-Reference:
    Signed-off-by: Steven Rostedt

    Jason Baron
     

19 Oct, 2010

2 commits


23 Sep, 2010

1 commit