10 May, 2007

3 commits

  • Since nonboot CPUs are now disabled after tasks and devices have been
    frozen and the CPU hotplug infrastructure is used for this purpose, we need
    special CPU hotplug notifications that will help the CPU-hotplug-aware
    subsystems distinguish normal CPU hotplug events from CPU hotplug events
    related to a system-wide suspend or resume operation in progress. This
    patch introduces such notifications and causes them to be used during
    suspend and resume transitions. It also changes all of the
    CPU-hotplug-aware subsystems to take these notifications into consideration
    (for now they are handled in the same way as the corresponding "normal"
    ones).

    [oleg@tv-sign.ru: cleanups]
    Signed-off-by: Rafael J. Wysocki
    Cc: Gautham R Shenoy
    Cc: Pavel Machek
    Signed-off-by: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • This is an attempt to provide an alternate mechanism for postponing
    a hotplug event instead of using a global mechanism like lock_cpu_hotplug.

    The proposal is to add two new events namely CPU_LOCK_ACQUIRE and
    CPU_LOCK_RELEASE. The notification for these two events would be sent
    out before and after a cpu_hotplug event respectively.

    During the CPU_LOCK_ACQUIRE event, a cpu-hotplug-aware subsystem is
    supposed to acquire any per-subsystem hotcpu mutex ( Eg. workqueue_mutex
    in kernel/workqueue.c ).

    During the CPU_LOCK_RELEASE release event the cpu-hotplug-aware subsystem
    is supposed to release the per-subsystem hotcpu mutex.

    The reasons for defining new events as opposed to reusing the existing events
    like CPU_UP_PREPARE/CPU_UP_FAILED/CPU_ONLINE for locking/unlocking of
    per-subsystem hotcpu mutexes are as follow:

    - CPU_LOCK_ACQUIRE: All hotcpu mutexes are taken before subsystems
    start handling pre-hotplug events like CPU_UP_PREPARE/CPU_DOWN_PREPARE
    etc, thus ensuring a clean handling of these events.

    - CPU_LOCK_RELEASE: The hotcpu mutexes will be released only after
    all subsystems have handled post-hotplug events like CPU_DOWN_FAILED,
    CPU_DEAD,CPU_ONLINE etc thereby ensuring that there are no subsequent
    clashes amongst the interdependent subsystems after a cpu hotplugs.

    This patch also uses __raw_notifier_call chain in _cpu_up to take care
    of the dependency between the two consequetive calls to
    raw_notifier_call_chain.

    [akpm@linux-foundation.org: fix a bug]
    Signed-off-by: Gautham R Shenoy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gautham R Shenoy
     
  • Since 2.6.18-something, the community has been bugged by the problem to
    provide a clean and a stable mechanism to postpone a cpu-hotplug event as
    lock_cpu_hotplug was badly broken.

    This is another proposal towards solving that problem. This one is along the
    lines of the solution provided in kernel/workqueue.c

    Instead of having a global mechanism like lock_cpu_hotplug, we allow the
    subsytems to define their own per-subsystem hot cpu mutexes. These would be
    taken(released) where ever we are currently calling
    lock_cpu_hotplug(unlock_cpu_hotplug).

    Also, in the per-subsystem hotcpu callback function,we take this mutex before
    we handle any pre-cpu-hotplug events and release it once we finish handling
    the post-cpu-hotplug events. A standard means for doing this has been
    provided in [PATCH 2/4] and demonstrated in [PATCH 3/4].

    The ordering of these per-subsystem mutexes might still prove to be a
    problem, but hopefully lockdep should help us get out of that muddle.

    The patch set to be applied against linux-2.6.19-rc5 is as follows:

    [PATCH 1/4] : Extend notifier_call_chain with an option to specify the
    number of notifications to be sent and also count the
    number of notifications actually sent.

    [PATCH 2/4] : Define events CPU_LOCK_ACQUIRE and CPU_LOCK_RELEASE
    and send out notifications for these in _cpu_up and
    _cpu_down. This would help us standardise the acquire and
    release of the subsystem locks in the hotcpu
    callback functions of these subsystems.

    [PATCH 3/4] : Eliminate lock_cpu_hotplug from kernel/sched.c.

    [PATCH 4/4] : In workqueue_cpu_callback function, acquire(release) the
    workqueue_mutex while handling
    CPU_LOCK_ACQUIRE(CPU_LOCK_RELEASE).

    If the per-subsystem-locking approach survives the test of time, we can expect
    a slow phasing out of lock_cpu_hotplug, which has not yet been eliminated in
    these patches :)

    This patch:

    Provide notifier_call_chain with an option to call only a specified number of
    notifiers and also record the number of call to notifiers made.

    The need for this enhancement was identified in the post entitled
    "Slab - Eliminate lock_cpu_hotplug from slab"
    (http://lkml.org/lkml/2006/10/28/92) by Ravikiran G Thirumalai and
    Andrew Morton.

    This patch adds two additional parameters to notifier_call_chain API namely
    - int nr_to_calls : Number of notifier_functions to be called.
    The don't care value is -1.

    - unsigned int *nr_calls : Records the total number of notifier_funtions
    called by notifier_call_chain. The don't care
    value is NULL.

    [michal.k.k.piotrowski@gmail.com: build fix]
    Credit: Andrew Morton
    Signed-off-by: Gautham R Shenoy
    Signed-off-by: Michal Piotrowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gautham R Shenoy
     

04 Oct, 2006

1 commit

  • This patch (as751) adds a new type of notifier chain, based on the SRCU
    (Sleepable Read-Copy Update) primitives recently added to the kernel. An
    SRCU notifier chain is much like a blocking notifier chain, in that it must
    be called in process context and its callout routines are allowed to sleep.
    The difference is that the chain's links are protected by the SRCU
    mechanism rather than by an rw-semaphore, so calling the chain has
    extremely low overhead: no memory barriers and no cache-line bouncing. On
    the other hand, unregistering from the chain is expensive and the chain
    head requires special runtime initialization (plus cleanup if it is to be
    deallocated).

    SRCU notifiers are appropriate for notifiers that will be called very
    frequently and for which unregistration occurs very seldom. The proposed
    "task notifier" scheme qualifies, as may some of the network notifiers.

    Signed-off-by: Alan Stern
    Acked-by: Paul E. McKenney
    Acked-by: Chandra Seetharaman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Stern
     

04 Jul, 2006

1 commit


28 Mar, 2006

1 commit

  • The kernel's implementation of notifier chains is unsafe. There is no
    protection against entries being added to or removed from a chain while the
    chain is in use. The issues were discussed in this thread:

    http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2

    We noticed that notifier chains in the kernel fall into two basic usage
    classes:

    "Blocking" chains are always called from a process context
    and the callout routines are allowed to sleep;

    "Atomic" chains can be called from an atomic context and
    the callout routines are not allowed to sleep.

    We decided to codify this distinction and make it part of the API. Therefore
    this set of patches introduces three new, parallel APIs: one for blocking
    notifiers, one for atomic notifiers, and one for "raw" notifiers (which is
    really just the old API under a new name). New kinds of data structures are
    used for the heads of the chains, and new routines are defined for
    registration, unregistration, and calling a chain. The three APIs are
    explained in include/linux/notifier.h and their implementation is in
    kernel/sys.c.

    With atomic and blocking chains, the implementation guarantees that the chain
    links will not be corrupted and that chain callers will not get messed up by
    entries being added or removed. For raw chains the implementation provides no
    guarantees at all; users of this API must provide their own protections. (The
    idea was that situations may come up where the assumptions of the atomic and
    blocking APIs are not appropriate, so it should be possible for users to
    handle these things in their own way.)

    There are some limitations, which should not be too hard to live with. For
    atomic/blocking chains, registration and unregistration must always be done in
    a process context since the chain is protected by a mutex/rwsem. Also, a
    callout routine for a non-raw chain must not try to register or unregister
    entries on its own chain. (This did happen in a couple of places and the code
    had to be changed to avoid it.)

    Since atomic chains may be called from within an NMI handler, they cannot use
    spinlocks for synchronization. Instead we use RCU. The overhead falls almost
    entirely in the unregister routine, which is okay since unregistration is much
    less frequent that calling a chain.

    Here is the list of chains that we adjusted and their classifications. None
    of them use the raw API, so for the moment it is only a placeholder.

    ATOMIC CHAINS
    -------------
    arch/i386/kernel/traps.c: i386die_chain
    arch/ia64/kernel/traps.c: ia64die_chain
    arch/powerpc/kernel/traps.c: powerpc_die_chain
    arch/sparc64/kernel/traps.c: sparc64die_chain
    arch/x86_64/kernel/traps.c: die_chain
    drivers/char/ipmi/ipmi_si_intf.c: xaction_notifier_list
    kernel/panic.c: panic_notifier_list
    kernel/profile.c: task_free_notifier
    net/bluetooth/hci_core.c: hci_notifier
    net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_chain
    net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_expect_chain
    net/ipv6/addrconf.c: inet6addr_chain
    net/netfilter/nf_conntrack_core.c: nf_conntrack_chain
    net/netfilter/nf_conntrack_core.c: nf_conntrack_expect_chain
    net/netlink/af_netlink.c: netlink_chain

    BLOCKING CHAINS
    ---------------
    arch/powerpc/platforms/pseries/reconfig.c: pSeries_reconfig_chain
    arch/s390/kernel/process.c: idle_chain
    arch/x86_64/kernel/process.c idle_notifier
    drivers/base/memory.c: memory_chain
    drivers/cpufreq/cpufreq.c cpufreq_policy_notifier_list
    drivers/cpufreq/cpufreq.c cpufreq_transition_notifier_list
    drivers/macintosh/adb.c: adb_client_list
    drivers/macintosh/via-pmu.c sleep_notifier_list
    drivers/macintosh/via-pmu68k.c sleep_notifier_list
    drivers/macintosh/windfarm_core.c wf_client_list
    drivers/usb/core/notify.c usb_notifier_list
    drivers/video/fbmem.c fb_notifier_list
    kernel/cpu.c cpu_chain
    kernel/module.c module_notify_list
    kernel/profile.c munmap_notifier
    kernel/profile.c task_exit_notifier
    kernel/sys.c reboot_notifier_list
    net/core/dev.c netdev_chain
    net/decnet/dn_dev.c: dnaddr_chain
    net/ipv4/devinet.c: inetaddr_chain

    It's possible that some of these classifications are wrong. If they are,
    please let us know or submit a patch to fix them. Note that any chain that
    gets called very frequently should be atomic, because the rwsem read-locking
    used for blocking chains is very likely to incur cache misses on SMP systems.
    (However, if the chain's callout routines may sleep then the chain cannot be
    atomic.)

    The patch set was written by Alan Stern and Chandra Seetharaman, incorporating
    material written by Keith Owens and suggestions from Paul McKenney and Andrew
    Morton.

    [jes@sgi.com: restructure the notifier chain initialization macros]
    Signed-off-by: Alan Stern
    Signed-off-by: Chandra Seetharaman
    Signed-off-by: Jes Sorensen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Stern
     

30 May, 2005

1 commit


17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds