21 May, 2011

1 commit

  • Commit e66eed651fd1 ("list: remove prefetching from regular list
    iterators") removed the include of prefetch.h from list.h, which
    uncovered several cases that had apparently relied on that rather
    obscure header file dependency.

    So this fixes things up a bit, using

    grep -L linux/prefetch.h $(git grep -l '[^a-z_]prefetchw*(' -- '*.[ch]')
    grep -L 'prefetchw*(' $(git grep -l 'linux/prefetch.h' -- '*.[ch]')

    to guide us in finding files that either need
    inclusion, or have it despite not needing it.

    There are more of them around (mostly network drivers), but this gets
    many core ones.

    Reported-by: Stephen Rothwell
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

20 May, 2011

10 commits

  • …/gregkh/driver-core-2.6

    * 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (44 commits)
    debugfs: Silence DEBUG_STRICT_USER_COPY_CHECKS=y warning
    sysfs: remove "last sysfs file:" line from the oops messages
    drivers/base/memory.c: fix warning due to "memory hotplug: Speed up add/remove when blocks are larger than PAGES_PER_SECTION"
    memory hotplug: Speed up add/remove when blocks are larger than PAGES_PER_SECTION
    SYSFS: Fix erroneous comments for sysfs_update_group().
    driver core: remove the driver-model structures from the documentation
    driver core: Add the device driver-model structures to kerneldoc
    Translated Documentation/email-clients.txt
    RAW driver: Remove call to kobject_put().
    reboot: disable usermodehelper to prevent fs access
    efivars: prevent oops on unload when efi is not enabled
    Allow setting of number of raw devices as a module parameter
    Introduce CONFIG_GOOGLE_FIRMWARE
    driver: Google Memory Console
    driver: Google EFI SMI
    x86: Better comments for get_bios_ebda()
    x86: get_bios_ebda_length()
    misc: fix ti-st build issues
    params.c: Use new strtobool function to process boolean inputs
    debugfs: move to new strtobool
    ...

    Fix up trivial conflicts in fs/debugfs/file.c due to the same patch
    being applied twice, and an unrelated cleanup nearby.

    Linus Torvalds
     
  • * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (78 commits)
    Revert "rcu: Decrease memory-barrier usage based on semi-formal proof"
    net,rcu: convert call_rcu(prl_entry_destroy_rcu) to kfree
    batman,rcu: convert call_rcu(softif_neigh_free_rcu) to kfree_rcu
    batman,rcu: convert call_rcu(neigh_node_free_rcu) to kfree()
    batman,rcu: convert call_rcu(gw_node_free_rcu) to kfree_rcu
    net,rcu: convert call_rcu(kfree_tid_tx) to kfree_rcu()
    net,rcu: convert call_rcu(xt_osf_finger_free_rcu) to kfree_rcu()
    net/mac80211,rcu: convert call_rcu(work_free_rcu) to kfree_rcu()
    net,rcu: convert call_rcu(wq_free_rcu) to kfree_rcu()
    net,rcu: convert call_rcu(phonet_device_rcu_free) to kfree_rcu()
    perf,rcu: convert call_rcu(swevent_hlist_release_rcu) to kfree_rcu()
    perf,rcu: convert call_rcu(free_ctx) to kfree_rcu()
    net,rcu: convert call_rcu(__nf_ct_ext_free_rcu) to kfree_rcu()
    net,rcu: convert call_rcu(net_generic_release) to kfree_rcu()
    net,rcu: convert call_rcu(netlbl_unlhsh_free_addr6) to kfree_rcu()
    net,rcu: convert call_rcu(netlbl_unlhsh_free_addr4) to kfree_rcu()
    security,rcu: convert call_rcu(sel_netif_free) to kfree_rcu()
    net,rcu: convert call_rcu(xps_dev_maps_release) to kfree_rcu()
    net,rcu: convert call_rcu(xps_map_release) to kfree_rcu()
    net,rcu: convert call_rcu(rps_map_release) to kfree_rcu()
    ...

    Linus Torvalds
     
  • …l/git/tip/linux-2.6-tip

    * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    hrtimer: Make lookup table const
    RTC: Disable CONFIG_RTC_CLASS from being built as a module
    timers: Fix alarmtimer build issues when CONFIG_RTC_CLASS=n
    timers: Remove delayed irqwork from alarmtimers implementation
    timers: Improve alarmtimer comments and minor fixes
    timers: Posix interface for alarm-timers
    timers: Introduce in-kernel alarm-timer interface
    timers: Add rb_init_node() to allow for stack allocated rb nodes
    time: Add timekeeping_inject_sleeptime

    Linus Torvalds
     
  • …x/kernel/git/tip/linux-2.6-tip

    * 'timers-clockevents-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86: hpet: Cleanup the clockevents init and register code
    x86: Convert PIT to clockevents_config_and_register()
    clockevents: Provide interface to reconfigure an active clock event device
    clockevents: Provide combined configure and register function
    clockevents: Restructure clock_event_device members
    clocksource: Get rid of the hardcoded 5 seconds sleep time limit
    clocksource: Restructure clocksource struct members

    Linus Torvalds
     
  • …kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

    * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits)
    sched: Fix and optimise calculation of the weight-inverse
    sched: Avoid going ahead if ->cpus_allowed is not changed
    sched, rt: Update rq clock when unthrottling of an otherwise idle CPU
    sched: Remove unused parameters from sched_fork() and wake_up_new_task()
    sched: Shorten the construction of the span cpu mask of sched domain
    sched: Wrap the 'cfs_rq->nr_spread_over' field with CONFIG_SCHED_DEBUG
    sched: Remove unused 'this_best_prio arg' from balance_tasks()
    sched: Remove noop in alloc_rt_sched_group()
    sched: Get rid of lock_depth
    sched: Remove obsolete comment from scheduler_tick()
    sched: Fix sched_domain iterations vs. RCU
    sched: Next buddy hint on sleep and preempt path
    sched: Make set_*_buddy() work on non-task entities
    sched: Remove need_migrate_task()
    sched: Move the second half of ttwu() to the remote cpu
    sched: Restructure ttwu() some more
    sched: Rename ttwu_post_activation() to ttwu_do_wakeup()
    sched: Remove rq argument from ttwu_stat()
    sched: Remove rq->lock from the first half of ttwu()
    sched: Drop rq->lock from sched_exec()
    ...

    * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    sched: Fix rt_rq runtime leakage bug

    Linus Torvalds
     
  • …git/tip/linux-2.6-tip

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (107 commits)
    perf stat: Add more cache-miss percentage printouts
    perf stat: Add -d -d and -d -d -d options to show more CPU events
    ftrace/kbuild: Add recordmcount files to force full build
    ftrace: Add self-tests for multiple function trace users
    ftrace: Modify ftrace_set_filter/notrace to take ops
    ftrace: Allow dynamically allocated function tracers
    ftrace: Implement separate user function filtering
    ftrace: Free hash with call_rcu_sched()
    ftrace: Have global_ops store the functions that are to be traced
    ftrace: Add ops parameter to ftrace_startup/shutdown functions
    ftrace: Add enabled_functions file
    ftrace: Use counters to enable functions to trace
    ftrace: Separate hash allocation and assignment
    ftrace: Create a global_ops to hold the filter and notrace hashes
    ftrace: Use hash instead for FTRACE_FL_FILTER
    ftrace: Replace FTRACE_FL_NOTRACE flag with a hash of ignored functions
    perf bench, x86: Add alternatives-asm.h wrapper
    x86, 64-bit: Fix copy_[to/from]_user() checks for the userspace address limit
    x86, mem: memset_64.S: Optimize memset by enhanced REP MOVSB/STOSB
    x86, mem: memmove_64.S: Optimize memmove by enhanced REP MOVSB/STOSB
    ...

    Linus Torvalds
     
  • * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    irq: Export functions to allow modular irq drivers
    genirq: Uninline and sanity check generic_handle_irq()
    genirq: Remove pointless ifdefs
    genirq: Make generic irq chip depend on CONFIG_GENERIC_IRQ_CHIP
    genirq: Add chip suspend and resume callbacks
    genirq: Implement a generic interrupt chip
    genirq: Support per-IRQ thread disabling.
    genirq: irq_desc: Document preflow_handler and affinity_hint
    genirq: Update DocBook comments
    genirq: Forgotten updates/deletions after removal of compat code

    Linus Torvalds
     
  • …el/git/tip/linux-2.6-tip

    * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    seqlock: Don't smp_rmb in seqlock reader spin loop
    watchdog, hung_task_timeout: Add Kconfig configurable default
    lockdep: Remove cmpxchg to update nr_chain_hlocks
    lockdep: Print a nicer description for simple irq lock inversions
    lockdep: Replace "Bad BFS generated tree" message with something less cryptic
    lockdep: Print a nicer description for irq inversion bugs
    lockdep: Print a nicer description for simple deadlocks
    lockdep: Print a nicer description for normal deadlocks
    lockdep: Print a nicer description for irq lock inversions

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: (34 commits)
    PM: Introduce generic prepare and complete callbacks for subsystems
    PM: Allow drivers to allocate memory from .prepare() callbacks safely
    PM: Remove CONFIG_PM_VERBOSE
    Revert "PM / Hibernate: Reduce autotuned default image size"
    PM / Hibernate: Add sysfs knob to control size of memory for drivers
    PM / Wakeup: Remove useless synchronize_rcu() call
    kmod: always provide usermodehelper_disable()
    PM / ACPI: Remove acpi_sleep=s4_nonvs
    PM / Wakeup: Fix build warning related to the "wakeup" sysfs file
    PM: Print a warning if firmware is requested when tasks are frozen
    PM / Runtime: Rework runtime PM handling during driver removal
    Freezer: Use SMP barriers
    PM / Suspend: Do not ignore error codes returned by suspend_enter()
    PM: Fix build issue in clock_ops.c for CONFIG_PM_RUNTIME unset
    PM: Revert "driver core: platform_bus: allow runtime override of dev_pm_ops"
    OMAP1 / PM: Use generic clock manipulation routines for runtime PM
    PM: Remove sysdev suspend, resume and shutdown operations
    PM / PowerPC: Use struct syscore_ops instead of sysdevs for PM
    PM / UNICORE32: Use struct syscore_ops instead of sysdevs for PM
    PM / AVR32: Use struct syscore_ops instead of sysdevs for PM
    ...

    Linus Torvalds
     
  • This reverts commit e59fb3120becfb36b22ddb8bd27d065d3cdca499.

    This reversion was due to (extreme) boot-time slowdowns on SPARC seen by
    Yinghai Lu and on x86 by Ingo
    .
    This is a non-trivial reversion due to intervening commits.

    Conflicts:

    Documentation/RCU/trace.txt
    kernel/rcutree.c

    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

19 May, 2011

25 commits

  • Some ARM SoCs have clock event devices which have their frequency
    modified due to frequency scaling. Provide an interface which allows
    to reconfigure an active device. After reconfiguration reprogram the
    current pending event.

    Signed-off-by: Thomas Gleixner
    Cc: LAK
    Cc: John Stultz
    Acked-by: Linus Walleij
    Reviewed-by: Ingo Molnar
    Link: http://lkml.kernel.org/r/%3C20110518210136.437459958%40linutronix.de%3E

    Thomas Gleixner
     
  • All clockevent devices have the same open coded initialization
    functions. Provide an interface which does all necessary
    initialization in the core code.

    Signed-off-by: Thomas Gleixner
    Cc: John Stultz
    Reviewed-by: Ingo Molnar
    Link: http://lkml.kernel.org/r/%3C20110518210136.331975870%40linutronix.de%3E

    Thomas Gleixner
     
  • Slow clocksources can have a way longer sleep time than 5 seconds and
    even fast ones can easily cope with 600 seconds and still maintain
    proper accuracy.

    Signed-off-by: Thomas Gleixner
    Cc: John Stultz
    Reviewed-by: Ingo Molnar
    Link: http://lkml.kernel.org/r/%3C20110518210136.109811585%40linutronix.de%3E

    Thomas Gleixner
     
  • Signed-off-by: Jonathan Cameron
    Signed-off-by: Rusty Russell

    Jonathan Cameron
     
  • The function is_exported() with its helper function lookup_symbol() are used to
    verify if a provided symbol is effectively exported by the kernel or by the
    modules. Now that both have their symbols sorted we can replace a linear search
    with a binary search which provide a considerably speed-up.

    This work was supported by a hardware donation from the CE Linux Forum.

    Signed-off-by: Alessio Igor Bogani
    Acked-by: Greg Kroah-Hartman
    Signed-off-by: Rusty Russell

    Alessio Igor Bogani
     
  • Takes advantage of the order and locates symbols using binary search.

    This work was supported by a hardware donation from the CE Linux Forum.

    Signed-off-by: Alessio Igor Bogani
    Signed-off-by: Rusty Russell
    Tested-by: Dirk Behme

    Alessio Igor Bogani
     
  • Instead of having a callback function for each symbol in the kernel,
    have a callback for each array of symbols.

    This eases the logic when we move to sorted symbols and binary search.

    Signed-off-by: Rusty Russell
    Signed-off-by: Alessio Igor Bogani

    Rusty Russell
     
  • Split the unprotect function into a function per section to make
    the code more readable and add the missing static declaration.

    Signed-off-by: Jan Glauber
    Signed-off-by: Rusty Russell

    Jan Glauber
     
  • While debugging I stumbled over two problems in the code that protects module
    pages.

    First issue is that disabling the protection before freeing init or unload of
    a module is not symmetric with the enablement. For instance, if pages are set
    to RO the page range from module_core to module_core + core_ro_size is
    protected. If a module is unloaded the page range from module_core to
    module_core + core_size is set back to RW.
    So pages that were not set to RO are also changed to RW.
    This is not critical but IMHO it should be symmetric.

    Second issue is that while set_memory_rw & set_memory_ro are used for
    RO/RW changes only set_memory_nx is involved for NX/X. One would await that
    the inverse function is called when the NX protection should be removed,
    which is not the case here, unless I'm missing something.

    Signed-off-by: Jan Glauber
    Signed-off-by: Rusty Russell

    Jan Glauber
     
  • Reset mod->init_ro_size to zero after the init part of a module is unloaded.
    Otherwise we need to check if module->init is NULL in the unprotect functions
    in the next patch.

    Signed-off-by: Jan Glauber
    Signed-off-by: Rusty Russell

    Jan Glauber
     
  • Fix function prototype to be ANSI-C compliant, consistent with other
    function prototypes, addressing a sparse warning.

    Signed-off-by: Daniel J Blueman
    Signed-off-by: Rusty Russell

    Daniel J Blueman
     
  • On m68k natural alignment is 2-byte boundary but we are trying to
    align structures in __modver section on sizeof(void *) boundary.
    This causes trouble when we try to access elements in this section
    in array-like fashion when create "version" attributes for built-in
    modules.

    Moreover, as DaveM said, we can't reliably put structures into
    independent objects, put them into a special section, and then expect
    array access over them (via the section boundaries) after linking the
    objects together to just "work" due to variable alignment choices in
    different situations. The only solution that seems to work reliably
    is to make an array of plain pointers to the objects in question and
    put those pointers in the special section.

    Reported-by: Geert Uytterhoeven
    Signed-off-by: Dmitry Torokhov
    Signed-off-by: Rusty Russell

    Dmitry Torokhov
     
  • Add some basic sanity tests for multiple users of the function
    tracer at startup.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Since users of the function tracer can now pick and choose which
    functions they want to trace agnostically from other users of the
    function tracer, we need to pass the ops struct to the ftrace_set_filter()
    functions.

    The functions ftrace_set_global_filter() and ftrace_set_global_notrace()
    is added to keep the old filter functions which are used to modify
    the generic function tracers.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Now that functions may be selected individually, it only makes sense
    that we should allow dynamically allocated trace structures to
    be traced. This will allow perf to allocate a ftrace_ops structure
    at runtime and use it to pick and choose which functions that
    structure will trace.

    Note, a dynamically allocated ftrace_ops will always be called
    indirectly instead of being called directly from the mcount in
    entry.S. This is because there's no safe way to prevent mcount
    from being preempted before calling the function, unless we
    modify every entry.S to do so (not likely). Thus, dynamically allocated
    functions will now be called by the ftrace_ops_list_func() that
    loops through the ops that are allocated if there are more than
    one op allocated at a time. This loop is protected with a
    preempt_disable.

    To determine if an ftrace_ops structure is allocated or not, a new
    util function was added to the kernel/extable.c called
    core_kernel_data(), which returns 1 if the address is between
    _sdata and _edata.

    Cc: Paul E. McKenney
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • ftrace_ops that are registered to trace functions can now be
    agnostic to each other in respect to what functions they trace.
    Each ops has their own hash of the functions they want to trace
    and a hash to what they do not want to trace. A empty hash for
    the functions they want to trace denotes all functions should
    be traced that are not in the notrace hash.

    Cc: Paul E. McKenney
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • When a hash is modified and might be in use, we need to perform
    a schedule RCU operation on it, as the hashes will soon be used
    directly in the function tracer callback.

    Cc: Paul E. McKenney
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • This is a step towards each ops structure defining its own set
    of functions to trace. As the current code with pid's and such
    are specific to the global_ops, it is restructured to be used
    with the global ops.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • In order to allow different ops to enable different functions,
    the ftrace_startup() and ftrace_shutdown() functions need the
    ops parameter passed to them.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Add the enabled_functions file that is used to show all the
    functions that have been enabled for tracing as well as their
    ref counts. This helps seeing if any function has been registered
    and what functions are being traced.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Every function has its own record that stores the instruction
    pointer and flags for the function to be traced. There are only
    two flags: enabled and free. The enabled flag states that tracing
    for the function has been enabled (actively traced), and the free
    flag states that the record no longer points to a function and can
    be used by new functions (loaded modules).

    These flags are now moved to the MSB of the flags (actually just
    the top 32bits). The rest of the bits (30 bits) are now used as
    a ref counter. Everytime a tracer register functions to trace,
    those functions will have its counter incremented.

    When tracing is enabled, to determine if a function should be traced,
    the counter is examined, and if it is non-zero it is set to trace.

    When a ftrace_ops is registered to trace functions, its hashes
    are examined. If the ftrace_ops filter_hash count is zero, then
    all functions are set to be traced, otherwise only the functions
    in the hash are to be traced. The exception to this is if a function
    is also in the ftrace_ops notrace_hash. Then that function's counter
    is not incremented for this ftrace_ops.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • When filtering, allocate a hash to insert the function records.
    After the filtering is complete, assign it to the ftrace_ops structure.

    This allows the ftrace_ops structure to have a much smaller array of
    hash buckets instead of wasting a lot of memory.

    A read only empty_hash is created to be the minimum size that any ftrace_ops
    can point to.

    When a new hash is created, it has the following steps:

    o Allocate a default hash.
    o Walk the function records assigning the filtered records to the hash
    o Allocate a new hash with the appropriate size buckets
    o Move the entries from the default hash to the new hash.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Combine the filter and notrace hashes to be accessed by a single entity,
    the global_ops. The global_ops is a ftrace_ops structure that is passed
    to different functions that can read or modify the filtering of the
    function tracer.

    The ftrace_ops structure was modified to hold a filter and notrace
    hashes so that later patches may allow each ftrace_ops to have its own
    set of rules to what functions may be filtered.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • When multiple users are allowed to have their own set of functions
    to trace, having the FTRACE_FL_FILTER flag will not be enough to
    handle the accounting of those users. Each user will need their own
    set of functions.

    Replace the FTRACE_FL_FILTER with a filter_hash instead. This is
    temporary until the rest of the function filtering accounting
    gets in.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • To prepare for the accounting system that will allow multiple users of
    the function tracer, having the FTRACE_FL_NOTRACE as a flag in the
    dyn_trace record does not make sense.

    All ftrace_ops will soon have a hash of functions they should trace
    and not trace. By making a global hash of functions not to trace makes
    this easier for the transition.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

18 May, 2011

4 commits

  • Export handle_simple_irq, irq_modify_status, irq_alloc_descs,
    irq_free_descs and generic_handle_irq to allow their usage in
    modules. First user is IIO, which wants to be built modular, but needs
    to be able to create irq chips, allocate and configure interrupt
    descriptors and handle demultiplexing interrupts.

    [ tglx: Moved the uninlinig of generic_handle_irq to a separate patch ]

    Signed-off-by: Jonathan Cameron
    Link: http://lkml.kernel.org/r/%3C1305711544-505-1-git-send-email-jic23%40cam.ac.uk%3E
    Signed-off-by: Thomas Gleixner

    Jonathan Cameron
     
  • generic_handle_irq() is missing a NULL pointer check for the result of
    irq_to_desc. This was a not a big problem, but we want to expose it to
    drivers, so we better have sanity checks in place. Add a return value
    as well, which indicates that the irq number was valid and the handler
    was invoked.

    Based on the pure code move from Jonathan Cameron.

    Signed-off-by: Thomas Gleixner
    Cc: Jonathan Cameron

    Thomas Gleixner
     
  • kernel/irq/ is only built when CONFIG_GENERIC_HARDIRQS=y. So making
    code inside of kernel/irq/ conditional on CONFIG_GENERIC_HARDIRQS is
    pointless.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • If device drivers allocate substantial amounts of memory (above 1 MB)
    in their hibernate .freeze() callbacks (or in their legacy suspend
    callbcks during hibernation), the subsequent creation of hibernate
    image may fail due to the lack of memory. This is the case, because
    the drivers' .freeze() callbacks are executed after the hibernate
    memory preallocation has been carried out and the preallocated amount
    of memory may be too small to cover the new driver allocations.
    Unfortunately, the drivers' .prepare() callbacks also are executed
    after the hibernate memory preallocation has completed, so they are
    not suitable for allocating additional memory either. Thus the only
    way a driver can safely allocate memory during hibernation is to use
    a hibernate/suspend notifier. However, the notifiers are called
    before the freezing of user space and the drivers wanting to use them
    for allocating additional memory may not know how much memory needs
    to be allocated at that point.

    To let device drivers overcome this difficulty rework the hibernation
    sequence so that the memory preallocation is carried out after the
    drivers' .prepare() callbacks have been executed, so that the
    .prepare() callbacks can be used for allocating additional memory
    to be used by the drivers' .freeze() callbacks. Update documentation
    to match the new behavior of the code.

    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki