16 Feb, 2011

2 commits


12 Feb, 2011

3 commits

  • In the continuing effort to avoid kernel addresses leaking to
    unprivileged users, this patch switches to %pK for
    /proc/timer_list reporting.

    Signed-off-by: Kees Cook
    Cc: John Stultz
    Cc: Dan Rosenberg
    Cc: Eugene Teo
    Cc: Linus Torvalds
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Kees Cook
     
  • …s/security-testing-2.6

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
    pci: use security_capable() when checking capablities during config space read
    security: add cred argument to security_capable()
    tpm_tis: Use timeouts returned from TPM

    Linus Torvalds
     
  • The wake_up_process() call in ptrace_detach() is spurious and not
    interlocked with the tracee state. IOW, the tracee could be running or
    sleeping in any place in the kernel by the time wake_up_process() is
    called. This can lead to the tracee waking up unexpectedly which can be
    dangerous.

    The wake_up is spurious and should be removed but for now reduce its
    toxicity by only waking up if the tracee is in TRACED or STOPPED state.

    This bug can possibly be used as an attack vector. I don't think it
    will take too much effort to come up with an attack which triggers oops
    somewhere. Most sleeps are wrapped in condition test loops and should
    be safe but we have quite a number of places where sleep and wakeup
    conditions are expected to be interlocked. Although the window of
    opportunity is tiny, ptrace can be used by non-privileged users and with
    some loading the window can definitely be extended and exploited.

    Signed-off-by: Tejun Heo
    Acked-by: Roland McGrath
    Acked-by: Oleg Nesterov
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     

11 Feb, 2011

2 commits

  • Expand security_capable() to include cred, so that it can be usable in a
    wider range of call sites.

    Signed-off-by: Chris Wright
    Acked-by: Serge Hallyn
    Signed-off-by: James Morris

    Chris Wright
     
  • In commit ce6ada35bdf7 ("security: Define CAP_SYSLOG") Serge Hallyn
    introduced CAP_SYSLOG, but broke backwards compatibility by no longer
    accepting CAP_SYS_ADMIN as an override (it would cause a warning and
    then reject the operation).

    Re-instate CAP_SYS_ADMIN - but keeping the warning - as an acceptable
    capability until any legacy applications have been updated. There are
    apparently applications out there that drop all capabilities except for
    CAP_SYS_ADMIN in order to access the syslog.

    (This is a re-implementation of a patch by Serge, cleaning the logic up
    and making the code more readable)

    Acked-by: Serge Hallyn
    Reviewed-by: James Morris
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

10 Feb, 2011

2 commits

  • During boot if the hardlockup detector fails to initialize, it
    complains very loudly. Some failures should be expected under
    certain situations, ie no lapics, or resource in-use. Tone
    those error messages down a bit. Keep the rest at a high level.

    Reported-by: Paul Bolle
    Tested-by: Paul Bolle
    Signed-off-by: Don Zickus
    Cc: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Don Zickus
     
  • * 'for-linus' of git://git.kernel.dk/linux-2.6-block:
    cdrom: support devices that have check_events but not media_changed
    cfq-iosched: Don't wait if queue already has requests.
    blkio-throttle: Avoid calling blkiocg_lookup_group() for root group
    cfq: rename a function to give it more appropriate name
    cciss: make cciss_revalidate not loop through CISS_MAX_LUNS volumes unnecessarily.
    drivers/block/aoe/Makefile: replace the use of -objs with -y
    loop: queue_lock NULL pointer derefence in blk_throtl_exit
    drivers/block/Makefile: replace the use of -objs with -y
    blktrace: Don't output messages if NOTIFY isn't set.

    Linus Torvalds
     

08 Feb, 2011

3 commits

  • Both attempts at trying to allow softirq usage for
    del_timer_sync() failed (produced bogus warnings),
    so revert the commit for this release:

    f266a5110d45: lockdep, timer: Fix del_timer_sync() annotation

    and try again later.

    Reported-by: Borislav Petkov
    Signed-off-by: Peter Zijlstra
    Cc: Linus Torvalds
    Cc: Yong Zhang
    Cc: Thomas Gleixner
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • In prepare_kernel_cred() since 2.6.29, put_cred(new) is called without
    assigning new->usage when security_prepare_creds() returned an error. As a
    result, memory for new and refcount for new->{user,group_info,tgcred} are
    leaked because put_cred(new) won't call __put_cred() unless old->usage == 1.

    Fix these leaks by assigning new->usage (and new->subscribers which was added
    in 2.6.32) before calling security_prepare_creds().

    Signed-off-by: Tetsuo Handa
    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    Tetsuo Handa
     
  • In cred_alloc_blank() since 2.6.32, abort_creds(new) is called with
    new->security == NULL and new->magic == 0 when security_cred_alloc_blank()
    returns an error. As a result, BUG() will be triggered if SELinux is enabled
    or CONFIG_DEBUG_CREDENTIALS=y.

    If CONFIG_DEBUG_CREDENTIALS=y, BUG() is called from __invalid_creds() because
    cred->magic == 0. Failing that, BUG() is called from selinux_cred_free()
    because selinux_cred_free() is not expecting cred->security == NULL. This does
    not affect smack_cred_free(), tomoyo_cred_free() or apparmor_cred_free().

    Fix these bugs by

    (1) Set new->magic before calling security_cred_alloc_blank().

    (2) Handle null cred->security in creds_are_invalid() and selinux_cred_free().

    Signed-off-by: Tetsuo Handa
    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    Tetsuo Handa
     

07 Feb, 2011

1 commit


05 Feb, 2011

1 commit


04 Feb, 2011

4 commits


03 Feb, 2011

6 commits

  • Currently the syscall_meta structures for the syscall tracepoints are
    placed in the __syscall_metadata section, and at link time, the linker
    makes one large array of all these syscall metadata structures. On boot
    up, this array is read (much like the initcall sections) and the syscall
    data is processed.

    The problem is that there is no guarantee that gcc will place complex
    structures nicely together in an array format. Two structures in the
    same file may be placed awkwardly, because gcc has no clue that they
    are suppose to be in an array.

    A hack was used previous to force the alignment to 4, to pack the
    structures together. But this caused alignment issues with other
    architectures (sparc).

    Instead of packing the structures into an array, the structures' addresses
    are now put into the __syscall_metadata section. As pointers are always the
    natural alignment, gcc should always pack them tightly together
    (otherwise initcall, extable, etc would also fail).

    By having the pointers to the structures in the section, we can still
    iterate the trace_events without causing unnecessary alignment problems
    with other architectures, or depending on the current behaviour of
    gcc that will likely change in the future just to tick us kernel developers
    off a little more.

    The __syscall_metadata section is also moved into the .init.data section
    as it is now only needed at boot up.

    Suggested-by: David Miller
    Acked-by: David S. Miller
    Cc: Mathieu Desnoyers
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Make the tracepoints more robust, making them solid enough to handle compiler
    changes by not relying on anything based on compiler-specific behavior with
    respect to structure alignment. Implement an approach proposed by David Miller:
    use an array of const pointers to refer to the individual structures, and export
    this pointer array through the linker script rather than the structures per se.
    It will consume 32 extra bytes per tracepoint (24 for structure padding and 8
    for the pointers), but are less likely to break due to compiler changes.

    History:

    commit 7e066fb8 tracepoints: add DECLARE_TRACE() and DEFINE_TRACE()
    added the aligned(32) type and variable attribute to the tracepoint structures
    to deal with gcc happily aligning statically defined structures on 32-byte
    multiples.

    One attempt was to use a 8-byte alignment for tracepoint structures by applying
    both the variable and type attribute to tracepoint structures definitions and
    declarations. It worked fine with gcc 4.5.1, but broke with gcc 4.4.4 and 4.4.5.

    The reason is that the "aligned" attribute only specify the _minimum_ alignment
    for a structure, leaving both the compiler and the linker free to align on
    larger multiples. Because tracepoint.c expects the structures to be placed as an
    array within each section, up-alignment cause NULL-pointer exceptions due to the
    extra unexpected padding.

    (this patch applies on top of -tip)

    Signed-off-by: Mathieu Desnoyers
    Acked-by: David S. Miller
    LKML-Reference:
    CC: Frederic Weisbecker
    CC: Ingo Molnar
    CC: Thomas Gleixner
    CC: Andrew Morton
    CC: Peter Zijlstra
    CC: Rusty Russell
    Signed-off-by: Steven Rostedt

    Mathieu Desnoyers
     
  • cpu_stopper_thread()
    migration_cpu_stop()
    __migrate_task()
    deactivate_task()
    dequeue_task()
    dequeue_task_rq()
    update_curr_rt()

    Will call update_curr_rt() on rq->curr, which at that time is
    rq->stop. The problem is that rq->stop.prio matches an RT prio and
    thus falsely assumes its a rt_sched_class task.

    Reported-Debuged-Tested-Acked-by: Thomas Gleixner
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Cc: stable@kernel.org # .37
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • It is quite possible for the event to have been disabled between
    perf_event_read() sending the IPI and the CPU servicing the IPI and
    calling __perf_event_read(), hence revalidate the state.

    Reported-by: Stephane Eranian
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Currently the trace_event structures are placed in the _ftrace_events
    section, and at link time, the linker makes one large array of all
    the trace_event structures. On boot up, this array is read (much like
    the initcall sections) and the events are processed.

    The problem is that there is no guarantee that gcc will place complex
    structures nicely together in an array format. Two structures in the
    same file may be placed awkwardly, because gcc has no clue that they
    are suppose to be in an array.

    A hack was used previous to force the alignment to 4, to pack the
    structures together. But this caused alignment issues with other
    architectures (sparc).

    Instead of packing the structures into an array, the structures' addresses
    are now put into the _ftrace_event section. As pointers are always the
    natural alignment, gcc should always pack them tightly together
    (otherwise initcall, extable, etc would also fail).

    By having the pointers to the structures in the section, we can still
    iterate the trace_events without causing unnecessary alignment problems
    with other architectures, or depending on the current behaviour of
    gcc that will likely change in the future just to tick us kernel developers
    off a little more.

    The _ftrace_event section is also moved into the .init.data section
    as it is now only needed at boot up.

    Suggested-by: David Miller
    Cc: Mathieu Desnoyers
    Acked-by: David S. Miller
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • move_native_irq() masks and unmasks the interrupt line
    unconditionally, but the interrupt line might be masked due to a
    threaded oneshot handler in progress. Unmasking the line in that case
    can lead to interrupt storms. Observed on PREEMPT_RT.

    Originally-from: Ingo Molnar
    Signed-off-by: Thomas Gleixner
    Cc: stable@kernel.org

    Thomas Gleixner
     

31 Jan, 2011

4 commits

  • Signed-off-by: Marcin Slusarz
    [ add {}'s to fix a warning ]
    Signed-off-by: Don Zickus
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    Cc:
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Marcin Slusarz
     
  • If it was not possible to enable watchdog for any cpu, switch
    watchdog_enabled back to 0, because it's visible via
    kernel.watchdog sysctl.

    Signed-off-by: Marcin Slusarz
    Signed-off-by: Don Zickus
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    Cc:
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Marcin Slusarz
     
  • Passing nowatchdog to kernel disables 2 things: creation of
    watchdog threads AND initialization of percpu watchdog_hrtimer.
    As hrtimers are initialized only at boot it's not possible to
    enable watchdog later - for me all watchdog threads started to
    eat 100% of CPU time, but they could just crash.

    Additionally, even if these threads would start properly,
    watchdog_disable_all_cpus was guarded by no_watchdog check, so
    you couldn't disable watchdog.

    To fix this, remove no_watchdog variable and use already
    existing watchdog_enabled variable.

    Signed-off-by: Marcin Slusarz
    [ removed another no_watchdog instance ]
    Signed-off-by: Don Zickus
    Cc: Stephane Eranian
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    Cc:
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Marcin Slusarz
     
  • Since check_prlimit_permission always fails in the case of SUID/GUID
    processes, such processes are not able to read or set their own limits.
    This commit changes this by assuming that process can always read/change
    its own limits.

    Signed-off-by: Kacper Kornet
    Acked-by: Jiri Slaby
    Signed-off-by: Linus Torvalds

    Kacper Kornet
     

28 Jan, 2011

2 commits

  • …l/git/tip/linux-2.6-tip

    * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    sched: Use rq->clock_task instead of rq->clock for correctly maintaining load averages
    sched: Fix/remove redundant cfs_rq checks
    sched: Fix sign under-flows in wake_affine

    Linus Torvalds
     
  • Commit 927c7a9e92c4 ("perf: Fix race in callchains") introduced
    a mismatch in the sizing of struct callchain_cpus_entries.

    nr_cpu_ids must be used instead of num_possible_cpus(), or we
    might get out of bound memory accesses on some machines.

    Signed-off-by: Eric Dumazet
    Cc: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: David Miller
    Cc: Stephane Eranian
    CC: stable@kernel.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Eric Dumazet
     

26 Jan, 2011

5 commits

  • The delta in clock_task is a more fair attribution of how much time a tg has
    been contributing load to the current cpu.

    While not really important it also means we're more in sync (by magnitude)
    with respect to periodic updates (since __update_curr deltas are clock_task
    based).

    Signed-off-by: Paul Turner
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul Turner
     
  • Since updates are against an entity's queuing cfs_rq it's not possible to
    enter update_cfs_{shares,load} with a NULL cfs_rq. (Indeed, update_cfs_load
    would crash prior to the check if we did anyway since we load is examined
    during the initializers).

    Also, in the update_cfs_load case there's no point
    in maintaining averages for rq->cfs_rq since we don't perform shares
    distribution at that level -- NULL check is replaced accordingly.

    Thanks to Dan Carpenter for pointing out the deference before NULL check.

    Signed-off-by: Paul Turner
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul Turner
     
  • While care is taken around the zero-point in effective_load to not exceed
    the instantaneous rq->weight, it's still possible (e.g. using wake_idx != 0)
    for (load + effective_load) to underflow.

    In this case the comparing the unsigned values can result in incorrect balanced
    decisions.

    Signed-off-by: Paul Turner
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul Turner
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
    Input: wacom - pass touch resolution to clients through input_absinfo
    Input: wacom - add 2 Bamboo Pen and touch models
    Input: sysrq - ensure sysrq_enabled and __sysrq_enabled are consistent
    Input: sparse-keymap - fix KEY_VSW handling in sparse_keymap_setup
    Input: tegra-kbc - add tegra keyboard driver
    Input: gpio_keys - switch to using request_any_context_irq
    Input: serio - allow registered drivers to get status flag
    Input: ct82710c - return proper error code for ct82c710_open
    Input: bu21013_ts - added regulator support
    Input: bu21013_ts - remove duplicate resolution parameters
    Input: tnetv107x-ts - don't treat NULL clk as an error
    Input: tnetv107x-keypad - don't treat NULL clk as an error

    Fix up trivial conflicts in drivers/input/keyboard/Makefile due to
    additions of tc3589x/Tegra drivers

    Linus Torvalds
     
  • The -rt patches change the console_semaphore to console_mutex. As a
    result, a quite large chunk of the patches changes all
    acquire/release_console_sem() to acquire/release_console_mutex()

    This commit makes things use more neutral function names which dont make
    implications about the underlying lock.

    The only real change is the return value of console_trylock which is
    inverted from try_acquire_console_sem()

    This patch also paves the way to switching console_sem from a semaphore to
    a mutex.

    [akpm@linux-foundation.org: coding-style fixes]
    [akpm@linux-foundation.org: make console_trylock return 1 on success, per Geert]
    Signed-off-by: Torben Hohn
    Cc: Thomas Gleixner
    Cc: Greg KH
    Cc: Ingo Molnar
    Cc: Geert Uytterhoeven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Torben Hohn
     

25 Jan, 2011

4 commits

  • …/git/tip/linux-2.6-tip

    * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    perf tools: Fix time function double declaration with glibc
    perf tools: Fix build by checking if extra warnings are supported
    perf tools: Fix build when using gcc 3.4.6
    perf tools: Add missing header, fixes build
    perf tools: Fix 64 bit integer format strings
    perf test: Fix build on older glibcs
    perf: perf_event_exit_task_context: s/rcu_dereference/rcu_dereference_raw/
    perf test: Use cpu_map->[cpu] when setting affinity
    perf symbols: Fix annotation of thumb code
    perf: Annotate cpuctx->ctx.mutex to avoid a lockdep splat
    powerpc, perf: Fix frequency calculation for overflowing counters (FSL version)
    perf: Fix perf_event_init_task()/perf_event_free_task() interaction
    perf: Fix find_get_context() vs perf_event_exit_task() race

    Linus Torvalds
     
  • …el/git/tip/linux-2.6-tip

    * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    RTC: Remove Kconfig symbol for UIE emulation
    RTC: Properly handle rtc_read_alarm error propagation and fix bug
    RTC: Propagate error handling via rtc_timer_enqueue properly
    acpi_pm: Clear pmtmr_ioport if acpi_pm initialization fails
    rtc: Cleanup removed UIE emulation declaration
    hrtimers: Notify hrtimer users of switches to NOHZ mode

    Linus Torvalds
     
  • …l/git/tip/linux-2.6-tip

    * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    sched: Fix poor interactivity on UP systems due to group scheduler nice tune bug

    Linus Torvalds
     
  • Currently sysrq_enabled and __sysrq_enabled are initialised separately
    and inconsistently, leading to sysrq being actually enabled by reported
    as not enabled in sysfs. The first change to the sysfs configurable
    synchronises these two:

    static int __read_mostly sysrq_enabled = 1;
    static int __sysrq_enabled;

    Add a common define to carry the default for these preventing them becoming
    out of sync again. Default this to 1 to mirror previous behaviour.

    Signed-off-by: Andy Whitcroft
    Cc: stable@kernel.org
    Signed-off-by: Dmitry Torokhov

    Andy Whitcroft
     

24 Jan, 2011

1 commit

  • Michael Witten and Christian Kujau reported that the autogroup
    scheduling feature hurts interactivity on their UP systems.

    It turns out that this is an older bug in the group scheduling code,
    and the wider appeal provided by the autogroup feature exposed it
    more prominently.

    When on UP with FAIR_GROUP_SCHED enabled, tune shares
    only affect tg->shares, but is not reflected in
    tg->se->load. The reason is that update_cfs_shares()
    does nothing on UP.

    So introduce update_cfs_shares() for UP && FAIR_GROUP_SCHED.

    This issue was found when enable autogroup scheduling was enabled,
    but it is an older bug that also exists on cgroup.cpu on UP.

    Reported-and-Tested-by: Michael Witten
    Reported-and-Tested-by: Christian Kujau
    Signed-off-by: Yong Zhang
    Acked-by: Pekka Enberg
    Acked-by: Mike Galbraith
    Acked-by: Peter Zijlstra
    Cc: Linus Torvalds
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Yong Zhang