06 May, 2005

6 commits

  • As per http://www.nist.gov/dads/HTML/shellsort.html, this should be
    referred to as a Shell sort. Shell-Metzner is a misnomer.

    Signed-off-by: Daniel Dickman
    Signed-off-by: Domen Puncer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Domen Puncer
     
  • It seems that the code responsible for this is in kernel/itimer.c:126:

    p->signal->real_timer.expires = jiffies + interval;
    add_timer(&p->signal->real_timer);

    If you request an interval of, lets say 900 usecs, the interval given by
    timeval_to_jiffies will be 1.

    If you request this when we are half-way between two timer ticks, the
    interval will only give 400 usecs.

    If we want to guarantee that we never ever give intervals less than
    requested, the simple solution would be to change that to:

    p->signal->real_timer.expires = jiffies + interval + 1;

    This however will produce pathological cases, like having a idle system
    being requested 1 ms timeouts will give systematically 2 ms timeouts,
    whereas currently it simply gives a few usecs less than 1 ms.

    The complex (and more computationally expensive) solution would be to
    check the gettimeofday time, and compute the correct number of jiffies.
    This way, if we request a 300 usecs timer 200 usecs inside the timer
    tick, we can wait just one tick, but not if we are 800 usecs inside the
    tick. This would also mean that we would have to lock preemption during
    these computations to avoid races, etc.

    I've searched the archives but couldn't find this particular issue being
    discussed before.

    Attached is a patch to do the simple solution, in case anybody thinks
    that it should be used.

    Signed-Off-By: Paulo Marques
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paulo Marques
     
  • Allow registration of multiple kprobes at an address in an architecture
    agnostic way. Corresponding handlers will be invoked in a sequence. But,
    a kprobe and a jprobe can't (yet) co-exist at the same address.

    Signed-off-by: Ananth N Mavinakayanahalli
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ananth N Mavinakayanahalli
     
  • kernel oops! when unregister_kprobe() is called on a non-registered
    kprobe. This patch fixes the above problem by checking if the probe exists
    before unregistering.

    Signed-off-by: Prasanna S Panchamukhi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Prasanna S Panchamukhi
     
  • While looking at code generated by gcc4.0 I noticed some functions still
    had frame pointers, even after we stopped ppc64 from defining
    CONFIG_FRAME_POINTER. It turns out kernel/Makefile hardwires
    -fno-omit-frame-pointer on when compiling schedule.c.

    Create CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER and define it on architectures
    that dont require frame pointers in sched.c code.

    (akpm: blame me for the name)

    Signed-off-by: Anton Blanchard
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Anton Blanchard
     
  • The PPC32 kernel puts platform-specific functions into separate sections so
    that unneeded parts of it can be freed when we've booted and actually
    worked out what we're running on today.

    This makes kallsyms ignore those functions, because they're not between
    _[se]text or _[se]inittext. Rather than teaching kallsyms about the
    various pmac/chrp/etc sections, this patch adds '_[se]extratext' markers
    for kallsyms.

    Signed-off-by: David Woodhouse
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Woodhouse
     

05 May, 2005

2 commits


04 May, 2005

2 commits

  • Let's recap the problem. The current asynchronous netlink kernel
    message processing is vulnerable to these attacks:

    1) Hit and run: Attacker sends one or more messages and then exits
    before they're processed. This may confuse/disable the next netlink
    user that gets the netlink address of the attacker since it may
    receive the responses to the attacker's messages.

    Proposed solutions:

    a) Synchronous processing.
    b) Stream mode socket.
    c) Restrict/prohibit binding.

    2) Starvation: Because various netlink rcv functions were written
    to not return until all messages have been processed on a socket,
    it is possible for these functions to execute for an arbitrarily
    long period of time. If this is successfully exploited it could
    also be used to hold rtnl forever.

    Proposed solutions:

    a) Synchronous processing.
    b) Stream mode socket.

    Firstly let's cross off solution c). It only solves the first
    problem and it has user-visible impacts. In particular, it'll
    break user space applications that expect to bind or communicate
    with specific netlink addresses (pid's).

    So we're left with a choice of synchronous processing versus
    SOCK_STREAM for netlink.

    For the moment I'm sticking with the synchronous approach as
    suggested by Alexey since it's simpler and I'd rather spend
    my time working on other things.

    However, it does have a number of deficiencies compared to the
    stream mode solution:

    1) User-space to user-space netlink communication is still vulnerable.

    2) Inefficient use of resources. This is especially true for rtnetlink
    since the lock is shared with other users such as networking drivers.
    The latter could hold the rtnl while communicating with hardware which
    causes the rtnetlink user to wait when it could be doing other things.

    3) It is still possible to DoS all netlink users by flooding the kernel
    netlink receive queue. The attacker simply fills the receive socket
    with a single netlink message that fills up the entire queue. The
    attacker then continues to call sendmsg with the same message in a loop.

    Point 3) can be countered by retransmissions in user-space code, however
    it is pretty messy.

    In light of these problems (in particular, point 3), we should implement
    stream mode netlink at some point. In the mean time, here is a patch
    that implements synchronous processing.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • The patch "MCA recovery improvements" added do_exit to mca_drv.c.
    That's fine when the mca recovery code is built in the kernel
    (CONFIG_IA64_MCA_RECOVERY=y) but breaks building the mca recovery
    code as a module (CONFIG_IA64_MCA_RECOVERY=m).

    Most users are currently building this as a module, as loading
    and unloading the module provides a very convenient way to turn
    on/off error recovery.

    This patch exports do_exit, so mca_drv.c can build as a module.

    Signed-off-by: Russ Anderson (rja@sgi.com)
    Signed-off-by: Tony Luck

    Russ Anderson
     

03 May, 2005

2 commits


01 May, 2005

11 commits

  • Another large rollup of various patches from Adrian which make things static
    where they were needlessly exported.

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • Some KernelDoc descriptions are updated to match the current code.
    No code changes.

    Signed-off-by: Martin Waitz
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Martin Waitz
     
  • I have recompiled Linux kernel 2.6.11.5 documentation for me and our
    university students again. The documentation could be extended for more
    sources which are equipped by structured comments for recent 2.6 kernels. I
    have tried to proceed with that task. I have done that more times from 2.6.0
    time and it gets boring to do same changes again and again. Linux kernel
    compiles after changes for i386 and ARM targets. I have added references to
    some more files into kernel-api book, I have added some section names as well.
    So please, check that changes do not break something and that categories are
    not too much skewed.

    I have changed kernel-doc to accept "fastcall" and "asmlinkage" words reserved
    by kernel convention. Most of the other changes are modifications in the
    comments to make kernel-doc happy, accept some parameters description and do
    not bail out on errors. Changed to @pid in the description, moved some
    #ifdef before comments to correct function to comments bindings, etc.

    You can see result of the modified documentation build at
    http://cmp.felk.cvut.cz/~pisa/linux/lkdb-2.6.11.tar.gz

    Some more sources are ready to be included into kernel-doc generated
    documentation. Sources has been added into kernel-api for now. Some more
    section names added and probably some more chaos introduced as result of quick
    cleanup work.

    Signed-off-by: Pavel Pisa
    Signed-off-by: Martin Waitz
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Pisa
     
  • Convert most of the current code that uses _NSIG directly to instead use
    valid_signal(). This avoids gcc -W warnings and off-by-one errors.

    Signed-off-by: Jesper Juhl
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jesper Juhl
     
  • Signed-off-by: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Rothwell
     
  • This patch changes calls to synchronize_kernel(), deprecated in the earlier
    "Deprecate synchronize_kernel, GPL replacement" patch to instead call the new
    synchronize_rcu() and synchronize_sched() APIs.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul E. McKenney
     
  • The synchronize_kernel() primitive is used for quite a few different purposes:
    waiting for RCU readers, waiting for NMIs, waiting for interrupts, and so on.
    This makes RCU code harder to read, since synchronize_kernel() might or might
    not have matching rcu_read_lock()s. This patch creates a new
    synchronize_rcu() that is to be used for RCU readers and a new
    synchronize_sched() that is used for the rest. These two new primitives
    currently have the same implementation, but this is might well change with
    additional real-time support. Both new primitives are GPL-only, the old
    primitive is deprecated.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul E. McKenney
     
  • The gpl exports need to be put back. Moving them to GPL -- but in a
    measured manner, as I proposed on this list some months ago -- is fine.
    Changing these particular exports precipitously is most definitely -not-
    fine. Here is my earlier proposal:

    http://marc.theaimsgroup.com/?l=linux-kernel&m=110520930301813&w=2

    See below for a patch that puts the exports back, along with an updated
    version of my earlier patch that starts the process of moving them to GPL.
    I will also be following this message with RFC patches that introduce two
    (EXPORT_SYMBOL_GPL) interfaces to replace synchronize_kernel(), which then
    becomes deprecated.

    Signed-off-by:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul E. McKenney
     
  • Arrange for all kernel printks to be no-ops. Only available if
    CONFIG_EMBEDDED.

    This patch saves about 375k on my laptop config and nearly 100k on minimal
    configs.

    Signed-off-by: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matt Mackall
     
  • Add a pair of rlimits for allowing non-root tasks to raise nice and rt
    priorities. Defaults to traditional behavior. Originally written by
    Chris Wright.

    The patch implements a simple rlimit ceiling for the RT (and nice) priorities
    a task can set. The rlimit defaults to 0, meaning no change in behavior by
    default. A value of 50 means RT priority levels 1-50 are allowed. A value of
    100 means all 99 privilege levels from 1 to 99 are allowed. CAP_SYS_NICE is
    blanket permission.

    (akpm: see http://www.uwsg.iu.edu/hypermail/linux/kernel/0503.1/1921.html for
    tips on integrating this with PAM).

    Signed-off-by: Matt Mackall
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matt Mackall
     
  • Replace a number of memory barriers with smp_ variants. This means we won't
    take the unnecessary hit on UP machines.

    Signed-off-by: Anton Blanchard
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    akpm@osdl.org
     

30 Apr, 2005

3 commits

  • It's old sanity checking that may have been useful for debugging, but
    is just bogus these days.

    Noticed by Mattia Belletti.

    Linus Torvalds
     
  • Attached is a new patch that solves the issue of getting valid credentials
    into the LOGIN message. The current code was assuming that the audit context
    had already been copied. This is not always the case for LOGIN messages.

    To solve the problem, the patch passes the task struct to the function that
    emits the message where it can get valid credentials.

    Signed-off-by: Steve Grubb
    Signed-off-by: David Woodhouse

    Steve Grubb
     
  • If netlink_unicast() fails, requeue the skb back at the head of the queue
    it just came from, instead of the tail. And do so unless we've exceeded
    the audit_backlog limit; not according to some other arbitrary limit.

    From: Chris Wright
    Signed-off-by: David Woodhouse

    Chris Wright
     

29 Apr, 2005

7 commits

  • Most audit control messages are sent over netlink.In order to properly
    log the identity of the sender of audit control messages, we would like
    to add the loginuid to the netlink_creds structure, as per the attached
    patch.

    Signed-off-by: Serge Hallyn
    Signed-off-by: David Woodhouse

    Serge Hallyn
     
  •  
  • They don't seem to work correctly (investigation ongoing), but we don't
    actually need to do it anyway.

    Patch from Peter Martuccelli
    Signed-off-by: David Woodhouse

    Peter Martuccelli
     
  • Attached is a patch that corrects a signed/unsigned warning. I also noticed
    that we needlessly init serial to 0. That only needs to occur if the kernel
    was compiled without the audit system.

    -Steve Grubb

    Signed-off-by: David Woodhouse

    Steve Grubb
     
  • We were calling ptrace_notify() after auditing the syscall and arguments,
    but the debugger could have _changed_ them before the syscall was actually
    invoked. Reorder the calls to fix that.

    While we're touching ever call to audit_syscall_entry(), we also make it
    take an extra argument: the architecture of the syscall which was made,
    because some architectures allow more than one type of syscall.

    Also add an explicit success/failure flag to audit_syscall_exit(), for
    the benefit of architectures which return that in a condition register
    rather than only returning a single register.

    Change type of syscall return value to 'long' not 'int'.

    Signed-off-by: David Woodhouse

     
  • kernel/audit.c: In function `audit_log_untrustedstring':
    kernel/audit.c:736: warning: comparison is always false due to limited range of data type

    Signed-off-by: Andrew Morton
    Signed-off-by: David Woodhouse

    Andrew Morton
     
  • We log strings from userspace, such as arguments to open(). These could
    be formatted to contain \n followed by fake audit log entries. Provide
    a function for logging such strings, which gives a hex dump when the
    string contains anything but basic printable ASCII characters. Use it
    for logging filenames.

    Signed-off-by: David Woodhouse

     

28 Apr, 2005

1 commit

  • settimeofday will set the time a little bit too early on systems using
    time interpolation since it subtracts the current interpolator offset
    from the time. This used to be necessary with the code in 2.6.9 and earlier
    but the new code resets the time interpolator after setting the time.
    Thus the time is set too early and gettimeofday will return a time slightly
    before the time specified with settimeofday if invoked immeditely after
    settimeofday.

    This removes the obsolete subtraction of the time interpolator offset
    and makes settimeofday set the time accurately.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

25 Apr, 2005

1 commit

  • This patch is incredibly trivial, but it does resolve some of the user
    confusion as to what "L1-A" actually is.

    Clarify printk message to refer to Stop-A (L1-A).

    Gentoo has a virtually identical patch in their kernel sources.

    Signed-off-by: Tom 'spot' Callaway
    Signed-off-by: David S. Miller

    Tom 'spot' Callaway
     

19 Apr, 2005

2 commits

  • Signed-off-by: Jesper Juhl
    Signed-off-by: Ingo Molnar
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • This fixes a deadlock on the dcache lock detected during testing at IBM
    by moving the logging of the current executable information from the
    SELinux avc_audit function to audit_log_exit (via an audit_log_task_info
    helper) for processing upon syscall exit.

    For consistency, the patch also removes the logging of other
    task-related information from avc_audit, deferring handling to
    audit_log_exit instead.

    This allows simplification of the avc_audit code, allows the exe
    information to be obtained more reliably, always includes the comm
    information (useful for scripts), and avoids including bogus task
    information for checks performed from irq or softirq.

    Signed-off-by: Stephen Smalley
    Signed-off-by: James Morris
    Signed-off-by: Linus Torvalds

    Stephen Smalley
     

17 Apr, 2005

3 commits

  • This patch hides reparent_to_init(). reparent_to_init() should only be
    called by daemonize().

    Signed-off-by: Coywolf Qi Hunt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Coywolf Qi Hunt
     
  • gcc-4 warns with
    include/linux/cpuset.h:21: warning: type qualifiers ignored on function
    return type

    cpuset_cpus_allowed is declared with const
    extern const cpumask_t cpuset_cpus_allowed(const struct task_struct *p);

    First const should be __attribute__((const)), but the gcc manual
    explains that:

    "Note that a function that has pointer arguments and examines the data
    pointed to must not be declared const. Likewise, a function that calls a
    non-const function usually must not be const. It does not make sense for
    a const function to return void."

    The following patch remove const from the function declaration.

    Signed-off-by: Benoit Boissinot
    Acked-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Benoit Boissinot
     
  • IXP2000 (ARM-based) platforms use a separate 'struct resource' for PCI MEM
    space. Resource allocation for PCI BARs always fails because the 'root'
    resource (the IXP2000 PCI MEM resource) always has the entire address space
    (00000000-ffffffff) free, and find_resource() calculates the size of that
    range as ffffffff-00000000+1=0, so all allocations fail because it thinks
    there is no space.

    (akpm: pls. double-check)

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lennert Buytenhek