24 Mar, 2011

1 commit


15 Oct, 2010

1 commit

  • akiphie points out that a.out core-dumps have that odd task struct
    dumping that was never used and was never really a good idea (it goes
    back into the mists of history, probably the original core-dumping
    code). Just remove it.

    Also do the access_ok() check on dump_write(). It probably doesn't
    matter (since normal filesystems all seem to do it anyway), but he
    points out that it's normally done by the VFS layer, so ...

    [ I suspect that we should possibly do "vfs_write()" instead of
    calling ->write directly. That also does the whole fsnotify and write
    statistics thing, which may or may not be a good idea. ]

    And just to be anal, do this all for the x86-64 32-bit a.out emulation
    code too, even though it's not enabled (and won't currently even
    compile)

    Reported-by: akiphie
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

04 Mar, 2010

1 commit

  • * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (25 commits)
    x86: Fix out of order of gsi
    x86: apic: Fix mismerge, add arch_probe_nr_irqs() again
    x86, irq: Keep chip_data in create_irq_nr and destroy_irq
    xen: Remove unnecessary arch specific xen irq functions.
    smp: Use nr_cpus= to set nr_cpu_ids early
    x86, irq: Remove arch_probe_nr_irqs
    sparseirq: Use radix_tree instead of ptrs array
    sparseirq: Change irq_desc_ptrs to static
    init: Move radix_tree_init() early
    irq: Remove unnecessary bootmem code
    x86: Add iMac9,1 to pci_reboot_dmi_table
    x86: Convert i8259_lock to raw_spinlock
    x86: Convert nmi_lock to raw_spinlock
    x86: Convert ioapic_lock and vector_lock to raw_spinlock
    x86: Avoid race condition in pci_enable_msix()
    x86: Fix SCI on IOAPIC != 0
    x86, ia32_aout: do not kill argument mapping
    x86, irq: Move __setup_vector_irq() before the first irq enable in cpu online path
    x86, irq: Update the vector domain for legacy irqs handled by io-apic
    x86, irq: Don't block IRQ0_VECTOR..IRQ15_VECTOR's on all cpu's
    ...

    Linus Torvalds
     

01 Mar, 2010

1 commit

  • …r-linus', 'x86-debug-for-linus', 'x86-doc-for-linus', 'x86-gpu-for-linus' and 'x86-rlimit-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

    * 'core-ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    generic-ipi: Optimize accesses by using DEFINE_PER_CPU_SHARED_ALIGNED for IPI data

    * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    plist: Fix grammar mistake, and c-style mistake

    * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    kprobes: Add mcount to the kprobes blacklist

    * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86_64: Print modules like i386 does

    * 'x86-doc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86: Put 'nopat' in kernel-parameters

    * 'x86-gpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86-64: Allow fbdev primary video code

    * 'x86-rlimit-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86: Use helpers for rlimits

    Linus Torvalds
     

11 Feb, 2010

1 commit

  • Do not set current->mm->mmap to NULL in 32-bit emulation on 64-bit
    load_aout_binary after flush_old_exec as it would destroy already
    set brpm mapping with arguments.

    Introduced by b6a2fea39318e43fee84fa7b0b90d68bed92d2ba
    mm: variable length argument support
    where the argument mapping in bprm was added.

    [ hpa: this is a regression from 2.6.22... time to kill a.out? ]

    Signed-off-by: Jiri Slaby
    LKML-Reference:
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Ollie Wild
    Cc: x86@kernel.org
    Cc:
    Signed-off-by: H. Peter Anvin

    Jiri Slaby
     

30 Jan, 2010

2 commits

  • Now that the previous commit made it possible to do the personality
    setting at the point of no return, we do just that for ELF binaries.
    And suddenly all the reasons for that insane TIF_ABI_PENDING bit go
    away, and we can just make SET_PERSONALITY() just do the obvious thing
    for a 32-bit compat process.

    Everything becomes much more straightforward this way.

    Signed-off-by: H. Peter Anvin
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    H. Peter Anvin
     
  • 'flush_old_exec()' is the point of no return when doing an execve(), and
    it is pretty badly misnamed. It doesn't just flush the old executable
    environment, it also starts up the new one.

    Which is very inconvenient for things like setting up the new
    personality, because we want the new personality to affect the starting
    of the new environment, but at the same time we do _not_ want the new
    personality to take effect if flushing the old one fails.

    As a result, the x86-64 '32-bit' personality is actually done using this
    insane "I'm going to change the ABI, but I haven't done it yet" bit
    (TIF_ABI_PENDING), with SET_PERSONALITY() not actually setting the
    personality, but just the "pending" bit, so that "flush_thread()" can do
    the actual personality magic.

    This patch in no way changes any of that insanity, but it does split the
    'flush_old_exec()' function up into a preparatory part that can fail
    (still called flush_old_exec()), and a new part that will actually set
    up the new exec environment (setup_new_exec()). All callers are changed
    to trivially comply with the new world order.

    Signed-off-by: H. Peter Anvin
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

28 Jan, 2010

1 commit

  • Make sure compiler won't do weird things with limits. Fetching them
    twice may return 2 different values after writable limits are
    implemented.

    We can either use rlimit helpers added in
    3e10e716abf3c71bdb5d86b8f507f9e72236c9cd or ACCESS_ONCE if not
    applicable; this patch uses the helpers.

    Signed-off-by: Jiri Slaby
    LKML-Reference:
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Signed-off-by: H. Peter Anvin

    Jiri Slaby
     

14 Nov, 2008

1 commit

  • Make execve() take advantage of copy-on-write credentials, allowing it to set
    up the credentials in advance, and then commit the whole lot after the point
    of no return.

    This patch and the preceding patches have been tested with the LTP SELinux
    testsuite.

    This patch makes several logical sets of alteration:

    (1) execve().

    The credential bits from struct linux_binprm are, for the most part,
    replaced with a single credentials pointer (bprm->cred). This means that
    all the creds can be calculated in advance and then applied at the point
    of no return with no possibility of failure.

    I would like to replace bprm->cap_effective with:

    cap_isclear(bprm->cap_effective)

    but this seems impossible due to special behaviour for processes of pid 1
    (they always retain their parent's capability masks where normally they'd
    be changed - see cap_bprm_set_creds()).

    The following sequence of events now happens:

    (a) At the start of do_execve, the current task's cred_exec_mutex is
    locked to prevent PTRACE_ATTACH from obsoleting the calculation of
    creds that we make.

    (a) prepare_exec_creds() is then called to make a copy of the current
    task's credentials and prepare it. This copy is then assigned to
    bprm->cred.

    This renders security_bprm_alloc() and security_bprm_free()
    unnecessary, and so they've been removed.

    (b) The determination of unsafe execution is now performed immediately
    after (a) rather than later on in the code. The result is stored in
    bprm->unsafe for future reference.

    (c) prepare_binprm() is called, possibly multiple times.

    (i) This applies the result of set[ug]id binaries to the new creds
    attached to bprm->cred. Personality bit clearance is recorded,
    but now deferred on the basis that the exec procedure may yet
    fail.

    (ii) This then calls the new security_bprm_set_creds(). This should
    calculate the new LSM and capability credentials into *bprm->cred.

    This folds together security_bprm_set() and parts of
    security_bprm_apply_creds() (these two have been removed).
    Anything that might fail must be done at this point.

    (iii) bprm->cred_prepared is set to 1.

    bprm->cred_prepared is 0 on the first pass of the security
    calculations, and 1 on all subsequent passes. This allows SELinux
    in (ii) to base its calculations only on the initial script and
    not on the interpreter.

    (d) flush_old_exec() is called to commit the task to execution. This
    performs the following steps with regard to credentials:

    (i) Clear pdeath_signal and set dumpable on certain circumstances that
    may not be covered by commit_creds().

    (ii) Clear any bits in current->personality that were deferred from
    (c.i).

    (e) install_exec_creds() [compute_creds() as was] is called to install the
    new credentials. This performs the following steps with regard to
    credentials:

    (i) Calls security_bprm_committing_creds() to apply any security
    requirements, such as flushing unauthorised files in SELinux, that
    must be done before the credentials are changed.

    This is made up of bits of security_bprm_apply_creds() and
    security_bprm_post_apply_creds(), both of which have been removed.
    This function is not allowed to fail; anything that might fail
    must have been done in (c.ii).

    (ii) Calls commit_creds() to apply the new credentials in a single
    assignment (more or less). Possibly pdeath_signal and dumpable
    should be part of struct creds.

    (iii) Unlocks the task's cred_replace_mutex, thus allowing
    PTRACE_ATTACH to take place.

    (iv) Clears The bprm->cred pointer as the credentials it was holding
    are now immutable.

    (v) Calls security_bprm_committed_creds() to apply any security
    alterations that must be done after the creds have been changed.
    SELinux uses this to flush signals and signal handlers.

    (f) If an error occurs before (d.i), bprm_free() will call abort_creds()
    to destroy the proposed new credentials and will then unlock
    cred_replace_mutex. No changes to the credentials will have been
    made.

    (2) LSM interface.

    A number of functions have been changed, added or removed:

    (*) security_bprm_alloc(), ->bprm_alloc_security()
    (*) security_bprm_free(), ->bprm_free_security()

    Removed in favour of preparing new credentials and modifying those.

    (*) security_bprm_apply_creds(), ->bprm_apply_creds()
    (*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()

    Removed; split between security_bprm_set_creds(),
    security_bprm_committing_creds() and security_bprm_committed_creds().

    (*) security_bprm_set(), ->bprm_set_security()

    Removed; folded into security_bprm_set_creds().

    (*) security_bprm_set_creds(), ->bprm_set_creds()

    New. The new credentials in bprm->creds should be checked and set up
    as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the
    second and subsequent calls.

    (*) security_bprm_committing_creds(), ->bprm_committing_creds()
    (*) security_bprm_committed_creds(), ->bprm_committed_creds()

    New. Apply the security effects of the new credentials. This
    includes closing unauthorised files in SELinux. This function may not
    fail. When the former is called, the creds haven't yet been applied
    to the process; when the latter is called, they have.

    The former may access bprm->cred, the latter may not.

    (3) SELinux.

    SELinux has a number of changes, in addition to those to support the LSM
    interface changes mentioned above:

    (a) The bprm_security_struct struct has been removed in favour of using
    the credentials-under-construction approach.

    (c) flush_unauthorized_files() now takes a cred pointer and passes it on
    to inode_has_perm(), file_has_perm() and dentry_open().

    Signed-off-by: David Howells
    Acked-by: James Morris
    Acked-by: Serge Hallyn
    Signed-off-by: James Morris

    David Howells
     

20 Aug, 2008

1 commit


27 Jul, 2008

1 commit

  • This moves all the ptrace hooks related to exec into tracehook.h inlines.

    This also lifts the calls for tracing out of the binfmt load_binary hooks
    into search_binary_handler() after it calls into the binfmt module. This
    change has no effect, since all the binfmt modules' load_binary functions
    did the call at the end on success, and now search_binary_handler() does
    it immediately after return if successful. We consolidate the repeated
    code, and binfmt modules no longer need to import ptrace_notify().

    Signed-off-by: Roland McGrath
    Cc: Oleg Nesterov
    Reviewed-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roland McGrath
     

08 Feb, 2008

1 commit

  • struct user.u_ar0 is defined to contain a pointer offset on all
    architectures in which it is defined (all architectures which define an
    a.out format except SPARC.) However, it has a pointer type in the headers,
    which is pointless -- is not exported to userspace, and it
    just makes the code messy.

    Redefine the field as "unsigned long" (which is the same size as a pointer
    on all Linux architectures) and change the setting code to user offsetof()
    instead of hand-coded arithmetic.

    Cc: Linux Arch Mailing List
    Cc: Bryan Wu
    Cc: Roman Zippel
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Russell King
    Cc: Lennert Buytenhek
    Cc: Håvard Skinnemoen
    Cc: Mikael Starvik
    Cc: Yoshinori Sato
    Cc: Tony Luck
    Cc: Hirokazu Takata
    Cc: Ralf Baechle
    Cc: Paul Mackerras
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Paul Mundt
    Signed-off-by: H. Peter Anvin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    H. Peter Anvin
     

30 Jan, 2008

3 commits

  • The functions time_before, time_before_eq, time_after, and time_after_eq
    are more robust for comparing jiffies against other values.

    A simplified version of the semantic patch making this change is as follows:
    (http://www.emn.fr/x-info/coccinelle/)

    //
    @ change_compare_np @
    expression E;
    @@

    (
    - jiffies = E
    + time_after_eq(jiffies,E)
    |
    - jiffies < E
    + time_before(jiffies,E)
    |
    - jiffies > E
    + time_after(jiffies,E)
    )

    @ include depends on change_compare_np @
    @@

    #include

    @ no_include depends on !include && change_compare_np @
    @@

    #include
    + #include
    //

    [ mingo@elte.hu: merge to x86.git ]

    Signed-off-by: Julia Lawall
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Julia Lawall
     
  • We have a lot of code which differs only by the naming of specific
    members of structures that contain registers. In order to enable
    additional unifications, this patch drops the e- or r- size prefix
    from the register names in struct pt_regs, and drops the x- prefixes
    for segment registers on the 32-bit side.

    This patch also performs the equivalent renames in some additional
    places that might be candidates for unification in the future.

    Signed-off-by: H. Peter Anvin
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    H. Peter Anvin
     
  • White space and coding style clenaup.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     

18 Oct, 2007

1 commit


17 Oct, 2007

1 commit

  • For some time /proc/sys/kernel/core_pattern has been able to set its output
    destination as a pipe, allowing a user space helper to receive and
    intellegently process a core. This infrastructure however has some
    shortcommings which can be enhanced. Specifically:

    1) The coredump code in the kernel should ignore RLIMIT_CORE limitation
    when core_pattern is a pipe, since file system resources are not being
    consumed in this case, unless the user application wishes to save the core,
    at which point the app is restricted by usual file system limits and
    restrictions.

    2) The core_pattern code should be able to parse and pass options to the
    user space helper as an argv array. The real core limit of the uid of the
    crashing proces should also be passable to the user space helper (since it
    is overridden to zero when called).

    3) Some miscellaneous bugs need to be cleaned up (specifically the
    recognition of a recursive core dump, should the user mode helper itself
    crash. Also, the core dump code in the kernel should not wait for the user
    mode helper to exit, since the same context is responsible for writing to
    the pipe, and a read of the pipe by the user mode helper will result in a
    deadlock.

    This patch:

    Remove the check of RLIMIT_CORE if core_pattern is a pipe. In the event that
    core_pattern is a pipe, the entire core will be fed to the user mode helper.

    Signed-off-by: Neil Horman
    Cc:
    Cc:
    Cc: Jeremy Fitzhardinge
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Neil Horman
     

11 Oct, 2007

1 commit