17 May, 2007

1 commit

  • SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it.

    Signed-off-by: Christoph Lameter
    Cc: David Howells
    Cc: Jens Axboe
    Cc: Steven French
    Cc: Michael Halcrow
    Cc: OGAWA Hirofumi
    Cc: Miklos Szeredi
    Cc: Steven Whitehouse
    Cc: Roman Zippel
    Cc: David Woodhouse
    Cc: Dave Kleikamp
    Cc: Trond Myklebust
    Cc: "J. Bruce Fields"
    Cc: Anton Altaparmakov
    Cc: Mark Fasheh
    Cc: Paul Mackerras
    Cc: Christoph Hellwig
    Cc: Jan Kara
    Cc: David Chinner
    Cc: "David S. Miller"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

09 May, 2007

18 commits

  • /proc/pid/clear_refs is only defined in the CONFIG_MMU case, so make sure we
    don't have any references to clear_refs_smap() in generic procfs code.

    Signed-off-by: David Rientjes
    Signed-off-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • Cleanup using simple_read_from_buffer() in procfs.

    Signed-off-by: Akinobu Mita
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • notify_change() already calls security_inode_setattr() before
    calling iop->setattr.

    Alan sayeth

    This is a behaviour change on all of these and limits some behaviour of
    existing established security modules

    When inode_change_ok is called it has side effects. This includes
    clearing the SGID bit on attribute changes caused by chmod. If you make
    this change the results of some rulesets may be different before or after
    the change is made.

    I'm not saying the change is wrong but it does change behaviour so that
    needs looking at closely (ditto all other attribute twiddles)

    Signed-off-by: Steve Beattie
    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: John Johansen
    Acked-by: Stephen Smalley
    Cc: James Morris
    Cc: Chris Wright
    Cc: Alan Cox
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    John Johansen
     
  • notify_change() already calls security_inode_setattr() before
    calling iop->setattr.

    Signed-off-by: Tony Jones
    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: John Johansen
    Acked-by: Stephen Smalley
    Cc: James Morris
    Cc: Chris Wright
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    John Johansen
     
  • We can save some lines of code by using seq_release_private().

    Signed-off-by: Martin Peschke
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Martin Peschke
     
  • kallsyms_lookup() can go iterating over modules list unprotected which is OK
    for emergency situations (oops), but not OK for regular stuff like
    /proc/*/wchan.

    Introduce lookup_symbol_name()/lookup_module_symbol_name() which copy symbol
    name into caller-supplied buffer or return -ERANGE. All copying is done with
    module_mutex held, so...

    Signed-off-by: Alexey Dobriyan
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Several kallsyms_lookup() pass dummy arguments but only need, say, module's
    name. Make kallsyms_lookup() accept NULLs where possible.

    Also, makes picture clearer about what interfaces are needed for all symbol
    resolving business.

    Signed-off-by: Alexey Dobriyan
    Cc: Rusty Russell
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Remove includes of where it is not used/needed.
    Suggested by Al Viro.

    Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
    sparc64, and arm (all 59 defconfigs).

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Additions and removal from tty_drivers list were just done as well as
    iterating on it for /proc/tty/drivers generation.

    testing: modprobe/rmmod loop of simple module which does nothing but
    tty_register_driver() vs cat /proc/tty/drivers loop

    BUG: unable to handle kernel paging request at virtual address 6b6b6b6b
    printing eip:
    c01cefa7
    *pde = 00000000
    Oops: 0000 [#1]
    PREEMPT
    last sysfs file: devices/pci0000:00/0000:00:1d.7/usb5/5-0:1.0/bInterfaceProtocol
    Modules linked in: ohci_hcd af_packet e1000 ehci_hcd uhci_hcd usbcore xfs
    CPU: 0
    EIP: 0060:[] Not tainted VLI
    EFLAGS: 00010297 (2.6.21-rc4-mm1 #4)
    EIP is at vsnprintf+0x3a4/0x5fc
    eax: 6b6b6b6b ebx: f6cb50f2 ecx: 6b6b6b6b edx: fffffffe
    esi: c0354700 edi: f6cb6000 ebp: 6b6b6b6b esp: f31f5e68
    ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068
    Process cat (pid: 31864, ti=f31f4000 task=c1998030 task.ti=f31f4000)
    Stack: 00000000 c0103f20 c013003a c0103f20 00000000 f6cb50da 0000000a 00000f0e
    f6cb50f2 00000010 00000014 ffffffff ffffffff 00000007 c0354753 f6cb50f2
    f73e39dc f73e39dc 00000001 c0175416 f31f5ed8 f31f5ed4 0ee00000 f32090bc
    Call Trace:
    [] restore_nocheck+0x12/0x15
    [] mark_held_locks+0x6d/0x86
    [] restore_nocheck+0x12/0x15
    [] seq_printf+0x2e/0x52
    [] show_tty_range+0x35/0x1f3
    [] seq_printf+0x2e/0x52
    [] show_tty_driver+0x8a/0x1d9
    [] seq_read+0x70/0x2ba
    [] seq_read+0x0/0x2ba
    [] proc_reg_read+0x63/0x9f
    [] vfs_read+0x7d/0xb5
    [] proc_reg_read+0x0/0x9f
    [] sys_read+0x41/0x6a
    [] sysenter_past_esp+0x5f/0x99
    =======================
    Code: 00 8b 4d 04 e9 44 ff ff ff 8d 4d 04 89 4c 24 50 8b 6d 00 81 fd ff 0f 00 00 b8 a4 c1 35 c0 0f 46 e8 8b 54 24 2c 89 e9 89 c8 eb 06 38 00 74 07 40 4a 83 fa ff 75 f4 29 c8 89 c6 8b 44 24 28 89
    EIP: [] vsnprintf+0x3a4/0x5fc SS:ESP 0068:f31f5e68

    Signed-off-by: Alexey Dobriyan
    Cc: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Eternal quest to make

    while true; do cat /proc/fs/xfs/stat >/dev/null 2>/dev/null; done
    while true; do find /proc -type f 2>/dev/null | xargs cat >/dev/null 2>/dev/null; done
    while true; do modprobe xfs; rmmod xfs; done

    work reliably continues and now kernel oopses in the following way:

    BUG: unable to handle ... at virtual address 6b6b6b6b
    EIP is at badness
    process: cat
    proc_oom_score
    proc_info_read
    sys_fstat64
    vfs_read
    proc_info_read
    sys_read

    Failing code is prefetch hidden in list_for_each_entry() in badness().
    badness() is reachable from two points. One is proc_oom_score, another
    is out_of_memory() => select_bad_process() => badness().

    Second path grabs tasklist_lock, while first doesn't.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Add support for finding out the current file position, open flags and
    possibly other info in the future.

    These new entries are added:

    /proc/PID/fdinfo/FD
    /proc/PID/task/TID/fdinfo/FD

    For each fd the information is provided in the following format:

    pos: 1234
    flags: 0100002

    [bunk@stusta.de: make struct proc_fdinfo_file_operations static]
    Signed-off-by: Miklos Szeredi
    Cc: Alexey Dobriyan
    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Change the order of fields of struct pid_entry (file fs/proc/base.c) in order
    to avoid a hole on 64bit archs. (8 bytes saved per object)

    Also change all pid_entry arrays to be const qualified, to make clear they
    must not be modified.

    Before (on x86_64) :

    # size fs/proc/base.o
    text data bss dec hex filename
    15549 2192 0 17741 454d fs/proc/base.o

    After :

    # size fs/proc/base.o
    text data bss dec hex filename
    17229 176 0 17405 43fd fs/proc/base.o

    Thats 336 bytes saved on kernel size on x86_64

    Signed-off-by: Eric Dumazet
    Acked-by: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     
  • The /proc/pid/ "maps", "smaps", and "numa_maps" files contain sensitive
    information about the memory location and usage of processes. Issues:

    - maps should not be world-readable, especially if programs expect any
    kind of ASLR protection from local attackers.
    - maps cannot just be 0400 because "-D_FORTIFY_SOURCE=2 -O2" makes glibc
    check the maps when %n is in a *printf call, and a setuid(getuid())
    process wouldn't be able to read its own maps file. (For reference
    see http://lkml.org/lkml/2006/1/22/150)
    - a system-wide toggle is needed to allow prior behavior in the case of
    non-root applications that depend on access to the maps contents.

    This change implements a check using "ptrace_may_attach" before allowing
    access to read the maps contents. To control this protection, the new knob
    /proc/sys/kernel/maps_protect has been added, with corresponding updates to
    the procfs documentation.

    [akpm@linux-foundation.org: build fixes]
    [akpm@linux-foundation.org: New sysctl numbers are old hat]
    Signed-off-by: Kees Cook
    Cc: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • WARN_ON(de && de->deleted); is sooo unreliable. Why?

    proc_lookup remove_proc_entry
    =========== =================
    lock_kernel();
    spin_lock(&proc_subdir_lock);
    [find proc entry]
    spin_unlock(&proc_subdir_lock);
    spin_lock(&proc_subdir_lock);
    [find proc entry]

    proc_get_inode
    ==============
    WARN_ON(de && de->deleted); ...

    if (!atomic_read(&de->count))
    free_proc_entry(de);
    else
    de->deleted = 1;

    So, if you have some strange oops [1], and doesn't see this WARN_ON it means
    nothing.

    [1] try_module_get() of module which doesn't exist, two lines below
    should suffice, or not?

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Fix the following race:

    proc_readdir remove_proc_entry
    ============ =================

    spin_lock(&proc_subdir_lock);
    [choose PDE to start filldir from]
    spin_unlock(&proc_subdir_lock);
    spin_lock(&proc_subdir_lock);
    [find PDE]
    [free PDE, refcount is 0]
    spin_unlock(&proc_subdir_lock);
    /* boom */
    if (filldir(dirent, de->name, ...

    [de_put on error path --adobriyan]
    Signed-off-by: Darrick J. Wong
    Signed-off-by: Alexey Dobriyan
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Darrick J. Wong
     
  • proc_lookup remove_proc_entry
    =========== =================

    lock_kernel();
    spin_lock(&proc_subdir_lock);
    [find PDE with refcount 0]
    spin_unlock(&proc_subdir_lock);
    spin_lock(&proc_subdir_lock);
    [find PDE with refcount 0]
    [check refcount and free PDE]
    spin_unlock(&proc_subdir_lock);
    proc_get_inode:
    de_get(de); /* boom */

    Signed-off-by: Alexey Dobriyan
    Cc: "Eric W. Biederman"
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • This past week I was playing around with that pahole tool
    (http://oops.ghostprotocols.net:81/acme/dwarves/) and looking at the size
    of various struct in the kernel. I was surprised by the size of the
    task_struct on x86_64, approaching 4K. I looked through the fields in
    task_struct and found that a number of them were declared as "unsigned
    long" rather than "unsigned int" despite them appearing okay as 32-bit
    sized fields. On x86_64 "unsigned long" ends up being 8 bytes in size and
    forces 8 byte alignment. Is there a reason there a reason they are
    "unsigned long"?

    The patch below drops the size of the struct from 3808 bytes (60 64-byte
    cachelines) to 3760 bytes (59 64-byte cachelines). A couple other fields
    in the task struct take a signficant amount of space:

    struct thread_struct thread; 688
    struct held_lock held_locks[30]; 1680

    CONFIG_LOCKDEP is turned on in the .config

    [akpm@linux-foundation.org: fix printk warnings]
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    William Cohen
     
  • /proc/$PID/fd has r-x------ permissions, so if process does setuid(), it
    will not be able to access /proc/*/fd/. This breaks fstatat() emulation
    in glibc.

    open("foo", O_RDONLY|O_DIRECTORY) = 4
    setuid32(65534) = 0
    stat64("/proc/self/fd/4/bar", 0xbfafb298) = -1 EACCES (Permission denied)

    Signed-off-by: Alexey Dobriyan
    Cc: "Eric W. Biederman"
    Cc: James Morris
    Cc: Chris Wright
    Cc: Ulrich Drepper
    Cc: Oleg Nesterov
    Acked-By: Kirill Korotaev
    Cc: Al Viro
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

08 May, 2007

5 commits

  • I have never seen a use of SLAB_DEBUG_INITIAL. It is only supported by
    SLAB.

    I think its purpose was to have a callback after an object has been freed
    to verify that the state is the constructor state again? The callback is
    performed before each freeing of an object.

    I would think that it is much easier to check the object state manually
    before the free. That also places the check near the code object
    manipulation of the object.

    Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was
    compiled with SLAB debugging on. If there would be code in a constructor
    handling SLAB_DEBUG_INITIAL then it would have to be conditional on
    SLAB_DEBUG otherwise it would just be dead code. But there is no such code
    in the kernel. I think SLUB_DEBUG_INITIAL is too problematic to make real
    use of, difficult to understand and there are easier ways to accomplish the
    same effect (i.e. add debug code before kfree).

    There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be
    clear in fs inode caches. Remove the pointless checks (they would even be
    pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors.

    This is the last slab flag that SLUB did not support. Remove the check for
    unimplemented flags from SLUB.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Adds /proc/pid/clear_refs. When any non-zero number is written to this file,
    pte_mkold() and ClearPageReferenced() is called for each pte and its
    corresponding page, respectively, in that task's VMAs. This file is only
    writable by the user who owns the task.

    It is now possible to measure _approximately_ how much memory a task is using
    by clearing the reference bits with

    echo 1 > /proc/pid/clear_refs

    and checking the reference count for each VMA from the /proc/pid/smaps output
    at a measured time interval. For example, to observe the approximate change
    in memory footprint for a task, write a script that clears the references
    (echo 1 > /proc/pid/clear_refs), sleeps, and then greps for Pgs_Referenced and
    extracts the size in kB. Add the sizes for each VMA together for the total
    referenced footprint. Moments later, repeat the process and observe the
    difference.

    For example, using an efficient Mozilla:

    accumulated time referenced memory
    ---------------- -----------------
    0 s 408 kB
    1 s 408 kB
    2 s 556 kB
    3 s 1028 kB
    4 s 872 kB
    5 s 1956 kB
    6 s 416 kB
    7 s 1560 kB
    8 s 2336 kB
    9 s 1044 kB
    10 s 416 kB

    This is a valuable tool to get an approximate measurement of the memory
    footprint for a task.

    Cc: Hugh Dickins
    Cc: Paul Mundt
    Cc: Christoph Lameter
    Signed-off-by: David Rientjes
    [akpm@linux-foundation.org: build fixes]
    [mpm@selenic.com: rename for_each_pmd]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • Adds an additional unsigned long field to struct mem_size_stats called
    'referenced'. For each pte walked in the smaps code, this field is
    incremented by PAGE_SIZE if it has pte-reference bits.

    An additional line was added to the /proc/pid/smaps output for each VMA to
    indicate how many pages within it are currently marked as referenced or
    accessed.

    Cc: Hugh Dickins
    Cc: Paul Mundt
    Cc: Christoph Lameter
    Signed-off-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • Extracts the pmd walker from smaps-specific code in fs/proc/task_mmu.c.

    The new struct pmd_walker includes the struct vm_area_struct of the memory to
    walk over. Iteration begins at the vma->vm_start and completes at
    vma->vm_end. A pointer to another data structure may be stored in the private
    field such as struct mem_size_stats, which acts as the smaps accumulator. For
    each pmd in the VMA, the action function is called with a pointer to its
    struct vm_area_struct, a pointer to the pmd_t, its start and end addresses,
    and the private field.

    The interface for walking pmd's in a VMA for fs/proc/task_mmu.c is now:

    void for_each_pmd(struct vm_area_struct *vma,
    void (*action)(struct vm_area_struct *vma,
    pmd_t *pmd, unsigned long addr,
    unsigned long end,
    void *private),
    void *private);

    Since the pmd walker is now extracted from the smaps code, smaps_one_pmd() is
    invoked for each pmd in the VMA. Its behavior and efficiency is identical to
    the existing implementation.

    Cc: Hugh Dickins
    Cc: Paul Mundt
    Cc: Christoph Lameter
    Signed-off-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • Add proper prototypes in include/linux/slab.h.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     

03 May, 2007

1 commit

  • The specific case I am encountering is kdump under Xen with a 64 bit
    hypervisor and 32 bit kernel/userspace. The dump created is 64 bit due to
    the hypervisor but the dump kernel is 32 bit for maximum compatibility.

    It's possibly less likely to be useful in a purely native scenario but I
    see no reason to disallow it.

    [akpm@linux-foundation.org: build fix]
    Signed-off-by: Ian Campbell
    Signed-off-by: Andi Kleen
    Acked-by: Vivek Goyal
    Cc: Horms
    Cc: Magnus Damm
    Cc: "Eric W. Biederman"
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton

    Ian Campbell
     

13 Apr, 2007

1 commit


03 Apr, 2007

1 commit


28 Mar, 2007

1 commit

  • Without attached patch against current -git I get following with
    !PROC_SYSCTL (with EMBEDDED and PROC_FS set):

    CC init/version.o
    LD init/built-in.o
    LD vmlinux
    fs/built-in.o: In function `do_proc_sys_lookup':
    proc_sysctl.c:(.text+0x26583): undefined reference to `sysctl_head_next'
    fs/built-in.o: In function `proc_sys_revalidate':
    proc_sysctl.c:(.text+0x265bb): undefined reference to `sysctl_head_finish'
    fs/built-in.o: In function `proc_sys_readdir':
    proc_sysctl.c:(.text+0x26720): undefined reference to `sysctl_head_next'
    proc_sysctl.c:(.text+0x267d8): undefined reference to `sysctl_head_finish'
    proc_sysctl.c:(.text+0x268e7): undefined reference to `sysctl_head_next'
    proc_sysctl.c:(.text+0x26910): undefined reference to `sysctl_head_finish'
    fs/built-in.o: In function `proc_sys_write':
    proc_sysctl.c:(.text+0x2695d): undefined reference to `sysctl_perm'
    proc_sysctl.c:(.text+0x2699c): undefined reference to `sysctl_head_finish'
    fs/built-in.o: In function `proc_sys_read':
    proc_sysctl.c:(.text+0x269e9): undefined reference to `sysctl_perm'
    proc_sysctl.c:(.text+0x26a25): undefined reference to `sysctl_head_finish'
    fs/built-in.o: In function `proc_sys_permission':
    proc_sysctl.c:(.text+0x26ad1): undefined reference to `sysctl_perm'
    proc_sysctl.c:(.text+0x26adb): undefined reference to `sysctl_head_finish'
    fs/built-in.o: In function `proc_sys_lookup':
    proc_sysctl.c:(.text+0x26b39): undefined reference to `sysctl_head_finish'
    make: *** [vmlinux] Virhe 1

    All those functions are in fs/proc/proc_sysctl.c, which has no CONFIG_
    #define's in it, so the patch makes the compilation of that file to depend
    on CONFIG_PROC_SYSCTL (the simplest choice).

    Acked-by: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mika Kukkonen
     

15 Mar, 2007

1 commit


21 Feb, 2007

1 commit


15 Feb, 2007

3 commits

  • Since the security checks are applied on each read and write of a sysctl file,
    just like they are applied when calling sys_sysctl, they are redundant on the
    standard VFS constructs. Since it is difficult to compute the security labels
    on the standard VFS constructs we just mark the sysctl inodes in proc private
    so selinux won't even bother with them.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • With this change the sysctl inodes can be cached and nothing needs to be done
    when removing a sysctl table.

    For a cost of 2K code we will save about 4K of static tables (when we remove
    de from ctl_table) and 70K in proc_dir_entries that we will not allocate, or
    about half that on a 32bit arch.

    The speed feels about the same, even though we can now cache the sysctl
    dentries :(

    We get the core advantage that we don't need to have a 1 to 1 mapping between
    ctl table entries and proc files. Making it possible to have /proc/sys vary
    depending on the namespace you are in. The currently merged namespaces don't
    have an issue here but the network namespace under /proc/sys/net needs to have
    different directories depending on which network adapters are visible. By
    simply being a cache different directories being visible depending on who you
    are is trivial to implement.

    [akpm@osdl.org: fix uninitialised var]
    [akpm@osdl.org: fix ARM build]
    [bunk@stusta.de: make things static]
    Signed-off-by: Eric W. Biederman
    Cc: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • binfmt_misc has a mount point in the middle of the sysctl and that mount point
    is created as a proc_generic directory.

    Doing it that way gets in the way of cleaning up the sysctl proc support as it
    continues the existence of a horrible hack. So instead simply create the
    directory as an ordinary sysctl directory. At least that removes the magic
    special case.

    [akpm@osdl.org: warning fix]
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

13 Feb, 2007

4 commits

  • This patch is inspired by Arjan's "Patch series to mark struct
    file_operations and struct inode_operations const".

    Compile tested with gcc & sparse.

    Signed-off-by: Josef 'Jeff' Sipek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josef 'Jeff' Sipek
     
  • Many struct inode_operations in the kernel can be "const". Marking them const
    moves these to the .rodata section, which avoids false sharing with potential
    dirty data. In addition it'll catch accidental writes at compile time to
    these shared resources.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     
  • Many struct file_operations in the kernel can be "const". Marking them const
    moves these to the .rodata section, which avoids false sharing with potential
    dirty data. In addition it'll catch accidental writes at compile time to
    these shared resources.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     
  • Of kernel subsystems that work with pids the tty layer is probably the largest
    consumer. But it has the nice virtue that the assiation with a session only
    lasts until the session leader exits. Which means that no reference counting
    is required. So using struct pid winds up being a simple optimization to
    avoid hash table lookups.

    In the long term the use of pid_nr also ensures that when we have multiple pid
    spaces mixed everything will work correctly.

    Signed-off-by: Eric W. Biederman
    Cc: Alan Cox
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

12 Feb, 2007

3 commits