09 May, 2007

1 commit

  • The /proc/pid/ "maps", "smaps", and "numa_maps" files contain sensitive
    information about the memory location and usage of processes. Issues:

    - maps should not be world-readable, especially if programs expect any
    kind of ASLR protection from local attackers.
    - maps cannot just be 0400 because "-D_FORTIFY_SOURCE=2 -O2" makes glibc
    check the maps when %n is in a *printf call, and a setuid(getuid())
    process wouldn't be able to read its own maps file. (For reference
    see http://lkml.org/lkml/2006/1/22/150)
    - a system-wide toggle is needed to allow prior behavior in the case of
    non-root applications that depend on access to the maps contents.

    This change implements a check using "ptrace_may_attach" before allowing
    access to read the maps contents. To control this protection, the new knob
    /proc/sys/kernel/maps_protect has been added, with corresponding updates to
    the procfs documentation.

    [akpm@linux-foundation.org: build fixes]
    [akpm@linux-foundation.org: New sysctl numbers are old hat]
    Signed-off-by: Kees Cook
    Cc: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     

03 Apr, 2007

1 commit


15 Feb, 2007

1 commit

  • With this change the sysctl inodes can be cached and nothing needs to be done
    when removing a sysctl table.

    For a cost of 2K code we will save about 4K of static tables (when we remove
    de from ctl_table) and 70K in proc_dir_entries that we will not allocate, or
    about half that on a 32bit arch.

    The speed feels about the same, even though we can now cache the sysctl
    dentries :(

    We get the core advantage that we don't need to have a 1 to 1 mapping between
    ctl table entries and proc files. Making it possible to have /proc/sys vary
    depending on the namespace you are in. The currently merged namespaces don't
    have an issue here but the network namespace under /proc/sys/net needs to have
    different directories depending on which network adapters are visible. By
    simply being a cache different directories being visible depending on who you
    are is trivial to implement.

    [akpm@osdl.org: fix uninitialised var]
    [akpm@osdl.org: fix ARM build]
    [bunk@stusta.de: make things static]
    Signed-off-by: Eric W. Biederman
    Cc: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

13 Feb, 2007

1 commit

  • Many struct file_operations in the kernel can be "const". Marking them const
    moves these to the .rodata section, which avoids false sharing with potential
    dirty data. In addition it'll catch accidental writes at compile time to
    these shared resources.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

27 Sep, 2006

1 commit


27 Jun, 2006

4 commits

  • Incrementally update my proc-dont-lock-task_structs-indefinitely patches so
    that they work with struct pid instead of struct task_ref.

    Mostly this is a straight 1-1 substitution.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • Every inode in /proc holds a reference to a struct task_struct. If a
    directory or file is opened and remains open after the the task exits this
    pinning continues. With 8K stacks on a 32bit machine the amount pinned per
    file descriptor is about 10K.

    Normally I would figure a reasonable per user process limit is about 100
    processes. With 80 processes, with a 1000 file descriptors each I can trigger
    the 00M killer on a 32bit kernel, because I have pinned about 800MB of useless
    data.

    This patch replaces the struct task_struct pointer with a pointer to a struct
    task_ref which has a struct task_struct pointer. The so the pinning of dead
    tasks does not happen.

    The code now has to contend with the fact that the task may now exit at any
    time. Which is a little but not muh more complicated.

    With this change it takes about 1000 processes each opening up 1000 file
    descriptors before I can trigger the OOM killer. Much better.

    [mlp@google.com: task_mmu small fixes]
    Signed-off-by: Eric W. Biederman
    Cc: Trond Myklebust
    Cc: Paul Jackson
    Cc: Oleg Nesterov
    Cc: Albert Cahalan
    Signed-off-by: Prasanna Meda
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • All of the functions for proc_maps_operations are already defined in
    task_mmu.c so move the operations structure to keep the functionality
    together.

    Since task_nommu.c implements a dummy version of /proc//maps give it a
    simplified version of proc_maps_operations that it can modify to best suit its
    needs.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • The sole renaming use of proc_inode.type is to discover the file descriptor
    number, so just store the file descriptor number and don't wory about
    processing this field. This removes any /proc limits on the maximum number of
    file descriptors, and clears the path to make the hard coded /proc inode
    numbers go away.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

29 Mar, 2006

1 commit

  • Mark the f_ops members of inodes as const, as well as fix the
    ripple-through this causes by places that copy this f_ops and then "do
    stuff" with it.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

09 Jan, 2006

1 commit


17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds