26 Oct, 2012

4 commits

  • Before changing a group's default behavior to ALLOW, we must check if
    its parent's behavior is also ALLOW.

    Signed-off-by: Aristeu Rozanski
    Cc: Tejun Heo
    Cc: Li Zefan
    Cc: James Morris
    Cc: Pavel Emelyanov
    Acked-by: Serge Hallyn
    Cc: Jiri Slaby
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Aristeu Rozanski
     
  • Convert the code to use kstrtou32() instead of simple_strtoul() which is
    deprecated. The real size of the variables are u32, so use kstrtou32
    instead of kstrtoul

    Signed-off-by: Aristeu Rozanski
    Cc: Dave Jones
    Cc: Tejun Heo
    Cc: Li Zefan
    Cc: James Morris
    Cc: Pavel Emelyanov
    Acked-by: Serge Hallyn
    Cc: Jiri Slaby
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Aristeu Rozanski
     
  • This was done in a v2 patch but v1 ended up being committed. The
    variable name is less confusing and stores the default behavior when no
    matching exception exists.

    Signed-off-by: Aristeu Rozanski
    Cc: Dave Jones
    Cc: Tejun Heo
    Cc: Li Zefan
    Cc: James Morris
    Cc: Pavel Emelyanov
    Acked-by: Serge Hallyn
    Cc: Jiri Slaby
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Aristeu Rozanski
     
  • Commit ad676077a2ae ("device_cgroup: convert device_cgroup internally to
    policy + exceptions") removed rcu locks which are needed in
    task_devcgroup called in this chain:

    devcgroup_inode_mknod OR __devcgroup_inode_permission ->
    __devcgroup_inode_permission ->
    task_devcgroup ->
    task_subsys_state ->
    task_subsys_state_check.

    Change the code so that task_devcgroup is safely called with rcu read
    lock held.

    ===============================
    [ INFO: suspicious RCU usage. ]
    3.6.0-rc5-next-20120913+ #42 Not tainted
    -------------------------------
    include/linux/cgroup.h:553 suspicious rcu_dereference_check() usage!

    other info that might help us debug this:

    rcu_scheduler_active = 1, debug_locks = 0
    2 locks held by kdevtmpfs/23:
    #0: (sb_writers){.+.+.+}, at: []
    mnt_want_write+0x1f/0x50
    #1: (&sb->s_type->i_mutex_key#3/1){+.+.+.}, at: []
    kern_path_create+0x7f/0x170

    stack backtrace:
    Pid: 23, comm: kdevtmpfs Not tainted 3.6.0-rc5-next-20120913+ #42
    Call Trace:
    lockdep_rcu_suspicious+0xfd/0x130
    devcgroup_inode_mknod+0x19d/0x240
    vfs_mknod+0x71/0xf0
    handle_create.isra.2+0x72/0x200
    devtmpfsd+0x114/0x140
    ? handle_create.isra.2+0x200/0x200
    kthread+0xd6/0xe0
    kernel_thread_helper+0x4/0x10

    Signed-off-by: Jiri Slaby
    Cc: Dave Jones
    Cc: Tejun Heo
    Cc: Li Zefan
    Cc: James Morris
    Cc: Pavel Emelyanov
    Acked-by: Serge Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiri Slaby
     

24 Oct, 2012

1 commit

  • BugLink: http://bugs.launchpad.net/bugs/1056078

    Profile replacement can cause long chains of profiles to build up when
    the profile being replaced is pinned. When the pinned profile is finally
    freed, it puts the reference to its replacement, which may in turn nest
    another call to free_profile on the stack. Because this may happen for
    each profile in the replacedby chain this can result in a recusion that
    causes the stack to overflow.

    Break this nesting by directly walking the chain of replacedby profiles
    (ie. use iteration instead of recursion to free the list). This results
    in at most 2 levels of free_profile being called, while freeing a
    replacedby chain.

    Signed-off-by: John Johansen
    Signed-off-by: James Morris

    John Johansen
     

18 Oct, 2012

1 commit

  • The capability defines have moved causing the auto generated names
    of capabilities that apparmor uses in logging to be incorrect.

    Fix the autogenerated table source to uapi/linux/capability.h

    Reported-by: YanHong
    Reported-by: Krzysztof Kolasa
    Analyzed-by: Al Viro
    Signed-off-by: John Johansen
    Acked-by: David Howells
    Acked-by: James Morris
    Signed-off-by: Linus Torvalds

    John Johansen
     

17 Oct, 2012

1 commit

  • replace_fd() began with "eats a reference, tries to insert into
    descriptor table" semantics; at some point I'd switched it to
    much saner current behaviour ("try to insert into descriptor
    table, grabbing a new reference if inserted; caller should do
    fput() in any case"), but forgot to update the callers.
    Mea culpa...

    [Spotted by Pavel Roskin, who has really weird system with pipe-fed
    coredumps as part of what he considers a normal boot ;-)]

    Signed-off-by: Al Viro

    Al Viro
     

15 Oct, 2012

1 commit

  • Pull module signing support from Rusty Russell:
    "module signing is the highlight, but it's an all-over David Howells frenzy..."

    Hmm "Magrathea: Glacier signing key". Somebody has been reading too much HHGTTG.

    * 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (37 commits)
    X.509: Fix indefinite length element skip error handling
    X.509: Convert some printk calls to pr_devel
    asymmetric keys: fix printk format warning
    MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checking
    MODSIGN: Make mrproper should remove generated files.
    MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs
    MODSIGN: Use the same digest for the autogen key sig as for the module sig
    MODSIGN: Sign modules during the build process
    MODSIGN: Provide a script for generating a key ID from an X.509 cert
    MODSIGN: Implement module signature checking
    MODSIGN: Provide module signing public keys to the kernel
    MODSIGN: Automatically generate module signing keys if missing
    MODSIGN: Provide Kconfig options
    MODSIGN: Provide gitignore and make clean rules for extra files
    MODSIGN: Add FIPS policy
    module: signature checking hook
    X.509: Add a crypto key parser for binary (DER) X.509 certificates
    MPILIB: Provide a function to read raw data into an MPI
    X.509: Add an ASN.1 decoder
    X.509: Add simple ASN.1 grammar compiler
    ...

    Linus Torvalds
     

12 Oct, 2012

1 commit


09 Oct, 2012

4 commits

  • Merge patches from Andrew Morton:
    "A few misc things and very nearly all of the MM tree. A tremendous
    amount of stuff (again), including a significant rbtree library
    rework."

    * emailed patches from Andrew Morton : (160 commits)
    sparc64: Support transparent huge pages.
    mm: thp: Use more portable PMD clearing sequenece in zap_huge_pmd().
    mm: Add and use update_mmu_cache_pmd() in transparent huge page code.
    sparc64: Document PGD and PMD layout.
    sparc64: Eliminate PTE table memory wastage.
    sparc64: Halve the size of PTE tables
    sparc64: Only support 4MB huge pages and 8KB base pages.
    memory-hotplug: suppress "Trying to free nonexistent resource " warning
    mm: memcg: clean up mm_match_cgroup() signature
    mm: document PageHuge somewhat
    mm: use %pK for /proc/vmallocinfo
    mm, thp: fix mlock statistics
    mm, thp: fix mapped pages avoiding unevictable list on mlock
    memory-hotplug: update memory block's state and notify userspace
    memory-hotplug: preparation to notify memory block's state at memory hot remove
    mm: avoid section mismatch warning for memblock_type_name
    make GFP_NOTRACK definition unconditional
    cma: decrease cc.nr_migratepages after reclaiming pagelist
    CMA: migrate mlocked pages
    kpageflags: fix wrong KPF_THP on non-huge compound pages
    ...

    Linus Torvalds
     
  • A long time ago, in v2.4, VM_RESERVED kept swapout process off VMA,
    currently it lost original meaning but still has some effects:

    | effect | alternative flags
    -+------------------------+---------------------------------------------
    1| account as reserved_vm | VM_IO
    2| skip in core dump | VM_IO, VM_DONTDUMP
    3| do not merge or expand | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
    4| do not mlock | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP

    This patch removes reserved_vm counter from mm_struct. Seems like nobody
    cares about it, it does not exported into userspace directly, it only
    reduces total_vm showed in proc.

    Thus VM_RESERVED can be replaced with VM_IO or pair VM_DONTEXPAND | VM_DONTDUMP.

    remap_pfn_range() and io_remap_pfn_range() set VM_IO|VM_DONTEXPAND|VM_DONTDUMP.
    remap_vmalloc_range() set VM_DONTEXPAND | VM_DONTDUMP.

    [akpm@linux-foundation.org: drivers/vfio/pci/vfio_pci.c fixup]
    Signed-off-by: Konstantin Khlebnikov
    Cc: Alexander Viro
    Cc: Carsten Otte
    Cc: Chris Metcalf
    Cc: Cyrill Gorcunov
    Cc: Eric Paris
    Cc: H. Peter Anvin
    Cc: Hugh Dickins
    Cc: Ingo Molnar
    Cc: James Morris
    Cc: Jason Baron
    Cc: Kentaro Takeda
    Cc: Matt Helsley
    Cc: Nick Piggin
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Robert Richter
    Cc: Suresh Siddha
    Cc: Tetsuo Handa
    Cc: Venkatesh Pallipadi
    Acked-by: Linus Torvalds
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     
  • Some security modules and oprofile still uses VM_EXECUTABLE for retrieving
    a task's executable file. After this patch they will use mm->exe_file
    directly. mm->exe_file is protected with mm->mmap_sem, so locking stays
    the same.

    Signed-off-by: Konstantin Khlebnikov
    Acked-by: Chris Metcalf [arch/tile]
    Acked-by: Tetsuo Handa [tomoyo]
    Cc: Alexander Viro
    Cc: Carsten Otte
    Cc: Cyrill Gorcunov
    Cc: Eric Paris
    Cc: H. Peter Anvin
    Cc: Hugh Dickins
    Cc: Ingo Molnar
    Acked-by: James Morris
    Cc: Jason Baron
    Cc: Kentaro Takeda
    Cc: Matt Helsley
    Cc: Nick Piggin
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Robert Richter
    Cc: Suresh Siddha
    Cc: Venkatesh Pallipadi
    Acked-by: Linus Torvalds
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     
  • Pull asm-generic updates from Arnd Bergmann:
    "This has three changes for asm-generic that did not really fit into
    any other branch as normal asm-generic changes do. One is a fix for a
    build warning, the other two are more interesting:

    * A patch from Mark Brown to allow using the common clock
    infrastructure on all architectures, so we can use the clock API in
    architecture independent device drivers.

    * The UAPI split patches from David Howells for the asm-generic
    files. There are other architecture specific series that are going
    through the arch maintainer tree and that depend on this one.

    There may be a few small merge conflicts between Mark's patch and the
    following arch header file split patches. In each case the solution
    will be to keep the new "generic-y += clkdev.h" line, even if it ends
    up being the only line in the Kbuild file."

    * tag 'asm-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
    UAPI: (Scripted) Disintegrate include/asm-generic
    asm-generic: Add default clkdev.h
    asm-generic: xor: mark static functions as __maybe_unused

    Linus Torvalds
     

08 Oct, 2012

1 commit

  • Give the key type the opportunity to preparse the payload prior to the
    instantiation and update routines being called. This is done with the
    provision of two new key type operations:

    int (*preparse)(struct key_preparsed_payload *prep);
    void (*free_preparse)(struct key_preparsed_payload *prep);

    If the first operation is present, then it is called before key creation (in
    the add/update case) or before the key semaphore is taken (in the update and
    instantiate cases). The second operation is called to clean up if the first
    was called.

    preparse() is given the opportunity to fill in the following structure:

    struct key_preparsed_payload {
    char *description;
    void *type_data[2];
    void *payload;
    const void *data;
    size_t datalen;
    size_t quotalen;
    };

    Before the preparser is called, the first three fields will have been cleared,
    the payload pointer and size will be stored in data and datalen and the default
    quota size from the key_type struct will be stored into quotalen.

    The preparser may parse the payload in any way it likes and may store data in
    the type_data[] and payload fields for use by the instantiate() and update()
    ops.

    The preparser may also propose a description for the key by attaching it as a
    string to the description field. This can be used by passing a NULL or ""
    description to the add_key() system call or the key_create_or_update()
    function. This cannot work with request_key() as that required the description
    to tell the upcall about the key to be created.

    This, for example permits keys that store PGP public keys to generate their own
    name from the user ID and public key fingerprint in the key.

    The instantiate() and update() operations are then modified to look like this:

    int (*instantiate)(struct key *key, struct key_preparsed_payload *prep);
    int (*update)(struct key *key, struct key_preparsed_payload *prep);

    and the new payload data is passed in *prep, whether or not it was preparsed.

    Signed-off-by: David Howells
    Signed-off-by: Rusty Russell

    David Howells
     

07 Oct, 2012

1 commit


06 Oct, 2012

4 commits

  • This patch replaces the "whitelist" usage in the code and comments and replace
    them by exception list related information.

    Signed-off-by: Aristeu Rozanski
    Cc: Tejun Heo
    Cc: Li Zefan
    Cc: James Morris
    Cc: Pavel Emelyanov
    Acked-by: Serge E. Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Aristeu Rozanski
     
  • The original model of device_cgroup is having a whitelist where all the
    allowed devices are listed. The problem with this approach is that is
    impossible to have the case of allowing everything but few devices.

    The reason for that lies in the way the whitelist is handled internally:
    since there's only a whitelist, the "all devices" entry would have to be
    removed and replaced by the entire list of possible devices but the ones
    that are being denied. Since dev_t is 32 bits long, representing the allowed
    devices as a bitfield is not memory efficient.

    This patch replaces the "whitelist" by a "exceptions" list and the default
    policy is kept as "deny_all" variable in dev_cgroup structure.

    The current interface determines that whenever "a" is written to devices.allow
    or devices.deny, the entry masking all devices will be added or removed,
    respectively. This behavior is kept and it's what will determine the default
    policy:

    # cat devices.list
    a *:* rwm
    # echo a >devices.deny
    # cat devices.list
    # echo a >devices.allow
    # cat devices.list
    a *:* rwm

    The interface is also preserved. For example, if one wants to block only access
    to /dev/null:
    # ls -l /dev/null
    crw-rw-rw- 1 root root 1, 3 Jul 24 16:17 /dev/null
    # echo a >devices.allow
    # echo "c 1:3 rwm" >devices.deny
    # cat /dev/null
    cat: /dev/null: Operation not permitted
    # echo >/dev/null
    bash: /dev/null: Operation not permitted
    mknod /tmp/null c 1 3
    mknod: `/tmp/null': Operation not permitted
    # echo "c 1:3 r" >devices.allow
    # cat /dev/null
    # echo >/dev/null
    bash: /dev/null: Operation not permitted
    mknod /tmp/null c 1 3
    mknod: `/tmp/null': Operation not permitted
    # echo "c 1:3 rw" >devices.allow
    # echo >/dev/null
    # cat /dev/null
    # mknod /tmp/null c 1 3
    mknod: `/tmp/null': Operation not permitted
    # echo "c 1:3 rwm" >devices.allow
    # echo >/dev/null
    # cat /dev/null
    # mknod /tmp/null c 1 3
    #

    Note that I didn't rename the functions/variables in this patch, but in the
    next one to make reviewing easier.

    Signed-off-by: Aristeu Rozanski
    Cc: Tejun Heo
    Cc: Li Zefan
    Cc: James Morris
    Cc: Pavel Emelyanov
    Acked-by: Serge E. Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Aristeu Rozanski
     
  • This function cleans all the items in a whitelist and will be used by the next
    patches.

    Signed-off-by: Aristeu Rozanski
    Cc: Tejun Heo
    Cc: Li Zefan
    Cc: James Morris
    Cc: Pavel Emelyanov
    Acked-by: Serge E. Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Aristeu Rozanski
     
  • deny_all will determine if the default policy is to deny all device access
    unless for the ones in the exception list.

    This variable will be used in the next patches to convert device_cgroup
    internally into a default policy + rules.

    Signed-off-by: Aristeu Rozanski
    Cc: Tejun Heo
    Cc: Li Zefan
    Cc: James Morris
    Cc: Pavel Emelyanov
    Acked-by: Serge E. Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Aristeu Rozanski
     

05 Oct, 2012

2 commits


03 Oct, 2012

6 commits

  • Pull security subsystem updates from James Morris:
    "Highlights:

    - Integrity: add local fs integrity verification to detect offline
    attacks
    - Integrity: add digital signature verification
    - Simple stacking of Yama with other LSMs (per LSS discussions)
    - IBM vTPM support on ppc64
    - Add new driver for Infineon I2C TIS TPM
    - Smack: add rule revocation for subject labels"

    Fixed conflicts with the user namespace support in kernel/auditsc.c and
    security/integrity/ima/ima_policy.c.

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (39 commits)
    Documentation: Update git repository URL for Smack userland tools
    ima: change flags container data type
    Smack: setprocattr memory leak fix
    Smack: implement revoking all rules for a subject label
    Smack: remove task_wait() hook.
    ima: audit log hashes
    ima: generic IMA action flag handling
    ima: rename ima_must_appraise_or_measure
    audit: export audit_log_task_info
    tpm: fix tpm_acpi sparse warning on different address spaces
    samples/seccomp: fix 31 bit build on s390
    ima: digital signature verification support
    ima: add support for different security.ima data types
    ima: add ima_inode_setxattr/removexattr function and calls
    ima: add inode_post_setattr call
    ima: replace iint spinblock with rwlock/read_lock
    ima: allocating iint improvements
    ima: add appraise action keywords and default rules
    ima: integrity appraisal extension
    vfs: move ima_file_free before releasing the file
    ...

    Linus Torvalds
     
  • Pull vfs update from Al Viro:

    - big one - consolidation of descriptor-related logics; almost all of
    that is moved to fs/file.c

    (BTW, I'm seriously tempted to rename the result to fd.c. As it is,
    we have a situation when file_table.c is about handling of struct
    file and file.c is about handling of descriptor tables; the reasons
    are historical - file_table.c used to be about a static array of
    struct file we used to have way back).

    A lot of stray ends got cleaned up and converted to saner primitives,
    disgusting mess in android/binder.c is still disgusting, but at least
    doesn't poke so much in descriptor table guts anymore. A bunch of
    relatively minor races got fixed in process, plus an ext4 struct file
    leak.

    - related thing - fget_light() partially unuglified; see fdget() in
    there (and yes, it generates the code as good as we used to have).

    - also related - bits of Cyrill's procfs stuff that got entangled into
    that work; _not_ all of it, just the initial move to fs/proc/fd.c and
    switch of fdinfo to seq_file.

    - Alex's fs/coredump.c spiltoff - the same story, had been easier to
    take that commit than mess with conflicts. The rest is a separate
    pile, this was just a mechanical code movement.

    - a few misc patches all over the place. Not all for this cycle,
    there'll be more (and quite a few currently sit in akpm's tree)."

    Fix up trivial conflicts in the android binder driver, and some fairly
    simple conflicts due to two different changes to the sock_alloc_file()
    interface ("take descriptor handling from sock_alloc_file() to callers"
    vs "net: Providing protocol type via system.sockprotoname xattr of
    /proc/PID/fd entries" adding a dentry name to the socket)

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits)
    MAX_LFS_FILESIZE should be a loff_t
    compat: fs: Generic compat_sys_sendfile implementation
    fs: push rcu_barrier() from deactivate_locked_super() to filesystems
    btrfs: reada_extent doesn't need kref for refcount
    coredump: move core dump functionality into its own file
    coredump: prevent double-free on an error path in core dumper
    usb/gadget: fix misannotations
    fcntl: fix misannotations
    ceph: don't abuse d_delete() on failure exits
    hypfs: ->d_parent is never NULL or negative
    vfs: delete surplus inode NULL check
    switch simple cases of fget_light to fdget
    new helpers: fdget()/fdput()
    switch o2hb_region_dev_write() to fget_light()
    proc_map_files_readdir(): don't bother with grabbing files
    make get_file() return its argument
    vhost_set_vring(): turn pollstart/pollstop into bool
    switch prctl_set_mm_exe_file() to fget_light()
    switch xfs_find_handle() to fget_light()
    switch xfs_swapext() to fget_light()
    ...

    Linus Torvalds
     
  • Pull networking changes from David Miller:

    1) GRE now works over ipv6, from Dmitry Kozlov.

    2) Make SCTP more network namespace aware, from Eric Biederman.

    3) TEAM driver now works with non-ethernet devices, from Jiri Pirko.

    4) Make openvswitch network namespace aware, from Pravin B Shelar.

    5) IPV6 NAT implementation, from Patrick McHardy.

    6) Server side support for TCP Fast Open, from Jerry Chu and others.

    7) Packet BPF filter supports MOD and XOR, from Eric Dumazet and Daniel
    Borkmann.

    8) Increate the loopback default MTU to 64K, from Eric Dumazet.

    9) Use a per-task rather than per-socket page fragment allocator for
    outgoing networking traffic. This benefits processes that have very
    many mostly idle sockets, which is quite common.

    From Eric Dumazet.

    10) Use up to 32K for page fragment allocations, with fallbacks to
    smaller sizes when higher order page allocations fail. Benefits are
    a) less segments for driver to process b) less calls to page
    allocator c) less waste of space.

    From Eric Dumazet.

    11) Allow GRO to be used on GRE tunnels, from Eric Dumazet.

    12) VXLAN device driver, one way to handle VLAN issues such as the
    limitation of 4096 VLAN IDs yet still have some level of isolation.
    From Stephen Hemminger.

    13) As usual there is a large boatload of driver changes, with the scale
    perhaps tilted towards the wireless side this time around.

    Fix up various fairly trivial conflicts, mostly caused by the user
    namespace changes.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1012 commits)
    hyperv: Add buffer for extended info after the RNDIS response message.
    hyperv: Report actual status in receive completion packet
    hyperv: Remove extra allocated space for recv_pkt_list elements
    hyperv: Fix page buffer handling in rndis_filter_send_request()
    hyperv: Fix the missing return value in rndis_filter_set_packet_filter()
    hyperv: Fix the max_xfer_size in RNDIS initialization
    vxlan: put UDP socket in correct namespace
    vxlan: Depend on CONFIG_INET
    sfc: Fix the reported priorities of different filter types
    sfc: Remove EFX_FILTER_FLAG_RX_OVERRIDE_IP
    sfc: Fix loopback self-test with separate_tx_channels=1
    sfc: Fix MCDI structure field lookup
    sfc: Add parentheses around use of bitfield macro arguments
    sfc: Fix null function pointer in efx_sriov_channel_type
    vxlan: virtual extensible lan
    igmp: export symbol ip_mc_leave_group
    netlink: add attributes to fdb interface
    tg3: unconditionally select HWMON support when tg3 is enabled.
    Revert "net: ti cpsw ethernet: allow reading phy interface mode from DT"
    gre: fix sparse warning
    ...

    Linus Torvalds
     
  • Pull user namespace changes from Eric Biederman:
    "This is a mostly modest set of changes to enable basic user namespace
    support. This allows the code to code to compile with user namespaces
    enabled and removes the assumption there is only the initial user
    namespace. Everything is converted except for the most complex of the
    filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs,
    nfs, ocfs2 and xfs as those patches need a bit more review.

    The strategy is to push kuid_t and kgid_t values are far down into
    subsystems and filesystems as reasonable. Leaving the make_kuid and
    from_kuid operations to happen at the edge of userspace, as the values
    come off the disk, and as the values come in from the network.
    Letting compile type incompatible compile errors (present when user
    namespaces are enabled) guide me to find the issues.

    The most tricky areas have been the places where we had an implicit
    union of uid and gid values and were storing them in an unsigned int.
    Those places were converted into explicit unions. I made certain to
    handle those places with simple trivial patches.

    Out of that work I discovered we have generic interfaces for storing
    quota by projid. I had never heard of the project identifiers before.
    Adding full user namespace support for project identifiers accounts
    for most of the code size growth in my git tree.

    Ultimately there will be work to relax privlige checks from
    "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing
    root in a user names to do those things that today we only forbid to
    non-root users because it will confuse suid root applications.

    While I was pushing kuid_t and kgid_t changes deep into the audit code
    I made a few other cleanups. I capitalized on the fact we process
    netlink messages in the context of the message sender. I removed
    usage of NETLINK_CRED, and started directly using current->tty.

    Some of these patches have also made it into maintainer trees, with no
    problems from identical code from different trees showing up in
    linux-next.

    After reading through all of this code I feel like I might be able to
    win a game of kernel trivial pursuit."

    Fix up some fairly trivial conflicts in netfilter uid/git logging code.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits)
    userns: Convert the ufs filesystem to use kuid/kgid where appropriate
    userns: Convert the udf filesystem to use kuid/kgid where appropriate
    userns: Convert ubifs to use kuid/kgid
    userns: Convert squashfs to use kuid/kgid where appropriate
    userns: Convert reiserfs to use kuid and kgid where appropriate
    userns: Convert jfs to use kuid/kgid where appropriate
    userns: Convert jffs2 to use kuid and kgid where appropriate
    userns: Convert hpfs to use kuid and kgid where appropriate
    userns: Convert btrfs to use kuid/kgid where appropriate
    userns: Convert bfs to use kuid/kgid where appropriate
    userns: Convert affs to use kuid/kgid wherwe appropriate
    userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids
    userns: On ia64 deal with current_uid and current_gid being kuid and kgid
    userns: On ppc convert current_uid from a kuid before printing.
    userns: Convert s390 getting uid and gid system calls to use kuid and kgid
    userns: Convert s390 hypfs to use kuid and kgid where appropriate
    userns: Convert binder ipc to use kuids
    userns: Teach security_path_chown to take kuids and kgids
    userns: Add user namespace support to IMA
    userns: Convert EVM to deal with kuids and kgids in it's hmac computation
    ...

    Linus Torvalds
     
  • Pull cgroup hierarchy update from Tejun Heo:
    "Currently, different cgroup subsystems handle nested cgroups
    completely differently. There's no consistency among subsystems and
    the behaviors often are outright broken.

    People at least seem to agree that the broken hierarhcy behaviors need
    to be weeded out if any progress is gonna be made on this front and
    that the fallouts from deprecating the broken behaviors should be
    acceptable especially given that the current behaviors don't make much
    sense when nested.

    This patch makes cgroup emit warning messages if cgroups for
    subsystems with broken hierarchy behavior are nested to prepare for
    fixing them in the future. This was put in a separate branch because
    more related changes were expected (didn't make it this round) and the
    memory cgroup wanted to pull in this and make changes on top."

    * 'for-3.7-hierarchy' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
    cgroup: mark subsystems with broken hierarchy support and whine if cgroups are nested for them

    Linus Torvalds
     
  • Pull workqueue changes from Tejun Heo:
    "This is workqueue updates for v3.7-rc1. A lot of activities this
    round including considerable API and behavior cleanups.

    * delayed_work combines a timer and a work item. The handling of the
    timer part has always been a bit clunky leading to confusing
    cancelation API with weird corner-case behaviors. delayed_work is
    updated to use new IRQ safe timer and cancelation now works as
    expected.

    * Another deficiency of delayed_work was lack of the counterpart of
    mod_timer() which led to cancel+queue combinations or open-coded
    timer+work usages. mod_delayed_work[_on]() are added.

    These two delayed_work changes make delayed_work provide interface
    and behave like timer which is executed with process context.

    * A work item could be executed concurrently on multiple CPUs, which
    is rather unintuitive and made flush_work() behavior confusing and
    half-broken under certain circumstances. This problem doesn't
    exist for non-reentrant workqueues. While non-reentrancy check
    isn't free, the overhead is incurred only when a work item bounces
    across different CPUs and even in simulated pathological scenario
    the overhead isn't too high.

    All workqueues are made non-reentrant. This removes the
    distinction between flush_[delayed_]work() and
    flush_[delayed_]_work_sync(). The former is now as strong as the
    latter and the specified work item is guaranteed to have finished
    execution of any previous queueing on return.

    * In addition to the various bug fixes, Lai redid and simplified CPU
    hotplug handling significantly.

    * Joonsoo introduced system_highpri_wq and used it during CPU
    hotplug.

    There are two merge commits - one to pull in IRQ safe timer from
    tip/timers/core and the other to pull in CPU hotplug fixes from
    wq/for-3.6-fixes as Lai's hotplug restructuring depended on them."

    Fixed a number of trivial conflicts, but the more interesting conflicts
    were silent ones where the deprecated interfaces had been used by new
    code in the merge window, and thus didn't cause any real data conflicts.

    Tejun pointed out a few of them, I fixed a couple more.

    * 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (46 commits)
    workqueue: remove spurious WARN_ON_ONCE(in_irq()) from try_to_grab_pending()
    workqueue: use cwq_set_max_active() helper for workqueue_set_max_active()
    workqueue: introduce cwq_set_max_active() helper for thaw_workqueues()
    workqueue: remove @delayed from cwq_dec_nr_in_flight()
    workqueue: fix possible stall on try_to_grab_pending() of a delayed work item
    workqueue: use hotcpu_notifier() for workqueue_cpu_down_callback()
    workqueue: use __cpuinit instead of __devinit for cpu callbacks
    workqueue: rename manager_mutex to assoc_mutex
    workqueue: WORKER_REBIND is no longer necessary for idle rebinding
    workqueue: WORKER_REBIND is no longer necessary for busy rebinding
    workqueue: reimplement idle worker rebinding
    workqueue: deprecate __cancel_delayed_work()
    workqueue: reimplement cancel_delayed_work() using try_to_grab_pending()
    workqueue: use mod_delayed_work() instead of __cancel + queue
    workqueue: use irqsafe timer for delayed_work
    workqueue: clean up delayed_work initializers and add missing one
    workqueue: make deferrable delayed_work initializer names consistent
    workqueue: cosmetic whitespace updates for macro definitions
    workqueue: deprecate system_nrt[_freezable]_wq
    workqueue: deprecate flush[_delayed]_work_sync()
    ...

    Linus Torvalds
     

02 Oct, 2012

2 commits

  • Pull core kernel fixes from Ingo Molnar:
    "This is a complex task_work series from Oleg that fixes the bug that
    this VFS commit tried to fix:

    d35abdb28824 hold task_lock around checks in keyctl

    but solves the problem without the lockup regression that d35abdb28824
    introduced in v3.6.

    This series came late in v3.6 and I did not feel confident about it so
    late in the cycle. Might be worth backporting to -stable if it proves
    itself upstream."

    * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    task_work: Simplify the usage in ptrace_notify() and get_signal_to_deliver()
    task_work: Revert "hold task_lock around checks in keyctl"
    task_work: task_work_add() should not succeed after exit_task_work()
    task_work: Make task_work_add() lockless

    Linus Torvalds
     
  • Pull the trivial tree from Jiri Kosina:
    "Tiny usual fixes all over the place"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits)
    doc: fix old config name of kprobetrace
    fs/fs-writeback.c: cleanup riteback_sb_inodes kerneldoc
    btrfs: fix the commment for the action flags in delayed-ref.h
    btrfs: fix trivial typo for the comment of BTRFS_FREE_INO_OBJECTID
    vfs: fix kerneldoc for generic_fh_to_parent()
    treewide: fix comment/printk/variable typos
    ipr: fix small coding style issues
    doc: fix broken utf8 encoding
    nfs: comment fix
    platform/x86: fix asus_laptop.wled_type module parameter
    mfd: printk/comment fixes
    doc: getdelays.c: remember to close() socket on error in create_nl_socket()
    doc: aliasing-test: close fd on write error
    mmc: fix comment typos
    dma: fix comments
    spi: fix comment/printk typos in spi
    Coccinelle: fix typo in memdup_user.cocci
    tmiofb: missing NULL pointer checks
    tools: perf: Fix typo in tools/perf
    tools/testing: fix comment / output typos
    ...

    Linus Torvalds
     

29 Sep, 2012

1 commit

  • Conflicts:
    drivers/net/team/team.c
    drivers/net/usb/qmi_wwan.c
    net/batman-adv/bat_iv_ogm.c
    net/ipv4/fib_frontend.c
    net/ipv4/route.c
    net/l2tp/l2tp_netlink.c

    The team, fib_frontend, route, and l2tp_netlink conflicts were simply
    overlapping changes.

    qmi_wwan and bat_iv_ogm were of the "use HEAD" variety.

    With help from Antonio Quartulli.

    Signed-off-by: David S. Miller

    David S. Miller
     

28 Sep, 2012

1 commit


27 Sep, 2012

3 commits


21 Sep, 2012

6 commits