08 May, 2013

1 commit

  • Faster kernel compiles by way of fewer unnecessary includes.

    [akpm@linux-foundation.org: fix fallout]
    [akpm@linux-foundation.org: fix build]
    Signed-off-by: Kent Overstreet
    Cc: Zach Brown
    Cc: Felipe Balbi
    Cc: Greg Kroah-Hartman
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Rusty Russell
    Cc: Jens Axboe
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Cc: Jeff Moyer
    Cc: Al Viro
    Cc: Benjamin LaHaise
    Reviewed-by: "Theodore Ts'o"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kent Overstreet
     

02 May, 2013

2 commits

  • Pull VFS updates from Al Viro,

    Misc cleanups all over the place, mainly wrt /proc interfaces (switch
    create_proc_entry to proc_create(), get rid of the deprecated
    create_proc_read_entry() in favor of using proc_create_data() and
    seq_file etc).

    7kloc removed.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits)
    don't bother with deferred freeing of fdtables
    proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h
    proc: Make the PROC_I() and PDE() macros internal to procfs
    proc: Supply a function to remove a proc entry by PDE
    take cgroup_open() and cpuset_open() to fs/proc/base.c
    ppc: Clean up scanlog
    ppc: Clean up rtas_flash driver somewhat
    hostap: proc: Use remove_proc_subtree()
    drm: proc: Use remove_proc_subtree()
    drm: proc: Use minor->index to label things, not PDE->name
    drm: Constify drm_proc_list[]
    zoran: Don't print proc_dir_entry data in debug
    reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show()
    proc: Supply an accessor for getting the data from a PDE's parent
    airo: Use remove_proc_subtree()
    rtl8192u: Don't need to save device proc dir PDE
    rtl8187se: Use a dir under /proc/net/r8180/
    proc: Add proc_mkdir_data()
    proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h}
    proc: Move PDE_NET() to fs/proc/proc_net.c
    ...

    Linus Torvalds
     
  • Pull networking updates from David Miller:
    "Highlights (1721 non-merge commits, this has to be a record of some
    sort):

    1) Add 'random' mode to team driver, from Jiri Pirko and Eric
    Dumazet.

    2) Make it so that any driver that supports configuration of multiple
    MAC addresses can provide the forwarding database add and del
    calls by providing a default implementation and hooking that up if
    the driver doesn't have an explicit set of handlers. From Vlad
    Yasevich.

    3) Support GSO segmentation over tunnels and other encapsulating
    devices such as VXLAN, from Pravin B Shelar.

    4) Support L2 GRE tunnels in the flow dissector, from Michael Dalton.

    5) Implement Tail Loss Probe (TLP) detection in TCP, from Nandita
    Dukkipati.

    6) In the PHY layer, allow supporting wake-on-lan in situations where
    the PHY registers have to be written for it to be configured.

    Use it to support wake-on-lan in mv643xx_eth.

    From Michael Stapelberg.

    7) Significantly improve firewire IPV6 support, from YOSHIFUJI
    Hideaki.

    8) Allow multiple packets to be sent in a single transmission using
    network coding in batman-adv, from Martin Hundebøll.

    9) Add support for T5 cxgb4 chips, from Santosh Rastapur.

    10) Generalize the VXLAN forwarding tables so that there is more
    flexibility in configurating various aspects of the endpoints.
    From David Stevens.

    11) Support RSS and TSO in hardware over GRE tunnels in bxn2x driver,
    from Dmitry Kravkov.

    12) Zero copy support in nfnelink_queue, from Eric Dumazet and Pablo
    Neira Ayuso.

    13) Start adding networking selftests.

    14) In situations of overload on the same AF_PACKET fanout socket, or
    per-cpu packet receive queue, minimize drop by distributing the
    load to other cpus/fanouts. From Willem de Bruijn and Eric
    Dumazet.

    15) Add support for new payload offset BPF instruction, from Daniel
    Borkmann.

    16) Convert several drivers over to mdoule_platform_driver(), from
    Sachin Kamat.

    17) Provide a minimal BPF JIT image disassembler userspace tool, from
    Daniel Borkmann.

    18) Rewrite F-RTO implementation in TCP to match the final
    specification of it in RFC4138 and RFC5682. From Yuchung Cheng.

    19) Provide netlink socket diag of netlink sockets ("Yo dawg, I hear
    you like netlink, so I implemented netlink dumping of netlink
    sockets.") From Andrey Vagin.

    20) Remove ugly passing of rtnetlink attributes into rtnl_doit
    functions, from Thomas Graf.

    21) Allow userspace to be able to see if a configuration change occurs
    in the middle of an address or device list dump, from Nicolas
    Dichtel.

    22) Support RFC3168 ECN protection for ipv6 fragments, from Hannes
    Frederic Sowa.

    23) Increase accuracy of packet length used by packet scheduler, from
    Jason Wang.

    24) Beginning set of changes to make ipv4/ipv6 fragment handling more
    scalable and less susceptible to overload and locking contention,
    from Jesper Dangaard Brouer.

    25) Get rid of using non-type-safe NLMSG_* macros and use nlmsg_*()
    instead. From Hong Zhiguo.

    26) Optimize route usage in IPVS by avoiding reference counting where
    possible, from Julian Anastasov.

    27) Convert IPVS schedulers to RCU, also from Julian Anastasov.

    28) Support cpu fanouts in xt_NFQUEUE netfilter target, from Holger
    Eitzenberger.

    29) Network namespace support for nf_log, ebt_log, xt_LOG, ipt_ULOG,
    nfnetlink_log, and nfnetlink_queue. From Gao feng.

    30) Implement RFC3168 ECN protection, from Hannes Frederic Sowa.

    31) Support several new r8169 chips, from Hayes Wang.

    32) Support tokenized interface identifiers in ipv6, from Daniel
    Borkmann.

    33) Use usbnet_link_change() helper in USB net driver, from Ming Lei.

    34) Add 802.1ad vlan offload support, from Patrick McHardy.

    35) Support mmap() based netlink communication, also from Patrick
    McHardy.

    36) Support HW timestamping in mlx4 driver, from Amir Vadai.

    37) Rationalize AF_PACKET packet timestamping when transmitting, from
    Willem de Bruijn and Daniel Borkmann.

    38) Bring parity to what's provided by /proc/net/packet socket dumping
    and the info provided by netlink socket dumping of AF_PACKET
    sockets. From Nicolas Dichtel.

    39) Fix peeking beyond zero sized SKBs in AF_UNIX, from Benjamin
    Poirier"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits)
    filter: fix va_list build error
    af_unix: fix a fatal race with bit fields
    bnx2x: Prevent memory leak when cnic is absent
    bnx2x: correct reading of speed capabilities
    net: sctp: attribute printl with __printf for gcc fmt checks
    netlink: kconfig: move mmap i/o into netlink kconfig
    netpoll: convert mutex into a semaphore
    netlink: Fix skb ref counting.
    net_sched: act_ipt forward compat with xtables
    mlx4_en: fix a build error on 32bit arches
    Revert "bnx2x: allow nvram test to run when device is down"
    bridge: avoid OOPS if root port not found
    drivers: net: cpsw: fix kernel warn on cpsw irq enable
    sh_eth: use random MAC address if no valid one supplied
    3c509.c: call SET_NETDEV_DEV for all device types (ISA/ISAPnP/EISA)
    tg3: fix to append hardware time stamping flags
    unix/stream: fix peeking with an offset larger than data in queue
    unix/dgram: fix peeking with an offset larger than data in queue
    unix/dgram: peek beyond 0-sized skbs
    openvswitch: Remove unneeded ovs_netdev_get_ifindex()
    ...

    Linus Torvalds
     

01 May, 2013

3 commits

  • Merge third batch of fixes from Andrew Morton:
    "Most of the rest. I still have two large patchsets against AIO and
    IPC, but they're a bit stuck behind other trees and I'm about to
    vanish for six days.

    - random fixlets
    - inotify
    - more of the MM queue
    - show_stack() cleanups
    - DMI update
    - kthread/workqueue things
    - compat cleanups
    - epoll udpates
    - binfmt updates
    - nilfs2
    - hfs
    - hfsplus
    - ptrace
    - kmod
    - coredump
    - kexec
    - rbtree
    - pids
    - pidns
    - pps
    - semaphore tweaks
    - some w1 patches
    - relay updates
    - core Kconfig changes
    - sysrq tweaks"

    * emailed patches from Andrew Morton : (109 commits)
    Documentation/sysrq: fix inconstistent help message of sysrq key
    ethernet/emac/sysrq: fix inconstistent help message of sysrq key
    sparc/sysrq: fix inconstistent help message of sysrq key
    powerpc/xmon/sysrq: fix inconstistent help message of sysrq key
    ARM/etm/sysrq: fix inconstistent help message of sysrq key
    power/sysrq: fix inconstistent help message of sysrq key
    kgdb/sysrq: fix inconstistent help message of sysrq key
    lib/decompress.c: fix initconst
    notifier-error-inject: fix module names in Kconfig
    kernel/sys.c: make prctl(PR_SET_MM) generally available
    UAPI: remove empty Kbuild files
    menuconfig: print more info for symbol without prompts
    init/Kconfig: re-order CONFIG_EXPERT options to fix menuconfig display
    kconfig menu: move Virtualization drivers near other virtualization options
    Kconfig: consolidate CONFIG_DEBUG_STRICT_USER_COPY_CHECKS
    relay: use macro PAGE_ALIGN instead of FIX_SIZE
    kernel/relay.c: move FIX_SIZE macro into relay.c
    kernel/relay.c: remove unused function argument actor
    drivers/w1/slaves/w1_ds2760.c: fix the error handling in w1_ds2760_add_slave()
    drivers/w1/slaves/w1_ds2781.c: fix the error handling in w1_ds2781_add_slave()
    ...

    Linus Torvalds
     
  • Use call_usermodehelper_setup() + call_usermodehelper_exec() instead of
    calling call_usermodehelper_fns(). In case there's an OOM in this last
    function the cleanup function may not be called - in this case we would
    miss a call to key_put().

    Signed-off-by: Lucas De Marchi
    Cc: Oleg Nesterov
    Acked-by: David Howells
    Acked-by: James Morris
    Cc: Al Viro
    Cc: Tejun Heo
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lucas De Marchi
     
  • Pull security subsystem update from James Morris:
    "Just some minor updates across the subsystem"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
    ima: eliminate passing d_name.name to process_measurement()
    TPM: Retry SaveState command in suspend path
    tpm/tpm_i2c_infineon: Add small comment about return value of __i2c_transfer
    tpm/tpm_i2c_infineon.c: Add OF attributes type and name to the of_device_id table entries
    tpm_i2c_stm_st33: Remove duplicate inclusion of header files
    tpm: Add support for new Infineon I2C TPM (SLB 9645 TT 1.2 I2C)
    char/tpm: Convert struct i2c_msg initialization to C99 format
    drivers/char/tpm/tpm_ppi: use strlcpy instead of strncpy
    tpm/tpm_i2c_stm_st33: formatting and white space changes
    Smack: include magic.h in smackfs.c
    selinux: make security_sb_clone_mnt_opts return an error on context mismatch
    seccomp: allow BPF_XOR based ALU instructions.
    Fix NULL pointer dereference in smack_inode_unlink() and smack_inode_rmdir()
    Smack: add support for modification of existing rules
    smack: SMACK_MAGIC to include/uapi/linux/magic.h
    Smack: add missing support for transmute bit in smack_str_from_perm()
    Smack: prevent revoke-subject from failing when unseen label is written to it
    tomoyo: use DEFINE_SRCU() to define tomoyo_ss
    tomoyo: use DEFINE_SRCU() to define tomoyo_ss

    Linus Torvalds
     

30 Apr, 2013

2 commits

  • Pull cgroup updates from Tejun Heo:

    - Fixes and a lot of cleanups. Locking cleanup is finally complete.
    cgroup_mutex is no longer exposed to individual controlelrs which
    used to cause nasty deadlock issues. Li fixed and cleaned up quite a
    bit including long standing ones like racy cgroup_path().

    - device cgroup now supports proper hierarchy thanks to Aristeu.

    - perf_event cgroup now supports proper hierarchy.

    - A new mount option "__DEVEL__sane_behavior" is added. As indicated
    by the name, this option is to be used for development only at this
    point and generates a warning message when used. Unfortunately,
    cgroup interface currently has too many brekages and inconsistencies
    to implement a consistent and unified hierarchy on top. The new flag
    is used to collect the behavior changes which are necessary to
    implement consistent unified hierarchy. It's likely that this flag
    won't be used verbatim when it becomes ready but will be enabled
    implicitly along with unified hierarchy.

    The option currently disables some of broken behaviors in cgroup core
    and also .use_hierarchy switch in memcg (will be routed through -mm),
    which can be used to make very unusual hierarchy where nesting is
    partially honored. It will also be used to implement hierarchy
    support for blk-throttle which would be impossible otherwise without
    introducing a full separate set of control knobs.

    This is essentially versioning of interface which isn't very nice but
    at this point I can't see any other options which would allow keeping
    the interface the same while moving towards hierarchy behavior which
    is at least somewhat sane. The planned unified hierarchy is likely
    to require some level of adaptation from userland anyway, so I think
    it'd be best to take the chance and update the interface such that
    it's supportable in the long term.

    Maintaining the existing interface does complicate cgroup core but
    shouldn't put too much strain on individual controllers and I think
    it'd be manageable for the foreseeable future. Maybe we'll be able
    to drop it in a decade.

    Fix up conflicts (including a semantic one adding a new #include to ppc
    that was uncovered by header the file changes) as per Tejun.

    * 'for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (45 commits)
    cpuset: fix compile warning when CONFIG_SMP=n
    cpuset: fix cpu hotplug vs rebuild_sched_domains() race
    cpuset: use rebuild_sched_domains() in cpuset_hotplug_workfn()
    cgroup: restore the call to eventfd->poll()
    cgroup: fix use-after-free when umounting cgroupfs
    cgroup: fix broken file xattrs
    devcg: remove parent_cgroup.
    memcg: force use_hierarchy if sane_behavior
    cgroup: remove cgrp->top_cgroup
    cgroup: introduce sane_behavior mount option
    move cgroupfs_root to include/linux/cgroup.h
    cgroup: convert cgroupfs_root flag bits to masks and add CGRP_ prefix
    cgroup: make cgroup_path() not print double slashes
    Revert "cgroup: remove bind() method from cgroup_subsys."
    perf: make perf_event cgroup hierarchical
    cgroup: implement cgroup_is_descendant()
    cgroup: make sure parent won't be destroyed before its children
    cgroup: remove bind() method from cgroup_subsys.
    devcg: remove broken_hierarchy tag
    cgroup: remove cgroup_lock_is_held()
    ...

    Linus Torvalds
     
  • Signed-off-by: Al Viro

    Al Viro
     

23 Apr, 2013

1 commit

  • Conflicts:
    drivers/net/ethernet/emulex/benet/be_main.c
    drivers/net/ethernet/intel/igb/igb_main.c
    drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c
    include/net/scm.h
    net/batman-adv/routing.c
    net/ipv4/tcp_input.c

    The e{uid,gid} --> {uid,gid} credentials fix conflicted with the
    cleanup in net-next to now pass cred structs around.

    The be2net driver had a bug fix in 'net' that overlapped with the VLAN
    interface changes by Patrick McHardy in net-next.

    An IGB conflict existed because in 'net' the build_skb() support was
    reverted, and in 'net-next' there was a comment style fix within that
    code.

    Several batman-adv conflicts were resolved by making sure that all
    calls to batadv_is_my_mac() are changed to have a new bat_priv first
    argument.

    Eric Dumazet's TS ECR fix in TCP in 'net' conflicted with the F-RTO
    rewrite in 'net-next', mostly overlapping changes.

    Thanks to Stephen Rothwell and Antonio Quartulli for help with several
    of these merge resolutions.

    Signed-off-by: David S. Miller

    David S. Miller
     

19 Apr, 2013

1 commit

  • In devcgroup_css_alloc(), there is no longer need for parent_cgroup.
    bd2953ebbb("devcg: propagate local changes down the hierarchy") made
    the variable parent_cgroup redundant. This patch removes parent_cgroup
    from devcgroup_css_alloc().

    Signed-off-by: Rami Rosen
    Acked-by: Aristeu Rozanski
    Signed-off-by: Tejun Heo

    Rami Rosen
     

18 Apr, 2013

1 commit

  • Passing a pointer to the dentry name, as a parameter to
    process_measurement(), causes a race condition with rename() and
    is unnecessary, as the dentry name is already accessible via the
    file parameter.

    In the normal case, we use the full pathname as provided by
    brpm->filename, bprm->interp, or ima_d_path(). Only on ima_d_path()
    failure, do we fallback to using the d_name.name, which points
    either to external memory or d_iname.

    Reported-by: Al Viro
    Signed-off-by: Mimi Zohar
    Signed-off-by: James Morris

    Mimi Zohar
     

10 Apr, 2013

1 commit

  • Commit 90ba9b1986b5ac (tcp: tcp_make_synack() can use alloc_skb())
    broke certain SELinux/NetLabel configurations by no longer correctly
    assigning the sock to the outgoing SYNACK packet.

    Cost of atomic operations on the LISTEN socket is quite big,
    and we would like it to happen only if really needed.

    This patch introduces a new security_ops->skb_owned_by() method,
    that is a void operation unless selinux is active.

    Reported-by: Miroslav Vadkerti
    Diagnosed-by: Paul Moore
    Signed-off-by: Eric Dumazet
    Cc: "David S. Miller"
    Cc: linux-security-module@vger.kernel.org
    Acked-by: James Morris
    Tested-by: Paul Moore
    Acked-by: Paul Moore
    Signed-off-by: David S. Miller

    Eric Dumazet
     

08 Apr, 2013

1 commit


03 Apr, 2013

1 commit


02 Apr, 2013

2 commits

  • I had the following problem reported a while back. If you mount the
    same filesystem twice using NFSv4 with different contexts, then the
    second context= option is ignored. For instance:

    # mount server:/export /mnt/test1
    # mount server:/export /mnt/test2 -o context=system_u:object_r:tmp_t:s0
    # ls -dZ /mnt/test1
    drwxrwxrwt. root root system_u:object_r:nfs_t:s0 /mnt/test1
    # ls -dZ /mnt/test2
    drwxrwxrwt. root root system_u:object_r:nfs_t:s0 /mnt/test2

    When we call into SELinux to set the context of a "cloned" superblock,
    it will currently just bail out when it notices that we're reusing an
    existing superblock. Since the existing superblock is already set up and
    presumably in use, we can't go overwriting its context with the one from
    the "original" sb. Because of this, the second context= option in this
    case cannot take effect.

    This patch fixes this by turning security_sb_clone_mnt_opts into an int
    return operation. When it finds that the "new" superblock that it has
    been handed is already set up, it checks to see whether the contexts on
    the old superblock match it. If it does, then it will just return
    success, otherwise it'll return -EBUSY and emit a printk to tell the
    admin why the second mount failed.

    Note that this patch may cause casualties. The NFSv4 code relies on
    being able to walk down to an export from the pseudoroot. If you mount
    filesystems that are nested within one another with different contexts,
    then this patch will make those mounts fail in new and "exciting" ways.

    For instance, suppose that /export is a separate filesystem on the
    server:

    # mount server:/ /mnt/test1
    # mount salusa:/export /mnt/test2 -o context=system_u:object_r:tmp_t:s0
    mount.nfs: an incorrect mount option was specified

    ...with the printk in the ring buffer. Because we *might* eventually
    walk down to /mnt/test1/export, the mount is denied due to this patch.
    The second mount needs the pseudoroot superblock, but that's already
    present with the wrong context.

    OTOH, if we mount these in the reverse order, then both mounts work,
    because the pseudoroot superblock created when mounting /export is
    discarded once that mount is done. If we then however try to walk into
    that directory, the automount fails for the similar reasons:

    # cd /mnt/test1/scratch/
    -bash: cd: /mnt/test1/scratch: Device or resource busy

    The story I've gotten from the SELinux folks that I've talked to is that
    this is desirable behavior. In SELinux-land, mounting the same data
    under different contexts is wrong -- there can be only one.

    Cc: Steve Dickson
    Cc: Stephen Smalley
    Signed-off-by: Jeff Layton
    Acked-by: Eric Paris
    Signed-off-by: James Morris

    Jeff Layton
     
  • Conflicts:
    net/mac80211/sta_info.c
    net/wireless/core.h

    Two minor conflicts in wireless. Overlapping additions of extern
    declarations in net/wireless/core.h and a bug fix overlapping with
    the addition of a boolean parameter to __ieee80211_key_free().

    Signed-off-by: David S. Miller

    David S. Miller
     

29 Mar, 2013

2 commits

  • Pull userns fixes from Eric W Biederman:
    "The bulk of the changes are fixing the worst consequences of the user
    namespace design oversight in not considering what happens when one
    namespace starts off as a clone of another namespace, as happens with
    the mount namespace.

    The rest of the changes are just plain bug fixes.

    Many thanks to Andy Lutomirski for pointing out many of these issues."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    userns: Restrict when proc and sysfs can be mounted
    ipc: Restrict mounting the mqueue filesystem
    vfs: Carefully propogate mounts across user namespaces
    vfs: Add a mount flag to lock read only bind mounts
    userns: Don't allow creation if the user is chrooted
    yama: Better permission check for ptraceme
    pid: Handle the exit of a multi-threaded init.
    scm: Require CAP_SYS_ADMIN over the current pidns to spoof pids.

    Linus Torvalds
     
  • Signed-off-by: Hong Zhiguo
    Acked-by: Thomas Graf
    Signed-off-by: David S. Miller

    Hong zhi guo
     

27 Mar, 2013

1 commit


20 Mar, 2013

9 commits

  • This patch makes exception changes to propagate down in hierarchy respecting
    when possible local exceptions.

    New exceptions allowing additional access to devices won't be propagated, but
    it'll be possible to add an exception to access all of part of the newly
    allowed device(s).

    New exceptions disallowing access to devices will be propagated down and the
    local group's exceptions will be revalidated for the new situation.
    Example:
    A
    / \
    B

    group behavior exceptions
    A allow "b 8:* rwm", "c 116:1 rw"
    B deny "c 1:3 rwm", "c 116:2 rwm", "b 3:* rwm"

    If a new exception is added to group A:
    # echo "c 116:* r" > A/devices.deny
    it'll propagate down and after revalidating B's local exceptions, the exception
    "c 116:2 rwm" will be removed.

    In case parent's exceptions change and local exceptions are not allowed anymore,
    they'll be deleted.

    v7:
    - do not allow behavior change when the cgroup has children
    - update documentation

    v6: fixed issues pointed by Serge Hallyn
    - only copy parent's exceptions while propagating behavior if the local
    behavior is different
    - while propagating exceptions, do not clear and copy parent's: it'd be against
    the premise we don't propagate access to more devices

    v5: fixed issues pointed by Serge Hallyn
    - updated documentation
    - not propagating when an exception is written to devices.allow
    - when propagating a new behavior, clean the local exceptions list if they're
    for a different behavior

    v4: fixed issues pointed by Tejun Heo
    - separated function to walk the tree and collect valid propagation targets

    v3: fixed issues pointed by Tejun Heo
    - update documentation
    - move css_online/css_offline changes to a new patch
    - use cgroup_for_each_descendant_pre() instead of own descendant walk
    - move exception_copy rework to a separared patch
    - move exception_clean rework to a separated patch

    v2: fixed issues pointed by Tejun Heo
    - instead of keeping the local settings that won't apply anymore, remove them

    Cc: Tejun Heo
    Cc: Serge Hallyn
    Signed-off-by: Aristeu Rozanski
    Signed-off-by: Tejun Heo

    Aristeu Rozanski
     
  • Allocate resources and change behavior only when online. This is needed in
    order to determine if a node is suitable for hierarchy propagation or if it's
    being removed.

    Locking:
    Both functions take devcgroup_mutex to make changes to device_cgroup structure.
    Hierarchy propagation will also take devcgroup_mutex before walking the
    tree while walking the tree itself is protected by rcu lock.

    Acked-by: Tejun Heo
    Acked-by: Serge Hallyn
    Cc: Tejun Heo
    Cc: Serge Hallyn
    Signed-off-by: Aristeu Rozanski
    Signed-off-by: Tejun Heo

    Aristeu Rozanski
     
  • Currently may_access() is only able to verify if an exception is valid for the
    current cgroup, which has the same behavior. With hierarchy, it'll be also used
    to verify if a cgroup local exception is valid towards its cgroup parent, which
    might have different behavior.

    v2:
    - updated patch description
    - rebased on top of a new patch to expand the may_access() logic to make it
    more clear
    - fixed argument description order in may_access()

    Acked-by: Tejun Heo
    Acked-by: Serge Hallyn
    Cc: Tejun Heo
    Cc: Serge Hallyn
    Signed-off-by: Aristeu Rozanski
    Signed-off-by: Tejun Heo

    Aristeu Rozanski
     
  • In order to make the next patch more clear, expand may_access() logic.

    v2: may_access() returns bool now

    Acked-by: Tejun Heo
    Acked-by: Serge Hallyn
    Cc: Tejun Heo
    Cc: Serge Hallyn
    Signed-off-by: Aristeu Rozanski
    Signed-off-by: Tejun Heo

    Aristeu Rozanski
     
  • This patch fixes kernel Oops because of wrong common_audit_data type
    in smack_inode_unlink() and smack_inode_rmdir().

    When SMACK security module is enabled and SMACK logging is on (/smack/logging
    is not zero) and you try to delete the file which
    1) you cannot delete due to SMACK rules and logging of failures is on
    or
    2) you can delete and logging of success is on,

    you will see following:

    Unable to handle kernel NULL pointer dereference at virtual address 000002d7

    [] (strlen+0x0/0x28)
    [] (audit_log_untrustedstring+0x14/0x28)
    [] (common_lsm_audit+0x108/0x6ac)
    [] (smack_log+0xc4/0xe4)
    [] (smk_curacc+0x80/0x10c)
    [] (smack_inode_unlink+0x74/0x80)
    [] (security_inode_unlink+0x2c/0x30)
    [] (vfs_unlink+0x7c/0x100)
    [] (do_unlinkat+0x144/0x16c)

    The function smack_inode_unlink() (and smack_inode_rmdir()) need
    to log two structures of different types. First of all it does:

    smk_ad_init(&ad, __func__, LSM_AUDIT_DATA_DENTRY);
    smk_ad_setfield_u_fs_path_dentry(&ad, dentry);

    This will set common audit data type to LSM_AUDIT_DATA_DENTRY
    and store dentry for auditing (by function smk_curacc(), which in turn calls
    dump_common_audit_data(), which is actually uses provided data and logs it).

    /*
    * You need write access to the thing you're unlinking
    */
    rc = smk_curacc(smk_of_inode(ip), MAY_WRITE, &ad);
    if (rc == 0) {
    /*
    * You also need write access to the containing directory
    */

    Then this function wants to log anoter data:

    smk_ad_setfield_u_fs_path_dentry(&ad, NULL);
    smk_ad_setfield_u_fs_inode(&ad, dir);

    The function sets inode field, but don't change common_audit_data type.

    rc = smk_curacc(smk_of_inode(dir), MAY_WRITE, &ad);
    }

    So the dump_common_audit() function incorrectly interprets inode structure
    as dentry, and Oops will happen.

    This patch reinitializes common_audit_data structures with correct type.
    Also I removed unneeded
    smk_ad_setfield_u_fs_path_dentry(&ad, NULL);
    initialization, because both dentry and inode pointers are stored
    in the same union.

    Signed-off-by: Igor Zhbanov
    Signed-off-by: Kyungmin Park

    Igor Zhbanov
     
  • Rule modifications are enabled via /smack/change-rule. Format is as follows:
    "Subject Object rwaxt rwaxt"

    First two strings are subject and object labels up to 255 characters.
    Third string contains permissions to enable.
    Fourth string contains permissions to disable.

    All unmentioned permissions will be left unchanged.
    If no rule previously existed, it will be created.

    Targeted for git://git.gitorious.org/smack-next/kernel.git

    Signed-off-by: Rafal Krypa

    Rafal Krypa
     
  • SMACK_MAGIC moved to a proper place for easy user space access
    (i.e. libsmack).

    Signed-off-by: Jarkko Sakkinen

    Jarkko Sakkinen
     
  • This fixes audit logs for granting or denial of permissions to show
    information about transmute bit.

    Targeted for git://git.gitorious.org/smack-next/kernel.git

    Signed-off-by: Rafal Krypa

    Rafal Krypa
     
  • Special file /smack/revoke-subject will silently accept labels that are not
    present on the subject label list. Nothing has to be done for such labels,
    as there are no rules for them to revoke.

    Targeted for git://git.gitorious.org/smack-next/kernel.git

    Signed-off-by: Rafal Krypa

    Rafal Krypa
     

18 Mar, 2013

2 commits


13 Mar, 2013

1 commit

  • Looking at mm/process_vm_access.c:process_vm_rw() and comparing it to
    compat_process_vm_rw() shows that the compatibility code requires an
    explicit "access_ok()" check before calling
    compat_rw_copy_check_uvector(). The same difference seems to appear when
    we compare fs/read_write.c:do_readv_writev() to
    fs/compat.c:compat_do_readv_writev().

    This subtle difference between the compat and non-compat requirements
    should probably be debated, as it seems to be error-prone. In fact,
    there are two others sites that use this function in the Linux kernel,
    and they both seem to get it wrong:

    Now shifting our attention to fs/aio.c, we see that aio_setup_iocb()
    also ends up calling compat_rw_copy_check_uvector() through
    aio_setup_vectored_rw(). Unfortunately, the access_ok() check appears to
    be missing. Same situation for
    security/keys/compat.c:compat_keyctl_instantiate_key_iov().

    I propose that we add the access_ok() check directly into
    compat_rw_copy_check_uvector(), so callers don't have to worry about it,
    and it therefore makes the compat call code similar to its non-compat
    counterpart. Place the access_ok() check in the same location where
    copy_from_user() can trigger a -EFAULT error in the non-compat code, so
    the ABI behaviors are alike on both compat and non-compat.

    While we are here, fix compat_do_readv_writev() so it checks for
    compat_rw_copy_check_uvector() negative return values.

    And also, fix a memory leak in compat_keyctl_instantiate_key_iov() error
    handling.

    Acked-by: Linus Torvalds
    Acked-by: Al Viro
    Signed-off-by: Mathieu Desnoyers
    Signed-off-by: Linus Torvalds

    Mathieu Desnoyers
     

12 Mar, 2013

1 commit

  • This fixes CVE-2013-1792.

    There is a race in install_user_keyrings() that can cause a NULL pointer
    dereference when called concurrently for the same user if the uid and
    uid-session keyrings are not yet created. It might be possible for an
    unprivileged user to trigger this by calling keyctl() from userspace in
    parallel immediately after logging in.

    Assume that we have two threads both executing lookup_user_key(), both
    looking for KEY_SPEC_USER_SESSION_KEYRING.

    THREAD A THREAD B
    =============================== ===============================
    ==>call install_user_keyrings();
    if (!cred->user->session_keyring)
    ==>call install_user_keyrings()
    ...
    user->uid_keyring = uid_keyring;
    if (user->uid_keyring)
    return 0;
    user->session_keyring [== NULL]
    user->session_keyring = session_keyring;
    atomic_inc(&key->usage); [oops]

    At the point thread A dereferences cred->user->session_keyring, thread B
    hasn't updated user->session_keyring yet, but thread A assumes it is
    populated because install_user_keyrings() returned ok.

    The race window is really small but can be exploited if, for example,
    thread B is interrupted or preempted after initializing uid_keyring, but
    before doing setting session_keyring.

    This couldn't be reproduced on a stock kernel. However, after placing
    systemtap probe on 'user->session_keyring = session_keyring;' that
    introduced some delay, the kernel could be crashed reliably.

    Fix this by checking both pointers before deciding whether to return.
    Alternatively, the test could be done away with entirely as it is checked
    inside the mutex - but since the mutex is global, that may not be the best
    way.

    Signed-off-by: David Howells
    Reported-by: Mateusz Guzik
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: James Morris

    David Howells
     

04 Mar, 2013

2 commits

  • Dave Jones writes:
    > Just hit this on Linus' current tree.
    >
    > [ 89.621770] BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8
    > [ 89.623111] IP: [] commit_creds+0x250/0x2f0
    > [ 89.624062] PGD 122bfd067 PUD 122bfe067 PMD 0
    > [ 89.624901] Oops: 0000 [#1] PREEMPT SMP
    > [ 89.625678] Modules linked in: caif_socket caif netrom bridge hidp 8021q garp stp mrp rose llc2 af_rxrpc phonet af_key binfmt_misc bnep l2tp_ppp can_bcm l2tp_core pppoe pppox can_raw scsi_transport_iscsi ppp_generic slhc nfnetlink can ipt_ULOG ax25 decnet irda nfc rds x25 crc_ccitt appletalk atm ipx p8023 psnap p8022 llc lockd sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables btusb bluetooth snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm vhost_net snd_page_alloc snd_timer tun macvtap usb_debug snd rfkill microcode macvlan edac_core pcspkr serio_raw kvm_amd soundcore kvm r8169 mii
    > [ 89.637846] CPU 2
    > [ 89.638175] Pid: 782, comm: trinity-main Not tainted 3.8.0+ #63 Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H
    > [ 89.639850] RIP: 0010:[] [] commit_creds+0x250/0x2f0
    > [ 89.641161] RSP: 0018:ffff880115657eb8 EFLAGS: 00010207
    > [ 89.641984] RAX: 00000000000003e8 RBX: ffff88012688b000 RCX: 0000000000000000
    > [ 89.643069] RDX: 0000000000000000 RSI: ffffffff81c32960 RDI: ffff880105839600
    > [ 89.644167] RBP: ffff880115657ed8 R08: 0000000000000000 R09: 0000000000000000
    > [ 89.645254] R10: 0000000000000001 R11: 0000000000000246 R12: ffff880105839600
    > [ 89.646340] R13: ffff88011beea490 R14: ffff88011beea490 R15: 0000000000000000
    > [ 89.647431] FS: 00007f3ac063b740(0000) GS:ffff88012b200000(0000) knlGS:0000000000000000
    > [ 89.648660] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    > [ 89.649548] CR2: 00000000000000c8 CR3: 0000000122bfc000 CR4: 00000000000007e0
    > [ 89.650635] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    > [ 89.651723] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    > [ 89.652812] Process trinity-main (pid: 782, threadinfo ffff880115656000, task ffff88011beea490)
    > [ 89.654128] Stack:
    > [ 89.654433] 0000000000000000 ffff8801058396a0 ffff880105839600 ffff88011beeaa78
    > [ 89.655769] ffff880115657ef8 ffffffff812c7d9b ffffffff82079be0 0000000000000000
    > [ 89.657073] ffff880115657f28 ffffffff8106c665 0000000000000002 ffff880115657f58
    > [ 89.658399] Call Trace:
    > [ 89.658822] [] key_change_session_keyring+0xfb/0x140
    > [ 89.659845] [] task_work_run+0xa5/0xd0
    > [ 89.660698] [] do_notify_resume+0x71/0xb0
    > [ 89.661581] [] int_signal+0x12/0x17
    > [ 89.662385] Code: 24 90 00 00 00 48 8b b3 90 00 00 00 49 8b 4c 24 40 48 39 f2 75 08 e9 83 00 00 00 48 89 ca 48 81 fa 60 29 c3 81 0f 84 41 fe ff ff 8b 8a c8 00 00 00 48 39 ce 75 e4 3b 82 d0 00 00 00 0f 84 4b
    > [ 89.667778] RIP [] commit_creds+0x250/0x2f0
    > [ 89.668733] RSP
    > [ 89.669301] CR2: 00000000000000c8
    >
    > My fastest trinity induced oops yet!
    >
    >
    > Appears to be..
    >
    > if ((set_ns == subset_ns->parent) &&
    > 850: 48 8b 8a c8 00 00 00 mov 0xc8(%rdx),%rcx
    >
    > from the inlined cred_cap_issubset

    By historical accident we have been reading trying to set new->user_ns
    from new->user_ns. Which is totally silly as new->user_ns is NULL (as
    is every other field in new except session_keyring at that point).

    The intent is clearly to copy all of the fields from old to new so copy
    old->user_ns into into new->user_ns.

    Cc: stable@vger.kernel.org
    Reported-by: Dave Jones
    Tested-by: Dave Jones
    Acked-by: Serge Hallyn
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • Pull more VFS bits from Al Viro:
    "Unfortunately, it looks like xattr series will have to wait until the
    next cycle ;-/

    This pile contains 9p cleanups and fixes (races in v9fs_fid_add()
    etc), fixup for nommu breakage in shmem.c, several cleanups and a bit
    more file_inode() work"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    constify path_get/path_put and fs_struct.c stuff
    fix nommu breakage in shmem.c
    cache the value of file_inode() in struct file
    9p: if v9fs_fid_lookup() gets to asking server, it'd better have hashed dentry
    9p: make sure ->lookup() adds fid to the right dentry
    9p: untangle ->lookup() a bit
    9p: double iput() in ->lookup() if d_materialise_unique() fails
    9p: v9fs_fid_add() can't fail now
    v9fs: get rid of v9fs_dentry
    9p: turn fid->dlist into hlist
    9p: don't bother with private lock in ->d_fsdata; dentry->d_lock will do just fine
    more file_inode() open-coded instances
    selinux: opened file can't have NULL or negative ->f_path.dentry

    (In the meantime, the hlist traversal macros have changed, so this
    required a semantic conflict fixup for the newly hlistified fid->dlist)

    Linus Torvalds
     

28 Feb, 2013

2 commits

  • I'm not sure why, but the hlist for each entry iterators were conceived

    list_for_each_entry(pos, head, member)

    The hlist ones were greedy and wanted an extra parameter:

    hlist_for_each_entry(tpos, pos, head, member)

    Why did they need an extra pos parameter? I'm not quite sure. Not only
    they don't really need it, it also prevents the iterator from looking
    exactly like the list iterator, which is unfortunate.

    Besides the semantic patch, there was some manual work required:

    - Fix up the actual hlist iterators in linux/list.h
    - Fix up the declaration of other iterators based on the hlist ones.
    - A very small amount of places were using the 'node' parameter, this
    was modified to use 'obj->member' instead.
    - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
    properly, so those had to be fixed up manually.

    The semantic patch which is mostly the work of Peter Senna Tschudin is here:

    @@
    iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

    type T;
    expression a,c,d,e;
    identifier b;
    statement S;
    @@

    -T b;

    [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
    [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
    [akpm@linux-foundation.org: checkpatch fixes]
    [akpm@linux-foundation.org: fix warnings]
    [akpm@linux-foudnation.org: redo intrusive kvm changes]
    Tested-by: Peter Senna Tschudin
    Acked-by: Paul E. McKenney
    Signed-off-by: Sasha Levin
    Cc: Wu Fengguang
    Cc: Marcelo Tosatti
    Cc: Gleb Natapov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sasha Levin
     
  • Signed-off-by: Al Viro

    Al Viro
     

27 Feb, 2013

1 commit

  • Pull vfs pile (part one) from Al Viro:
    "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
    locking violations, etc.

    The most visible changes here are death of FS_REVAL_DOT (replaced with
    "has ->d_weak_revalidate()") and a new helper getting from struct file
    to inode. Some bits of preparation to xattr method interface changes.

    Misc patches by various people sent this cycle *and* ocfs2 fixes from
    several cycles ago that should've been upstream right then.

    PS: the next vfs pile will be xattr stuff."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
    saner proc_get_inode() calling conventions
    proc: avoid extra pde_put() in proc_fill_super()
    fs: change return values from -EACCES to -EPERM
    fs/exec.c: make bprm_mm_init() static
    ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
    ocfs2: fix possible use-after-free with AIO
    ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
    get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
    target: writev() on single-element vector is pointless
    export kernel_write(), convert open-coded instances
    fs: encode_fh: return FILEID_INVALID if invalid fid_type
    kill f_vfsmnt
    vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
    nfsd: handle vfs_getattr errors in acl protocol
    switch vfs_getattr() to struct path
    default SET_PERSONALITY() in linux/elf.h
    ceph: prepopulate inodes only when request is aborted
    d_hash_and_lookup(): export, switch open-coded instances
    9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
    9p: split dropping the acls from v9fs_set_create_acl()
    ...

    Linus Torvalds
     

26 Feb, 2013

2 commits

  • very few users left...

    Signed-off-by: Al Viro

    Al Viro
     
  • Commit "85865c1 ima: add policy support for file system uuid"
    introduced a CONFIG_BLOCK dependency. This patch defines a
    wrapper called blk_part_pack_uuid(), which returns -EINVAL,
    when CONFIG_BLOCK is not defined.

    security/integrity/ima/ima_policy.c:538:4: error: implicit declaration
    of function 'part_pack_uuid' [-Werror=implicit-function-declaration]

    Changelog v2:
    - Reference commit number in patch description
    Changelog v1:
    - rename ima_part_pack_uuid() to blk_part_pack_uuid()
    - resolve scripts/checkpatch.pl warnings
    Changelog v0:
    - fix UUID scripts/Lindent msgs

    Reported-by: Randy Dunlap
    Reported-by: David Rientjes
    Signed-off-by: Mimi Zohar
    Acked-by: David Rientjes
    Acked-by: Randy Dunlap
    Cc: Jens Axboe
    Signed-off-by: James Morris

    Mimi Zohar
     

25 Feb, 2013

1 commit

  • Commit "750943a ima: remove enforce checking duplication" combined
    the 'in IMA policy' and 'enforcing file integrity' checks. For
    the non-file, kernel module verification, a specific check for
    'enforcing file integrity' was not added. This patch adds the
    check.

    Signed-off-by: Mimi Zohar
    Signed-off-by: James Morris

    Mimi Zohar