13 Nov, 2013

2 commits

  • Pull networking updates from David Miller:

    1) The addition of nftables. No longer will we need protocol aware
    firewall filtering modules, it can all live in userspace.

    At the core of nftables is a, for lack of a better term, virtual
    machine that executes byte codes to inspect packet or metadata
    (arriving interface index, etc.) and make verdict decisions.

    Besides support for loading packet contents and comparing them, the
    interpreter supports lookups in various datastructures as
    fundamental operations. For example sets are supports, and
    therefore one could create a set of whitelist IP address entries
    which have ACCEPT verdicts attached to them, and use the appropriate
    byte codes to do such lookups.

    Since the interpreted code is composed in userspace, userspace can
    do things like optimize things before giving it to the kernel.

    Another major improvement is the capability of atomically updating
    portions of the ruleset. In the existing netfilter implementation,
    one has to update the entire rule set in order to make a change and
    this is very expensive.

    Userspace tools exist to create nftables rules using existing
    netfilter rule sets, but both kernel implementations will need to
    co-exist for quite some time as we transition from the old to the
    new stuff.

    Kudos to Patrick McHardy, Pablo Neira Ayuso, and others who have
    worked so hard on this.

    2) Daniel Borkmann and Hannes Frederic Sowa made several improvements
    to our pseudo-random number generator, mostly used for things like
    UDP port randomization and netfitler, amongst other things.

    In particular the taus88 generater is updated to taus113, and test
    cases are added.

    3) Support 64-bit rates in HTB and TBF schedulers, from Eric Dumazet
    and Yang Yingliang.

    4) Add support for new 577xx tigon3 chips to tg3 driver, from Nithin
    Sujir.

    5) Fix two fatal flaws in TCP dynamic right sizing, from Eric Dumazet,
    Neal Cardwell, and Yuchung Cheng.

    6) Allow IP_TOS and IP_TTL to be specified in sendmsg() ancillary
    control message data, much like other socket option attributes.
    From Francesco Fusco.

    7) Allow applications to specify a cap on the rate computed
    automatically by the kernel for pacing flows, via a new
    SO_MAX_PACING_RATE socket option. From Eric Dumazet.

    8) Make the initial autotuned send buffer sizing in TCP more closely
    reflect actual needs, from Eric Dumazet.

    9) Currently early socket demux only happens for TCP sockets, but we
    can do it for connected UDP sockets too. Implementation from Shawn
    Bohrer.

    10) Refactor inet socket demux with the goal of improving hash demux
    performance for listening sockets. With the main goals being able
    to use RCU lookups on even request sockets, and eliminating the
    listening lock contention. From Eric Dumazet.

    11) The bonding layer has many demuxes in it's fast path, and an RCU
    conversion was started back in 3.11, several changes here extend the
    RCU usage to even more locations. From Ding Tianhong and Wang
    Yufen, based upon suggestions by Nikolay Aleksandrov and Veaceslav
    Falico.

    12) Allow stackability of segmentation offloads to, in particular, allow
    segmentation offloading over tunnels. From Eric Dumazet.

    13) Significantly improve the handling of secret keys we input into the
    various hash functions in the inet hashtables, TCP fast open, as
    well as syncookies. From Hannes Frederic Sowa. The key fundamental
    operation is "net_get_random_once()" which uses static keys.

    Hannes even extended this to ipv4/ipv6 fragmentation handling and
    our generic flow dissector.

    14) The generic driver layer takes care now to set the driver data to
    NULL on device removal, so it's no longer necessary for drivers to
    explicitly set it to NULL any more. Many drivers have been cleaned
    up in this way, from Jingoo Han.

    15) Add a BPF based packet scheduler classifier, from Daniel Borkmann.

    16) Improve CRC32 interfaces and generic SKB checksum iterators so that
    SCTP's checksumming can more cleanly be handled. Also from Daniel
    Borkmann.

    17) Add a new PMTU discovery mode, IP_PMTUDISC_INTERFACE, which forces
    using the interface MTU value. This helps avoid PMTU attacks,
    particularly on DNS servers. From Hannes Frederic Sowa.

    18) Use generic XPS for transmit queue steering rather than internal
    (re-)implementation in virtio-net. From Jason Wang.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1622 commits)
    random32: add test cases for taus113 implementation
    random32: upgrade taus88 generator to taus113 from errata paper
    random32: move rnd_state to linux/random.h
    random32: add prandom_reseed_late() and call when nonblocking pool becomes initialized
    random32: add periodic reseeding
    random32: fix off-by-one in seeding requirement
    PHY: Add RTL8201CP phy_driver to realtek
    xtsonic: add missing platform_set_drvdata() in xtsonic_probe()
    macmace: add missing platform_set_drvdata() in mace_probe()
    ethernet/arc/arc_emac: add missing platform_set_drvdata() in arc_emac_probe()
    ipv6: protect for_each_sk_fl_rcu in mem_check with rcu_read_lock_bh
    vlan: Implement vlan_dev_get_egress_qos_mask as an inline.
    ixgbe: add warning when max_vfs is out of range.
    igb: Update link modes display in ethtool
    netfilter: push reasm skb through instead of original frag skbs
    ip6_output: fragment outgoing reassembled skb properly
    MAINTAINERS: mv643xx_eth: take over maintainership from Lennart
    net_sched: tbf: support of 64bit rates
    ixgbe: deleting dfwd stations out of order can cause null ptr deref
    ixgbe: fix build err, num_rx_queues is only available with CONFIG_RPS
    ...

    Linus Torvalds
     
  • Pull cgroup changes from Tejun Heo:
    "Not too much activity this time around. css_id is finally killed and
    a minor update to device_cgroup"

    * 'for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
    device_cgroup: remove can_attach
    cgroup: kill css_id
    memcg: stop using css id
    memcg: fail to create cgroup if the cgroup id is too big
    memcg: convert to use cgroup id
    memcg: convert to use cgroup_is_descendant()

    Linus Torvalds
     

24 Oct, 2013

2 commits


16 Oct, 2013

2 commits

  • BugLink: http://bugs.launchpad.net/bugs/1235977

    The profile introspection seq file has a locking bug when policy is viewed
    from a virtual root (task in a policy namespace), introspection from the
    real root is not affected.

    The test for root
    while (parent) {
    is correct for the real root, but incorrect for tasks in a policy namespace.
    This allows the task to walk backup the policy tree past its virtual root
    causing it to be unlocked before the virtual root should be in the p_stop
    fn.

    This results in the following lockdep back trace:
    [ 78.479744] [ BUG: bad unlock balance detected! ]
    [ 78.479792] 3.11.0-11-generic #17 Not tainted
    [ 78.479838] -------------------------------------
    [ 78.479885] grep/2223 is trying to release lock (&ns->lock) at:
    [ 78.479952] [] mutex_unlock+0xe/0x10
    [ 78.480002] but there are no more locks to release!
    [ 78.480037]
    [ 78.480037] other info that might help us debug this:
    [ 78.480037] 1 lock held by grep/2223:
    [ 78.480037] #0: (&p->lock){+.+.+.}, at: [] seq_read+0x3d/0x3d0
    [ 78.480037]
    [ 78.480037] stack backtrace:
    [ 78.480037] CPU: 0 PID: 2223 Comm: grep Not tainted 3.11.0-11-generic #17
    [ 78.480037] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
    [ 78.480037] ffffffff817bf3be ffff880007763d60 ffffffff817b97ef ffff8800189d2190
    [ 78.480037] ffff880007763d88 ffffffff810e1c6e ffff88001f044730 ffff8800189d2190
    [ 78.480037] ffffffff817bf3be ffff880007763e00 ffffffff810e5bd6 0000000724fe56b7
    [ 78.480037] Call Trace:
    [ 78.480037] [] ? mutex_unlock+0xe/0x10
    [ 78.480037] [] dump_stack+0x54/0x74
    [ 78.480037] [] print_unlock_imbalance_bug+0xee/0x100
    [ 78.480037] [] ? mutex_unlock+0xe/0x10
    [ 78.480037] [] lock_release_non_nested+0x226/0x300
    [ 78.480037] [] ? __mutex_unlock_slowpath+0xce/0x180
    [ 78.480037] [] ? mutex_unlock+0xe/0x10
    [ 78.480037] [] lock_release+0xac/0x310
    [ 78.480037] [] __mutex_unlock_slowpath+0x83/0x180
    [ 78.480037] [] mutex_unlock+0xe/0x10
    [ 78.480037] [] p_stop+0x51/0x90
    [ 78.480037] [] seq_read+0x288/0x3d0
    [ 78.480037] [] vfs_read+0x9e/0x170
    [ 78.480037] [] SyS_read+0x4c/0xa0
    [ 78.480037] [] system_call_fastpath+0x1a/0x1f

    Signed-off-by: John Johansen
    Signed-off-by: James Morris

    John Johansen
     
  • BugLink: http://bugs.launchpad.net/bugs/1235523

    This fixes the following kmemleak trace:
    unreferenced object 0xffff8801e8c35680 (size 32):
    comm "apparmor_parser", pid 691, jiffies 4294895667 (age 13230.876s)
    hex dump (first 32 bytes):
    e0 d3 4e b5 ac 6d f4 ed 3f cb ee 48 1c fd 40 cf ..N..m..?..H..@.
    5b cc e9 93 00 00 00 00 00 00 00 00 00 00 00 00 [...............
    backtrace:
    [] kmemleak_alloc+0x4e/0xb0
    [] __kmalloc+0x103/0x290
    [] aa_calc_profile_hash+0x6c/0x150
    [] aa_unpack+0x39d/0xd50
    [] aa_replace_profiles+0x3d/0xd80
    [] profile_replace+0x37/0x50
    [] vfs_write+0xbd/0x1e0
    [] SyS_write+0x4c/0xa0
    [] system_call_fastpath+0x1a/0x1f
    [] 0xffffffffffffffff

    Signed-off-by: John Johansen
    Signed-off-by: James Morris

    John Johansen
     

14 Oct, 2013

1 commit


10 Oct, 2013

1 commit


09 Oct, 2013

1 commit

  • TCP listener refactoring, part 4 :

    To speed up inet lookups, we moved IPv4 addresses from inet to struct
    sock_common

    Now is time to do the same for IPv6, because it permits us to have fast
    lookups for all kind of sockets, including upcoming SYN_RECV.

    Getting IPv6 addresses in TCP lookups currently requires two extra cache
    lines, plus a dereference (and memory stall).

    inet6_sk(sk) does the dereference of inet_sk(__sk)->pinet6

    This patch is way bigger than its IPv4 counter part, because for IPv4,
    we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6,
    it's not doable easily.

    inet6_sk(sk)->daddr becomes sk->sk_v6_daddr
    inet6_sk(sk)->rcv_saddr becomes sk->sk_v6_rcv_saddr

    And timewait socket also have tw->tw_v6_daddr & tw->tw_v6_rcv_saddr
    at the same offset.

    We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic
    macro.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

05 Oct, 2013

3 commits


02 Oct, 2013

1 commit

  • Conflicts:
    drivers/net/ethernet/emulex/benet/be.h
    drivers/net/usb/qmi_wwan.c
    drivers/net/wireless/brcm80211/brcmfmac/dhd_bus.h
    include/net/netfilter/nf_conntrack_synproxy.h
    include/net/secure_seq.h

    The conflicts are of two varieties:

    1) Conflicts with Joe Perches's 'extern' removal from header file
    function declarations. Usually it's an argument signature change
    or a function being added/removed. The resolutions are trivial.

    2) Some overlapping changes in qmi_wwan.c and be.h, one commit adds
    a new value, another changes an existing value. That sort of
    thing.

    Signed-off-by: David S. Miller

    David S. Miller
     

01 Oct, 2013

1 commit

  • - Move sysctl_local_ports from a global variable into struct netns_ipv4.
    - Modify inet_get_local_port_range to take a struct net, and update all
    of the callers.
    - Move the initialization of sysctl_local_ports into
    sysctl_net_ipv4.c:ipv4_sysctl_init_net from inet_connection_sock.c

    v2:
    - Ensure indentation used tabs
    - Fixed ip.h so it applies cleanly to todays net-next

    v3:
    - Compile fixes of strange callers of inet_get_local_port_range.
    This patch now successfully passes an allmodconfig build.
    Removed manual inlining of inet_get_local_port_range in ipv4_local_port_range

    Originally-by: Samya
    Acked-by: Nicolas Dichtel
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

30 Sep, 2013

2 commits

  • The recent 3.12 pull request for apparmor was missing a couple rcu _protected
    access modifiers. Resulting in the follow suspicious RCU usage

    [ 29.804534] [ INFO: suspicious RCU usage. ]
    [ 29.804539] 3.11.0+ #5 Not tainted
    [ 29.804541] -------------------------------
    [ 29.804545] security/apparmor/include/policy.h:363 suspicious rcu_dereference_check() usage!
    [ 29.804548]
    [ 29.804548] other info that might help us debug this:
    [ 29.804548]
    [ 29.804553]
    [ 29.804553] rcu_scheduler_active = 1, debug_locks = 1
    [ 29.804558] 2 locks held by apparmor_parser/1268:
    [ 29.804560] #0: (sb_writers#9){.+.+.+}, at: [] file_start_write+0x27/0x29
    [ 29.804576] #1: (&ns->lock){+.+.+.}, at: [] aa_replace_profiles+0x166/0x57c
    [ 29.804589]
    [ 29.804589] stack backtrace:
    [ 29.804595] CPU: 0 PID: 1268 Comm: apparmor_parser Not tainted 3.11.0+ #5
    [ 29.804599] Hardware name: ASUSTeK Computer Inc. UL50VT /UL50VT , BIOS 217 03/01/2010
    [ 29.804602] 0000000000000000 ffff8800b95a1d90 ffffffff8144eb9b ffff8800b94db540
    [ 29.804611] ffff8800b95a1dc0 ffffffff81087439 ffff880138cc3a18 ffff880138cc3a18
    [ 29.804619] ffff8800b9464a90 ffff880138cc3a38 ffff8800b95a1df0 ffffffff811f5084
    [ 29.804628] Call Trace:
    [ 29.804636] [] dump_stack+0x4e/0x82
    [ 29.804642] [] lockdep_rcu_suspicious+0xfc/0x105
    [ 29.804649] [] __aa_update_replacedby+0x53/0x7f
    [ 29.804655] [] __replace_profile+0x11f/0x1ed
    [ 29.804661] [] aa_replace_profiles+0x410/0x57c
    [ 29.804668] [] profile_replace+0x35/0x4c
    [ 29.804674] [] vfs_write+0xad/0x113
    [ 29.804680] [] SyS_write+0x44/0x7a
    [ 29.804687] [] system_call_fastpath+0x16/0x1b
    [ 29.804691]
    [ 29.804694] ===============================
    [ 29.804697] [ INFO: suspicious RCU usage. ]
    [ 29.804700] 3.11.0+ #5 Not tainted
    [ 29.804703] -------------------------------
    [ 29.804706] security/apparmor/policy.c:566 suspicious rcu_dereference_check() usage!
    [ 29.804709]
    [ 29.804709] other info that might help us debug this:
    [ 29.804709]
    [ 29.804714]
    [ 29.804714] rcu_scheduler_active = 1, debug_locks = 1
    [ 29.804718] 2 locks held by apparmor_parser/1268:
    [ 29.804721] #0: (sb_writers#9){.+.+.+}, at: [] file_start_write+0x27/0x29
    [ 29.804733] #1: (&ns->lock){+.+.+.}, at: [] aa_replace_profiles+0x166/0x57c
    [ 29.804744]
    [ 29.804744] stack backtrace:
    [ 29.804750] CPU: 0 PID: 1268 Comm: apparmor_parser Not tainted 3.11.0+ #5
    [ 29.804753] Hardware name: ASUSTeK Computer Inc. UL50VT /UL50VT , BIOS 217 03/01/2010
    [ 29.804756] 0000000000000000 ffff8800b95a1d80 ffffffff8144eb9b ffff8800b94db540
    [ 29.804764] ffff8800b95a1db0 ffffffff81087439 ffff8800b95b02b0 0000000000000000
    [ 29.804772] ffff8800b9efba08 ffff880138cc3a38 ffff8800b95a1dd0 ffffffff811f4f94
    [ 29.804779] Call Trace:
    [ 29.804786] [] dump_stack+0x4e/0x82
    [ 29.804791] [] lockdep_rcu_suspicious+0xfc/0x105
    [ 29.804798] [] aa_free_replacedby_kref+0x4d/0x62
    [ 29.804804] [] ? aa_put_namespace+0x17/0x17
    [ 29.804810] [] kref_put+0x36/0x40
    [ 29.804816] [] __replace_profile+0x13a/0x1ed
    [ 29.804822] [] aa_replace_profiles+0x410/0x57c
    [ 29.804829] [] profile_replace+0x35/0x4c
    [ 29.804835] [] vfs_write+0xad/0x113
    [ 29.804840] [] SyS_write+0x44/0x7a
    [ 29.804847] [] system_call_fastpath+0x16/0x1b

    Reported-by: miles.lane@gmail.com
    CC: paulmck@linux.vnet.ibm.com
    Signed-off-by: John Johansen
    Signed-off-by: James Morris

    John Johansen
     
  • Use the shash interface, rather than the hash interface, when hashing
    AppArmor profiles. The shash interface does not use scatterlists and it
    is a better fit for what AppArmor needs.

    This fixes a kernel paging BUG when aa_calc_profile_hash() is passed a
    buffer from vmalloc(). The hash interface requires callers to handle
    vmalloc() buffers differently than what AppArmor was doing. Due to
    vmalloc() memory not being physically contiguous, each individual page
    behind the buffer must be assigned to a scatterlist with sg_set_page()
    and then the scatterlist passed to crypto_hash_update().

    The shash interface does not have that limitation and allows vmalloc()
    and kmalloc() buffers to be handled in the same manner.

    BugLink: https://launchpad.net/bugs/1216294/
    BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=62261

    Signed-off-by: Tyler Hicks
    Acked-by: Seth Arnold
    Signed-off-by: John Johansen
    Signed-off-by: James Morris

    Tyler Hicks
     

08 Sep, 2013

2 commits

  • Pull namespace changes from Eric Biederman:
    "This is an assorted mishmash of small cleanups, enhancements and bug
    fixes.

    The major theme is user namespace mount restrictions. nsown_capable
    is killed as it encourages not thinking about details that need to be
    considered. A very hard to hit pid namespace exiting bug was finally
    tracked and fixed. A couple of cleanups to the basic namespace
    infrastructure.

    Finally there is an enhancement that makes per user namespace
    capabilities usable as capabilities, and an enhancement that allows
    the per userns root to nice other processes in the user namespace"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    userns: Kill nsown_capable it makes the wrong thing easy
    capabilities: allow nice if we are privileged
    pidns: Don't have unshare(CLONE_NEWPID) imply CLONE_THREAD
    userns: Allow PR_CAPBSET_DROP in a user namespace.
    namespaces: Simplify copy_namespaces so it is clear what is going on.
    pidns: Fix hang in zap_pid_ns_processes by sending a potentially extra wakeup
    sysfs: Restrict mounting sysfs
    userns: Better restrictions on when proc and sysfs can be mounted
    vfs: Don't copy mount bind mounts of /proc//ns/mnt between namespaces
    kernel/nsproxy.c: Improving a snippet of code.
    proc: Restrict mounting the proc filesystem
    vfs: Lock in place mounts from more privileged users

    Linus Torvalds
     
  • Pull security subsystem updates from James Morris:
    "Nothing major for this kernel, just maintenance updates"

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (21 commits)
    apparmor: add the ability to report a sha1 hash of loaded policy
    apparmor: export set of capabilities supported by the apparmor module
    apparmor: add the profile introspection file to interface
    apparmor: add an optional profile attachment string for profiles
    apparmor: add interface files for profiles and namespaces
    apparmor: allow setting any profile into the unconfined state
    apparmor: make free_profile available outside of policy.c
    apparmor: rework namespace free path
    apparmor: update how unconfined is handled
    apparmor: change how profile replacement update is done
    apparmor: convert profile lists to RCU based locking
    apparmor: provide base for multiple profiles to be replaced at once
    apparmor: add a features/policy dir to interface
    apparmor: enable users to query whether apparmor is enabled
    apparmor: remove minimum size check for vmalloc()
    Smack: parse multiple rules per write to load2, up to PAGE_SIZE-1 bytes
    Smack: network label match fix
    security: smack: add a hash table to quicken smk_find_entry()
    security: smack: fix memleak in smk_write_rules_list()
    xattr: Constify ->name member of "struct xattr".
    ...

    Linus Torvalds
     

06 Sep, 2013

1 commit

  • Pull networking changes from David Miller:
    "Noteworthy changes this time around:

    1) Multicast rejoin support for team driver, from Jiri Pirko.

    2) Centralize and simplify TCP RTT measurement handling in order to
    reduce the impact of bad RTO seeding from SYN/ACKs. Also, when
    both timestamps and local RTT measurements are available prefer
    the later because there are broken middleware devices which
    scramble the timestamp.

    From Yuchung Cheng.

    3) Add TCP_NOTSENT_LOWAT socket option to limit the amount of kernel
    memory consumed to queue up unsend user data. From Eric Dumazet.

    4) Add a "physical port ID" abstraction for network devices, from
    Jiri Pirko.

    5) Add a "suppress" operation to influence fib_rules lookups, from
    Stefan Tomanek.

    6) Add a networking development FAQ, from Paul Gortmaker.

    7) Extend the information provided by tcp_probe and add ipv6 support,
    from Daniel Borkmann.

    8) Use RCU locking more extensively in openvswitch data paths, from
    Pravin B Shelar.

    9) Add SCTP support to openvswitch, from Joe Stringer.

    10) Add EF10 chip support to SFC driver, from Ben Hutchings.

    11) Add new SYNPROXY netfilter target, from Patrick McHardy.

    12) Compute a rate approximation for sending in TCP sockets, and use
    this to more intelligently coalesce TSO frames. Furthermore, add
    a new packet scheduler which takes advantage of this estimate when
    available. From Eric Dumazet.

    13) Allow AF_PACKET fanouts with random selection, from Daniel
    Borkmann.

    14) Add ipv6 support to vxlan driver, from Cong Wang"

    Resolved conflicts as per discussion.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1218 commits)
    openvswitch: Fix alignment of struct sw_flow_key.
    netfilter: Fix build errors with xt_socket.c
    tcp: Add missing braces to do_tcp_setsockopt
    caif: Add missing braces to multiline if in cfctrl_linkup_request
    bnx2x: Add missing braces in bnx2x:bnx2x_link_initialize
    vxlan: Fix kernel panic on device delete.
    net: mvneta: implement ->ndo_do_ioctl() to support PHY ioctls
    net: mvneta: properly disable HW PHY polling and ensure adjust_link() works
    icplus: Use netif_running to determine device state
    ethernet/arc/arc_emac: Fix huge delays in large file copies
    tuntap: orphan frags before trying to set tx timestamp
    tuntap: purge socket error queue on detach
    qlcnic: use standard NAPI weights
    ipv6:introduce function to find route for redirect
    bnx2x: VF RSS support - VF side
    bnx2x: VF RSS support - PF side
    vxlan: Notify drivers for listening UDP port changes
    net: usbnet: update addr_assign_type if appropriate
    driver/net: enic: update enic maintainers and driver
    driver/net: enic: Exposing symbols for Cisco's low latency driver
    ...

    Linus Torvalds
     

05 Sep, 2013

1 commit

  • Pull module updates from Rusty Russell:
    "Minor fixes mainly, including a potential use-after-free on remove
    found by CONFIG_DEBUG_KOBJECT_RELEASE which may be theoretical"

    * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
    module: Fix mod->mkobj.kobj potentially freed too early
    kernel/params.c: use scnprintf() instead of sprintf()
    kernel/module.c: use scnprintf() instead of sprintf()
    module/lsm: Have apparmor module parameters work with no args
    module: Add NOARG flag for ops with param_set_bool_enable_only() set function
    module: Add flag to allow mod params to have no arguments
    modules: add support for soft module dependencies
    scripts/mod/modpost.c: permit '.cranges' secton for sh64 architecture.
    module: fix sprintf format specifier in param_get_byte()

    Linus Torvalds
     

04 Sep, 2013

1 commit

  • Pull cgroup updates from Tejun Heo:
    "A lot of activities on the cgroup front. Most changes aren't visible
    to userland at all at this point and are laying foundation for the
    planned unified hierarchy.

    - The biggest change is decoupling the lifetime management of css
    (cgroup_subsys_state) from that of cgroup's. Because controllers
    (cpu, memory, block and so on) will need to be dynamically enabled
    and disabled, css which is the association point between a cgroup
    and a controller may come and go dynamically across the lifetime of
    a cgroup. Till now, css's were created when the associated cgroup
    was created and stayed till the cgroup got destroyed.

    Assumptions around this tight coupling permeated through cgroup
    core and controllers. These assumptions are gradually removed,
    which consists bulk of patches, and css destruction path is
    completely decoupled from cgroup destruction path. Note that
    decoupling of creation path is relatively easy on top of these
    changes and the patchset is pending for the next window.

    - cgroup has its own event mechanism cgroup.event_control, which is
    only used by memcg. It is overly complex trying to achieve high
    flexibility whose benefits seem dubious at best. Going forward,
    new events will simply generate file modified event and the
    existing mechanism is being made specific to memcg. This pull
    request contains prepatory patches for such change.

    - Various fixes and cleanups"

    Fixed up conflict in kernel/cgroup.c as per Tejun.

    * 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (69 commits)
    cgroup: fix cgroup_css() invocation in css_from_id()
    cgroup: make cgroup_write_event_control() use css_from_dir() instead of __d_cgrp()
    cgroup: make cgroup_event hold onto cgroup_subsys_state instead of cgroup
    cgroup: implement CFTYPE_NO_PREFIX
    cgroup: make cgroup_css() take cgroup_subsys * instead and allow NULL subsys
    cgroup: rename cgroup_css_from_dir() to css_from_dir() and update its syntax
    cgroup: fix cgroup_write_event_control()
    cgroup: fix subsystem file accesses on the root cgroup
    cgroup: change cgroup_from_id() to css_from_id()
    cgroup: use css_get() in cgroup_create() to check CSS_ROOT
    cpuset: remove an unncessary forward declaration
    cgroup: RCU protect each cgroup_subsys_state release
    cgroup: move subsys file removal to kill_css()
    cgroup: factor out kill_css()
    cgroup: decouple cgroup_subsys_state destruction from cgroup destruction
    cgroup: replace cgroup->css_kill_cnt with ->nr_css
    cgroup: bounce cgroup_subsys_state ref kill confirmation to a work item
    cgroup: move cgroup->subsys[] assignment to online_css()
    cgroup: reorganize css init / exit paths
    cgroup: add __rcu modifier to cgroup->subsys[]
    ...

    Linus Torvalds
     

31 Aug, 2013

2 commits

  • We allow task A to change B's nice level if it has a supserset of
    B's privileges, or of it has CAP_SYS_NICE. Also allow it if A has
    CAP_SYS_NICE with respect to B - meaning it is root in the same
    namespace, or it created B's namespace.

    Signed-off-by: Serge Hallyn
    Reviewed-by: "Eric W. Biederman"
    Signed-off-by: Eric W. Biederman

    Serge Hallyn
     
  • As the capabilites and capability bounding set are per user namespace
    properties it is safe to allow changing them with just CAP_SETPCAP
    permission in the user namespace.

    Acked-by: Serge Hallyn
    Tested-by: Richard Weinberger
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

23 Aug, 2013

1 commit


20 Aug, 2013

1 commit

  • The apparmor module parameters for param_ops_aabool and
    param_ops_aalockpolicy are both based off of the param_ops_bool,
    and can handle a NULL value passed in as val. Have it enable the
    new KERNEL_PARAM_FL_NOARGS flag to allow the parameters to be set
    without having to state "=y" or "=1".

    Cc: John Johansen
    Signed-off-by: Steven Rostedt
    Signed-off-by: Rusty Russell

    Steven Rostedt
     

17 Aug, 2013

1 commit


15 Aug, 2013

14 commits