15 Jan, 2012

1 commit

  • * 'for-linus' of git://selinuxproject.org/~jmorris/linux-security:
    capabilities: remove __cap_full_set definition
    security: remove the security_netlink_recv hook as it is equivalent to capable()
    ptrace: do not audit capability check when outputing /proc/pid/stat
    capabilities: remove task_ns_* functions
    capabitlies: ns_capable can use the cap helpers rather than lsm call
    capabilities: style only - move capable below ns_capable
    capabilites: introduce new has_ns_capabilities_noaudit
    capabilities: call has_ns_capability from has_capability
    capabilities: remove all _real_ interfaces
    capabilities: introduce security_capable_noaudit
    capabilities: reverse arguments to security_capable
    capabilities: remove the task from capable LSM hook entirely
    selinux: sparse fix: fix several warnings in the security server cod
    selinux: sparse fix: fix warnings in netlink code
    selinux: sparse fix: eliminate warnings for selinuxfs
    selinux: sparse fix: declare selinux_disable() in security.h
    selinux: sparse fix: move selinux_complete_init
    selinux: sparse fix: make selinux_secmark_refcount static
    SELinux: Fix RCU deref check warning in sel_netport_insert()

    Manually fix up a semantic mis-merge wrt security_netlink_recv():

    - the interface was removed in commit fd7784615248 ("security: remove
    the security_netlink_recv hook as it is equivalent to capable()")

    - a new user of it appeared in commit a38f7907b926 ("crypto: Add
    userspace configuration API")

    causing no automatic merge conflict, but Eric Paris pointed out the
    issue.

    Linus Torvalds
     

13 Jan, 2012

2 commits

  • The logic of the current code is that whenever we destroy
    a cgroup that had its limit set (set meaning different than
    maximum), we should decrement the jump_label counter.
    Otherwise we assume it was never incremented.

    But what the code actually does is test for RES_USAGE
    instead of RES_LIMIT. Usage being different than maximum
    is likely to be true most of the time.

    The effect of this is that the key must become negative,
    and since the jump_label test says:

    !!atomic_read(&key->enabled);

    we'll have jump_labels still on when no one else is
    using this functionality.

    Signed-off-by: Glauber Costa
    CC: David S. Miller
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • commit a9b3cd7f32 (rcu: convert uses of rcu_assign_pointer(x, NULL) to
    RCU_INIT_POINTER) did a lot of incorrect changes, since it did a
    complete conversion of rcu_assign_pointer(x, y) to RCU_INIT_POINTER(x,
    y).

    We miss needed barriers, even on x86, when y is not NULL.

    Signed-off-by: Eric Dumazet
    CC: Stephen Hemminger
    CC: Paul E. McKenney
    Signed-off-by: David S. Miller

    Eric Dumazet
     

12 Jan, 2012

2 commits


10 Jan, 2012

2 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
    igmp: Avoid zero delay when receiving odd mixture of IGMP queries
    netdev: make net_device_ops const
    bcm63xx: make ethtool_ops const
    usbnet: make ethtool_ops const
    net: Fix build with INET disabled.
    net: introduce netif_addr_lock_nested() and call if when appropriate
    net: correct lock name in dev_[uc/mc]_sync documentations.
    net: sk_update_clone is only used in net/core/sock.c
    8139cp: fix missing napi_gro_flush.
    pktgen: set correct max and min in pktgen_setup_inject()
    smsc911x: Unconditionally include linux/smscphy.h in smsc911x.h
    asix: fix infinite loop in rx_fixup()
    net: Default UDP and UNIX diag to 'n'.
    r6040: fix typo in use of MCR0 register bits
    net: fix sock_clone reference mismatch with tcp memcontrol

    Linus Torvalds
     
  • Commit 5b7c84066733c5dfb0e4016d939757b38de189e4 ('ipv4: correct IGMP
    behavior on v3 query during v2-compatibility mode') added yet another
    case for query parsing, which can result in max_delay = 0. Substitute
    a value of 1, as in the usual v3 case.

    Reported-by: Simon McVittie
    References: http://bugs.debian.org/654876
    Signed-off-by: Ben Hutchings
    Signed-off-by: David S. Miller

    Ben Hutchings
     

09 Jan, 2012

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (53 commits)
    Kconfig: acpi: Fix typo in comment.
    misc latin1 to utf8 conversions
    devres: Fix a typo in devm_kfree comment
    btrfs: free-space-cache.c: remove extra semicolon.
    fat: Spelling s/obsolate/obsolete/g
    SCSI, pmcraid: Fix spelling error in a pmcraid_err() call
    tools/power turbostat: update fields in manpage
    mac80211: drop spelling fix
    types.h: fix comment spelling for 'architectures'
    typo fixes: aera -> area, exntension -> extension
    devices.txt: Fix typo of 'VMware'.
    sis900: Fix enum typo 'sis900_rx_bufer_status'
    decompress_bunzip2: remove invalid vi modeline
    treewide: Fix comment and string typo 'bufer'
    hyper-v: Update MAINTAINERS
    treewide: Fix typos in various parts of the kernel, and fix some comments.
    clockevents: drop unknown Kconfig symbol GENERIC_CLOCKEVENTS_MIGR
    gpio: Kconfig: drop unknown symbol 'CS5535_GPIO'
    leds: Kconfig: Fix typo 'D2NET_V2'
    sound: Kconfig: drop unknown symbol ARCH_CLPS7500
    ...

    Fix up trivial conflicts in arch/powerpc/platforms/40x/Kconfig (some new
    kconfig additions, close to removed commented-out old ones)

    Linus Torvalds
     

08 Jan, 2012

1 commit


06 Jan, 2012

1 commit


31 Dec, 2011

1 commit


29 Dec, 2011

1 commit


28 Dec, 2011

2 commits


25 Dec, 2011

1 commit


24 Dec, 2011

1 commit


23 Dec, 2011

9 commits

  • The NAT range to nlattr conversation callbacks and helpers are entirely
    dead code and are also useless since there are no NAT ranges in conntrack
    context, they are only used for initially selecting a tuple. The final NAT
    information is contained in the selected tuples of the conntrack entry.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • The packet size check originates from a time when UDP helpers could
    accidentally mangle incorrect packets (NEWNAT) and is unnecessary
    nowadays since the conntrack helpers invoke the NAT helpers for the
    proper packet directly.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • The inner tuple that is extracted from the packet is unused. The code also
    doesn't have any useful side-effects like verifying the packet does contain
    enough data to extract the inner tuple since conntrack already does the
    same, so remove it.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • The only remaining user of NAT protocol module reference counting is NAT
    ctnetlink support. Since this is a fairly short sequence of code, convert
    over to use RCU and remove module reference counting.

    Module unregistration is already protected by RCU using synchronize_rcu(),
    so no further changes are necessary.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • Use nf_conntrack_hash_rnd in NAT bysource hash to avoid hash chain attacks.

    Signed-off-by: Patrick McHardy
    Acked-by: Eric Dumazet
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • Export the NAT definitions to userspace. So far userspace (specifically,
    iptables) has been copying the headers files from include/net. Also
    rename some structures and definitions in preparation for IPv6 NAT.
    Since these have never been officially exported, this doesn't affect
    existing userspace code.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • Chris Boot reported crashes occurring in ipv6_select_ident().

    [ 461.457562] RIP: 0010:[] []
    ipv6_select_ident+0x31/0xa7

    [ 461.578229] Call Trace:
    [ 461.580742]
    [ 461.582870] [] ? udp6_ufo_fragment+0x124/0x1a2
    [ 461.589054] [] ? ipv6_gso_segment+0xc0/0x155
    [ 461.595140] [] ? skb_gso_segment+0x208/0x28b
    [ 461.601198] [] ? ipv6_confirm+0x146/0x15e
    [nf_conntrack_ipv6]
    [ 461.608786] [] ? nf_iterate+0x41/0x77
    [ 461.614227] [] ? dev_hard_start_xmit+0x357/0x543
    [ 461.620659] [] ? nf_hook_slow+0x73/0x111
    [ 461.626440] [] ? br_parse_ip_options+0x19a/0x19a
    [bridge]
    [ 461.633581] [] ? dev_queue_xmit+0x3af/0x459
    [ 461.639577] [] ? br_dev_queue_push_xmit+0x72/0x76
    [bridge]
    [ 461.646887] [] ? br_nf_post_routing+0x17d/0x18f
    [bridge]
    [ 461.653997] [] ? nf_iterate+0x41/0x77
    [ 461.659473] [] ? br_flood+0xfa/0xfa [bridge]
    [ 461.665485] [] ? nf_hook_slow+0x73/0x111
    [ 461.671234] [] ? br_flood+0xfa/0xfa [bridge]
    [ 461.677299] [] ?
    nf_bridge_update_protocol+0x20/0x20 [bridge]
    [ 461.684891] [] ? nf_ct_zone+0xa/0x17 [nf_conntrack]
    [ 461.691520] [] ? br_flood+0xfa/0xfa [bridge]
    [ 461.697572] [] ? NF_HOOK.constprop.8+0x3c/0x56
    [bridge]
    [ 461.704616] [] ?
    nf_bridge_push_encap_header+0x1c/0x26 [bridge]
    [ 461.712329] [] ? br_nf_forward_finish+0x8a/0x95
    [bridge]
    [ 461.719490] [] ?
    nf_bridge_pull_encap_header+0x1c/0x27 [bridge]
    [ 461.727223] [] ? br_nf_forward_ip+0x1c0/0x1d4 [bridge]
    [ 461.734292] [] ? nf_iterate+0x41/0x77
    [ 461.739758] [] ? __br_deliver+0xa0/0xa0 [bridge]
    [ 461.746203] [] ? nf_hook_slow+0x73/0x111
    [ 461.751950] [] ? __br_deliver+0xa0/0xa0 [bridge]
    [ 461.758378] [] ? NF_HOOK.constprop.4+0x56/0x56
    [bridge]

    This is caused by bridge netfilter special dst_entry (fake_rtable), a
    special shared entry, where attaching an inetpeer makes no sense.

    Problem is present since commit 87c48fa3b46 (ipv6: make fragment
    identifications less predictable)

    Introduce DST_NOPEER dst flag and make sure ipv6_select_ident() and
    __ip_select_ident() fallback to the 'no peer attached' handling.

    Reported-by: Chris Boot
    Tested-by: Chris Boot
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Signed-off-by: Stephen Rothwell
    Acked-by: Eric Dumazet
    Acked-by: David Miller
    Signed-off-by: Linus Torvalds

    Stephen Rothwell
     

22 Dec, 2011

1 commit

  • Commit 2c8cec5c10b (ipv4: Cache learned PMTU information in inetpeer)
    removed IP route cache garbage collector a bit too soon, as this gc was
    responsible for expired routes cleanup, releasing their neighbour
    reference.

    As pointed out by Robert Gladewitz, recent kernels can fill and exhaust
    their neighbour cache.

    Reintroduce the garbage collection, since we'll have to wait our
    neighbour lookups become refcount-less to not depend on this stuff.

    Reported-by: Robert Gladewitz
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

21 Dec, 2011

2 commits

  • to record the state of SACK/FACK and DSACK for better readability and maintenance.

    Signed-off-by: Vijay Subramanian
    Signed-off-by: David S. Miller

    Vijay Subramanian
     
  • previous commit 3fb72f1e6e6165c5f495e8dc11c5bbd14c73385c
    makes IP-Config wait for carrier on at least one network device.

    Before waiting (predefined value 120s), check that at least one device
    was successfully brought up. Otherwise (e.g. buggy bootloader
    which does not set the MAC address) there is no point in waiting
    for carrier.

    Cc: Micha Nelissen
    Cc: Holger Brunck
    Signed-off-by: Gerlando Falauto
    Signed-off-by: David S. Miller

    Gerlando Falauto
     

20 Dec, 2011

2 commits

  • module_param(bool) used to counter-intuitively take an int. In
    fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
    trick.

    It's time to remove the int/unsigned int option. For this version
    it'll simply give a warning, but it'll break next kernel version.

    (Thanks to Joe Perches for suggesting coccinelle for 0/1 -> true/false).

    Cc: "David S. Miller"
    Cc: netdev@vger.kernel.org
    Signed-off-by: Rusty Russell
    Signed-off-by: David S. Miller

    Rusty Russell
     
  • DaveM said:
    Please, this kind of stuff rots forever and not using bool properly
    drives me crazy.

    Joe Perches gave me the spatch script:

    @@
    bool b;
    @@
    -b = 0
    +b = false
    @@
    bool b;
    @@
    -b = 1
    +b = true

    I merely installed coccinelle, read the documentation and took credit.

    Signed-off-by: Rusty Russell
    Signed-off-by: David S. Miller

    Rusty Russell
     

17 Dec, 2011

2 commits


16 Dec, 2011

2 commits


15 Dec, 2011

1 commit

  • net/ipv4/sysctl_net_ipv4.c:78:6: warning: symbol 'inet_get_ping_group_range_table'
    was not declared. Should it be static?

    net/ipv4/sysctl_net_ipv4.c:119:31: warning: incorrect type in argument 2
    (different signedness)
    net/ipv4/sysctl_net_ipv4.c:119:31: expected int *range
    net/ipv4/sysctl_net_ipv4.c:119:31: got unsigned int *

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

13 Dec, 2011

5 commits

  • This patch introduces kmem.tcp.max_usage_in_bytes file, living in the
    kmem_cgroup filesystem. The root cgroup will display a value equal
    to RESOURCE_MAX. This is to avoid introducing any locking schemes in
    the network paths when cgroups are not being actively used.

    All others, will see the maximum memory ever used by this cgroup.

    Signed-off-by: Glauber Costa
    Reviewed-by: Hiroyouki Kamezawa
    CC: David S. Miller
    CC: Eric W. Biederman
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • This patch introduces kmem.tcp.failcnt file, living in the
    kmem_cgroup filesystem. Following the pattern in the other
    memcg resources, this files keeps a counter of how many times
    allocation failed due to limits being hit in this cgroup.
    The root cgroup will always show a failcnt of 0.

    Signed-off-by: Glauber Costa
    Reviewed-by: Hiroyouki Kamezawa
    CC: David S. Miller
    CC: Eric W. Biederman
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • This patch introduces kmem.tcp.usage_in_bytes file, living in the
    kmem_cgroup filesystem. It is a simple read-only file that displays the
    amount of kernel memory currently consumed by the cgroup.

    Signed-off-by: Glauber Costa
    Reviewed-by: Hiroyouki Kamezawa
    CC: David S. Miller
    CC: Eric W. Biederman
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • This patch uses the "tcp.limit_in_bytes" field of the kmem_cgroup to
    effectively control the amount of kernel memory pinned by a cgroup.

    This value is ignored in the root cgroup, and in all others,
    caps the value specified by the admin in the net namespaces'
    view of tcp_sysctl_mem.

    If namespaces are being used, the admin is allowed to set a
    value bigger than cgroup's maximum, the same way it is allowed
    to set pretty much unlimited values in a real box.

    Signed-off-by: Glauber Costa
    Reviewed-by: Hiroyouki Kamezawa
    CC: David S. Miller
    CC: Eric W. Biederman
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • This patch allows each namespace to independently set up
    its levels for tcp memory pressure thresholds. This patch
    alone does not buy much: we need to make this values
    per group of process somehow. This is achieved in the
    patches that follows in this patchset.

    Signed-off-by: Glauber Costa
    Reviewed-by: KAMEZAWA Hiroyuki
    CC: David S. Miller
    CC: Eric W. Biederman
    Signed-off-by: David S. Miller

    Glauber Costa