10 Jan, 2012

2 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
    igmp: Avoid zero delay when receiving odd mixture of IGMP queries
    netdev: make net_device_ops const
    bcm63xx: make ethtool_ops const
    usbnet: make ethtool_ops const
    net: Fix build with INET disabled.
    net: introduce netif_addr_lock_nested() and call if when appropriate
    net: correct lock name in dev_[uc/mc]_sync documentations.
    net: sk_update_clone is only used in net/core/sock.c
    8139cp: fix missing napi_gro_flush.
    pktgen: set correct max and min in pktgen_setup_inject()
    smsc911x: Unconditionally include linux/smscphy.h in smsc911x.h
    asix: fix infinite loop in rx_fixup()
    net: Default UDP and UNIX diag to 'n'.
    r6040: fix typo in use of MCR0 register bits
    net: fix sock_clone reference mismatch with tcp memcontrol

    Linus Torvalds
     
  • Commit 5b7c84066733c5dfb0e4016d939757b38de189e4 ('ipv4: correct IGMP
    behavior on v3 query during v2-compatibility mode') added yet another
    case for query parsing, which can result in max_delay = 0. Substitute
    a value of 1, as in the usual v3 case.

    Reported-by: Simon McVittie
    References: http://bugs.debian.org/654876
    Signed-off-by: Ben Hutchings
    Signed-off-by: David S. Miller

    Ben Hutchings
     

09 Jan, 2012

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (53 commits)
    Kconfig: acpi: Fix typo in comment.
    misc latin1 to utf8 conversions
    devres: Fix a typo in devm_kfree comment
    btrfs: free-space-cache.c: remove extra semicolon.
    fat: Spelling s/obsolate/obsolete/g
    SCSI, pmcraid: Fix spelling error in a pmcraid_err() call
    tools/power turbostat: update fields in manpage
    mac80211: drop spelling fix
    types.h: fix comment spelling for 'architectures'
    typo fixes: aera -> area, exntension -> extension
    devices.txt: Fix typo of 'VMware'.
    sis900: Fix enum typo 'sis900_rx_bufer_status'
    decompress_bunzip2: remove invalid vi modeline
    treewide: Fix comment and string typo 'bufer'
    hyper-v: Update MAINTAINERS
    treewide: Fix typos in various parts of the kernel, and fix some comments.
    clockevents: drop unknown Kconfig symbol GENERIC_CLOCKEVENTS_MIGR
    gpio: Kconfig: drop unknown symbol 'CS5535_GPIO'
    leds: Kconfig: Fix typo 'D2NET_V2'
    sound: Kconfig: drop unknown symbol ARCH_CLPS7500
    ...

    Fix up trivial conflicts in arch/powerpc/platforms/40x/Kconfig (some new
    kconfig additions, close to removed commented-out old ones)

    Linus Torvalds
     

08 Jan, 2012

1 commit


31 Dec, 2011

1 commit


29 Dec, 2011

1 commit


28 Dec, 2011

2 commits


25 Dec, 2011

1 commit


24 Dec, 2011

1 commit


23 Dec, 2011

9 commits

  • The NAT range to nlattr conversation callbacks and helpers are entirely
    dead code and are also useless since there are no NAT ranges in conntrack
    context, they are only used for initially selecting a tuple. The final NAT
    information is contained in the selected tuples of the conntrack entry.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • The packet size check originates from a time when UDP helpers could
    accidentally mangle incorrect packets (NEWNAT) and is unnecessary
    nowadays since the conntrack helpers invoke the NAT helpers for the
    proper packet directly.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • The inner tuple that is extracted from the packet is unused. The code also
    doesn't have any useful side-effects like verifying the packet does contain
    enough data to extract the inner tuple since conntrack already does the
    same, so remove it.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • The only remaining user of NAT protocol module reference counting is NAT
    ctnetlink support. Since this is a fairly short sequence of code, convert
    over to use RCU and remove module reference counting.

    Module unregistration is already protected by RCU using synchronize_rcu(),
    so no further changes are necessary.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • Use nf_conntrack_hash_rnd in NAT bysource hash to avoid hash chain attacks.

    Signed-off-by: Patrick McHardy
    Acked-by: Eric Dumazet
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • Export the NAT definitions to userspace. So far userspace (specifically,
    iptables) has been copying the headers files from include/net. Also
    rename some structures and definitions in preparation for IPv6 NAT.
    Since these have never been officially exported, this doesn't affect
    existing userspace code.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • Chris Boot reported crashes occurring in ipv6_select_ident().

    [ 461.457562] RIP: 0010:[] []
    ipv6_select_ident+0x31/0xa7

    [ 461.578229] Call Trace:
    [ 461.580742]
    [ 461.582870] [] ? udp6_ufo_fragment+0x124/0x1a2
    [ 461.589054] [] ? ipv6_gso_segment+0xc0/0x155
    [ 461.595140] [] ? skb_gso_segment+0x208/0x28b
    [ 461.601198] [] ? ipv6_confirm+0x146/0x15e
    [nf_conntrack_ipv6]
    [ 461.608786] [] ? nf_iterate+0x41/0x77
    [ 461.614227] [] ? dev_hard_start_xmit+0x357/0x543
    [ 461.620659] [] ? nf_hook_slow+0x73/0x111
    [ 461.626440] [] ? br_parse_ip_options+0x19a/0x19a
    [bridge]
    [ 461.633581] [] ? dev_queue_xmit+0x3af/0x459
    [ 461.639577] [] ? br_dev_queue_push_xmit+0x72/0x76
    [bridge]
    [ 461.646887] [] ? br_nf_post_routing+0x17d/0x18f
    [bridge]
    [ 461.653997] [] ? nf_iterate+0x41/0x77
    [ 461.659473] [] ? br_flood+0xfa/0xfa [bridge]
    [ 461.665485] [] ? nf_hook_slow+0x73/0x111
    [ 461.671234] [] ? br_flood+0xfa/0xfa [bridge]
    [ 461.677299] [] ?
    nf_bridge_update_protocol+0x20/0x20 [bridge]
    [ 461.684891] [] ? nf_ct_zone+0xa/0x17 [nf_conntrack]
    [ 461.691520] [] ? br_flood+0xfa/0xfa [bridge]
    [ 461.697572] [] ? NF_HOOK.constprop.8+0x3c/0x56
    [bridge]
    [ 461.704616] [] ?
    nf_bridge_push_encap_header+0x1c/0x26 [bridge]
    [ 461.712329] [] ? br_nf_forward_finish+0x8a/0x95
    [bridge]
    [ 461.719490] [] ?
    nf_bridge_pull_encap_header+0x1c/0x27 [bridge]
    [ 461.727223] [] ? br_nf_forward_ip+0x1c0/0x1d4 [bridge]
    [ 461.734292] [] ? nf_iterate+0x41/0x77
    [ 461.739758] [] ? __br_deliver+0xa0/0xa0 [bridge]
    [ 461.746203] [] ? nf_hook_slow+0x73/0x111
    [ 461.751950] [] ? __br_deliver+0xa0/0xa0 [bridge]
    [ 461.758378] [] ? NF_HOOK.constprop.4+0x56/0x56
    [bridge]

    This is caused by bridge netfilter special dst_entry (fake_rtable), a
    special shared entry, where attaching an inetpeer makes no sense.

    Problem is present since commit 87c48fa3b46 (ipv6: make fragment
    identifications less predictable)

    Introduce DST_NOPEER dst flag and make sure ipv6_select_ident() and
    __ip_select_ident() fallback to the 'no peer attached' handling.

    Reported-by: Chris Boot
    Tested-by: Chris Boot
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Signed-off-by: Stephen Rothwell
    Acked-by: Eric Dumazet
    Acked-by: David Miller
    Signed-off-by: Linus Torvalds

    Stephen Rothwell
     

22 Dec, 2011

1 commit

  • Commit 2c8cec5c10b (ipv4: Cache learned PMTU information in inetpeer)
    removed IP route cache garbage collector a bit too soon, as this gc was
    responsible for expired routes cleanup, releasing their neighbour
    reference.

    As pointed out by Robert Gladewitz, recent kernels can fill and exhaust
    their neighbour cache.

    Reintroduce the garbage collection, since we'll have to wait our
    neighbour lookups become refcount-less to not depend on this stuff.

    Reported-by: Robert Gladewitz
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

21 Dec, 2011

2 commits

  • to record the state of SACK/FACK and DSACK for better readability and maintenance.

    Signed-off-by: Vijay Subramanian
    Signed-off-by: David S. Miller

    Vijay Subramanian
     
  • previous commit 3fb72f1e6e6165c5f495e8dc11c5bbd14c73385c
    makes IP-Config wait for carrier on at least one network device.

    Before waiting (predefined value 120s), check that at least one device
    was successfully brought up. Otherwise (e.g. buggy bootloader
    which does not set the MAC address) there is no point in waiting
    for carrier.

    Cc: Micha Nelissen
    Cc: Holger Brunck
    Signed-off-by: Gerlando Falauto
    Signed-off-by: David S. Miller

    Gerlando Falauto
     

20 Dec, 2011

2 commits

  • module_param(bool) used to counter-intuitively take an int. In
    fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
    trick.

    It's time to remove the int/unsigned int option. For this version
    it'll simply give a warning, but it'll break next kernel version.

    (Thanks to Joe Perches for suggesting coccinelle for 0/1 -> true/false).

    Cc: "David S. Miller"
    Cc: netdev@vger.kernel.org
    Signed-off-by: Rusty Russell
    Signed-off-by: David S. Miller

    Rusty Russell
     
  • DaveM said:
    Please, this kind of stuff rots forever and not using bool properly
    drives me crazy.

    Joe Perches gave me the spatch script:

    @@
    bool b;
    @@
    -b = 0
    +b = false
    @@
    bool b;
    @@
    -b = 1
    +b = true

    I merely installed coccinelle, read the documentation and took credit.

    Signed-off-by: Rusty Russell
    Signed-off-by: David S. Miller

    Rusty Russell
     

17 Dec, 2011

2 commits


16 Dec, 2011

2 commits


15 Dec, 2011

1 commit

  • net/ipv4/sysctl_net_ipv4.c:78:6: warning: symbol 'inet_get_ping_group_range_table'
    was not declared. Should it be static?

    net/ipv4/sysctl_net_ipv4.c:119:31: warning: incorrect type in argument 2
    (different signedness)
    net/ipv4/sysctl_net_ipv4.c:119:31: expected int *range
    net/ipv4/sysctl_net_ipv4.c:119:31: got unsigned int *

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

13 Dec, 2011

8 commits

  • This patch introduces kmem.tcp.max_usage_in_bytes file, living in the
    kmem_cgroup filesystem. The root cgroup will display a value equal
    to RESOURCE_MAX. This is to avoid introducing any locking schemes in
    the network paths when cgroups are not being actively used.

    All others, will see the maximum memory ever used by this cgroup.

    Signed-off-by: Glauber Costa
    Reviewed-by: Hiroyouki Kamezawa
    CC: David S. Miller
    CC: Eric W. Biederman
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • This patch introduces kmem.tcp.failcnt file, living in the
    kmem_cgroup filesystem. Following the pattern in the other
    memcg resources, this files keeps a counter of how many times
    allocation failed due to limits being hit in this cgroup.
    The root cgroup will always show a failcnt of 0.

    Signed-off-by: Glauber Costa
    Reviewed-by: Hiroyouki Kamezawa
    CC: David S. Miller
    CC: Eric W. Biederman
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • This patch introduces kmem.tcp.usage_in_bytes file, living in the
    kmem_cgroup filesystem. It is a simple read-only file that displays the
    amount of kernel memory currently consumed by the cgroup.

    Signed-off-by: Glauber Costa
    Reviewed-by: Hiroyouki Kamezawa
    CC: David S. Miller
    CC: Eric W. Biederman
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • This patch uses the "tcp.limit_in_bytes" field of the kmem_cgroup to
    effectively control the amount of kernel memory pinned by a cgroup.

    This value is ignored in the root cgroup, and in all others,
    caps the value specified by the admin in the net namespaces'
    view of tcp_sysctl_mem.

    If namespaces are being used, the admin is allowed to set a
    value bigger than cgroup's maximum, the same way it is allowed
    to set pretty much unlimited values in a real box.

    Signed-off-by: Glauber Costa
    Reviewed-by: Hiroyouki Kamezawa
    CC: David S. Miller
    CC: Eric W. Biederman
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • This patch allows each namespace to independently set up
    its levels for tcp memory pressure thresholds. This patch
    alone does not buy much: we need to make this values
    per group of process somehow. This is achieved in the
    patches that follows in this patchset.

    Signed-off-by: Glauber Costa
    Reviewed-by: KAMEZAWA Hiroyuki
    CC: David S. Miller
    CC: Eric W. Biederman
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • This patch introduces memory pressure controls for the tcp
    protocol. It uses the generic socket memory pressure code
    introduced in earlier patches, and fills in the
    necessary data in cg_proto struct.

    Signed-off-by: Glauber Costa
    Reviewed-by: KAMEZAWA Hiroyuki
    CC: Eric W. Biederman
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • This patch replaces all uses of struct sock fields' memory_pressure,
    memory_allocated, sockets_allocated, and sysctl_mem to acessor
    macros. Those macros can either receive a socket argument, or a mem_cgroup
    argument, depending on the context they live in.

    Since we're only doing a macro wrapping here, no performance impact at all is
    expected in the case where we don't have cgroups disabled.

    Signed-off-by: Glauber Costa
    Reviewed-by: Hiroyouki Kamezawa
    CC: David S. Miller
    CC: Eric W. Biederman
    CC: Eric Dumazet
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • Same fix as 731abb9cb2 for ipip and sit tunnel.
    Commit 1c5cae815d removed an explicit call to dev_alloc_name in
    ipip_tunnel_locate and ipip6_tunnel_locate, because register_netdevice
    will now create a valid name, however the tunnel keeps a copy of the
    name in the private parms structure. Fix this by copying the name back
    after register_netdevice has successfully returned.

    This shows up if you do a simple tunnel add, followed by a tunnel show:

    $ sudo ip tunnel add mode ipip remote 10.2.20.211
    $ ip tunnel
    tunl0: ip/ip remote any local any ttl inherit nopmtudisc
    tunl%d: ip/ip remote 10.2.20.211 local any ttl inherit
    $ sudo ip tunnel add mode sit remote 10.2.20.212
    $ ip tunnel
    sit0: ipv6/ip remote any local any ttl 64 nopmtudisc 6rd-prefix 2002::/16
    sit%d: ioctl 89f8 failed: No such device
    sit%d: ipv6/ip remote 10.2.20.212 local any ttl inherit

    Cc: stable@vger.kernel.org
    Signed-off-by: Ted Feng
    Signed-off-by: David S. Miller

    Ted Feng
     

12 Dec, 2011

1 commit


11 Dec, 2011

2 commits

  • Wrap the udp6 lookup into the proper ifdef-s.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Eric Dumazet reported, that when inet_diag is built-in the udp_diag also goes
    built-in and when ipv6 is a module the udp6 lookup symbol is not found.

    LD .tmp_vmlinux1
    net/built-in.o: In function `udp_dump_one':
    udp_diag.c:(.text+0xa2b40): undefined reference to `__udp6_lib_lookup'
    make: *** [.tmp_vmlinux1] Erreur 1

    Fix this by making udp diag build mode depend on both -- inet diag and ipv6.

    Reported-by: Eric Dumazet
    Signed-off-by: Pavel Emelyanov
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Pavel Emelyanov