13 Dec, 2011

12 commits

  • David S. Miller
     
  • This patch introduces kmem.tcp.max_usage_in_bytes file, living in the
    kmem_cgroup filesystem. The root cgroup will display a value equal
    to RESOURCE_MAX. This is to avoid introducing any locking schemes in
    the network paths when cgroups are not being actively used.

    All others, will see the maximum memory ever used by this cgroup.

    Signed-off-by: Glauber Costa
    Reviewed-by: Hiroyouki Kamezawa
    CC: David S. Miller
    CC: Eric W. Biederman
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • This patch introduces kmem.tcp.failcnt file, living in the
    kmem_cgroup filesystem. Following the pattern in the other
    memcg resources, this files keeps a counter of how many times
    allocation failed due to limits being hit in this cgroup.
    The root cgroup will always show a failcnt of 0.

    Signed-off-by: Glauber Costa
    Reviewed-by: Hiroyouki Kamezawa
    CC: David S. Miller
    CC: Eric W. Biederman
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • This patch introduces kmem.tcp.usage_in_bytes file, living in the
    kmem_cgroup filesystem. It is a simple read-only file that displays the
    amount of kernel memory currently consumed by the cgroup.

    Signed-off-by: Glauber Costa
    Reviewed-by: Hiroyouki Kamezawa
    CC: David S. Miller
    CC: Eric W. Biederman
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • This patch uses the "tcp.limit_in_bytes" field of the kmem_cgroup to
    effectively control the amount of kernel memory pinned by a cgroup.

    This value is ignored in the root cgroup, and in all others,
    caps the value specified by the admin in the net namespaces'
    view of tcp_sysctl_mem.

    If namespaces are being used, the admin is allowed to set a
    value bigger than cgroup's maximum, the same way it is allowed
    to set pretty much unlimited values in a real box.

    Signed-off-by: Glauber Costa
    Reviewed-by: Hiroyouki Kamezawa
    CC: David S. Miller
    CC: Eric W. Biederman
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • This patch allows each namespace to independently set up
    its levels for tcp memory pressure thresholds. This patch
    alone does not buy much: we need to make this values
    per group of process somehow. This is achieved in the
    patches that follows in this patchset.

    Signed-off-by: Glauber Costa
    Reviewed-by: KAMEZAWA Hiroyuki
    CC: David S. Miller
    CC: Eric W. Biederman
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • This patch introduces memory pressure controls for the tcp
    protocol. It uses the generic socket memory pressure code
    introduced in earlier patches, and fills in the
    necessary data in cg_proto struct.

    Signed-off-by: Glauber Costa
    Reviewed-by: KAMEZAWA Hiroyuki
    CC: Eric W. Biederman
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • The goal of this work is to move the memory pressure tcp
    controls to a cgroup, instead of just relying on global
    conditions.

    To avoid excessive overhead in the network fast paths,
    the code that accounts allocated memory to a cgroup is
    hidden inside a static_branch(). This branch is patched out
    until the first non-root cgroup is created. So when nobody
    is using cgroups, even if it is mounted, no significant performance
    penalty should be seen.

    This patch handles the generic part of the code, and has nothing
    tcp-specific.

    Signed-off-by: Glauber Costa
    Reviewed-by: KAMEZAWA Hiroyuki
    CC: Kirill A. Shutemov
    CC: David S. Miller
    CC: Eric W. Biederman
    CC: Eric Dumazet
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • This patch replaces all uses of struct sock fields' memory_pressure,
    memory_allocated, sockets_allocated, and sysctl_mem to acessor
    macros. Those macros can either receive a socket argument, or a mem_cgroup
    argument, depending on the context they live in.

    Since we're only doing a macro wrapping here, no performance impact at all is
    expected in the case where we don't have cgroups disabled.

    Signed-off-by: Glauber Costa
    Reviewed-by: Hiroyouki Kamezawa
    CC: David S. Miller
    CC: Eric W. Biederman
    CC: Eric Dumazet
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • This patch lays down the foundation for the kernel memory component
    of the Memory Controller.

    As of today, I am only laying down the following files:

    * memory.independent_kmem_limit
    * memory.kmem.limit_in_bytes (currently ignored)
    * memory.kmem.usage_in_bytes (always zero)

    Signed-off-by: Glauber Costa
    CC: Kirill A. Shutemov
    CC: Paul Menage
    CC: Greg Thelen
    CC: Johannes Weiner
    CC: Michal Hocko
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • After a guest is live migrated, the xen-netfront driver emits a gratuitous
    ARP message, so that networking hardware on the target host's subnet can
    take notice, and public routing to the guest is re-established. However,
    if the packet appears on the backend interface before the backend is added
    to the target host's bridge, the packet is lost, and the migrated guest's
    peers become unable to talk to the guest.

    A sufficient two-parts condition to prevent the above is:

    (1) ensure that the backend only moves to Connected xenbus state after its
    hotplug scripts completed, ie. the netback interface got added to the
    bridge; and

    (2) ensure the frontend only queues the gARP when it sees the backend move
    to Connected.

    These two together provide complete ordering. Sub-condition (1) is already
    satisfied by commit f942dc2552b8 in Linus' tree, based on commit
    6b0b80ca7165 from [1].

    In general, the full condition is sufficient, not necessary, because,
    according to [2], live migration has been working for a long time without
    satisfying sub-condition (2). However, after 6b0b80ca7165 was backported
    to the RHEL-5 host to ensure (1), (2) still proved necessary in the RHEL-6
    guest. This patch intends to provide (2) for upstream.

    The Reviewed-by line comes from [3].

    [1] git://xenbits.xen.org/people/ianc/linux-2.6.git#upstream/dom0/backend/netback-history
    [2] http://old-list-archives.xen.org/xen-devel/2011-06/msg01969.html
    [3] http://old-list-archives.xen.org/xen-devel/2011-07/msg00484.html

    Signed-off-by: Laszlo Ersek
    Reviewed-by: Ian Campbell
    Signed-off-by: David S. Miller

    Laszlo Ersek
     
  • …wireless-next into for-davem

    John W. Linville
     

12 Dec, 2011

6 commits


11 Dec, 2011

2 commits

  • Wrap the udp6 lookup into the proper ifdef-s.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Eric Dumazet reported, that when inet_diag is built-in the udp_diag also goes
    built-in and when ipv6 is a module the udp6 lookup symbol is not found.

    LD .tmp_vmlinux1
    net/built-in.o: In function `udp_dump_one':
    udp_diag.c:(.text+0xa2b40): undefined reference to `__udp6_lib_lookup'
    make: *** [.tmp_vmlinux1] Erreur 1

    Fix this by making udp diag build mode depend on both -- inet diag and ipv6.

    Reported-by: Eric Dumazet
    Signed-off-by: Pavel Emelyanov
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     

10 Dec, 2011

17 commits


09 Dec, 2011

3 commits

  • These tests are off by one because sock_diag_handlers[] only has AF_MAX
    elements.

    Signed-off-by: Dan Carpenter
    Acked-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Dan Carpenter
     
  • net_prio_subsys can be made static this removes the sparse
    warning it was throwing.

    Signed-off-by: John Fastabend
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    John Fastabend
     
  • The code is missing initialization of NO_FCOE_FLAG and NO_ISCSI*FLAGS
    when CONFIG_CNIC is not selected.
    This causes panic during driver load since commit
    1d187b34daaecbb87aa523ba46b92930a388cb21 where NO_FCOE tested
    unconditionally (outside #ifdef BCM_CNIC structure) and
    accessed fp[FCOE_IDX] which is not allocated.

    Reported-by: Eric Dumazet
    Signed-off-by: Dmitry Kravkov
    Signed-off-by: Eilon Greenstein
    Signed-off-by: David S. Miller

    Dmitry Kravkov