04 Sep, 2013

1 commit

  • This config option is superfluous in that it only guards a call
    to neigh_app_ns(). Enabling CONFIG_ARPD by default has no
    change in behavior. There will now be call to __neigh_notify()
    for each ARP resolution, which has no impact unless there is a
    user space daemon waiting to receive the notification, i.e.,
    the case for which CONFIG_ARPD was designed anyways.

    Suggested-by: Eric W. Biederman
    Cc: "David S. Miller"
    Cc: Alexey Kuznetsov
    Cc: James Morris
    Cc: Hideaki YOSHIFUJI
    Cc: Patrick McHardy
    Cc: "Eric W. Biederman"
    Cc: Gao feng
    Cc: Joe Perches
    Cc: Veaceslav Falico
    Signed-off-by: Tim Gardner
    Reviewed-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Tim Gardner
     

04 Aug, 2013

1 commit


03 Aug, 2013

1 commit


27 Jul, 2013

1 commit


02 Jul, 2013

1 commit

  • There is a race in neighbour code, because neigh_destroy() uses
    skb_queue_purge(&neigh->arp_queue) without holding neighbour lock,
    while other parts of the code assume neighbour rwlock is what
    protects arp_queue

    Convert all skb_queue_purge() calls to the __skb_queue_purge() variant

    Use __skb_queue_head_init() instead of skb_queue_head_init()
    to make clear we do not use arp_queue.lock

    And hold neigh->lock in neigh_destroy() to close the race.

    Reported-by: Joe Jin
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

20 Jun, 2013

3 commits


13 Jun, 2013

1 commit

  • Reduce the uses of this unnecessary typedef.

    Done via perl script:

    $ git grep --name-only -w ctl_table net | \
    xargs perl -p -i -e '\
    sub trim { my ($local) = @_; $local =~ s/(^\s+|\s+$)//g; return $local; } \
    s/\b(?<!struct\s)ctl_table\b(\s*\*\s*|\s+\w+)/"struct ctl_table " . trim($1)/ge'

    Reflow the modified lines that now exceed 80 columns.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

02 May, 2013

1 commit

  • Pull VFS updates from Al Viro,

    Misc cleanups all over the place, mainly wrt /proc interfaces (switch
    create_proc_entry to proc_create(), get rid of the deprecated
    create_proc_read_entry() in favor of using proc_create_data() and
    seq_file etc).

    7kloc removed.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits)
    don't bother with deferred freeing of fdtables
    proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h
    proc: Make the PROC_I() and PDE() macros internal to procfs
    proc: Supply a function to remove a proc entry by PDE
    take cgroup_open() and cpuset_open() to fs/proc/base.c
    ppc: Clean up scanlog
    ppc: Clean up rtas_flash driver somewhat
    hostap: proc: Use remove_proc_subtree()
    drm: proc: Use remove_proc_subtree()
    drm: proc: Use minor->index to label things, not PDE->name
    drm: Constify drm_proc_list[]
    zoran: Don't print proc_dir_entry data in debug
    reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show()
    proc: Supply an accessor for getting the data from a PDE's parent
    airo: Use remove_proc_subtree()
    rtl8192u: Don't need to save device proc dir PDE
    rtl8187se: Use a dir under /proc/net/r8180/
    proc: Add proc_mkdir_data()
    proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h}
    proc: Move PDE_NET() to fs/proc/proc_net.c
    ...

    Linus Torvalds
     

17 Apr, 2013

1 commit

  • Update debugging messages to a more current style.

    Emit these debugging messages at KERN_DEBUG instead
    of KERN_DEFAULT.

    Add and use neigh_dbg(level, fmt, ...) macro
    Add dynamic_debug capability via pr_debug
    Convert embedded function names to "%s: ", __func__

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

10 Apr, 2013

1 commit

  • The only part of proc_dir_entry the code outside of fs/proc
    really cares about is PDE(inode)->data. Provide a helper
    for that; static inline for now, eventually will be moved
    to fs/proc, along with the knowledge of struct proc_dir_entry
    layout.

    Signed-off-by: Al Viro

    Al Viro
     

22 Mar, 2013

1 commit


29 Jan, 2013

1 commit

  • When allocating memory for neighbour cache entry, if
    tbl->entry_size is not set, we always calculate
    sizeof(struct neighbour) + tbl->key_len, which is common
    in the same table.

    With this change, set tbl->entry_size during the table
    initialization phase, if it was not set, and use it in
    neigh_alloc() and neighbour_priv().

    This change also allow us to have both of protocol private
    data and device priate data at tha same time.

    Note that the only user of prototcol private is DECnet
    and the only user of device private is ATM CLIP.
    Since those are exclusive, we have not been facing issues
    here.

    Signed-off-by: YOSHIFUJI Hideaki
    Signed-off-by: David S. Miller

    YOSHIFUJI Hideaki / 吉藤英明
     

23 Jan, 2013

1 commit

  • Since we have removed NCE (Neighbour Cache Entry) reference from
    routing entries, the only refcnt holders of an NCE are its timer
    (if running) and its owner table, in usual cases. As a result,
    neigh_periodic_work() purges NCEs over and over again even for
    gateways.

    It does not make sense to purge entries, if number of them is
    very small, so keep them. The minimum number of entries to keep
    is specified by gc_thresh1.

    Signed-off-by: YOSHIFUJI Hideaki
    Signed-off-by: David S. Miller

    YOSHIFUJI Hideaki / 吉藤英明
     

06 Dec, 2012

2 commits

  • net/core/neighbour.c:65:12: warning: 'zero' defined but not used [-Wunused-variable]
    net/core/neighbour.c:66:12: warning: 'unres_qlen_max' defined but not used [-Wunused-variable]

    These variables are only used when CONFIG_SYSCTL is defined,
    so move them under #ifdef CONFIG_SYSCTL.

    Reported-by: Fengguang Wu
    Signed-off-by: Cong Wang
    Acked-by: Shan Wei
    Signed-off-by: David S. Miller

    Cong Wang
     
  • unres_qlen_bytes and unres_qlen are int type.
    But multiple relation(unres_qlen_bytes = unres_qlen * SKB_TRUESIZE(ETH_FRAME_LEN))
    will cause type overflow when seting unres_qlen. e.g.

    $ echo 1027506 > /proc/sys/net/ipv4/neigh/eth1/unres_qlen
    $ cat /proc/sys/net/ipv4/neigh/eth1/unres_qlen
    1182657265
    $ cat /proc/sys/net/ipv4/neigh/eth1/unres_qlen_bytes
    -2147479756

    The gutted value is not that we setting。
    But user/administrator don't know this is caused by int type overflow.

    what's more, it is meaningless and even dangerous that unres_qlen_bytes is set
    with negative number. Because, for unresolved neighbour address, kernel will cache packets
    without limit in __neigh_event_send()(e.g. (u32)-1 = 2GB).

    Signed-off-by: Shan Wei
    Signed-off-by: David S. Miller

    Shan Wei
     

19 Nov, 2012

3 commits

  • - Only allow moving network devices to network namespaces you have
    CAP_NET_ADMIN privileges over.

    - Enable creating/deleting/modifying interfaces
    - Enable adding/deleting addresses
    - Enable adding/setting/deleting neighbour entries
    - Enable adding/removing routes
    - Enable adding/removing fib rules
    - Enable setting the forwarding state
    - Enable adding/removing ipv6 address labels
    - Enable setting bridge parameter

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • - In rtnetlink_rcv_msg convert the capable(CAP_NET_ADMIN) check
    to ns_capable(net->user-ns, CAP_NET_ADMIN). Allowing unprivileged
    users to make netlink calls to modify their local network
    namespace.

    - In the rtnetlink doit methods add capable(CAP_NET_ADMIN) so
    that calls that are not safe for unprivileged users are still
    protected.

    Later patches will remove the extra capable calls from methods
    that are safe for unprivilged users.

    Acked-by: Serge Hallyn
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • In preparation for supporting the creation of network namespaces
    by unprivileged users, modify all of the per net sysctl exports
    and refuse to allow them to unprivileged users.

    This makes it safe for unprivileged users in general to access
    per net sysctls, and allows sysctls to be exported to unprivileged
    users on an individual basis as they are deemed safe.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

08 Oct, 2012

1 commit

  • The retry loop in neigh_resolve_output() and neigh_connected_output()
    call dev_hard_header() with out reseting the skb to network_header.
    This causes the retry to fail with skb_under_panic. The fix is to
    reset the network_header within the retry loop.

    Signed-off-by: Ramesh Nagappa
    Reviewed-by: Shawn Lu
    Reviewed-by: Robert Coulson
    Reviewed-by: Billie Alsup
    Signed-off-by: David S. Miller

    ramesh.nagappa@gmail.com
     

03 Oct, 2012

1 commit

  • Pull networking changes from David Miller:

    1) GRE now works over ipv6, from Dmitry Kozlov.

    2) Make SCTP more network namespace aware, from Eric Biederman.

    3) TEAM driver now works with non-ethernet devices, from Jiri Pirko.

    4) Make openvswitch network namespace aware, from Pravin B Shelar.

    5) IPV6 NAT implementation, from Patrick McHardy.

    6) Server side support for TCP Fast Open, from Jerry Chu and others.

    7) Packet BPF filter supports MOD and XOR, from Eric Dumazet and Daniel
    Borkmann.

    8) Increate the loopback default MTU to 64K, from Eric Dumazet.

    9) Use a per-task rather than per-socket page fragment allocator for
    outgoing networking traffic. This benefits processes that have very
    many mostly idle sockets, which is quite common.

    From Eric Dumazet.

    10) Use up to 32K for page fragment allocations, with fallbacks to
    smaller sizes when higher order page allocations fail. Benefits are
    a) less segments for driver to process b) less calls to page
    allocator c) less waste of space.

    From Eric Dumazet.

    11) Allow GRO to be used on GRE tunnels, from Eric Dumazet.

    12) VXLAN device driver, one way to handle VLAN issues such as the
    limitation of 4096 VLAN IDs yet still have some level of isolation.
    From Stephen Hemminger.

    13) As usual there is a large boatload of driver changes, with the scale
    perhaps tilted towards the wireless side this time around.

    Fix up various fairly trivial conflicts, mostly caused by the user
    namespace changes.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1012 commits)
    hyperv: Add buffer for extended info after the RNDIS response message.
    hyperv: Report actual status in receive completion packet
    hyperv: Remove extra allocated space for recv_pkt_list elements
    hyperv: Fix page buffer handling in rndis_filter_send_request()
    hyperv: Fix the missing return value in rndis_filter_set_packet_filter()
    hyperv: Fix the max_xfer_size in RNDIS initialization
    vxlan: put UDP socket in correct namespace
    vxlan: Depend on CONFIG_INET
    sfc: Fix the reported priorities of different filter types
    sfc: Remove EFX_FILTER_FLAG_RX_OVERRIDE_IP
    sfc: Fix loopback self-test with separate_tx_channels=1
    sfc: Fix MCDI structure field lookup
    sfc: Add parentheses around use of bitfield macro arguments
    sfc: Fix null function pointer in efx_sriov_channel_type
    vxlan: virtual extensible lan
    igmp: export symbol ip_mc_leave_group
    netlink: add attributes to fdb interface
    tg3: unconditionally select HWMON support when tg3 is enabled.
    Revert "net: ti cpsw ethernet: allow reading phy interface mode from DT"
    gre: fix sparse warning
    ...

    Linus Torvalds
     

11 Sep, 2012

1 commit

  • It is a frequent mistake to confuse the netlink port identifier with a
    process identifier. Try to reduce this confusion by renaming fields
    that hold port identifiers portid instead of pid.

    I have carefully avoided changing the structures exported to
    userspace to avoid changing the userspace API.

    I have successfully built an allyesconfig kernel with this change.

    Signed-off-by: "Eric W. Biederman"
    Acked-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

22 Aug, 2012

1 commit


05 Jul, 2012

2 commits


08 Jun, 2012

1 commit


17 May, 2012

1 commit

  • Use the current logging style.

    This enables use of dynamic debugging as well.

    Convert printk(KERN_ to pr_.
    Add pr_fmt. Remove embedded prefixes, use
    %s, __func__ instead.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

21 Apr, 2012

2 commits

  • Using an ascii path to register_net_sysctl as opposed to the slightly
    awkward ctl_path allows for much simpler code.

    We no longer need to malloc dev_name to keep it alive the length of our
    sysctl register instead we can use a small temporary buffer on the
    stack.

    Signed-off-by: Eric W. Biederman
    Acked-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This makes it clearer which sysctls are relative to your current network
    namespace.

    This makes it a little less error prone by not exposing sysctls for the
    initial network namespace in other namespaces.

    This is the same way we handle all of our other network interfaces to
    userspace and I can't honestly remember why we didn't do this for
    sysctls right from the start.

    Signed-off-by: Eric W. Biederman
    Acked-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

16 Apr, 2012

1 commit


14 Apr, 2012

1 commit


02 Apr, 2012

1 commit


27 Feb, 2012

1 commit


22 Feb, 2012

1 commit

  • When the fixed race condition happens:

    1. While function neigh_periodic_work scans the neighbor hash table
    pointed by field tbl->nht, it unlocks and locks tbl->lock between
    buckets in order to call cond_resched.

    2. Assume that function neigh_periodic_work calls cond_resched, that is,
    the lock tbl->lock is available, and function neigh_hash_grow runs.

    3. Once function neigh_hash_grow finishes, and RCU calls
    neigh_hash_free_rcu, the original struct neigh_hash_table that function
    neigh_periodic_work was using doesn't exist anymore.

    4. Once back at neigh_periodic_work, whenever the old struct
    neigh_hash_table is accessed, things can go badly.

    Signed-off-by: Michel Machado
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Michel Machado
     

31 Jan, 2012

1 commit

  • Add ability to return neighbour proxies list to caller if
    it sent full ndmsg structure and has NTF_PROXY flag set.

    Before this patch (and before iproute2 patches):
    $ ip neigh add proxy 2001::1 dev eth0
    $ ip -6 neigh show
    $

    After it and with applied iproute2 patches:
    $ ip neigh add proxy 2001::1 dev eth0
    $ ip -6 neigh show
    2001::1 dev eth0 proxy
    $

    Compatibility with old versions of iproute2 is not broken,
    kernel checks for incoming structure size and properly
    works if old structure is came.

    [v2]
    * changed comments style.
    * removed useless line with continue and curly bracket.
    * changed incoming message size check from equal to more or
    equal.

    CC: davem@davemloft.net
    CC: kuznet@ms2.inr.ac.ru
    CC: netdev@vger.kernel.org
    CC: xemul@parallels.com
    Signed-off-by: Tony Zelenoff
    Acked-by: Thomas Graf
    Signed-off-by: David S. Miller

    Tony Zelenoff
     

29 Dec, 2011

1 commit


20 Dec, 2011

1 commit


14 Dec, 2011

1 commit


06 Dec, 2011

1 commit