14 Oct, 2013

1 commit

  • [ Upstream commit 703133de331a7a7df47f31fb9de51dc6f68a9de8 ]

    If local fragmentation is allowed, then ip_select_ident() and
    ip_select_ident_more() need to generate unique IDs to ensure
    correct defragmentation on the peer.

    For example, if IPsec (tunnel mode) has to encrypt large skbs
    that have local_df bit set, then all IP fragments that belonged
    to different ESP datagrams would have used the same identificator.
    If one of these IP fragments would get lost or reordered, then
    peer could possibly stitch together wrong IP fragments that did
    not belong to the same datagram. This would lead to a packet loss
    or data corruption.

    Signed-off-by: Ansis Atteka
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Ansis Atteka
     

02 Oct, 2013

1 commit

  • commit 2cf55125c64d64cc106e204d53b107094762dfdf upstream.

    This fixes a serious bug affecting all hash types with a net element -
    specifically, if a CIDR value is deleted such that none of the same size
    exist any more, all larger (less-specific) values will then fail to
    match. Adding back any prefix with a CIDR equal to or more specific than
    the one deleted will fix it.

    Steps to reproduce:
    ipset -N test hash:net
    ipset -A test 1.1.0.0/16
    ipset -A test 2.2.2.0/24
    ipset -T test 1.1.1.1 #1.1.1.1 IS in set
    ipset -D test 2.2.2.0/24
    ipset -T test 1.1.1.1 #1.1.1.1 IS NOT in set

    This is due to the fact that the nets counter was unconditionally
    decremented prior to the iteration that shifts up the entries. Now, we
    first check if there is a proceeding entry and if not, decrement it and
    return. Otherwise, we proceed to iterate and then zero the last element,
    which, in most cases, will already be zero.

    Signed-off-by: Oliver Smith
    Signed-off-by: Jozsef Kadlecsik
    Signed-off-by: Greg Kroah-Hartman

    Oliver Smith
     

24 Jun, 2013

2 commits

  • commit 0ceabd83875b72a29f33db4ab703d6ba40ea4c58
    (netfilter: ctnetlink: deliver labels to userspace) sets the event bit
    when we raced with another packet, instead of raising the event bit
    when the label bit is set for the first time.

    commit 9b21f6a90924dfe8e5e686c314ddb441fb06501e
    (netfilter: ctnetlink: allow userspace to modify labels) forgot to update
    the event mask in the "conntrack already exists" case.

    Both issues result in CTA_LABELS attribute not getting included in the
    conntrack event.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • In (b20ab9c netfilter: nf_ct_helper: better logging for dropped packets)
    there were some missing brackets around the logging information, thus
    always returning drop.

    Closes https://bugzilla.kernel.org/show_bug.cgi?id=60061

    Signed-off-by: Balazs Peter Odor
    Signed-off-by: Pablo Neira Ayuso

    Balazs Peter Odor
     

19 Jun, 2013

1 commit


12 Jun, 2013

2 commits

  • Similar to commit bc6bcb59 ("netfilter: xt_TCPOPTSTRIP: fix
    possible mangling beyond packet boundary"), add safe fragment
    handling to xt_TCPMSS.

    Signed-off-by: Phil Oester
    Signed-off-by: Pablo Neira Ayuso

    Phil Oester
     
  • As a followup to commit 409b545a ("netfilter: xt_TCPMSS: Fix violation
    of RFC879 in absence of MSS option"), John Heffner points out that IPv6
    has a higher MTU than IPv4, and thus a higher minimum MSS. Update TCPMSS
    target to account for this, and update RFC comment.

    While at it, point to more recent reference RFC1122 instead of RFC879.

    Signed-off-by: Phil Oester
    Signed-off-by: Pablo Neira Ayuso

    Phil Oester
     

11 Jun, 2013

1 commit


10 Jun, 2013

1 commit

  • The entry struct has a 2 byte hole after ->port and another 4 byte
    hole after ->stats.outpkts. You must have CAP_NET_ADMIN in your
    namespace to hit this information leak.

    Signed-off-by: Dan Carpenter
    Acked-by: Julian Anastasov
    Signed-off-by: Simon Horman
    Signed-off-by: Pablo Neira Ayuso

    Dan Carpenter
     

08 Jun, 2013

1 commit


05 Jun, 2013

3 commits


29 May, 2013

2 commits

  • kfree_rcu() requires offsetof(..., rcu_head) < 4096, which can
    get violated with a sufficiently high CONFIG_IP_VS_SH_TAB_BITS.

    Signed-off-by: Jan Beulich
    Signed-off-by: Simon Horman
    Signed-off-by: Pablo Neira Ayuso

    Jan Beulich
     
  • In dump_ipv6_packet(), the "recurse" parameter is zero only if
    dumping contents of a packet embedded into an ICMPv6 error
    message. Therefore we want to log packet mark if recurse is
    non-zero, not when it is zero.

    Signed-off-by: Michal Kubecek
    Signed-off-by: Pablo Neira Ayuso

    Michal Kubeček
     

27 May, 2013

1 commit

  • Expire cached connection for new TCP/SCTP connection if real
    server is down. Otherwise, IPVS uses the dead server for the
    reused connection, instead of a new working one.

    Signed-off-by: Grzegorz Lyczba
    Acked-by: Hans Schillstrom
    Acked-by: Julian Anastasov
    Signed-off-by: Simon Horman
    Signed-off-by: Pablo Neira Ayuso

    Grzegorz Lyczba
     

23 May, 2013

1 commit

  • Quoting https://bugzilla.netfilter.org/show_bug.cgi?id=812:

    [ ip6tables -m addrtype ]
    When I tried to use in the nat/PREROUTING it messes up the
    routing cache even if the rule didn't matched at all.
    [..]
    If I remove the --limit-iface-in from the non-working scenario, so just
    use the -m addrtype --dst-type LOCAL it works!

    This happens when LOCAL type matching is requested with --limit-iface-in,
    and the default ipv6 route is via the interface the packet we test
    arrived on.

    Because xt_addrtype uses ip6_route_output, the ipv6 routing implementation
    creates an unwanted cached entry, and the packet won't make it to the
    real/expected destination.

    Silently ignoring --limit-iface-in makes the routing work but it breaks
    rule matching (--dst-type LOCAL with limit-iface-in is supposed to only
    match if the dst address is configured on the incoming interface;
    without --limit-iface-in it will match if the address is reachable
    via lo).

    The test should call ipv6_chk_addr() instead. However, this would add
    a link-time dependency on ipv6.

    There are two possible solutions:

    1) Revert the commit that moved ipt_addrtype to xt_addrtype,
    and put ipv6 specific code into ip6t_addrtype.
    2) add new "nf_ipv6_ops" struct to register pointers to ipv6 functions.

    While the former might seem preferable, Pablo pointed out that there
    are more xt modules with link-time dependeny issues regarding ipv6,
    so lets go for 2).

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

16 May, 2013

1 commit


15 May, 2013

1 commit

  • Since (69b34fb netfilter: xt_LOG: add net namespace support
    for xt_LOG), we hit this:

    [ 4224.708977] BUG: unable to handle kernel NULL pointer dereference at 0000000000000388
    [ 4224.709074] IP: [] ipt_log_packet+0x29/0x270

    when callling log functions from conntrack both in and out
    are NULL i.e. the net pointer is invalid.

    Adding struct net *net in call to nf_logfn() will secure that
    there always is a vaild net ptr.

    Reported as netfilter's bugzilla bug 818:
    https://bugzilla.netfilter.org/show_bug.cgi?id=818

    Reported-by: Ronald
    Signed-off-by: Hans Schillstrom
    Signed-off-by: Pablo Neira Ayuso

    Hans Schillstrom
     

06 May, 2013

1 commit

  • This patch fixes the following compilation error:

    net/netfilter/nf_log.c:373:38: error: 'struct netns_nf' has no member named 'proc_netfilter'

    if procfs is not set.

    The netns support for nf_log, nfnetlink_log and nfnetlink_queue_core
    requires CONFIG_PROC_FS in the removal path of their respective
    /proc interface since net->nf.proc_netfilter is undefined in that
    case.

    Reported-by: Fengguang Wu
    Signed-off-by: Pablo Neira Ayuso
    Acked-by: Gao feng

    Pablo Neira Ayuso
     

02 May, 2013

3 commits

  • Pull VFS updates from Al Viro,

    Misc cleanups all over the place, mainly wrt /proc interfaces (switch
    create_proc_entry to proc_create(), get rid of the deprecated
    create_proc_read_entry() in favor of using proc_create_data() and
    seq_file etc).

    7kloc removed.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits)
    don't bother with deferred freeing of fdtables
    proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h
    proc: Make the PROC_I() and PDE() macros internal to procfs
    proc: Supply a function to remove a proc entry by PDE
    take cgroup_open() and cpuset_open() to fs/proc/base.c
    ppc: Clean up scanlog
    ppc: Clean up rtas_flash driver somewhat
    hostap: proc: Use remove_proc_subtree()
    drm: proc: Use remove_proc_subtree()
    drm: proc: Use minor->index to label things, not PDE->name
    drm: Constify drm_proc_list[]
    zoran: Don't print proc_dir_entry data in debug
    reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show()
    proc: Supply an accessor for getting the data from a PDE's parent
    airo: Use remove_proc_subtree()
    rtl8192u: Don't need to save device proc dir PDE
    rtl8187se: Use a dir under /proc/net/r8180/
    proc: Add proc_mkdir_data()
    proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h}
    proc: Move PDE_NET() to fs/proc/proc_net.c
    ...

    Linus Torvalds
     
  • Supply accessor functions to set attributes in proc_dir_entry structs.

    The following are supplied: proc_set_size() and proc_set_user().

    Signed-off-by: David Howells
    Acked-by: Mauro Carvalho Chehab
    cc: linuxppc-dev@lists.ozlabs.org
    cc: linux-media@vger.kernel.org
    cc: netdev@vger.kernel.org
    cc: linux-wireless@vger.kernel.org
    cc: linux-pci@vger.kernel.org
    cc: netfilter-devel@vger.kernel.org
    cc: alsa-devel@alsa-project.org
    Signed-off-by: Al Viro

    David Howells
     
  • Pull networking updates from David Miller:
    "Highlights (1721 non-merge commits, this has to be a record of some
    sort):

    1) Add 'random' mode to team driver, from Jiri Pirko and Eric
    Dumazet.

    2) Make it so that any driver that supports configuration of multiple
    MAC addresses can provide the forwarding database add and del
    calls by providing a default implementation and hooking that up if
    the driver doesn't have an explicit set of handlers. From Vlad
    Yasevich.

    3) Support GSO segmentation over tunnels and other encapsulating
    devices such as VXLAN, from Pravin B Shelar.

    4) Support L2 GRE tunnels in the flow dissector, from Michael Dalton.

    5) Implement Tail Loss Probe (TLP) detection in TCP, from Nandita
    Dukkipati.

    6) In the PHY layer, allow supporting wake-on-lan in situations where
    the PHY registers have to be written for it to be configured.

    Use it to support wake-on-lan in mv643xx_eth.

    From Michael Stapelberg.

    7) Significantly improve firewire IPV6 support, from YOSHIFUJI
    Hideaki.

    8) Allow multiple packets to be sent in a single transmission using
    network coding in batman-adv, from Martin Hundebøll.

    9) Add support for T5 cxgb4 chips, from Santosh Rastapur.

    10) Generalize the VXLAN forwarding tables so that there is more
    flexibility in configurating various aspects of the endpoints.
    From David Stevens.

    11) Support RSS and TSO in hardware over GRE tunnels in bxn2x driver,
    from Dmitry Kravkov.

    12) Zero copy support in nfnelink_queue, from Eric Dumazet and Pablo
    Neira Ayuso.

    13) Start adding networking selftests.

    14) In situations of overload on the same AF_PACKET fanout socket, or
    per-cpu packet receive queue, minimize drop by distributing the
    load to other cpus/fanouts. From Willem de Bruijn and Eric
    Dumazet.

    15) Add support for new payload offset BPF instruction, from Daniel
    Borkmann.

    16) Convert several drivers over to mdoule_platform_driver(), from
    Sachin Kamat.

    17) Provide a minimal BPF JIT image disassembler userspace tool, from
    Daniel Borkmann.

    18) Rewrite F-RTO implementation in TCP to match the final
    specification of it in RFC4138 and RFC5682. From Yuchung Cheng.

    19) Provide netlink socket diag of netlink sockets ("Yo dawg, I hear
    you like netlink, so I implemented netlink dumping of netlink
    sockets.") From Andrey Vagin.

    20) Remove ugly passing of rtnetlink attributes into rtnl_doit
    functions, from Thomas Graf.

    21) Allow userspace to be able to see if a configuration change occurs
    in the middle of an address or device list dump, from Nicolas
    Dichtel.

    22) Support RFC3168 ECN protection for ipv6 fragments, from Hannes
    Frederic Sowa.

    23) Increase accuracy of packet length used by packet scheduler, from
    Jason Wang.

    24) Beginning set of changes to make ipv4/ipv6 fragment handling more
    scalable and less susceptible to overload and locking contention,
    from Jesper Dangaard Brouer.

    25) Get rid of using non-type-safe NLMSG_* macros and use nlmsg_*()
    instead. From Hong Zhiguo.

    26) Optimize route usage in IPVS by avoiding reference counting where
    possible, from Julian Anastasov.

    27) Convert IPVS schedulers to RCU, also from Julian Anastasov.

    28) Support cpu fanouts in xt_NFQUEUE netfilter target, from Holger
    Eitzenberger.

    29) Network namespace support for nf_log, ebt_log, xt_LOG, ipt_ULOG,
    nfnetlink_log, and nfnetlink_queue. From Gao feng.

    30) Implement RFC3168 ECN protection, from Hannes Frederic Sowa.

    31) Support several new r8169 chips, from Hayes Wang.

    32) Support tokenized interface identifiers in ipv6, from Daniel
    Borkmann.

    33) Use usbnet_link_change() helper in USB net driver, from Ming Lei.

    34) Add 802.1ad vlan offload support, from Patrick McHardy.

    35) Support mmap() based netlink communication, also from Patrick
    McHardy.

    36) Support HW timestamping in mlx4 driver, from Amir Vadai.

    37) Rationalize AF_PACKET packet timestamping when transmitting, from
    Willem de Bruijn and Daniel Borkmann.

    38) Bring parity to what's provided by /proc/net/packet socket dumping
    and the info provided by netlink socket dumping of AF_PACKET
    sockets. From Nicolas Dichtel.

    39) Fix peeking beyond zero sized SKBs in AF_UNIX, from Benjamin
    Poirier"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits)
    filter: fix va_list build error
    af_unix: fix a fatal race with bit fields
    bnx2x: Prevent memory leak when cnic is absent
    bnx2x: correct reading of speed capabilities
    net: sctp: attribute printl with __printf for gcc fmt checks
    netlink: kconfig: move mmap i/o into netlink kconfig
    netpoll: convert mutex into a semaphore
    netlink: Fix skb ref counting.
    net_sched: act_ipt forward compat with xtables
    mlx4_en: fix a build error on 32bit arches
    Revert "bnx2x: allow nvram test to run when device is down"
    bridge: avoid OOPS if root port not found
    drivers: net: cpsw: fix kernel warn on cpsw irq enable
    sh_eth: use random MAC address if no valid one supplied
    3c509.c: call SET_NETDEV_DEV for all device types (ISA/ISAPnP/EISA)
    tg3: fix to append hardware time stamping flags
    unix/stream: fix peeking with an offset larger than data in queue
    unix/dgram: fix peeking with an offset larger than data in queue
    unix/dgram: peek beyond 0-sized skbs
    openvswitch: Remove unneeded ovs_netdev_get_ifindex()
    ...

    Linus Torvalds
     

30 Apr, 2013

17 commits