16 Apr, 2014

1 commit

  • ip_queue_xmit() assumes the skb it has to transmit is attached to an
    inet socket. Commit 31c70d5956fc ("l2tp: keep original skb ownership")
    changed l2tp to not change skb ownership and thus broke this assumption.

    One fix is to add a new 'struct sock *sk' parameter to ip_queue_xmit(),
    so that we do not assume skb->sk points to the socket used by l2tp
    tunnel.

    Fixes: 31c70d5956fc ("l2tp: keep original skb ownership")
    Reported-by: Zhan Jianyu
    Tested-by: Zhan Jianyu
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

10 Apr, 2014

1 commit

  • When l2tp driver tries to get PMTU for the tunnel destination, it uses
    the pointer to struct sock that represents PPPoX socket, while it
    should use the pointer that represents UDP socket of the tunnel.

    Signed-off-by: Dmitry Petukhov
    Signed-off-by: David S. Miller

    Dmitry Petukhov
     

03 Apr, 2014

1 commit

  • Pull networking updates from David Miller:
    "Here is my initial pull request for the networking subsystem during
    this merge window:

    1) Support for ESN in AH (RFC 4302) from Fan Du.

    2) Add full kernel doc for ethtool command structures, from Ben
    Hutchings.

    3) Add BCM7xxx PHY driver, from Florian Fainelli.

    4) Export computed TCP rate information in netlink socket dumps, from
    Eric Dumazet.

    5) Allow IPSEC SA to be dumped partially using a filter, from Nicolas
    Dichtel.

    6) Convert many drivers to pci_enable_msix_range(), from Alexander
    Gordeev.

    7) Record SKB timestamps more efficiently, from Eric Dumazet.

    8) Switch to microsecond resolution for TCP round trip times, also
    from Eric Dumazet.

    9) Clean up and fix 6lowpan fragmentation handling by making use of
    the existing inet_frag api for it's implementation.

    10) Add TX grant mapping to xen-netback driver, from Zoltan Kiss.

    11) Auto size SKB lengths when composing netlink messages based upon
    past message sizes used, from Eric Dumazet.

    12) qdisc dumps can take a long time, add a cond_resched(), From Eric
    Dumazet.

    13) Sanitize netpoll core and drivers wrt. SKB handling semantics.
    Get rid of never-used-in-tree netpoll RX handling. From Eric W
    Biederman.

    14) Support inter-address-family and namespace changing in VTI tunnel
    driver(s). From Steffen Klassert.

    15) Add Altera TSE driver, from Vince Bridgers.

    16) Optimizing csum_replace2() so that it doesn't adjust the checksum
    by checksumming the entire header, from Eric Dumazet.

    17) Expand BPF internal implementation for faster interpreting, more
    direct translations into JIT'd code, and much cleaner uses of BPF
    filtering in non-socket ocntexts. From Daniel Borkmann and Alexei
    Starovoitov"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1976 commits)
    netpoll: Use skb_irq_freeable to make zap_completion_queue safe.
    net: Add a test to see if a skb is freeable in irq context
    qlcnic: Fix build failure due to undefined reference to `vxlan_get_rx_port'
    net: ptp: move PTP classifier in its own file
    net: sxgbe: make "core_ops" static
    net: sxgbe: fix logical vs bitwise operation
    net: sxgbe: sxgbe_mdio_register() frees the bus
    Call efx_set_channels() before efx->type->dimension_resources()
    xen-netback: disable rogue vif in kthread context
    net/mlx4: Set proper build dependancy with vxlan
    be2net: fix build dependency on VxLAN
    mac802154: make csma/cca parameters per-wpan
    mac802154: allow only one WPAN to be up at any given time
    net: filter: minor: fix kdoc in __sk_run_filter
    netlink: don't compare the nul-termination in nla_strcmp
    can: c_can: Avoid led toggling for every packet.
    can: c_can: Simplify TX interrupt cleanup
    can: c_can: Store dlc private
    can: c_can: Reduce register access
    can: c_can: Make the code readable
    ...

    Linus Torvalds
     

01 Apr, 2014

1 commit

  • Pull workqueue changes from Tejun Heo:
    "PREPARE_[DELAYED_]WORK() were used to change the work function of work
    items without fully reinitializing it; however, this makes workqueue
    consider the work item as a different one from before and allows the
    work item to start executing before the previous instance is finished
    which can lead to extremely subtle issues which are painful to debug.

    The interface has never been popular. This pull request contains
    patches to remove existing usages and kill the interface. As one of
    the changes was routed during the last devel cycle and another
    depended on a pending change in nvme, for-3.15 contains a couple merge
    commits.

    In addition, interfaces which were deprecated quite a while ago -
    __cancel_delayed_work() and WQ_NON_REENTRANT - are removed too"

    * 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
    workqueue: remove deprecated WQ_NON_REENTRANT
    workqueue: Spelling s/instensive/intensive/
    workqueue: remove PREPARE_[DELAYED_]WORK()
    staging/fwserial: don't use PREPARE_WORK
    afs: don't use PREPARE_WORK
    nvme: don't use PREPARE_WORK
    usb: don't use PREPARE_DELAYED_WORK
    floppy: don't use PREPARE_[DELAYED_]WORK
    ps3-vuart: don't use PREPARE_WORK
    wireless/rt2x00: don't use PREPARE_WORK in rt2800usb.c
    workqueue: Remove deprecated __cancel_delayed_work()

    Linus Torvalds
     

29 Mar, 2014

1 commit

  • Tejun Heo has made WQ_NON_REENTRANT useless in the dbf2576e37
    ("workqueue: make all workqueues non-reentrant"). So remove its
    usages and definition.

    This patch doesn't introduce any behavior changes.

    tj: minor description updates.

    Signed-off-by: ZhangZhen
    Sigend-off-by: Tejun Heo
    Acked-by: James Chapman
    Acked-by: Ulf Hansson

    ZhangZhen
     

15 Mar, 2014

1 commit


11 Mar, 2014

1 commit

  • net/l2tp/l2tp_core.c:1111:15: warning: unused variable
    'sk' [-Wunused-variable]

    Fixes: 31c70d5956fc ("l2tp: keep original skb ownership")
    Reported-by: kbuild test robot
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

08 Mar, 2014

1 commit

  • There is no reason to orphan skb in l2tp.

    This breaks things like per socket memory limits, TCP Small queues...

    Fix this before more people copy/paste it.

    This is very similar to commit 8f646c922d550
    ("vxlan: keep original skb ownership")

    Signed-off-by: Eric Dumazet
    Cc: James Chapman
    Signed-off-by: David S. Miller

    Eric Dumazet
     

07 Mar, 2014

2 commits

  • As pppol2tp_recv() never queues up packets to plain L2TP sockets,
    pppol2tp_recvmsg() never returns data to userspace, thus making
    the recv*() system calls unusable.

    Instead of dropping packets when the L2TP socket isn't bound to a PPP
    channel, this patch adds them to its reception queue.

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     
  • Commit e0d4435f "l2tp: Update PPP-over-L2TP driver to work over L2TPv3"
    broke the PPPOL2TP_SO_SENDSEQ setsockopt. The L2TP header length was
    previously computed by pppol2tp_l2t_header_len() before each call to
    l2tp_xmit_skb(). Now that header length is retrieved from the hdr_len
    session field, this field must be updated every time the L2TP header
    format is modified, or l2tp_xmit_skb() won't push the right amount of
    data for the L2TP header.

    This patch uses l2tp_session_set_header_len() to adjust hdr_len every
    time sequencing is (de)activated from userspace (either by the
    PPPOL2TP_SO_SENDSEQ setsockopt or the L2TP_ATTR_SEND_SEQ netlink
    attribute).

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     

14 Feb, 2014

1 commit

  • One of my pet coding style peeves is the practice of
    adding extra return; at the end of function.
    Kill several instances of this in network code.

    I suppose some coccinelle wizardy could do this automatically.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    stephen hemminger
     

22 Jan, 2014

1 commit

  • Some ipv6 protocols cannot handle ipv4 addresses, so we must not allow
    connecting and binding to them. sendmsg logic does already check msg->name
    for this but must trust already connected sockets which could be set up
    for connection to ipv4 address family.

    Per-socket flag ipv6only is of no use here, as it is under users control
    by setsockopt.

    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

19 Jan, 2014

1 commit

  • This is a follow-up patch to f3d3342602f8bc ("net: rework recvmsg
    handler msg_name and msg_namelen logic").

    DECLARE_SOCKADDR validates that the structure we use for writing the
    name information to is not larger than the buffer which is reserved
    for msg->msg_name (which is 128 bytes). Also use DECLARE_SOCKADDR
    consistently in sendmsg code paths.

    Signed-off-by: Steffen Hurrle
    Suggested-by: Hannes Frederic Sowa
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Steffen Hurrle
     

14 Jan, 2014

1 commit


20 Dec, 2013

1 commit

  • Steffen Klassert says:

    ====================
    pull request (net-next): ipsec-next 2013-12-19

    1) Use the user supplied policy index instead of a generated one
    if present. From Fan Du.

    2) Make xfrm migration namespace aware. From Fan Du.

    3) Make the xfrm state and policy locks namespace aware. From Fan Du.

    4) Remove ancient sleeping when the SA is in acquire state,
    we now queue packets to the policy instead. This replaces the
    sleeping code.

    5) Remove FLOWI_FLAG_CAN_SLEEP. This was used to notify xfrm about the
    posibility to sleep. The sleeping code is gone, so remove it.

    6) Check user specified spi for IPComp. Thr spi for IPcomp is only
    16 bit wide, so check for a valid value. From Fan Du.

    7) Export verify_userspi_info to check for valid user supplied spi ranges
    with pfkey and netlink. From Fan Du.

    8) RFC3173 states that if the total size of a compressed payload and the IPComp
    header is not smaller than the size of the original payload, the IP datagram
    must be sent in the original non-compressed form. These packets are dropped
    by the inbound policy check because they are not transformed. Document the need
    to set 'level use' for IPcomp to receive such packets anyway. From Fan Du.

    Please pull or let me know if there are problems.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

11 Dec, 2013

1 commit

  • This patch is following b579035ff766c9412e2b92abf5cab794bff102b6
    "ipv6: remove old conditions on flow label sharing"

    Since there is no reason to restrict a label to a
    destination, we should not erase the destination value of a
    socket with the value contained in the flow label storage.

    This patch allows to really have the same flow label to more
    than one destination.

    Signed-off-by: Florent Fourcot
    Reviewed-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Florent Fourcot
     

06 Dec, 2013

1 commit


24 Nov, 2013

1 commit

  • Commit bceaa90240b6019ed73b49965eac7d167610be69 ("inet: prevent leakage
    of uninitialized memory to user in recv syscalls") conditionally updated
    addr_len if the msg_name is written to. The recv_error and rxpmtu
    functions relied on the recvmsg functions to set up addr_len before.

    As this does not happen any more we have to pass addr_len to those
    functions as well and set it to the size of the corresponding sockaddr
    length.

    This broke traceroute and such.

    Fixes: bceaa90240b6 ("inet: prevent leakage of uninitialized memory to user in recv syscalls")
    Reported-by: Brad Spengler
    Reported-by: Tom Labanowski
    Cc: mpb
    Cc: David S. Miller
    Cc: Eric Dumazet
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

21 Nov, 2013

1 commit


20 Nov, 2013

1 commit

  • As suggested by David Miller, make genl_register_family_with_ops()
    a macro and pass only the array, evaluating ARRAY_SIZE() in the
    macro, this is a little safer.

    The openvswitch has some indirection, assing ops/n_ops directly in
    that code. This might ultimately just assign the pointers in the
    family initializations, saving the struct genl_family_and_ops and
    code (once mcast groups are handled differently.)

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

19 Nov, 2013

1 commit

  • Only update *addr_len when we actually fill in sockaddr, otherwise we
    can return uninitialized memory from the stack to the caller in the
    recvfrom, recvmmsg and recvmsg syscalls. Drop the the (addr_len == NULL)
    checks because we only get called with a valid addr_len pointer either
    from sock_common_recvmsg or inet_recvmsg.

    If a blocking read waits on a socket which is concurrently shut down we
    now return zero and set msg_msgnamelen to 0.

    Reported-by: mpb
    Suggested-by: Eric Dumazet
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

15 Nov, 2013

1 commit

  • Now that genl_ops are no longer modified in place when
    registering, they can be made const. This patch was done
    mostly with spatch:

    @@
    identifier ops;
    @@
    +const
    struct genl_ops ops[] = {
    ...
    };

    (except the struct thing in net/openvswitch/datapath.c)

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

24 Oct, 2013

1 commit


20 Oct, 2013

1 commit

  • There are a mix of function prototypes with and without extern
    in the kernel sources. Standardize on not using extern for
    function prototypes.

    Function prototypes don't need to be written with extern.
    extern is assumed by the compiler. Its use is as unnecessary as
    using auto to declare automatic/local variables in a block.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

12 Oct, 2013

1 commit

  • François Cachereul made a very nice bug report and suspected
    the bh_lock_sock() / bh_unlok_sock() pair used in l2tp_xmit_skb() from
    process context was not good.

    This problem was added by commit 6af88da14ee284aaad6e4326da09a89191ab6165
    ("l2tp: Fix locking in l2tp_core.c").

    l2tp_eth_dev_xmit() runs from BH context, so we must disable BH
    from other l2tp_xmit_skb() users.

    [ 452.060011] BUG: soft lockup - CPU#1 stuck for 23s! [accel-pppd:6662]
    [ 452.061757] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core pppoe pppox
    ppp_generic slhc ipv6 ext3 mbcache jbd virtio_balloon xfs exportfs dm_mod
    virtio_blk ata_generic virtio_net floppy ata_piix libata virtio_pci virtio_ring virtio [last unloaded: scsi_wait_scan]
    [ 452.064012] CPU 1
    [ 452.080015] BUG: soft lockup - CPU#2 stuck for 23s! [accel-pppd:6643]
    [ 452.080015] CPU 2
    [ 452.080015]
    [ 452.080015] Pid: 6643, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs
    [ 452.080015] RIP: 0010:[] [] do_raw_spin_lock+0x17/0x1f
    [ 452.080015] RSP: 0018:ffff88007125fc18 EFLAGS: 00000293
    [ 452.080015] RAX: 000000000000aba9 RBX: ffffffff811d0703 RCX: 0000000000000000
    [ 452.080015] RDX: 00000000000000ab RSI: ffff8800711f6896 RDI: ffff8800745c8110
    [ 452.080015] RBP: ffff88007125fc18 R08: 0000000000000020 R09: 0000000000000000
    [ 452.080015] R10: 0000000000000000 R11: 0000000000000280 R12: 0000000000000286
    [ 452.080015] R13: 0000000000000020 R14: 0000000000000240 R15: 0000000000000000
    [ 452.080015] FS: 00007fdc0cc24700(0000) GS:ffff8800b6f00000(0000) knlGS:0000000000000000
    [ 452.080015] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 452.080015] CR2: 00007fdb054899b8 CR3: 0000000074404000 CR4: 00000000000006a0
    [ 452.080015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 452.080015] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    [ 452.080015] Process accel-pppd (pid: 6643, threadinfo ffff88007125e000, task ffff8800b27e6dd0)
    [ 452.080015] Stack:
    [ 452.080015] ffff88007125fc28 ffffffff81256559 ffff88007125fc98 ffffffffa01b2bd1
    [ 452.080015] ffff88007125fc58 000000000000000c 00000000029490d0 0000009c71dbe25e
    [ 452.080015] 000000000000005c 000000080000000e 0000000000000000 ffff880071170600
    [ 452.080015] Call Trace:
    [ 452.080015] [] _raw_spin_lock+0xe/0x10
    [ 452.080015] [] l2tp_xmit_skb+0x189/0x4ac [l2tp_core]
    [ 452.080015] [] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
    [ 452.080015] [] __sock_sendmsg_nosec+0x22/0x24
    [ 452.080015] [] sock_sendmsg+0xa1/0xb6
    [ 452.080015] [] ? __schedule+0x5c1/0x616
    [ 452.080015] [] ? __dequeue_signal+0xb7/0x10c
    [ 452.080015] [] ? fget_light+0x75/0x89
    [ 452.080015] [] ? sockfd_lookup_light+0x20/0x56
    [ 452.080015] [] sys_sendto+0x10c/0x13b
    [ 452.080015] [] system_call_fastpath+0x16/0x1b
    [ 452.080015] Code: 81 48 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 07 eb f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3
    [ 452.080015] Call Trace:
    [ 452.080015] [] _raw_spin_lock+0xe/0x10
    [ 452.080015] [] l2tp_xmit_skb+0x189/0x4ac [l2tp_core]
    [ 452.080015] [] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
    [ 452.080015] [] __sock_sendmsg_nosec+0x22/0x24
    [ 452.080015] [] sock_sendmsg+0xa1/0xb6
    [ 452.080015] [] ? __schedule+0x5c1/0x616
    [ 452.080015] [] ? __dequeue_signal+0xb7/0x10c
    [ 452.080015] [] ? fget_light+0x75/0x89
    [ 452.080015] [] ? sockfd_lookup_light+0x20/0x56
    [ 452.080015] [] sys_sendto+0x10c/0x13b
    [ 452.080015] [] system_call_fastpath+0x16/0x1b
    [ 452.064012]
    [ 452.064012] Pid: 6662, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs
    [ 452.064012] RIP: 0010:[] [] do_raw_spin_lock+0x19/0x1f
    [ 452.064012] RSP: 0018:ffff8800b6e83ba0 EFLAGS: 00000297
    [ 452.064012] RAX: 000000000000aaa9 RBX: ffff8800b6e83b40 RCX: 0000000000000002
    [ 452.064012] RDX: 00000000000000aa RSI: 000000000000000a RDI: ffff8800745c8110
    [ 452.064012] RBP: ffff8800b6e83ba0 R08: 000000000000c802 R09: 000000000000001c
    [ 452.064012] R10: ffff880071096c4e R11: 0000000000000006 R12: ffff8800b6e83b18
    [ 452.064012] R13: ffffffff8125d51e R14: ffff8800b6e83ba0 R15: ffff880072a589c0
    [ 452.064012] FS: 00007fdc0b81e700(0000) GS:ffff8800b6e80000(0000) knlGS:0000000000000000
    [ 452.064012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 452.064012] CR2: 0000000000625208 CR3: 0000000074404000 CR4: 00000000000006a0
    [ 452.064012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 452.064012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    [ 452.064012] Process accel-pppd (pid: 6662, threadinfo ffff88007129a000, task ffff8800744f7410)
    [ 452.064012] Stack:
    [ 452.064012] ffff8800b6e83bb0 ffffffff81256559 ffff8800b6e83bc0 ffffffff8121c64a
    [ 452.064012] ffff8800b6e83bf0 ffffffff8121ec7a ffff880072a589c0 ffff880071096c62
    [ 452.064012] 0000000000000011 ffffffff81430024 ffff8800b6e83c80 ffffffff8121f276
    [ 452.064012] Call Trace:
    [ 452.064012]
    [ 452.064012] [] _raw_spin_lock+0xe/0x10
    [ 452.064012] [] spin_lock+0x9/0xb
    [ 452.064012] [] udp_queue_rcv_skb+0x186/0x269
    [ 452.064012] [] __udp4_lib_rcv+0x297/0x4ae
    [ 452.064012] [] ? raw_rcv+0xe9/0xf0
    [ 452.064012] [] udp_rcv+0x1a/0x1c
    [ 452.064012] [] ip_local_deliver_finish+0x12b/0x1a5
    [ 452.064012] [] ip_local_deliver+0x53/0x84
    [ 452.064012] [] ip_rcv_finish+0x2bc/0x2f3
    [ 452.064012] [] ip_rcv+0x210/0x269
    [ 452.064012] [] ? kvm_clock_get_cycles+0x9/0xb
    [ 452.064012] [] __netif_receive_skb+0x3a5/0x3f7
    [ 452.064012] [] netif_receive_skb+0x57/0x5e
    [ 452.064012] [] ? __netdev_alloc_skb+0x1f/0x3b
    [ 452.064012] [] virtnet_poll+0x4ba/0x5a4 [virtio_net]
    [ 452.064012] [] net_rx_action+0x73/0x184
    [ 452.064012] [] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
    [ 452.064012] [] __do_softirq+0xc3/0x1a8
    [ 452.064012] [] ? ack_APIC_irq+0x10/0x12
    [ 452.064012] [] ? _raw_spin_lock+0xe/0x10
    [ 452.064012] [] call_softirq+0x1c/0x26
    [ 452.064012] [] do_softirq+0x45/0x82
    [ 452.064012] [] irq_exit+0x42/0x9c
    [ 452.064012] [] do_IRQ+0x8e/0xa5
    [ 452.064012] [] common_interrupt+0x6e/0x6e
    [ 452.064012]
    [ 452.064012] [] ? kfree+0x8a/0xa3
    [ 452.064012] [] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
    [ 452.064012] [] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core]
    [ 452.064012] [] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
    [ 452.064012] [] __sock_sendmsg_nosec+0x22/0x24
    [ 452.064012] [] sock_sendmsg+0xa1/0xb6
    [ 452.064012] [] ? __schedule+0x5c1/0x616
    [ 452.064012] [] ? __dequeue_signal+0xb7/0x10c
    [ 452.064012] [] ? fget_light+0x75/0x89
    [ 452.064012] [] ? sockfd_lookup_light+0x20/0x56
    [ 452.064012] [] sys_sendto+0x10c/0x13b
    [ 452.064012] [] system_call_fastpath+0x16/0x1b
    [ 452.064012] Code: 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 8a 07 f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 55 48
    [ 452.064012] Call Trace:
    [ 452.064012] [] _raw_spin_lock+0xe/0x10
    [ 452.064012] [] spin_lock+0x9/0xb
    [ 452.064012] [] udp_queue_rcv_skb+0x186/0x269
    [ 452.064012] [] __udp4_lib_rcv+0x297/0x4ae
    [ 452.064012] [] ? raw_rcv+0xe9/0xf0
    [ 452.064012] [] udp_rcv+0x1a/0x1c
    [ 452.064012] [] ip_local_deliver_finish+0x12b/0x1a5
    [ 452.064012] [] ip_local_deliver+0x53/0x84
    [ 452.064012] [] ip_rcv_finish+0x2bc/0x2f3
    [ 452.064012] [] ip_rcv+0x210/0x269
    [ 452.064012] [] ? kvm_clock_get_cycles+0x9/0xb
    [ 452.064012] [] __netif_receive_skb+0x3a5/0x3f7
    [ 452.064012] [] netif_receive_skb+0x57/0x5e
    [ 452.064012] [] ? __netdev_alloc_skb+0x1f/0x3b
    [ 452.064012] [] virtnet_poll+0x4ba/0x5a4 [virtio_net]
    [ 452.064012] [] net_rx_action+0x73/0x184
    [ 452.064012] [] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
    [ 452.064012] [] __do_softirq+0xc3/0x1a8
    [ 452.064012] [] ? ack_APIC_irq+0x10/0x12
    [ 452.064012] [] ? _raw_spin_lock+0xe/0x10
    [ 452.064012] [] call_softirq+0x1c/0x26
    [ 452.064012] [] do_softirq+0x45/0x82
    [ 452.064012] [] irq_exit+0x42/0x9c
    [ 452.064012] [] do_IRQ+0x8e/0xa5
    [ 452.064012] [] common_interrupt+0x6e/0x6e
    [ 452.064012] [] ? kfree+0x8a/0xa3
    [ 452.064012] [] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
    [ 452.064012] [] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core]
    [ 452.064012] [] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
    [ 452.064012] [] __sock_sendmsg_nosec+0x22/0x24
    [ 452.064012] [] sock_sendmsg+0xa1/0xb6
    [ 452.064012] [] ? __schedule+0x5c1/0x616
    [ 452.064012] [] ? __dequeue_signal+0xb7/0x10c
    [ 452.064012] [] ? fget_light+0x75/0x89
    [ 452.064012] [] ? sockfd_lookup_light+0x20/0x56
    [ 452.064012] [] sys_sendto+0x10c/0x13b
    [ 452.064012] [] system_call_fastpath+0x16/0x1b

    Reported-by: François Cachereul
    Tested-by: François Cachereul
    Signed-off-by: Eric Dumazet
    Cc: James Chapman
    Signed-off-by: David S. Miller

    Eric Dumazet
     

09 Oct, 2013

2 commits

  • TCP listener refactoring, part 4 :

    To speed up inet lookups, we moved IPv4 addresses from inet to struct
    sock_common

    Now is time to do the same for IPv6, because it permits us to have fast
    lookups for all kind of sockets, including upcoming SYN_RECV.

    Getting IPv6 addresses in TCP lookups currently requires two extra cache
    lines, plus a dereference (and memory stall).

    inet6_sk(sk) does the dereference of inet_sk(__sk)->pinet6

    This patch is way bigger than its IPv4 counter part, because for IPv4,
    we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6,
    it's not doable easily.

    inet6_sk(sk)->daddr becomes sk->sk_v6_daddr
    inet6_sk(sk)->rcv_saddr becomes sk->sk_v6_rcv_saddr

    And timewait socket also have tw->tw_v6_daddr & tw->tw_v6_rcv_saddr
    at the same offset.

    We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic
    macro.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • net/l2tp/l2tp_core.c: In function ‘l2tp_verify_udp_checksum’:
    net/l2tp/l2tp_core.c:499:22: warning: unused variable ‘tunnel’ [-Wunused-variable]

    Create a helper "l2tp_tunnel()" to facilitate this, and as a side
    effect get rid of a bunch of unnecessary void pointer casts.

    Signed-off-by: David S. Miller

    David S. Miller
     

03 Oct, 2013

1 commit

  • IPv4 mapped addresses cause kernel panic.
    The patch juste check whether the IPv6 address is an IPv4 mapped
    address. If so, use IPv4 API instead of IPv6.

    [ 940.026915] general protection fault: 0000 [#1]
    [ 940.026915] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core pppox ppp_generic slhc loop psmouse
    [ 940.026915] CPU: 0 PID: 3184 Comm: memcheck-amd64- Not tainted 3.11.0+ #1
    [ 940.026915] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
    [ 940.026915] task: ffff880007130e20 ti: ffff88000737e000 task.ti: ffff88000737e000
    [ 940.026915] RIP: 0010:[] [] ip6_xmit+0x276/0x326
    [ 940.026915] RSP: 0018:ffff88000737fd28 EFLAGS: 00010286
    [ 940.026915] RAX: c748521a75ceff48 RBX: ffff880000c30800 RCX: 0000000000000000
    [ 940.026915] RDX: ffff88000075cc4e RSI: 0000000000000028 RDI: ffff8800060e5a40
    [ 940.026915] RBP: ffff8800060e5a40 R08: 0000000000000000 R09: ffff88000075cc90
    [ 940.026915] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88000737fda0
    [ 940.026915] R13: 0000000000000000 R14: 0000000000002000 R15: ffff880005d3b580
    [ 940.026915] FS: 00007f163dc5e800(0000) GS:ffffffff81623000(0000) knlGS:0000000000000000
    [ 940.026915] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 940.026915] CR2: 00000004032dc940 CR3: 0000000005c25000 CR4: 00000000000006f0
    [ 940.026915] Stack:
    [ 940.026915] ffff88000075cc4e ffffffff81694e90 ffff880000c30b38 0000000000000020
    [ 940.026915] 11000000523c4bac ffff88000737fdb4 0000000000000000 ffff880000c30800
    [ 940.026915] ffff880005d3b580 ffff880000c30b38 ffff8800060e5a40 0000000000000020
    [ 940.026915] Call Trace:
    [ 940.026915] [] ? inet6_csk_xmit+0xa4/0xc4
    [ 940.026915] [] ? l2tp_xmit_skb+0x503/0x55a [l2tp_core]
    [ 940.026915] [] ? pskb_expand_head+0x161/0x214
    [ 940.026915] [] ? pppol2tp_xmit+0xf2/0x143 [l2tp_ppp]
    [ 940.026915] [] ? ppp_channel_push+0x36/0x8b [ppp_generic]
    [ 940.026915] [] ? ppp_write+0xaf/0xc5 [ppp_generic]
    [ 940.026915] [] ? vfs_write+0xa2/0x106
    [ 940.026915] [] ? SyS_write+0x56/0x8a
    [ 940.026915] [] ? system_call_fastpath+0x16/0x1b
    [ 940.026915] Code: 00 49 8b 8f d8 00 00 00 66 83 7c 11 02 00 74 60 49
    8b 47 58 48 83 e0 fe 48 8b 80 18 01 00 00 48 85 c0 74 13 48 8b 80 78 02
    00 00 ff 40 28 41 8b 57 68 48 01 50 30 48 8b 54 24 08 49 c7 c1 51
    [ 940.026915] RIP [] ip6_xmit+0x276/0x326
    [ 940.026915] RSP
    [ 940.057945] ---[ end trace be8aba9a61c8b7f3 ]---
    [ 940.058583] Kernel panic - not syncing: Fatal exception in interrupt

    Signed-off-by: François CACHEREUL
    Signed-off-by: David S. Miller

    François Cachereul
     

03 Jul, 2013

3 commits

  • If L2TP data sequence numbers are enabled and reordering is not
    enabled, data reception stops if a packet is lost since the kernel
    waits for a sequence number that is never resent. (When reordering is
    enabled, data reception restarts when the reorder timeout expires.) If
    no reorder timeout is set, we should count the number of in-sequence
    packets after the out-of-sequence (OOS) condition is detected, and reset
    sequence number state after a number of such packets are received.

    For now, the number of in-sequence packets while in OOS state which
    cause the sequence number state to be reset is hard-coded to 5. This
    could be configurable later.

    Signed-off-by: James Chapman
    Signed-off-by: David S. Miller

    James Chapman
     
  • The L2TP datapath is not currently RFC-compliant when sequence numbers
    are used in L2TP data packets. According to the L2TP RFC, any received
    sequence number NR greater than or equal to the next expected NR is
    acceptable, where the "greater than or equal to" test is determined by
    the NR wrap point. This differs for L2TPv2 and L2TPv3, so add state in
    the session context to hold the max NR value and the NR window size in
    order to do the acceptable sequence number value check. These might be
    configurable later, but for now we derive it from the tunnel L2TP
    version, which determines the sequence number field size.

    Signed-off-by: James Chapman
    Signed-off-by: David S. Miller

    James Chapman
     
  • This change moves some code handling data sequence numbers into a
    separate function to avoid too much indentation. This is to prepare
    for some changes to data sequence number handling in subsequent
    patches.

    Signed-off-by: James Chapman
    Signed-off-by: David S. Miller

    James Chapman
     

02 Jul, 2013

1 commit


13 Jun, 2013

2 commits


08 Apr, 2013

2 commits


23 Mar, 2013

1 commit


21 Mar, 2013

3 commits

  • If we postpone unhashing of l2tp sessions until the structure is freed, we
    risk:

    1. further packets arriving and getting queued while the pseudowire is being
    closed down
    2. the recv path hitting "scheduling while atomic" errors in the case that
    recv drops the last reference to a session and calls l2tp_session_free
    while in atomic context

    As such, l2tp sessions should be unhashed from l2tp_core data structures early
    in the teardown process prior to calling pseudowire close. For pseudowires
    like l2tp_ppp which have multiple shutdown codepaths, provide an unhash hook.

    Signed-off-by: Tom Parkin
    Signed-off-by: James Chapman
    Signed-off-by: David S. Miller

    Tom Parkin
     
  • l2tp's u64_stats writers were incorrectly synchronised, making it possible to
    deadlock a 64bit machine running a 32bit kernel simply by sending the l2tp
    code netlink commands while passing data through l2tp sessions.

    Previous discussion on netdev determined that alternative solutions such as
    spinlock writer synchronisation or per-cpu data would bring unjustified
    overhead, given that most users interested in high volume traffic will likely
    be running 64bit kernels on 64bit hardware.

    As such, this patch replaces l2tp's use of u64_stats with atomic_long_t,
    thereby avoiding the deadlock.

    Ref:
    http://marc.info/?l=linux-netdev&m=134029167910731&w=2
    http://marc.info/?l=linux-netdev&m=134079868111131&w=2

    Signed-off-by: Tom Parkin
    Signed-off-by: James Chapman
    Signed-off-by: David S. Miller

    Tom Parkin
     
  • If userspace deletes a ppp pseudowire using the netlink API, either by
    directly deleting the session or by deleting the tunnel that contains the
    session, we need to tear down the corresponding pppox channel.

    Rather than trying to manage two pppox unbind codepaths, switch the netlink
    and l2tp_core session_close handlers to close via. the l2tp_ppp socket
    .release handler.

    Signed-off-by: Tom Parkin
    Signed-off-by: James Chapman
    Signed-off-by: David S. Miller

    Tom Parkin