09 Feb, 2013

2 commits

  • On sctp_endpoint_destroy, previously used sensitive keying material
    should be zeroed out before the memory is returned, as we already do
    with e.g. auth keys when released.

    Signed-off-by: Daniel Borkmann
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • In sctp_setsockopt_auth_key, we create a temporary copy of the user
    passed shared auth key for the endpoint or association and after
    internal setup, we free it right away. Since it's sensitive data, we
    should zero out the key before returning the memory back to the
    allocator. Thus, use kzfree instead of kfree, just as we do in
    sctp_auth_key_put().

    Signed-off-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

08 Feb, 2013

1 commit


28 Jan, 2013

2 commits

  • Per-net sysctl table needs to be explicitly freed at
    net exit. Otherwise we see the following with kmemleak:

    unreferenced object 0xffff880402d08000 (size 2048):
    comm "chrome_sandbox", pid 18437, jiffies 4310887172 (age 9097.630s)
    hex dump (first 32 bytes):
    b2 68 89 81 ff ff ff ff 20 04 04 f8 01 88 ff ff .h...... .......
    04 00 00 00 a4 01 00 00 00 00 00 00 00 00 00 00 ................
    backtrace:
    [] kmemleak_alloc+0x21/0x3e
    [] slab_post_alloc_hook+0x28/0x2a
    [] __kmalloc_track_caller+0xf1/0x104
    [] kmemdup+0x1b/0x30
    [] sctp_sysctl_net_register+0x1f/0x72
    [] sctp_net_init+0x100/0x39f
    [] ops_init+0xc6/0xf5
    [] setup_net+0x4c/0xd0
    [] copy_net_ns+0x6d/0xd6
    [] create_new_namespaces+0xd7/0x147
    [] copy_namespaces+0x63/0x99
    [] copy_process+0xa65/0x1233
    [] do_fork+0x10b/0x271
    [] sys_clone+0x23/0x25
    [] stub_clone+0x13/0x20
    [] 0xffffffffffffffff

    I fixed the spelling of sysctl_header so the code actually
    compiles. -- EWB.

    Reported-by: Martin Mokrejs
    Signed-off-by: Vlad Yasevich
    Acked-by: Neil Horman
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Vlad Yasevich
     
  • While sctp handling a duplicate COOKIE-ECHO and the action is
    'Association restart', sctp_sf_do_dupcook_a() will processing
    the unexpected COOKIE-ECHO for peer restart, but it does not set
    the association state to SCTP_STATE_ESTABLISHED, so the association
    could stuck in SCTP_STATE_SHUTDOWN_PENDING state forever.
    This violates the sctp specification:
    RFC 4960 5.2.4. Handle a COOKIE ECHO when a TCB Exists
    Action
    A) In this case, the peer may have restarted. .....
    After this, the endpoint shall enter the ESTABLISHED state.

    To resolve this problem, adding a SCTP_CMD_NEW_STATE cmd to the
    command list before SCTP_CMD_REPLY cmd, this will set the restart
    association to SCTP_STATE_ESTABLISHED state properly and also avoid
    I-bit being set in the DATA chunk header when COOKIE_ACK is bundled
    with DATA chunks.

    Signed-off-by: Xufeng Zhang
    Acked-by: Neil Horman
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Xufeng Zhang
     

18 Jan, 2013

1 commit

  • Jamie Parsons reported a problem recently, in which the re-initalization of an
    association (The duplicate init case), resulted in a loss of receive window
    space. He tracked down the root cause to sctp_outq_teardown, which discarded
    all the data on an outq during a re-initalization of the corresponding
    association, but never reset the outq->outstanding_data field to zero. I wrote,
    and he tested this fix, which does a proper full re-initalization of the outq,
    fixing this problem, and hopefully future proofing us from simmilar issues down
    the road.

    Signed-off-by: Neil Horman
    Reported-by: Jamie Parsons
    Tested-by: Jamie Parsons
    CC: Jamie Parsons
    CC: Vlad Yasevich
    CC: "David S. Miller"
    CC: netdev@vger.kernel.org
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Neil Horman
     

08 Jan, 2013

1 commit

  • Commit 0d0863b02002 ("sctp: Change defaults on cookie hmac selection")
    added a "choice" to the sctp Kconfig file. It introduced a bug which
    led to an infinite loop when while running "make oldconfig".

    The problem is that the wrong symbol was defined as the default value
    for the choice. Using the correct value gets rid of the infinite loop.

    Note: if CONFIG_SCTP_COOKIE_HMAC_SHA1=y was present in the input
    config file, both that and CONFIG_SCTP_COOKIE_HMAC_MD5=y be present
    in the generated config file.

    Signed-off-by: Alex Elder
    Signed-off-by: Linus Torvalds

    Alex Elder
     

16 Dec, 2012

2 commits

  • Commit 24cb81a6a (sctp: Push struct net down into all of the
    state machine functions) introduced the net structure into all
    state machine functions, but jsctp_sf_eat_sack was not updated,
    hence when SCTP association probing is enabled in the kernel,
    any simple SCTP client/server program from userspace will panic
    the kernel.

    Cc: Vlad Yasevich
    Signed-off-by: Daniel Borkmann
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • Recently I posted commit 3c68198e75 which made selection of the cookie hmac
    algorithm selectable. This is all well and good, but Linus noted that it
    changes the default config:
    http://marc.info/?l=linux-netdev&m=135536629004808&w=2

    I've modified the sctp Kconfig file to reflect the recommended way of making
    this choice, using the thermal driver example specified, and brought the
    defaults back into line with the way they were prior to my origional patch

    Also, on Linus' suggestion, re-adding ability to select default 'none' hmac
    algorithm, so we don't needlessly bloat the kernel by forcing a non-none
    default. This also led me to note that we won't honor the default none
    condition properly because of how sctp_net_init is encoded. Fix that up as
    well.

    Tested by myself (allbeit fairly quickly). All configuration combinations seems
    to work soundly.

    Signed-off-by: Neil Horman
    CC: David Miller
    CC: Linus Torvalds
    CC: Vlad Yasevich
    CC: linux-sctp@vger.kernel.org
    Signed-off-by: David S. Miller

    Neil Horman
     

14 Dec, 2012

1 commit

  • Pull trivial branch from Jiri Kosina:
    "Usual stuff -- comment/printk typo fixes, documentation updates, dead
    code elimination."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
    HOWTO: fix double words typo
    x86 mtrr: fix comment typo in mtrr_bp_init
    propagate name change to comments in kernel source
    doc: Update the name of profiling based on sysfs
    treewide: Fix typos in various drivers
    treewide: Fix typos in various Kconfig
    wireless: mwifiex: Fix typo in wireless/mwifiex driver
    messages: i2o: Fix typo in messages/i2o
    scripts/kernel-doc: check that non-void fcts describe their return value
    Kernel-doc: Convention: Use a "Return" section to describe return values
    radeon: Fix typo and copy/paste error in comments
    doc: Remove unnecessary declarations from Documentation/accounting/getdelays.c
    various: Fix spelling of "asynchronous" in comments.
    Fix misspellings of "whether" in comments.
    eisa: Fix spelling of "asynchronous".
    various: Fix spelling of "registered" in comments.
    doc: fix quite a few typos within Documentation
    target: iscsi: fix comment typos in target/iscsi drivers
    treewide: fix typo of "suport" in various comments and Kconfig
    treewide: fix typo of "suppport" in various comments
    ...

    Linus Torvalds
     

08 Dec, 2012

3 commits

  • peer.transport_addr_list is currently only protected by sk_sock
    which is inpractical to acquire for procfs dumping purposes.

    This patch adds RCU protection allowing for the procfs readers to
    enter RCU read-side critical sections.

    Modification of the list continues to be serialized via sk_lock.

    V2: Use list_del_rcu() in sctp_association_free() to be safe
    Skip transports marked dead when dumping for procfs

    Cc: Vlad Yasevich
    Cc: Neil Horman
    Signed-off-by: Thomas Graf
    Acked-by: Vlad Yasevich
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Thomas Graf
     
  • address_list is protected via the socket lock or RCU. Since we don't want
    to take the socket lock for each assoc we dump in procfs a RCU read-side
    critical section must be entered.

    V2: Skip local addresses marked as dead

    Cc: Vlad Yasevich
    Cc: Neil Horman
    Signed-off-by: Thomas Graf
    Acked-by: Vlad Yasevich
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Thomas Graf
     
  • WARNING: net/sctp/sctp.o(.text+0x72f1): Section mismatch in reference
    from the function sctp_net_init() to the function
    .init.text:sctp_proc_init()
    The function sctp_net_init() references
    the function __init sctp_proc_init().
    This is often because sctp_net_init lacks a __init
    annotation or the annotation of sctp_proc_init is wrong.

    And put __net_init after 'int' for sctp_proc_init - as it is done
    everywhere else in the sctp-stack.

    Signed-off-by: Christoph Paasch
    Acked-by: Neil Horman
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Christoph Paasch
     

04 Dec, 2012

1 commit

  • The current SCTP stack is lacking a mechanism to have per association
    statistics. This is an implementation modeled after OpenSolaris'
    SCTP_GET_ASSOC_STATS.

    Userspace part will follow on lksctp if/when there is a general ACK on
    this.
    V4:
    - Move ipackets++ before q->immediate.func() for consistency reasons
    - Move sctp_max_rto() at the end of sctp_transport_update_rto() to avoid
    returning bogus RTO values
    - return asoc->rto_min when max_obs_rto value has not changed

    V3:
    - Increase ictrlchunks in sctp_assoc_bh_rcv() as well
    - Move ipackets++ to sctp_inq_push()
    - return 0 when no rto updates took place since the last call

    V2:
    - Implement partial retrieval of stat struct to cope for future expansion
    - Kill the rtxpackets counter as it cannot be precise anyway
    - Rename outseqtsns to outofseqtsns to make it clearer that these are out
    of sequence unexpected TSNs
    - Move asoc->ipackets++ under a lock to avoid potential miscounts
    - Fold asoc->opackets++ into the already existing asoc check
    - Kill unneeded (q->asoc) test when increasing rtxchunks
    - Do not count octrlchunks if sending failed (SCTP_XMIT_OK != 0)
    - Don't count SHUTDOWNs as SACKs
    - Move SCTP_GET_ASSOC_STATS to the private space API
    - Adjust the len check in sctp_getsockopt_assoc_stats() to allow for
    future struct growth
    - Move association statistics in their own struct
    - Update idupchunks when we send a SACK with dup TSNs
    - return min_rto in max_rto when RTO has not changed. Also return the
    transport when max_rto last changed.

    Signed-off: Michele Baldessari
    Acked-by: Vlad Yasevich

    Signed-off-by: David S. Miller

    Michele Baldessari
     

01 Dec, 2012

2 commits

  • If the variable parameter length provided in the mandatory
    heartbeat information parameter exceeds the calculated payload
    length the packet has been corrupted. Reply with a parameter
    length protocol violation message.

    Signed-off-by: Thomas Graf
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Thomas Graf
     
  • Trinity (the syscall fuzzer) triggered the following BUG, reproducible
    only when the kernel is configured with CONFIG_SCTP_DBG_MSG=y.

    When CONFIG_SCTP_DBG_MSG is not set, the null pointer is never
    dereferenced.

    ---[ end trace a4de0bfcb38a3642 ]---
    BUG: unable to handle kernel NULL pointer dereference at 0000000000000100
    IP: [] ip6_string+0x1e/0xa0
    PGD 4eead067 PUD 4e472067 PMD 0
    Oops: 0000 [#1] PREEMPT SMP
    Modules linked in:
    CPU 3
    Pid: 21324, comm: trinity-child11 Tainted: G W 3.7.0-rc7+ #61 ASUSTeK Computer INC. EB1012/EB1012
    RIP: 0010:[] [] ip6_string+0x1e/0xa0
    RSP: 0018:ffff88004e4637a0 EFLAGS: 00010046
    RAX: ffff88004e4637da RBX: ffff88004e4637da RCX: 0000000000000000
    RDX: ffffffff8246e92a RSI: 0000000000000100 RDI: ffff88004e4637da
    RBP: ffff88004e4637a8 R08: 000000000000ffff R09: 000000000000ffff
    R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff8289d600
    R13: ffffffff8289d230 R14: ffffffff8246e928 R15: ffffffff8289d600
    FS: 00007fed95153700(0000) GS:ffff88005fd80000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000100 CR3: 000000004eeac000 CR4: 00000000000007e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process trinity-child11 (pid: 21324, threadinfo ffff88004e462000, task ffff8800524b0000)
    Stack:
    ffff88004e4637da ffff88004e463828 ffffffff81368eee 000000004e4637d8
    ffffffff0000ffff ffff88000000ffff 0000000000000000 000000004e4637f8
    ffffffff826285d8 ffff88004e4637f8 0000000000000000 ffff8800524b06b0
    Call Trace:
    [] ip6_addr_string.isra.11+0x3e/0xa0
    [] pointer.isra.12+0x233/0x2d0
    [] ? vprintk_emit+0x1ba/0x450
    [] ? trace_hardirqs_on_caller+0x10d/0x1a0
    [] vsnprintf+0x187/0x5d0
    [] vscnprintf+0x12/0x30
    [] vprintk_emit+0xa8/0x450
    [] printk+0x49/0x4b
    [] sctp_v6_get_dst+0x731/0x780
    [] ? sctp_v6_get_dst+0x325/0x780
    [] sctp_transport_route+0x46/0x120
    [] sctp_assoc_add_peer+0x161/0x350
    [] sctp_sendmsg+0x6cd/0xcb0
    [] ? inet_create+0x670/0x670
    [] inet_sendmsg+0x10b/0x220
    [] ? inet_create+0x670/0x670
    [] ? sock_update_classid+0xa4/0x2b0
    [] ? sock_update_classid+0xf0/0x2b0
    [] sock_sendmsg+0xdc/0xf0
    [] ? might_fault+0x85/0x90
    [] ? might_fault+0x3c/0x90
    [] sys_sendto+0xfa/0x130
    [] ? do_setitimer+0x197/0x380
    [] ? sysret_check+0x22/0x5d
    [] system_call_fastpath+0x16/0x1b
    Code: 01 eb 89 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 f8 31 c9 48 89 e5 53 eb 12 0f 1f 40 00 48 83 c1 01 48 83 c0 04 48 83 f9 08 74 70 b6 3c 4e 89 fb 83 e7 0f c0 eb 04 41 89 d8 41 83 e0 0f 0f b6
    RIP [] ip6_string+0x1e/0xa0
    RSP
    CR2: 0000000000000100
    ---[ end trace a4de0bfcb38a3643 ]---

    Signed-off-by: Tommi Rantala
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Tommi Rantala
     

30 Nov, 2012

1 commit


29 Nov, 2012

3 commits

  • The calculation of RTTVAR involves the subtraction of two unsigned
    numbers which
    may causes rollover and results in very high values of RTTVAR when RTT > SRTT.
    With this patch it is possible to set RTOmin = 1 to get the minimum of RTO at
    4 times the clock granularity.

    Change Notes:

    v2)
    *Replaced abs() by abs64() and long by __s64, changed patch
    description.

    Signed-off-by: Christian Schoch
    CC: Vlad Yasevich
    CC: Sridhar Samudrala
    CC: Neil Horman
    CC: linux-sctp@vger.kernel.org
    Acked-by: Vlad Yasevich
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Schoch Christian
     
  • Consider the following program, that sets the second argument to the
    sendto() syscall incorrectly:

    #include
    #include
    #include

    int main(void)
    {
    int fd;
    struct sockaddr_in sa;

    fd = socket(AF_INET, SOCK_STREAM, 132 /*IPPROTO_SCTP*/);
    if (fd < 0)
    return 1;

    memset(&sa, 0, sizeof(sa));
    sa.sin_family = AF_INET;
    sa.sin_addr.s_addr = inet_addr("127.0.0.1");
    sa.sin_port = htons(11111);

    sendto(fd, NULL, 1, 0, (struct sockaddr *)&sa, sizeof(sa));

    return 0;
    }

    We get -ENOMEM:

    $ strace -e sendto ./demo
    sendto(3, NULL, 1, 0, {sa_family=AF_INET, sin_port=htons(11111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ENOMEM (Cannot allocate memory)

    Propagate the error code from sctp_user_addto_chunk(), so that we will
    tell user space what actually went wrong:

    $ strace -e sendto ./demo
    sendto(3, NULL, 1, 0, {sa_family=AF_INET, sin_port=htons(11111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EFAULT (Bad address)

    Noticed while running Trinity (the syscall fuzzer).

    Signed-off-by: Tommi Rantala
    Acked-by: Vlad Yasevich
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Tommi Rantala
     
  • Trinity (the syscall fuzzer) discovered a memory leak in SCTP,
    reproducible e.g. with the sendto() syscall by passing invalid
    user space pointer in the second argument:

    #include
    #include
    #include

    int main(void)
    {
    int fd;
    struct sockaddr_in sa;

    fd = socket(AF_INET, SOCK_STREAM, 132 /*IPPROTO_SCTP*/);
    if (fd < 0)
    return 1;

    memset(&sa, 0, sizeof(sa));
    sa.sin_family = AF_INET;
    sa.sin_addr.s_addr = inet_addr("127.0.0.1");
    sa.sin_port = htons(11111);

    sendto(fd, NULL, 1, 0, (struct sockaddr *)&sa, sizeof(sa));

    return 0;
    }

    As far as I can tell, the leak has been around since ~2003.

    Signed-off-by: Tommi Rantala
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Tommi Rantala
     

21 Nov, 2012

1 commit

  • In the event that an association exceeds its max_retrans attempts, we should
    send an ABORT chunk indicating that we are closing the assocation as a result.
    Because of the nature of the error, its unlikely to be received, but its a nice
    clean way to close the association if it does make it through, and it will give
    anyone watching via tcpdump a clue as to what happened.

    Change notes:
    v2)
    * Removed erroneous changes from sctp_make_violation_parmlen

    Signed-off-by: Neil Horman
    CC: Vlad Yasevich
    CC: "David S. Miller"
    CC: linux-sctp@vger.kernel.org
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Neil Horman
     

19 Nov, 2012

2 commits


18 Nov, 2012

2 commits


16 Nov, 2012

1 commit

  • Commit 13d782f ("sctp: Make the proc files per network namespace.")
    changed the /proc/net/sctp/ struct file_operations opener functions to
    use single_open_net() and seq_open_net().

    Avoid leaking memory by using single_release_net() and seq_release_net()
    as the release functions.

    Discovered with Trinity (the syscall fuzzer).

    Signed-off-by: Tommi Rantala
    Acked-by: Neil Horman
    Cc: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Tommi Rantala
     

11 Nov, 2012

1 commit


04 Nov, 2012

1 commit

  • Lots of points in the sctp_cmd_interpreter function treat the sctp_cmd_t arg as
    a void pointer, even though they are written as various other types. Theres no
    need for this as doing so just leads to possible type-punning issues that could
    cause crashes, and if we remain type-consistent we can actually just remove the
    void * member of the union entirely.

    Change Notes:

    v2)
    * Dropped chunk that modified SCTP_NULL to create a marker pattern
    should anyone try to use a SCTP_NULL() assigned sctp_arg_t, Assigning
    to .zero provides the same effect and should be faster, per Vlad Y.

    v3)
    * Reverted part of V2, opting to use memset instead of .zero, so that
    the entire union is initalized thus avoiding the i164 speculative load
    problems previously encountered, per Dave M.. Also rewrote
    SCTP_[NO]FORCE so as to use common infrastructure a little more

    Signed-off-by: Neil Horman
    CC: "David S. Miller"
    CC: linux-sctp@vger.kernel.org
    Signed-off-by: David S. Miller

    Neil Horman
     

01 Nov, 2012

1 commit


26 Oct, 2012

1 commit

  • Currently sctp allows for the optional use of md5 of sha1 hmac algorithms to
    generate cookie values when establishing new connections via two build time
    config options. Theres no real reason to make this a static selection. We can
    add a sysctl that allows for the dynamic selection of these algorithms at run
    time, with the default value determined by the corresponding crypto library
    availability.
    This comes in handy when, for example running a system in FIPS mode, where use
    of md5 is disallowed, but SHA1 is permitted.

    Note: This new sysctl has no corresponding socket option to select the cookie
    hmac algorithm. I chose not to implement that intentionally, as RFC 6458
    contains no option for this value, and I opted not to pollute the socket option
    namespace.

    Change notes:
    v2)
    * Updated subject to have the proper sctp prefix as per Dave M.
    * Replaced deafult selection options with new options that allow
    developers to explicitly select available hmac algs at build time
    as per suggestion by Vlad Y.

    Signed-off-by: Neil Horman
    CC: Vlad Yasevich
    CC: "David S. Miller"
    CC: netdev@vger.kernel.org
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Neil Horman
     

17 Oct, 2012

1 commit


06 Oct, 2012

1 commit

  • Pull networking changes from David Miller:
    "The most important bit in here is the fix for input route caching from
    Eric Dumazet, it's a shame we couldn't fully analyze this in time for
    3.6 as it's a 3.6 regression introduced by the routing cache removal.

    Anyways, will send quickly to -stable after you pull this in.

    Other changes of note:

    1) Fix lockdep splats in team and bonding, from Eric Dumazet.

    2) IPV6 adds link local route even when there is no link local
    address, from Nicolas Dichtel.

    3) Fix ixgbe PTP implementation, from Jacob Keller.

    4) Fix excessive stack usage in cxgb4 driver, from Vipul Pandya.

    5) MAC length computed improperly in VLAN demux, from Antonio
    Quartulli."

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits)
    ipv6: release reference of ip6_null_entry's dst entry in __ip6_del_rt
    Remove noisy printks from llcp_sock_connect
    tipc: prevent dropped connections due to rcvbuf overflow
    silence some noisy printks in irda
    team: set qdisc_tx_busylock to avoid LOCKDEP splat
    bonding: set qdisc_tx_busylock to avoid LOCKDEP splat
    sctp: check src addr when processing SACK to update transport state
    sctp: fix a typo in prototype of __sctp_rcv_lookup()
    ipv4: add a fib_type to fib_info
    can: mpc5xxx_can: fix section type conflict
    can: peak_pcmcia: fix error return code
    can: peak_pci: fix error return code
    cxgb4: Fix build error due to missing linux/vmalloc.h include.
    bnx2x: fix ring size for 10G functions
    cxgb4: Dynamically allocate memory in t4_memory_rw() and get_vpd_params()
    ixgbe: add support for X540-AT1
    ixgbe: fix poll loop for FDIRCTRL.INIT_DONE bit
    ixgbe: fix PTP ethtool timestamping function
    ixgbe: (PTP) Fix PPS interrupt code
    ixgbe: Fix PTP X540 SDP alignment code for PPS signal
    ...

    Linus Torvalds
     

05 Oct, 2012

2 commits

  • Suppose we have an SCTP connection with two paths. After connection is
    established, path1 is not available, thus this path is marked as inactive. Then
    traffic goes through path2, but for some reasons packets are delayed (after
    rto.max). Because packets are delayed, the retransmit mechanism will switch
    again to path1. At this time, we receive a delayed SACK from path2. When we
    update the state of the path in sctp_check_transmitted(), we do not take into
    account the source address of the SACK, hence we update the wrong path.

    Signed-off-by: Nicolas Dichtel
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     
  • Just to avoid confusion when people only reads this prototype.

    Signed-off-by: Nicolas Dichtel
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     

03 Oct, 2012

1 commit

  • Pull vfs update from Al Viro:

    - big one - consolidation of descriptor-related logics; almost all of
    that is moved to fs/file.c

    (BTW, I'm seriously tempted to rename the result to fd.c. As it is,
    we have a situation when file_table.c is about handling of struct
    file and file.c is about handling of descriptor tables; the reasons
    are historical - file_table.c used to be about a static array of
    struct file we used to have way back).

    A lot of stray ends got cleaned up and converted to saner primitives,
    disgusting mess in android/binder.c is still disgusting, but at least
    doesn't poke so much in descriptor table guts anymore. A bunch of
    relatively minor races got fixed in process, plus an ext4 struct file
    leak.

    - related thing - fget_light() partially unuglified; see fdget() in
    there (and yes, it generates the code as good as we used to have).

    - also related - bits of Cyrill's procfs stuff that got entangled into
    that work; _not_ all of it, just the initial move to fs/proc/fd.c and
    switch of fdinfo to seq_file.

    - Alex's fs/coredump.c spiltoff - the same story, had been easier to
    take that commit than mess with conflicts. The rest is a separate
    pile, this was just a mechanical code movement.

    - a few misc patches all over the place. Not all for this cycle,
    there'll be more (and quite a few currently sit in akpm's tree)."

    Fix up trivial conflicts in the android binder driver, and some fairly
    simple conflicts due to two different changes to the sock_alloc_file()
    interface ("take descriptor handling from sock_alloc_file() to callers"
    vs "net: Providing protocol type via system.sockprotoname xattr of
    /proc/PID/fd entries" adding a dentry name to the socket)

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits)
    MAX_LFS_FILESIZE should be a loff_t
    compat: fs: Generic compat_sys_sendfile implementation
    fs: push rcu_barrier() from deactivate_locked_super() to filesystems
    btrfs: reada_extent doesn't need kref for refcount
    coredump: move core dump functionality into its own file
    coredump: prevent double-free on an error path in core dumper
    usb/gadget: fix misannotations
    fcntl: fix misannotations
    ceph: don't abuse d_delete() on failure exits
    hypfs: ->d_parent is never NULL or negative
    vfs: delete surplus inode NULL check
    switch simple cases of fget_light to fdget
    new helpers: fdget()/fdput()
    switch o2hb_region_dev_write() to fget_light()
    proc_map_files_readdir(): don't bother with grabbing files
    make get_file() return its argument
    vhost_set_vring(): turn pollstart/pollstop into bool
    switch prctl_set_mm_exe_file() to fget_light()
    switch xfs_find_handle() to fget_light()
    switch xfs_swapext() to fget_light()
    ...

    Linus Torvalds
     

27 Sep, 2012

1 commit

  • Both modular callers of sock_map_fd() had been buggy; sctp one leaks
    descriptor and file if copy_to_user() fails, 9p one shouldn't be
    exposing file in the descriptor table at all.

    Switch both to sock_alloc_file(), export it, unexport sock_map_fd() and
    make it static.

    Signed-off-by: Al Viro

    Al Viro
     

15 Sep, 2012

1 commit

  • Conflicts:
    net/netfilter/nfnetlink_log.c
    net/netfilter/xt_LOG.c

    Rather easy conflict resolution, the 'net' tree had bug fixes to make
    sure we checked if a socket is a time-wait one or not and elide the
    logging code if so.

    Whereas on the 'net-next' side we are calculating the UID and GID from
    the creds using different interfaces due to the user namespace changes
    from Eric Biederman.

    Signed-off-by: David S. Miller

    David S. Miller
     

05 Sep, 2012

1 commit


04 Sep, 2012

1 commit

  • SCTP charges wmem_alloc via sctp_set_owner_w() in sctp_sendmsg() and via
    skb_set_owner_w() in sctp_packet_transmit(). If a sender runs out of
    sndbuf it will sleep in sctp_wait_for_sndbuf() and expects to be waken up
    by __sctp_write_space().

    Buffer space charged via sctp_set_owner_w() is released in sctp_wfree()
    which calls __sctp_write_space() directly.

    Buffer space charged via skb_set_owner_w() is released via sock_wfree()
    which calls sk->sk_write_space() _if_ SOCK_USE_WRITE_QUEUE is not set.
    sctp_endpoint_init() sets SOCK_USE_WRITE_QUEUE on all sockets.

    Therefore if sctp_packet_transmit() manages to queue up more than sndbuf
    bytes, sctp_wait_for_sndbuf() will never be woken up again unless it is
    interrupted by a signal.

    This could be fixed by clearing the SOCK_USE_WRITE_QUEUE flag but ...

    Charging for the data twice does not make sense in the first place, it
    leads to overcharging sndbuf by a factor 2. Therefore this patch only
    charges a single byte in wmem_alloc when transmitting an SCTP packet to
    ensure that the socket stays alive until the packet has been released.

    This means that control chunks are no longer accounted for in wmem_alloc
    which I believe is not a problem as skb->truesize will typically lead
    to overcharging anyway and thus compensates for any control overhead.

    Signed-off-by: Thomas Graf
    CC: Vlad Yasevich
    CC: Neil Horman
    CC: David Miller
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Thomas Graf
     

25 Aug, 2012

1 commit