12 Jan, 2016

1 commit


11 Jan, 2016

1 commit


16 Dec, 2015

1 commit

  • As we all know, the value of pf_retrans >= max_retrans_path can
    disable pf state. The variables of pf_retrans and max_retrans_path
    can be changed by the userspace application.

    Sometimes the user expects to disable pf state while the 2
    variables are changed to enable pf state. So it is necessary to
    introduce a new variable to disable pf state.

    According to the suggestions from Vlad Yasevich, extra1 and extra2
    are removed. The initialization of pf_enable is added.

    Acked-by: Vlad Yasevich
    Signed-off-by: Zhu Yanjun
    Acked-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Zhu Yanjun
     

25 Mar, 2015

1 commit


03 Jul, 2014

1 commit


20 Jun, 2014

1 commit

  • When writing to the sysctl field net.sctp.auth_enable, it can well
    be that the user buffer we handed over to proc_dointvec() via
    proc_sctp_do_auth() handler contains something other than integers.

    In that case, we would set an uninitialized 4-byte value from the
    stack to net->sctp.auth_enable that can be leaked back when reading
    the sysctl variable, and it can unintentionally turn auth_enable
    on/off based on the stack content since auth_enable is interpreted
    as a boolean.

    Fix it up by making sure proc_dointvec() returned sucessfully.

    Fixes: b14878ccb7fa ("net: sctp: cache auth_enable per endpoint")
    Reported-by: Florian Westphal
    Signed-off-by: Daniel Borkmann
    Acked-by: Neil Horman
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

19 Jun, 2014

1 commit

  • sysctl handler proc_sctp_do_hmac_alg(), proc_sctp_do_rto_min() and
    proc_sctp_do_rto_max() do not properly reflect some error cases
    when writing values via sysctl from internal proc functions such
    as proc_dointvec() and proc_dostring().

    In all these cases we pass the test for write != 0 and partially
    do additional work just to notice that additional sanity checks
    fail and we return with hard-coded -EINVAL while proc_do*
    functions might also return different errors. So fix this up by
    simply testing a successful return of proc_do* right after
    calling it.

    This also allows to propagate its return value onwards to the user.
    While touching this, also fix up some minor style issues.

    Fixes: 4f3fdf3bc59c ("sctp: add check rto_min and rto_max in sysctl")
    Fixes: 3c68198e7511 ("sctp: Make hmac algorithm selection for cookie generation dynamic")
    Signed-off-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

15 Jun, 2014

1 commit

  • Commit 3fd091e73b81 ("[SCTP]: Remove multiple levels of msecs
    to jiffies conversions.") has silently changed permissions for
    rto_alpha and rto_beta knobs from 0644 to 0444. The purpose of
    this was to discourage users from tweaking rto_alpha and
    rto_beta knobs in production environments since they are key
    to correctly compute rtt/srtt.

    RFC4960 under section 6.3.1. RTO Calculation says regarding
    rto_alpha and rto_beta under rule C3 and C4:

    [...]
    C3) When a new RTT measurement R' is made, set

    RTTVAR
    Cc: Vlad Yasevich
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

10 May, 2014

2 commits

  • When register_net_sysctl failed, we should free the
    sysctl_table.

    Signed-off-by: Wang Weidong
    Signed-off-by: David S. Miller

    wangweidong
     
  • This revert commit efb842c45("sctp: optimize the sctp_sysctl_net_register"),
    Since it doesn't kmemdup a sysctl_table for init_net, so the
    init_net->sctp.sysctl_header->ctl_table_arg points to sctp_net_table
    which is a static array pointer. So when doing sctp_sysctl_net_unregister,
    it will free sctp_net_table, then we will get a NULL pointer dereference
    like that:

    [ 262.948220] BUG: unable to handle kernel NULL pointer dereference at 000000000000006c
    [ 262.948232] IP: [] kfree+0x80/0x420
    [ 262.948260] PGD db80a067 PUD dae12067 PMD 0
    [ 262.948268] Oops: 0000 [#1] SMP
    [ 262.948273] Modules linked in: sctp(-) crc32c_generic libcrc32c
    ...
    [ 262.948338] task: ffff8800db830190 ti: ffff8800dad00000 task.ti: ffff8800dad00000
    [ 262.948344] RIP: 0010:[] [] kfree+0x80/0x420
    [ 262.948353] RSP: 0018:ffff8800dad01d88 EFLAGS: 00010046
    [ 262.948358] RAX: 0100000000000000 RBX: ffffffffa0227940 RCX: ffffea0000707888
    [ 262.948363] RDX: ffffea0000707888 RSI: 0000000000000001 RDI: ffffffffa0227940
    [ 262.948369] RBP: ffff8800dad01de8 R08: 0000000000000000 R09: ffff8800d9e983a9
    [ 262.948374] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0227940
    [ 262.948380] R13: ffffffff8187cfc0 R14: 0000000000000000 R15: ffffffff8187da10
    [ 262.948386] FS: 00007fa2a2658700(0000) GS:ffff880112800000(0000) knlGS:0000000000000000
    [ 262.948394] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    [ 262.948400] CR2: 000000000000006c CR3: 00000000cddc0000 CR4: 00000000000006e0
    [ 262.948410] Stack:
    [ 262.948413] ffff8800dad01da8 0000000000000286 0000000020227940 ffffffffa0227940
    [ 262.948422] ffff8800dad01dd8 ffffffff811b7fa1 ffffffffa0227940 ffffffffa0227940
    [ 262.948431] ffffffff8187d960 ffffffff8187cfc0 ffffffff8187d960 ffffffff8187da10
    [ 262.948440] Call Trace:
    [ 262.948457] [] ? unregister_sysctl_table+0x51/0xa0
    [ 262.948476] [] sctp_sysctl_net_unregister+0x21/0x30 [sctp]
    [ 262.948490] [] sctp_net_exit+0x12d/0x150 [sctp]
    [ 262.948512] [] ops_exit_list+0x39/0x60
    [ 262.948522] [] unregister_pernet_operations+0x3d/0x70
    [ 262.948530] [] unregister_pernet_subsys+0x22/0x40
    [ 262.948544] [] sctp_exit+0x3c/0x12d [sctp]
    [ 262.948562] [] SyS_delete_module+0x194/0x210
    [ 262.948577] [] ? trace_hardirqs_on_thunk+0x3a/0x3f
    [ 262.948587] [] system_call_fastpath+0x16/0x1b

    With this revert, it won't occur the Oops.

    Signed-off-by: Wang Weidong
    Signed-off-by: David S. Miller

    wangweidong
     

19 Apr, 2014

1 commit

  • Currently, it is possible to create an SCTP socket, then switch
    auth_enable via sysctl setting to 1 and crash the system on connect:

    Oops[#1]:
    CPU: 0 PID: 0 Comm: swapper Not tainted 3.14.1-mipsgit-20140415 #1
    task: ffffffff8056ce80 ti: ffffffff8055c000 task.ti: ffffffff8055c000
    [...]
    Call Trace:
    [] sctp_auth_asoc_set_default_hmac+0x68/0x80
    [] sctp_process_init+0x5e0/0x8a4
    [] sctp_sf_do_5_1B_init+0x234/0x34c
    [] sctp_do_sm+0xb4/0x1e8
    [] sctp_endpoint_bh_rcv+0x1c4/0x214
    [] sctp_rcv+0x588/0x630
    [] sctp6_rcv+0x10/0x24
    [] ip6_input+0x2c0/0x440
    [] __netif_receive_skb_core+0x4a8/0x564
    [] process_backlog+0xb4/0x18c
    [] net_rx_action+0x12c/0x210
    [] __do_softirq+0x17c/0x2ac
    [] irq_exit+0x54/0xb0
    [] ret_from_irq+0x0/0x4
    [] rm7k_wait_irqoff+0x24/0x48
    [] cpu_startup_entry+0xc0/0x148
    [] start_kernel+0x37c/0x398
    Code: dd0900b8 000330f8 0126302d 50c0fff1 0047182a a48306a0
    03e00008 00000000
    ---[ end trace b530b0551467f2fd ]---
    Kernel panic - not syncing: Fatal exception in interrupt

    What happens while auth_enable=0 in that case is, that
    ep->auth_hmacs is initialized to NULL in sctp_auth_init_hmacs()
    when endpoint is being created.

    After that point, if an admin switches over to auth_enable=1,
    the machine can crash due to NULL pointer dereference during
    reception of an INIT chunk. When we enter sctp_process_init()
    via sctp_sf_do_5_1B_init() in order to respond to an INIT chunk,
    the INIT verification succeeds and while we walk and process
    all INIT params via sctp_process_param() we find that
    net->sctp.auth_enable is set, therefore do not fall through,
    but invoke sctp_auth_asoc_set_default_hmac() instead, and thus,
    dereference what we have set to NULL during endpoint
    initialization phase.

    The fix is to make auth_enable immutable by caching its value
    during endpoint initialization, so that its original value is
    being carried along until destruction. The bug seems to originate
    from the very first days.

    Fix in joint work with Daniel Borkmann.

    Reported-by: Joshua Kinard
    Signed-off-by: Vlad Yasevich
    Signed-off-by: Daniel Borkmann
    Acked-by: Neil Horman
    Tested-by: Joshua Kinard
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

14 Feb, 2014

2 commits

  • Here, when the net is init_net, we needn't to kmemdup the ctl_table
    again. So add a check for net. Also we can save some memory.

    Signed-off-by: Wang Weidong
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    wangweidong
     
  • As commit 3c68198e75111a90("sctp: Make hmac algorithm selection for
    cookie generation dynamic"), we miss the .data initialization.
    If we don't use the net_namespace, the problem that parts of the
    sysctl configuration won't be isolation and won't occur.

    In sctp_sysctl_net_register(), we register the sysctl for each
    net, in the for(), we use the 'table[i].data' as check condition, so
    when the 'i' is the index of sctp_hmac_alg, the data is NULL, then
    break. So add the .data initialization.

    Acked-by: Neil Horman
    Signed-off-by: Wang Weidong
    Signed-off-by: David S. Miller

    wangweidong
     

27 Dec, 2013

1 commit


19 Dec, 2013

1 commit


11 Dec, 2013

2 commits

  • fix up spacing of proc_sctp_do_hmac_alg for according to the
    proc_sctp_do_rto_min[max] in sysctl.c

    Suggested-by: Daniel Borkmann
    Signed-off-by: Wang Weidong
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    wangweidong
     
  • rto_min should be smaller than rto_max while rto_max should be larger
    than rto_min. Add two proc_handler for the checking.

    Suggested-by: Vlad Yasevich
    Signed-off-by: Wang Weidong
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    wangweidong
     

07 Dec, 2013

1 commit

  • Several files refer to an old address for the Free Software Foundation
    in the file header comment. Resolve by replacing the address with
    the URL so that we do not have to keep
    updating the header comments anytime the address changes.

    CC: Vlad Yasevich
    CC: Neil Horman
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Jeff Kirsher
     

10 Aug, 2013

3 commits


25 Jul, 2013

1 commit

  • The SCTP mailing list address to send patches or questions
    to is linux-sctp@vger.kernel.org and not
    lksctp-developers@lists.sourceforge.net anymore. Therefore,
    update all occurences.

    Signed-off-by: Daniel Borkmann
    Acked-by: Neil Horman
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

13 Jun, 2013

1 commit

  • Reduce the uses of this unnecessary typedef.

    Done via perl script:

    $ git grep --name-only -w ctl_table net | \
    xargs perl -p -i -e '\
    sub trim { my ($local) = @_; $local =~ s/(^\s+|\s+$)//g; return $local; } \
    s/\b(?<!struct\s)ctl_table\b(\s*\*\s*|\s+\w+)/"struct ctl_table " . trim($1)/ge'

    Reflow the modified lines that now exceed 80 columns.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

28 Jan, 2013

1 commit

  • Per-net sysctl table needs to be explicitly freed at
    net exit. Otherwise we see the following with kmemleak:

    unreferenced object 0xffff880402d08000 (size 2048):
    comm "chrome_sandbox", pid 18437, jiffies 4310887172 (age 9097.630s)
    hex dump (first 32 bytes):
    b2 68 89 81 ff ff ff ff 20 04 04 f8 01 88 ff ff .h...... .......
    04 00 00 00 a4 01 00 00 00 00 00 00 00 00 00 00 ................
    backtrace:
    [] kmemleak_alloc+0x21/0x3e
    [] slab_post_alloc_hook+0x28/0x2a
    [] __kmalloc_track_caller+0xf1/0x104
    [] kmemdup+0x1b/0x30
    [] sctp_sysctl_net_register+0x1f/0x72
    [] sctp_net_init+0x100/0x39f
    [] ops_init+0xc6/0xf5
    [] setup_net+0x4c/0xd0
    [] copy_net_ns+0x6d/0xd6
    [] create_new_namespaces+0xd7/0x147
    [] copy_namespaces+0x63/0x99
    [] copy_process+0xa65/0x1233
    [] do_fork+0x10b/0x271
    [] sys_clone+0x23/0x25
    [] stub_clone+0x13/0x20
    [] 0xffffffffffffffff

    I fixed the spelling of sysctl_header so the code actually
    compiles. -- EWB.

    Reported-by: Martin Mokrejs
    Signed-off-by: Vlad Yasevich
    Acked-by: Neil Horman
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

26 Oct, 2012

1 commit

  • Currently sctp allows for the optional use of md5 of sha1 hmac algorithms to
    generate cookie values when establishing new connections via two build time
    config options. Theres no real reason to make this a static selection. We can
    add a sysctl that allows for the dynamic selection of these algorithms at run
    time, with the default value determined by the corresponding crypto library
    availability.
    This comes in handy when, for example running a system in FIPS mode, where use
    of md5 is disallowed, but SHA1 is permitted.

    Note: This new sysctl has no corresponding socket option to select the cookie
    hmac algorithm. I chose not to implement that intentionally, as RFC 6458
    contains no option for this value, and I opted not to pollute the socket option
    namespace.

    Change notes:
    v2)
    * Updated subject to have the proper sctp prefix as per Dave M.
    * Replaced deafult selection options with new options that allow
    developers to explicitly select available hmac algs at build time
    as per suggestion by Vlad Y.

    Signed-off-by: Neil Horman
    CC: Vlad Yasevich
    CC: "David S. Miller"
    CC: netdev@vger.kernel.org
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Neil Horman
     

15 Aug, 2012

2 commits


23 Jul, 2012

1 commit

  • I've seen several attempts recently made to do quick failover of sctp transports
    by reducing various retransmit timers and counters. While its possible to
    implement a faster failover on multihomed sctp associations, its not
    particularly robust, in that it can lead to unneeded retransmits, as well as
    false connection failures due to intermittent latency on a network.

    Instead, lets implement the new ietf quick failover draft found here:
    http://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05

    This will let the sctp stack identify transports that have had a small number of
    errors, and avoid using them quickly until their reliability can be
    re-established. I've tested this out on two virt guests connected via multiple
    isolated virt networks and believe its in compliance with the above draft and
    works well.

    Signed-off-by: Neil Horman
    CC: Vlad Yasevich
    CC: Sridhar Samudrala
    CC: "David S. Miller"
    CC: linux-sctp@vger.kernel.org
    CC: joe@perches.com
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Neil Horman
     

21 Apr, 2012

2 commits

  • This results in code with less boiler plate that is a bit easier
    to read.

    Additionally stops us from using compatibility code in the sysctl
    core, hastening the day when the compatibility code can be removed.

    Signed-off-by: Eric W. Biederman
    Acked-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This makes it clearer which sysctls are relative to your current network
    namespace.

    This makes it a little less error prone by not exposing sysctls for the
    initial network namespace in other namespaces.

    This is the same way we handle all of our other network interfaces to
    userspace and I can't honestly remember why we didn't do this for
    sysctls right from the start.

    Signed-off-by: Eric W. Biederman
    Acked-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

20 Dec, 2011

1 commit

  • Commit 8ffd3208 voids the previous patches f6778aab and 810c0719 for
    limiting the autoclose value. If userspace passes in -1 on 32-bit
    platform, the overflow check didn't work and autoclose would be set
    to 0xffffffff.

    This patch defines a max_autoclose (in seconds) for limiting the value
    and exposes it through sysctl, with the following intentions.

    1) Avoid overflowing autoclose * HZ.

    2) Keep the default autoclose bound consistent across 32- and 64-bit
    platforms (INT_MAX / HZ in this patch).

    3) Keep the autoclose value consistent between setsockopt() and
    getsockopt() calls.

    Suggested-by: Vlad Yasevich
    Signed-off-by: Xi Wang
    Signed-off-by: David S. Miller

    Xi Wang
     

02 Jun, 2011

1 commit


11 Nov, 2010

1 commit

  • Robin Holt tried to boot a 16TB machine and found some limits were
    reached : sysctl_tcp_mem[2], sysctl_udp_mem[2]

    We can switch infrastructure to use long "instead" of "int", now
    atomic_long_t primitives are available for free.

    Signed-off-by: Eric Dumazet
    Reported-by: Robin Holt
    Reviewed-by: Robin Holt
    Signed-off-by: Andrew Morton
    Signed-off-by: David S. Miller

    Eric Dumazet
     

09 Dec, 2009

1 commit

  • I messed up the merge in d7fc02c7bae7b1cf69269992cf880a43a350cdaa, where
    the conflict in question wasn't just about CTL_UNNUMBERED being removed,
    but the 'strategy' field is too (sysctl handling is now done through the
    /proc interface, with no duplicate protocols for reading the data).

    Reported-by: Larry Finger
    Reported-by: Ingo Molnar
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

08 Dec, 2009

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1815 commits)
    mac80211: fix reorder buffer release
    iwmc3200wifi: Enable wimax core through module parameter
    iwmc3200wifi: Add wifi-wimax coexistence mode as a module parameter
    iwmc3200wifi: Coex table command does not expect a response
    iwmc3200wifi: Update wiwi priority table
    iwlwifi: driver version track kernel version
    iwlwifi: indicate uCode type when fail dump error/event log
    iwl3945: remove duplicated event logging code
    b43: fix two warnings
    ipw2100: fix rebooting hang with driver loaded
    cfg80211: indent regulatory messages with spaces
    iwmc3200wifi: fix NULL pointer dereference in pmkid update
    mac80211: Fix TX status reporting for injected data frames
    ath9k: enable 2GHz band only if the device supports it
    airo: Fix integer overflow warning
    rt2x00: Fix padding bug on L2PAD devices.
    WE: Fix set events not propagated
    b43legacy: avoid PPC fault during resume
    b43: avoid PPC fault during resume
    tcp: fix a timewait refcnt race
    ...

    Fix up conflicts due to sysctl cleanups (dead sysctl_check code and
    CTL_UNNUMBERED removed) in
    kernel/sysctl_check.c
    net/ipv4/sysctl_net_ipv4.c
    net/ipv6/addrconf.c
    net/sctp/sysctl.c

    Linus Torvalds
     

24 Nov, 2009

1 commit

  • We currently send window update SACKs every time we free up 1 PMTU
    worth of data. That a lot more SACKs then necessary. Instead, we'll
    now send back the actuall window every time we send a sack, and do
    window-update SACKs when a fraction of the receive buffer has been
    opened. The fraction is controlled with a sysctl.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     

19 Nov, 2009

1 commit


12 Nov, 2009

1 commit

  • Now that sys_sysctl is a compatiblity wrapper around /proc/sys
    all sysctl strategy routines, and all ctl_name and strategy
    entries in the sysctl tables are unused, and can be
    revmoed.

    In addition neigh_sysctl_register has been modified to no longer
    take a strategy argument and it's callers have been modified not
    to pass one.

    Cc: "David Miller"
    Cc: Hideaki YOSHIFUJI
    Cc: netdev@vger.kernel.org
    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     

05 Sep, 2009

1 commit

  • This patch introduces a new sysctl option to make IPv4 Address Scoping
    configurable .

    In networking environments where DNAT rules in iptables prerouting
    chains convert destination IP's to link-local/private IP addresses,
    SCTP connections fail to establish as the INIT chunk is dropped by the
    kernel due to address scope match failure.
    For example to support overlapping IP addresses (same IP address with
    different vlan id) a Layer-5 application listens on link local IP's,
    and there is a DNAT rule that maps the destination IP to a link local
    IP. Such applications never get the SCTP INIT if the address-scoping
    draft is strictly followed.

    This sysctl configuration allows SCTP to function in such
    unconventional networking environments.

    Sysctl options:
    0 - Disable IPv4 address scoping draft altogether
    1 - Enable IPv4 address scoping (default, current behavior)
    2 - Enable address scoping but allow IPv4 private addresses in init/init-ack
    3 - Enable address scoping but allow IPv4 link local address in init/init-ack

    Signed-off-by: Bhaskar Dutta
    Signed-off-by: Vlad Yasevich

    Bhaskar Dutta
     

03 Jun, 2009

1 commit