15 Dec, 2011

1 commit

  • net/ipv4/sysctl_net_ipv4.c:78:6: warning: symbol 'inet_get_ping_group_range_table'
    was not declared. Should it be static?

    net/ipv4/sysctl_net_ipv4.c:119:31: warning: incorrect type in argument 2
    (different signedness)
    net/ipv4/sysctl_net_ipv4.c:119:31: expected int *range
    net/ipv4/sysctl_net_ipv4.c:119:31: got unsigned int *

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

13 Dec, 2011

2 commits

  • This patch uses the "tcp.limit_in_bytes" field of the kmem_cgroup to
    effectively control the amount of kernel memory pinned by a cgroup.

    This value is ignored in the root cgroup, and in all others,
    caps the value specified by the admin in the net namespaces'
    view of tcp_sysctl_mem.

    If namespaces are being used, the admin is allowed to set a
    value bigger than cgroup's maximum, the same way it is allowed
    to set pretty much unlimited values in a real box.

    Signed-off-by: Glauber Costa
    Reviewed-by: Hiroyouki Kamezawa
    CC: David S. Miller
    CC: Eric W. Biederman
    Signed-off-by: David S. Miller

    Glauber Costa
     
  • This patch allows each namespace to independently set up
    its levels for tcp memory pressure thresholds. This patch
    alone does not buy much: we need to make this values
    per group of process somehow. This is achieved in the
    patches that follows in this patchset.

    Signed-off-by: Glauber Costa
    Reviewed-by: KAMEZAWA Hiroyuki
    CC: David S. Miller
    CC: Eric W. Biederman
    Signed-off-by: David S. Miller

    Glauber Costa
     

09 Jun, 2011

1 commit

  • Andi Kleen and Tim Chen reported huge contention on inetpeer
    unused_peers.lock, on memcached workload on a 40 core machine, with
    disabled route cache.

    It appears we constantly flip peers refcnt between 0 and 1 values, and
    we must insert/remove peers from unused_peers.list, holding a contended
    spinlock.

    Remove this list completely and perform a garbage collection on-the-fly,
    at lookup time, using the expired nodes we met during the tree
    traversal.

    This removes a lot of code, makes locking more standard, and obsoletes
    two sysctls (inet_peer_gc_mintime and inet_peer_gc_maxtime). This also
    removes two pointers in inet_peer structure.

    There is still a false sharing effect because refcnt is in first cache
    line of object [were the links and keys used by lookups are located], we
    might move it at the end of inet_peer structure to let this first cache
    line mostly read by cpus.

    Signed-off-by: Eric Dumazet
    CC: Andi Kleen
    CC: Tim Chen
    Signed-off-by: David S. Miller

    Eric Dumazet
     

18 May, 2011

1 commit

  • If CONFIG_PROC_SYSCTL=n the building process fails:

    ping.c:(.text+0x52af3): undefined reference to `inet_get_ping_group_range_net'

    Moved inet_get_ping_group_range_net() to ping.c.

    Reported-by: Randy Dunlap
    Signed-off-by: Vasiliy Kulikov
    Acked-by: Eric Dumazet
    Acked-by: Randy Dunlap
    Signed-off-by: David S. Miller

    Vasiliy Kulikov
     

14 May, 2011

1 commit

  • This patch adds IPPROTO_ICMP socket kind. It makes it possible to send
    ICMP_ECHO messages and receive the corresponding ICMP_ECHOREPLY messages
    without any special privileges. In other words, the patch makes it
    possible to implement setuid-less and CAP_NET_RAW-less /bin/ping. In
    order not to increase the kernel's attack surface, the new functionality
    is disabled by default, but is enabled at bootup by supporting Linux
    distributions, optionally with restriction to a group or a group range
    (see below).

    Similar functionality is implemented in Mac OS X:
    http://www.manpagez.com/man/4/icmp/

    A new ping socket is created with

    socket(PF_INET, SOCK_DGRAM, PROT_ICMP)

    Message identifiers (octets 4-5 of ICMP header) are interpreted as local
    ports. Addresses are stored in struct sockaddr_in. No port numbers are
    reserved for privileged processes, port 0 is reserved for API ("let the
    kernel pick a free number"). There is no notion of remote ports, remote
    port numbers provided by the user (e.g. in connect()) are ignored.

    Data sent and received include ICMP headers. This is deliberate to:
    1) Avoid the need to transport headers values like sequence numbers by
    other means.
    2) Make it easier to port existing programs using raw sockets.

    ICMP headers given to send() are checked and sanitized. The type must be
    ICMP_ECHO and the code must be zero (future extensions might relax this,
    see below). The id is set to the number (local port) of the socket, the
    checksum is always recomputed.

    ICMP reply packets received from the network are demultiplexed according
    to their id's, and are returned by recv() without any modifications.
    IP header information and ICMP errors of those packets may be obtained
    via ancillary data (IP_RECVTTL, IP_RETOPTS, and IP_RECVERR). ICMP source
    quenches and redirects are reported as fake errors via the error queue
    (IP_RECVERR); the next hop address for redirects is saved to ee_info (in
    network order).

    socket(2) is restricted to the group range specified in
    "/proc/sys/net/ipv4/ping_group_range". It is "1 0" by default, meaning
    that nobody (not even root) may create ping sockets. Setting it to "100
    100" would grant permissions to the single group (to either make
    /sbin/ping g+s and owned by this group or to grant permissions to the
    "netadmins" group), "0 4294967295" would enable it for the world, "100
    4294967295" would enable it for the users, but not daemons.

    The existing code might be (in the unlikely case anyone needs it)
    extended rather easily to handle other similar pairs of ICMP messages
    (Timestamp/Reply, Information Request/Reply, Address Mask Request/Reply
    etc.).

    Userspace ping util & patch for it:
    http://openwall.info/wiki/people/segoon/ping

    For Openwall GNU/*/Linux it was the last step on the road to the
    setuid-less distro. A revision of this patch (for RHEL5/OpenVZ kernels)
    is in use in Owl-current, such as in the 2011/03/12 LiveCD ISOs:
    http://mirrors.kernel.org/openwall/Owl/current/iso/

    Initially this functionality was written by Pavel Kankovsky for
    Linux 2.4.32, but unfortunately it was never made public.

    All ping options (-b, -p, -Q, -R, -s, -t, -T, -M, -I), are tested with
    the patch.

    PATCH v3:
    - switched to flowi4.
    - minor changes to be consistent with raw sockets code.

    PATCH v2:
    - changed ping_debug() to pr_debug().
    - removed CONFIG_IP_PING.
    - removed ping_seq_fops.owner field (unused for procfs).
    - switched to proc_net_fops_create().
    - switched to %pK in seq_printf().

    PATCH v1:
    - fixed checksumming bug.
    - CAP_NET_RAW may not create icmp sockets anymore.

    RFC v2:
    - minor cleanups.
    - introduced sysctl'able group range to restrict socket(2).

    Signed-off-by: Vasiliy Kulikov
    Signed-off-by: David S. Miller

    Vasiliy Kulikov
     

13 Apr, 2011

1 commit

  • controlling igmp_max_membership is useful even when IP_MULTICAST
    is off.
    Quagga(an OSPF deamon) uses multicast addresses for all interfaces
    using a single socket and hits igmp_max_membership limit when
    there are 20 interfaces or more.
    Always export sysctl igmp_max_memberships in proc, just like
    igmp_max_msf

    Signed-off-by: Joakim Tjernlund
    Signed-off-by: David S. Miller

    Joakim Tjernlund
     

14 Dec, 2010

1 commit


13 Dec, 2010

1 commit

  • Always go through a new ip4_dst_hoplimit() helper, just like ipv6.

    This allowed several simplifications:

    1) The interim dst_metric_hoplimit() can go as it's no longer
    userd.

    2) The sysctl_ip_default_ttl entry no longer needs to use
    ipv4_doint_and_flush, since the sysctl is not cached in
    routing cache metrics any longer.

    3) ipv4_doint_and_flush no longer needs to be exported and
    therefore can be marked static.

    When ipv4_doint_and_flush_strategy was removed some time ago,
    the external declaration in ip.h was mistakenly left around
    so kill that off too.

    We have to move the sysctl_ip_default_ttl declaration into
    ipv4's route cache definition header net/route.h, because
    currently net/ip.h (where the declaration lives now) has
    a back dependency on net/route.h

    Signed-off-by: David S. Miller

    David S. Miller
     

29 Nov, 2010

1 commit

  • tcp_win_from_space() does the following:

    if (sysctl_tcp_adv_win_scale > (-sysctl_tcp_adv_win_scale);
    else
    return space - (space >> sysctl_tcp_adv_win_scale);

    "space" is int.

    As per C99 6.5.7 (3) shifting int for 32 or more bits is
    undefined behaviour.

    Indeed, if sysctl_tcp_adv_win_scale is exactly 32,
    space >> 32 equals space and function returns 0.

    Which means we busyloop in tcp_fixup_rcvbuf().

    Restrict net.ipv4.tcp_adv_win_scale to [-31, 31].

    Fix https://bugzilla.kernel.org/show_bug.cgi?id=20312

    Steps to reproduce:

    echo 32 >/proc/sys/net/ipv4/tcp_adv_win_scale
    wget www.kernel.org
    [softlockup]

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

11 Nov, 2010

1 commit

  • Robin Holt tried to boot a 16TB machine and found some limits were
    reached : sysctl_tcp_mem[2], sysctl_udp_mem[2]

    We can switch infrastructure to use long "instead" of "int", now
    atomic_long_t primitives are available for free.

    Signed-off-by: Eric Dumazet
    Reported-by: Robin Holt
    Reviewed-by: Robin Holt
    Signed-off-by: Andrew Morton
    Signed-off-by: David S. Miller

    Eric Dumazet
     

16 May, 2010

1 commit

  • (Dropped the infiniband part, because Tetsuo modified the related code,
    I will send a separate patch for it once this is accepted.)

    This patch introduces /proc/sys/net/ipv4/ip_local_reserved_ports which
    allows users to reserve ports for third-party applications.

    The reserved ports will not be used by automatic port assignments
    (e.g. when calling connect() or bind() with port number 0). Explicit
    port allocation behavior is unchanged.

    Signed-off-by: Octavian Purdila
    Signed-off-by: WANG Cong
    Cc: Neil Horman
    Cc: Eric Dumazet
    Cc: Eric W. Biederman
    Signed-off-by: David S. Miller

    Amerigo Wang
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

19 Feb, 2010

2 commits

  • This patch enables fast retransmissions after one dupACK for
    TCP if the stream is identified as thin. This will reduce
    latencies for thin streams that are not able to trigger fast
    retransmissions due to high packet interarrival time. This
    mechanism is only active if enabled by iocontrol or syscontrol
    and the stream is identified as thin.

    Signed-off-by: Andreas Petlund
    Signed-off-by: David S. Miller

    Andreas Petlund
     
  • This patch will make TCP use only linear timeouts if the
    stream is thin. This will help to avoid the very high latencies
    that thin stream suffer because of exponential backoff. This
    mechanism is only active if enabled by iocontrol or syscontrol
    and the stream is identified as thin. A maximum of 6 linear
    timeouts is tried before exponential backoff is resumed.

    Signed-off-by: Andreas Petlund
    Signed-off-by: David S. Miller

    Andreas Petlund
     

08 Dec, 2009

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1815 commits)
    mac80211: fix reorder buffer release
    iwmc3200wifi: Enable wimax core through module parameter
    iwmc3200wifi: Add wifi-wimax coexistence mode as a module parameter
    iwmc3200wifi: Coex table command does not expect a response
    iwmc3200wifi: Update wiwi priority table
    iwlwifi: driver version track kernel version
    iwlwifi: indicate uCode type when fail dump error/event log
    iwl3945: remove duplicated event logging code
    b43: fix two warnings
    ipw2100: fix rebooting hang with driver loaded
    cfg80211: indent regulatory messages with spaces
    iwmc3200wifi: fix NULL pointer dereference in pmkid update
    mac80211: Fix TX status reporting for injected data frames
    ath9k: enable 2GHz band only if the device supports it
    airo: Fix integer overflow warning
    rt2x00: Fix padding bug on L2PAD devices.
    WE: Fix set events not propagated
    b43legacy: avoid PPC fault during resume
    b43: avoid PPC fault during resume
    tcp: fix a timewait refcnt race
    ...

    Fix up conflicts due to sysctl cleanups (dead sysctl_check code and
    CTL_UNNUMBERED removed) in
    kernel/sysctl_check.c
    net/ipv4/sysctl_net_ipv4.c
    net/ipv6/addrconf.c
    net/sctp/sysctl.c

    Linus Torvalds
     

03 Dec, 2009

1 commit

  • Define sysctl (tcp_cookie_size) to turn on and off the cookie option
    default globally, instead of a compiled configuration option.

    Define per socket option (TCP_COOKIE_TRANSACTIONS) for setting constant
    data values, retrieving variable cookie values, and other facilities.

    Move inline tcp_clear_options() unchanged from net/tcp.h to linux/tcp.h,
    near its corresponding struct tcp_options_received (prior to changes).

    This is a straightforward re-implementation of an earlier (year-old)
    patch that no longer applies cleanly, with permission of the original
    author (Adam Langley):

    http://thread.gmane.org/gmane.linux.network/102586

    These functions will also be used in subsequent patches that implement
    additional features.

    Requires:
    net: TCP_MSS_DEFAULT, TCP_MSS_DESIRED

    Signed-off-by: William.Allen.Simpson@gmail.com
    Signed-off-by: David S. Miller

    William Allen Simpson
     

26 Nov, 2009

1 commit

  • Generated with the following semantic patch

    @@
    struct net *n1;
    struct net *n2;
    @@
    - n1 == n2
    + net_eq(n1, n2)

    @@
    struct net *n1;
    struct net *n2;
    @@
    - n1 != n2
    + !net_eq(n1, n2)

    applied over {include,net,drivers/net}.

    Signed-off-by: Octavian Purdila
    Signed-off-by: David S. Miller

    Octavian Purdila
     

12 Nov, 2009

1 commit

  • Now that sys_sysctl is a compatiblity wrapper around /proc/sys
    all sysctl strategy routines, and all ctl_name and strategy
    entries in the sysctl tables are unused, and can be
    revmoed.

    In addition neigh_sysctl_register has been modified to no longer
    take a strategy argument and it's callers have been modified not
    to pass one.

    Cc: "David Miller"
    Cc: Hideaki YOSHIFUJI
    Cc: netdev@vger.kernel.org
    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     

24 Sep, 2009

1 commit

  • It's unused.

    It isn't needed -- read or write flag is already passed and sysctl
    shouldn't care about the rest.

    It _was_ used in two places at arch/frv for some reason.

    Signed-off-by: Alexey Dobriyan
    Cc: David Howells
    Cc: "Eric W. Biederman"
    Cc: Al Viro
    Cc: Ralf Baechle
    Cc: Martin Schwidefsky
    Cc: Ingo Molnar
    Cc: "David S. Miller"
    Cc: James Morris
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

04 Nov, 2008

1 commit

  • I want to compile out proc_* and sysctl_* handlers totally and
    stub them to NULL depending on config options, however usage of &
    will prevent this, since taking adress of NULL pointer will break
    compilation.

    So, drop & in front of every ->proc_handler and every ->strategy
    handler, it was never needed in fact.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

28 Oct, 2008

1 commit

  • This is a patch to provide on demand route cache rebuilding. Currently, our
    route cache is rebulid periodically regardless of need. This introduced
    unneeded periodic latency. This patch offers a better approach. Using code
    provided by Eric Dumazet, we compute the standard deviation of the average hash
    bucket chain length while running rt_check_expire. Should any given chain
    length grow to larger that average plus 4 standard deviations, we trigger an
    emergency hash table rebuild for that net namespace. This allows for the common
    case in which chains are well behaved and do not grow unevenly to not incur any
    latency at all, while those systems (which may be being maliciously attacked),
    only rebuild when the attack is detected. This patch take 2 other factors into
    account:
    1) chains with multiple entries that differ by attributes that do not affect the
    hash value are only counted once, so as not to unduly bias system to rebuilding
    if features like QOS are heavily used
    2) if rebuilding crosses a certain threshold (which is adjustable via the added
    sysctl in this patch), route caching is disabled entirely for that net
    namespace, since constant rebuilding is less efficient that no caching at all

    Tested successfully by me.

    Signed-off-by: Neil Horman
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Neil Horman
     

17 Oct, 2008

1 commit

  • name and nlen parameters passed to ->strategy hook are unused, remove
    them. In general ->strategy hook should know what it's doing, and don't
    do something tricky for which, say, pointer to original userspace array
    may be needed (name).

    Signed-off-by: Alexey Dobriyan
    Acked-by: David S. Miller [ networking bits ]
    Cc: Ralf Baechle
    Cc: David Howells
    Cc: Matt Mackall
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

09 Oct, 2008

1 commit

  • I noticed sysctl_local_port_range[] and its associated seqlock
    sysctl_local_port_range_lock were on separate cache lines.
    Moreover, sysctl_local_port_range[] was close to unrelated
    variables, highly modified, leading to cache misses.

    Moving these two variables in a structure can help data
    locality and moving this structure to read_mostly section
    helps sharing of this data among cpus.

    Cleanup of extern declarations (moved in include file where
    they belong), and use of inet_get_local_port_range()
    accessor instead of direct access to ports values.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

04 Aug, 2008

1 commit

  • Commit 76e6ebfb40a2455c18234dcb0f9df37533215461 ("netns: add namespace
    parameter to rt_cache_flush") acceses the extra2 parameter of the
    ip_default_ttl ctl_table, but it is never set to a meaningful
    value. When e84f84f276473dcc673f360e8ff3203148bdf0e2 ("netns: place
    rt_genid into struct net") is applied, we'll oops in
    rt_cache_invalidate(). Set extra2 to init_net, to avoid that.

    Reported-by: Marcin Slusarz
    Signed-off-by: Sven Wegener
    Tested-by: Marcin Slusarz
    Acked-by: Denis V. Lunev
    Signed-off-by: David S. Miller

    Sven Wegener
     

28 Jul, 2008

1 commit

  • Piss-poor sysctl registration API strikes again, film at 11...

    What we really need is _pathname_ required to be present in already
    registered table, so that kernel could warn about bad order. That's the
    next target for sysctl stuff (and generally saner and more explicit
    order of initialization of ipv[46] internals wouldn't hurt either).

    For the time being, here are full fixups required by ..._rotable()
    stuff; we make per-net sysctl sets descendents of "ro" one and make sure
    that sufficient skeleton is there before we start registering per-net
    sysctls.

    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Al Viro
     

27 Jul, 2008

1 commit


02 Jul, 2008

1 commit

  • Convert the sysctl values for icmp ratelimit to use milliseconds instead
    of jiffies which is based on kernel configured HZ.
    Internal kernel jiffies are not a proper unit for any userspace API.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

12 Jun, 2008

1 commit


26 Mar, 2008

3 commits


01 Feb, 2008

1 commit

  • In strategy_allowed_congestion_control of the 2.6.24 kernel, when
    sysctl_string return 1 on success,it should call
    tcp_set_allowed_congestion_control to set the allowed congestion
    control.But, it don't. the sysctl_string return 1 on success,
    otherwise return negative, never return 0.The patch fix the problem.

    Signed-off-by: Shan Wei
    Acked-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Shan Wei
     

29 Jan, 2008

6 commits

  • This is a preparation for sysctl netns-ization.
    Move the ctl tables to the files, where the tuning
    variables reside. Plus make the helpers to register
    the tables.

    This will simplify the later patches and will keep
    similar things closer to each other.

    ipv4, ipv6 and conntrack_reasm are patched differently,
    but the result is all the tables are in appropriate files.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • This includes the most simple cases for netfilter.

    The first part is tne queue modules for ipv4 and ipv6,
    on which the net/ipv4/ and net/ipv6/ paths are reused
    from the appropriate ipv4 and ipv6 code.

    The conntrack module is also patched, but this hunk is
    very small and simple.

    Signed-off-by: Pavel Emelyanov
    Acked-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Signed-off-by: Takahiro Yasui
    Signed-off-by: Hideo Aoki
    Signed-off-by: David S. Miller

    Hideo Aoki
     
  • AFAIS these two entries should do the same thing - change the
    forwarding state on ipv4_devconf and on all the devices.

    I propose to merge the handlers together using ctl paths.

    The inet_forward_change() is static after this and I move
    it higher to be closer to other "propagation" helpers and
    to avoid diff making patches based on { and } matching :)
    i.e. - make them easier to read.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • This is the same as I did for the net/core/ table in the
    second patch in his series: use the paths and isolate the
    whole table in the .c file.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • This includes several cleanups:

    * tune Makefile to compile out this file when SYSCTL=n. Now
    it looks like net/core/sysctl_net_core.c one;
    * move the ipv4_config to af_inet.c to exist all the time;
    * remove additional sysctl_ip_nonlocal_bind declaration
    (it is already declared in net/ip.h);
    * remove no nonger needed ifdefs from this file.

    This is a preparation for using ctl paths for net/ipv4/
    sysctl table.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     

20 Nov, 2007

1 commit

  • From: "Sam Jansen"

    sysctl_tcp_congestion_control seems to have a bug that prevents it
    from actually calling the tcp_set_default_congestion_control
    function. This is not so apparent because it does not return an error
    and generally the /proc interface is used to configure the default TCP
    congestion control algorithm. This is present in 2.6.18 onwards and
    probably earlier, though I have not inspected 2.6.15--2.6.17.

    sysctl_tcp_congestion_control calls sysctl_string and expects a successful
    return code of 0. In such a case it actually sets the congestion control
    algorithm with tcp_set_default_congestion_control. Otherwise, it returns the
    value returned by sysctl_string. This was correct in 2.6.14, as sysctl_string
    returned 0 on success. However, sysctl_string was updated to return 1 on
    success around about 2.6.15 and sysctl_tcp_congestion_control was not updated.
    Even though sysctl_tcp_congestion_control returns 1, do_sysctl_strategy
    converts this return code to '0', so the caller never notices the error.

    Signed-off-by: David S. Miller

    Sam Jansen