19 Jul, 2019

1 commit

  • In the sysctl code the proc_dointvec_minmax() function is often used to
    validate the user supplied value between an allowed range. This
    function uses the extra1 and extra2 members from struct ctl_table as
    minimum and maximum allowed value.

    On sysctl handler declaration, in every source file there are some
    readonly variables containing just an integer which address is assigned
    to the extra1 and extra2 members, so the sysctl range is enforced.

    The special values 0, 1 and INT_MAX are very often used as range
    boundary, leading duplication of variables like zero=0, one=1,
    int_max=INT_MAX in different source files:

    $ git grep -E '\.extra[12].*&(zero|one|int_max)' |wc -l
    248

    Add a const int array containing the most commonly used values, some
    macros to refer more easily to the correct array member, and use them
    instead of creating a local one for every object file.

    This is the bloat-o-meter output comparing the old and new binary
    compiled with the default Fedora config:

    # scripts/bloat-o-meter -d vmlinux.o.old vmlinux.o
    add/remove: 2/2 grow/shrink: 0/2 up/down: 24/-188 (-164)
    Data old new delta
    sysctl_vals - 12 +12
    __kstrtab_sysctl_vals - 12 +12
    max 14 10 -4
    int_max 16 - -16
    one 68 - -68
    zero 128 28 -100
    Total: Before=20583249, After=20583085, chg -0.00%

    [mcroce@redhat.com: tipc: remove two unused variables]
    Link: http://lkml.kernel.org/r/20190530091952.4108-1-mcroce@redhat.com
    [akpm@linux-foundation.org: fix net/ipv6/sysctl_net_ipv6.c]
    [arnd@arndb.de: proc/sysctl: make firmware loader table conditional]
    Link: http://lkml.kernel.org/r/20190617130014.1713870-1-arnd@arndb.de
    [akpm@linux-foundation.org: fix fs/eventpoll.c]
    Link: http://lkml.kernel.org/r/20190430180111.10688-1-mcroce@redhat.com
    Signed-off-by: Matteo Croce
    Signed-off-by: Arnd Bergmann
    Acked-by: Kees Cook
    Reviewed-by: Aaron Tomlin
    Cc: Matthew Wilcox
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matteo Croce
     

02 Jul, 2019

2 commits


06 Jun, 2019

1 commit

  • When RST packets are sent because no socket could be found,
    it makes sense to use flowlabel_reflect sysctl to decide
    if a reflection of the flowlabel is requested.

    This extends commit 22b6722bfa59 ("ipv6: Add sysctl for per
    namespace flow label reflection"), for some TCP RST packets.

    In order to provide full control of this new feature,
    flowlabel_reflect becomes a bitmask.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

26 Apr, 2018

1 commit

  • ECMP (equal-cost multipath) hashes are typically computed on the packets'
    5-tuple(src IP, dst IP, src port, dst port, L4 proto).

    For encapsulated packets, the L4 data is not readily available and ECMP
    hashing will often revert to (src IP, dst IP). This will lead to traffic
    polarization on a single ECMP path, causing congestion and waste of network
    capacity.

    In IPv6, the 20-bit flow label field is also used as part of the ECMP hash.
    In the lack of L4 data, the hashing will be on (src IP, dst IP, flow
    label). Having a non-zero flow label is thus important for proper traffic
    load balancing when L4 data is unavailable (i.e., when packets are
    encapsulated).

    Currently, the seg6_do_srh_encap() function extracts the original packet's
    flow label and set it as the outer IPv6 flow label. There are two issues
    with this behaviour:

    a) There is no guarantee that the inner flow label is set by the source.
    b) If the original packet is not IPv6, the flow label will be set to
    zero (e.g., IPv4 or L2 encap).

    This patch adds a function, named seg6_make_flowlabel(), that computes a
    flow label from a given skb. It supports IPv6, IPv4 and L2 payloads, and
    leverages the per namespace 'seg6_flowlabel" sysctl value.

    The currently support behaviours are as follows:
    -1 set flowlabel to zero.
    0 copy flowlabel from Inner paceket in case of Inner IPv6
    (Set flowlabel to 0 in case IPv4/L2)
    1 Compute the flowlabel using seg6_make_flowlabel()

    This patch has been tested for IPv6, IPv4, and L2 traffic.

    Signed-off-by: Ahmed Abdelsalam
    Acked-by: David Lebrun
    Signed-off-by: David S. Miller

    Ahmed Abdelsalam
     

28 Mar, 2018

1 commit


05 Mar, 2018

1 commit

  • Some operators prefer IPv6 path selection to use a standard 5-tuple
    hash rather than just an L3 hash with the flow the label. To that end
    add support to IPv6 for multipath hash policy similar to bf4e0a3db97eb
    ("net: ipv4: add support for ECMP hash policy choice"). The default
    is still L3 which covers source and destination addresses along with
    flow label and IPv6 protocol.

    Signed-off-by: David Ahern
    Reviewed-by: Ido Schimmel
    Tested-by: Ido Schimmel
    Reviewed-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    David Ahern
     

20 Feb, 2018

1 commit

  • These pernet_operations create and destroy sysctl tables.
    They are not touched by another net pernet_operations.
    So, it's possible to execute them in parallel with others.

    Signed-off-by: Kirill Tkhai
    Signed-off-by: David S. Miller

    Kirill Tkhai
     

04 Nov, 2017

1 commit


03 Nov, 2017

1 commit

  • RFC 8200 (IPv6) defines Hop-by-Hop options and Destination options
    extension headers. Both of these carry a list of TLVs which is
    only limited by the maximum length of the extension header (2048
    bytes). By the spec a host must process all the TLVs in these
    options, however these could be used as a fairly obvious
    denial of service attack. I think this could in fact be
    a significant DOS vector on the Internet, one mitigating
    factor might be that many FWs drop all packets with EH (and
    obviously this is only IPv6) so an Internet wide attack might not
    be so effective (yet!).

    By my calculation, the worse case packet with TLVs in a standard
    1500 byte MTU packet that would be processed by the stack contains
    1282 invidual TLVs (including pad TLVS) or 724 two byte TLVs. I
    wrote a quick test program that floods a whole bunch of these
    packets to a host and sure enough there is substantial time spent
    in ip6_parse_tlv. These packets contain nothing but unknown TLVS
    (that are ignored), TLV padding, and bogus UDP header with zero
    payload length.

    25.38% [kernel] [k] __fib6_clean_all
    21.63% [kernel] [k] ip6_parse_tlv
    4.21% [kernel] [k] __local_bh_enable_ip
    2.18% [kernel] [k] ip6_pol_route.isra.39
    1.98% [kernel] [k] fib6_walk_continue
    1.88% [kernel] [k] _raw_write_lock_bh
    1.65% [kernel] [k] dst_release

    This patch adds configurable limits to Destination and Hop-by-Hop
    options. There are three limits that may be set:
    - Limit the number of options in a Hop-by-Hop or Destination options
    extension header.
    - Limit the byte length of a Hop-by-Hop or Destination options
    extension header.
    - Disallow unrecognized options in a Hop-by-Hop or Destination
    options extension header.

    The limits are set in corresponding sysctls:

    ipv6.sysctl.max_dst_opts_cnt
    ipv6.sysctl.max_hbh_opts_cnt
    ipv6.sysctl.max_dst_opts_len
    ipv6.sysctl.max_hbh_opts_len

    If a max_*_opts_cnt is less than zero then unknown TLVs are disallowed.
    The number of known TLVs that are allowed is the absolute value of
    this number.

    If a limit is exceeded when processing an extension header the packet is
    dropped.

    Default values are set to 8 for options counts, and set to INT_MAX
    for maximum length. Note the choice to limit options to 8 is an
    arbitrary guess (roughly based on the fact that the stack supports
    three HBH options and just one destination option).

    These limits have being proposed in draft-ietf-6man-rfc6434-bis.

    Tested (by Martin Lau)

    I tested out 1 thread (i.e. one raw_udp process).

    I changed the net.ipv6.max_dst_(opts|hbh)_number between 8 to 2048.
    With sysctls setting to 2048, the softirq% is packed to 100%.
    With 8, the softirq% is almost unnoticable from mpstat.

    v2;
    - Code and documention cleanup.
    - Change references of RFC2460 to be RFC8200.
    - Add reference to RFC6434-bis where the limits will be in standard.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

25 Aug, 2017

1 commit

  • Reflecting IPv6 Flow Label at server nodes is useful in environments
    that employ multipath routing to load balance the requests. As "IPv6
    Flow Label Reflection" standard draft [1] points out - ICMPv6 PTB error
    messages generated in response to a downstream packets from the server
    can be routed by a load balancer back to the original server without
    looking at transport headers, if the server applies the flow label
    reflection. This enables the Path MTU Discovery past the ECMP router in
    load-balance or anycast environments where each server node is reachable
    by only one path.

    Introduce a sysctl to enable flow label reflection per net namespace for
    all newly created sockets. Same could be earlier achieved only per
    socket by setting the IPV6_FL_F_REFLECT flag for the IPV6_FLOWLABEL_MGR
    socket option.

    [1] https://tools.ietf.org/html/draft-wang-6man-flow-label-reflection-01

    Signed-off-by: Jakub Sitnicki
    Signed-off-by: David S. Miller

    Jakub Sitnicki
     

28 Jun, 2016

1 commit

  • This works in exactly the same way as the CIPSO label cache.
    The idea is to allow the lsm to cache the result of a secattr
    lookup so that it doesn't need to perform the lookup for
    every skbuff.

    It introduces two sysctl controls:
    calipso_cache_enable - enables/disables the cache.
    calipso_cache_bucket_size - sets the size of a cache bucket.

    Signed-off-by: Huw Davies
    Signed-off-by: Paul Moore

    Huw Davies
     

01 Aug, 2015

1 commit

  • Change the meaning of net.ipv6.auto_flowlabels to provide a mode for
    automatic flow labels generation. There are four modes:

    0: flow labels are disabled
    1: flow labels are enabled, sockets can opt-out
    2: flow labels are allowed, sockets can opt-in
    3: flow labels are enabled and enforced, no opt-out for sockets

    np->autoflowlabel is initialized according to the sysctl value.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

10 Jul, 2015

1 commit

  • Add support to allow non-local binds similar to how this was done for IPv4.
    Non-local binds are very useful in emulating the Internet in a box, etc.

    This add the ip_nonlocal_bind sysctl under ipv6.

    Testing:

    Set up nonlocal binding and receive routing on a host, e.g.:

    ip -6 rule add from ::/0 iif eth0 lookup 200
    ip -6 route add local 2001:0:0:1::/64 dev lo proto kernel scope host table 200
    sysctl -w net.ipv6.ip_nonlocal_bind=1

    Set up routing to 2001:0:0:1::/64 on peer to go to first host

    ping6 -I 2001:0:0:1::1 peer-address -- to verify

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

04 May, 2015

1 commit

  • This patch divides the IPv6 flow label space into two ranges:
    0-7ffff is reserved for flow label manager, 80000-fffff will be
    used for creating auto flow labels (per RFC6438). This only affects how
    labels are set on transmit, it does not affect receive. This range split
    can be disbaled by systcl.

    Background:

    IPv6 flow labels have been an unmitigated disappointment thus far
    in the lifetime of IPv6. Support in HW devices to use them for ECMP
    is lacking, and OSes don't turn them on by default. If we had these
    we could get much better hashing in IPv6 networks without resorting
    to DPI, possibly eliminating some of the motivations to to define new
    encaps in UDP just for getting ECMP.

    Unfortunately, the initial specfications of IPv6 did not clarify
    how they are to be used. There has always been a vague concept that
    these can be used for ECMP, flow hashing, etc. and we do now have a
    good standard how to this in RFC6438. The problem is that flow labels
    can be either stateful or stateless (as in RFC6438), and we are
    presented with the possibility that a stateless label may collide
    with a stateful one. Attempts to split the flow label space were
    rejected in IETF. When we added support in Linux for RFC6438, we
    could not turn on flow labels by default due to this conflict.

    This patch splits the flow label space and should give us
    a path to enabling auto flow labels by default for all IPv6 packets.
    This is an API change so we need to consider compatibility with
    existing deployment. The stateful range is chosen to be the lower
    values in hopes that most uses would have chosen small numbers.

    Once we resolve the stateless/stateful issue, we can proceed to
    look at enabling RFC6438 flow labels by default (starting with
    scaled testing).

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

01 Apr, 2015

1 commit

  • The ipv6 code uses a mixture of coding styles. In some instances check for NULL
    pointer is done as x == NULL and sometimes as !x. !x is preferred according to
    checkpatch and this patch makes the code consistent by adopting the latter
    form.

    No changes detected by objdiff.

    Signed-off-by: Ian Morris
    Signed-off-by: David S. Miller

    Ian Morris
     

24 Mar, 2015

1 commit


05 Sep, 2014

1 commit

  • This patch adds a new sysctl_mld_qrv knob to configure the mldv1/v2 query
    robustness variable. It specifies how many retransmit of unsolicited mld
    retransmit should happen. Admins might want to tune this on lossy links.

    Also reset mld state on interface down/up, so we pick up new sysctl
    settings during interface up event.

    IPv6 certification requests this knob to be available.

    I didn't make this knob netns specific, as it is mostly a setting in a
    physical environment and should be per host.

    Cc: Flavio Leitner
    Signed-off-by: Hannes Frederic Sowa
    Acked-by: Flavio Leitner
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

06 Aug, 2014

1 commit

  • Conflicts:
    drivers/net/Makefile
    net/ipv6/sysctl_net_ipv6.c

    Two ipv6_table_template[] additions overlap, so the index
    of the ipv6_table[x] assignments needed to be adjusted.

    In the drivers/net/Makefile case, we've gotten rid of the
    garbage whereby we had to list every single USB networking
    driver in the top-level Makefile, there is just one
    "USB_NETWORKING" that guards everything.

    Signed-off-by: David S. Miller

    David S. Miller
     

03 Aug, 2014

1 commit


08 Jul, 2014

1 commit

  • Automatically generate flow labels for IPv6 packets on transmit.
    The flow label is computed based on skb_get_hash. The flow label will
    only automatically be set when it is zero otherwise (i.e. flow label
    manager hasn't set one). This supports the transmit side functionality
    of RFC 6438.

    Added an IPv6 sysctl auto_flowlabels to enable/disable this behavior
    system wide, and added IPV6_AUTOFLOWLABEL socket option to enable this
    functionality per socket.

    By default, auto flowlabels are disabled to avoid possible conflicts
    with flow label manager, however if this feature proves useful we
    may want to enable it by default.

    It should also be noted that FreeBSD has already implemented automatic
    flow labels (including the sysctl and socket option). In FreeBSD,
    automatic flow labels default to enabled.

    Performance impact:

    Running super_netperf with 200 flows for TCP_RR and UDP_RR for
    IPv6. Note that in UDP case, __skb_get_hash will be called for
    every packet with explains slight regression. In the TCP case
    the hash is saved in the socket so there is no regression.

    Automatic flow labels disabled:

    TCP_RR:
    86.53% CPU utilization
    127/195/322 90/95/99% latencies
    1.40498e+06 tps

    UDP_RR:
    90.70% CPU utilization
    118/168/243 90/95/99% latencies
    1.50309e+06 tps

    Automatic flow labels enabled:

    TCP_RR:
    85.90% CPU utilization
    128/199/337 90/95/99% latencies
    1.40051e+06

    UDP_RR
    92.61% CPU utilization
    115/164/236 90/95/99% latencies
    1.4687e+06

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

14 May, 2014

1 commit

  • Kernel-originated IP packets that have no user socket associated
    with them (e.g., ICMP errors and echo replies, TCP RSTs, etc.)
    are emitted with a mark of zero. Add a sysctl to make them have
    the same mark as the packet they are replying to.

    This allows an administrator that wishes to do so to use
    mark-based routing, firewalling, etc. for these replies by
    marking the original packets inbound.

    Tested using user-mode linux:
    - ICMP/ICMPv6 echo replies and errors.
    - TCP RST packets (IPv4 and IPv6).

    Signed-off-by: Lorenzo Colitti
    Signed-off-by: David S. Miller

    Lorenzo Colitti
     

20 Jan, 2014

1 commit

  • With the introduction of IPV6_FL_F_REFLECT, there is no guarantee of
    flow label unicity. This patch introduces a new sysctl to protect the old
    behaviour, enable by default.

    Changelog of V3:
    * rename ip6_flowlabel_consistency to flowlabel_consistency
    * use net_info_ratelimited()
    * checkpatch cleanups

    Signed-off-by: Florent Fourcot
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Florent Fourcot
     

15 Jan, 2014

1 commit


08 Jan, 2014

1 commit

  • This change allows to follow a recommandation of RFC4942.

    - Add "anycast_src_echo_reply" sysctl to control the use of anycast addresses
    as source addresses for ICMPv6 echo reply. This sysctl is false by default
    to preserve existing behavior.
    - Add inline check ipv6_anycast_destination().
    - Use them in icmpv6_echo_reply().

    Reference:
    RFC4942 - IPv6 Transition/Coexistence Security Considerations
    (http://tools.ietf.org/html/rfc4942#section-2.1.6)

    2.1.6. Anycast Traffic Identification and Security

    [...]
    To avoid exposing knowledge about the internal structure of the
    network, it is recommended that anycast servers now take advantage of
    the ability to return responses with the anycast address as the
    source address if possible.

    Signed-off-by: Francois-Xavier Le Bail
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    FX Le Bail
     

13 Jun, 2013

1 commit

  • Reduce the uses of this unnecessary typedef.

    Done via perl script:

    $ git grep --name-only -w ctl_table net | \
    xargs perl -p -i -e '\
    sub trim { my ($local) = @_; $local =~ s/(^\s+|\s+$)//g; return $local; } \
    s/\b(?<!struct\s)ctl_table\b(\s*\*\s*|\s+\w+)/"struct ctl_table " . trim($1)/ge'

    Reflow the modified lines that now exceed 80 columns.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

21 Apr, 2012

5 commits

  • We don't use struct ctl_path anymore so delete the exported constants.

    Signed-off-by: Eric W. Biederman
    Acked-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • The sysctl core no longer natively understands sysctl tables
    with .child entries.

    Split the ipv6_table to remove the .child entries.

    Signed-off-by: Eric W. Biederman
    Acked-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • sysctl no longer requires explicit creation of directories. The neigh
    directory is always populated with at least a default entry so this
    should cause no user visible changes.

    Signed-off-by: Eric W. Biederman
    Acked-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This makes it clearer which sysctls are relative to your current network
    namespace.

    This makes it a little less error prone by not exposing sysctls for the
    initial network namespace in other namespaces.

    This is the same way we handle all of our other network interfaces to
    userspace and I can't honestly remember why we didn't do this for
    sysctls right from the start.

    Signed-off-by: Eric W. Biederman
    Acked-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • register_sysctl_rotable never caught on as an interesting way to
    register sysctls. My take on the situation is that what we want are
    sysctls that we can only see in the initial network namespace. What we
    have implemented with register_sysctl_rotable are sysctls that we can
    see in all of the network namespaces and can only change in the initial
    network namespace.

    That is a very silly way to go. Just register the network sysctls
    in the initial network namespace and we don't have any weird special
    cases to deal with.

    The sysctls affected are:
    /proc/sys/net/ipv4/ipfrag_secret_interval
    /proc/sys/net/ipv4/ipfrag_max_dist
    /proc/sys/net/ipv6/ip6frag_secret_interval
    /proc/sys/net/ipv6/mld_max_msf

    I really don't expect anyone will miss them if they can't read them in a
    child user namespace.

    CC: Pavel Emelyanov
    Signed-off-by: Eric W. Biederman
    Acked-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

01 Nov, 2011

1 commit


22 Mar, 2011

1 commit

  • When I was fixing issues with unregisgtering tables under /proc/sys/net/ipv6/neigh
    by adding a mount point it appears I missed a critical ordering issue, in the
    ipv6 initialization. I had not realized that ipv6_sysctl_register is called
    at the very end of the ipv6 initialization and in particular after we call
    neigh_sysctl_register from ndisc_init.

    "neigh" needs to be initialized in ipv6_static_sysctl_register which is
    the first ipv6 table to initialized, and definitely before ndisc_init.
    This removes the weirdness of duplicate tables while still providing a
    "neigh" mount point which prevents races in sysctl unregistering.

    This was initially reported at https://bugzilla.kernel.org/show_bug.cgi?id=31232
    Reported-by: sunkan@zappa.cx
    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

01 Feb, 2011

1 commit

  • In my testing of 2.6.37 I was occassionally getting a warning about
    sysctl table entries being unregistered in the wrong order. Digging
    in it turns out this dates back to the last great sysctl reorg done
    where Al Viro introduced the requirement that sysctl directories
    needed to be created before and destroyed after the files in them.

    It turns out that in that great reorg /proc/sys/net/ipv6/neigh was
    overlooked. So this patch fixes that oversight and makes an annoying
    warning message go away.

    >------------[ cut here ]------------
    >WARNING: at kernel/sysctl.c:1992 unregister_sysctl_table+0x134/0x164()
    >Pid: 23951, comm: kworker/u:3 Not tainted 2.6.37-350888.2010AroraKernelBeta.fc14.x86_64 #1
    >Call Trace:
    > [] warn_slowpath_common+0x80/0x98
    > [] warn_slowpath_null+0x15/0x17
    > [] unregister_sysctl_table+0x134/0x164
    > [] ? kfree+0xc4/0xd1
    > [] neigh_sysctl_unregister+0x22/0x3a
    > [] addrconf_ifdown+0x33f/0x37b [ipv6]
    > [] ? skb_dequeue+0x5f/0x6b
    > [] addrconf_notify+0x69b/0x75c [ipv6]
    > [] ? ip6mr_device_event+0x98/0xa9 [ipv6]
    > [] notifier_call_chain+0x32/0x5e
    > [] raw_notifier_call_chain+0xf/0x11
    > [] call_netdevice_notifiers+0x45/0x4a
    > [] rollback_registered_many+0x118/0x201
    > [] unregister_netdevice_many+0x16/0x6d
    > [] default_device_exit_batch+0xa4/0xb8
    > [] ? cleanup_net+0x0/0x194
    > [] ops_exit_list+0x4e/0x56
    > [] cleanup_net+0xf4/0x194
    > [] process_one_work+0x187/0x280
    > [] worker_thread+0xff/0x19f
    > [] ? worker_thread+0x0/0x19f
    > [] kthread+0x7d/0x85
    > [] kernel_thread_helper+0x4/0x10
    > [] ? kthread+0x0/0x85
    > [] ? kernel_thread_helper+0x0/0x10
    >---[ end trace 8a7e9310b35e9486 ]---

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

18 Jan, 2010

1 commit


12 Nov, 2009

1 commit

  • Now that sys_sysctl is a compatiblity wrapper around /proc/sys
    all sysctl strategy routines, and all ctl_name and strategy
    entries in the sysctl tables are unused, and can be
    revmoed.

    In addition neigh_sysctl_register has been modified to no longer
    take a strategy argument and it's callers have been modified not
    to pass one.

    Cc: "David Miller"
    Cc: Hideaki YOSHIFUJI
    Cc: netdev@vger.kernel.org
    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     

03 Aug, 2009

1 commit

  • This renames away a variable clash:
    * ipv6_table[] is declared as a static global table;
    * ipv6_sysctl_net_init() uses ipv6_table to refer/destroy dynamic memory;
    * ipv6_sysctl_net_exit() also uses ipv6_table for the same purpose;
    * both the two last functions call kfree() on ipv6_table.

    Signed-off-by: Gerrit Renker
    Signed-off-by: David S. Miller

    Gerrit Renker
     

09 Jan, 2009

1 commit

  • Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Acked-by: Theodore Ts'o
    Acked-by: Mark Fasheh
    Acked-by: David S. Miller
    Cc: James Morris
    Acked-by: Casey Schaufler
    Acked-by: Takashi Iwai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fernando Carrijo