02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

22 Jan, 2014

1 commit

  • Jakub Zawadzki noticed that some divisions by reciprocal_divide()
    were not correct [1][2], which he could also show with BPF code
    after divisions are transformed into reciprocal_value() for runtime
    invariance which can be passed to reciprocal_divide() later on;
    reverse in BPF dump ended up with a different, off-by-one K in
    some situations.

    This has been fixed by Eric Dumazet in commit aee636c4809fa5
    ("bpf: do not use reciprocal divide"). This follow-up patch
    improves reciprocal_value() and reciprocal_divide() to work in
    all cases by using Granlund and Montgomery method, so that also
    future use is safe and without any non-obvious side-effects.
    Known problems with the old implementation were that division by 1
    always returned 0 and some off-by-ones when the dividend and divisor
    where very large. This seemed to not be problematic with its
    current users, as far as we can tell. Eric Dumazet checked for
    the slab usage, we cannot surely say so in the case of flex_array.
    Still, in order to fix that, we propose an extension from the
    original implementation from commit 6a2d7a955d8d resp. [3][4],
    by using the algorithm proposed in "Division by Invariant Integers
    Using Multiplication" [5], Torbjörn Granlund and Peter L.
    Montgomery, that is, pseudocode for q = n/d where q, n, d is in
    u32 universe:

    1) Initialization:

    int l = ceil(log_2 d)
    uword m' = floor((1<<
    Cc: Eric Dumazet
    Cc: Austin S Hemmelgarn
    Cc: linux-kernel@vger.kernel.org
    Cc: Jesse Gross
    Cc: Jamal Hadi Salim
    Cc: Stephen Hemminger
    Cc: Matt Mackall
    Cc: Pekka Enberg
    Cc: Christoph Lameter
    Cc: Andy Gospodarek
    Cc: Veaceslav Falico
    Cc: Jay Vosburgh
    Cc: Jakub Zawadzki
    Signed-off-by: Daniel Borkmann
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

09 Dec, 2011

1 commit

  • Adaptative RED AQM for linux, based on paper from Sally FLoyd,
    Ramakrishna Gummadi, and Scott Shenker, August 2001 :

    http://icir.org/floyd/papers/adaptiveRed.pdf

    Goal of Adaptative RED is to make max_p a dynamic value between 1% and
    50% to reach the target average queue : (max_th - min_th) / 2

    Every 500 ms:
    if (avg > target and max_p < target and max_p >= 0.01)
    decrease max_p : max_p *= beta;

    target :[min_th + 0.4*(min_th - max_th),
    min_th + 0.6*(min_th - max_th)].
    alpha : min(0.01, max_p / 4)
    beta : 0.9
    max_P is a Q0.32 fixed point number (unsigned, with 32 bits mantissa)

    Changes against our RED implementation are :

    max_p is no longer a negative power of two (1/(2^Plog)), but a Q0.32
    fixed point number, to allow full range described in Adatative paper.

    To deliver a random number, we now use a reciprocal divide (thats really
    a multiply), but this operation is done once per marked/droped packet
    when in RED_BETWEEN_TRESH window, so added cost (compared to previous
    AND operation) is near zero.

    dump operation gives current max_p value in a new TCA_RED_MAX_P
    attribute.

    Example on a 10Mbit link :

    tc qdisc add dev $DEV parent 1:1 handle 10: est 1sec 8sec red \
    limit 400000 min 30000 max 90000 avpkt 1000 \
    burst 55 ecn adaptative bandwidth 10Mbit

    # tc -s -d qdisc show dev eth3
    ...
    qdisc red 10: parent 1:1 limit 400000b min 30000b max 90000b ecn
    adaptative ewma 5 max_p=0.113335 Scell_log 15
    Sent 50414282 bytes 34504 pkt (dropped 35, overlimits 1392 requeues 0)
    rate 9749Kbit 831pps backlog 72056b 16p requeues 0
    marked 1357 early 35 pdrop 0 other 0

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

14 Dec, 2006

1 commit

  • When some objects are allocated by one CPU but freed by another CPU we can
    consume lot of cycles doing divides in obj_to_index().

    (Typical load on a dual processor machine where network interrupts are
    handled by one particular CPU (allocating skbufs), and the other CPU is
    running the application (consuming and freeing skbufs))

    Here on one production server (dual-core AMD Opteron 285), I noticed this
    divide took 1.20 % of CPU_CLK_UNHALTED events in kernel. But Opteron are
    quite modern cpus and the divide is much more expensive on oldest
    architectures :

    On a 200 MHz sparcv9 machine, the division takes 64 cycles instead of 1
    cycle for a multiply.

    Doing some math, we can use a reciprocal multiplication instead of a divide.

    If we want to compute V = (A / B) (A and B being u32 quantities)
    we can instead use :

    V = ((u64)A * RECIPROCAL(B)) >> 32 ;

    where RECIPROCAL(B) is precalculated to ((1LL << 32) + (B - 1)) / B

    Note :

    I wrote pure C code for clarity. gcc output for i386 is not optimal but
    acceptable :

    mull 0x14(%ebx)
    mov %edx,%eax // part of the >> 32
    xor %edx,%edx // useless
    mov %eax,(%esp) // could be avoided
    mov %edx,0x4(%esp) // useless
    mov (%esp),%ebx

    [akpm@osdl.org: small cleanups]
    Signed-off-by: Eric Dumazet
    Cc: Christoph Lameter
    Cc: David Miller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet