14 Dec, 2006

7 commits

  • Run this:

    #!/bin/sh
    for f in $(grep -Erl "\([^\)]*\) *k[cmz]alloc" *) ; do
    echo "De-casting $f..."
    perl -pi -e "s/ ?= ?\([^\)]*\) *(k[cmz]alloc) *\(/ = \1\(/" $f
    done

    And then go through and reinstate those cases where code is casting pointers
    to non-pointers.

    And then drop a few hunks which conflicted with outstanding work.

    Cc: Russell King , Ian Molton
    Cc: Mikael Starvik
    Cc: Yoshinori Sato
    Cc: Roman Zippel
    Cc: Geert Uytterhoeven
    Cc: Ralf Baechle
    Cc: Paul Mackerras
    Cc: Kyle McMartin
    Cc: Benjamin Herrenschmidt
    Cc: Martin Schwidefsky
    Cc: "David S. Miller"
    Cc: Jeff Dike
    Cc: Greg KH
    Cc: Jens Axboe
    Cc: Paul Fulghum
    Cc: Alan Cox
    Cc: Karsten Keil
    Cc: Mauro Carvalho Chehab
    Cc: Jeff Garzik
    Cc: James Bottomley
    Cc: Ian Kent
    Cc: Steven French
    Cc: David Woodhouse
    Cc: Neil Brown
    Cc: Jaroslav Kysela
    Cc: Takashi Iwai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert P. J. Day
     
  • Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • There's no point deferring something just to immediately fail the deferral,
    especially now that we can do something more useful in the failure case by
    returning an error.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    J.Bruce Fields
     
  • To avoid tying up server threads when nfsd makes an upcall (to mountd, to get
    export options, to idmapd, for nfsv4 nameid mapping, etc.), we temporarily
    "drop" the request and save enough information so that we can revisit it
    later.

    Certain failures during the deferral process can cause us to really drop the
    request and never revisit it.

    This is often less than ideal, and is unacceptable in the NFSv4 case--rfc 3530
    forbids the server from dropping a request without also closing the
    connection.

    As a first step, we modify the deferral code to return -ETIMEDOUT (which is
    translated to nfserr_jukebox in the v3 and v4 cases, and remains a drop in the
    v2 case).

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    J.Bruce Fields
     
  • The memory leak here is embarassingly obvious.

    This fixes a problem that causes the kernel to leak a small amount of memory
    every time it receives a integrity-protected request.

    Thanks to Aim Le Rouzic for the bug report.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    J.Bruce Fields
     
  • Signed-off-by: Al Viro
    Acked-by: Marcel Holtmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Al Viro
     
  • All kcalloc() calls of the form "kcalloc(1,...)" are converted to the
    equivalent kzalloc() calls, and a few kcalloc() calls with the incorrect
    ordering of the first two arguments are fixed.

    Signed-off-by: Robert P. J. Day
    Cc: Jeff Garzik
    Cc: Alan Cox
    Cc: Dominik Brodowski
    Cc: Adam Belay
    Cc: James Bottomley
    Cc: Greg KH
    Cc: Mark Fasheh
    Cc: Trond Myklebust
    Cc: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert P. J. Day
     

13 Dec, 2006

3 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial:
    Fix inotify maintainers entry
    Fix typo in new debug options.
    Jon needs a new shift key.
    fs: Convert kmalloc() + memset() to kzalloc() in fs/.
    configfs.h: Remove dead macro definitions.
    kconfig: Standardize "depends" -> "depends on" in Kconfig files
    e100: replace kmalloc with kcalloc
    um: replace kmalloc+memset with kzalloc
    fix typo in net/ipv4/ip_fragment.c
    include/linux/compiler.h: reject gcc 3 < gcc 3.2
    Kconfig: fix spelling error in config KALLSYMS help text
    Remove duplicate "have to" in comment
    Fix small typo in drivers/serial/icom.c
    Use consistent casing in help message
    EXT{2,3,4}_FS: remove outdated part of the help text

    Linus Torvalds
     
  • Signed-off-by: Adrian Bunk

    Peter Zijlstra
     
  • current -git doesnt boot on my laptop due to netpoll not unlocking the
    tx lock in the else branch.

    booted this up on my laptop with lockdep enabled and there are no
    locking complaints and it works fine.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

12 Dec, 2006

28 commits

  • During boot we get:

    netconsole: device eth0 not up yet, forcing it
    e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex
    WARNING (!__warned) at kernel/softirq.c:137 local_bh_enable()

    Call Trace:
    [] local_bh_enable+0x41/0xa3
    [] netpoll_send_skb+0x116/0x144
    [] netpoll_send_udp+0x263/0x271
    [] write_msg+0x42/0x5e
    [] __call_console_drivers+0x5f/0x70
    [] _call_console_drivers+0x6d/0x71
    [] release_console_sem+0x148/0x1ec
    [] register_console+0x1b1/0x1ba
    [] init_netconsole+0x54/0x68
    [] init+0x152/0x308
    [] _spin_unlock_irq+0x14/0x30
    [] schedule_tail+0x43/0x9f
    [] child_rip+0xa/0x12

    Herbert sayeth:

    Normally networking isn't invoked with interrupts turned off, but I
    suppose we don't have a choice here. This is unique being a place where you
    can get called with BH on, off, or IRQs off.

    Given that this is only used for printk, the easiest solution is probably
    just to disable local IRQs instead of BH.

    Signed-off-by: Andrew Morton
    Signed-off-by: David S. Miller

    Andrew Morton
     
  • Signed-off-by: Simon Horman
    Signed-off-by: David S. Miller

    Simon Horman
     
  • Dean Manners notices that when an IPVS synchonisation daemons are
    started the system load slowly climbs up to 1. This seems to be related
    to the call to ssleep(1) (aka msleep(1000) in the main loop. Replacing
    this with a call to msleep_interruptable() seems to make the problem go
    away. Though I'm not sure that it is correct.

    This is the second edition of this patch, which replaces ssleep()
    in the main loop for both the master and backup threads, as well
    as some thread synchronisation code. The latter is just for thorougness
    as it shouldn't be causing any problems.

    Signed-Off-By: Simon Horman
    Signed-off-by: David S. Miller

    Simon Horman
     
  • Fix foobar in 15b1c0e822f578306332d4f4c449250db5c5dceb and
    e8cc49bb0fdb9e18a99e6780073d1400ba2b0d1f patch series.

    Signed-off-by: Ralf Baechle
    Signed-off-by: David S. Miller

    Ralf Baechle
     
  • That accumulated over the last months hackaton, shame on me for not
    using git-apply whitespace helping hand, will do that from now on.

    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Spotted by David Miller when compiling on sparc64, I reproduced it here on
    parisc64, that are the only platforms to define __kernel_suseconds_t as an
    'int', all the others, x86_64 and x86 included typedef it as a 'long', but from
    the definition of suseconds_t it should just be an 'int' on platforms where it
    is >= 32bits, it would not require all the castings from suseconds_t to (int)
    when printking variables of this type, that are not needed on parisc64 and
    sparc64.

    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • This fixes conversion errors which arose by not properly type-casting
    from u32 to __u64. Fixed by explicitly casting each type which is not
    __u64, or by performing operation after assignment.

    The patch further adds missing debug information to track the current
    value of X_recv.

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • No code change at all.

    This reorders the source file to follow the same order as the corresponding
    header file.

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • No code change at all.

    To make the header file easier to read, the following ordering is established
    among the declarations:
    * hist_new
    * hist_delete
    * hist_entry_new
    * hist_head
    * hist_find_entry
    * hist_add_entry
    * hist_entry_delete
    * hist_purge

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This patch does not alter any algorithm, just the debug message format:

    * s#%s, sk=%p#%s(%p)#g

    * when a statename is present, it now uses %s(%p, state=%s)

    * when only function entry is debugged, it adds an `- entry'

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This migrates all packet history operations into the routine
    ccid3_hc_tx_packet_sent, thereby removing synchronization problems
    that occur when, as before, the operations are spread over multiple
    routines.
    The following minor simplifications are also applied:
    * several simplifications now follow from this change - several tests
    are now no longer required
    * removal of one unnecessary variable (dp)

    Justification:

    Currently packet history operations span two different routines,
    one of which is likely to pass through several iterations of sleeping
    and awakening.
    The first routine, ccid3_hc_tx_send_packet, allocates an entry and
    sets a few fields. The remaining fields are filled in when the second
    routine (which is not within a sleeping context), ccid3_hc_tx_packet_sent,
    is called. This has several strong drawbacks:
    * it is not necessary to split history operations - all fields can be
    filled in by the second routine
    * the first routine is called multiple times, until a packet can be sent,
    and sleeps meanwhile - this causes a lot of difficulties with regard to
    keeping the list consistent
    * since both routines do not have a producer-consumer like synchronization,
    it is very difficult to maintain data across calls to these routines
    * the fact that the routines are called in different contexts (sleeping, not
    sleeping) adds further problems

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This removes the `dccphtx_ccval' field since it is nowhere used in the code and
    in fact not necessary for the accounting.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This puts the window counter computation [RFC 4342, 8.1] into a separate
    function which is called whenever a new packet is ready for immediate
    transmission in ccid3_hc_tx_send_packet.

    Justification:

    The window counter update was previously computed after the packet was sent. This has
    two drawbacks, both fixed by this patch:
    1) re-compute another timestamp almost directly after the packet was sent (expensive),
    2) the CCVal for the window counter is needed at the instant the packet is sent.

    Further details:

    The initialisation of the window counter is left in the state NO_SENT, as before.
    The algorithm will do nothing if either RTT is initialised to 0 (which is ok) or if
    the RTT value remains below 4 microseconds (which is almost pathological).

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • CCID3 performance depends much on the accuracy of RTT samples. If RTT
    samples grow too large, performance can be catastrophically poor.

    To limit the amount of possible damage in such cases, the patch
    * introduces an upper limit which identifies a maximum `sane' RTT value;
    * uses a macro to enforce this upper limit.

    Using a macro was given preference, since it is necessary to identify the
    calling function in the warning message. Since exceeding this threshold
    identifies a critical condition, DCCP_CRIT is used and not DCCP_WARN.

    Many thanks to Ian McDonald for collaboration on this issue.

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • In both the sender and the receiver it is possible that the stored
    RTT value is accessed before an actual RTT estimate has been computed.

    This patch
    * initialises the sender RTT to 0
    - the sender always accesses the RTT in ccid3_hc_tx_packet_sent
    - the RTT is further needed for the window counter algorithm

    * replaces the receiver initialisation of 5msec with 0
    - which has the same effect and removes an `XXX'
    - the RTT value is needed in ccid3_hc_rx_packet_recv as rtt_prev

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • The function ccid3_hc_tx_insert_options only does a redundant no-op,
    as the operation

    DCCP_SKB_CB(skb)->dccpd_ccval = hctx->ccid3hctx_last_win_count;

    is already performed _unconditionally_ in ccid3_hc_tx_send_packet.

    Since there is further no current need for this function, it is removed
    entirely. Since furthermore, there is actually no present need for the
    entire interface function ccid_hc_tx_insert_options, it was decided to
    remove it also, to clean up the interface.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This adds a (debug) warning message which is triggered whenever a packet is
    discarded due to send failure.

    It also adds a conditional, so that an interruption during dccp_wait_for_ccid
    is not treated as a `BUG': the rationale is that interruptions are external,
    whereas bug warnings are concerned with the internals.

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This is an optimisation to reduce CPU load. The received feedback is now
    only directed to the active CCID component, without requiring processing
    also by the inactive one.

    As a consequence, a similar test in ccid3.c is now redundant and is
    also removed.

    Justification:

    Currently DCCP works as a unidirectional service, i.e. a listening server
    is not at the same time a connecting client.
    As far as I can see, several modifications are necessary until that
    becomes possible.
    At the present time, received feedback is both fed to the rx/tx CCID
    modules. In unidirectional service, only one of these is active at any
    one time.

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • In migrating towards using the newer functions scaled_div/scaled_div32
    for TFRC computations mapped from floating-point onto integer arithmetic,
    this completes the last stage of modifications.

    In particular, the overflow case for computing X_calc is circumvented by
    * breaking the computation into two stages
    * the first stage, res = (s*1E6)/R, cannot overflow due to use of u64
    * in the second stage, res = (res*1E6)/f, overflow on u32 is avoided due
    to (i) returning UINT_MAX in this case (which is logically appropriate)
    and (ii) issuing a warning message into the system log (since very likely
    there is a problem somewhere else with the parameters)

    Lastly, all such scaling operations are now exported into tfrc.h, since
    actually this form of scaled computation is specific to TFRC and not to CCID3.

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • Problem:

    Most target types in the CCID3 code are u32, so subtle conversion errors
    can occur if signed time calculations yield negative results: the original
    values are lost in the conversion to unsigned, calculation errors go undetected.

    This patch therefore
    * sets all critical time types from unsigned to suseconds_t
    * avoids comparison between signed/unsigned via type-casting
    * provides ample warning messages in case time calculations are negative

    These warning messages can be removed at a later stage when the code
    has undergone more testing.

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This simplifies the calculation of a value p for a given fval when the
    first loss interval is computed (RFC 3448, 6.3.1). It makes use of the
    two new functions scaled_div/scaled_div32 to provide overflow protection.

    Additionally, protection against divide-by-zero is extended - in this
    case the function will return the maximally possible value of p=100%.

    Background:

    The maximum fval, f(100%), is approximately 244, i.e. the scaled value of fval
    should never exceed 244E6, which fits easily into u32. The problem is the scaling
    by 10^6, since additionally R(TT) is in microseconds.
    This is resolved by breaking the division into two stages: the first stage
    computes fval=(s*10^6)/R, stores that into u64; the second stage computes
    fval = (fval*10^6)/X_recv and complains if overflow is reached for u32.
    This case is safe since the TFRC reverse-lookup routine then returns p=100%.

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This replaces the remaining uses of usecs_div with scaled_div32, which
    internally uses 64bit division and produces a warning on overflow.

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This patch
    * resolves a bug where packets smaller than 32/64 bytes resulted in sending rates of 0
    * supports all sending rates from 1/64 bytes/second up to 4Gbyte/second
    * simplifies the present overflow problems in calculations

    Current sending rate X and the cached value X_recv of the receiver-estimated
    sending rate are both scaled by 64 (2^6) in order to
    * cope with low sending rates (minimally 1 byte/second)
    * allow upgrading to use a packets-per-second implementation of CCID 3
    * avoid calculation errors due to integer arithmetic cut-off

    The patch implements a revised strategy from
    http://www.mail-archive.com/dccp@vger.kernel.org/msg01040.html

    The only difference with regard to that strategy is that t_ipi is already
    used in the calculation of the nofeedback timeout, which saves one division.

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This fixes
    1) a bug in the recomputation of the sending rate by the nofeedback
    timer when no feedback at all has so far been sent by the receiver:
    min_t was used instead of max_t, which is wrong (cf. RFC 3448, p. 10);

    2) an error in the computation of larger initial windows: instead of
    min(... max()) (cf. RFC 4342, 5.), the code had used max(... max()).

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This performs two optimisations for the recomputation of the sending rate.

    1) Currently the target sending rate X_calc is recalculated whenever
    a) the nofeedback timer expires, or
    b) a feedback packet is received.
    In the (a) case, recomputing X_calc is redundant, since

    * the parameters p and RTT do not change in between the
    reception of feedback packets;

    * the parameter X_recv is either modified from received
    feedback or via the nofeedback timer;

    * a test (`p == 0') in the nofeedback timer avoids using
    a stale/undefined value of X_calc if p was previously 0.

    2) The nofeedback timer now only recomputes a timestamp when p == 0.
    This is according to step (4) of [RFC 3448, 4.3] and avoids
    unnecessarily determining a timestamp.

    A debug statement about not updating X is also removed - it helps very
    little in debugging and just clutters the logs.

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This patch follows a suggestion by Ian McDonald and ensures that in
    the current code the value of p can not exceed 100%. Such a value is
    illegal and would consequently cause a bug condition in tfrc_calc_x().

    The receiver case is also tested, and a warning message is added.

    Signed-off-by: Gerrit Renker
    Acked-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • It simplifies waiting for the CCID module to signal that a packet
    is ready to be sent. Other simplifications flow on from this such as
    removing constants.

    As a result of this EAGAIN is not returned any more by dccp_wait_for_ccid
    (which would otherwise lead to unnecessarily discarding the packet in
    dccp_write_xmit).

    Signed-off-by: Ian McDonald
    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo

    Ian McDonald
     
  • Acked-by: Herbert Xu
    Signed-off-by: David S. Miller

    Andrew Morton
     

11 Dec, 2006

2 commits