17 Feb, 2010

1 commit

  • Add __percpu sparse annotations to net.

    These annotations are to make sparse consider percpu variables to be
    in a different address space and warn if accessed without going
    through percpu accessors. This patch doesn't affect normal builds.

    The macro and type tricks around snmp stats make things a bit
    interesting. DEFINE/DECLARE_SNMP_STAT() macros mark the target field
    as __percpu and SNMP_UPD_PO_STATS() macro is updated accordingly. All
    snmp_mib_*() users which used to cast the argument to (void **) are
    updated to cast it to (void __percpu **).

    Signed-off-by: Tejun Heo
    Acked-by: David S. Miller
    Cc: Patrick McHardy
    Cc: Arnaldo Carvalho de Melo
    Cc: Vlad Yasevich
    Cc: netdev@vger.kernel.org
    Signed-off-by: David S. Miller

    Tejun Heo
     

13 Feb, 2010

2 commits

  • DCCP is datagram-oriented but lacks UDP's support for MSG_TRUNC as defined in
    recvmsg(2)/recv(2). Hence the following 'Hello world\0' receiver

    len = recv(fd, buf, 10, MSG_PEEK | MSG_TRUNC);

    wrongly (always) returns 10, while in UDP it returns 12 as expected.
    This patch adds the missing MSG_TRUNC support to recvmsg().

    Signed-off-by: Gerrit Renker
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This fixes a problem in the DCCP getsockopt() API: currently there is no way
    for a user to a priori know the number of built-in CCIDs, other than trying
    DCCP_SOCKOPT_AVAILABLE_CCIDS in a loop, incrementing the option length until
    EINVAL is no longer returned.

    This patch truncates the array to the user-provided length. No copy is made
    when the length is
    Signed-off-by: David S. Miller

    Gerrit Renker
     

04 Feb, 2010

3 commits

  • David S. Miller
     
  • This fixes commit (38ff3e6bb987ec583268da8eb22628293095d43b) ("dccp_probe:
    Fix module load dependencies between dccp and dccp_probe", from 15 Jan).

    It fixes the construction of the first argument of try_then_request_module(),
    where only valid return codes from the first argument should be returned.

    What we do now is assign the result of register_jprobe() to ret, without
    the side effect of the comparison.

    Acked-by: Gerrit Renker
    Signed-off-by: Neil Horman
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This fixes a bug introduced in commit de4ef86cfce60d2250111f34f8a084e769f23b16
    ("dccp: fix dccp rmmod when kernel configured to use slub", 17 Jan): the
    vsnprintf used sizeof(slab_name_fmt), which became truncated to 4 bytes, since
    slab_name_fmt is now a 4-byte pointer and no longer a 32-character array.

    This lead to error messages such as
    FATAL: Error inserting dccp: No buffer space available

    >> kernel: [ 1456.341501] kmem_cache_create: duplicate cache cci
    generated due to the truncation after the 3rd character.

    Fixed for the moment by introducing a symbolic constant. Tested to fix the bug.

    Signed-off-by: Gerrit Renker
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Gerrit Renker
     

23 Jan, 2010

2 commits


19 Jan, 2010

1 commit

  • Hey all-
    I was tinkering with dccp recently and noticed that I BUG halted the
    kernel when I rmmod-ed the dccp module. The bug halt occured because the page
    that I passed to kfree failed the PageCompound and PageSlab test in the slub
    implementation of kfree. I tracked the problem down to the following set of
    events:

    1) dccp, unlike all other uses of kmem_cache_create, allocates a string
    dynamically when registering a slab cache. This allocated string is freed when
    the cache is destroyed.

    2) Normally, (1) is not an issue, but when Slub is in use, it is possible that
    caches are 'merged'. This process causes multiple caches of simmilar
    configuration to use the same cache data structure. When this happens, the new
    name of the cache is effectively dropped.

    3) (2) results in kmem_cache_name returning an ambigous value (i.e.
    ccid_kmem_cache_destroy, which uses this fuction to retrieve the name pointer
    for freeing), is no longer guaranteed that the string it assigned is what is
    returned.

    4) If such merge event occurs, ccid_kmem_cache_destroy frees the wrong pointer,
    which trips over the BUG in the slub implementation of kfree (since its likely
    not a slab allocation, but rather a pointer into the static string table
    section.

    So, what to do about this. At first blush this is pretty clearly a leak in the
    information that slub owns, and as such a slub bug. Unfortunately, theres no
    really good way to fix it, without exposing slub specific implementation details
    to the generic slab interface. Also, even if we could fix this in slub cleanly,
    I think the RCU free option would force us to do lots of string duplication, not
    only in slub, but in every slab allocator. As such, I'd like to propose this
    solution. Basically, I just move the storage for the kmem cache name to the
    ccid_operations structure. In so doing, we don't have to do the kstrdup or
    kfree when we allocate/free the various caches for dccp, and so we avoid the
    problem, by storing names with static memory, rather than heap, the way all
    other calls to kmem_cache_create do.

    I've tested this out myself here, and it solves the problem quite well.

    Signed-off-by: Neil Horman
    Acked-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Neil Horman
     

18 Jan, 2010

1 commit


15 Jan, 2010

1 commit

  • This was just recently reported to me. When built as modules, the
    dccp_probe module has a silent dependency on the dccp module. This
    stems from the fact that the module_init routine of dccp_probe
    registers a jprobe on the dccp_sendmsg symbol. Since the symbol is
    only referenced as a text string (the .symbol_name field in the jprobe
    struct) rather than the address of the symbol itself, depmod never
    picks this dependency up, and so if you load the dccp_probe module
    without the dccp module loaded, the register_jprobe call fails with an
    -EINVAL, and the whole module load fails.

    The fix is pretty easy, we can just wrap the register_jprobe call in a
    try_then_request_module call, which forces the dependency to get
    satisfied prior to the probe registration.

    Signed-off-by: Neil Horman
    Acked-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Neil Horman
     

23 Dec, 2009

4 commits

  • rename kfifo_put... into kfifo_in... to prevent miss use of old non in
    kernel-tree drivers

    ditto for kfifo_get... -> kfifo_out...

    Improve the prototypes of kfifo_in and kfifo_out to make the kerneldoc
    annotations more readable.

    Add mini "howto porting to the new API" in kfifo.h

    Signed-off-by: Stefani Seibold
    Acked-by: Greg Kroah-Hartman
    Acked-by: Mauro Carvalho Chehab
    Acked-by: Andi Kleen
    Acked-by: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stefani Seibold
     
  • change name of __kfifo_* functions to kfifo_*, because the prefix __kfifo
    should be reserved for internal functions only.

    Signed-off-by: Stefani Seibold
    Acked-by: Greg Kroah-Hartman
    Acked-by: Mauro Carvalho Chehab
    Acked-by: Andi Kleen
    Acked-by: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stefani Seibold
     
  • Move the pointer to the spinlock out of struct kfifo. Most users in
    tree do not actually use a spinlock, so the few exceptions now have to
    call kfifo_{get,put}_locked, which takes an extra argument to a
    spinlock.

    Signed-off-by: Stefani Seibold
    Acked-by: Greg Kroah-Hartman
    Acked-by: Mauro Carvalho Chehab
    Acked-by: Andi Kleen
    Acked-by: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stefani Seibold
     
  • This is a new generic kernel FIFO implementation.

    The current kernel fifo API is not very widely used, because it has to
    many constrains. Only 17 files in the current 2.6.31-rc5 used it.
    FIFO's are like list's a very basic thing and a kfifo API which handles
    the most use case would save a lot of development time and memory
    resources.

    I think this are the reasons why kfifo is not in use:

    - The API is to simple, important functions are missing
    - A fifo can be only allocated dynamically
    - There is a requirement of a spinlock whether you need it or not
    - There is no support for data records inside a fifo

    So I decided to extend the kfifo in a more generic way without blowing up
    the API to much. The new API has the following benefits:

    - Generic usage: For kernel internal use and/or device driver.
    - Provide an API for the most use case.
    - Slim API: The whole API provides 25 functions.
    - Linux style habit.
    - DECLARE_KFIFO, DEFINE_KFIFO and INIT_KFIFO Macros
    - Direct copy_to_user from the fifo and copy_from_user into the fifo.
    - The kfifo itself is an in place member of the using data structure, this save an
    indirection access and does not waste the kernel allocator.
    - Lockless access: if only one reader and one writer is active on the fifo,
    which is the common use case, no additional locking is necessary.
    - Remove spinlock - give the user the freedom of choice what kind of locking to use if
    one is required.
    - Ability to handle records. Three type of records are supported:
    - Variable length records between 0-255 bytes, with a record size
    field of 1 bytes.
    - Variable length records between 0-65535 bytes, with a record size
    field of 2 bytes.
    - Fixed size records, which no record size field.
    - Preserve memory resource.
    - Performance!
    - Easy to use!

    This patch:

    Since most users want to have the kfifo as part of another object,
    reorganize the code to allow including struct kfifo in another data
    structure. This requires changing the kfifo_alloc and kfifo_init
    prototypes so that we pass an existing kfifo pointer into them. This
    patch changes the implementation and all existing users.

    [akpm@linux-foundation.org: fix warning]
    Signed-off-by: Stefani Seibold
    Acked-by: Greg Kroah-Hartman
    Acked-by: Mauro Carvalho Chehab
    Acked-by: Andi Kleen
    Acked-by: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stefani Seibold
     

09 Dec, 2009

1 commit

  • First patch changes __inet_hash_nolisten() and __inet6_hash()
    to get a timewait parameter to be able to unhash it from ehash
    at same time the new socket is inserted in hash.

    This makes sure timewait socket wont be found by a concurrent
    writer in __inet_check_established()

    Reported-by: kapil dakhane
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

08 Dec, 2009

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1815 commits)
    mac80211: fix reorder buffer release
    iwmc3200wifi: Enable wimax core through module parameter
    iwmc3200wifi: Add wifi-wimax coexistence mode as a module parameter
    iwmc3200wifi: Coex table command does not expect a response
    iwmc3200wifi: Update wiwi priority table
    iwlwifi: driver version track kernel version
    iwlwifi: indicate uCode type when fail dump error/event log
    iwl3945: remove duplicated event logging code
    b43: fix two warnings
    ipw2100: fix rebooting hang with driver loaded
    cfg80211: indent regulatory messages with spaces
    iwmc3200wifi: fix NULL pointer dereference in pmkid update
    mac80211: Fix TX status reporting for injected data frames
    ath9k: enable 2GHz band only if the device supports it
    airo: Fix integer overflow warning
    rt2x00: Fix padding bug on L2PAD devices.
    WE: Fix set events not propagated
    b43legacy: avoid PPC fault during resume
    b43: avoid PPC fault during resume
    tcp: fix a timewait refcnt race
    ...

    Fix up conflicts due to sysctl cleanups (dead sysctl_check code and
    CTL_UNNUMBERED removed) in
    kernel/sysctl_check.c
    net/ipv4/sysctl_net_ipv4.c
    net/ipv6/addrconf.c
    net/sctp/sysctl.c

    Linus Torvalds
     

03 Dec, 2009

1 commit

  • Add optional function parameters associated with sending SYNACK.
    These parameters are not needed after sending SYNACK, and are not
    used for retransmission. Avoids extending struct tcp_request_sock,
    and avoids allocating kernel memory.

    Also affects DCCP as it uses common struct request_sock_ops,
    but this parameter is currently reserved for future use.

    Signed-off-by: William.Allen.Simpson@gmail.com
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    William Allen Simpson
     

12 Nov, 2009

1 commit

  • Now that sys_sysctl is a compatiblity wrapper around /proc/sys
    all sysctl strategy routines, and all ctl_name and strategy
    entries in the sysctl tables are unused, and can be
    revmoed.

    In addition neigh_sysctl_register has been modified to no longer
    take a strategy argument and it's callers have been modified not
    to pass one.

    Cc: "David Miller"
    Cc: Hideaki YOSHIFUJI
    Cc: netdev@vger.kernel.org
    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     

06 Nov, 2009

1 commit

  • struct can_proto had a capability field which wasn't ever used. It is
    dropped entirely.

    struct inet_protosw had a capability field which can be more clearly
    expressed in the code by just checking if sock->type = SOCK_RAW.

    Signed-off-by: Eric Paris
    Acked-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Eric Paris
     

21 Oct, 2009

1 commit

  • dst_negative_advice() should check for changed dst and reset
    sk_tx_queue_mapping accordingly. Pass sock to the callers of
    dst_negative_advice.

    (sk_reset_txq is defined just for use by dst_negative_advice. The
    only way I could find to get around this is to move dst_negative_()
    from dst.h to dst.c, include sock.h in dst.c, etc)

    Signed-off-by: Krishna Kumar
    Signed-off-by: David S. Miller

    Krishna Kumar
     

19 Oct, 2009

1 commit

  • In order to have better cache layouts of struct sock (separate zones
    for rx/tx paths), we need this preliminary patch.

    Goal is to transfert fields used at lookup time in the first
    read-mostly cache line (inside struct sock_common) and move sk_refcnt
    to a separate cache line (only written by rx path)

    This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr,
    sport and id fields. This allows a future patch to define these
    fields as macros, like sk_refcnt, without name clashes.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

13 Oct, 2009

1 commit


08 Oct, 2009

5 commits

  • Might as well use the ipv6_addr_set_v4mapped() inline we created last
    year.

    Signed-off-by: Brian Haley
    Signed-off-by: David S. Miller

    Brian Haley
     
  • This continues the previous patch, by applying the same change to CCID-3.

    Signed-off-by: Gerrit Renker
    Acked-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This removes a redundancy in the CCID half-connection (hc) naming scheme:
    * instead of 'hctx->tx_...', write 'hc->tx_...';
    * instead of 'hcrx->rx_...', write 'hc->rx_...';

    which works because the 'type' of the half-connection is encoded in the
    'rx_' / 'tx_' prefixes.

    Signed-off-by: Gerrit Renker
    Acked-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This implements the new naming scheme also for CCID-3.

    Signed-off-by: Gerrit Renker
    Acked-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This patch starts a less problematic naming convention for CCID structs.

    The old naming convention used 'hc{tx,rx}->ccid?hc{tx,rx}->...' as
    recurring prefixes, which made the code
    * hard to write (not easy to fit into 80 characters);
    * hard to read (most of the space is occupied by prefixes).

    The new naming scheme:
    * struct entries for the TX socket are prefixed by 'tx_';
    * and those for the RX socket are prefixed by 'rx_'.

    The identifiers then remain distinguishable when grep-ing through the tree:
    (a) RX/TX sockets are distinguished by the naming scheme,
    (b) individual CCIDs are distinguished by filename (ccid{2,3,4}.{c,h}).

    This first patch implements the scheme for CCID-2.

    Signed-off-by: Gerrit Renker
    Acked-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Gerrit Renker
     

01 Oct, 2009

1 commit

  • This provides safety against negative optlen at the type
    level instead of depending upon (sometimes non-trivial)
    checks against this sprinkled all over the the place, in
    each and every implementation.

    Based upon work done by Arjan van de Ven and feedback
    from Linus Torvalds.

    Signed-off-by: David S. Miller

    David S. Miller
     

22 Sep, 2009

1 commit

  • Sizing of memory allocations shouldn't depend on the number of physical
    pages found in a system, as that generally includes (perhaps a huge amount
    of) non-RAM pages. The amount of what actually is usable as storage
    should instead be used as a basis here.

    Some of the calculations (i.e. those not intending to use high memory)
    should likely even use (totalram_pages - totalhigh_pages).

    Signed-off-by: Jan Beulich
    Acked-by: Rusty Russell
    Acked-by: Ingo Molnar
    Cc: Dave Airlie
    Cc: Kyle McMartin
    Cc: Jeremy Fitzhardinge
    Cc: Pekka Enberg
    Cc: Hugh Dickins
    Cc: "David S. Miller"
    Cc: Patrick McHardy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Beulich
     

15 Sep, 2009

4 commits


02 Sep, 2009

1 commit


13 Aug, 2009

1 commit


10 Aug, 2009

1 commit


06 Aug, 2009

2 commits

  • String literals are constant, and usually, we can also tag the array
    of pointers const too, moving it to the .rodata section.

    Signed-off-by: Jan Engelhardt
    Signed-off-by: David S. Miller

    Jan Engelhardt
     
  • percpu counter dccp_orphan_count is init in dccp_init() by
    percpu_counter_init() while dccp module is loaded, but the
    destroy of it is missing while dccp module is unloaded. We
    can get the kernel WARNING about this. Reproduct by the
    following commands:

    $ modprobe dccp
    $ rmmod dccp
    $ modprobe dccp

    WARNING: at lib/list_debug.c:26 __list_add+0x27/0x5c()
    Hardware name: VMware Virtual Platform
    list_add corruption. next->prev should be prev (c080c0c4), but was (null). (next
    =ca7188cc).
    Modules linked in: dccp(+) nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc
    Pid: 1956, comm: modprobe Not tainted 2.6.31-rc5 #55
    Call Trace:
    [] warn_slowpath_common+0x6a/0x81
    [] ? __list_add+0x27/0x5c
    [] warn_slowpath_fmt+0x29/0x2c
    [] __list_add+0x27/0x5c
    [] __percpu_counter_init+0x4d/0x5d
    [] dccp_init+0x19/0x2ed [dccp]
    [] do_one_initcall+0x4f/0x111
    [] ? dccp_init+0x0/0x2ed [dccp]
    [] ? notifier_call_chain+0x26/0x48
    [] ? __blocking_notifier_call_chain+0x45/0x51
    [] sys_init_module+0xac/0x1bd
    [] sysenter_do_call+0x12/0x22

    Signed-off-by: Wei Yongjun
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Wei Yongjun
     

30 Jul, 2009

1 commit

  • The DCCP protocol tries to allocate some large hash tables during
    initialisation using the largest size possible. This can be larger than
    what the page allocator can provide so it prints a warning. However, the
    caller is able to handle the situation so this patch suppresses the
    warning.

    Signed-off-by: Mel Gorman
    Acked-by: Arnaldo Carvalho de Melo
    Cc: "David S. Miller"
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman