05 Oct, 2011

1 commit

  • * git://github.com/davem330/net:
    pch_gbe: Fixed the issue on which a network freezes
    pch_gbe: Fixed the issue on which PC was frozen when link was downed.
    make PACKET_STATISTICS getsockopt report consistently between ring and non-ring
    net: xen-netback: correctly restart Tx after a VM restore/migrate
    bonding: properly stop queuing work when requested
    can bcm: fix incomplete tx_setup fix
    RDSRDMA: Fix cleanup of rds_iw_mr_pool
    net: Documentation: Fix type of variables
    ibmveth: Fix oops on request_irq failure
    ipv6: nullify ipv6_ac_list and ipv6_fl_list when creating new socket
    cxgb4: Fix EEH on IBM P7IOC
    can bcm: fix tx_setup off-by-one errors
    MAINTAINERS: tehuti: Alexander Indenbaum's address bounces
    dp83640: reduce driver noise
    ptp: fix L2 event message recognition

    Linus Torvalds
     

04 Oct, 2011

1 commit

  • This is a minor change.

    Up until kernel 2.6.32, getsockopt(fd, SOL_PACKET, PACKET_STATISTICS,
    ...) would return total and dropped packets since its last invocation. The
    introduction of socket queue overflow reporting [1] changed drop
    rate calculation in the normal packet socket path, but not when using a
    packet ring. As a result, the getsockopt now returns different statistics
    depending on the reception method used. With a ring, it still returns the
    count since the last call, as counts are incremented in tpacket_rcv and
    reset in getsockopt. Without a ring, it returns 0 if no drops occurred
    since the last getsockopt and the total drops over the lifespan of
    the socket otherwise. The culprit is this line in packet_rcv, executed
    on a drop:

    drop_n_acct:
    po->stats.tp_drops = atomic_inc_return(&sk->sk_drops);

    As it shows, the new drop number it taken from the socket drop counter,
    which is not reset at getsockopt. I put together a small example
    that demonstrates the issue [2]. It runs for 10 seconds and overflows
    the queue/ring on every odd second. The reported drop rates are:
    ring: 16, 0, 16, 0, 16, ...
    non-ring: 0, 15, 0, 30, 0, 46, 0, 60, 0 , 74.

    Note how the even ring counts monotonically increase. Because the
    getsockopt adds tp_drops to tp_packets, total counts are similarly
    reported cumulatively. Long story short, reinstating the original code, as
    the below patch does, fixes the issue at the cost of additional per-packet
    cycles. Another solution that does not introduce per-packet overhead
    is be to keep the current data path, record the value of sk_drops at
    getsockopt() at call N in a new field in struct packetsock and subtract
    that when reporting at call N+1. I'll be happy to code that, instead,
    it's just more messy.

    [1] http://patchwork.ozlabs.org/patch/35665/
    [2] http://kernel.googlecode.com/files/test-packetsock-getstatistics.c

    Signed-off-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Willem de Bruijn
     

30 Sep, 2011

3 commits

  • * 'for-linus' of git://github.com/NewDreamNetwork/ceph-client:
    libceph: fix pg_temp mapping update
    libceph: fix pg_temp mapping calculation
    libceph: fix linger request requeuing
    libceph: fix parse options memory leak
    libceph: initialize ack_stamp to avoid unnecessary connection reset

    Linus Torvalds
     
  • The commit aabdcb0b553b9c9547b1a506b34d55a764745870 ("can bcm: fix tx_setup
    off-by-one errors") fixed only a part of the original problem reported by
    Andre Naujoks. It turned out that the original code needed to be re-ordered
    to reduce complexity and to finally fix the reported frame counting issues.

    Signed-off-by: Oliver Hartkopp
    Signed-off-by: David S. Miller

    Oliver Hartkopp
     
  • In the rds_iw_mr_pool struct the free_pinned field keeps track of
    memory pinned by free MRs. While this field is incremented properly
    upon allocation, it is never decremented upon unmapping. This would
    cause the rds_rdma module to crash the kernel upon unloading, by
    triggering the BUG_ON in the rds_iw_destroy_mr_pool function.

    This change keeps track of the MRs that become unpinned, so that
    free_pinned can be decremented appropriately.

    Signed-off-by: Jonathan Lallinger
    Signed-off-by: Steve Wise
    Signed-off-by: David S. Miller

    Jonathan Lallinger
     

29 Sep, 2011

4 commits

  • ipv6_ac_list and ipv6_fl_list from listening socket are inadvertently
    shared with new socket created for connection.

    Signed-off-by: Zheng Yan
    Signed-off-by: David S. Miller

    Yan, Zheng
     
  • This patch fixes two off-by-one errors that canceled each other out.
    Checking for the same condition two times in bcm_tx_timeout_tsklet() reduced
    the count of frames to be sent by one. This did not show up the first time
    tx_setup is invoked as an additional frame is sent due to TX_ANNONCE.
    Invoking a second tx_setup on the same item led to a reduced (by 1) number of
    sent frames.

    Reported-by: Andre Naujoks
    Signed-off-by: Oliver Hartkopp
    Signed-off-by: David S. Miller

    Oliver Hartkopp
     
  • The incremental map updates have a record for each pg_temp mapping that is
    to be add/updated (len > 0) or removed (len == 0). The old code was
    written as if the updates were a complete enumeration; that was just wrong.
    Update the code to remove 0-length entries and drop the rbtree traversal.

    This avoids misdirected (and hung) requests that manifest as server
    errors like

    [WRN] client4104 10.0.1.219:0/275025290 misdirected client4104.1:129 0.1 to osd0 not [1,0] in e11/11

    Signed-off-by: Sage Weil

    Sage Weil
     
  • We need to apply the modulo pg_num calculation before looking up a pgid in
    the pg_temp mapping rbtree. This fixes pg_temp mappings, and fixes
    (some) misdirected requests that result in messages like

    [WRN] client4104 10.0.1.219:0/275025290 misdirected client4104.1:129 0.1 to osd0 not [1,0] in e11/11

    on the server and stall make the client block without getting a reply (at
    least until the pg_temp mapping goes way, but that can take a long long
    time).

    Reorder calc_pg_raw() a bit to make more sense.

    Signed-off-by: Sage Weil

    Sage Weil
     

28 Sep, 2011

7 commits


23 Sep, 2011

1 commit

  • corrects a critical bug of the GW feature. This bug made all the unicast
    packets destined to a GW to be sent as broadcast. This bug is present even if
    the sender GW feature is configured as OFF. It's an urgent bug fix and should
    be committed as soon as possible.

    This was a regression introduced by 43676ab590c3f8686fd047d34c3e33803eef71f0

    Signed-off-by: Antonio Quartulli
    Signed-off-by: Marek Lindner

    Antonio Quartulli
     

22 Sep, 2011

3 commits

  • Incorrect variable was used in validating the akm_suites array from
    NL80211_ATTR_AKM_SUITES. In addition, there was no explicit
    validation of the array length (we only have room for
    NL80211_MAX_NR_AKM_SUITES).

    This can result in a buffer write overflow for stack variables with
    arbitrary data from user space. The nl80211 commands using the affected
    functionality require GENL_ADMIN_PERM, so this is only exposed to admin
    users.

    Cc: stable@kernel.org
    Signed-off-by: Jouni Malinen
    Signed-off-by: John W. Linville

    Jouni Malinen
     
  • When asyncronous crypto algorithms are used, there might be many
    packets that passed the xfrm replay check, but the replay advance
    function is not called yet for these packets. So the replay check
    function would accept a replay of all of these packets. Also the
    system might crash if there are more packets in async processing
    than the size of the anti replay window, because the replay advance
    function would try to update the replay window beyond the bounds.

    This pach adds a second replay check after resuming from the async
    processing to fix these issues.

    Signed-off-by: Steffen Klassert
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller

    Steffen Klassert
     
  • add new fib rule can cause BUG_ON happen
    the reproduce shell is
    ip rule add pref 38
    ip rule add pref 38
    ip rule add to 192.168.3.0/24 goto 38
    ip rule del pref 38
    ip rule add to 192.168.3.0/24 goto 38
    ip rule add pref 38

    then the BUG_ON will happen
    del BUG_ON and use (ctarget == NULL) identify whether this rule is unresolved

    Signed-off-by: Gao feng
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Gao feng
     

21 Sep, 2011

1 commit

  • When calling snmp6_alloc_dev fails, the snmp6 relevant memory
    are freed by snmp6_alloc_dev. Calling in6_dev_finish_destroy
    will free these memory twice.

    Double free will lead that undefined behavior occurs.

    Signed-off-by: Roy Li
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Roy Li
     

20 Sep, 2011

2 commits


19 Sep, 2011

2 commits

  • D-SACK is allowed to reside below snd_una. But the corresponding check
    in tcp_is_sackblock_valid() is the exact opposite. It looks like a typo.

    Signed-off-by: Zheng Yan
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Zheng Yan
     
  • * git://github.com/davem330/net: (62 commits)
    ipv6: don't use inetpeer to store metrics for routes.
    can: ti_hecc: include linux/io.h
    IRDA: Fix global type conflicts in net/irda/irsysctl.c v2
    net: Handle different key sizes between address families in flow cache
    net: Align AF-specific flowi structs to long
    ipv4: Fix fib_info->fib_metrics leak
    caif: fix a potential NULL dereference
    sctp: deal with multiple COOKIE_ECHO chunks
    ibmveth: Fix checksum offload failure handling
    ibmveth: Checksum offload is always disabled
    ibmveth: Fix issue with DMA mapping failure
    ibmveth: Fix DMA unmap error
    pch_gbe: support ML7831 IOH
    pch_gbe: added the process of FIFO over run error
    pch_gbe: fixed the issue which receives an unnecessary packet.
    sfc: Use 64-bit writes for TX push where possible
    Revert "sfc: Use write-combining to reduce TX latency" and follow-ups
    bnx2x: Fix ethtool advertisement
    bnx2x: Fix 578xx link LED
    bnx2x: Fix XMAC loopback test
    ...

    Linus Torvalds
     

17 Sep, 2011

11 commits

  • Current IPv6 implementation uses inetpeer to store metrics for
    routes. The problem of inetpeer is that it doesn't take subnet
    prefix length in to consideration. If two routes have the same
    address but different prefix length, they share same inetpeer.
    So changing metrics of one route also affects the other. The
    fix is to allocate separate metrics storage for each route.

    Signed-off-by: Zheng Yan
    Signed-off-by: David S. Miller

    Yan, Zheng
     
  • The externs here didn't agree with the declarations in qos.c.

    Better would be probably to move this into a header, but since it's
    common practice to have naked externs with sysctls I left it for now.

    Cc: samuel@sortiz.org
    Signed-off-by: Andi Kleen
    Signed-off-by: David S. Miller

    Andi Kleen
     
  • With the conversion of struct flowi to a union of AF-specific structs, some
    operations on the flow cache need to account for the exact size of the key.

    Signed-off-by: David Ward
    Signed-off-by: David S. Miller

    dpward
     
  • Commit 4670994d(net,rcu: convert call_rcu(fc_rport_free_rcu) to
    kfree_rcu()) introduced a memory leak. This patch reverts it.

    Signed-off-by: Zheng Yan
    Signed-off-by: David S. Miller

    Yan, Zheng
     
  • Commit bd30ce4bc0b7 (caif: Use RCU instead of spin-lock in caif_dev.c)
    added a potential NULL dereference in case alloc_percpu() fails.

    caif_device_alloc() can also use GFP_KERNEL instead of GFP_ATOMIC.

    Signed-off-by: Eric Dumazet
    CC: Sjur Brændeland
    Acked-by: Sjur Brændeland
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Attempt to reduce the number of IP packets emitted in response to single
    SCTP packet (2e3216cd) introduced a complication - if a packet contains
    two COOKIE_ECHO chunks and nothing else then SCTP state machine corks the
    socket while processing first COOKIE_ECHO and then loses the association
    and forgets to uncork the socket. To deal with the issue add new SCTP
    command which can be used to set association explictly. Use this new
    command when processing second COOKIE_ECHO chunk to restore the context
    for SCTP state machine.

    Signed-off-by: Max Matveev
    Signed-off-by: David S. Miller

    Max Matveev
     
  • The scan request received from cfg80211_connect do not
    have proper rate mast. So the probe request sent on each
    channel do not have proper the supported rates ie.

    Cc: stable@kernel.org
    Reviewed-by: Johannes Berg
    Signed-off-by: Rajkumar Manoharan
    Signed-off-by: John W. Linville

    Rajkumar Manoharan
     
  • During the association, the regulatory is updated by country IE
    that reaps the previously found beacons. The impact is that
    after a STA disconnects *or* when for any reason a regulatory
    domain change happens the beacon hint flag is not cleared
    therefore preventing future beacon hints to be learned.
    This is important as a regulatory domain change or a restore
    of regulatory settings would set back the passive scan and no-ibss
    flags on the channel. This is the right place to do this given that
    it covers any regulatory domain change.

    Cc: stable@kernel.org
    Reviewed-by: Luis R. Rodriguez
    Signed-off-by: Rajkumar Manoharan
    Acked-by: Luis R. Rodriguez
    Signed-off-by: John W. Linville

    Rajkumar Manoharan
     
  • The r_req_lru_item list node moves between several lists, and that cycle
    is not directly related (and does not begin) with __register_request().
    Initialize it in the request constructor, not __register_request(). This
    fixes later badness (below) when OSDs restart underneath an rbd mount.

    Crashes we've seen due to this include:

    [ 213.974288] kernel BUG at net/ceph/messenger.c:2193!

    and

    [ 144.035274] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
    [ 144.035278] IP: [] con_work+0x1463/0x2ce0 [libceph]

    Signed-off-by: Sage Weil

    Sage Weil
     
  • ceph_destroy_options does not free opt->mon_addr that
    is allocated in ceph_parse_options.

    Signed-off-by: Noah Watkins
    Signed-off-by: Sage Weil

    Noah Watkins
     
  • Commit 4cf9d544631c recorded when an outgoing ceph message was ACKed,
    in order to avoid unnecessary connection resets when an OSD is busy.

    However, ack_stamp is uninitialized, so there is a window between
    when the message is sent and when it is ACKed in which handle_timeout()
    interprets the unitialized value as an expired timeout, and resets
    the connection unnecessarily.

    Close the window by initializing ack_stamp.

    Signed-off-by: Jim Schutt
    Signed-off-by: Sage Weil

    Jim Schutt
     

16 Sep, 2011

4 commits