09 Jan, 2012

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (53 commits)
    Kconfig: acpi: Fix typo in comment.
    misc latin1 to utf8 conversions
    devres: Fix a typo in devm_kfree comment
    btrfs: free-space-cache.c: remove extra semicolon.
    fat: Spelling s/obsolate/obsolete/g
    SCSI, pmcraid: Fix spelling error in a pmcraid_err() call
    tools/power turbostat: update fields in manpage
    mac80211: drop spelling fix
    types.h: fix comment spelling for 'architectures'
    typo fixes: aera -> area, exntension -> extension
    devices.txt: Fix typo of 'VMware'.
    sis900: Fix enum typo 'sis900_rx_bufer_status'
    decompress_bunzip2: remove invalid vi modeline
    treewide: Fix comment and string typo 'bufer'
    hyper-v: Update MAINTAINERS
    treewide: Fix typos in various parts of the kernel, and fix some comments.
    clockevents: drop unknown Kconfig symbol GENERIC_CLOCKEVENTS_MIGR
    gpio: Kconfig: drop unknown symbol 'CS5535_GPIO'
    leds: Kconfig: Fix typo 'D2NET_V2'
    sound: Kconfig: drop unknown symbol ARCH_CLPS7500
    ...

    Fix up trivial conflicts in arch/powerpc/platforms/40x/Kconfig (some new
    kconfig additions, close to removed commented-out old ones)

    Linus Torvalds
     

24 Dec, 2011

1 commit


21 Dec, 2011

1 commit

  • When checking whether a DATA chunk fits into the estimated rwnd a
    full sizeof(struct sk_buff) is added to the needed chunk size. This
    quickly exhausts the available rwnd space and leads to packets being
    sent which are much below the PMTU limit. This can lead to much worse
    performance.

    The reason for this behaviour was to avoid putting too much memory
    pressure on the receiver. The concept is not completely irational
    because a Linux receiver does in fact clone an skb for each DATA chunk
    delivered. However, Linux also reserves half the available socket
    buffer space for data structures therefore usage of it is already
    accounted for.

    When proposing to change this the last time it was noted that this
    behaviour was introduced to solve a performance issue caused by rwnd
    overusage in combination with small DATA chunks.

    Trying to reproduce this I found that with the sk_buff overhead removed,
    the performance would improve significantly unless socket buffer limits
    are increased.

    The following numbers have been gathered using a patched iperf
    supporting SCTP over a live 1 Gbit ethernet network. The -l option
    was used to limit DATA chunk sizes. The numbers listed are based on
    the average of 3 test runs each. Default values have been used for
    sk_(r|w)mem.

    Chunk
    Size Unpatched No Overhead
    -------------------------------------
    4 15.2 Kbit [!] 12.2 Mbit [!]
    8 35.8 Kbit [!] 26.0 Mbit [!]
    16 95.5 Kbit [!] 54.4 Mbit [!]
    32 106.7 Mbit 102.3 Mbit
    64 189.2 Mbit 188.3 Mbit
    128 331.2 Mbit 334.8 Mbit
    256 537.7 Mbit 536.0 Mbit
    512 766.9 Mbit 766.6 Mbit
    1024 810.1 Mbit 808.6 Mbit

    Signed-off-by: Thomas Graf
    Signed-off-by: David S. Miller

    Thomas Graf
     

20 Dec, 2011

1 commit

  • Commit 8ffd3208 voids the previous patches f6778aab and 810c0719 for
    limiting the autoclose value. If userspace passes in -1 on 32-bit
    platform, the overflow check didn't work and autoclose would be set
    to 0xffffffff.

    This patch defines a max_autoclose (in seconds) for limiting the value
    and exposes it through sysctl, with the following intentions.

    1) Avoid overflowing autoclose * HZ.

    2) Keep the default autoclose bound consistent across 32- and 64-bit
    platforms (INT_MAX / HZ in this patch).

    3) Keep the autoclose value consistent between setsockopt() and
    getsockopt() calls.

    Suggested-by: Vlad Yasevich
    Signed-off-by: Xi Wang
    Signed-off-by: David S. Miller

    Xi Wang
     

12 Dec, 2011

1 commit


03 Dec, 2011

1 commit


02 Dec, 2011

1 commit


30 Nov, 2011

1 commit

  • The check from commit 30c2235c is incomplete and cannot prevent
    cases like key_len = 0x80000000 (INT_MAX + 1). In that case, the
    left-hand side of the check (INT_MAX - key_len), which is unsigned,
    becomes 0xffffffff (UINT_MAX) and bypasses the check.

    However this shouldn't be a security issue. The function is called
    from the following two code paths:

    1) setsockopt()

    2) sctp_auth_asoc_set_secret()

    In case (1), sca_keylength is never going to exceed 65535 since it's
    bounded by a u16 from the user API. As such, the key length will
    never overflow.

    In case (2), sca_keylength is computed based on the user key (1 short)
    and 2 * key_vector (3 shorts) for a total of 7 * USHRT_MAX, which still
    will not overflow.

    In other words, this overflow check is not really necessary. Just
    make it more correct.

    Signed-off-by: Xi Wang
    Cc: Vlad Yasevich
    Signed-off-by: David S. Miller

    Xi Wang
     

23 Nov, 2011

1 commit


09 Nov, 2011

2 commits


07 Nov, 2011

1 commit

  • * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
    Revert "tracing: Include module.h in define_trace.h"
    irq: don't put module.h into irq.h for tracking irqgen modules.
    bluetooth: macroize two small inlines to avoid module.h
    ip_vs.h: fix implicit use of module_get/module_put from module.h
    nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
    include: replace linux/module.h with "struct module" wherever possible
    include: convert various register fcns to macros to avoid include chaining
    crypto.h: remove unused crypto_tfm_alg_modname() inline
    uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
    pm_runtime.h: explicitly requires notifier.h
    linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
    miscdevice.h: fix up implicit use of lists and types
    stop_machine.h: fix implicit use of smp.h for smp_processor_id
    of: fix implicit use of errno.h in include/linux/of.h
    of_platform.h: delete needless include
    acpi: remove module.h include from platform/aclinux.h
    miscdevice.h: delete unnecessary inclusion of module.h
    device_cgroup.h: delete needless include
    net: sch_generic remove redundant use of
    net: inet_timewait_sock doesnt need
    ...

    Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in
    - drivers/media/dvb/frontends/dibx000_common.c
    - drivers/media/video/{mt9m111.c,ov6650.c}
    - drivers/mfd/ab3550-core.c
    - include/linux/dmaengine.h

    Linus Torvalds
     

01 Nov, 2011

1 commit


27 Oct, 2011

1 commit

  • commit 66b13d99d96a (ipv4: tcp: fix TOS value in ACK messages sent from
    TIME_WAIT) fixed IPv4 only.

    This part is for the IPv6 side, adding a tclass param to ip6_xmit()

    We alias tw_tclass and tw_tos, if socket family is INET6.

    [ if sockets is ipv4-mapped, only IP_TOS socket option is used to fill
    TOS field, TCLASS is not taken into account ]

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

14 Oct, 2011

1 commit

  • skb truesize currently accounts for sk_buff struct and part of skb head.
    kmalloc() roundings are also ignored.

    Considering that skb_shared_info is larger than sk_buff, its time to
    take it into account for better memory accounting.

    This patch introduces SKB_TRUESIZE(X) macro to centralize various
    assumptions into a single place.

    At skb alloc phase, we put skb_shared_info struct at the exact end of
    skb head, to allow a better use of memory (lowering number of
    reallocations), since kmalloc() gives us power-of-two memory blocks.

    Unless SLUB/SLUB debug is active, both skb->head and skb_shared_info are
    aligned to cache lines, as before.

    Note: This patch might trigger performance regressions because of
    misconfigured protocol stacks, hitting per socket or global memory
    limits that were previously not reached. But its a necessary step for a
    more accurate memory accounting.

    Signed-off-by: Eric Dumazet
    CC: Andi Kleen
    CC: Ben Hutchings
    Signed-off-by: David S. Miller

    Eric Dumazet
     

22 Sep, 2011

1 commit

  • Conflicts:
    MAINTAINERS
    drivers/net/Kconfig
    drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
    drivers/net/ethernet/broadcom/tg3.c
    drivers/net/wireless/iwlwifi/iwl-pci.c
    drivers/net/wireless/iwlwifi/iwl-trans-tx-pcie.c
    drivers/net/wireless/rt2x00/rt2800usb.c
    drivers/net/wireless/wl12xx/main.c

    David S. Miller
     

17 Sep, 2011

1 commit

  • Attempt to reduce the number of IP packets emitted in response to single
    SCTP packet (2e3216cd) introduced a complication - if a packet contains
    two COOKIE_ECHO chunks and nothing else then SCTP state machine corks the
    socket while processing first COOKIE_ECHO and then loses the association
    and forgets to uncork the socket. To deal with the issue add new SCTP
    command which can be used to set association explictly. Use this new
    command when processing second COOKIE_ECHO chunk to restore the context
    for SCTP state machine.

    Signed-off-by: Max Matveev
    Signed-off-by: David S. Miller

    Max Matveev
     

25 Aug, 2011

2 commits


22 Jul, 2011

1 commit


15 Jul, 2011

1 commit

  • Packets to devices without NETIF_F_SCTP_CSUM (including NETIF_F_NO_CSUM)
    should be properly checksummed because the packets can be diverted or
    rerouted after construction. This still leaves packets diverted from
    NETIF_F_SCTP_CSUM-enabled devices with broken checksums. Fixing this
    needs implementing software offload fallback in networking core.

    For users of sctp_checksum_disable, skb->ip_summed should be left as
    CHECKSUM_NONE and not CHECKSUM_UNNECESSARY as per include/linux/skbuff.h.

    Signed-off-by: Michał Mirosław
    Signed-off-by: David S. Miller

    Michał Mirosław
     

14 Jul, 2011

1 commit


09 Jul, 2011

1 commit

  • Trigger user ABORT if application closes a socket which has data
    queued on the socket receive queue or chunks waiting on the
    reassembly or ordering queue as this would imply data being lost
    which defeats the point of a graceful shutdown.

    This behavior is already practiced in TCP.

    We do not check the input queue because that would mean to parse
    all chunks on it to look for unacknowledged data which seems too
    much of an effort. Control chunks or duplicated chunks may also
    be in the input queue and should not be stopping a graceful
    shutdown.

    Signed-off-by: Thomas Graf
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Thomas Graf
     

08 Jul, 2011

1 commit

  • When initiating a graceful shutdown while having data chunks
    on the retransmission queue with a peer which is in zero
    window mode the shutdown is never completed because the
    retransmission error count is reset periodically by the
    following two rules:

    - Do not timeout association while doing zero window probe.
    - Reset overall error count when a heartbeat request has
    been acknowledged.

    The graceful shutdown will wait for all outstanding TSN to
    be acknowledged before sending the SHUTDOWN request. This
    never happens due to the peer's zero window not acknowledging
    the continuously retransmitted data chunks. Although the
    error counter is incremented for each failed retransmission,
    the receiving of the SACK announcing the zero window clears
    the error count again immediately. Also heartbeat requests
    continue to be sent periodically. The peer acknowledges these
    requests causing the error counter to be reset as well.

    This patch changes behaviour to only reset the overall error
    counter for the above rules while not in shutdown. After
    reaching the maximum number of retransmission attempts, the
    T5 shutdown guard timer is scheduled to give the receiver
    some additional time to recover. The timer is stopped as soon
    as the receiver acknowledges any data.

    The issue can be easily reproduced by establishing a sctp
    association over the loopback device, constantly queueing
    data at the sender while not reading any at the receiver.
    Wait for the window to reach zero, then initiate a shutdown
    by killing both processes simultaneously. The association
    will never be freed and the chunks on the retransmission
    queue will be retransmitted indefinitely.

    Signed-off-by: Thomas Graf
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Thomas Graf
     

07 Jul, 2011

2 commits

  • We forgot to send up SCTP_SENDER_DRY_EVENT notification when
    user app subscribes to this event, and there is no data to be
    sent or retransmit.

    This is required by the Socket API and used by the DTLS/SCTP
    implementation.

    Reported-by: Michael Tüxen
    Signed-off-by: Wei Yongjun
    Tested-by: Robin Seggelmann
    Signed-off-by: David S. Miller

    Wei Yongjun
     
  • Current tcp/udp/sctp global memory limits are not taking into account
    hugepages allocations, and allow 50% of ram to be used by buffers of a
    single protocol [ not counting space used by sockets / inodes ...]

    Lets use nr_free_buffer_pages() and allow a default of 1/8 of kernel ram
    per protocol, and a minimum of 128 pages.
    Heavy duty machines sysadmins probably need to tweak limits anyway.

    References: https://bugzilla.stlinux.com/show_bug.cgi?id=38032
    Reported-by: starlight
    Suggested-by: Andrew Morton
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

02 Jul, 2011

1 commit

  • Make the case labels the same indent as the switch.

    git diff -w shows useless break;s removed after returns
    and a comment added to an unnecessary default: break;
    because of a dubious gcc warning.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

17 Jun, 2011

1 commit

  • Unnecessary casts of void * clutter the code.

    These are the remainder casts after several specific
    patches to remove netdev_priv and dev_priv.

    Done via coccinelle script:

    $ cat cast_void_pointer.cocci
    @@
    type T;
    T *pt;
    void *pv;
    @@

    - pt = (T *)pv;
    + pt = pv;

    Signed-off-by: Joe Perches
    Acked-by: Paul Moore
    Signed-off-by: David S. Miller

    Joe Perches
     

12 Jun, 2011

1 commit


07 Jun, 2011

1 commit


02 Jun, 2011

5 commits

  • In this case, the SCTP association transmits an ASCONF packet
    including addition of the new IP address and deletion of the old
    address. This patch implements this functionality.
    In this case, the ASCONF chunk is added to the beginning of the
    queue, because the other chunks cannot be transmitted in this state.

    Signed-off-by: Michio Honda
    Signed-off-by: YOSHIFUJI Hideaki
    Acked-by: Wei Yongjun
    Signed-off-by: David S. Miller

    Michio Honda
     
  • This patch allows the application to operate Auto-ASCONF on/off
    behavior via setsockopt() and getsockopt().

    Signed-off-by: Michio Honda
    Signed-off-by: YOSHIFUJI Hideaki
    Acked-by: Wei Yongjun
    Signed-off-by: David S. Miller

    Michio Honda
     
  • This patch allows the system administrator to change default
    Auto-ASCONF on/off behavior via an sysctl value.

    Signed-off-by: Michio Honda
    Signed-off-by: YOSHIFUJI Hideaki
    Acked-by: Wei Yongjun
    Signed-off-by: David S. Miller

    Michio Honda
     
  • SCTP reconfigure the IP addresses in the association by using
    ASCONF chunks as mentioned in RFC5061. For example, we can
    start to use the newly configured IP address in the existing
    association. This patch implements automatic ASCONF operation
    in the SCTP stack with address events in the host computer,
    which is called auto_asconf.

    Signed-off-by: Michio Honda
    Signed-off-by: YOSHIFUJI Hideaki
    Acked-by: Wei Yongjun
    Signed-off-by: David S. Miller

    Michio Honda
     
  • This patch fixes the problem that the original code cannot delete
    the remote address where the corresponding transport is currently
    directed, even when the ASCONF is sent from the other address (this
    situation happens when the single-homed sender transmits ASCONF
    with ADD and DEL.)

    Signed-off-by: Michio Honda
    Signed-off-by: YOSHIFUJI Hideaki
    Acked-by: Wei Yongjun
    Signed-off-by: David S. Miller

    Michio Honda
     

01 Jun, 2011

1 commit


26 May, 2011

1 commit


24 May, 2011

1 commit

  • The %pK format specifier is designed to hide exposed kernel pointers,
    specifically via /proc interfaces. Exposing these pointers provides an
    easy target for kernel write vulnerabilities, since they reveal the
    locations of writable structures containing easily triggerable function
    pointers. The behavior of %pK depends on the kptr_restrict sysctl.

    If kptr_restrict is set to 0, no deviation from the standard %p behavior
    occurs. If kptr_restrict is set to 1, the default, if the current user
    (intended to be a reader via seq_printf(), etc.) does not have CAP_SYSLOG
    (currently in the LSM tree), kernel pointers using %pK are printed as 0's.
    If kptr_restrict is set to 2, kernel pointers using %pK are printed as
    0's regardless of privileges. Replacing with 0's was chosen over the
    default "(null)", which cannot be parsed by userland %p, which expects
    "(nil)".

    The supporting code for kptr_restrict and %pK are currently in the -mm
    tree. This patch converts users of %p in net/ to %pK. Cases of printing
    pointers to the syslog are not covered, since this would eliminate useful
    information for postmortem debugging and the reading of the syslog is
    already optionally protected by the dmesg_restrict sysctl.

    Signed-off-by: Dan Rosenberg
    Cc: James Morris
    Cc: Eric Dumazet
    Cc: Thomas Graf
    Cc: Eugene Teo
    Cc: Kees Cook
    Cc: Ingo Molnar
    Cc: David S. Miller
    Cc: Peter Zijlstra
    Cc: Eric Paris
    Signed-off-by: Andrew Morton
    Signed-off-by: David S. Miller

    Dan Rosenberg
     

21 May, 2011

2 commits

  • Commit c182f90bc1f22ce5039b8722e45621d5f96862c2 ("SCTP: fix race
    between sctp_bind_addr_free() and sctp_bind_addr_conflict()") and
    commit 1231f0baa547a541a7481119323b7f964dda4788 ("net,rcu: convert
    call_rcu(sctp_local_addr_free) to kfree_rcu()"), happening in
    different trees, introduced a build failure.

    Simply make the SCTP race fix use kfree_rcu() too.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1446 commits)
    macvlan: fix panic if lowerdev in a bond
    tg3: Add braces around 5906 workaround.
    tg3: Fix NETIF_F_LOOPBACK error
    macvlan: remove one synchronize_rcu() call
    networking: NET_CLS_ROUTE4 depends on INET
    irda: Fix error propagation in ircomm_lmp_connect_response()
    irda: Kill set but unused variable 'bytes' in irlan_check_command_param()
    irda: Kill set but unused variable 'clen' in ircomm_connect_indication()
    rxrpc: Fix set but unused variable 'usage' in rxrpc_get_transport()
    be2net: Kill set but unused variable 'req' in lancer_fw_download()
    irda: Kill set but unused vars 'saddr' and 'daddr' in irlan_provider_connect_indication()
    atl1c: atl1c_resume() is only used when CONFIG_PM_SLEEP is defined.
    rxrpc: Fix set but unused variable 'usage' in rxrpc_get_peer().
    rxrpc: Kill set but unused variable 'local' in rxrpc_UDP_error_handler()
    rxrpc: Kill set but unused variable 'sp' in rxrpc_process_connection()
    rxrpc: Kill set but unused variable 'sp' in rxrpc_rotate_tx_window()
    pkt_sched: Kill set but unused variable 'protocol' in tc_classify()
    isdn: capi: Use pr_debug() instead of ifdefs.
    tg3: Update version to 3.119
    tg3: Apply rx_discards fix to 5719/5720
    ...

    Fix up trivial conflicts in arch/x86/Kconfig and net/mac80211/agg-tx.c
    as per Davem.

    Linus Torvalds