18 Dec, 2010

1 commit


17 Dec, 2010

1 commit


11 Dec, 2010

1 commit

  • SCTP_SET_PEER_PRIMARY_ADDR does not accpet v4mapped address, using
    v4mapped address in SCTP_SET_PEER_PRIMARY_ADDR socket option will
    get -EADDRNOTAVAIL error if v4map is enabled. This patch try to
    fix it by mapping v4mapped address to v4 address if allowed.

    Signed-off-by: Wei Yongjun
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Wei Yongjun
     

07 Dec, 2010

1 commit


11 Nov, 2010

1 commit

  • Robin Holt tried to boot a 16TB machine and found some limits were
    reached : sysctl_tcp_mem[2], sysctl_udp_mem[2]

    We can switch infrastructure to use long "instead" of "int", now
    atomic_long_t primitives are available for free.

    Signed-off-by: Eric Dumazet
    Reported-by: Robin Holt
    Reviewed-by: Robin Holt
    Signed-off-by: Andrew Morton
    Signed-off-by: David S. Miller

    Eric Dumazet
     

24 Oct, 2010

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1699 commits)
    bnx2/bnx2x: Unsupported Ethtool operations should return -EINVAL.
    vlan: Calling vlan_hwaccel_do_receive() is always valid.
    tproxy: use the interface primary IP address as a default value for --on-ip
    tproxy: added IPv6 support to the socket match
    cxgb3: function namespace cleanup
    tproxy: added IPv6 support to the TPROXY target
    tproxy: added IPv6 socket lookup function to nf_tproxy_core
    be2net: Changes to use only priority codes allowed by f/w
    tproxy: allow non-local binds of IPv6 sockets if IP_TRANSPARENT is enabled
    tproxy: added tproxy sockopt interface in the IPV6 layer
    tproxy: added udp6_lib_lookup function
    tproxy: added const specifiers to udp lookup functions
    tproxy: split off ipv6 defragmentation to a separate module
    l2tp: small cleanup
    nf_nat: restrict ICMP translation for embedded header
    can: mcp251x: fix generation of error frames
    can: mcp251x: fix endless loop in interrupt handler if CANINTF_MERRF is set
    can-raw: add msg_flags to distinguish local traffic
    9p: client code cleanup
    rds: make local functions/variables static
    ...

    Fix up conflicts in net/core/dev.c, drivers/net/pcmcia/smc91c92_cs.c and
    drivers/net/wireless/ath/ath9k/debug.c as per David

    Linus Torvalds
     

23 Oct, 2010

1 commit

  • * 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
    vfs: make no_llseek the default
    vfs: don't use BKL in default_llseek
    llseek: automatically add .llseek fop
    libfs: use generic_file_llseek for simple_attr
    mac80211: disallow seeks in minstrel debug code
    lirc: make chardev nonseekable
    viotape: use noop_llseek
    raw: use explicit llseek file operations
    ibmasmfs: use generic_file_llseek
    spufs: use llseek in all file operations
    arm/omap: use generic_file_llseek in iommu_debug
    lkdtm: use generic_file_llseek in debugfs
    net/wireless: use generic_file_llseek in debugfs
    drm: use noop_llseek

    Linus Torvalds
     

15 Oct, 2010

1 commit

  • All file_operations should get a .llseek operation so we can make
    nonseekable_open the default for future file operations without a
    .llseek pointer.

    The three cases that we can automatically detect are no_llseek, seq_lseek
    and default_llseek. For cases where we can we can automatically prove that
    the file offset is always ignored, we use noop_llseek, which maintains
    the current behavior of not returning an error from a seek.

    New drivers should normally not use noop_llseek but instead use no_llseek
    and call nonseekable_open at open time. Existing drivers can be converted
    to do the same when the maintainer knows for certain that no user code
    relies on calling seek on the device file.

    The generated code is often incorrectly indented and right now contains
    comments that clarify for each added line why a specific variant was
    chosen. In the version that gets submitted upstream, the comments will
    be gone and I will manually fix the indentation, because there does not
    seem to be a way to do that using coccinelle.

    Some amount of new code is currently sitting in linux-next that should get
    the same modifications, which I will do at the end of the merge window.

    Many thanks to Julia Lawall for helping me learn to write a semantic
    patch that does all this.

    ===== begin semantic patch =====
    // This adds an llseek= method to all file operations,
    // as a preparation for making no_llseek the default.
    //
    // The rules are
    // - use no_llseek explicitly if we do nonseekable_open
    // - use seq_lseek for sequential files
    // - use default_llseek if we know we access f_pos
    // - use noop_llseek if we know we don't access f_pos,
    // but we still want to allow users to call lseek
    //
    @ open1 exists @
    identifier nested_open;
    @@
    nested_open(...)
    {

    }

    @ open exists@
    identifier open_f;
    identifier i, f;
    identifier open1.nested_open;
    @@
    int open_f(struct inode *i, struct file *f)
    {

    }

    @ read disable optional_qualifier exists @
    identifier read_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    expression E;
    identifier func;
    @@
    ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
    {

    }

    @ read_no_fpos disable optional_qualifier exists @
    identifier read_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    @@
    ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
    {
    ... when != off
    }

    @ write @
    identifier write_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    expression E;
    identifier func;
    @@
    ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
    {

    }

    @ write_no_fpos @
    identifier write_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    @@
    ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
    {
    ... when != off
    }

    @ fops0 @
    identifier fops;
    @@
    struct file_operations fops = {
    ...
    };

    @ has_llseek depends on fops0 @
    identifier fops0.fops;
    identifier llseek_f;
    @@
    struct file_operations fops = {
    ...
    .llseek = llseek_f,
    ...
    };

    @ has_read depends on fops0 @
    identifier fops0.fops;
    identifier read_f;
    @@
    struct file_operations fops = {
    ...
    .read = read_f,
    ...
    };

    @ has_write depends on fops0 @
    identifier fops0.fops;
    identifier write_f;
    @@
    struct file_operations fops = {
    ...
    .write = write_f,
    ...
    };

    @ has_open depends on fops0 @
    identifier fops0.fops;
    identifier open_f;
    @@
    struct file_operations fops = {
    ...
    .open = open_f,
    ...
    };

    // use no_llseek if we call nonseekable_open
    ////////////////////////////////////////////
    @ nonseekable1 depends on !has_llseek && has_open @
    identifier fops0.fops;
    identifier nso ~= "nonseekable_open";
    @@
    struct file_operations fops = {
    ... .open = nso, ...
    +.llseek = no_llseek, /* nonseekable */
    };

    @ nonseekable2 depends on !has_llseek @
    identifier fops0.fops;
    identifier open.open_f;
    @@
    struct file_operations fops = {
    ... .open = open_f, ...
    +.llseek = no_llseek, /* open uses nonseekable */
    };

    // use seq_lseek for sequential files
    /////////////////////////////////////
    @ seq depends on !has_llseek @
    identifier fops0.fops;
    identifier sr ~= "seq_read";
    @@
    struct file_operations fops = {
    ... .read = sr, ...
    +.llseek = seq_lseek, /* we have seq_read */
    };

    // use default_llseek if there is a readdir
    ///////////////////////////////////////////
    @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier readdir_e;
    @@
    // any other fop is used that changes pos
    struct file_operations fops = {
    ... .readdir = readdir_e, ...
    +.llseek = default_llseek, /* readdir is present */
    };

    // use default_llseek if at least one of read/write touches f_pos
    /////////////////////////////////////////////////////////////////
    @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier read.read_f;
    @@
    // read fops use offset
    struct file_operations fops = {
    ... .read = read_f, ...
    +.llseek = default_llseek, /* read accesses f_pos */
    };

    @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier write.write_f;
    @@
    // write fops use offset
    struct file_operations fops = {
    ... .write = write_f, ...
    + .llseek = default_llseek, /* write accesses f_pos */
    };

    // Use noop_llseek if neither read nor write accesses f_pos
    ///////////////////////////////////////////////////////////

    @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier read_no_fpos.read_f;
    identifier write_no_fpos.write_f;
    @@
    // write fops use offset
    struct file_operations fops = {
    ...
    .write = write_f,
    .read = read_f,
    ...
    +.llseek = noop_llseek, /* read and write both use no f_pos */
    };

    @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier write_no_fpos.write_f;
    @@
    struct file_operations fops = {
    ... .write = write_f, ...
    +.llseek = noop_llseek, /* write uses no f_pos */
    };

    @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier read_no_fpos.read_f;
    @@
    struct file_operations fops = {
    ... .read = read_f, ...
    +.llseek = noop_llseek, /* read uses no f_pos */
    };

    @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    @@
    struct file_operations fops = {
    ...
    +.llseek = noop_llseek, /* no read or write fn */
    };
    ===== End semantic patch =====

    Signed-off-by: Arnd Bergmann
    Cc: Julia Lawall
    Cc: Christoph Hellwig

    Arnd Bergmann
     

05 Oct, 2010

1 commit


04 Oct, 2010

3 commits

  • Signed-off-by: David S. Miller

    David S. Miller
     
  • The sctp_asoc_get_hmac() function iterates through a peer's hmac_ids
    array and attempts to ensure that only a supported hmac entry is
    returned. The current code fails to do this properly - if the last id
    in the array is out of range (greater than SCTP_AUTH_HMAC_ID_MAX), the
    id integer remains set after exiting the loop, and the address of an
    out-of-bounds entry will be returned and subsequently used in the parent
    function, causing potentially ugly memory corruption. This patch resets
    the id integer to 0 on encountering an invalid id so that NULL will be
    returned after finishing the loop if no valid ids are found.

    Signed-off-by: Dan Rosenberg
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Dan Rosenberg
     
  • Two user-controlled allocations in SCTP are subsequently dereferenced as
    sockaddr structs, without checking if the dereferenced struct members fall
    beyond the end of the allocated chunk. There doesn't appear to be any
    information leakage here based on how these members are used and
    additional checking, but it's still worth fixing.

    [akpm@linux-foundation.org: remove unfashionable newlines, fix gmail tab->space conversion]
    Signed-off-by: Dan Rosenberg
    Acked-by: Vlad Yasevich
    Cc: David Miller
    Signed-off-by: Andrew Morton
    Signed-off-by: David S. Miller

    Dan Rosenberg
     

27 Sep, 2010

1 commit


24 Sep, 2010

1 commit


18 Sep, 2010

1 commit

  • sctp_packet_config() is called when getting the packet ready
    for appending of chunks. The function should not touch the
    current state, since it's possible to ping-pong between two
    transports when sending, and that can result packet corruption
    followed by skb overlfow crash.

    Reported-by: Thomas Dreibholz
    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

10 Sep, 2010

2 commits


09 Sep, 2010

1 commit


07 Sep, 2010

1 commit


27 Aug, 2010

1 commit

  • Change SCTP_DEBUG_PRINTK and SCTP_DEBUG_PRINTK_IPADDR to
    use do { print } while (0) guards.
    Add SCTP_DEBUG_PRINTK_CONT to fix errors in log when
    lines were continued.
    Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
    Add a missing newline in "Failed bind hash alloc"

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

05 Aug, 2010

1 commit

  • * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (48 commits)
    Documentation: update broken web addresses.
    fix comment typo "choosed" -> "chosen"
    hostap:hostap_hw.c Fix typo in comment
    Fix spelling contorller -> controller in comments
    Kconfig.debug: FAIL_IO_TIMEOUT: typo Faul -> Fault
    fs/Kconfig: Fix typo Userpace -> Userspace
    Removing dead MACH_U300_BS26
    drivers/infiniband: Remove unnecessary casts of private_data
    fs/ocfs2: Remove unnecessary casts of private_data
    libfc: use ARRAY_SIZE
    scsi: bfa: use ARRAY_SIZE
    drm: i915: use ARRAY_SIZE
    drm: drm_edid: use ARRAY_SIZE
    synclink: use ARRAY_SIZE
    block: cciss: use ARRAY_SIZE
    comment typo fixes: charater => character
    fix comment typos concerning "challenge"
    arm: plat-spear: fix typo in kerneldoc
    reiserfs: typo comment fix
    update email address
    ...

    Linus Torvalds
     

26 Jun, 2010

1 commit

  • In preparation for 64bit snmp counters for some mibs,
    add an 'align' parameter to snmp_mib_init(), instead
    of assuming mibs only contain 'unsigned long' fields.

    Callers can use __alignof__(type) to provide correct
    alignment.

    Signed-off-by: Eric Dumazet
    CC: Herbert Xu
    CC: Arnaldo Carvalho de Melo
    CC: Hideaki YOSHIFUJI
    CC: Vlad Yasevich
    Signed-off-by: David S. Miller

    Eric Dumazet
     

17 Jun, 2010

1 commit


11 Jun, 2010

1 commit


03 Jun, 2010

1 commit


18 May, 2010

2 commits

  • This patch removes from net/ (but not any netfilter files)
    all the unnecessary return; statements that precede the
    last closing brace of void functions.

    It does not remove the returns that are immediately
    preceded by a label as gcc doesn't like that.

    Done via:
    $ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
    xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     
  • commit 5fa782c2f5ef6c2e4f04d3e228412c9b4a4c8809
    sctp: Fix skb_over_panic resulting from multiple invalid \
    parameter errors (CVE-2010-1173) (v4)

    cause 'error cause' never be add the the ERROR chunk due to
    some typo when check valid length in sctp_init_cause_fixed().

    Signed-off-by: Wei Yongjun
    Reviewed-by: Neil Horman
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Wei Yongjun
     

17 May, 2010

1 commit


16 May, 2010

2 commits

  • transport may be free before ICMP proto unreachable timer expire, so
    we should delete active ICMP proto unreachable timer when transport
    is going away.

    Signed-off-by: Wei Yongjun
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Wei Yongjun
     
  • (Dropped the infiniband part, because Tetsuo modified the related code,
    I will send a separate patch for it once this is accepted.)

    This patch introduces /proc/sys/net/ipv4/ip_local_reserved_ports which
    allows users to reserve ports for third-party applications.

    The reserved ports will not be used by automatic port assignments
    (e.g. when calling connect() or bind() with port number 0). Explicit
    port allocation behavior is unchanged.

    Signed-off-by: Octavian Purdila
    Signed-off-by: WANG Cong
    Cc: Neil Horman
    Cc: Eric Dumazet
    Cc: Eric W. Biederman
    Signed-off-by: David S. Miller

    Amerigo Wang
     

12 May, 2010

1 commit


06 May, 2010

1 commit

  • ICMP protocol unreachable handling completely disregarded
    the fact that the user may have locked the socket. It proceeded
    to destroy the association, even though the user may have
    held the lock and had a ref on the association. This resulted
    in the following:

    Attempt to release alive inet socket f6afcc00

    =========================
    [ BUG: held lock freed! ]
    -------------------------
    somenu/2672 is freeing memory f6afcc00-f6afcfff, with a lock still held
    there!
    (sk_lock-AF_INET){+.+.+.}, at: [] sctp_connect+0x13/0x4c
    1 lock held by somenu/2672:
    #0: (sk_lock-AF_INET){+.+.+.}, at: [] sctp_connect+0x13/0x4c

    stack backtrace:
    Pid: 2672, comm: somenu Not tainted 2.6.32-telco #55
    Call Trace:
    [] ? printk+0xf/0x11
    [] debug_check_no_locks_freed+0xce/0xff
    [] kmem_cache_free+0x21/0x66
    [] __sk_free+0x9d/0xab
    [] sk_free+0x1c/0x1e
    [] sctp_association_put+0x32/0x89
    [] __sctp_connect+0x36d/0x3f4
    [] ? sctp_connect+0x13/0x4c
    [] ? autoremove_wake_function+0x0/0x33
    [] sctp_connect+0x31/0x4c
    [] inet_dgram_connect+0x4b/0x55
    [] sys_connect+0x54/0x71
    [] ? lock_release_non_nested+0x88/0x239
    [] ? might_fault+0x42/0x7c
    [] ? might_fault+0x42/0x7c
    [] sys_socketcall+0x6d/0x178
    [] ? trace_hardirqs_on_thunk+0xc/0x10
    [] syscall_call+0x7/0xb

    This was because the sctp_wait_for_connect() would aqcure the socket
    lock and then proceed to release the last reference count on the
    association, thus cause the fully destruction path to finish freeing
    the socket.

    The simplest solution is to start a very short timer in case the socket
    is owned by user. When the timer expires, we can do some verification
    and be able to do the release properly.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

04 May, 2010

1 commit


03 May, 2010

1 commit


02 May, 2010

1 commit

  • sk_callback_lock rwlock actually protects sk->sk_sleep pointer, so we
    need two atomic operations (and associated dirtying) per incoming
    packet.

    RCU conversion is pretty much needed :

    1) Add a new structure, called "struct socket_wq" to hold all fields
    that will need rcu_read_lock() protection (currently: a
    wait_queue_head_t and a struct fasync_struct pointer).

    [Future patch will add a list anchor for wakeup coalescing]

    2) Attach one of such structure to each "struct socket" created in
    sock_alloc_inode().

    3) Respect RCU grace period when freeing a "struct socket_wq"

    4) Change sk_sleep pointer in "struct sock" by sk_wq, pointer to "struct
    socket_wq"

    5) Change sk_sleep() function to use new sk->sk_wq instead of
    sk->sk_sleep

    6) Change sk_has_sleeper() to wq_has_sleeper() that must be used inside
    a rcu_read_lock() section.

    7) Change all sk_has_sleeper() callers to :
    - Use rcu_read_lock() instead of read_lock(&sk->sk_callback_lock)
    - Use wq_has_sleeper() to eventually wakeup tasks.
    - Use rcu_read_unlock() instead of read_unlock(&sk->sk_callback_lock)

    8) sock_wake_async() is modified to use rcu protection as well.

    9) Exceptions :
    macvtap, drivers/net/tun.c, af_unix use integrated "struct socket_wq"
    instead of dynamically allocated ones. They dont need rcu freeing.

    Some cleanups or followups are probably needed, (possible
    sk_callback_lock conversion to a spinlock for example...).

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

01 May, 2010

5 commits

  • When we create the sctp_datamsg and fragment the user data,
    we know exactly if we are sending full segments or not and
    how they might be bundled. During this time, we can mark
    messages a Nagle capable or not. This makes the check at
    transmit time much simpler.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     
  • Right now, if the highest tsn in the SACK doesn't change, we'll
    end up scanning the transmitted lists on the transports twice:
    once for locating the highest _new_ tsn, and once for actually
    tagging chunks as acked. This is a waste, since we can record
    the highest _new_ tsn at the same time as tagging chunks. Long
    ago this was not possible because we would try to mark chunks
    as missing at the same time as tagging them acked and this approach
    didn't work. Now that the two steps are separate, we can re-use
    the old approach.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     
  • According to RFC 4960 Section 7.2.4:
    If an endpoint is in Fast
    Recovery and a SACK arrives that advances the Cumulative TSN Ack
    Point, the miss indications are incremented for all TSNs reported
    missing in the SACK.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     
  • rwnd_press tracks the pressure on the recieve window. Every
    timer the receive buffer overlows, we truncate the receive
    window and then grow it back. However, if we don't track
    the cumulative presser, it's possible to reach a situation
    when receive buffer is empty, but rwnd stays truncated.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich
     
  • SCTP fast recovery algorithm really applies per association
    and impacts all transports.

    Signed-off-by: Vlad Yasevich

    Vlad Yasevich