14 Mar, 2011

1 commit


10 Mar, 2011

1 commit

  • Since a8f80e8ff94ecba629542d9b4b5f5a8ee3eb565c any process with
    CAP_NET_ADMIN may load any module from /lib/modules/. This doesn't mean
    that CAP_NET_ADMIN is a superset of CAP_SYS_MODULE as modules are
    limited to /lib/modules/**. However, CAP_NET_ADMIN capability shouldn't
    allow anybody load any module not related to networking.

    This patch restricts an ability of autoloading modules to netdev modules
    with explicit aliases. This fixes CVE-2011-1019.

    Arnd Bergmann suggested to leave untouched the old pre-v2.6.32 behavior
    of loading netdev modules by name (without any prefix) for processes
    with CAP_SYS_MODULE to maintain the compatibility with network scripts
    that use autoloading netdev modules by aliases like "eth0", "wlan0".

    Currently there are only three users of the feature in the upstream
    kernel: ipip, ip_gre and sit.

    root@albatros:~# capsh --drop=$(seq -s, 0 11),$(seq -s, 13 34) --
    root@albatros:~# grep Cap /proc/$$/status
    CapInh: 0000000000000000
    CapPrm: fffffff800001000
    CapEff: fffffff800001000
    CapBnd: fffffff800001000
    root@albatros:~# modprobe xfs
    FATAL: Error inserting xfs
    (/lib/modules/2.6.38-rc6-00001-g2bf4ca3/kernel/fs/xfs/xfs.ko): Operation not permitted
    root@albatros:~# lsmod | grep xfs
    root@albatros:~# ifconfig xfs
    xfs: error fetching interface information: Device not found
    root@albatros:~# lsmod | grep xfs
    root@albatros:~# lsmod | grep sit
    root@albatros:~# ifconfig sit
    sit: error fetching interface information: Device not found
    root@albatros:~# lsmod | grep sit
    root@albatros:~# ifconfig sit0
    sit0 Link encap:IPv6-in-IPv4
    NOARP MTU:1480 Metric:1

    root@albatros:~# lsmod | grep sit
    sit 10457 0
    tunnel4 2957 1 sit

    For CAP_SYS_MODULE module loading is still relaxed:

    root@albatros:~# grep Cap /proc/$$/status
    CapInh: 0000000000000000
    CapPrm: ffffffffffffffff
    CapEff: ffffffffffffffff
    CapBnd: ffffffffffffffff
    root@albatros:~# ifconfig xfs
    xfs: error fetching interface information: Device not found
    root@albatros:~# lsmod | grep xfs
    xfs 745319 0

    Reference: https://lkml.org/lkml/2011/2/24/203

    Signed-off-by: Vasiliy Kulikov
    Signed-off-by: Michael Tokarev
    Acked-by: David S. Miller
    Acked-by: Kees Cook
    Signed-off-by: James Morris

    Vasiliy Kulikov
     

06 Mar, 2011

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
    ceph: no .snap inside of snapped namespace
    libceph: fix msgr standby handling
    libceph: fix msgr keepalive flag
    libceph: fix msgr backoff
    libceph: retry after authorization failure
    libceph: fix handling of short returns from get_user_pages
    ceph: do not clear I_COMPLETE from d_release
    ceph: do not set I_COMPLETE
    Revert "ceph: keep reference to parent inode on ceph_dentry"

    Linus Torvalds
     

05 Mar, 2011

3 commits

  • The standby logic used to be pretty dependent on the work requeueing
    behavior that changed when we switched to WQ_NON_REENTRANT. It was also
    very fragile.

    Restructure things so that:
    - We clear WRITE_PENDING when we set STANDBY. This ensures we will
    requeue work when we wake up later.
    - con_work backs off if STANDBY is set. There is nothing to do if we are
    in standby.
    - clear_standby() helper is called by both con_send() and con_keepalive(),
    the two actions that can wake us up again. Move the connect_seq++
    logic here.

    Signed-off-by: Sage Weil

    Sage Weil
     
  • There was some broken keepalive code using a dead variable. Shift to using
    the proper bit flag.

    Signed-off-by: Sage Weil

    Sage Weil
     
  • With commit f363e45f we replaced a bunch of hacky workqueue mutual
    exclusion logic with the WQ_NON_REENTRANT flag. One pieces of fallout is
    that the exponential backoff breaks in certain cases:

    * con_work attempts to connect.
    * we get an immediate failure, and the socket state change handler queues
    immediate work.
    * con_work calls con_fault, we decide to back off, but can't queue delayed
    work.

    In this case, we add a BACKOFF bit to make con_work reschedule delayed work
    next time it runs (which should be immediately).

    Signed-off-by: Sage Weil

    Sage Weil
     

04 Mar, 2011

5 commits

  • …s/security-testing-2.6

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
    DNS: Fix a NULL pointer deref when trying to read an error key [CVE-2011-1076]

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (42 commits)
    MAINTAINERS: Add Andy Gospodarek as co-maintainer.
    r8169: disable ASPM
    RxRPC: Fix v1 keys
    AF_RXRPC: Handle receiving ACKALL packets
    cnic: Fix lost interrupt on bnx2x
    cnic: Prevent status block race conditions with hardware
    net: dcbnl: check correct ops in dcbnl_ieee_set()
    e1000e: disable broken PHY wakeup for ICH10 LOMs, use MAC wakeup instead
    igb: fix sparse warning
    e1000: fix sparse warning
    netfilter: nf_log: avoid oops in (un)bind with invalid nfproto values
    dccp: fix oops on Reset after close
    ipvs: fix dst_lock locking on dest update
    davinci_emac: Add Carrier Link OK check in Davinci RX Handler
    bnx2x: update driver version to 1.62.00-6
    bnx2x: properly calculate lro_mss
    bnx2x: perform statistics "action" before state transition.
    bnx2x: properly configure coefficients for MinBW algorithm (NPAR mode).
    bnx2x: Fix ethtool -t link test for MF (non-pmf) devices.
    bnx2x: Fix nvram test for single port devices.
    ...

    Linus Torvalds
     
  • When a DNS resolver key is instantiated with an error indication, attempts to
    read that key will result in an oops because user_read() is expecting there to
    be a payload - and there isn't one [CVE-2011-1076].

    Give the DNS resolver key its own read handler that returns the error cached in
    key->type_data.x[0] as an error rather than crashing.

    Also make the kenter() at the beginning of dns_resolver_instantiate() limit the
    amount of data it prints, since the data is not necessarily NUL-terminated.

    The buggy code was added in:

    commit 4a2d789267e00b5a1175ecd2ddefcc78b83fbf09
    Author: Wang Lei
    Date: Wed Aug 11 09:37:58 2010 +0100
    Subject: DNS: If the DNS server returns an error, allow that to be cached [ver #2]

    This can trivially be reproduced by any user with the following program
    compiled with -lkeyutils:

    #include
    #include
    #include
    static char payload[] = "#dnserror=6";
    int main()
    {
    key_serial_t key;
    key = add_key("dns_resolver", "a", payload, sizeof(payload),
    KEY_SPEC_SESSION_KEYRING);
    if (key == -1)
    err(1, "add_key");
    if (keyctl_read(key, NULL, 0) == -1)
    err(1, "read_key");
    return 0;
    }

    What should happen is that keyctl_read() reports error 6 (ENXIO) to the user:

    dns-break: read_key: No such device or address

    but instead the kernel oopses.

    This cannot be reproduced with the 'keyutils add' or 'keyutils padd' commands
    as both of those cut the data down below the NUL termination that must be
    included in the data. Without this dns_resolver_instantiate() will return
    -EINVAL and the key will not be instantiated such that it can be read.

    The oops looks like:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
    IP: [] user_read+0x4f/0x8f
    PGD 3bdf8067 PUD 385b9067 PMD 0
    Oops: 0000 [#1] SMP
    last sysfs file: /sys/devices/pci0000:00/0000:00:19.0/irq
    CPU 0
    Modules linked in:

    Pid: 2150, comm: dns-break Not tainted 2.6.38-rc7-cachefs+ #468 /DG965RY
    RIP: 0010:[] [] user_read+0x4f/0x8f
    RSP: 0018:ffff88003bf47f08 EFLAGS: 00010246
    RAX: 0000000000000001 RBX: ffff88003b5ea378 RCX: ffffffff81972368
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88003b5ea378
    RBP: ffff88003bf47f28 R08: ffff88003be56620 R09: 0000000000000000
    R10: 0000000000000395 R11: 0000000000000002 R12: 0000000000000000
    R13: 0000000000000000 R14: 0000000000000000 R15: ffffffffffffffa1
    FS: 00007feab5751700(0000) GS:ffff88003e000000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000010 CR3: 000000003de40000 CR4: 00000000000006f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process dns-break (pid: 2150, threadinfo ffff88003bf46000, task ffff88003be56090)
    Stack:
    ffff88003b5ea378 ffff88003b5ea3a0 0000000000000000 0000000000000000
    ffff88003bf47f68 ffffffff811b708e ffff88003c442bc8 0000000000000000
    00000000004005a0 00007fffba368060 0000000000000000 0000000000000000
    Call Trace:
    [] keyctl_read_key+0xac/0xcf
    [] sys_keyctl+0x75/0xb6
    [] system_call_fastpath+0x16/0x1b
    Code: 75 1f 48 83 7b 28 00 75 18 c6 05 58 2b fb 00 01 be bb 00 00 00 48 c7 c7 76 1c 75 81 e8 13 c2 e9 ff 4c 8b b3 e0 00 00 00 4d 85 ed 0f b7 5e 10 74 2d 4d 85 e4 74 28 e8 98 79 ee ff 49 39 dd 48
    RIP [] user_read+0x4f/0x8f
    RSP
    CR2: 0000000000000010

    Signed-off-by: David Howells
    Acked-by: Jeff Layton
    cc: Wang Lei
    Signed-off-by: James Morris

    David Howells
     
  • If we mark the connection CLOSED we will give up trying to reconnect to
    this server instance. That is appropriate for things like a protocol
    version mismatch that won't change until the server is restarted, at which
    point we'll get a new addr and reconnect. An authorization failure like
    this is probably due to the server not properly rotating it's secret keys,
    however, and should be treated as transient so that the normal backoff and
    retry behavior kicks in.

    Signed-off-by: Sage Weil

    Sage Weil
     
  • get_user_pages() can return fewer pages than we ask for. We were returning
    a bogus pointer/error code in that case. Instead, loop until we get all
    the pages we want or get an error we can return to the caller.

    Signed-off-by: Sage Weil

    Sage Weil
     

03 Mar, 2011

3 commits


02 Mar, 2011

3 commits

  • Like many other places, we have to check that the array index is
    within allowed limits, or otherwise, a kernel oops and other nastiness
    can ensue when we access memory beyond the end of the array.

    [ 5954.115381] BUG: unable to handle kernel paging request at 0000004000000000
    [ 5954.120014] IP: __find_logger+0x6f/0xa0
    [ 5954.123979] nf_log_bind_pf+0x2b/0x70
    [ 5954.123979] nfulnl_recv_config+0xc0/0x4a0 [nfnetlink_log]
    [ 5954.123979] nfnetlink_rcv_msg+0x12c/0x1b0 [nfnetlink]
    ...

    The problem goes back to v2.6.30-rc1~1372~1342~31 where nf_log_bind
    was decoupled from nf_log_register.

    Reported-by: Miguel Di Ciurcio Filho ,
    via irc.freenode.net/#netfilter
    Signed-off-by: Jan Engelhardt
    Signed-off-by: Patrick McHardy

    Jan Engelhardt
     
  • This fixes a bug in the order of dccp_rcv_state_process() that still permitted
    reception even after closing the socket. A Reset after close thus causes a NULL
    pointer dereference by not preventing operations on an already torn-down socket.

    dccp_v4_do_rcv()
    |
    | state other than OPEN
    v
    dccp_rcv_state_process()
    |
    | DCCP_PKT_RESET
    v
    dccp_rcv_reset()
    |
    v
    dccp_time_wait()

    WARNING: at net/ipv4/inet_timewait_sock.c:141 __inet_twsk_hashdance+0x48/0x128()
    Modules linked in: arc4 ecb carl9170 rt2870sta(C) mac80211 r8712u(C) crc_ccitt ah
    [] (unwind_backtrace+0x0/0xec) from [] (warn_slowpath_common)
    [] (warn_slowpath_common+0x4c/0x64) from [] (warn_slowpath_n)
    [] (warn_slowpath_null+0x1c/0x24) from [] (__inet_twsk_hashd)
    [] (__inet_twsk_hashdance+0x48/0x128) from [] (dccp_time_wai)
    [] (dccp_time_wait+0x40/0xc8) from [] (dccp_rcv_state_proces)
    [] (dccp_rcv_state_process+0x120/0x538) from [] (dccp_v4_do_)
    [] (dccp_v4_do_rcv+0x11c/0x14c) from [] (release_sock+0xac/0)
    [] (release_sock+0xac/0x110) from [] (dccp_close+0x28c/0x380)
    [] (dccp_close+0x28c/0x380) from [] (inet_release+0x64/0x70)

    The fix is by testing the socket state first. Receiving a packet in Closed state
    now also produces the required "No connection" Reset reply of RFC 4340, 8.3.1.

    Reported-and-tested-by: Johan Hovold
    Cc: stable@kernel.org
    Signed-off-by: Gerrit Renker
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • Fix dst_lock usage in __ip_vs_update_dest. We need
    _bh locking because destination is updated in user context.
    Can cause lockups on frequent destination updates.
    Problem reported by Simon Kirby. Bug was introduced
    in 2.6.37 from the "ipvs: changes for local real server"
    change.

    Signed-off-by: Julian Anastasov
    Signed-off-by: Hans Schillstrom
    Signed-off-by: Simon Horman

    Julian Anastasov
     

01 Mar, 2011

1 commit

  • netlink_dump() may failed, but nobody handle its error.
    It generates output data, when a previous portion has been returned to
    user space. This mechanism works when all data isn't go in skb. If we
    enter in netlink_recvmsg() and skb is absent in the recv queue, the
    netlink_dump() will not been executed. So if netlink_dump() is failed
    one time, the new data never appear and the reader will sleep forever.

    netlink_dump() is called from two places:

    1. from netlink_sendmsg->...->netlink_dump_start().
    In this place we can report error directly and it will be returned
    by sendmsg().

    2. from netlink_recvmsg
    There we can't report error directly, because we have a portion of
    valid output data and call netlink_dump() for prepare the next portion.
    If netlink_dump() is failed, the socket will be mark as error and the
    next recvmsg will be failed.

    Signed-off-by: Andrey Vagin
    Signed-off-by: David S. Miller

    Andrey Vagin
     

26 Feb, 2011

3 commits

  • addr_type of 0 means that the type should be adopted from from_dev and
    not from __hw_addr_del_multiple(). Unfortunately it isn't so and
    addr_type will always be considered. Fix this by implementing the
    considered and documented behavior.

    Signed-off-by: Hagen Paul Pfeifer
    Signed-off-by: David S. Miller

    Hagen Paul Pfeifer
     
  • With slab poisoning enabled, I see the following oops:

    Unable to handle kernel paging request for data at address 0x6b6b6b6b6b6b6b73
    ...
    NIP [c0000000006bc61c] .rxrpc_destroy+0x44/0x104
    LR [c0000000006bc618] .rxrpc_destroy+0x40/0x104
    Call Trace:
    [c0000000feb2bc00] [c0000000006bc618] .rxrpc_destroy+0x40/0x104 (unreliable)
    [c0000000feb2bc90] [c000000000349b2c] .key_cleanup+0x1a8/0x20c
    [c0000000feb2bd40] [c0000000000a2920] .process_one_work+0x2f4/0x4d0
    [c0000000feb2be00] [c0000000000a2d50] .worker_thread+0x254/0x468
    [c0000000feb2bec0] [c0000000000a868c] .kthread+0xbc/0xc8
    [c0000000feb2bf90] [c000000000020e00] .kernel_thread+0x54/0x70

    We aren't initialising token->next, but the code in destroy_context relies
    on the list being NULL terminated. Use kzalloc to zero out all the fields.

    Signed-off-by: Anton Blanchard
    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    Anton Blanchard
     
  • Before this patch issuing these commands:

    fd = open("/proc/sys/net/ipv6/route/flush")
    unshare(CLONE_NEWNET)
    write(fd, "stuff")

    would flush the newly created net, not the original one.

    The equivalent ipv4 code is correct (stores the net inside ->extra1).
    Acked-by: Daniel Lezcano

    Signed-off-by: David S. Miller

    Lucian Adrian Grijincu
     

24 Feb, 2011

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (33 commits)
    Added support for usb ethernet (0x0fe6, 0x9700)
    r8169: fix RTL8168DP power off issue.
    r8169: correct settings of rtl8102e.
    r8169: fix incorrect args to oob notify.
    DM9000B: Fix PHY power for network down/up
    DM9000B: Fix reg_save after spin_lock in dm9000_timeout
    net_sched: long word align struct qdisc_skb_cb data
    sfc: lower stack usage in efx_ethtool_self_test
    bridge: Use IPv6 link-local address for multicast listener queries
    bridge: Fix MLD queries' ethernet source address
    bridge: Allow mcast snooping for transient link local addresses too
    ipv6: Add IPv6 multicast address flag defines
    bridge: Add missing ntohs()s for MLDv2 report parsing
    bridge: Fix IPv6 multicast snooping by correcting offset in MLDv2 report
    bridge: Fix IPv6 multicast snooping by storing correct protocol type
    p54pci: update receive dma buffers before and after processing
    fix cfg80211_wext_siwfreq lock ordering...
    rt2x00: Fix WPA TKIP Michael MIC failures.
    ath5k: Fix fast channel switching
    tcp: undo_retrans counter fixes
    ...

    Linus Torvalds
     

23 Feb, 2011

7 commits

  • David S. Miller
     
  • Currently the bridge multicast snooping feature periodically issues
    IPv6 general multicast listener queries to sense the absence of a
    listener.

    For this, it uses :: as its source address - however RFC 2710 requires:
    "To be valid, the Query message MUST come from a link-local IPv6 Source
    Address". Current Linux kernel versions seem to follow this requirement
    and ignore our bogus MLD queries.

    With this commit a link local address from the bridge interface is being
    used to issue the MLD query, resulting in other Linux devices which are
    multicast listeners in the network to respond with a MLD response (which
    was not the case before).

    Signed-off-by: Linus Lüssing
    Signed-off-by: David S. Miller

    Linus Lüssing
     
  • Map the IPv6 header's destination multicast address to an ethernet
    source address instead of the MLD queries multicast address.

    For instance for a general MLD query (multicast address in the MLD query
    set to ::), this would wrongly be mapped to 33:33:00:00:00:00, although
    an MLD queries destination MAC should always be 33:33:00:00:00:01 which
    matches the IPv6 header's multicast destination ff02::1.

    Signed-off-by: Linus Lüssing
    Signed-off-by: David S. Miller

    Linus Lüssing
     
  • Currently the multicast bridge snooping support is not active for
    link local multicast. I assume this has been done to leave
    important multicast data untouched, like IPv6 Neighborhood Discovery.

    In larger, bridged, local networks it could however be desirable to
    optimize for instance local multicast audio/video streaming too.

    With the transient flag in IPv6 multicast addresses we have an easy
    way to optimize such multimedia traffic without tempering with the
    high priority multicast data from well-known addresses.

    This patch alters the multicast bridge snooping for IPv6, to take
    effect for transient multicast addresses instead of non-link-local
    addresses.

    Signed-off-by: Linus Lüssing
    Signed-off-by: David S. Miller

    Linus Lüssing
     
  • The nsrcs number is 2 Byte wide, therefore we need to call ntohs()
    before using it.

    Signed-off-by: Linus Lüssing
    Signed-off-by: David S. Miller

    Linus Lüssing
     
  • We actually want a pointer to the grec_nsrcr and not the following
    field. Otherwise we can get very high values for *nsrcs as the first two
    bytes of the IPv6 multicast address are being used instead, leading to
    a failing pskb_may_pull() which results in MLDv2 reports not being
    parsed.

    Signed-off-by: Linus Lüssing
    Signed-off-by: David S. Miller

    Linus Lüssing
     
  • The protocol type for IPv6 entries in the hash table for multicast
    bridge snooping is falsely set to ETH_P_IP, marking it as an IPv4
    address, instead of setting it to ETH_P_IPV6, which results in negative
    look-ups in the hash table later.

    Signed-off-by: Linus Lüssing
    Signed-off-by: David S. Miller

    Linus Lüssing
     

22 Feb, 2011

3 commits

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
    ceph: keep reference to parent inode on ceph_dentry
    ceph: queue cap_snaps once per realm
    libceph: fix socket write error handling
    libceph: fix socket read error handling

    Linus Torvalds
     
  • I previously managed to reproduce a hang while scanning wireless
    channels (reproducible with airodump-ng hopping channels); subsequent
    lockdep instrumentation revealed a lock ordering issue.

    Without knowing the design intent, it looks like the locks should be
    taken in reverse order; please comment.

    =======================================================
    [ INFO: possible circular locking dependency detected ]
    2.6.38-rc5-341cd #4
    -------------------------------------------------------
    airodump-ng/15445 is trying to acquire lock:
    (&rdev->devlist_mtx){+.+.+.}, at: []
    cfg80211_wext_siwfreq+0xc6/0x100

    but task is already holding lock:
    (&wdev->mtx){+.+.+.}, at: [] cfg80211_wext_siwfreq+0xbc/0x100

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #1 (&wdev->mtx){+.+.+.}:
    [] lock_acquire+0xc6/0x280
    [] mutex_lock_nested+0x6e/0x4b0
    [] cfg80211_netdev_notifier_call+0x430/0x5f0
    [] notifier_call_chain+0x8b/0x100
    [] raw_notifier_call_chain+0x11/0x20
    [] call_netdevice_notifiers+0x32/0x60
    [] __dev_notify_flags+0x34/0x80
    [] dev_change_flags+0x40/0x70
    [] do_setlink+0x1fc/0x8d0
    [] rtnl_setlink+0xf2/0x140
    [] rtnetlink_rcv_msg+0x163/0x270
    [] netlink_rcv_skb+0xa1/0xd0
    [] rtnetlink_rcv+0x20/0x30
    [] netlink_unicast+0x2ba/0x300
    [] netlink_sendmsg+0x267/0x3e0
    [] sock_sendmsg+0xe4/0x110
    [] sys_sendmsg+0x253/0x3b0
    [] system_call_fastpath+0x16/0x1b

    -> #0 (&rdev->devlist_mtx){+.+.+.}:
    [] __lock_acquire+0x1622/0x1d10
    [] lock_acquire+0xc6/0x280
    [] mutex_lock_nested+0x6e/0x4b0
    [] cfg80211_wext_siwfreq+0xc6/0x100
    [] ioctl_standard_call+0x5d/0xd0
    [] T.808+0x163/0x170
    [] wext_handle_ioctl+0x3a/0x90
    [] dev_ioctl+0x6f2/0x830
    [] sock_ioctl+0xfd/0x290
    [] do_vfs_ioctl+0x9d/0x590
    [] sys_ioctl+0x4a/0x80
    [] system_call_fastpath+0x16/0x1b

    other info that might help us debug this:

    2 locks held by airodump-ng/15445:
    #0: (rtnl_mutex){+.+.+.}, at: [] rtnl_lock+0x12/0x20
    #1: (&wdev->mtx){+.+.+.}, at: []
    cfg80211_wext_siwfreq+0xbc/0x100

    stack backtrace:
    Pid: 15445, comm: airodump-ng Not tainted 2.6.38-rc5-341cd #4
    Call Trace:
    [] ? print_circular_bug+0xfa/0x100
    [] ? __lock_acquire+0x1622/0x1d10
    [] ? trace_hardirqs_off_caller+0x29/0xc0
    [] ? lock_acquire+0xc6/0x280
    [] ? cfg80211_wext_siwfreq+0xc6/0x100
    [] ? mark_held_locks+0x67/0x90
    [] ? mutex_lock_nested+0x6e/0x4b0
    [] ? cfg80211_wext_siwfreq+0xc6/0x100
    [] ? mark_held_locks+0x67/0x90
    [] ? cfg80211_wext_siwfreq+0xc6/0x100
    [] ? cfg80211_wext_siwfreq+0xc6/0x100
    [] ? ioctl_standard_call+0x5d/0xd0
    [] ? __dev_get_by_name+0x9b/0xc0
    [] ? ioctl_standard_call+0x0/0xd0
    [] ? T.808+0x163/0x170
    [] ? might_fault+0x72/0xd0
    [] ? wext_handle_ioctl+0x3a/0x90
    [] ? might_fault+0xbb/0xd0
    [] ? dev_ioctl+0x6f2/0x830
    [] ? put_lock_stats+0xe/0x40
    [] ? lock_release_holdtime+0xac/0x150
    [] ? sock_ioctl+0xfd/0x290
    [] ? do_vfs_ioctl+0x9d/0x590
    [] ? fget_light+0x1df/0x3c0
    [] ? sys_ioctl+0x4a/0x80
    [] ? system_call_fastpath+0x16/0x1b

    Signed-off-by: Daniel J Blueman
    Acked-by: Johannes Berg
    Signed-off-by: John W. Linville

    Daniel J Blueman
     
  • Fix a bug that undo_retrans is incorrectly decremented when undo_marker is
    not set or undo_retrans is already 0. This happens when sender receives
    more DSACK ACKs than packets retransmitted during the current
    undo phase. This may also happen when sender receives DSACK after
    the undo operation is completed or cancelled.

    Fix another bug that undo_retrans is incorrectly incremented when
    sender retransmits an skb and tcp_skb_pcount(skb) > 1 (TSO). This case
    is rare but not impossible.

    Signed-off-by: Yuchung Cheng
    Acked-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Yuchung Cheng
     

21 Feb, 2011

1 commit

  • From: Eric W. Biederman

    In the beginning with batching unreg_list was a list that was used only
    once in the lifetime of a network device (I think). Now we have calls
    using the unreg_list that can happen multiple times in the life of a
    network device like dev_deactivate and dev_close that are also using the
    unreg_list. In addition in unregister_netdevice_queue we also do a
    list_move because for devices like veth pairs it is possible that
    unregister_netdevice_queue will be called multiple times.

    So I think the change below to fix dev_deactivate which Eric D. missed
    will fix this problem. Now to go test that.

    Signed-off-by: David S. Miller

    Eric W. Biederman
     

20 Feb, 2011

3 commits

  • commit 5fa782c2f5ef6c2e4f04d3e228412c9b4a4c8809 re-worked the
    handling of unknown parameters. sctp_init_cause_fixed() can now
    return -ENOSPC if there is not enough tailroom in the error
    chunk skb. When this happens, the error header is not appended to
    the error chunk. In that case, the payload of the unknown parameter
    should not be appended either.

    Signed-off-by: Jiri Bohac
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Jiri Bohac
     
  • Eric W. Biederman reported a lockdep splat in inet_twsk_deschedule()

    This is caused by inet_twsk_purge(), run from process context,
    and commit 575f4cd5a5b6394577 (net: Use rcu lookups in inet_twsk_purge.)
    removed the BH disabling that was necessary.

    Add the BH disabling but fine grained, right before calling
    inet_twsk_deschedule(), instead of whole function.

    With help from Linus Torvalds and Eric W. Biederman

    Reported-by: Eric W. Biederman
    Signed-off-by: Eric Dumazet
    CC: Daniel Lezcano
    CC: Pavel Emelyanov
    CC: Arnaldo Carvalho de Melo
    CC: stable (# 2.6.33+)
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • David S. Miller
     

19 Feb, 2011

4 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (37 commits)
    net: deinit automatic LIST_HEAD
    net: dont leave active on stack LIST_HEAD
    net: provide default_advmss() methods to blackhole dst_ops
    tg3: Restrict phy ioctl access
    drivers/net: Call netif_carrier_off at the end of the probe
    ixgbe: work around for DDP last buffer size
    ixgbe: fix panic due to uninitialised pointer
    e1000e: flush all writebacks before unload
    e1000e: check down flag in tasks
    isdn: hisax: Use l2headersize() instead of dup (and buggy) func.
    arp_notify: unconditionally send gratuitous ARP for NETDEV_NOTIFY_PEERS.
    cxgb4vf: Use defined Mailbox Timeout
    cxgb4vf: Quiesce Virtual Interfaces on shutdown ...
    cxgb4vf: Behave properly when CONFIG_DEBUG_FS isn't defined ...
    cxgb4vf: Check driver parameters in the right place ...
    pch_gbe: Fix the MAC Address load issue.
    iwlwifi: Delete iwl3945_good_plcp_health.
    net/can/softing: make CAN_SOFTING_CS depend on CAN_SOFTING
    netfilter: nf_iterate: fix incorrect RCU usage
    pch_gbe: Fix the issue that the receiving data is not normal.
    ...

    Linus Torvalds
     
  • Low level driver could pass rx frames to us after disassociate, what
    can lead to run conn_mon_timer by ieee80211_sta_rx_notify(). That
    is obviously wrong, but nothing happens until we unload modules and
    resources are used after free. If kernel debugging is enabled following
    warning could be observed:

    WARNING: at lib/debugobjects.c:259 debug_print_object+0x65/0x70()
    Hardware name: HP xw8600 Workstation
    ODEBUG: free active (active state 0) object type: timer_list
    Modules linked in: iwlagn(-) iwlcore mac80211 cfg80211 aes_x86_64 aes_generic fuse cpufreq_ondemand acpi_cpufreq freq_table mperf xt_physdev ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 ext3 jbd dm_mirror dm_region_hash dm_log dm_mod uinput hp_wmi sparse_keymap sg wmi arc4 microcode serio_raw ecb tg3 shpchp rfkill ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif firewire_ohci firewire_core crc_itu_t mptsas mptscsih mptbase scsi_transport_sas ahci libahci pata_acpi ata_generic ata_piix floppy nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: cfg80211]
    Pid: 13827, comm: rmmod Tainted: G W 2.6.38-rc4-wl+ #22
    Call Trace:
    [] ? warn_slowpath_common+0x7f/0xc0
    [] ? warn_slowpath_fmt+0x46/0x50
    [] ? debug_print_object+0x65/0x70
    [] ? debug_check_no_obj_freed+0x125/0x210
    [] ? debug_check_no_locks_freed+0xf7/0x170
    [] ? kfree+0xc2/0x2f0
    [] ? netdev_release+0x45/0x60
    [] ? device_release+0x27/0xa0
    [] ? kobject_release+0x8d/0x1a0
    [] ? kobject_release+0x0/0x1a0
    [] ? kref_put+0x37/0x70
    [] ? kobject_put+0x27/0x60
    [] ? netdev_run_todo+0x1ab/0x270
    [] ? rtnl_unlock+0xe/0x10
    [] ? ieee80211_unregister_hw+0x58/0x120 [mac80211]
    [] ? iwl_pci_remove+0xdb/0x22a [iwlagn]
    [] ? pci_device_remove+0x52/0x120
    [] ? __device_release_driver+0x75/0xe0
    [] ? driver_detach+0xd8/0xe0
    [] ? bus_remove_driver+0x91/0x100
    [] ? driver_unregister+0x62/0xa0
    [] ? pci_unregister_driver+0x44/0xa0
    [] ? iwl_exit+0x15/0x1c [iwlagn]
    [] ? sys_delete_module+0x1a2/0x270
    [] ? trace_hardirqs_on_thunk+0x3a/0x3f
    [] ? system_call_fastpath+0x16/0x1b

    Acked-by: Johannes Berg
    Signed-off-by: Stanislaw Gruszka
    Signed-off-by: John W. Linville

    Stanislaw Gruszka
     
  • commit 9b5e383c11b08784 (net: Introduce
    unregister_netdevice_many()) left an active LIST_HEAD() in
    rollback_registered(), with possible memory corruption.

    Even if device is freed without touching its unreg_list (and therefore
    touching the previous memory location holding LISTE_HEAD(single), better
    close the bug for good, since its really subtle.

    (Same fix for default_device_exit_batch() for completeness)

    Reported-by: Michal Hocko
    Tested-by: Michal Hocko
    Reported-by: Eric W. Biderman
    Tested-by: Eric W. Biderman
    Signed-off-by: Linus Torvalds
    Signed-off-by: Eric Dumazet
    CC: Ingo Molnar
    CC: Octavian Purdila
    CC: stable [.33+]
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Eric W. Biderman and Michal Hocko reported various memory corruptions
    that we suspected to be related to a LIST head located on stack, that
    was manipulated after thread left function frame (and eventually exited,
    so its stack was freed and reused).

    Eric Dumazet suggested the problem was probably coming from commit
    443457242beb (net: factorize
    sync-rcu call in unregister_netdevice_many)

    This patch fixes __dev_close() and dev_close() to properly deinit their
    respective LIST_HEAD(single) before exiting.

    References: https://lkml.org/lkml/2011/2/16/304
    References: https://lkml.org/lkml/2011/2/14/223

    Reported-by: Michal Hocko
    Tested-by: Michal Hocko
    Reported-by: Eric W. Biderman
    Tested-by: Eric W. Biderman
    Signed-off-by: Linus Torvalds
    Signed-off-by: Eric Dumazet
    CC: Ingo Molnar
    CC: Octavian Purdila
    Signed-off-by: David S. Miller

    Linus Torvalds