27 May, 2009

2 commits


26 May, 2009

1 commit


22 May, 2009

3 commits


21 May, 2009

6 commits

  • The use of unspecified protocol in IPv6 initial route prevents quagga to
    install IPv6 default route:
    # show ipv6 route
    S ::/0 [1/0] via fe80::1, eth1_0
    K>* ::/0 is directly connected, lo, rej
    C>* ::1/128 is directly connected, lo
    C>* fe80::/64 is directly connected, eth1_0

    # ip -6 route
    fe80::/64 dev eth1_0 proto kernel metric 256 mtu 1500 advmss 1440
    hoplimit -1
    ff00::/8 dev eth1_0 metric 256 mtu 1500 advmss 1440 hoplimit -1
    unreachable default dev lo proto none metric -1 error -101 hoplimit 255

    The attached patch ensures RTPROT_KERNEL to the default initial route
    and fixes the problem for quagga.
    This is similar to "ipv6: protocol for address routes"
    f410a1fba7afa79d2992620e874a343fdba28332.

    # show ipv6 route
    S>* ::/0 [1/0] via fe80::1, eth1_0
    C>* ::1/128 is directly connected, lo
    C>* fe80::/64 is directly connected, eth1_0

    # ip -6 route
    fe80::/64 dev eth1_0 proto kernel metric 256 mtu 1500 advmss 1440
    hoplimit -1
    fe80::/64 dev eth1_0 proto kernel metric 256 mtu 1500 advmss 1440
    hoplimit -1
    ff00::/8 dev eth1_0 metric 256 mtu 1500 advmss 1440 hoplimit -1
    default via fe80::1 dev eth1_0 proto zebra metric 1024 mtu 1500
    advmss 1440 hoplimit -1
    unreachable default dev lo proto kernel metric -1 error -101 hoplimit 255

    Signed-off-by: Jean-Mickael Guerin
    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Jean-Mickael Guerin
     
  • David S. Miller
     
  • Alexander V. Lukyanov found a regression in 2.6.29 and made a complete
    analysis found in http://bugzilla.kernel.org/show_bug.cgi?id=13339
    Quoted here because its a perfect one :

    begin_of_quotation
    2.6.29 patch has introduced flexible route cache rebuilding. Unfortunately the
    patch has at least one critical flaw, and another problem.

    rt_intern_hash calculates rthi pointer, which is later used for new entry
    insertion. The same loop calculates cand pointer which is used to clean the
    list. If the pointers are the same, rtable leak occurs, as first the cand is
    removed then the new entry is appended to it.

    This leak leads to unregister_netdevice problem (usage count > 0).

    Another problem of the patch is that it tries to insert the entries in certain
    order, to facilitate counting of entries distinct by all but QoS parameters.
    Unfortunately, referencing an existing rtable entry moves it to list beginning,
    to speed up further lookups, so the carefully built order is destroyed.

    For the first problem the simplest patch it to set rthi=0 when rthi==cand, but
    it will also destroy the ordering.
    end_of_quotation

    Problematic commit is 1080d709fb9d8cd4392f93476ee46a9d6ea05a5b
    (net: implement emergency route cache rebulds when gc_elasticity is exceeded)

    Trying to keep dst_entries ordered is too complex and breaks the fact that
    order should depend on the frequency of use for garbage collection.

    A possible fix is to make rt_intern_hash() simpler, and only makes
    rt_check_expire() a litle bit smarter, being able to cope with an arbitrary
    entries order. The added loop is running on cache hot data, while cpu
    is prefetching next object, so should be unnoticied.

    Reported-and-analyzed-by: Alexander V. Lukyanov
    Signed-off-by: Eric Dumazet
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • rt_check_expire() computes average and standard deviation of chain lengths,
    but not correclty reset length to 0 at beginning of each chain.
    This probably gives overflows for sum2 (and sum) on loaded machines instead
    of meaningful results.

    Signed-off-by: Eric Dumazet
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Its possible for cfg80211 to have scheduled the work and for
    the global workqueue to not have kicked in prior to a cfg80211
    driver's regulatory hint or wiphy_apply_custom_regulatory().

    Although this is very unlikely its possible and should fix
    this race. When this race would happen you are expected to have
    hit a null pointer dereference panic.

    Cc: stable@kernel.org
    Signed-off-by: Luis R. Rodriguez
    Tested-by: Alan Jenkins
    Signed-off-by: John W. Linville

    Luis R. Rodriguez
     
  • Another design flaw in wireless extensions (is anybody
    surprised?) in the way it handles the iw_encode_ext
    structure: The structure is part of the 'extra' memory
    but contains the key length explicitly, instead of it
    just being the length of the extra buffer - size of
    the struct and using the explicit key length only for
    the get operation (which only writes it).

    Therefore, we have this layout:

    extra: +-------------------------+
    | struct iw_encode_ext { |
    | ... |
    | u16 key_len; |
    | u8 key[0]; |
    | }; |
    +-------------------------+
    | key material |
    +-------------------------+

    Now, all drivers I checked use ext->key_len without
    checking that both key_len and the struct fit into the
    extra buffer that has been copied from userspace. This
    leads to a buffer overrun while reading that buffer,
    depending on the driver it may be possible to specify
    arbitrary key_len or it may need to be a proper length
    for the key algorithm specified.

    Thankfully, this is only exploitable by root, but root
    can actually cause a segfault or use kernel memory as
    a key (which you can even get back with siocgiwencode
    or siocgiwencodeext from the key buffer).

    Fix this by verifying that key_len fits into the buffer
    along with struct iw_encode_ext.

    Signed-off-by: Johannes Berg
    Signed-off-by: John W. Linville

    Johannes Berg
     

19 May, 2009

5 commits

  • Commit e81963b1 ("ipv4: Make INET_LRO a bool instead of tristate.")
    changed this config from tristate to bool. Add default so that it is
    consistent with the help text.

    Signed-off-by: Frans Pop
    Signed-off-by: David S. Miller

    Frans Pop
     
  • When called with a consumed value that is less than skb_headlen(skb)
    bytes into a page frag, skb_seq_read() incorrectly returns an
    offset/length relative to skb->data. Ensure that data which should come
    from a page frag does.

    Signed-off-by: Thomas Chenault
    Tested-by: Shyam Iyer
    Signed-off-by: David S. Miller

    Thomas Chenault
     
  • gen_estimator can overflow bps (bytes per second) with Gb links, while
    it was designed with a u32 API, with a theorical limit of 34360Mbit
    (2^32 bytes)

    Using 64 bit intermediate avbps/brate counters can allow us to reach
    this theorical limit.

    Signed-off-by: Eric Dumazet
    Signed-off-by: Jarek Poplawski
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • It is illegal to dereference a skb after a successful ndo_start_xmit()
    call. We must store skb length in a local variable instead.

    Bug was introduced in 2.6.27 by commit 0abf77e55a2459aa9905be4b226e4729d5b4f0cb
    (net_sched: Add accessor function for packet length for qdiscs)

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Commit 518a09ef11 (tcp: Fix recvmsg MSG_PEEK influence of
    blocking behavior) lets the loop run longer than the race check
    did previously expect, so we need to be more careful with this
    check and consider the work we have been doing.

    I tried my best to deal with urg hole madness too which happens
    here:
    if (!sock_flag(sk, SOCK_URGINLINE)) {
    ++*seq;
    ...
    by using additional offset by one but I certainly have very
    little interest in testing that part.

    Signed-off-by: Ilpo Järvinen
    Tested-by: Frans Pop
    Tested-by: Ian Zimmermann
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     

18 May, 2009

4 commits

  • If bridge is configured with no STP and forwarding delay of 0 (which
    is typical for virtualization) then when link starts it will flood all
    packets for the first 20 seconds.

    This bug was introduced by a combination of earlier changes:
    * forwarding database uses hold time of zero to indicate
    user wants to always flood packets
    * optimzation of the case of forwarding delay of 0 avoids the initial
    timer tick

    The fix is to just skip all the topology change detection code if
    kernel STP is not being used.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • Currently the bridge catches all STP packets; even if STP is turned
    off. This prevents other systems (which do have STP turned on)
    from being able to detect loops in the network.

    With this patch, if STP is off, then any packet sent to the STP
    multicast group address is forwarded to all ports.

    Based on earlier patch by Joakim Tjernlund with changes
    to go through forwarding (not local chain), and optimization
    that only last octet needs to be checked.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • If a DHCP server is delayed, it's possible for the client to receive the
    DHCPOFFER after it has already sent out a new DHCPDISCOVER message from
    a second interface. The client then sends out a DHCPREQUEST from the
    second interface, but the server doesn't recognize the device and
    rejects the request.

    This patch simply tracks the current device being configured and throws
    away the OFFER if it is not intended for the current device. A more
    sophisticated approach would be to put the OFFER information into the
    struct ic_device rather than storing it globally.

    Signed-off-by: Chris Friesen
    Signed-off-by: David S. Miller

    Chris Friesen
     
  • It looks like the dev in netpoll_poll can be NULL - at lease it's
    checked at the function beginning. Thus the dev->netde_ops dereference
    looks dangerous.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     

16 May, 2009

2 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6:
    Bluetooth: Don't trigger disconnect timeout for security mode 3 pairing
    Bluetooth: Don't use hci_acl_connect_cancel() for incoming connections
    Bluetooth: Fix wrong module refcount when connection setup fails

    Another case of me handling the fallout from Davem's unfortunate
    addiction to shuffleboard.

    Won't anybody think of the children? Join the anti-shuffleboard league
    today!

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6:
    iwlwifi: fix device id registration for 6000 series 2x2 devices
    ath5k: update channel in sw state after stopping RX and TX
    rtl8187: use DMA-aware buffers with usb_control_msg
    mac80211: avoid NULL ptr deref when finding max_rates in PID and minstrel
    airo: airo_get_encode{,ext} potential buffer overflow

    Pulled directly by Linus because Davem is off playing shuffle-board at
    some Alaskan cruise, and the NULL ptr deref issue hits people and should
    get merged sooner rather than later.

    David - make us proud on the shuffle-board tournament!

    Linus Torvalds
     

13 May, 2009

1 commit


12 May, 2009

1 commit

  • "There is another problem with this piece of code. The sband will be NULL
    after second iteration on single band device and cause null pointer
    dereference. Everything is working with dual band card. Sorry, but i
    don't know how to explain this clearly in English. I have looked on the
    second patch for pid algorithm and found similar bug."

    Reported-by: Karol Szuster
    Signed-off-by: John W. Linville

    John W. Linville
     

11 May, 2009

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (26 commits)
    bonding: fix panic if initialization fails
    IXP4xx: complete Ethernet netdev setup before calling register_netdev().
    IXP4xx: use "ENODEV" instead of "ENOSYS" in module initialization.
    ipvs: Fix IPv4 FWMARK virtual services
    ipv4: Make INET_LRO a bool instead of tristate.
    net: remove stale reference to fastroute from Kconfig help text
    net: update skb_recycle_check() for hardware timestamping changes
    bnx2: Fix panic in bnx2_poll_work().
    net-sched: fix bfifo default limit
    igb: resolve panic on shutdown when SR-IOV is enabled
    wimax: oops: wimax_dev_add() is the only one that can initialize the state
    wimax: fix oops if netlink fails to add attribute
    Bluetooth: Move dev_set_name() to a context that can sleep
    netfilter: ctnetlink: fix wrong message type in user updates
    netfilter: xt_cluster: fix use of cluster match with 32 nodes
    netfilter: ip6t_ipv6header: fix match on packets ending with NEXTHDR_NONE
    netfilter: add missing linux/types.h include to xt_LED.h
    mac80211: pid, fix memory corruption
    mac80211: minstrel, fix memory corruption
    cfg80211: fix comment on regulatory hint processing
    ...

    Linus Torvalds
     

10 May, 2009

3 commits

  • A remote device in security mode 3 that tries to connect will require
    the pairing during the connection setup phase. The disconnect timeout
    is now triggered within 10 milliseconds and causes the pairing to fail.

    If a connection is not fully established and a PIN code request is
    received, don't trigger the disconnect timeout. The either successful
    or failing connection complete event will make sure that the timeout
    is triggered at the right time.

    The biggest problem with security mode 3 is that many Bluetooth 2.0
    device and before use a temporary security mode 3 for dedicated
    bonding.

    Based on a report by Johan Hedberg

    Signed-off-by: Marcel Holtmann
    Tested-by: Johan Hedberg

    Marcel Holtmann
     
  • The connection setup phase takes around 2 seconds or longer and in
    that time it is possible that the need for an ACL connection is no
    longer present. If that happens then, the connection attempt will
    be canceled.

    This only applies to outgoing connections, but currently it can also
    be triggered by incoming connection. Don't call hci_acl_connect_cancel()
    on incoming connection since these have to be either accepted or rejected
    in this state. Once they are successfully connected they need to be
    fully disconnected anyway.

    Also remove the wrong hci_acl_disconn() call for SCO and eSCO links
    since at this stage they can't be disconnected either, because the
    connection handle is still unknown.

    Based on a report by Johan Hedberg

    Signed-off-by: Marcel Holtmann
    Tested-by: Johan Hedberg

    Marcel Holtmann
     
  • The module refcount is increased by hci_dev_hold() call in hci_conn_add()
    and decreased by hci_dev_put() call in del_conn(). In case the connection
    setup fails, hci_dev_put() is never called.

    Procedure to reproduce the issue:

    # hciconfig hci0 up
    # lsmod | grep btusb -> "used by" refcount = 1

    # hcitool cc -> will get timeout

    # lsmod | grep btusb -> "used by" refcount = 2
    # hciconfig hci0 down
    # lsmod | grep btusb -> "used by" refcount = 1
    # rmmod btusb -> ERROR: Module btusb is in use

    The hci_dev_put() call got moved into del_conn() with the 2.6.25 kernel
    to fix an issue with hci_dev going away before hci_conn. However that
    change was wrong and introduced this problem.

    When calling hci_conn_del() it has to call hci_dev_put() after freeing
    the connection details. This handling should be fully symmetric. The
    execution of del_conn() is done in a work queue and needs it own calls
    to hci_dev_hold() and hci_dev_put() to ensure that the hci_dev stays
    until the connection cleanup has been finished.

    Based on a report by Bing Zhao

    Signed-off-by: Marcel Holtmann
    Tested-by: Bing Zhao

    Marcel Holtmann
     

09 May, 2009

2 commits

  • This fixes the use of fwmarks to denote IPv4 virtual services
    which was unfortunately broken as a result of the integration
    of IPv6 support into IPVS, which was included in 2.6.28.

    The problem arises because fwmarks are stored in the 4th octet
    of a union nf_inet_addr .all, however in the case of IPv4 only
    the first octet, corresponding to .ip, is assigned and compared.

    In other words, using .all = { 0, 0, 0, htonl(svc->fwmark) always
    results in a value of 0 (32bits) being stored for IPv4. This means
    that one fwmark can be used, as it ends up being mapped to 0, but things
    break down when multiple fwmarks are used, as they all end up being mapped
    to 0.

    As fwmarks are 32bits a reasonable fix seems to be to just store the fwmark
    in .ip, and comparing and storing .ip when fwmarks are used.

    This patch makes the assumption that in calls to ip_vs_ct_in_get()
    and ip_vs_sched_persist() if the proto parameter is IPPROTO_IP then
    we are dealing with an fwmark. I believe this is valid as ip_vs_in()
    does fairly strict filtering on the protocol and IPPROTO_IP should
    not be used in these calls unless explicitly passed when making
    these calls for fwmarks in ip_vs_sched_persist().

    Tested-by: Fabien Duchêne
    Cc: Joseph Mack NA3T
    Cc: Julius Volz
    Signed-off-by: Simon Horman
    Signed-off-by: David S. Miller

    Simon Horman
     
  • This code is used as a library by several device drivers,
    which select INET_LRO.

    If some are modules and some are statically built into the
    kernel, we get build failures if INET_LRO is modular.

    Signed-off-by: David S. Miller

    David S. Miller
     

08 May, 2009

1 commit


07 May, 2009

5 commits


06 May, 2009

3 commits

  • Setting the name of a sysfs device has to be done in a context that can
    actually sleep. It allocates its memory with GFP_KERNEL. Previously it
    was a static (size limited) string and that got changed to accommodate
    longer device names. So move the dev_set_name() just before calling
    device_add() which is executed in a work queue.

    This fixes the following error:

    [ 110.012125] BUG: sleeping function called from invalid context at mm/slub.c:1595
    [ 110.012135] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper
    [ 110.012141] 2 locks held by swapper/0:
    [ 110.012145] #0: (hci_task_lock){++.-.+}, at: [] hci_rx_task+0x2f/0x2d0 [bluetooth]
    [ 110.012173] #1: (&hdev->lock){+.-.+.}, at: [] hci_event_packet+0x72/0x25c0 [bluetooth]
    [ 110.012198] Pid: 0, comm: swapper Tainted: G W 2.6.30-rc4-g953cdaa #1
    [ 110.012203] Call Trace:
    [ 110.012207] [] __might_sleep+0x14d/0x170
    [ 110.012228] [] __kmalloc+0x111/0x170
    [ 110.012239] [] kvasprintf+0x64/0xb0
    [ 110.012248] [] kobject_set_name_vargs+0x3b/0xa0
    [ 110.012257] [] dev_set_name+0x76/0xa0
    [ 110.012273] [] ? hci_event_packet+0x72/0x25c0 [bluetooth]
    [ 110.012289] [] hci_conn_add_sysfs+0x3d/0x70 [bluetooth]
    [ 110.012303] [] hci_event_packet+0xbc/0x25c0 [bluetooth]
    [ 110.012312] [] ? sock_def_readable+0x80/0xa0
    [ 110.012328] [] ? hci_send_to_sock+0xfc/0x1c0 [bluetooth]
    [ 110.012343] [] ? sock_def_readable+0x80/0xa0
    [ 110.012347] [] ? _read_unlock+0x75/0x80
    [ 110.012354] [] ? hci_send_to_sock+0xfc/0x1c0 [bluetooth]
    [ 110.012360] [] hci_rx_task+0x203/0x2d0 [bluetooth]
    [ 110.012365] [] tasklet_action+0xb5/0x160
    [ 110.012369] [] __do_softirq+0x9c/0x150
    [ 110.012372] [] ? _spin_unlock+0x3f/0x80
    [ 110.012376] [] call_softirq+0x1c/0x30
    [ 110.012380] [] do_softirq+0x8d/0xe0
    [ 110.012383] [] irq_exit+0xc5/0xe0
    [ 110.012386] [] do_IRQ+0x9d/0x120
    [ 110.012389] [] ret_from_intr+0x0/0xf
    [ 110.012391] [] ? acpi_idle_enter_bm+0x264/0x2a6
    [ 110.012399] [] ? acpi_idle_enter_bm+0x25a/0x2a6
    [ 110.012403] [] ? cpuidle_idle_call+0xc5/0x130
    [ 110.012407] [] ? cpu_idle+0xc4/0x130
    [ 110.012411] [] ? rest_init+0x88/0xb0
    [ 110.012416] [] ? start_kernel+0x3b5/0x412
    [ 110.012420] [] ? x86_64_start_reservations+0x91/0xb5
    [ 110.012424] [] ? x86_64_start_kernel+0xef/0x11b

    Based on a report by Davide Pesavento

    Signed-off-by: Marcel Holtmann
    Tested-by: Hugo Mildenberger
    Tested-by: Bing Zhao

    Marcel Holtmann
     
  • David S. Miller
     
  • David S. Miller