08 Dec, 2010

4 commits

  • In kernel ABI version 7.16 and later FUSE_IOCTL_RETRY reply from a
    unrestricted IOCTL request shall return with an array of 'struct
    fuse_ioctl_iovec' instead of 'struct iovec'. This fixes the ABI
    ambiguity of 32bit vs. 64bit.

    Reported-by: "ccmail111"
    Signed-off-by: Miklos Szeredi
    CC: Tejun Heo

    Miklos Szeredi
     
  • Terje Malmedal reports that a fuse filesystem with 32 million inodes
    on a machine with lots of memory can take up to 30 minutes to process
    FORGET requests when all those inodes are evicted from the icache.

    To solve this, create a BATCH_FORGET request that allows up to about
    8000 FORGET requests to be sent in a single message.

    This request is only sent if userspace supports interface version 7.16
    or later, otherwise fall back to sending individual FORGET messages.

    Reported-by: Terje Malmedal
    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • Terje Malmedal reports that a fuse filesystem with 32 million inodes
    on a machine with lots of memory can go unresponsive for up to 30
    minutes when all those inodes are evicted from the icache.

    The reason is that FORGET messages, sent when the inode is evicted,
    are queued up together with regular filesystem requests, and while the
    huge queue of FORGET messages are processed no other filesystem
    operation can proceed.

    Since a full fuse request structure is allocated for each inode, these
    take up quite a bit of memory as well.

    To solve these issues, create a slim 'fuse_forget_link' structure
    containing just the minimum of information required to send the FORGET
    request and chain these on a separate queue.

    When userspace is asking for a request make sure that FORGET and
    non-FORGET requests are selected fairly: for each 8 non-FORGET allow
    16 FORGET requests. This will make sure FORGETs do not pile up, yet
    other requests are also allowed to proceed while the queued FORGETs
    are processed.

    Reported-by: Terje Malmedal
    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • Get rid of unnecessary page_address()-es.

    Signed-off-by: Miklos Szeredi
    CC: Tejun Heo

    Miklos Szeredi
     

30 Nov, 2010

16 commits

  • Verify that the total length of the iovec returned in FUSE_IOCTL_RETRY
    doesn't overflow iov_length().

    Signed-off-by: Miklos Szeredi
    CC: Tejun Heo
    CC: [2.6.31+]

    Miklos Szeredi
     
  • If a 32bit CUSE server is run on 64bit this results in EIO being
    returned to the caller.

    The reason is that FUSE_IOCTL_RETRY reply was defined to use 'struct
    iovec', which is different on 32bit and 64bit archs.

    Work around this by looking at the size of the reply to determine
    which struct was used. This is only needed if CONFIG_COMPAT is
    defined.

    A more permanent fix for the interface will be to use the same struct
    on both 32bit and 64bit.

    Reported-by: "ccmail111"
    Signed-off-by: Miklos Szeredi
    CC: Tejun Heo
    CC: [2.6.31+]

    Miklos Szeredi
     
  • Linus Torvalds
     
  • * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
    powerpc: Use call_rcu_sched() for pagetables

    Linus Torvalds
     
  • PowerPC relies on IRQ-disable to guard against RCU quiecent states,
    use the appropriate RCU call version.

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Benjamin Herrenschmidt

    Peter Zijlstra
     
  • This reverts commit e0fdace10e75dac67d906213b780ff1b1a4cc360.

    On-list discussion seems to suggest that the robustness fixes for printk
    make this unnecessary and DaveM has also agreed in person at Kernel Summit
    and on list.

    The main problem with this code is once we hit a lockdep splat we always
    keep oops_in_progress set, the console layer uses oops_in_progress with KMS
    to decide when it should be showing the oops and not showing X, so it causes
    problems around suspend/resume time when a userspace resume can cause a console
    switch away from X, only if oops_in_progress is set (which is what we want
    if an oops actually is in progress, but not because we had a lockdep splat
    2 days prior).

    Cc: David S Miller
    Cc: Ingo Molnar
    Signed-off-by: Dave Airlie
    Signed-off-by: Linus Torvalds

    Dave Airlie
     
  • …s/security-testing-2.6

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
    tpm: Autodetect itpm devices

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits)
    af_unix: limit recursion level
    pch_gbe driver: The wrong of initializer entry
    pch_gbe dreiver: chang author
    ucc_geth: fix ucc halt problem in half duplex mode
    inet: Fix __inet_inherit_port() to correctly increment bsockets and num_owners
    ehea: Add some info messages and fix an issue
    hso: fix disable_net
    NET: wan/x25_asy, move lapb_unregister to x25_asy_close_tty
    cxgb4vf: fix setting unicast/multicast addresses ...
    net, ppp: Report correct error code if unit allocation failed
    DECnet: don't leak uninitialized stack byte
    au1000_eth: fix invalid address accessing the MAC enable register
    dccp: fix error in updating the GAR
    tcp: restrict net.ipv4.tcp_adv_win_scale (#20312)
    netns: Don't leak others' openreq-s in proc
    Net: ceph: Makefile: Remove unnessary code
    vhost/net: fix rcu check usage
    econet: fix CVE-2010-3848
    econet: fix CVE-2010-3850
    econet: disallow NULL remote addr for sendmsg(), fixes CVE-2010-3849
    ...

    Linus Torvalds
     
  • …/git/tmlind/linux-omap-2.6

    * 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6:
    OMAP2+: PM/serial: hold console semaphore while OMAP UARTs are disabled
    OMAP: UART: don't resume UARTs that are not enabled.

    Linus Torvalds
     
  • Some Lenovos have TPMs that require a quirk to function correctly. This can
    be autodetected by checking whether the device has a _HID of INTC0102. This
    is an invalid PNPid, and as such is discarded by the pnp layer - however
    it's still present in the ACPI code, so we can pull it out that way. This
    means that the quirk won't be automatically applied on non-ACPI systems,
    but without ACPI we don't have any way to identify the chip anyway so I
    don't think that's a great concern.

    Signed-off-by: Matthew Garrett
    Acked-by: Rajiv Andrade
    Tested-by: Jiri Kosina
    Tested-by: Andy Isaacson
    Signed-off-by: James Morris

    Matthew Garrett
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (24 commits)
    Btrfs: don't use migrate page without CONFIG_MIGRATION
    Btrfs: deal with DIO bios that span more than one ordered extent
    Btrfs: setup blank root and fs_info for mount time
    Btrfs: fix fiemap
    Btrfs - fix race between btrfs_get_sb() and umount
    Btrfs: update inode ctime when using links
    Btrfs: make sure new inode size is ok in fallocate
    Btrfs: fix typo in fallocate to make it honor actual size
    Btrfs: avoid NULL pointer deref in try_release_extent_buffer
    Btrfs: make btrfs_add_nondir take parent inode as an argument
    Btrfs: hold i_mutex when calling btrfs_log_dentry_safe
    Btrfs: use dget_parent where we can UPDATED
    Btrfs: fix more ESTALE problems with NFS
    Btrfs: handle NFS lookups properly
    btrfs: make 1-bit signed fileds unsigned
    btrfs: Show device attr correctly for symlinks
    btrfs: Set file size correctly in file clone
    btrfs: Check if dest_offset is block-size aligned before cloning file
    Btrfs: handle the space_cache option properly
    btrfs: Fix early enospc because 'unused' calculated with wrong sign.
    ...

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
    EDAC: Fix typos in Documentation/edac.txt
    EDAC, MCE: Fix edac_init_mce_inject error handling
    EDAC: Remove deprecated kbuild goal definitions

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes:
    GFS2: Userland expects quota limit/warn/usage in 512b blocks

    Linus Torvalds
     
  • Its easy to eat all kernel memory and trigger NMI watchdog, using an
    exploit program that queues unix sockets on top of others.

    lkml ref : http://lkml.org/lkml/2010/11/25/8

    This mechanism is used in applications, one choice we have is to have a
    recursion limit.

    Other limits might be needed as well (if we queue other types of files),
    since the passfd mechanism is currently limited by socket receive queue
    sizes only.

    Add a recursion_level to unix socket, allowing up to 4 levels.

    Each time we send an unix socket through sendfd mechanism, we copy its
    recursion level (plus one) to receiver. This recursion level is cleared
    when socket receive queue is emptied.

    Reported-by: Марк Коренберг
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • The wrong of initializer entry was modified.

    Signed-off-by: Toshiharu Okada
    Reported-by: Dr. David Alan Gilbert
    Signed-off-by: David S. Miller

    Toshiharu Okada
     
  • This driver's AUTHOR was changed to "Toshiharu Okada" from "Masayuki Ohtake".
    I update the Kconfig, renamed "Topcliff" to "EG20T".

    Signed-off-by: Toshiharu Okada
    Signed-off-by: David S. Miller

    Toshiharu Okada
     

29 Nov, 2010

19 commits

  • Fixes compile error

    Signed-off-by: Chris Mason

    Chris Mason
     
  • In commit 58933c64(ucc_geth: Fix the wrong the Rx/Tx FIFO size),
    the UCC_GETH_UTFTT_INIT is set to 512 based on the recommendation
    of the QE Reference Manual. But that will sometimes cause tx halt
    while working in half duplex mode.

    According to errata draft QE_GENERAL-A003(High Tx Virtual FIFO
    threshold size can cause UCC to halt), setting UTFTT less than
    [(UTFS x (M - 8)/M) - 128] will prevent this from happening
    (M is the minimum buffer size).

    The patch changes UTFTT back to 256.

    Signed-off-by: Li Yang
    Cc: Jean-Denis Boyer
    Cc: Andreas Schmitz
    Cc: Anton Vorontsov
    Signed-off-by: David S. Miller

    Yang Li
     
  • inet sockets corresponding to passive connections are added to the bind hash
    using ___inet_inherit_port(). These sockets are later removed from the bind
    hash using __inet_put_port(). These two functions are not exactly symmetrical.
    __inet_put_port() decrements hashinfo->bsockets and tb->num_owners, whereas
    ___inet_inherit_port() does not increment them. This results in both of these
    going to -ve values.

    This patch fixes this by calling inet_bind_hash() from ___inet_inherit_port(),
    which does the right thing.

    'bsockets' and 'num_owners' were introduced by commit a9d8f9110d7e953c
    (inet: Allowing more than 64k connections and heavily optimize bind(0))

    Signed-off-by: Nagendra Singh Tomar
    Acked-by: Eric Dumazet
    Acked-by: Evgeniy Polyakov
    Signed-off-by: David S. Miller

    Nagendra Tomar
     
  • This patch adds some debug information about ehea not being able to
    allocate enough spaces. Also it correctly updates the amount of available
    skb.

    Signed-off-by: Breno Leitao
    Signed-off-by: David S. Miller

    Breno Leitao
     
  • The new DIO bio splitting code has problems when the bio
    spans more than one ordered extent. This will happen as the
    generic DIO code merges our get_blocks calls together into
    a bigger single bio.

    This fixes things by walking forward in the ordered extent
    code finding all the overlapping ordered extents and completing them
    all at once.

    Signed-off-by: Chris Mason

    Chris Mason
     
  • This avoids some include-file hell, and the function isn't really
    important enough to be inlined anyway.

    Reported-by: Ingo Molnar
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • And in particular, use it in 'pipe_fcntl()'.

    The other pipe functions do not need to use the 'careful' version, since
    they are only ever called for things that are already known to be pipes.

    The normal read/write/ioctl functions are called through the file
    operations structures, so if a file isn't a pipe, they'd never get
    called. But pipe_fcntl() is special, and called directly from the
    generic fcntl code, and needs to use the same careful function that the
    splice code is using.

    Cc: Jens Axboe
    Cc: Andrew Morton
    Cc: Al Viro
    Cc: Dave Jones
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • .. and change it to take the 'file' pointer instead of an inode, since
    that's what all users want anyway.

    The renaming is preparatory to exporting it to other users. The old
    'pipe_info()' name was too generic and is already used elsewhere, so
    before making the function public we need to use a more specific name.

    Cc: Jens Axboe
    Cc: Andrew Morton
    Cc: Al Viro
    Cc: Dave Jones
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • …/git/tip/linux-2.6-tip

    * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    perf: Fix the software context switch counter
    perf, x86: Fixup Kconfig deps
    x86, perf, nmi: Disable perf if counters are not accessible
    perf: Fix inherit vs. context rotation bug

    Linus Torvalds
     
  • * 'fwnet' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
    firewire: net: throttle TX queue before running out of tlabels
    firewire: net: replace lists by counters
    firewire: net: fix memory leaks
    firewire: net: count stats.tx_packets and stats.tx_bytes

    Linus Torvalds
     
  • The HSO driver incorrectly creates a serial device instead of a net
    device when disable_net is set. It shouldn't create anything for the
    network interface.

    Signed-off-by: Filip Aben
    Reported-by: Piotr Isajew
    Reported-by: Johan Hovold
    Signed-off-by: David S. Miller

    Filip Aben
     
  • We register lapb when tty is created, but unregister it only when the
    device is UP. So move the lapb_unregister to x25_asy_close_tty after
    the device is down.

    The old behaviour causes ldisc switching to fail each second attempt,
    because we noted for us that the device is unused, so we use it the
    second time, but labp layer still have it registered, so it fails
    obviously.

    Signed-off-by: Jiri Slaby
    Reported-by: Sergey Lapin
    Cc: Andrew Hendry
    Tested-by: Sergey Lapin
    Tested-by: Mikhail Ulyanov
    Signed-off-by: David S. Miller

    Jiri Slaby
     
  • We were truncating the number of unicast and multicast MAC addresses
    supported. Additionally, we were incorrectly computing the MAC Address
    hash (a "1 << N" where we needed a "1ULL << N").

    Signed-off-by: Casey Leedom
    Signed-off-by: David S. Miller

    Casey Leedom
     
  • Allocating unit from ird might return several error codes
    not only -EAGAIN, so it should not be changed and returned
    precisely. Same time unit release procedure should be invoked
    only if device is unregistering.

    Signed-off-by: Cyrill Gorcunov
    CC: Paul Mackerras
    Signed-off-by: David S. Miller

    Cyrill Gorcunov
     
  • A single uninitialized padding byte is leaked to userspace.

    Signed-off-by: Dan Rosenberg
    CC: stable
    Signed-off-by: David S. Miller

    Dan Rosenberg
     
  • "aup->enable" holds already the address pointing to the MAC enable
    register. The bug was introduced by commit d0e7cb:

    "au1000-eth: remove volatiles, switch to I/O accessors".

    CC: Florian Fainelli
    Signed-off-by: Wolfgang Grandegger
    Acked-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Wolfgang Grandegger
     
  • This fixes a bug in updating the Greatest Acknowledgment number Received (GAR):
    the current implementation does not track the greatest received value -
    lower values in the range AWL..AWH (RFC 4340, 7.5.1) erase higher ones.

    Signed-off-by: Gerrit Renker
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • David S. Miller
     
  • tcp_win_from_space() does the following:

    if (sysctl_tcp_adv_win_scale > (-sysctl_tcp_adv_win_scale);
    else
    return space - (space >> sysctl_tcp_adv_win_scale);

    "space" is int.

    As per C99 6.5.7 (3) shifting int for 32 or more bits is
    undefined behaviour.

    Indeed, if sysctl_tcp_adv_win_scale is exactly 32,
    space >> 32 equals space and function returns 0.

    Which means we busyloop in tcp_fixup_rcvbuf().

    Restrict net.ipv4.tcp_adv_win_scale to [-31, 31].

    Fix https://bugzilla.kernel.org/show_bug.cgi?id=20312

    Steps to reproduce:

    echo 32 >/proc/sys/net/ipv4/tcp_adv_win_scale
    wget www.kernel.org
    [softlockup]

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

28 Nov, 2010

1 commit