07 Apr, 2012

2 commits

  • I have a new optimized x86 "strncpy_from_user()" that will use these
    same helper functions for all the same reasons the name lookup code uses
    them. This is preparation for that.

    This moves them into an architecture-specific header file. It's
    architecture-specific for two reasons:

    - some of the functions are likely to want architecture-specific
    implementations. Even if the current code happens to be "generic" in
    the sense that it should work on any little-endian machine, it's
    likely that the "multiply by a big constant and shift" implementation
    is less than optimal for an architecture that has a guaranteed fast
    bit count instruction, for example.

    - I expect that if architectures like sparc want to start playing
    around with this, we'll need to abstract out a few more details (in
    particular the actual unaligned accesses). So we're likely to have
    more architecture-specific stuff if non-x86 architectures start using
    this.

    (and if it turns out that non-x86 architectures don't start using
    this, then having it in an architecture-specific header is still the
    right thing to do, of course)

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Pull networking updates from David Miller:

    1) Fix inaccuracies in network driver interface documentation, from Ben
    Hutchings.

    2) Fix handling of negative offsets in BPF JITs, from Jan Seiffert.

    3) Compile warning, locking, and refcounting fixes in netfilter's
    xt_CT, from Pablo Neira Ayuso.

    4) phonet sendmsg needs to validate user length just like any other
    datagram protocol, fix from Sasha Levin.

    5) Ipv6 multicast code uses wrong loop index, from RongQing Li.

    6) Link handling and firmware fixes in bnx2x driver from Yaniv Rosner
    and Yuval Mintz.

    7) mlx4 erroneously allocates 4 pages at a time, regardless of page
    size, fix from Thadeu Lima de Souza Cascardo.

    8) SCTP socket option wasn't extended in a backwards compatible way,
    fix from Thomas Graf.

    9) Add missing address change event emissions to bonding, from Shlomo
    Pongratz.

    10) /proc/net/dev regressed because it uses a private offset to track
    where we are in the hash table, but this doesn't track the offset
    pullback that the seq_file code does resulting in some entries being
    missed in large dumps.

    Fix from Eric Dumazet.

    11) do_tcp_sendpage() unloads the send queue way too fast, because it
    invokes tcp_push() when it shouldn't. Let the natural sequence
    generated by the splice paths, and the assosciated MSG_MORE
    settings, guide the tcp_push() calls.

    Otherwise what goes out of TCP is spaghetti and doesn't batch
    effectively into GSO/TSO clusters.

    From Eric Dumazet.

    12) Once we put a SKB into either the netlink receiver's queue or a
    socket error queue, it can be consumed and freed up, therefore we
    cannot touch it after queueing it like that.

    Fixes from Eric Dumazet.

    13) PPP has this annoying behavior in that for every transmit call it
    immediately stops the TX queue, then calls down into the next layer
    to transmit the PPP frame.

    But if that next layer can take it immediately, it just un-stops the
    TX queue right before returning from the transmit method.

    Besides being useless work, it makes several facilities unusable, in
    particular things like the equalizers. Well behaved devices should
    only stop the TX queue when they really are full, and in PPP's case
    when it gets backlogged to the downstream device.

    David Woodhouse therefore fixed PPP to not stop the TX queue until
    it's downstream can't take data any more.

    14) IFF_UNICAST_FLT got accidently lost in some recent stmmac driver
    changes, re-add. From Marc Kleine-Budde.

    15) Fix link flaps in ixgbe, from Eric W. Multanen.

    16) Descriptor writeback fixes in e1000e from Matthew Vick.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits)
    net: fix a race in sock_queue_err_skb()
    netlink: fix races after skb queueing
    doc, net: Update ndo_start_xmit return type and values
    doc, net: Remove instruction to set net_device::trans_start
    doc, net: Update netdev operation names
    doc, net: Update documentation of synchronisation for TX multiqueue
    doc, net: Remove obsolete reference to dev->poll
    ethtool: Remove exception to the requirement of holding RTNL lock
    MAINTAINERS: update for Marvell Ethernet drivers
    bonding: properly unset current_arp_slave on slave link up
    phonet: Check input from user before allocating
    tcp: tcp_sendpages() should call tcp_push() once
    ipv6: fix array index in ip6_mc_add_src()
    mlx4: allocate just enough pages instead of always 4 pages
    stmmac: re-add IFF_UNICAST_FLT for dwmac1000
    bnx2x: Clear MDC/MDIO warning message
    bnx2x: Fix BCM57711+BCM84823 link issue
    bnx2x: Clear BCM84833 LED after fan failure
    bnx2x: Fix BCM84833 PHY FW version presentation
    bnx2x: Fix link issue for BCM8727 boards.
    ...

    Linus Torvalds
     

06 Apr, 2012

9 commits

  • commit 2f533844242 (tcp: allow splice() to build full TSO packets) added
    a regression for splice() calls using SPLICE_F_MORE.

    We need to call tcp_flush() at the end of the last page processed in
    tcp_sendpages(), or else transmits can be deferred and future sends
    stall.

    Add a new internal flag, MSG_SENDPAGE_NOTLAST, acting like MSG_MORE, but
    with different semantic.

    For all sendpage() providers, its a transparent change. Only
    sock_sendpage() and tcp_sendpages() can differentiate the two different
    flags provided by pipe_to_sendpage()

    Reported-by: Tom Herbert
    Cc: Nandita Dukkipati
    Cc: Neal Cardwell
    Cc: Tom Herbert
    Cc: Yuchung Cheng
    Cc: H.K. Jerry Chu
    Cc: Maciej Żenczykowski
    Cc: Mahesh Bandewar
    Cc: Ilpo Järvinen
    Signed-off-by: Eric Dumazet com>
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Merge batch of fixes from Andrew Morton:
    "The simple_open() cleanup was held back while I wanted for laggards to
    merge things.

    I still need to send a few checkpoint/restore patches. I've been
    wobbly about merging them because I'm wobbly about the overall
    prospects for success of the project. But after speaking with Pavel
    at the LSF conference, it sounds like they're further toward
    completion than I feared - apparently davem is at the "has stopped
    complaining" stage regarding the net changes. So I need to go back
    and re-review those patchs and their (lengthy) discussion."

    * emailed from Andrew Morton : (16 patches)
    memcg swap: use mem_cgroup_uncharge_swap fix
    backlight: add driver for DA9052/53 PMIC v1
    C6X: use set_current_blocked() and block_sigmask()
    MAINTAINERS: add entry for sparse checker
    MAINTAINERS: fix REMOTEPROC F: typo
    alpha: use set_current_blocked() and block_sigmask()
    simple_open: automatically convert to simple_open()
    scripts/coccinelle/api/simple_open.cocci: semantic patch for simple_open()
    libfs: add simple_open()
    hugetlbfs: remove unregister_filesystem() when initializing module
    drivers/rtc/rtc-88pm860x.c: fix rtc irq enable callback
    fs/xattr.c:setxattr(): improve handling of allocation failures
    fs/xattr.c:listxattr(): fall back to vmalloc() if kmalloc() failed
    fs/xattr.c: suppress page allocation failure warnings from sys_listxattr()
    sysrq: use SEND_SIG_FORCED instead of force_sig()
    proc: fix mount -t proc -o AAA

    Linus Torvalds
     
  • Many users of debugfs copy the implementation of default_open() when
    they want to support a custom read/write function op. This leads to a
    proliferation of the default_open() implementation across the entire
    tree.

    Now that the common implementation has been consolidated into libfs we
    can replace all the users of this function with simple_open().

    This replacement was done with the following semantic patch:

    @ open @
    identifier open_f != simple_open;
    identifier i, f;
    @@
    -int open_f(struct inode *i, struct file *f)
    -{
    (
    -if (i->i_private)
    -f->private_data = i->i_private;
    |
    -f->private_data = i->i_private;
    )
    -return 0;
    -}

    @ has_open depends on open @
    identifier fops;
    identifier open.open_f;
    @@
    struct file_operations fops = {
    ...
    -.open = open_f,
    +.open = simple_open,
    ...
    };

    [akpm@linux-foundation.org: checkpatch fixes]
    Signed-off-by: Stephen Boyd
    Cc: Greg Kroah-Hartman
    Cc: Al Viro
    Cc: Julia Lawall
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Boyd
     
  • debugfs and a few other drivers use an open-coded version of
    simple_open() to pass a pointer from the file to the read/write file
    ops. Add support for this simple case to libfs so that we can remove
    the many duplicate copies of this simple function.

    Signed-off-by: Stephen Boyd
    Cc: Al Viro
    Cc: Julia Lawall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Boyd
     
  • It was introduced by d1d5e05ffdc1 ("hugetlbfs: return error code when
    initializing module") but as Al pointed out, is a bad idea.

    Quoted comments from Al:
    "Note that unregister_filesystem() in module init is *always* wrong;
    it's not an issue here (it's done too early to care about and
    realistically the box is not going anywhere - it'll panic when attempt
    to exec /sbin/init fails, if not earlier), but it's a damn bad
    example.

    Consider a normal fs module. Somebody loads it and in parallel with
    that we get a mount attempt on that fs type. It comes between
    register and failure exits that causes unregister; at that point we
    are screwed since grabbing a reference to module as done by mount is
    enough to prevent exit, but not to prevent the failure of init. As
    the result, module will get freed when init fails, mounted fs of that
    type be damned."

    So remove it.

    Signed-off-by: Hillf Danton
    Cc: David Rientjes
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hillf Danton
     
  • This allocation can be as large as 64k.

    - Add __GFP_NOWARN so the a falied kmalloc() is silent

    - Fall back to vmalloc() if the kmalloc() failed

    Cc: Dave Chinner
    Cc: Dave Jones
    Cc: David Rientjes
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • This allocation can be as large as 64k. As David points out, "falling
    back to vmalloc here is much better solution than failing to retreive
    the attribute - it will work no matter how fragmented memory gets. That
    means we don't get incomplete backups occurring after days or months of
    uptime and successful backups".

    Cc: Dave Chinner
    Cc: Dave Jones
    Cc: David Rientjes
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • This size is user controllable, up to a maximum of XATTR_LIST_MAX (64k).
    So it's trivial for someone to trigger a stream of order:4 page
    allocation errors.

    Signed-off-by: Dave Jones
    Cc: Al Viro
    Cc: Dave Chinner
    Acked-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Jones
     
  • The proc_parse_options() call from proc_mount() runs only once at boot
    time. So on any later mount attempt, any mount options are ignored
    because ->s_root is already initialized.

    As a consequence, "mount -o " will ignore the options. The
    only way to change mount options is "mount -o remount,".

    To fix this, parse the mount options unconditionally.

    Signed-off-by: Vasiliy Kulikov
    Reported-by: Arkadiusz Miskiewicz
    Tested-by: Arkadiusz Miskiewicz
    Cc: Alexey Dobriyan
    Cc: Al Viro
    Cc: Valdis Kletnieks
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vasiliy Kulikov
     

05 Apr, 2012

2 commits


04 Apr, 2012

2 commits

  • The code cleanup of cifs_parse_mount_options resulted in a new bug being
    introduced in the parsing of the UNC. This results in vol->UNC being
    modified before vol->UNC was allocated.

    Reported-by: Steve French
    Signed-off-by: Sachin Prabhu
    Signed-off-by: Steve French

    Sachin Prabhu
     
  • The password parser has an unnecessary check for a NULL value which
    triggers warnings in source checking tools. The code contains artifacts
    from the old parsing code which are no longer required.

    Signed-off-by: Sachin Prabhu
    Reviewed-by: Jeff Layton
    Reported-by: Dan Carpenter
    Signed-off-by: Steve French

    Sachin Prabhu
     

02 Apr, 2012

2 commits

  • We can deadlock if we have a write oplock and two processes
    use the same file handle. In this case the first process can't
    unlock its lock if the second process blocked on the lock in the
    same time.

    Fix it by using posix_lock_file rather than posix_lock_file_wait
    under cinode->lock_mutex. If we request a blocking lock and
    posix_lock_file indicates that there is another lock that prevents
    us, wait untill that lock is released and restart our call.

    Cc: stable@kernel.org
    Acked-by: Jeff Layton
    Signed-off-by: Pavel Shilovsky
    Signed-off-by: Steve French

    Pavel Shilovsky
     
  • Revert previous version of patch to incorporate feedback
    so that we can merge version 3 of the patch instead.w

    This reverts commit b5efb978469d152c2c7c0a09746fb0bfc6171868.

    Steve French
     

01 Apr, 2012

23 commits