22 Jan, 2013

2 commits

  • Some architectures (e.g. blackfin) provide gpio API without requiring
    GPIOLIB support (ARCH_WANT_OPTIONAL_GPIOLIB). devm_gpio_* functions
    should also work for these architectures, since they do not really
    depend on GPIOLIB.

    Add a new option GPIO_DEVRES (enabled by default) to control the build
    of devres.c. It also removes the empty version of devm_gpio_*
    functions for !GENERIC_GPIO build from linux/gpio.h, and moves the
    function declarations from asm-generic/gpio.h into linux/gpio.h.

    Signed-off-by: Shawn Guo
    Signed-off-by: Linus Walleij

    Shawn Guo
     
  • The struct gpio_chip is only defined inside #ifdef CONFIG_GPIOLIB,
    but it's referenced by gpiochip_add_pin_range() and
    gpiochip_remove_pin_ranges() which are outside #ifdef CONFIG_GPIOLIB.
    Thus, we see the following warning when building blackfin image, where
    GPIOLIB is not required.

    CC arch/blackfin/kernel/bfin_gpio.o
    CC init/version.o
    In file included from arch/blackfin/include/asm/gpio.h:321,
    from arch/blackfin/kernel/bfin_gpio.c:15:
    include/asm-generic/gpio.h:298: warning: 'struct gpio_chip' declared inside parameter list
    include/asm-generic/gpio.h:298: warning: its scope is only this definition or declaration, which is probably not what you want
    include/asm-generic/gpio.h:304: warning: 'struct gpio_chip' declared inside parameter list

    Move pinctrl trunk into #ifdef CONFIG_GPIOLIB to fix the warning,
    since it appears that pinctrl gpio range support depends on GPIOLIB.

    Signed-off-by: Shawn Guo
    Signed-off-by: Linus Walleij

    Shawn Guo
     

22 Dec, 2012

9 commits

  • Pull watchdog updates from Wim Van Sebroeck:
    "This includes some fixes and code improvements (like
    clk_prepare_enable and clk_disable_unprepare), conversion from the
    omap_wdt and twl4030_wdt drivers to the watchdog framework, addition
    of the SB8x0 chipset support and the DA9055 Watchdog driver and some
    OF support for the davinci_wdt driver."

    * git://www.linux-watchdog.org/linux-watchdog: (22 commits)
    watchdog: mei: avoid oops in watchdog unregister code path
    watchdog: Orion: Fix possible null-deference in orion_wdt_probe
    watchdog: sp5100_tco: Add SB8x0 chipset support
    watchdog: davinci_wdt: add OF support
    watchdog: da9052: Fix invalid free of devm_ allocated data
    watchdog: twl4030_wdt: Change TWL4030_MODULE_PM_RECEIVER to TWL_MODULE_PM_RECEIVER
    watchdog: remove depends on CONFIG_EXPERIMENTAL
    watchdog: Convert dev_printk(KERN_ to dev_(
    watchdog: DA9055 Watchdog driver
    watchdog: omap_wdt: eliminate goto
    watchdog: omap_wdt: delete redundant platform_set_drvdata() calls
    watchdog: omap_wdt: convert to devm_ functions
    watchdog: omap_wdt: convert to new watchdog core
    watchdog: WatchDog Timer Driver Core: fix comment
    watchdog: s3c2410_wdt: use clk_prepare_enable and clk_disable_unprepare
    watchdog: imx2_wdt: Select the driver via ARCH_MXC
    watchdog: cpu5wdt.c: add missing del_timer call
    watchdog: hpwdt.c: Increase version string
    watchdog: Convert twl4030_wdt to watchdog core
    davinci_wdt: preparation for switch to common clock framework
    ...

    Linus Torvalds
     
  • Pull dm update from Alasdair G Kergon:
    "Miscellaneous device-mapper fixes, cleanups and performance
    improvements.

    Of particular note:
    - Disable broken WRITE SAME support in all targets except linear and
    striped. Use it when kcopyd is zeroing blocks.
    - Remove several mempools from targets by moving the data into the
    bio's new front_pad area(which dm calls 'per_bio_data').
    - Fix a race in thin provisioning if discards are misused.
    - Prevent userspace from interfering with the ioctl parameters and
    use kmalloc for the data buffer if it's small instead of vmalloc.
    - Throttle some annoying error messages when I/O fails."

    * tag 'dm-3.8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm: (36 commits)
    dm stripe: add WRITE SAME support
    dm: remove map_info
    dm snapshot: do not use map_context
    dm thin: dont use map_context
    dm raid1: dont use map_context
    dm flakey: dont use map_context
    dm raid1: rename read_record to bio_record
    dm: move target request nr to dm_target_io
    dm snapshot: use per_bio_data
    dm verity: use per_bio_data
    dm raid1: use per_bio_data
    dm: introduce per_bio_data
    dm kcopyd: add WRITE SAME support to dm_kcopyd_zero
    dm linear: add WRITE SAME support
    dm: add WRITE SAME support
    dm: prepare to support WRITE SAME
    dm ioctl: use kmalloc if possible
    dm ioctl: remove PF_MEMALLOC
    dm persistent data: improve improve space map block alloc failure message
    dm thin: use DMERR_LIMIT for errors
    ...

    Linus Torvalds
     
  • Pull more infiniband changes from Roland Dreier:
    "Second batch of InfiniBand/RDMA changes for 3.8:
    - cxgb4 changes to fix lookup engine hash collisions
    - mlx4 changes to make flow steering usable
    - fix to IPoIB to avoid pinning dst reference for too long"

    * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
    RDMA/cxgb4: Fix bug for active and passive LE hash collision path
    RDMA/cxgb4: Fix LE hash collision bug for passive open connection
    RDMA/cxgb4: Fix LE hash collision bug for active open connection
    mlx4_core: Allow choosing flow steering mode
    mlx4_core: Adjustments to Flow Steering activation logic for SR-IOV
    mlx4_core: Fix error flow in the flow steering wrapper
    mlx4_core: Add QPN enforcement for flow steering rules set by VFs
    cxgb4: Add LE hash collision bug fix path in LLD driver
    cxgb4: Add T4 filter support
    IPoIB: Call skb_dst_drop() once skb is enqueued for sending

    Linus Torvalds
     
  • Pull asm-generic cleanup from Arnd Bergmann:
    "These are a few cleanups for asm-generic:

    - a set of patches from Lars-Peter Clausen to generalize asm/mmu.h
    and use it in the architectures that don't need any special
    handling.
    - A patch from Will Deacon to remove the {read,write}s{b,w,l} as
    discussed during the arm64 review
    - A patch from James Hogan that helps with the meta architecture
    series."

    * tag 'asm-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
    xtensa: Use generic asm/mmu.h for nommu
    h8300: Use generic asm/mmu.h
    c6x: Use generic asm/mmu.h
    asm-generic/mmu.h: Add support for FDPIC
    asm-generic/mmu.h: Remove unused vmlist field from mm_context_t
    asm-generic: io: remove {read,write} string functions
    asm-generic/io.h: remove asm/cacheflush.h include

    Linus Torvalds
     
  • This patch removes map_info from bio-based device mapper targets.
    map_info is still used for request-based targets.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • This patch moves target_request_nr from map_info to dm_target_io and
    makes it accessible with dm_bio_get_target_request_nr.

    This patch is a preparation for the next patch that removes map_info.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • Introduce a field per_bio_data_size in struct dm_target.

    Targets can set this field in the constructor. If a target sets this
    field to a non-zero value, "per_bio_data_size" bytes of auxiliary data
    are allocated for each bio submitted to the target. These data can be
    used for any purpose by the target and help us improve performance by
    removing some per-target mempools.

    Per-bio data is accessed with dm_per_bio_data. The
    argument data_size must be the same as the value per_bio_data_size in
    dm_target.

    If the target has a pointer to per_bio_data, it can get a pointer to
    the bio with dm_bio_from_per_bio_data() function (data_size must be the
    same as the value passed to dm_per_bio_data).

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • Allow targets to opt in to WRITE SAME support by setting
    'num_write_same_requests' in the dm_target structure.

    A dm device will only advertise WRITE SAME support if all its
    targets and all its underlying devices support it.

    Signed-off-by: Mike Snitzer
    Signed-off-by: Alasdair G Kergon

    Mike Snitzer
     
  • When allocating memory for the userspace ioctl data, set some
    appropriate GPF flags directly instead of using PF_MEMALLOC.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     

21 Dec, 2012

25 commits

  • Pull filesystem notification updates from Eric Paris:
    "This pull mostly is about locking changes in the fsnotify system. By
    switching the group lock from a spin_lock() to a mutex() we can now
    hold the lock across things like iput(). This fixes a problem
    involving unmounting a fs and having inodes be busy, first pointed out
    by FAT, but reproducible with tmpfs.

    This also restores signal driven I/O for inotify, which has been
    broken since about 2.6.32."

    Ugh. I *hate* the timing of this. It was rebased after the merge
    window opened, and then left to sit with the pull request coming the day
    before the merge window closes. That's just crap. But apparently the
    patches themselves have been around for over a year, just gathering
    dust, so now it's suddenly critical.

    Fixed up semantic conflict in fs/notify/fdinfo.c as per Stephen
    Rothwell's fixes from -next.

    * 'for-next' of git://git.infradead.org/users/eparis/notify:
    inotify: automatically restart syscalls
    inotify: dont skip removal of watch descriptor if creation of ignored event failed
    fanotify: dont merge permission events
    fsnotify: make fasync generic for both inotify and fanotify
    fsnotify: change locking order
    fsnotify: dont put marks on temporary list when clearing marks by group
    fsnotify: introduce locked versions of fsnotify_add_mark() and fsnotify_remove_mark()
    fsnotify: pass group to fsnotify_destroy_mark()
    fsnotify: use a mutex instead of a spinlock to protect a groups mark list
    fanotify: add an extra flag to mark_remove_from_mask that indicates wheather a mark should be destroyed
    fsnotify: take groups mark_lock before mark lock
    fsnotify: use reference counting for groups
    fsnotify: introduce fsnotify_get_group()
    inotify, fanotify: replace fsnotify_put_group() with fsnotify_destroy_group()

    Linus Torvalds
     
  • Merge the rest of Andrew's patches for -rc1:
    "A bunch of fixes and misc missed-out-on things.

    That'll do for -rc1. I still have a batch of IPC patches which still
    have a possible bug report which I'm chasing down."

    * emailed patches from Andrew Morton : (25 commits)
    keys: use keyring_alloc() to create module signing keyring
    keys: fix unreachable code
    sendfile: allows bypassing of notifier events
    SGI-XP: handle non-fatal traps
    fat: fix incorrect function comment
    Documentation: ABI: remove testing/sysfs-devices-node
    proc: fix inconsistent lock state
    linux/kernel.h: fix DIV_ROUND_CLOSEST with unsigned divisors
    memcg: don't register hotcpu notifier from ->css_alloc()
    checkpatch: warn on uapi #includes that #include
    mm: cma: WARN if freed memory is still in use
    exec: do not leave bprm->interp on stack
    ...

    Linus Torvalds
     
  • Pull VFS update from Al Viro:
    "fscache fixes, ESTALE patchset, vmtruncate removal series, assorted
    misc stuff."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (79 commits)
    vfs: make lremovexattr retry once on ESTALE error
    vfs: make removexattr retry once on ESTALE
    vfs: make llistxattr retry once on ESTALE error
    vfs: make listxattr retry once on ESTALE error
    vfs: make lgetxattr retry once on ESTALE
    vfs: make getxattr retry once on an ESTALE error
    vfs: allow lsetxattr() to retry once on ESTALE errors
    vfs: allow setxattr to retry once on ESTALE errors
    vfs: allow utimensat() calls to retry once on an ESTALE error
    vfs: fix user_statfs to retry once on ESTALE errors
    vfs: make fchownat retry once on ESTALE errors
    vfs: make fchmodat retry once on ESTALE errors
    vfs: have chroot retry once on ESTALE error
    vfs: have chdir retry lookup and call once on ESTALE error
    vfs: have faccessat retry once on an ESTALE error
    vfs: have do_sys_truncate retry once on an ESTALE error
    vfs: fix renameat to retry on ESTALE errors
    vfs: make do_unlinkat retry once on ESTALE errors
    vfs: make do_rmdir retry once on ESTALE errors
    vfs: add a flags argument to user_path_parent
    ...

    Linus Torvalds
     
  • Pull signal handling cleanups from Al Viro:
    "sigaltstack infrastructure + conversion for x86, alpha and um,
    COMPAT_SYSCALL_DEFINE infrastructure.

    Note that there are several conflicts between "unify
    SS_ONSTACK/SS_DISABLE definitions" and UAPI patches in mainline;
    resolution is trivial - just remove definitions of SS_ONSTACK and
    SS_DISABLED from arch/*/uapi/asm/signal.h; they are all identical and
    include/uapi/linux/signal.h contains the unified variant."

    Fixed up conflicts as per Al.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
    alpha: switch to generic sigaltstack
    new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to those
    generic compat_sys_sigaltstack()
    introduce generic sys_sigaltstack(), switch x86 and um to it
    new helper: compat_user_stack_pointer()
    new helper: restore_altstack()
    unify SS_ONSTACK/SS_DISABLE definitions
    new helper: current_user_stack_pointer()
    missing user_stack_pointer() instances
    Bury the conditionals from kernel_thread/kernel_execve series
    COMPAT_SYSCALL_DEFINE: infrastructure

    Linus Torvalds
     
  • Commit 263a523d18bc ("linux/kernel.h: Fix warning seen with W=1 due to
    change in DIV_ROUND_CLOSEST") fixes a warning seen with W=1 due to
    change in DIV_ROUND_CLOSEST.

    Unfortunately, the C compiler converts divide operations with unsigned
    divisors to unsigned, even if the dividend is signed and negative (for
    example, -10 / 5U = 858993457). The C standard says "If one operand has
    unsigned int type, the other operand is converted to unsigned int", so
    the compiler is not to blame. As a result, DIV_ROUND_CLOSEST(0, 2U) and
    similar operations now return bad values, since the automatic conversion
    of expressions such as "0 - 2U/2" to unsigned was not taken into
    account.

    Fix by checking for the divisor variable type when deciding which
    operation to perform. This fixes DIV_ROUND_CLOSEST(0, 2U), but still
    returns bad values for negative dividends divided by unsigned divisors.
    Mark the latter case as unsupported.

    One observed effect of this problem is that the s2c_hwmon driver reports
    a value of 4198403 instead of 0 if the ADC reads 0.

    Other impact is unpredictable. Problem is seen if the divisor is an
    unsigned variable or constant and the dividend is less than (divisor/2).

    Signed-off-by: Guenter Roeck
    Reported-by: Juergen Beisert
    Tested-by: Juergen Beisert
    Cc: Jean Delvare
    Cc: [3.7.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Guenter Roeck
     
  • If a series of scripts are executed, each triggering module loading via
    unprintable bytes in the script header, kernel stack contents can leak
    into the command line.

    Normally execution of binfmt_script and binfmt_misc happens recursively.
    However, when modules are enabled, and unprintable bytes exist in the
    bprm->buf, execution will restart after attempting to load matching
    binfmt modules. Unfortunately, the logic in binfmt_script and
    binfmt_misc does not expect to get restarted. They leave bprm->interp
    pointing to their local stack. This means on restart bprm->interp is
    left pointing into unused stack memory which can then be copied into the
    userspace argv areas.

    After additional study, it seems that both recursion and restart remains
    the desirable way to handle exec with scripts, misc, and modules. As
    such, we need to protect the changes to interp.

    This changes the logic to require allocation for any changes to the
    bprm->interp. To avoid adding a new kmalloc to every exec, the default
    value is left as-is. Only when passing through binfmt_script or
    binfmt_misc does an allocation take place.

    For a proof of concept, see DoTest.sh from:

    http://www.halfdog.net/Security/2012/LinuxKernelBinfmtScriptStackDataDisclosure/

    Signed-off-by: Kees Cook
    Cc: halfdog
    Cc: P J P
    Cc: Alexander Viro
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • Where we can pass in LOOKUP_DIRECTORY or LOOKUP_REVAL. Any other flags
    passed in here are currently ignored.

    Signed-off-by: Jeff Layton
    Signed-off-by: Al Viro

    Jeff Layton
     
  • This function is expected to be called from path-based syscalls to help
    them decide whether to try the lookup and call again in the event that
    they got an -ESTALE return back on an earier try.

    Currently, we only retry the call once on an ESTALE error, but in the
    event that we decide that that's not enough in the future, we should be
    able to change the logic in this helper without too much effort.

    Signed-off-by: Jeff Layton
    Signed-off-by: Al Viro

    Jeff Layton
     
  • …/linux-fs into for-linus

    Al Viro
     
  • Commit 8e22cc88d68ca1a46d7d582938f979eb640ed30f removes the (un)lock_super
    function definitions but forgets to remove their prototypes.

    Signed-off-by: Alessio Igor Bogani
    Signed-off-by: Al Viro

    Alessio Igor Bogani
     
  • Removed vmtruncate

    Signed-off-by: Marco Stornelli
    Signed-off-by: Al Viro

    Marco Stornelli
     
  • Removed vmtruncate

    Signed-off-by: Marco Stornelli
    Signed-off-by: Al Viro

    Marco Stornelli
     
  • Mark as cancelled an operation that is in progress rather than pending at the
    time it is cancelled, and call fscache_complete_op() to cancel an operation so
    that blocked ops can be started.

    Signed-off-by: David Howells

    David Howells
     
  • Convert the fscache_object event IDs from #defines into an enum. Also add an
    extra label to the enum to carry the event count and redefine the event mask
    in terms of that.

    Signed-off-by: David Howells

    David Howells
     
  • Make a more complete truncate operation available to CacheFiles (including
    security checks and suchlike) so that it can use this to clear invalidated
    cache files.

    Signed-off-by: David Howells
    Acked-by: Al Viro

    David Howells
     
  • Pull nfsd update from Bruce Fields:
    "Included this time:

    - more nfsd containerization work from Stanislav Kinsbursky: we're
    not quite there yet, but should be by 3.9.

    - NFSv4.1 progress: implementation of basic backchannel security
    negotiation and the mandatory BACKCHANNEL_CTL operation. See

    http://wiki.linux-nfs.org/wiki/index.php/Server_4.0_and_4.1_issues

    for remaining TODO's

    - Fixes for some bugs that could be triggered by unusual compounds.
    Our xdr code wasn't designed with v4 compounds in mind, and it
    shows. A more thorough rewrite is still a todo.

    - If you've ever seen "RPC: multiple fragments per record not
    supported" logged while using some sort of odd userland NFS client,
    that should now be fixed.

    - Further work from Jeff Layton on our mechanism for storing
    information about NFSv4 clients across reboots.

    - Further work from Bryan Schumaker on his fault-injection mechanism
    (which allows us to discard selective NFSv4 state, to excercise
    rarely-taken recovery code paths in the client.)

    - The usual mix of miscellaneous bugs and cleanup.

    Thanks to everyone who tested or contributed this cycle."

    * 'for-3.8' of git://linux-nfs.org/~bfields/linux: (111 commits)
    nfsd4: don't leave freed stateid hashed
    nfsd4: free_stateid can use the current stateid
    nfsd4: cleanup: replace rq_resused count by rq_next_page pointer
    nfsd: warn on odd reply state in nfsd_vfs_read
    nfsd4: fix oops on unusual readlike compound
    nfsd4: disable zero-copy on non-final read ops
    svcrpc: fix some printks
    NFSD: Correct the size calculation in fault_inject_write
    NFSD: Pass correct buffer size to rpc_ntop
    nfsd: pass proper net to nfsd_destroy() from NFSd kthreads
    nfsd: simplify service shutdown
    nfsd: replace boolean nfsd_up flag by users counter
    nfsd: simplify NFSv4 state init and shutdown
    nfsd: introduce helpers for generic resources init and shutdown
    nfsd: make NFSd service structure allocated per net
    nfsd: make NFSd service boot time per-net
    nfsd: per-net NFSd up flag introduced
    nfsd: move per-net startup code to separated function
    nfsd: pass net to __write_ports() and down
    nfsd: pass net to nfsd_set_nrthreads()
    ...

    Linus Torvalds
     
  • Provide a proper invalidation method rather than relying on the netfs retiring
    the cookie it has and getting a new one. The problem with this is that isn't
    easy for the netfs to make sure that it has completed/cancelled all its
    outstanding storage and retrieval operations on the cookie it is retiring.

    Instead, have the cache provide an invalidation method that will cancel or wait
    for all currently outstanding operations before invalidating the cache, and
    will cause new operations to queue up behind that. Whilst invalidation is in
    progress, some requests will be rejected until the cache can stack a barrier on
    the operation queue to cause new operations to be deferred behind it.

    Signed-off-by: David Howells

    David Howells
     
  • Pull Ceph update from Sage Weil:
    "There are a few different groups of commits here. The largest is
    Alex's ongoing work to enable the coming RBD features (cloning,
    striping). There is some cleanup in libceph that goes along with it.

    Cyril and David have fixed some problems with NFS reexport (leaking
    dentries and page locks), and there is a batch of patches from Yan
    fixing problems with the fs client when running against a clustered
    MDS. There are a few bug fixes mixed in for good measure, many of
    which will be going to the stable trees once they're upstream.

    My apologies for the late pull. There is still a gremlin in the rbd
    map/unmap code and I was hoping to include the fix for that as well,
    but we haven't been able to confirm the fix is correct yet; I'll send
    that in a separate pull once it's nailed down."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (68 commits)
    rbd: get rid of rbd_{get,put}_dev()
    libceph: register request before unregister linger
    libceph: don't use rb_init_node() in ceph_osdc_alloc_request()
    libceph: init event->node in ceph_osdc_create_event()
    libceph: init osd->o_node in create_osd()
    libceph: report connection fault with warning
    libceph: socket can close in any connection state
    rbd: don't use ENOTSUPP
    rbd: remove linger unconditionally
    rbd: get rid of RBD_MAX_SEG_NAME_LEN
    libceph: avoid using freed osd in __kick_osd_requests()
    ceph: don't reference req after put
    rbd: do not allow remove of mounted-on image
    libceph: Unlock unprocessed pages in start_read() error path
    ceph: call handle_cap_grant() for cap import message
    ceph: Fix __ceph_do_pending_vmtruncate
    ceph: Don't add dirty inode to dirty list if caps is in migration
    ceph: Fix infinite loop in __wake_requests
    ceph: Don't update i_max_size when handling non-auth cap
    bdi_register: add __printf verification, fix arg mismatch
    ...

    Linus Torvalds
     
  • Fix the state management of internal fscache operations and the accounting of
    what operations are in what states.

    This is done by:

    (1) Give struct fscache_operation a enum variable that directly represents the
    state it's currently in, rather than spreading this knowledge over a bunch
    of flags, who's processing the operation at the moment and whether it is
    queued or not.

    This makes it easier to write assertions to check the state at various
    points and to prevent invalid state transitions.

    (2) Add an 'operation complete' state and supply a function to indicate the
    completion of an operation (fscache_op_complete()) and make things call
    it. The final call to fscache_put_operation() can then check that an op
    in the appropriate state (complete or cancelled).

    (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better
    govern the state of an object:

    (a) The ->n_ops is now the number of extant operations on the object
    and is now decremented by fscache_put_operation() only.

    (b) The ->n_in_progress is simply the number of objects that have been
    taken off of the object's pending queue for the purposes of being
    run. This is decremented by fscache_op_complete() only.

    (c) The ->n_exclusive is the number of exclusive ops that have been
    submitted and queued or are in progress. It is decremented by
    fscache_op_complete() and by fscache_cancel_op().

    fscache_put_operation() and fscache_operation_gc() now no longer try to
    clean up ->n_exclusive and ->n_in_progress. That was leading to double
    decrements against fscache_cancel_op().

    fscache_cancel_op() now no longer decrements ->n_ops. That was leading to
    double decrements against fscache_put_operation().

    fscache_submit_exclusive_op() now decides whether it has to queue an op
    based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter
    will persist in being true even after all preceding operations have been
    cancelled or completed. Furthermore, if an object is active and there are
    runnable ops against it, there must be at least one op running.

    (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and
    provide a function to record completion of the pages as they complete.

    When n_pages reaches 0, the operation is deemed to be complete and
    fscache_op_complete() is called.

    Add calls to fscache_retrieval_complete() anywhere we've finished with a
    page we've been given to read or allocate for. This includes places where
    we just return pages to the netfs for reading from the server and where
    accessing the cache fails and we discard the proposed netfs page.

    The bugs in the unfixed state management manifest themselves as oopses like the
    following where the operation completion gets out of sync with return of the
    cookie by the netfs. This is possible because the cache unlocks and returns
    all the netfs pages before recording its completion - which means that there's
    nothing to stop the netfs discarding them and returning the cookie.

    FS-Cache: Cookie 'NFS.fh' still has outstanding reads
    ------------[ cut here ]------------
    kernel BUG at fs/fscache/cookie.c:519!
    invalid opcode: 0000 [#1] SMP
    CPU 1
    Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc

    Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY
    RIP: 0010:[] [] __fscache_relinquish_cookie+0x170/0x343 [fscache]
    RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282
    RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000
    RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c
    RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000
    R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98
    R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370
    FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040)
    Stack:
    ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0
    ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0
    ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91
    Call Trace:
    [] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs]
    [] nfs_clear_inode+0x3c/0x41 [nfs]
    [] nfs4_evict_inode+0x2f/0x33 [nfs]
    [] evict+0xa1/0x15c
    [] dispose_list+0x2c/0x38
    [] prune_icache_sb+0x28c/0x29b
    [] prune_super+0xd5/0x140
    [] shrink_slab+0x102/0x1ab
    [] balance_pgdat+0x2f2/0x595
    [] ? process_timeout+0xb/0xb
    [] kswapd+0x270/0x289
    [] ? __init_waitqueue_head+0x46/0x46
    [] ? balance_pgdat+0x595/0x595
    [] kthread+0x7f/0x87
    [] kernel_thread_helper+0x4/0x10
    [] ? finish_task_switch+0x45/0xc0
    [] ? retint_restore_args+0xe/0xe
    [] ? __init_kthread_worker+0x53/0x53
    [] ? gs_change+0xb/0xb

    Signed-off-by: David Howells

    David Howells
     
  • Make fscache_relinquish_cookie() log a warning and wait if there are any
    outstanding reads left on the cookie it was given.

    Signed-off-by: David Howells

    David Howells
     
  • Pull new F2FS filesystem from Jaegeuk Kim:
    "Introduce a new file system, Flash-Friendly File System (F2FS), to
    Linux 3.8.

    Highlights:
    - Add initial f2fs source codes
    - Fix an endian conversion bug
    - Fix build failures on random configs
    - Fix the power-off-recovery routine
    - Minor cleanup, coding style, and typos patches"

    From the Kconfig help text:

    F2FS is based on Log-structured File System (LFS), which supports
    versatile "flash-friendly" features. The design has been focused on
    addressing the fundamental issues in LFS, which are snowball effect
    of wandering tree and high cleaning overhead.

    Since flash-based storages show different characteristics according to
    the internal geometry or flash memory management schemes aka FTL, F2FS
    and tools support various parameters not only for configuring on-disk
    layout, but also for selecting allocation and cleaning algorithms.

    and there's an article by Neil Brown about it on lwn.net:

    http://lwn.net/Articles/518988/

    * tag 'for-3.8-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (36 commits)
    f2fs: fix tracking parent inode number
    f2fs: cleanup the f2fs_bio_alloc routine
    f2fs: introduce accessor to retrieve number of dentry slots
    f2fs: remove redundant call to f2fs_put_page in delete entry
    f2fs: make use of GFP_F2FS_ZERO for setting gfp_mask
    f2fs: rewrite f2fs_bio_alloc to make it simpler
    f2fs: fix a typo in f2fs documentation
    f2fs: remove unused variable
    f2fs: move error condition for mkdir at proper place
    f2fs: remove unneeded initialization
    f2fs: check read only condition before beginning write out
    f2fs: remove unneeded memset from init_once
    f2fs: show error in case of invalid mount arguments
    f2fs: fix the compiler warning for uninitialized use of variable
    f2fs: resolve build failures
    f2fs: adjust kernel coding style
    f2fs: fix endian conversion bugs reported by sparse
    f2fs: remove unneeded version.h header file from f2fs.h
    f2fs: update the f2fs document
    f2fs: update Kconfig and Makefile
    ...

    Linus Torvalds
     
  • Under some circumstances CacheFiles defers the marking of pages with PG_fscache
    so that it can take advantage of pagevecs to reduce the number of calls to
    fscache_mark_pages_cached() and the netfs's hook to keep track of this.

    There are, however, two problems with this:

    (1) It can lead to the PG_fscache mark being applied _after_ the page is set
    PG_uptodate and unlocked (by the call to fscache_end_io()).

    (2) CacheFiles's ref on the page is dropped immediately following
    fscache_end_io() - and so may not still be held when the mark is applied.
    This can lead to the page being passed back to the allocator before the
    mark is applied.

    Fix this by, where appropriate, marking the page before calling
    fscache_end_io() and releasing the page. This means that we can't take
    advantage of pagevecs and have to make a separate call for each page to the
    marking routines.

    The symptoms of this are Bad Page state errors cropping up under memory
    pressure, for example:

    BUG: Bad page state in process tar pfn:002da
    page:ffffea0000009fb0 count:0 mapcount:0 mapping: (null) index:0x1447
    page flags: 0x1000(private_2)
    Pid: 4574, comm: tar Tainted: G W 3.1.0-rc4-fsdevel+ #1064
    Call Trace:
    [] ? dump_page+0xb9/0xbe
    [] bad_page+0xd5/0xea
    [] get_page_from_freelist+0x35b/0x46a
    [] __alloc_pages_nodemask+0x362/0x662
    [] __do_page_cache_readahead+0x13a/0x267
    [] ? __do_page_cache_readahead+0xa2/0x267
    [] ra_submit+0x1c/0x20
    [] ondemand_readahead+0x28b/0x29a
    [] ? ondemand_readahead+0x163/0x29a
    [] page_cache_sync_readahead+0x38/0x3a
    [] generic_file_aio_read+0x2ab/0x67e
    [] nfs_file_read+0xa4/0xc9 [nfs]
    [] do_sync_read+0xba/0xfa
    [] ? security_file_permission+0x7b/0x84
    [] ? rw_verify_area+0xab/0xc8
    [] vfs_read+0xaa/0x13a
    [] sys_read+0x45/0x6c
    [] system_call_fastpath+0x16/0x1b

    As can be seen, PG_private_2 (== PG_fscache) is set in the page flags.

    Instrumenting fscache_mark_pages_cached() to verify whether page->mapping was
    set appropriately showed that sometimes it wasn't. This led to the discovery
    that sometimes the page has apparently been reclaimed by the time the marker
    got to see it.

    Reported-by: M. Stevens
    Signed-off-by: David Howells
    Reviewed-by: Jeff Layton

    David Howells
     
  • The code that relied on that flag was ripped out of btrfs quite some
    time ago, and never added back. Josef indicated that he was going to
    take a different approach to the problem in btrfs, and that we
    could just eliminate this flag.

    Cc: Josef Bacik
    Signed-off-by: Jeff Layton
    Signed-off-by: Al Viro

    Jeff Layton
     
  • Pull IOMMU updates from Joerg Roedel:
    "A few new features this merge-window. The most important one is
    probably, that dma-debug now warns if a dma-handle is not checked with
    dma_mapping_error by the device driver. This requires minor changes
    to some architectures which make use of dma-debug. Most of these
    changes have the respective Acks by the Arch-Maintainers.

    Besides that there are updates to the AMD IOMMU driver for refactor
    the IOMMU-Groups support and to make sure it does not trigger a
    hardware erratum.

    The OMAP changes (for which I pulled in a branch from Tony Lindgren's
    tree) have a conflict in linux-next with the arm-soc tree. The
    conflict is in the file arch/arm/mach-omap2/clock44xx_data.c which is
    deleted in the arm-soc tree. It is safe to delete the file too so
    solve the conflict. Similar changes are done in the arm-soc tree in
    the common clock framework migration. A missing hunk from the patch
    in the IOMMU tree will be submitted as a seperate patch when the
    merge-window is closed."

    * tag 'iommu-updates-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (29 commits)
    ARM: dma-mapping: support debug_dma_mapping_error
    ARM: OMAP4: hwmod data: ipu and dsp to use parent clocks instead of leaf clocks
    iommu/omap: Adapt to runtime pm
    iommu/omap: Migrate to hwmod framework
    iommu/omap: Keep mmu enabled when requested
    iommu/omap: Remove redundant clock handling on ISR
    iommu/amd: Remove obsolete comment
    iommu/amd: Don't use 512GB pages
    iommu/tegra: smmu: Move bus_set_iommu after probe for multi arch
    iommu/tegra: gart: Move bus_set_iommu after probe for multi arch
    iommu/tegra: smmu: Remove unnecessary PTC/TLB flush all
    tile: dma_debug: add debug_dma_mapping_error support
    sh: dma_debug: add debug_dma_mapping_error support
    powerpc: dma_debug: add debug_dma_mapping_error support
    mips: dma_debug: add debug_dma_mapping_error support
    microblaze: dma-mapping: support debug_dma_mapping_error
    ia64: dma_debug: add debug_dma_mapping_error support
    c6x: dma_debug: add debug_dma_mapping_error support
    ARM64: dma_debug: add debug_dma_mapping_error support
    intel-iommu: Prevent devices with RMRRs from being placed into SI Domain
    ...

    Linus Torvalds
     
  • Pull virtio update from Rusty Russell:
    "Some nice cleanups, and even a patch my wife did as a "live" demo for
    Latinoware 2012.

    There's a slightly non-trivial merge in virtio-net, as we cleaned up
    the virtio add_buf interface while DaveM accepted the mq virtio-net
    patches."

    * tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (27 commits)
    virtio_console: Add support for remoteproc serial
    virtio_console: Merge struct buffer_token into struct port_buffer
    virtio: add drv_to_virtio to make code clearly
    virtio: use dev_to_virtio wrapper in virtio
    virtio-mmio: Fix irq parsing in command line parameter
    virtio_console: Free buffers from out-queue upon close
    virtio: Convert dev_printk(KERN_ to dev_(
    virtio_console: Use kmalloc instead of kzalloc
    virtio_console: Free buffer if splice fails
    virtio: tools: make it clear that virtqueue_add_buf() no longer returns > 0
    virtio: scsi: make it clear that virtqueue_add_buf() no longer returns > 0
    virtio: rpmsg: make it clear that virtqueue_add_buf() no longer returns > 0
    virtio: net: make it clear that virtqueue_add_buf() no longer returns > 0
    virtio: console: make it clear that virtqueue_add_buf() no longer returns > 0
    virtio: make virtqueue_add_buf() returning 0 on success, not capacity.
    virtio: console: don't rely on virtqueue_add_buf() returning capacity.
    virtio_net: don't rely on virtqueue_add_buf() returning capacity.
    virtio-net: remove unused skb_vnet_hdr->num_sg field
    virtio-net: correct capacity math on ring full
    virtio: move queue_index and num_free fields into core struct virtqueue.
    ...

    Linus Torvalds
     

20 Dec, 2012

4 commits

  • Pull sound fixes from Takashi Iwai:
    "This update contains overall only driver-specific fixes. Slightly
    large LOC are seen in usb-audio driver for a couple of new device
    quirks and cs42l71 ASoC driver for enhanced features. The others are
    a few small (regression) fixes HD-audio, and yet other small / trival
    ASoC fixes."

    * tag 'sound-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
    ALSA: usb-audio: Support for Digidesign Mbox 2 USB sound card:
    ALSA: HDA: Fix sound resume hang
    ALSA: hda - bug fix for invalid connection list of Haswell HDMI codec pins
    ALSA: hda - Fix the wrong pincaps set in ALC861VD dallas/hp fixup
    ALSA: hda - Set codec->single_adc_amp flag for Realtek codecs
    ASoC: atmel-ssc: change disable to disable in dts node
    ASoC: Prevent pop_wait overwrite
    ALSA: usb-audio: ignore-quirk for HP Wireless Audio
    ALSA: hda - Always turn on pins for HDMI/DP
    ALSA: hda - Fix pin configuration of HP Pavilion dv7
    ASoC: core: Fix splitting of log messages
    ASoC: cs42l73: Change VSPIN/VSPOUT to VSPINOUT
    ASoC: cs42l73: Add DAPM events for power down.
    ASoC: cs42l73: Add DMIC's as DAPM inputs.
    ASoC: sigmadsp: Fix endianness conversion issue
    ASoC: tpa6130a2: Use devm_* APIs

    Linus Torvalds
     
  • Pull ARM SoC fixes from Olof Johansson:
    "This is a batch of fixes for arm-soc platforms, most of it is for OMAP
    but there are others too (i.MX, Tegra, ep93xx). Fixes warnings, some
    broken platforms and drivers, etc. A bit all over the map really."

    There was some concern about commit 68136b10 ("RM: sunxi: Change device
    tree naming scheme for sunxi"), but Tony says:
    "Looks like that's trivial to fix as needed, no need to rebuild the
    branch to fix that AFAIK.

    The fix can be done once Olof is available online again.

    Linus, I suggest that you go ahead and pull this if there are no other
    issues with this branch."

    * tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (32 commits)
    ARM: sunxi: Change device tree naming scheme for sunxi
    ARM: ux500: fix missing include
    ARM: u300: delete custom pin hog code
    ARM: davinci: fix build break due to missing include
    ARM: exynos: Fix warning due to missing 'inline' in stub
    ARM: imx: Move platform-mx2-emma to arch/arm/mach-imx/devices
    ARM i.MX51 clock: Fix regression since enabling MIPI/HSP clocks
    ARM: dts: mx27: Fix the AIPI bus for FEC
    ARM: OMAP2+: common: remove use of vram
    ARM: OMAP3/4: cpuidle: fix sparse and checkpatch warnings
    ARM: OMAP4: clock data: DPLLs are missing bypass clocks in their parent lists
    ARM: OMAP4: clock data: div_iva_hs_clk is a power-of-two divider
    ARM: OMAP4: Fix EMU clock domain always on
    ARM: OMAP4460: Workaround ABE DPLL failing to turn-on
    ARM: OMAP4: Enhance support for DPLLs with 4X multiplier
    ARM: OMAP4: Add function table for non-M4X dplls
    ARM: OMAP4: Update timer clock aliases
    ARM: OMAP: Move plat/omap-serial.h to include/linux/platform_data/serial-omap.h
    ARM: dts: Add build target for omap4-panda-a4
    ARM: dts: OMAP2420: Correct H4 board memory size
    ...

    Linus Torvalds
     
  • Documentation says that code requiring dma-buf should add it to
    select, so inline fallbacks are not going to be used. A link error
    will make it obvious what went wrong, instead of silently doing
    nothing at runtime.

    Signed-off-by: Maarten Lankhorst
    Reviewed-by: Daniel Vetter
    Reviewed-by: Rob Clark
    Signed-off-by: Sumit Semwal

    Maarten Lankhorst
     
  • Pull networking fixes from David Miller:

    1) Really fix tuntap SKB use after free bug, from Eric Dumazet.

    2) Adjust SKB data pointer to point past the transport header before
    calling icmpv6_notify() so that the headers are in the state which
    that function expects. From Duan Jiong.

    3) Fix ambiguities in the new tuntap multi-queue APIs. From Jason
    Wang.

    4) mISDN needs to use del_timer_sync(), from Konstantin Khlebnikov.

    5) Don't destroy mutex after freeing up device private in mac802154,
    fix also from Konstantin Khlebnikov.

    6) Fix INET request socket leak in TCP and DCCP, from Christoph Paasch.

    7) SCTP HMAC kconfig rework, from Neil Horman.

    8) Fix SCTP jprobes function signature, otherwise things explode, from
    Daniel Borkmann.

    9) Fix typo in ipv6-offload Makefile variable reference, from Simon
    Arlott.

    10) Don't fail USBNET open just because remote wakeup isn't supported,
    from Oliver Neukum.

    11) be2net driver bug fixes from Sathya Perla.

    12) SOLOS PCI ATM driver bug fixes from Nathan Williams and David
    Woodhouse.

    13) Fix MTU changing regression in 8139cp driver, from John Greene.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (45 commits)
    solos-pci: ensure all TX packets are aligned to 4 bytes
    solos-pci: add firmware upgrade support for new models
    solos-pci: remove superfluous debug output
    solos-pci: add GPIO support for newer versions on Geos board
    8139cp: Prevent dev_close/cp_interrupt race on MTU change
    net: qmi_wwan: add ZTE MF880
    drivers/net: Use of_match_ptr() macro in smsc911x.c
    drivers/net: Use of_match_ptr() macro in smc91x.c
    ipv6: addrconf.c: remove unnecessary "if"
    bridge: Correctly encode addresses when dumping mdb entries
    bridge: Do not unregister all PF_BRIDGE rtnl operations
    use generic usbnet_manage_power()
    usbnet: generic manage_power()
    usbnet: handle PM failure gracefully
    ksz884x: fix receive polling race condition
    qlcnic: update driver version
    qlcnic: fix unused variable warnings
    net: fec: forbid FEC_PTP on SoCs that do not support
    be2net: fix wrong frag_idx reported by RX CQ
    be2net: fix be_close() to ensure all events are ack'ed
    ...

    Linus Torvalds