09 Apr, 2014

9 commits

  • Pull nfsd updates from Bruce Fields:
    "Highlights:
    - server-side nfs/rdma fixes from Jeff Layton and Tom Tucker
    - xdr fixes (a larger xdr rewrite has been posted but I decided it
    would be better to queue it up for 3.16).
    - miscellaneous fixes and cleanup from all over (thanks especially to
    Kinglong Mee)"

    * 'for-3.15' of git://linux-nfs.org/~bfields/linux: (36 commits)
    nfsd4: don't create unnecessary mask acl
    nfsd: revert v2 half of "nfsd: don't return high mode bits"
    nfsd4: fix memory leak in nfsd4_encode_fattr()
    nfsd: check passed socket's net matches NFSd superblock's one
    SUNRPC: Clear xpt_bc_xprt if xs_setup_bc_tcp failed
    NFSD/SUNRPC: Check rpc_xprt out of xs_setup_bc_tcp
    SUNRPC: New helper for creating client with rpc_xprt
    NFSD: Free backchannel xprt in bc_destroy
    NFSD: Clear wcc data between compound ops
    nfsd: Don't return NFS4ERR_STALE_STATEID for NFSv4.1+
    nfsd4: fix nfs4err_resource in 4.1 case
    nfsd4: fix setclientid encode size
    nfsd4: remove redundant check from nfsd4_check_resp_size
    nfsd4: use more generous NFS4_ACL_MAX
    nfsd4: minor nfsd4_replay_cache_entry cleanup
    nfsd4: nfsd4_replay_cache_entry should be static
    nfsd4: update comments with obsolete function name
    rpc: Allow xdr_buf_subsegment to operate in-place
    NFSD: Using free_conn free connection
    SUNRPC: fix memory leak of peer addresses in XPRT
    ...

    Linus Torvalds
     
  • My static checker suggests adding curly braces here. Probably that was
    the intent, but actually the code works the same either way. I've just
    changed the indenting and left the code as-is.

    Signed-off-by: Dan Carpenter
    Cc: Petr Vandrovec
    Acked-by: Dave Chiluk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Carpenter
     
  • Conversions to ncp_dbg showed some format/argument mismatches so fix
    them.

    Signed-off-by: Joe Perches
    Cc: Petr Vandrovec
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Uses are gone, remove the macro.

    Signed-off-by: Joe Perches
    Cc: Petr Vandrovec
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Use a more current logging style.

    Convert the paranoia debug statement to vdbg.
    Remove the embedded function names as dynamic_debug can do that.

    Signed-off-by: Joe Perches
    Cc: Petr Vandrovec
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Use a more current logging style and enable use of dynamic debugging.

    Remove embedded function names, dynamic debug can add this instead.

    Signed-off-by: Joe Perches
    Cc: Petr Vandrovec
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Convert to a more current logging style.

    Add pr_fmt to prefix with "ncpfs: ".
    Remove the embedded function names and use "%s: ", __func__

    Some previously unprefixed messages now have "ncpfs: "

    Signed-off-by: Joe Perches
    Cc: Petr Vandrovec
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • There wasn't any check of the size passed from userspace before trying
    to allocate the memory required.

    This meant that userspace might request more space than allowed,
    triggering an OOM.

    Signed-off-by: Sasha Levin
    Signed-off-by: Ian Kent
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sasha Levin
     
  • Pull drm updates from Dave Airlie:
    "Highlights:

    - drm:

    Generic display port aux features, primary plane support, drm
    master management fixes, logging cleanups, enforced locking checks
    (instead of docs), documentation improvements, minor number
    handling cleanup, pseudofs for shared inodes.

    - ttm:

    add ability to allocate from both ends

    - i915:

    broadwell features, power domain and runtime pm, per-process
    address space infrastructure (not enabled)

    - msm:

    power management, hdmi audio support

    - nouveau:

    ongoing GPU fault recovery, initial maxwell support, random fixes

    - exynos:

    refactored driver to clean up a lot of abstraction, DP support
    moved into drm, LVDS bridge support added, parallel panel support

    - gma500:

    SGX MMU support, SGX irq handling, asle irq work fixes

    - radeon:

    video engine bringup, ring handling fixes, use dp aux helpers

    - vmwgfx:

    add rendernode support"

    * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (849 commits)
    DRM: armada: fix corruption while loading cursors
    drm/dp_helper: don't return EPROTO for defers (v2)
    drm/bridge: export ptn3460_init function
    drm/exynos: remove MODULE_DEVICE_TABLE definitions
    ARM: dts: exynos4412-trats2: enable exynos/fimd node
    ARM: dts: exynos4210-trats: enable exynos/fimd node
    ARM: dts: exynos4412-trats2: add panel node
    ARM: dts: exynos4210-trats: add panel node
    ARM: dts: exynos4: add MIPI DSI Master node
    drm/panel: add S6E8AA0 driver
    ARM: dts: exynos4210-universal_c210: add proper panel node
    drm/panel: add ld9040 driver
    panel/ld9040: add DT bindings
    panel/s6e8aa0: add DT bindings
    drm/exynos: add DSIM driver
    exynos/dsim: add DT bindings
    drm/exynos: disallow fbdev initialization if no device is connected
    drm/mipi_dsi: create dsi devices only for nodes with reg property
    drm/mipi_dsi: add flags to DSI messages
    Skip intel_crt_init for Dell XPS 8700
    ...

    Linus Torvalds
     

08 Apr, 2014

28 commits

  • Pull ext3 improvements, cleanups, reiserfs fix from Jan Kara:
    "various cleanups for ext2, ext3, udf, isofs, a documentation update
    for quota, and a fix of a race in reiserfs readdir implementation"

    * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
    reiserfs: fix race in readdir
    ext2: acl: remove unneeded include of linux/capability.h
    ext3: explicitly remove inode from orphan list after failed direct io
    fs/isofs/inode.c add __init to init_inodecache()
    ext3: Speedup WB_SYNC_ALL pass
    fs/quota/Kconfig: Update filesystems
    ext3: Update outdated comment before ext3_ordered_writepage()
    ext3: Update PF_MEMALLOC handling in ext3_write_inode()
    ext2/3: use prandom_u32() instead of get_random_bytes()
    ext3: remove an unneeded check in ext3_new_blocks()
    ext3: remove unneeded check in ext3_ordered_writepage()
    fs: Mark function as static in ext3/xattr_security.c
    fs: Mark function as static in ext3/dir.c
    fs: Mark function as static in ext2/xattr_security.c
    ext3: Add __init macro to init_inodecache
    ext2: Add __init macro to init_inodecache
    udf: Add __init macro to init_inodecache
    fs: udf: parse_options: blocksize check

    Linus Torvalds
     
  • Merge second patch-bomb from Andrew Morton:
    - the rest of MM
    - zram updates
    - zswap updates
    - exit
    - procfs
    - exec
    - wait
    - crash dump
    - lib/idr
    - rapidio
    - adfs, affs, bfs, ufs
    - cris
    - Kconfig things
    - initramfs
    - small amount of IPC material
    - percpu enhancements
    - early ioremap support
    - various other misc things

    * emailed patches from Andrew Morton : (156 commits)
    MAINTAINERS: update Intel C600 SAS driver maintainers
    fs/ufs: remove unused ufs_super_block_third pointer
    fs/ufs: remove unused ufs_super_block_second pointer
    fs/ufs: remove unused ufs_super_block_first pointer
    fs/ufs/super.c: add __init to init_inodecache()
    doc/kernel-parameters.txt: add early_ioremap_debug
    arm64: add early_ioremap support
    arm64: initialize pgprot info earlier in boot
    x86: use generic early_ioremap
    mm: create generic early_ioremap() support
    x86/mm: sparse warning fix for early_memremap
    lglock: map to spinlock when !CONFIG_SMP
    percpu: add preemption checks to __this_cpu ops
    vmstat: use raw_cpu_ops to avoid false positives on preemption checks
    slub: use raw_cpu_inc for incrementing statistics
    net: replace __this_cpu_inc in route.c with raw_cpu_inc
    modules: use raw_cpu_write for initialization of per cpu refcount.
    mm: use raw_cpu ops for determining current NUMA node
    percpu: add raw_cpu_ops
    slub: fix leak of 'name' in sysfs_slab_add
    ...

    Linus Torvalds
     
  • Pointer 'usb3' to struct ufs_super_block_third acquired via
    ubh_get_usb_third() is never used in function
    ufs_read_cylinder_structures(). Thus remove it.

    Detected by Coverity: CID 139939.

    Signed-off-by: Christian Engelmayer
    Cc: Evgeniy Dushistov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christian Engelmayer
     
  • Pointer 'usb2' to struct ufs_super_block_second acquired via
    ubh_get_usb_second() is never used in function ufs_statfs(). Thus
    remove it.

    Detected by Coverity: CID 139940.

    Signed-off-by: Christian Engelmayer
    Cc: Evgeniy Dushistov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christian Engelmayer
     
  • Remove occurences of unused pointers to struct ufs_super_block_first
    that were acquired via ubh_get_usb_first().

    Detected by Coverity: CID 139929 - CID 139936, CID 139940.

    Signed-off-by: Christian Engelmayer
    Cc: Evgeniy Dushistov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christian Engelmayer
     
  • init_inodecache is only called by __init init_ufs_fs.

    Signed-off-by: Fabian Frederick
    Cc: Evgeniy Dushistov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     
  • /proc/self/make-it-fail is a boolean, but accepts any number, including
    negative ones. Change variable to unsigned, and cap upper bound at 1.

    [akpm@linux-foundation.org: don't make make_it_fail unsigned]
    Signed-off-by: Dave Jones
    Reviewed-by: Akinobu Mita
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Jones
     
  • init_inodecache is only called by __init init_bfs_fs

    Signed-off-by: Fabian Frederick
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     
  • Normal behavior for filenames exceeding specific filesystem limits is to
    refuse operation.

    AFFS standard name length being only 30 characters against 255 for usual
    Linux filesystems, original implementation does filename truncate by
    default with a define value AFFS_NO_TRUNCATE which can be enabled but
    needs module compilation.

    This patch adds 'nofilenametruncate' mount option so that user can
    easily activate that feature and avoid a lot of problems (eg overwrite
    files ...)

    Signed-off-by: Fabian Frederick
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     
  • Commit 0edf977d2ae3 ("[readdir] convert affs") returns directly -EIO
    without unlocking dir inode and releasing dir bh when second affs_bread
    sequence fails. This patch restores initial behaviour. It also fixes
    pr_debug and affs_error to fit in 80 columns + removes reference to
    filldir (replaced by dir_emit in the commit above).

    Signed-off-by: Fabian Frederick
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     
  • init_inodecache is only called by __init init_affs_fs

    Signed-off-by: Fabian Frederick
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     
  • init_inodecache is only called by __init init_adfs_fs.

    Signed-off-by: Fabian Frederick
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     
  • Currently when an empty PT_NOTE is detected, vmcore initialization
    fails. It sounds too harsh. Because PT_NOTE could be empty, for
    example, one offlined a cpu but never restarted kdump service, and after
    crash, PT_NOTE program header is there but no data contains. It's
    better to warn about the empty PT_NOTE and continue to initialise
    vmcore.

    And ultimately the multiple PT_NOTE are merged into a single one, all
    empty PT_NOTE are discarded naturally during the merge. So empty
    PT_NOTE is not visible to user space and vmcore is as good as expected.

    Signed-off-by: WANG Chao
    Cc: Vivek Goyal
    Cc: HATAYAMA Daisuke
    Cc: Greg Pearson
    Cc: Baoquan He
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    WANG Chao
     
  • Eliminate the following warning in proc/vmcore.c:

    fs/proc/vmcore.c:1088:6: warning: no previous prototype for `vmcore_cleanup' [-Wmissing-prototypes]

    [akpm@linux-foundation.org: clean up powerpc, remove unneeded EXPORT_SYMBOL]
    Signed-off-by: Rashika Kheria
    Reviewed-by: Josh Triplett
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rashika Kheria
     
  • get_task_state() uses the most significant bit to report the state to
    user-space, this means that EXIT_ZOMBIE->EXIT_TRACE->EXIT_DEAD transition
    can be noticed via /proc as Z -> X -> Z change. Note that this was
    possible even before EXIT_TRACE was introduced.

    This is not really bad but imho it make sense to hide EXIT_TRACE from
    user-space completely. So the patch simply swaps EXIT_ZOMBIE and
    EXIT_DEAD, this way EXIT_TRACE will be seen as EXIT_ZOMBIE by user-space.

    Signed-off-by: Oleg Nesterov
    Cc: Jan Kratochvil
    Cc: Michal Schmidt
    Cc: Al Viro
    Cc: Lennart Poettering
    Cc: Roland McGrath
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Starting from commit c4ad8f98bef7 ("execve: use 'struct filename *' for
    executable name passing") bprm->filename can not go away after
    flush_old_exec(), so we do not need to save the binary name in
    bprm->tcomm[] added by 96e02d158678 ("exec: fix use-after-free bug in
    setup_new_exec()").

    And there was never need for filename_to_taskname-like code, we can
    simply do set_task_comm(kbasename(filename).

    This patch has to change set_task_comm() and trace_task_rename() to
    accept "const char *", but I think this change is also good.

    Signed-off-by: Oleg Nesterov
    Cc: Heiko Carstens
    Cc: Steven Rostedt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • The /proc/*/pagemap contain sensitive information and currently its mode
    is 0444. Change this to 0400, so the VFS will prevent unprivileged
    processes from getting file descriptors on arbitrary privileged
    /proc/*/pagemap files.

    This reduces the scope of address space leaking and bypasses by protecting
    already running processes.

    Signed-off-by: Djalal Harouni
    Acked-by: Kees Cook
    Acked-by: Andy Lutomirski
    Cc: Eric W. Biederman
    Cc: Al Viro
    Cc: Oleg Nesterov
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Djalal Harouni
     
  • These procfs files contain sensitive information and currently their
    mode is 0444. Change this to 0400, so the VFS will be able to block
    unprivileged processes from getting file descriptors on arbitrary
    privileged /proc/*/{stack,syscall,personality} files.

    This reduces the scope of ASLR leaking and bypasses by protecting already
    running processes.

    Signed-off-by: Djalal Harouni
    Acked-by: Kees Cook
    Acked-by: Andy Lutomirski
    Cc: Eric W. Biederman
    Cc: Al Viro
    Cc: Oleg Nesterov
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Djalal Harouni
     
  • Replace rcu_assign_pointer(x, NULL) with RCU_INIT_POINTER(x, NULL)

    The rcu_assign_pointer() ensures that the initialization of a structure
    is carried out before storing a pointer to that structure. And in the
    case of the NULL pointer, there is no structure to initialize. So,
    rcu_assign_pointer(p, NULL) can be safely converted to
    RCU_INIT_POINTER(p, NULL)

    Signed-off-by: Monam Agarwal
    Cc: "Paul E. McKenney"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Monam Agarwal
     
  • Currently we don't have a way how to determing from which mount point
    file has been opened. This information is required for proper dumping
    and restoring file descriptos due to presence of mount namespaces. It's
    possible, that two file descriptors are opened using the same paths, but
    one fd references mount point from one namespace while the other fd --
    from other namespace.

    $ ls -l /proc/1/fd/1
    lrwx------ 1 root root 64 Mar 19 23:54 /proc/1/fd/1 -> /dev/null

    $ cat /proc/1/fdinfo/1
    pos: 0
    flags: 0100002
    mnt_id: 16

    $ cat /proc/1/mountinfo | grep ^16
    16 32 0:4 / /dev rw,nosuid shared:2 - devtmpfs devtmpfs rw,size=1013356k,nr_inodes=253339,mode=755

    Signed-off-by: Andrey Vagin
    Acked-by: Pavel Emelyanov
    Acked-by: Cyrill Gorcunov
    Cc: Rob Landley
    Cc: Al Viro
    Cc: Oleg Nesterov
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Vagin
     
  • It should read "reclaimable slab" and not "reclaimable swap".

    Signed-off-by: Luiz Capitulino
    Reviewed-by: Rik van Riel
    Acked-by: Rafael Aquini
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Luiz Capitulino
     
  • This patch is a continuation of efforts trying to optimize find_vma(),
    avoiding potentially expensive rbtree walks to locate a vma upon faults.
    The original approach (https://lkml.org/lkml/2013/11/1/410), where the
    largest vma was also cached, ended up being too specific and random,
    thus further comparison with other approaches were needed. There are
    two things to consider when dealing with this, the cache hit rate and
    the latency of find_vma(). Improving the hit-rate does not necessarily
    translate in finding the vma any faster, as the overhead of any fancy
    caching schemes can be too high to consider.

    We currently cache the last used vma for the whole address space, which
    provides a nice optimization, reducing the total cycles in find_vma() by
    up to 250%, for workloads with good locality. On the other hand, this
    simple scheme is pretty much useless for workloads with poor locality.
    Analyzing ebizzy runs shows that, no matter how many threads are
    running, the mmap_cache hit rate is less than 2%, and in many situations
    below 1%.

    The proposed approach is to replace this scheme with a small per-thread
    cache, maximizing hit rates at a very low maintenance cost.
    Invalidations are performed by simply bumping up a 32-bit sequence
    number. The only expensive operation is in the rare case of a seq
    number overflow, where all caches that share the same address space are
    flushed. Upon a miss, the proposed replacement policy is based on the
    page number that contains the virtual address in question. Concretely,
    the following results are seen on an 80 core, 8 socket x86-64 box:

    1) System bootup: Most programs are single threaded, so the per-thread
    scheme does improve ~50% hit rate by just adding a few more slots to
    the cache.

    +----------------+----------+------------------+
    | caching scheme | hit-rate | cycles (billion) |
    +----------------+----------+------------------+
    | baseline | 50.61% | 19.90 |
    | patched | 73.45% | 13.58 |
    +----------------+----------+------------------+

    2) Kernel build: This one is already pretty good with the current
    approach as we're dealing with good locality.

    +----------------+----------+------------------+
    | caching scheme | hit-rate | cycles (billion) |
    +----------------+----------+------------------+
    | baseline | 75.28% | 11.03 |
    | patched | 88.09% | 9.31 |
    +----------------+----------+------------------+

    3) Oracle 11g Data Mining (4k pages): Similar to the kernel build workload.

    +----------------+----------+------------------+
    | caching scheme | hit-rate | cycles (billion) |
    +----------------+----------+------------------+
    | baseline | 70.66% | 17.14 |
    | patched | 91.15% | 12.57 |
    +----------------+----------+------------------+

    4) Ebizzy: There's a fair amount of variation from run to run, but this
    approach always shows nearly perfect hit rates, while baseline is just
    about non-existent. The amounts of cycles can fluctuate between
    anywhere from ~60 to ~116 for the baseline scheme, but this approach
    reduces it considerably. For instance, with 80 threads:

    +----------------+----------+------------------+
    | caching scheme | hit-rate | cycles (billion) |
    +----------------+----------+------------------+
    | baseline | 1.06% | 91.54 |
    | patched | 99.97% | 14.18 |
    +----------------+----------+------------------+

    [akpm@linux-foundation.org: fix nommu build, per Davidlohr]
    [akpm@linux-foundation.org: document vmacache_valid() logic]
    [akpm@linux-foundation.org: attempt to untangle header files]
    [akpm@linux-foundation.org: add vmacache_find() BUG_ON]
    [hughd@google.com: add vmacache_valid_mm() (from Oleg)]
    [akpm@linux-foundation.org: coding-style fixes]
    [akpm@linux-foundation.org: adjust and enhance comments]
    Signed-off-by: Davidlohr Bueso
    Reviewed-by: Rik van Riel
    Acked-by: Linus Torvalds
    Reviewed-by: Michel Lespinasse
    Cc: Oleg Nesterov
    Tested-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davidlohr Bueso
     
  • filemap_map_pages() is generic implementation of ->map_pages() for
    filesystems who uses page cache.

    It should be safe to use filemap_map_pages() for ->map_pages() if
    filesystem use filemap_fault() for ->fault().

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Linus Torvalds
    Cc: Mel Gorman
    Cc: Rik van Riel
    Cc: Andi Kleen
    Cc: Matthew Wilcox
    Cc: Dave Hansen
    Cc: Alexander Viro
    Cc: Dave Chinner
    Cc: Ning Qu
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • load_elf_binary() sets current->mm->def_flags = def_flags and def_flags
    is always zero. Not only this looks strange, this is unnecessary
    because mm_init() has already set ->def_flags = 0.

    Signed-off-by: Alex Thorlton
    Suggested-by: Oleg Nesterov
    Cc: Gerald Schaefer
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Christian Borntraeger
    Cc: Paolo Bonzini
    Cc: "Kirill A. Shutemov"
    Cc: Mel Gorman
    Acked-by: Rik van Riel
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Andrea Arcangeli
    Cc: Oleg Nesterov
    Cc: "Eric W. Biederman"
    Cc: Alexander Viro
    Cc: Johannes Weiner
    Cc: David Rientjes
    Cc: Paolo Bonzini
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alex Thorlton
     
  • - Convert spinlock/static array to va_format (inspired by Joe Perches
    help on previous logging patches).

    - Convert printk(KERN_ERR to pr_warn in __ntfs_warning.

    - Convert printk(KERN_ERR to pr_err in __ntfs_error.

    - Convert printk(KERN_DEBUG to pr_debug in __ntfs_debug. (Note that
    __ntfs_debug is still guarded by #if DEBUG)

    - Improve !DEBUG to parse all arguments (Joe Perches).

    - Sparse pr_foo() conversions in super.c

    NTFS, NTFS-fs prefixes as well as 'warning' and 'error' were removed :
    pr_foo() automatically adds module name and error level is already
    specified.

    Signed-off-by: Fabian Frederick
    Cc: Anton Altaparmakov
    Cc: Joe Perches
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     
  • Pull Ceph updates from Sage Weil:
    "The biggest chunk is a series of patches from Ilya that add support
    for new Ceph osd and crush map features, including some new tunables,
    primary affinity, and the new encoding that is needed for erasure
    coding support. This brings things into parity with the server side
    and the looming firefly release. There is also support for allocation
    hints in RBD that help limit fragmentation on the server side.

    There is also a series of patches from Zheng fixing NFS reexport,
    directory fragmentation support, flock vs fnctl behavior, and some
    issues with clustered MDS.

    Finally, there are some miscellaneous fixes from Yunchuan Wen for
    fscache, Fabian Frederick for ACLs, and from me for fsync(dirfd)
    behavior"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (79 commits)
    ceph: skip invalid dentry during dcache readdir
    libceph: dump pool {read,write}_tier to debugfs
    libceph: output primary affinity values on osdmap updates
    ceph: flush cap release queue when trimming session caps
    ceph: don't grabs open file reference for aborted request
    ceph: drop extra open file reference in ceph_atomic_open()
    ceph: preallocate buffer for readdir reply
    libceph: enable PRIMARY_AFFINITY feature bit
    libceph: redo ceph_calc_pg_primary() in terms of ceph_calc_pg_acting()
    libceph: add support for osd primary affinity
    libceph: add support for primary_temp mappings
    libceph: return primary from ceph_calc_pg_acting()
    libceph: switch ceph_calc_pg_acting() to new helpers
    libceph: introduce apply_temps() helper
    libceph: introduce pg_to_raw_osds() and raw_to_up_osds() helpers
    libceph: ceph_can_shift_osds(pool) and pool type defines
    libceph: ceph_osd_{exists,is_up,is_down}(osd) definitions
    libceph: enable OSDMAP_ENC feature bit
    libceph: primary_affinity decode bits
    libceph: primary_affinity infrastructure
    ...

    Linus Torvalds
     
  • Pull f2fs updates from Jaegeuk Kim:
    "This patch-set includes the following major enhancement patches.
    - introduce large directory support
    - introduce f2fs_issue_flush to merge redundant flush commands
    - merge write IOs as much as possible aligned to the segment
    - add sysfs entries to tune the f2fs configuration
    - use radix_tree for the free_nid_list to reduce in-memory operations
    - remove costly bit operations in f2fs_find_entry
    - enhance the readahead flow for CP/NAT/SIT/SSA blocks

    The other bug fixes are as follows:
    - recover xattr node blocks correctly after sudden-power-cut
    - fix to calculate the maximum number of node ids
    - enhance to handle many error cases

    And, there are a bunch of cleanups"

    * tag 'for-f2fs-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (62 commits)
    f2fs: fix wrong statistics of inline data
    f2fs: check the acl's validity before setting
    f2fs: introduce f2fs_issue_flush to avoid redundant flush issue
    f2fs: fix to cover io->bio with io_rwsem
    f2fs: fix error path when fail to read inline data
    f2fs: use list_for_each_entry{_safe} for simplyfying code
    f2fs: avoid free slab cache under spinlock
    f2fs: avoid unneeded lookup when xattr name length is too long
    f2fs: avoid unnecessary bio submit when wait page writeback
    f2fs: return -EIO when node id is not matched
    f2fs: avoid RECLAIM_FS-ON-W warning
    f2fs: skip unnecessary node writes during fsync
    f2fs: introduce fi->i_sem to protect fi's info
    f2fs: change reclaim rate in percentage
    f2fs: add missing documentation for dir_level
    f2fs: remove unnecessary threshold
    f2fs: throttle the memory footprint with a sysfs entry
    f2fs: avoid to drop nat entries due to the negative nr_shrink
    f2fs: call f2fs_wait_on_page_writeback instead of native function
    f2fs: introduce nr_pages_to_write for segment alignment
    ...

    Linus Torvalds
     
  • Pull MTD updates from Brian Norris:
    - A few SPI NOR ID definitions
    - Kill the NAND "max pagesize" restriction
    - Fix some x16 bus-width NAND support
    - Add NAND JEDEC parameter page support
    - DT bindings for NAND ECC
    - GPMI NAND updates (subpage reads)
    - More OMAP NAND refactoring
    - New STMicro SPI NOR driver (now in 40 patches!)
    - A few other random bugfixes

    * tag 'for-linus-20140405' of git://git.infradead.org/linux-mtd: (120 commits)
    Fix index regression in nand_read_subpage
    mtd: diskonchip: mem resource name is not optional
    mtd: nand: fix mention to CONFIG_MTD_NAND_ECC_BCH
    mtd: nand: fix GET/SET_FEATURES address on 16-bit devices
    mtd: omap2: Use devm_ioremap_resource()
    mtd: denali_dt: Use devm_ioremap_resource()
    mtd: devices: elm: update DRIVER_NAME as "omap-elm"
    mtd: devices: elm: configure parallel channels based on ecc_steps
    mtd: devices: elm: clean elm_load_syndrome
    mtd: devices: elm: check for hardware engine's design constraints
    mtd: st_spi_fsm: Succinctly reorganise .remove()
    mtd: st_spi_fsm: Allow loop to run at least once before giving up CPU
    mtd: st_spi_fsm: Correct vendor name spelling issue - missing "M"
    mtd: st_spi_fsm: Avoid duplicating MTD core code
    mtd: st_spi_fsm: Remove useless consts from function arguments
    mtd: st_spi_fsm: Convert ST SPI FSM (NOR) Flash driver to new DT partitions
    mtd: st_spi_fsm: Move runtime configurable msg sequences into device's struct
    mtd: st_spi_fsm: Supply the W25Qxxx chip specific configuration call-back
    mtd: st_spi_fsm: Supply the S25FLxxx chip specific configuration call-back
    mtd: st_spi_fsm: Supply the MX25xxx chip specific configuration call-back
    ...

    Linus Torvalds
     

07 Apr, 2014

3 commits

  • If we remove a file that has inline data after mount, our statistics turns to
    inaccurate.

    cat /sys/kernel/debug/f2fs/status
    - Inline_data Inode: 4294967295

    Let's add stat_inc_inline_inode() to stat inline info of the file when lookup.

    Change log from v1:
    o stat in f2fs_lookup() instead of in do_read_inode() for excluding wrong stat.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • Before setting the acl, call posix_acl_valid() to check if it is
    valid or not.

    Signed-off-by: zhangzhen
    Signed-off-by: Jaegeuk Kim

    ZhangZhen
     
  • Some storage devices show relatively high latencies to complete cache_flush
    commands, even though their normal IO speed is prettry much high. In such
    the case, it needs to merge cache_flush commands as much as possible to avoid
    issuing them redundantly.
    So, this patch introduces a mount option, "-o flush_merge", to mitigate such
    the overhead.

    If this option is enabled by user, F2FS merges the cache_flush commands and then
    issues just one cache_flush on behalf of them. Once the single command is
    finished, F2FS sends a completion signal to all the pending threads.

    Note that, this option can be used under a workload consisting of very intensive
    concurrent fsync calls, while the storage handles cache_flush commands slowly.

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim