24 Jan, 2014

14 commits

  • Most of the VM_BUG_ON assertions are performed on a page. Usually, when
    one of these assertions fails we'll get a BUG_ON with a call stack and
    the registers.

    I've recently noticed based on the requests to add a small piece of code
    that dumps the page to various VM_BUG_ON sites that the page dump is
    quite useful to people debugging issues in mm.

    This patch adds a VM_BUG_ON_PAGE(cond, page) which beyond doing what
    VM_BUG_ON() does, also dumps the page before executing the actual
    BUG_ON.

    [akpm@linux-foundation.org: fix up includes]
    Signed-off-by: Sasha Levin
    Cc: "Kirill A. Shutemov"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sasha Levin
     
  • stable_page_flags() checks !PageHuge && PageTransCompound && PageLRU to
    know that a specified page is thp or not. But sometimes it's not enough
    and we fail to detect thp when the thp is on pagevec. This happens only
    for a few seconds after LRU list operations, but it makes it difficult
    to control our applications depending on this flag.

    So this patch adds another check PageAnon to detect thps on pagevec. It
    might not give the future extensibility for thp pagecache, but it's OK
    at least for now.

    Signed-off-by: Naoya Horiguchi
    Cc: David Rientjes
    Cc: KOSAKI Motohiro
    Cc: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     
  • The vmalloc was introduced by 33327948782b ("memcgroup: use vmalloc for
    mem_cgroup allocation"), because at that time MAX_NUMNODES was used for
    defining the per-node array in the mem_cgroup structure so that the
    structure could be huge even if the system had the only NUMA node.

    The situation was significantly improved by commit 45cf7ebd5a03 ("memcg:
    reduce the size of struct memcg 244-fold"), which made the size of the
    mem_cgroup structure calculated dynamically depending on the real number
    of NUMA nodes installed on the system (nr_node_ids), so now there is no
    point in using vmalloc here: the structure is allocated rarely and on
    most systems its size is about 1K.

    Signed-off-by: Vladimir Davydov
    Acked-by: Michal Hocko
    Cc: Glauber Costa
    Cc: Johannes Weiner
    Cc: Balbir Singh
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     
  • Since commit ff6a6da60b89 ("mm: accelerate munlock() treatment of THP
    pages") munlock skips tail pages of a munlocked THP page. There is some
    attempt to prevent bad consequences of racing with a THP page split, but
    code inspection indicates that there are two problems that may lead to a
    non-fatal, yet wrong outcome.

    First, __split_huge_page_refcount() copies flags including PageMlocked
    from the head page to the tail pages. Clearing PageMlocked by
    munlock_vma_page() in the middle of this operation might result in part
    of tail pages left with PageMlocked flag. As the head page still
    appears to be a THP page until all tail pages are processed,
    munlock_vma_page() might think it munlocked the whole THP page and skip
    all the former tail pages. Before ff6a6da60, those pages would be
    cleared in further iterations of munlock_vma_pages_range(), but NR_MLOCK
    would still become undercounted (related the next point).

    Second, NR_MLOCK accounting is based on call to hpage_nr_pages() after
    the PageMlocked is cleared. The accounting might also become
    inconsistent due to race with __split_huge_page_refcount()

    - undercount when HUGE_PMD_NR is subtracted, but some tail pages are
    left with PageMlocked set and counted again (only possible before
    ff6a6da60)

    - overcount when hpage_nr_pages() sees a normal page (split has already
    finished), but the parallel split has meanwhile cleared PageMlocked from
    additional tail pages

    This patch prevents both problems via extending the scope of lru_lock in
    munlock_vma_page(). This is convenient because:

    - __split_huge_page_refcount() takes lru_lock for its whole operation

    - munlock_vma_page() typically takes lru_lock anyway for page isolation

    As this becomes a second function where page isolation is done with
    lru_lock already held, factor this out to a new
    __munlock_isolate_lru_page() function and clean up the code around.

    [akpm@linux-foundation.org: avoid a coding-style ugly]
    Signed-off-by: Vlastimil Babka
    Cc: Sasha Levin
    Cc: Michel Lespinasse
    Cc: Andrea Arcangeli
    Cc: Rik van Riel
    Cc: Mel Gorman
    Cc: Hugh Dickins
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vlastimil Babka
     
  • bad_page() is cool in that it prints out a bunch of data about the page.
    But, I can never remember which page flags are good and which are bad,
    or whether ->index or ->mapping is required to be NULL.

    This patch allows bad/dump_page() callers to specify a string about why
    they are dumping the page and adds explanation strings to a number of
    places. It also adds a 'bad_flags' argument to bad_page(), which it
    then dumps out separately from the flags which are actually set.

    This way, the messages will show specifically why the page was bad,
    *specifically* which flags it is complaining about, if it was a page
    flag combination which was the problem.

    [akpm@linux-foundation.org: switch to pr_alert]
    Signed-off-by: Dave Hansen
    Reviewed-by: Christoph Lameter
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • The "compressor" and "enabled" params are currently hidden, this changes
    them to read-only, so userspace can tell if zswap is enabled or not and
    see what compressor is in use.

    Signed-off-by: Dan Streetman
    Cc: Vladimir Murzin
    Cc: Bob Liu
    Cc: Minchan Kim
    Cc: Weijie Yang
    Acked-by: Seth Jennings
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Streetman
     
  • Documentation/vm/locking is a blast from the past. In the entire git
    history, it has had precisely Three modifications. Two of those look to
    be pure renames, and the third was from 2005.

    The doc contains such gems as:

    > The page_table_lock is grabbed while holding the
    > kernel_lock spinning monitor.

    > Page stealers hold kernel_lock to protect against a bunch of
    > races.

    Or this which talks about mmap_sem:

    > 4. The exception to this rule is expand_stack, which just
    > takes the read lock and the page_table_lock, this is ok
    > because it doesn't really modify fields anybody relies on.

    expand_stack() doesn't take any locks any more directly, and the
    mmap_sem acquisition was long ago moved up in to the page fault code
    itself.

    It could be argued that we need to rewrite this, but it is dangerous to
    leave it as-is. It will confuse more people than it helps.

    Signed-off-by: Dave Hansen
    Cc: Hugh Dickins
    Acked-by: Vlastimil Babka
    Cc: Wanpeng Li
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • Sort the exception table at build-time rather than during boot.

    Microblaze is the same case as AARCH64 that's why EM_MICROBLAZE
    conditional check was added to allow cross-compilation on machines which
    are not running the latest libc-dev.

    Inspired by AARCH64 commit adace89562c7 ("arm64: extable: sort the
    exception table at build time").

    Signed-off-by: Michal Simek
    Acked-by: David Daney
    Cc: Catalin Marinas
    Cc: Will Deacon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Simek
     
  • drivers/staging/comedi/drivers/das6402.c: In function 'intr_handler':
    drivers/staging/comedi/drivers/das6402.c:164:3: error: implicit declaration of function 'outw_p' [-Werror=implicit-function-declaration]
    drivers/staging/speakup/speakup_dtlk.c: In function 'synth_probe':
    drivers/staging/speakup/speakup_dtlk.c:362:2: error: implicit declaration of function 'inw_p' [-Werror=implicit-function-declaration]

    Signed-off-by: Geert Uytterhoeven
    Cc: Mikael Starvik
    Cc: Jesper Nilsson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Geert Uytterhoeven
     
  • Pull UDF & jbd fixes from Jan Kara:
    "A cleanup of JBD log messages and UDF fix of a lockdep warning"

    * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
    udf: Fix lockdep warning from udf_symlink()
    jbd: Revise KERN_EMERG error messages

    Linus Torvalds
     
  • The associative array code creates unnecessary and potentially
    problematic global variable 'status'. Remove it since never used.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    Stephen Hemminger
     
  • Pull fuse update from Miklos Szeredi:
    "This contains a fix for a potential use-after-module-unload bug
    noticed by Al and caching improvements for read-only fuse filesystems
    by Andrew Gallagher"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
    fuse: support clients that don't implement 'open'
    fuse: don't invalidate attrs when not using atime
    fuse: fix SetPageUptodate() condition in STORE
    fuse: fix pipe_buf_operations

    Linus Torvalds
     
  • Pull f2fs updates from Jaegeuk Kim:
    "In this round, a couple of sysfs entries were introduced to tune the
    f2fs at runtime.

    In addition, f2fs starts to support inline_data and improves the
    read/write performance in some workloads by refactoring bio-related
    flows.

    This patch-set includes the following major enhancement patches.
    - support inline_data
    - refactor bio operations such as merge operations and rw type
    assignment
    - enhance the direct IO path
    - enhance bio operations
    - truncate a node page when it becomes obsolete
    - add sysfs entries: small_discards, max_victim_search, and
    in-place-update
    - add a sysfs entry to control max_victim_search

    The other bug fixes are as follows.
    - fix a bug in truncate_partial_nodes
    - avoid warnings during sparse and build process
    - fix error handling flows
    - fix potential bit overflows

    And, there are a bunch of cleanups"

    * tag 'for-f2fs-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (95 commits)
    f2fs: drop obsolete node page when it is truncated
    f2fs: introduce NODE_MAPPING for code consistency
    f2fs: remove the orphan block page array
    f2fs: add help function META_MAPPING
    f2fs: move a branch for code redability
    f2fs: call mark_inode_dirty to flush dirty pages
    f2fs: clean checkpatch warnings
    f2fs: missing REQ_META and REQ_PRIO when sync_meta_pages(META_FLUSH)
    f2fs: avoid f2fs_balance_fs call during pageout
    f2fs: add delimiter to seperate name and value in debug phrase
    f2fs: use spinlock rather than mutex for better speed
    f2fs: move alloc new orphan node out of lock protection region
    f2fs: move grabing orphan pages out of protection region
    f2fs: remove the needless parameter of f2fs_wait_on_page_writeback
    f2fs: update documents and a MAINTAINERS entry
    f2fs: add a sysfs entry to control max_victim_search
    f2fs: improve write performance under frequent fsync calls
    f2fs: avoid to read inline data except first page
    f2fs: avoid to left uninitialized data in page when read inline data
    f2fs: fix truncate_partial_nodes bug
    ...

    Linus Torvalds
     
  • Pull xfs update from Ben Myers:
    "This is primarily bug fixes, many of which you already have. New
    stuff includes a series to decouple the in-memory and on-disk log
    format, helpers in the area of inode clusters, and i_version handling.

    We decided to try to use more topic branches this release, so there
    are some merge commits in there on account of that. I'm afraid I
    didn't do a good job of putting meaningful comments in the first
    couple of merges. Sorry about that. I think I have the hang of it
    now.

    For 3.14-rc1 there are fixes in the areas of remote attributes,
    discard, growfs, memory leaks in recovery, directory v2, quotas, the
    MAINTAINERS file, allocation alignment, extent list locking, and in
    xfs_bmapi_allocate. There are cleanups in xfs_setsize_buftarg,
    removing unused macros, quotas, setattr, and freeing of inode
    clusters. The in-memory and on-disk log format have been decoupled, a
    common helper to calculate the number of blocks in an inode cluster
    has been added, and handling of i_version has been pulled into the
    filesystems that use it.

    - cleanup in xfs_setsize_buftarg
    - removal of remaining unused flags for vop toss/flush/flushinval
    - fix for memory corruption in xfs_attrlist_by_handle
    - fix for out-of-date comment in xfs_trans_dqlockedjoin
    - fix for discard if range length is less than one block
    - fix for overrun of agfl buffer using growfs on v4 superblock
    filesystems
    - pull i_version handling out into the filesystems that use it
    - don't leak recovery items on error
    - fix for memory leak in xfs_dir2_node_removename
    - several cleanups for quotas
    - fix bad assertion in xfs_qm_vop_create_dqattach
    - cleanup for xfs_setattr_mode, and add xfs_setattr_time
    - fix quota assert in xfs_setattr_nonsize
    - fix an infinite loop when turning off group/project quota before
    user quota
    - fix for temporary buffer allocation failure in xfs_dir2_block_to_sf
    with large directory block sizes
    - fix Dave's email address in MAINTAINERS
    - cleanup calculation of freed inode cluster blocks
    - fix alignment of initial file allocations to match filesystem
    geometry
    - decouple in-memory and on-disk log format
    - introduce a common helper to calculate the number of filesystem
    blocks in an inode cluster
    - fixes for extent list locking
    - fix for off-by-one in xfs_attr3_rmt_verify
    - fix for missing destroy_work_on_stack in xfs_bmapi_allocate"

    * tag 'xfs-for-linus-v3.14-rc1' of git://oss.sgi.com/xfs/xfs: (51 commits)
    xfs: Calling destroy_work_on_stack() to pair with INIT_WORK_ONSTACK()
    xfs: fix off-by-one error in xfs_attr3_rmt_verify
    xfs: assert that we hold the ilock for extent map access
    xfs: use xfs_ilock_attr_map_shared in xfs_attr_list_int
    xfs: use xfs_ilock_attr_map_shared in xfs_attr_get
    xfs: use xfs_ilock_data_map_shared in xfs_qm_dqiterate
    xfs: use xfs_ilock_data_map_shared in xfs_qm_dqtobp
    xfs: take the ilock around xfs_bmapi_read in xfs_zero_remaining_bytes
    xfs: reinstate the ilock in xfs_readdir
    xfs: add xfs_ilock_attr_map_shared
    xfs: rename xfs_ilock_map_shared
    xfs: remove xfs_iunlock_map_shared
    xfs: no need to lock the inode in xfs_find_handle
    xfs: use xfs_icluster_size_fsb in xfs_imap
    xfs: use xfs_icluster_size_fsb in xfs_ifree_cluster
    xfs: use xfs_icluster_size_fsb in xfs_ialloc_inode_init
    xfs: use xfs_icluster_size_fsb in xfs_bulkstat
    xfs: introduce a common helper xfs_icluster_size_fsb
    xfs: get rid of XFS_IALLOC_BLOCKS macros
    xfs: get rid of XFS_INODE_CLUSTER_SIZE macros
    ...

    Linus Torvalds
     

23 Jan, 2014

16 commits

  • Pull module updates from Rusty Russell.

    * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
    module: Add missing newline in printk call.
    module: fix coding style
    export: declare ksymtab symbols
    module.h: Remove unnecessary semicolon
    params: improve standard definitions
    Add Documentation/module-signing.txt file

    Linus Torvalds
     
  • Pull virtio update from Rusty Russell:
    "A few simple fixes. Quiet cycle"

    * tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
    drivers: virtio: Mark function virtballoon_migratepage() as static in virtio_balloon.c
    virtio-scsi: Fix hotcpu_notifier use-after-free with virtscsi_freeze
    virtio: pci: remove unnecessary pci_set_drvdata()

    Linus Torvalds
     
  • Pull Xen updates from Konrad Rzeszutek Wilk:
    "Two major features that Xen community is excited about:

    The first is event channel scalability by David Vrabel - we switch
    over from an two-level per-cpu bitmap of events (IRQs) - to an FIFO
    queue with priorities. This lets us be able to handle more events,
    have lower latency, and better scalability. Good stuff.

    The other is PVH by Mukesh Rathor. In short, PV is a mode where the
    kernel lets the hypervisor program page-tables, segments, etc. With
    EPT/NPT capabilities in current processors, the overhead of doing this
    in an HVM (Hardware Virtual Machine) container is much lower than the
    hypervisor doing it for us.

    In short we let a PV guest run without doing page-table, segment,
    syscall, etc updates through the hypervisor - instead it is all done
    within the guest container. It is a "hybrid" PV - hence the 'PVH'
    name - a PV guest within an HVM container.

    The major benefits are less code to deal with - for example we only
    use one function from the the pv_mmu_ops (which has 39 function
    calls); faster performance for syscall (no context switches into the
    hypervisor); less traps on various operations; etc.

    It is still being baked - the ABI is not yet set in stone. But it is
    pretty awesome and we are excited about it.

    Lastly, there are some changes to ARM code - you should get a simple
    conflict which has been resolved in #linux-next.

    In short, this pull has awesome features.

    Features:
    - FIFO event channels. Key advantages: support for over 100,000
    events (2^17), 16 different event priorities, improved fairness in
    event latency through the use of FIFOs.
    - Xen PVH support. "It’s a fully PV kernel mode, running with
    paravirtualized disk and network, paravirtualized interrupts and
    timers, no emulated devices of any kind (and thus no qemu), no BIOS
    or legacy boot — but instead of requiring PV MMU, it uses the HVM
    hardware extensions to virtualize the pagetables, as well as system
    calls and other privileged operations." (from "The
    Paravirtualization Spectrum, Part 2: From poles to a spectrum")

    Bug-fixes:
    - Fixes in balloon driver (refactor and make it work under ARM)
    - Allow xenfb to be used in HVM guests.
    - Allow xen_platform_pci=0 to work properly.
    - Refactors in event channels"

    * tag 'stable/for-linus-3.14-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (52 commits)
    xen/pvh: Set X86_CR0_WP and others in CR0 (v2)
    MAINTAINERS: add git repository for Xen
    xen/pvh: Use 'depend' instead of 'select'.
    xen: delete new instances of __cpuinit usage
    xen/fb: allow xenfb initialization for hvm guests
    xen/evtchn_fifo: fix error return code in evtchn_fifo_setup()
    xen-platform: fix error return code in platform_pci_init()
    xen/pvh: remove duplicated include from enlighten.c
    xen/pvh: Fix compile issues with xen_pvh_domain()
    xen: Use dev_is_pci() to check whether it is pci device
    xen/grant-table: Force to use v1 of grants.
    xen/pvh: Support ParaVirtualized Hardware extensions (v3).
    xen/pvh: Piggyback on PVHVM XenBus.
    xen/pvh: Piggyback on PVHVM for grant driver (v4)
    xen/grant: Implement an grant frame array struct (v3).
    xen/grant-table: Refactor gnttab_init
    xen/grants: Remove gnttab_max_grant_frames dependency on gnttab_init.
    xen/pvh: Piggyback on PVHVM for event channels (v2)
    xen/pvh: Update E820 to work with PVH (v2)
    xen/pvh: Secondary VCPU bringup (non-bootup CPUs)
    ...

    Linus Torvalds
     
  • Pull KVM updates from Paolo Bonzini:
    "First round of KVM updates for 3.14; PPC parts will come next week.

    Nothing major here, just bugfixes all over the place. The most
    interesting part is the ARM guys' virtualized interrupt controller
    overhaul, which lets userspace get/set the state and thus enables
    migration of ARM VMs"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (67 commits)
    kvm: make KVM_MMU_AUDIT help text more readable
    KVM: s390: Fix memory access error detection
    KVM: nVMX: Update guest activity state field on L2 exits
    KVM: nVMX: Fix nested_run_pending on activity state HLT
    KVM: nVMX: Clean up handling of VMX-related MSRs
    KVM: nVMX: Add tracepoints for nested_vmexit and nested_vmexit_inject
    KVM: nVMX: Pass vmexit parameters to nested_vmx_vmexit
    KVM: nVMX: Leave VMX mode on clearing of feature control MSR
    KVM: VMX: Fix DR6 update on #DB exception
    KVM: SVM: Fix reading of DR6
    KVM: x86: Sync DR7 on KVM_SET_DEBUGREGS
    add support for Hyper-V reference time counter
    KVM: remove useless write to vcpu->hv_clock.tsc_timestamp
    KVM: x86: fix tsc catchup issue with tsc scaling
    KVM: x86: limit PIT timer frequency
    KVM: x86: handle invalid root_hpa everywhere
    kvm: Provide kvm_vcpu_eligible_for_directed_yield() stub
    kvm: vfio: silence GCC warning
    KVM: ARM: Remove duplicate include
    arm/arm64: KVM: relax the requirements of VMA alignment for THP
    ...

    Linus Torvalds
     
  • Pull trivial tree updates from Jiri Kosina:
    "Usual rocket science stuff from trivial.git"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
    neighbour.h: fix comment
    sched: Fix warning on make htmldocs caused by wait.h
    slab: struct kmem_cache is protected by slab_mutex
    doc: Fix typo in USB Gadget Documentation
    of/Kconfig: Spelling s/one/once/
    mkregtable: Fix sscanf handling
    lp5523, lp8501: comment improvements
    thermal: rcar: comment spelling
    treewide: fix comments and printk msgs
    IXP4xx: remove '1 &&' from a condition check in ixp4xx_restart()
    Documentation: update /proc/uptime field description
    Documentation: Fix size parameter for snprintf
    arm: fix comment header and macro name
    asm-generic: uaccess: Spelling s/a ny/any/
    mtd: onenand: fix comment header
    doc: driver-model/platform.txt: fix a typo
    drivers: fix typo in DEVTMPFS_MOUNT Kconfig help text
    doc: Fix typo (acces_process_vm -> access_process_vm)
    treewide: Fix typos in printk
    drivers/gpu/drm/qxl/Kconfig: reformat the help text
    ...

    Linus Torvalds
     
  • Pull HID updates from Jiri Kosina:

    - quite some work on hid-sony driver in order to have DualShock 4
    device properly supported, from Frank Praznik

    - fixed support for suspending I2C conntected devices, from Mika
    Westerberg

    - regression fix for 0xff05 usage on Microsoft Ergonomy, from Jiri
    Kosina

    - support for Synaptics HD touchscreen, from AceLan Kao

    - workaround for USB 3.0 problem for logitech-dj connected devices,
    from Benjamin Tisssoires

    - support for Logitech Dual Action pads, from Vitaly Katraew

    - quite a few other assorted fixes and device ID additions

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (33 commits)
    HID: sony: Use colors for the Dualshock 4 LED names
    HID: sony: Add annotated HID descriptor for the Dualshock 4
    HID: sony: Cache the output report for the Dualshock 4
    HID: sony: Map gyroscopes and accelerometers to axes
    HID: sony: Fix spacing in the device definitions.
    HID: sony: Use standard output reports instead of raw reports to send data to the Dualshock 4.
    HID: sony: Use separate identifiers for USB and Bluetooth connected Dualshock 4 controllers.
    HID: hid-holtek-mouse: add new a070 mouse
    HID: hid-sensor-hub: Fix buggy report descriptors
    HID: logitech-dj: Fix USB 3.0 issue
    HID: sony: Rename worker function
    HID: sony: Add LED controls for the Dualshock 4
    HID: sony: Add force-feedback support for the Dualshock 4
    HID: hidraw: make comment more accurate and nicer
    HID: sony: fix error return code
    HID: input: fix input sysfs path for hid devices
    HID: debug: add labels for some new buttons
    HID: remove SIS entries from hid_have_special_driver[]
    HID: microsoft: no fallthrough in MS ergonomy 0xff05 usage
    HID: add support for SiS multitouch panel in the touch monitor LG 23ET83V
    ...

    Linus Torvalds
     
  • Pull device-mapper changes from Mike Snitzer:
    "A lot of attention was paid to improving the thin-provisioning
    target's handling of metadata operation failures and running out of
    space. A new 'error_if_no_space' feature was added to allow users to
    error IOs rather than queue them when either the data or metadata
    space is exhausted.

    Additional fixes/features include:
    - a few fixes to properly support thin metadata device resizing
    - a solution for reliably waiting for a DM device's embedded kobject
    to be released before destroying the device
    - old dm-snapshot is updated to use the dm-bufio interface to take
    advantage of readahead capabilities that improve snapshot
    activation
    - new dm-cache target tunables to control how quickly data is
    promoted to the cache (fast) device
    - improved write efficiency of cluster mirror target by combining
    userspace flush and mark requests"

    * tag 'dm-3.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (35 commits)
    dm log userspace: allow mark requests to piggyback on flush requests
    dm space map metadata: fix bug in resizing of thin metadata
    dm cache: add policy name to status output
    dm thin: fix pool feature parsing
    dm sysfs: fix a module unload race
    dm snapshot: use dm-bufio prefetch
    dm snapshot: use dm-bufio
    dm snapshot: prepare for switch to using dm-bufio
    dm snapshot: use GFP_KERNEL when initializing exceptions
    dm cache: add block sizes and total cache blocks to status output
    dm btree: add dm_btree_find_lowest_key
    dm space map metadata: fix extending the space map
    dm space map common: make sure new space is used during extend
    dm: wait until embedded kobject is released before destroying a device
    dm: remove pointless kobject comparison in dm_get_from_kobject
    dm snapshot: call destroy_work_on_stack() to pair with INIT_WORK_ONSTACK()
    dm cache policy mq: introduce three promotion threshold tunables
    dm cache policy mq: use list_del_init instead of list_del + INIT_LIST_HEAD
    dm thin: fix set_pool_mode exposed pool operation races
    dm thin: eliminate the no_free_space flag
    ...

    Linus Torvalds
     
  • Pull SCSI updates from James Bottomley:
    "This patch set is a lot of driver updates for qla4xxx, bfa, hpsa,
    qla2xxx. It also removes the aic7xxx_old driver (which has been
    deprecated for nearly a decade) and adds support for deadlines in
    error handling"

    * tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (75 commits)
    [SCSI] hpsa: allow SCSI mid layer to handle unit attention
    [SCSI] hpsa: do not require board "not ready" status after hard reset
    [SCSI] hpsa: enable unit attention reporting
    [SCSI] hpsa: rename scsi prefetch field
    [SCSI] hpsa: use workqueue instead of kernel thread for lockup detection
    [SCSI] ipr: increase dump size in ipr driver
    [SCSI] mac_scsi: Fix crash on out of memory
    [SCSI] st: fix enlarge_buffer
    [SCSI] qla1280: Annotate timer on stack so object debug does not complain
    [SCSI] qla4xxx: Update driver version to 5.04.00-k3
    [SCSI] qla4xxx: Recreate chap data list during get chap operation
    [SCSI] qla4xxx: Add support for ISCSI_PARAM_LOCAL_IPADDR sysfs attr
    [SCSI] libiscsi: Add local_ipaddr parameter in iscsi_conn struct
    [SCSI] scsi_transport_iscsi: Export ISCSI_PARAM_LOCAL_IPADDR attr for iscsi_connection
    [SCSI] qla4xxx: Add host statistics support
    [SCSI] scsi_transport_iscsi: Add host statistics support
    [SCSI] qla4xxx: Added support for Diagnostics MBOX command
    [SCSI] bfa: Driver version upgrade to 3.2.23.0
    [SCSI] bfa: change FC_ELS_TOV to 20sec
    [SCSI] bfa: Observed auto D-port mode instead of manual
    ...

    Linus Torvalds
     
  • Pull PCI updates from Bjorn Helgaas:
    "PCI changes for the v3.14 merge window:

    Resource management
    - Change pci_bus_region addresses to dma_addr_t (Bjorn Helgaas)
    - Support 64-bit AGP BARs (Bjorn Helgaas, Yinghai Lu)
    - Add pci_bus_address() to get bus address of a BAR (Bjorn Helgaas)
    - Use pci_resource_start() for CPU address of AGP BARs (Bjorn Helgaas)
    - Enforce bus address limits in resource allocation (Yinghai Lu)
    - Allocate 64-bit BARs above 4G when possible (Yinghai Lu)
    - Convert pcibios_resource_to_bus() to take pci_bus, not pci_dev (Yinghai Lu)

    PCI device hotplug
    - Major rescan/remove locking update (Rafael J. Wysocki)
    - Make ioapic builtin only (not modular) (Yinghai Lu)
    - Fix release/free issues (Yinghai Lu)
    - Clean up pciehp (Bjorn Helgaas)
    - Announce pciehp slot info during enumeration (Bjorn Helgaas)

    MSI
    - Add pci_msi_vec_count(), pci_msix_vec_count() (Alexander Gordeev)
    - Add pci_enable_msi_range(), pci_enable_msix_range() (Alexander Gordeev)
    - Deprecate "tri-state" interfaces: fail/success/fail+info (Alexander Gordeev)
    - Export MSI mode using attributes, not kobjects (Greg Kroah-Hartman)
    - Drop "irq" param from *_restore_msi_irqs() (DuanZhenzhong)

    SR-IOV
    - Clear NumVFs when disabling SR-IOV in sriov_init() (ethan.zhao)

    Virtualization
    - Add support for save/restore of extended capabilities (Alex Williamson)
    - Add Virtual Channel to save/restore support (Alex Williamson)
    - Never treat a VF as a multifunction device (Alex Williamson)
    - Add pci_try_reset_function(), et al (Alex Williamson)

    AER
    - Ignore non-PCIe error sources (Betty Dall)
    - Support ACPI HEST error sources for domains other than 0 (Betty Dall)
    - Consolidate HEST error source parsers (Bjorn Helgaas)
    - Add a TLP header print helper (Borislav Petkov)

    Freescale i.MX6
    - Remove unnecessary code (Fabio Estevam)
    - Make reset-gpio optional (Marek Vasut)
    - Report "link up" only after link training completes (Marek Vasut)
    - Start link in Gen1 before negotiating for Gen2 mode (Marek Vasut)
    - Fix PCIe startup code (Richard Zhu)

    Marvell MVEBU
    - Remove duplicate of_clk_get_by_name() call (Andrew Lunn)
    - Drop writes to bridge Secondary Status register (Jason Gunthorpe)
    - Obey bridge PCI_COMMAND_MEM and PCI_COMMAND_IO bits (Jason Gunthorpe)
    - Support a bridge with no IO port window (Jason Gunthorpe)
    - Use max_t() instead of max(resource_size_t,) (Jingoo Han)
    - Remove redundant of_match_ptr (Sachin Kamat)
    - Call pci_ioremap_io() at startup instead of dynamically (Thomas Petazzoni)

    NVIDIA Tegra
    - Disable Gen2 for Tegra20 and Tegra30 (Eric Brower)

    Renesas R-Car
    - Add runtime PM support (Valentine Barshak)
    - Fix rcar_pci_probe() return value check (Wei Yongjun)

    Synopsys DesignWare
    - Fix crash in dw_msi_teardown_irq() (Bjørn Erik Nilsen)
    - Remove redundant call to pci_write_config_word() (Bjørn Erik Nilsen)
    - Fix missing MSI IRQs (Harro Haan)
    - Add dw_pcie prefix before cfg_read/write (Pratyush Anand)
    - Fix I/O transfers by using CPU (not realio) address (Pratyush Anand)
    - Whitespace cleanup (Jingoo Han)

    EISA
    - Call put_device() if device_register() fails (Levente Kurusa)
    - Revert EISA initialization breakage ((Bjorn Helgaas)

    Miscellaneous
    - Remove unused code, including PCIe 3.0 interfaces (Stephen Hemminger)
    - Prevent bus conflicts while checking for bridge apertures (Bjorn Helgaas)
    - Stop clearing bridge Secondary Status when setting up I/O aperture (Bjorn Helgaas)
    - Use dev_is_pci() to identify PCI devices (Yijing Wang)
    - Deprecate DEFINE_PCI_DEVICE_TABLE (Joe Perches)
    - Update documentation 00-INDEX (Erik Ekman)"

    * tag 'pci-v3.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (119 commits)
    Revert "EISA: Initialize device before its resources"
    Revert "EISA: Log device resources in dmesg"
    vfio-pci: Use pci "try" reset interface
    PCI: Check parent kobject in pci_destroy_dev()
    xen/pcifront: Use global PCI rescan-remove locking
    powerpc/eeh: Use global PCI rescan-remove locking
    PCI: Fix pci_check_and_unmask_intx() comment typos
    PCI: Add pci_try_reset_function(), pci_try_reset_slot(), pci_try_reset_bus()
    MPT / PCI: Use pci_stop_and_remove_bus_device_locked()
    platform / x86: Use global PCI rescan-remove locking
    PCI: hotplug: Use global PCI rescan-remove locking
    pcmcia: Use global PCI rescan-remove locking
    ACPI / hotplug / PCI: Use global PCI rescan-remove locking
    ACPI / PCI: Use global PCI rescan-remove locking in PCI root hotplug
    PCI: Add global pci_lock_rescan_remove()
    PCI: Cleanup pci.h whitespace
    PCI: Reorder so actual code comes before stubs
    PCI/AER: Support ACPI HEST AER error sources for PCI domains other than 0
    ACPICA: Add helper macros to extract bus/segment numbers from HEST table.
    PCI: Make local functions static
    ...

    Linus Torvalds
     
  • Pull tracing updates from Steven Rostedt:
    "This pull request has a new feature to ftrace, namely the trace event
    triggers by Tom Zanussi. A trigger is a way to enable an action when
    an event is hit. The actions are:

    o trace on/off - enable or disable tracing
    o snapshot - save the current trace buffer in the snapshot
    o stacktrace - dump the current stack trace to the ringbuffer
    o enable/disable events - enable or disable another event

    Namhyung Kim added updates to the tracing uprobes code. Having the
    uprobes add support for fetch methods.

    The rest are various bug fixes with the new code, and minor ones for
    the old code"

    * tag 'trace-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (38 commits)
    tracing: Fix buggered tee(2) on tracing_pipe
    tracing: Have trace buffer point back to trace_array
    ftrace: Fix synchronization location disabling and freeing ftrace_ops
    ftrace: Have function graph only trace based on global_ops filters
    ftrace: Synchronize setting function_trace_op with ftrace_trace_function
    tracing: Show available event triggers when no trigger is set
    tracing: Consolidate event trigger code
    tracing: Fix counter for traceon/off event triggers
    tracing: Remove double-underscore naming in syscall trigger invocations
    tracing/kprobes: Add trace event trigger invocations
    tracing/probes: Fix build break on !CONFIG_KPROBE_EVENT
    tracing/uprobes: Add @+file_offset fetch method
    uprobes: Allocate ->utask before handler_chain() for tracing handlers
    tracing/uprobes: Add support for full argument access methods
    tracing/uprobes: Fetch args before reserving a ring buffer
    tracing/uprobes: Pass 'is_return' to traceprobe_parse_probe_arg()
    tracing/probes: Implement 'memory' fetch method for uprobes
    tracing/probes: Add fetch{,_size} member into deref fetch method
    tracing/probes: Move 'symbol' fetch method to kprobes
    tracing/probes: Implement 'stack' fetch method for uprobes
    ...

    Linus Torvalds
     
  • If a node page is trucated, we'd better drop the page in the node_inode's page
    cache for better memory footprint.

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • open/release operations require userspace transitions to keep track
    of the open count and to perform any FS-specific setup. However,
    for some purely read-only FSs which don't need to perform any setup
    at open/release time, we can avoid the performance overhead of
    calling into userspace for open/release calls.

    This patch adds the necessary support to the fuse kernel modules to prevent
    open/release operations from hitting in userspace. When the client returns
    ENOSYS, we avoid sending the subsequent release to userspace, and also
    remember this so that future opens also don't trigger a userspace
    operation.

    Signed-off-by: Miklos Szeredi

    Andrew Gallagher
     
  • Various read operations (e.g. readlink, readdir) invalidate the cached
    attrs for atime changes. This patch adds a new function
    'fuse_invalidate_atime', which checks for a read-only super block and
    avoids the attr invalidation in that case.

    Signed-off-by: Andrew Gallagher
    Signed-off-by: Miklos Szeredi

    Andrew Gallagher
     
  • As noticed by Coverity the "num != 0" condition never triggers. Instead it
    should check for a complete page.

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • Having this struct in module memory could Oops when if the module is
    unloaded while the buffer still persists in a pipe.

    Since sock_pipe_buf_ops is essentially the same as fuse_dev_pipe_buf_steal
    merge them into nosteal_pipe_buf_ops (this is the same as
    default_pipe_buf_ops except stealing the page from the buffer is not
    allowed).

    Reported-by: Al Viro
    Signed-off-by: Miklos Szeredi
    Cc: stable@vger.kernel.org

    Miklos Szeredi
     
  • James Bottomley
     

22 Jan, 2014

10 commits

  • …ub', 'for-3.14/sony' and 'for-3.14/upstream' into for-linus

    Jiri Kosina
     
  • This patch adds NODE_MAPPING which is similar as META_MAPPING introduced by
    Gu Zheng.

    Cc: Gu Zheng
    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • As the orphan_blocks may be max to 504, so it is not security
    and rigorous to store such a large array in the kernel stack
    as Dan Carpenter said.
    In fact, grab_meta_page has locked the page in the page cache,
    and we can use find_get_page() to fetch the page safely in the
    downstream, so we can remove the page array directly.

    Reported-by: Dan Carpenter
    Signed-off-by: Gu Zheng
    Signed-off-by: Jaegeuk Kim

    Gu Zheng
     
  • Introduce help function META_MAPPING() to get the cache meta blocks'
    address space.

    Signed-off-by: Gu Zheng
    Signed-off-by: Jaegeuk Kim

    Gu Zheng
     
  • This patch moves a function in f2fs_delete_entry for code readability.

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • If a dentry page is updated, we should call mark_inode_dirty to add the inode
    into the dirty list, so that its dentry pages are flushed to the disk.
    Otherwise, the inode can be evicted without flush.

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • Signed-off-by: Li Zhong
    Signed-off-by: Jiri Kosina

    Li Zhong
     
  • Missing "@" in include/linux/wait.h cause "make htmldocs" failed
    with following warning messages.

    Warning(/home/iida/Repo/linux-next//include/linux/wait.h:304):
    No description found for parameter 'cmd1'
    Warning(/home/iida/Repo/linux-next//include/linux/wait.h:304):
    No description found for parameter 'cmd2'

    Signed-off-by: Masanari Iida
    Signed-off-by: Jiri Kosina

    Masanari Iida
     
  • In the cluster evironment, cluster write has poor performance because
    userspace_flush() has to contact a userspace program (cmirrord) for
    clear/mark/flush requests. But both mark and flush requests require
    cmirrord to communicate the message to all the cluster nodes for each
    flush call. This behaviour is really slow.

    To address this we now merge mark and flush requests together to reduce
    the kernel-userspace-kernel time. We allow a new directive,
    "integrated_flush" that can be used to instruct the kernel log code to
    combine flush and mark requests when directed by userspace. If not
    directed by userspace (due to an older version of the userspace code
    perhaps), the kernel will function as it did previously - preserving
    backwards compatibility. Additionally, flush requests are performed
    lazily when only clear requests exist.

    Signed-off-by: Dongmao Zhang
    Signed-off-by: Jonathan Brassow
    Signed-off-by: Alasdair G Kergon
    Signed-off-by: Mike Snitzer

    Dongmao Zhang
     
  • Merge first patch-bomb from Andrew Morton:

    - a couple of misc things

    - inotify/fsnotify work from Jan

    - ocfs2 updates (partial)

    - about half of MM

    * emailed patches from Andrew Morton : (117 commits)
    mm/migrate: remove unused function, fail_migrate_page()
    mm/migrate: remove putback_lru_pages, fix comment on putback_movable_pages
    mm/migrate: correct failure handling if !hugepage_migration_support()
    mm/migrate: add comment about permanent failure path
    mm, page_alloc: warn for non-blockable __GFP_NOFAIL allocation failure
    mm: compaction: reset scanner positions immediately when they meet
    mm: compaction: do not mark unmovable pageblocks as skipped in async compaction
    mm: compaction: detect when scanners meet in isolate_freepages
    mm: compaction: reset cached scanner pfn's before reading them
    mm: compaction: encapsulate defer reset logic
    mm: compaction: trace compaction begin and end
    memcg, oom: lock mem_cgroup_print_oom_info
    sched: add tracepoints related to NUMA task migration
    mm: numa: do not automatically migrate KSM pages
    mm: numa: trace tasks that fail migration due to rate limiting
    mm: numa: limit scope of lock for NUMA migrate rate limiting
    mm: numa: make NUMA-migrate related functions static
    lib/show_mem.c: show num_poisoned_pages when oom
    mm/hwpoison: add '#' to hwpoison_inject
    mm/memblock: use WARN_ONCE when MAX_NUMNODES passed as input parameter
    ...

    Linus Torvalds