09 Oct, 2014

1 commit

  • Pull f2fs updates from Jaegeuk Kim:
    "This patch-set introduces a couple of new features such as large
    sector size, FITRIM, and atomic/volatile writes.

    Several patches enhance power-off recovery and checkpoint routines.

    The fsck.f2fs starts to support fixing corrupted partitions with
    recovery hints provided by this patch-set.

    Summary:
    - retain some recovery information for fsck.f2fs
    - enhance checkpoint speed
    - enhance flush command management
    - bug fix for lseek
    - tune in-place-update policies
    - enhance roll-forward speed
    - revisit all the roll-forward and fsync rules
    - support larget sector size
    - support FITRIM
    - support atomic and volatile writes

    And several clean-ups and bug fixes are included"

    * tag 'f2fs-for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (42 commits)
    f2fs: support volatile operations for transient data
    f2fs: support atomic writes
    f2fs: remove unused return value
    f2fs: clean up f2fs_ioctl functions
    f2fs: potential shift wrapping buf in f2fs_trim_fs()
    f2fs: call f2fs_unlock_op after error was handled
    f2fs: check the use of macros on block counts and addresses
    f2fs: refactor flush_nat_entries to remove costly reorganizing ops
    f2fs: introduce FITRIM in f2fs_ioctl
    f2fs: introduce cp_control structure
    f2fs: use more free segments until SSR is activated
    f2fs: change the ipu_policy option to enable combinations
    f2fs: fix to search whole dirty segmap when get_victim
    f2fs: fix to clean previous mount option when remount_fs
    f2fs: skip punching hole in special condition
    f2fs: support large sector size
    f2fs: fix to truncate blocks past EOF in ->setattr
    f2fs: update i_size when __allocate_data_block
    f2fs: use MAX_BIO_BLOCKS(sbi)
    f2fs: remove redundant operation during roll-forward recovery
    ...

    Linus Torvalds
     

08 Oct, 2014

1 commit

  • Pull KVM updates from Paolo Bonzini:
    "Fixes and features for 3.18.

    Apart from the usual cleanups, here is the summary of new features:

    - s390 moves closer towards host large page support

    - PowerPC has improved support for debugging (both inside the guest
    and via gdbstub) and support for e6500 processors

    - ARM/ARM64 support read-only memory (which is necessary to put
    firmware in emulated NOR flash)

    - x86 has the usual emulator fixes and nested virtualization
    improvements (including improved Windows support on Intel and
    Jailhouse hypervisor support on AMD), adaptive PLE which helps
    overcommitting of huge guests. Also included are some patches that
    make KVM more friendly to memory hot-unplug, and fixes for rare
    caching bugs.

    Two patches have trivial mm/ parts that were acked by Rik and Andrew.

    Note: I will soon switch to a subkey for signing purposes"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (157 commits)
    kvm: do not handle APIC access page if in-kernel irqchip is not in use
    KVM: s390: count vcpu wakeups in stat.halt_wakeup
    KVM: s390/facilities: allow TOD-CLOCK steering facility bit
    KVM: PPC: BOOK3S: HV: CMA: Reserve cma region only in hypervisor mode
    arm/arm64: KVM: Report correct FSC for unsupported fault types
    arm/arm64: KVM: Fix VTTBR_BADDR_MASK and pgd alloc
    kvm: Fix kvm_get_page_retry_io __gup retval check
    arm/arm64: KVM: Fix set_clear_sgi_pend_reg offset
    kvm: x86: Unpin and remove kvm_arch->apic_access_page
    kvm: vmx: Implement set_apic_access_page_addr
    kvm: x86: Add request bit to reload APIC access page address
    kvm: Add arch specific mmu notifier for page invalidation
    kvm: Rename make_all_cpus_request() to kvm_make_all_cpus_request() and make it non-static
    kvm: Fix page ageing bugs
    kvm/x86/mmu: Pass gfn and level to rmapp callback.
    x86: kvm: use alternatives for VMCALL vs. VMMCALL if kernel text is read-only
    kvm: x86: use macros to compute bank MSRs
    KVM: x86: Remove debug assertion of non-PAE reserved bits
    kvm: don't take vcpu mutex for obviously invalid vcpu ioctls
    kvm: Faults which trigger IO release the mmap_sem
    ...

    Linus Torvalds
     

01 Oct, 2014

2 commits


24 Sep, 2014

1 commit


16 Sep, 2014

1 commit

  • Currently, we call ioapic_service() immediately when we find the irq is still
    active during eoi broadcast. But for real hardware, there's some delay between
    the EOI writing and irq delivery. If we do not emulate this behavior, and
    re-inject the interrupt immediately after the guest sends an EOI and re-enables
    interrupts, a guest might spend all its time in the ISR if it has a broken
    handler for a level-triggered interrupt.

    Such livelock actually happens with Windows guests when resuming from
    hibernation.

    As there's no way to recognize the broken handle from new raised ones, this patch
    delays an interrupt if 10.000 consecutive EOIs found that the interrupt was
    still high. The guest can then make a little forward progress, until a proper
    IRQ handler is set or until some detection routine in the guest (such as
    Linux's note_interrupt()) recognizes the situation.

    Cc: Michael S. Tsirkin
    Signed-off-by: Jason Wang
    Signed-off-by: Zhang Haoyu
    Signed-off-by: Paolo Bonzini

    Zhang Haoyu
     

06 Sep, 2014

1 commit


15 Aug, 2014

1 commit

  • Pull more powerpc updates from Ben Herrenschmidt:
    "Here are some more powerpc bits for 3.17, essentially fixes.

    The biggest series, also aimed at -stable, is from Aneesh and is the
    result of weeks and weeks of debugging to find out why the heck or THP
    implementation was occasionally triggering multi-hit errors in our
    level 1 TLB. It ended up being a combination of issues including
    subtleties as to how we should invalidate those special 'MPSS' pages
    we use to allow the use of 16M pages inside 4K/64K "base page size"
    segments (you really have to love our MMU !)

    Another interesting one in the "OMG" category is the series from
    Michael adding memory barriers to spin_is_locked(). That's also the
    result of many days of debugging to figure out why the semaphore code
    would occasionally crash in ways that made no sense. It ended up
    being some creative lock stacking that was defeated by the fact that
    our locks allow a load inside the locked section to be re-ordered with
    the load of the lock value itself (I'm still of two mind about whether
    to kill that once and for all by putting a heavier barrier back into
    our lock implementation...). The fixes come with a long explanation
    in the cset comments, feel free to read it if you feel like having a
    headache today"

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (25 commits)
    powerpc/thp: Add tracepoints to track hugepage invalidate
    powerpc/mm: Use read barrier when creating real_pte
    powerpc/thp: Use ACCESS_ONCE when loading pmdp
    powerpc/thp: Invalidate with vpn in loop
    powerpc/thp: Handle combo pages in invalidate
    powerpc/thp: Invalidate old 64K based hash page mapping before insert of 4k pte
    powerpc/thp: Don't recompute vsid and ssize in loop on invalidate
    powerpc/thp: Add write barrier after updating the valid bit
    powerpc: reorder per-cpu NUMA information's initialization
    powerpc/perf/hv-24x7: Use kmem_cache_free
    powerpc/pseries/hvcserver: Fix endian issue in hvcs_get_partner_info
    powerpc: Hard disable interrupts in xmon
    powerpc: remove duplicate definition of TEXASR_FS
    powerpc/pseries: Avoid deadlock on removing ddw
    powerpc/pseries: Failure on removing device node
    powerpc/boot: Use correct zlib types for comparison
    powerpc/powernv: Interface to register/unregister opal dump region
    printk: Add function to return log buffer address and size
    powerpc: Add POWER8 features to CPU_FTRS_POSSIBLE/ALWAYS
    powerpc/ppc476: Disable BTAC
    ...

    Linus Torvalds
     

14 Aug, 2014

1 commit

  • Pull block driver changes from Jens Axboe:
    "Nothing out of the ordinary here, this pull request contains:

    - A big round of fixes for bcache from Kent Overstreet, Slava Pestov,
    and Surbhi Palande. No new features, just a lot of fixes.

    - The usual round of drbd updates from Andreas Gruenbacher, Lars
    Ellenberg, and Philipp Reisner.

    - virtio_blk was converted to blk-mq back in 3.13, but now Ming Lei
    has taken it one step further and added support for actually using
    more than one queue.

    - Addition of an explicit SG_FLAG_Q_AT_HEAD for block/bsg, to
    compliment the the default behavior of adding to the tail of the
    queue. From Douglas Gilbert"

    * 'for-3.17/drivers' of git://git.kernel.dk/linux-block: (86 commits)
    bcache: Drop unneeded blk_sync_queue() calls
    bcache: add mutex lock for bch_is_open
    bcache: Correct printing of btree_gc_max_duration_ms
    bcache: try to set b->parent properly
    bcache: fix memory corruption in init error path
    bcache: fix crash with incomplete cache set
    bcache: Fix more early shutdown bugs
    bcache: fix use-after-free in btree_gc_coalesce()
    bcache: Fix an infinite loop in journal replay
    bcache: fix crash in bcache_btree_node_alloc_fail tracepoint
    bcache: bcache_write tracepoint was crashing
    bcache: fix typo in bch_bkey_equal_header
    bcache: Allocate bounce buffers with GFP_NOWAIT
    bcache: Make sure to pass GFP_WAIT to mempool_alloc()
    bcache: fix uninterruptible sleep in writeback thread
    bcache: wait for buckets when allocating new btree root
    bcache: fix crash on shutdown in passthrough mode
    bcache: fix lockdep warnings on shutdown
    bcache allocator: send discards with correct size
    bcache: Fix to remove the rcu_sched stalls.
    ...

    Linus Torvalds
     

13 Aug, 2014

1 commit


10 Aug, 2014

1 commit

  • …it/rostedt/linux-trace

    Pull IPI tracepoints for ARM from Steven Rostedt:
    "Nicolas Pitre added generic tracepoints for tracing IPIs and updated
    the arm and arm64 architectures. It required some minor updates to
    the generic tracepoint system, so it had to wait for me to implement
    them"

    * tag 'trace-ipi-tracepoints' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    ARM64: add IPI tracepoints
    ARM: add IPI tracepoints
    tracepoint: add generic tracepoint definitions for IPI tracing
    tracing: Do not do anything special with tracepoint_string when tracing is disabled

    Linus Torvalds
     

08 Aug, 2014

2 commits

  • The Inter Processor Interrupt is used to make another processor do a
    specific action such as rescheduling tasks, signal a timer event or
    execute something in another CPU's context. IRQs are already traceable
    but IPIs were not. Tracing them is useful for monitoring IPI latency,
    or to verify when they are the source of CPU wake-ups with power
    management implications.

    Three trace hooks are defined: ipi_raise, ipi_entry and ipi_exit. To make
    them portable, a string is used to identify them and correlate related
    events. Additionally, ipi_raise records a bitmask representing targeted
    CPUs.

    Link: http://lkml.kernel.org/p/1406318733-26754-3-git-send-email-nicolas.pitre@linaro.org

    Acked-by: Daniel Lezcano
    Signed-off-by: Nicolas Pitre
    Signed-off-by: Steven Rostedt

    Nicolas Pitre
     
  • Pull second round of KVM changes from Paolo Bonzini:
    "Here are the PPC and ARM changes for KVM, which I separated because
    they had small conflicts (respectively within KVM documentation, and
    with 3.16-rc changes). Since they were all within the subsystem, I
    took care of them.

    Stephen Rothwell reported some snags in PPC builds, but they are all
    fixed now; the latest linux-next report was clean.

    New features for ARM include:
    - KVM VGIC v2 emulation on GICv3 hardware
    - Big-Endian support for arm/arm64 (guest and host)
    - Debug Architecture support for arm64 (arm32 is on Christoffer's todo list)

    And for PPC:
    - Book3S: Good number of LE host fixes, enable HV on LE
    - Book3S HV: Add in-guest debug support

    This release drops support for KVM on the PPC440. As a result, the
    PPC merge removes more lines than it adds. :)

    I also included an x86 change, since Davidlohr tied it to an
    independent bug report and the reporter quickly provided a Tested-by;
    there was no reason to wait for -rc2"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (122 commits)
    KVM: Move more code under CONFIG_HAVE_KVM_IRQFD
    KVM: nVMX: fix "acknowledge interrupt on exit" when APICv is in use
    KVM: nVMX: Fix nested vmexit ack intr before load vmcs01
    KVM: PPC: Enable IRQFD support for the XICS interrupt controller
    KVM: Give IRQFD its own separate enabling Kconfig option
    KVM: Move irq notifier implementation into eventfd.c
    KVM: Move all accesses to kvm::irq_routing into irqchip.c
    KVM: irqchip: Provide and use accessors for irq routing table
    KVM: Don't keep reference to irq routing table in irqfd struct
    KVM: PPC: drop duplicate tracepoint
    arm64: KVM: fix 64bit CP15 VM access for 32bit guests
    KVM: arm64: GICv3: mandate page-aligned GICV region
    arm64: KVM: GICv3: move system register access to msr_s/mrs_s
    KVM: PPC: PR: Handle FSCR feature deselects
    KVM: PPC: HV: Remove generic instruction emulation
    KVM: PPC: BOOKEHV: rename e500hv_spr to bookehv_spr
    KVM: PPC: Remove DCR handling
    KVM: PPC: Expose helper functions for data/inst faults
    KVM: PPC: Separate loadstore emulation from priv emulation
    KVM: PPC: Handle magic page in kvmppc_ld/st
    ...

    Linus Torvalds
     

07 Aug, 2014

4 commits

  • Merge incoming from Andrew Morton:
    - Various misc things.
    - arch/sh updates.
    - Part of ocfs2. Review is slow.
    - Slab updates.
    - Most of -mm.
    - printk updates.
    - lib/ updates.
    - checkpatch updates.

    * emailed patches from Andrew Morton : (226 commits)
    checkpatch: update $declaration_macros, add uninitialized_var
    checkpatch: warn on missing spaces in broken up quoted
    checkpatch: fix false positives for --strict "space after cast" test
    checkpatch: fix false positive MISSING_BREAK warnings with --file
    checkpatch: add test for native c90 types in unusual order
    checkpatch: add signed generic types
    checkpatch: add short int to c variable types
    checkpatch: add for_each tests to indentation and brace tests
    checkpatch: fix brace style misuses of else and while
    checkpatch: add --fix option for a couple OPEN_BRACE misuses
    checkpatch: use the correct indentation for which()
    checkpatch: add fix_insert_line and fix_delete_line helpers
    checkpatch: add ability to insert and delete lines to patch/file
    checkpatch: add an index variable for fixed lines
    checkpatch: warn on break after goto or return with same tab indentation
    checkpatch: emit a warning on file add/move/delete
    checkpatch: add test for commit id formatting style in commit log
    checkpatch: emit fewer kmalloc_array/kcalloc conversion warnings
    checkpatch: improve "no space after cast" test
    checkpatch: allow multiple const * types
    ...

    Linus Torvalds
     
  • Pull sound updates from Takashi Iwai:
    "There've been many updates in ASoC side at this time, especially the
    framework enhancement for multiple CODECs on a single DAI and more
    componentization works.

    The only major change in ALSA core is the addition of timestamp type
    in sw_params field. This should behave in backward compatible way.

    Other than that, there are lots of small changes and new drivers in
    wide range, including a large code cut in HD-audio driver for
    deprecated static quirks. Some highlights are below:

    ALSA Core:
    - Add the new timestamp type field to sw_params to choose
    MONOTONIC_RAW type

    HD-audio:
    - Continued conversion to standard printk macros, generic code
    cleanups
    - Removal of obsoleted static quirk codes for Conexant and C-Media
    codecs
    - Fixups for HP Envy TS, Dell XPS 15, HP and Dell mute/mic LED,
    Gigabyte BXBT-2807 mobo
    - Intel Braswell support

    ASoC:
    - Support for multiple CODECs attached to a single DAI, enabling
    systems with for example multiple DAC/speaker drivers on a single
    link, contributed by Benoit Cousson based on work from Misael Lopez
    Cruz
    - Support for byte controls larger than 256 bytes based on the use of
    TLVs contributed by Omair Mohammed Abdullah
    - More componentisation work from Lars-Peter Clausen
    - The remainder of the conversions of CODEC drivers to params_width()
    by Mark Brown
    - Drivers for Cirrus Logic CS4265, Freescale i.MX ASRC blocks,
    Realtek RT286 and RT5670, Rockchip RK3xxx I2S controllers and Texas
    Instruments TAS2552
    - Lots of updates and fixes, especially to the DaVinci, Intel,
    Freescale, Realtek, and rcar drivers"

    * tag 'sound-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (402 commits)
    ALSA: usb-audio: Whitespace cleanups for sound/usb/midi.*
    ALSA: usb-audio: Respond to suspend and resume callbacks for MIDI input
    sound/oss/pss: Remove typedefs pss_mixerdata and pss_confdata
    sound/oss/opl3: Remove typedef opl_devinfo
    ALSA: fireworks: fix specifiers in format strings for propper output
    ASoC: imx-audmux: Use uintptr_t for port numbers
    ASoC: davinci: Enable menuconfig entry for McASP
    ASoC: fsl_asrc: Don't access members of config before checking it
    ASoC: fsl_sarc_dma: Check pair before using it
    ASoC: adau1977: Fix truncation warning on 64 bit architectures
    ALSA: virtuoso: add Xonar Essence STX II support
    ALSA: riptide: fix %d confusingly prefixed with 0x in format strings
    ALSA: fireworks: fix %d confusingly prefixed with 0x in format strings
    ALSA: hda - add codec ID for Braswell display audio codec
    ALSA: hda - add PCI IDs for Intel Braswell
    ALSA: usb-audio: Adjust Gamecom 780 volume level
    ALSA: usb-audio: improve dmesg source grepability
    ASoC: rt5670: Fix duplicate const warnings
    ASoC: rt5670: Staticise non-exported symbols
    ASoC: Intel: update stream only on stream IPC msgs
    ...

    Linus Torvalds
     
  • This was formerly the series "Improve sequential read throughput" which
    noted some major differences in performance of tiobench since 3.0.
    While there are a number of factors, two that dominated were the
    introduction of the fair zone allocation policy and changes to CFQ.

    The behaviour of fair zone allocation policy makes more sense than
    tiobench as a benchmark and CFQ defaults were not changed due to
    insufficient benchmarking.

    This series is what's left. It's one functional fix to the fair zone
    allocation policy when used on NUMA machines and a reduction of overhead
    in general. tiobench was used for the comparison despite its flaws as
    an IO benchmark as in this case we are primarily interested in the
    overhead of page allocator and page reclaim activity.

    On UMA, it makes little difference to overhead

    3.16.0-rc3 3.16.0-rc3
    vanilla lowercost-v5
    User 383.61 386.77
    System 403.83 401.74
    Elapsed 5411.50 5413.11

    On a 4-socket NUMA machine it's a bit more noticable

    3.16.0-rc3 3.16.0-rc3
    vanilla lowercost-v5
    User 746.94 802.00
    System 65336.22 40852.33
    Elapsed 27553.52 27368.46

    This patch (of 6):

    The LRU insertion and activate tracepoints take PFN as a parameter
    forcing the overhead to the caller. Move the overhead to the tracepoint
    fast-assign method to ensure the cost is only incurred when the
    tracepoint is active.

    Signed-off-by: Mel Gorman
    Acked-by: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • The mm_migrate_pages trace event reports a reason for the migration,
    typically as a symbolic string. The exception is the reason
    MR_NUMA_MISPLACED for which it just displays the numeric value:
    mm_migrate_pages: nr_succeeded=1 nr_failed=0 mode=MIGRATE_ASYNC
    reason=0x5

    This patch makes the output consistent by introducing a string value for
    MR_NUMA_MISPLACED. The event is then reported as: mm_migrate_pages:
    nr_succeeded=1 nr_failed=0 mode=MIGRATE_ASYNC reason=numa_misplaced

    Signed-off-by: Max Asbock
    Acked-by: Steven Rostedt
    Cc: Ingo Molnar
    Acked-by: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Max Asbock
     

06 Aug, 2014

1 commit

  • Commits e4d57e1ee1ab (KVM: Move irq notifier implementation into
    eventfd.c, 2014-06-30) included the irq notifier code unconditionally
    in eventfd.c, while it was under CONFIG_HAVE_KVM_IRQCHIP before.

    Similarly, commit 297e21053a52 (KVM: Give IRQFD its own separate enabling
    Kconfig option, 2014-06-30) moved code from CONFIG_HAVE_IRQ_ROUTING
    to CONFIG_HAVE_KVM_IRQFD but forgot to move the pieces that used to be
    under CONFIG_HAVE_KVM_IRQCHIP.

    Together, this broke compilation without CONFIG_KVM_XICS. Fix by adding
    or changing the #ifdefs so that they point at CONFIG_HAVE_KVM_IRQFD.

    Signed-off-by: Paolo Bonzini

    Paolo Bonzini
     

05 Aug, 2014

6 commits

  • Pull f2fs updates from Jaegeuk Kim:
    "This series includes patches to:
    - add nobarrier mount option
    - support tmpfile and rename2
    - enhance the fdatasync behavior
    - fix the error path
    - fix the recovery routine
    - refactor a part of the checkpoint procedure
    - reduce some lock contentions"

    * tag 'for-f2fs-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (40 commits)
    f2fs: use for_each_set_bit to simplify the code
    f2fs: add f2fs_balance_fs for expand_inode_data
    f2fs: invalidate xattr node page when evict inode
    f2fs: avoid skipping recover_inline_xattr after recover_inline_data
    f2fs: add tracepoint for f2fs_direct_IO
    f2fs: reduce competition among node page writes
    f2fs: fix coding style
    f2fs: remove redundant lines in allocate_data_block
    f2fs: add tracepoint for f2fs_issue_flush
    f2fs: avoid retrying wrong recovery routine when error was occurred
    f2fs: test before set/clear bits
    f2fs: fix wrong condition for unlikely
    f2fs: enable in-place-update for fdatasync
    f2fs: skip unnecessary data writes during fsync
    f2fs: add info of appended or updated data writes
    f2fs: use radix_tree for ino management
    f2fs: add infra for ino management
    f2fs: punch the core function for inode management
    f2fs: add nobarrier mount option
    f2fs: fix to put root inode in error path of fill_super
    ...

    Linus Torvalds
     
  • Pull driver core updates from Greg KH:
    "Here's the big driver-core pull request for 3.17-rc1.

    Largest thing in here is the dma-buf rework and fence code, that
    touched many different subsystems so it was agreed it should go
    through this tree to handle merge issues. There's also some firmware
    loading updates, as well as tests added, and a few other tiny changes,
    the changelog has the details.

    All have been in linux-next for a long time"

    * tag 'driver-core-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (32 commits)
    ARM: imx: Remove references to platform_bus in mxc code
    firmware loader: Fix _request_firmware_load() return val for fw load abort
    platform: Remove most references to platform_bus device
    test: add firmware_class loader test
    doc: fix minor typos in firmware_class README
    staging: android: Cleanup style issues
    Documentation: devres: Sort managed interfaces
    Documentation: devres: Add devm_kmalloc() et al
    fs: debugfs: remove trailing whitespace
    kernfs: kernel-doc warning fix
    debugfs: Fix corrupted loop in debugfs_remove_recursive
    stable_kernel_rules: Add pointer to netdev-FAQ for network patches
    driver core: platform: add device binding path 'driver_override'
    driver core/platform: remove unused implicit padding in platform_object
    firmware loader: inform direct failure when udev loader is disabled
    firmware: replace ALIGN(PAGE_SIZE) by PAGE_ALIGN
    firmware: read firmware size using i_size_read()
    firmware loader: allow disabling of udev as firmware loader
    reservation: add suppport for read-only access using rcu
    reservation: update api and add some helpers
    ...

    Conflicts:
    drivers/base/platform.c

    Linus Torvalds
     
  • Pull RAS updates from Ingo Molnar:
    "The main changes in this cycle are:

    - RAS tracing/events infrastructure, by Gong Chen.

    - Various generalizations of the APEI code to make it available to
    non-x86 architectures, by Tomasz Nowicki"

    * 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/ras: Fix build warnings in
    acpi, apei, ghes: Factor out ioremap virtual memory for IRQ and NMI context.
    acpi, apei, ghes: Make NMI error notification to be GHES architecture extension.
    apei, mce: Factor out APEI architecture specific MCE calls.
    RAS, extlog: Adjust init flow
    trace, eMCA: Add a knob to adjust where to save event log
    trace, RAS: Add eMCA trace event interface
    RAS, debugfs: Add debugfs interface for RAS subsystem
    CPER: Adjust code flow of some functions
    x86, MCE: Robustify mcheck_init_device
    trace, AER: Move trace into unified interface
    trace, RAS: Add basic RAS trace event
    x86, MCE: Kill CPU_POST_DEAD

    Linus Torvalds
     
  • Pull x86 mm changes from Ingo Molnar:
    "The main change in this cycle is the rework of the TLB range flushing
    code, to simplify, fix and consolidate the code. By Dave Hansen"

    * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/mm: Set TLB flush tunable to sane value (33)
    x86/mm: New tunable for single vs full TLB flush
    x86/mm: Add tracepoints for TLB flushes
    x86/mm: Unify remote INVLPG code
    x86/mm: Fix missed global TLB flush stat
    x86/mm: Rip out complicated, out-of-date, buggy TLB flushing
    x86/mm: Clean up the TLB flushing code
    x86/smep: Be more informative when signalling an SMEP fault

    Linus Torvalds
     
  • 'b' was NULL.

    Change-Id: Icac0fd04afa2d23f213d96d51afd53374e6dd0c0

    Slava Pestov
     
  • Signed-off-by: Kent Overstreet

    Slava Pestov
     

04 Aug, 2014

1 commit


02 Aug, 2014

1 commit


31 Jul, 2014

2 commits

  • We don't have any good way to figure out what kinds of flushes
    are being attempted. Right now, we can try to use the vm
    counters, but those only tell us what we actually did with the
    hardware (one-by-one vs full) and don't tell us what was actually
    _requested_.

    This allows us to select out "interesting" TLB flushes that we
    might want to optimize (like the ranged ones) and ignore the ones
    that we have very little control over (the ones at context
    switch).

    Signed-off-by: Dave Hansen
    Link: http://lkml.kernel.org/r/20140731154059.4C96CBA5@viggo.jf.intel.com
    Acked-by: Rik van Riel
    Cc: Mel Gorman
    Signed-off-by: H. Peter Anvin

    Dave Hansen
     
  • This patch adds a tracepoint for f2fs_issue_flush.

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     

09 Jul, 2014

1 commit

  • A fence can be attached to a buffer which is being filled or consumed
    by hw, to allow userspace to pass the buffer without waiting to another
    device. For example, userspace can call page_flip ioctl to display the
    next frame of graphics after kicking the GPU but while the GPU is still
    rendering. The display device sharing the buffer with the GPU would
    attach a callback to get notified when the GPU's rendering-complete IRQ
    fires, to update the scan-out address of the display, without having to
    wake up userspace.

    A driver must allocate a fence context for each execution ring that can
    run in parallel. The function for this takes an argument with how many
    contexts to allocate:
    + fence_context_alloc()

    A fence is transient, one-shot deal. It is allocated and attached
    to one or more dma-buf's. When the one that attached it is done, with
    the pending operation, it can signal the fence:
    + fence_signal()

    To have a rough approximation whether a fence is fired, call:
    + fence_is_signaled()

    The dma-buf-mgr handles tracking, and waiting on, the fences associated
    with a dma-buf.

    The one pending on the fence can add an async callback:
    + fence_add_callback()

    The callback can optionally be cancelled with:
    + fence_remove_callback()

    To wait synchronously, optionally with a timeout:
    + fence_wait()
    + fence_wait_timeout()

    When emitting a fence, call:
    + trace_fence_emit()

    To annotate that a fence is blocking on another fence, call:
    + trace_fence_annotate_wait_on(fence, on_fence)

    A default software-only implementation is provided, which can be used
    by drivers attaching a fence to a buffer when they have no other means
    for hw sync. But a memory backed fence is also envisioned, because it
    is common that GPU's can write to, or poll on some memory location for
    synchronization. For example:

    fence = custom_get_fence(...);
    if ((seqno_fence = to_seqno_fence(fence)) != NULL) {
    dma_buf *fence_buf = seqno_fence->sync_buf;
    get_dma_buf(fence_buf);

    ... tell the hw the memory location to wait ...
    custom_wait_on(fence_buf, seqno_fence->seqno_ofs, fence->seqno);
    } else {
    /* fall-back to sw sync * /
    fence_add_callback(fence, my_cb);
    }

    On SoC platforms, if some other hw mechanism is provided for synchronizing
    between IP blocks, it could be supported as an alternate implementation
    with it's own fence ops in a similar way.

    enable_signaling callback is used to provide sw signaling in case a cpu
    waiter is requested or no compatible hardware signaling could be used.

    The intention is to provide a userspace interface (presumably via eventfd)
    later, to be used in conjunction with dma-buf's mmap support for sw access
    to buffers (or for userspace apps that would prefer to do their own
    synchronization).

    v1: Original
    v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided
    that dma-fence didn't need to care about the sw->hw signaling path
    (it can be handled same as sw->sw case), and therefore the fence->ops
    can be simplified and more handled in the core. So remove the signal,
    add_callback, cancel_callback, and wait ops, and replace with a simple
    enable_signaling() op which can be used to inform a fence supporting
    hw->hw signaling that one or more devices which do not support hw
    signaling are waiting (and therefore it should enable an irq or do
    whatever is necessary in order that the CPU is notified when the
    fence is passed).
    v3: Fix locking fail in attach_fence() and get_fence()
    v4: Remove tie-in w/ dma-buf.. after discussion w/ danvet and mlankorst
    we decided that we need to be able to attach one fence to N dma-buf's,
    so using the list_head in dma-fence struct would be problematic.
    v5: [ Maarten Lankhorst ] Updated for dma-bikeshed-fence and dma-buf-manager.
    v6: [ Maarten Lankhorst ] I removed dma_fence_cancel_callback and some comments
    about checking if fence fired or not. This is broken by design.
    waitqueue_active during destruction is now fatal, since the signaller
    should be holding a reference in enable_signalling until it signalled
    the fence. Pass the original dma_fence_cb along, and call __remove_wait
    in the dma_fence_callback handler, so that no cleanup needs to be
    performed.
    v7: [ Maarten Lankhorst ] Set cb->func and only enable sw signaling if
    fence wasn't signaled yet, for example for hardware fences that may
    choose to signal blindly.
    v8: [ Maarten Lankhorst ] Tons of tiny fixes, moved __dma_fence_init to
    header and fixed include mess. dma-fence.h now includes dma-buf.h
    All members are now initialized, so kmalloc can be used for
    allocating a dma-fence. More documentation added.
    v9: Change compiler bitfields to flags, change return type of
    enable_signaling to bool. Rework dma_fence_wait. Added
    dma_fence_is_signaled and dma_fence_wait_timeout.
    s/dma// and change exports to non GPL. Added fence_is_signaled and
    fence_enable_sw_signaling calls, add ability to override default
    wait operation.
    v10: remove event_queue, use a custom list, export try_to_wake_up from
    scheduler. Remove fence lock and use a global spinlock instead,
    this should hopefully remove all the locking headaches I was having
    on trying to implement this. enable_signaling is called with this
    lock held.
    v11:
    Use atomic ops for flags, lifting the need for some spin_lock_irqsaves.
    However I kept the guarantee that after fence_signal returns, it is
    guaranteed that enable_signaling has either been called to completion,
    or will not be called any more.

    Add contexts and seqno to base fence implementation. This allows you
    to wait for less fences, by testing for seqno + signaled, and then only
    wait on the later fence.

    Add FENCE_TRACE, FENCE_WARN, and FENCE_ERR. This makes debugging easier.
    An CONFIG_DEBUG_FENCE will be added to turn off the FENCE_TRACE
    spam, and another runtime option can turn it off at runtime.
    v12:
    Add CONFIG_FENCE_TRACE. Add missing documentation for the fence->context
    and fence->seqno members.
    v13:
    Fixup CONFIG_FENCE_TRACE kconfig description.
    Move fence_context_alloc to fence.
    Simplify fence_later.
    Kill priv member to fence_cb.
    v14:
    Remove priv argument from fence_add_callback, oops!
    v15:
    Remove priv from documentation.
    Explicitly include linux/atomic.h.
    v16:
    Add trace events.
    Import changes required by android syncpoints.
    v17:
    Use wake_up_state instead of try_to_wake_up. (Colin Cross)
    Fix up commit description for seqno_fence. (Rob Clark)
    v18:
    Rename release_fence to fence_release.
    Move to drivers/dma-buf/.
    Rename __fence_is_signaled and __fence_signal to *_locked.
    Rename __fence_init to fence_init.
    Make fence_default_wait return a signed long, and fix wait ops too.

    Signed-off-by: Maarten Lankhorst
    Signed-off-by: Thierry Reding #use smp_mb__before_atomic()
    Acked-by: Sumit Semwal
    Acked-by: Daniel Vetter
    Reviewed-by: Rob Clark
    Signed-off-by: Greg Kroah-Hartman

    Maarten Lankhorst
     

24 Jun, 2014

1 commit


22 Jun, 2014

1 commit


21 Jun, 2014

2 commits

  • Currently the __field() macro in TRACE_EVENT is only good for primitive
    values, such as integers and pointers, but it fails on complex data types
    such as structures or unions. This is because the __field() macro
    determines if the variable is signed or not with the test of:

    (((type)(-1)) < (type)1)

    Unfortunately, that fails when type is a structure.

    Since trace events should support structures as fields a new macro
    is created for such a case called __field_struct() which acts exactly
    the same as __field() does but it does not do the signed type check
    and just uses a constant false for that answer.

    Cc: Tony Luck
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • syscall_regfunc() and syscall_unregfunc() should set/clear
    TIF_SYSCALL_TRACEPOINT system-wide, but do_each_thread() can race
    with copy_process() and miss the new child which was not added to
    the process/thread lists yet.

    Change copy_process() to update the child's TIF_SYSCALL_TRACEPOINT
    under tasklist.

    Link: http://lkml.kernel.org/p/20140413185854.GB20668@redhat.com

    Cc: stable@vger.kernel.org # 2.6.33
    Fixes: a871bd33a6c0 "tracing: Add syscall tracepoints"
    Acked-by: Frederic Weisbecker
    Acked-by: Paul E. McKenney
    Signed-off-by: Oleg Nesterov
    Signed-off-by: Steven Rostedt

    Oleg Nesterov
     

13 Jun, 2014

2 commits

  • Pull more scheduler updates from Ingo Molnar:
    "Second round of scheduler changes:
    - try-to-wakeup and IPI reduction speedups, from Andy Lutomirski
    - continued power scheduling cleanups and refactorings, from Nicolas
    Pitre
    - misc fixes and enhancements"

    * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    sched/deadline: Delete extraneous extern for to_ratio()
    sched/idle: Optimize try-to-wake-up IPI
    sched/idle: Simplify wake_up_idle_cpu()
    sched/idle: Clear polling before descheduling the idle thread
    sched, trace: Add a tracepoint for IPI-less remote wakeups
    cpuidle: Set polling in poll_idle
    sched: Remove redundant assignment to "rt_rq" in update_curr_rt(...)
    sched: Rename capacity related flags
    sched: Final power vs. capacity cleanups
    sched: Remove remaining dubious usage of "power"
    sched: Let 'struct sched_group_power' care about CPU capacity
    sched/fair: Disambiguate existing/remaining "capacity" usage
    sched/fair: Change "has_capacity" to "has_free_capacity"
    sched/fair: Remove "power" from 'struct numa_stats'
    sched: Fix signedness bug in yield_to()
    sched/fair: Use time_after() in record_wakee()
    sched/balancing: Reduce the rate of needless idle load balancing
    sched/fair: Fix unlocked reads of some cfs_b->quota/period

    Linus Torvalds
     
  • Pull more ACPI and power management updates from Rafael Wysocki:
    "These are fixups on top of the previous PM+ACPI pull request,
    regression fixes (ACPI hotplug, cpufreq ppc-corenet), other bug fixes
    (ACPI reset, cpufreq), new PM trace points for system suspend
    profiling and a copyright notice update.

    Specifics:

    - I didn't remember correctly that the Hans de Goede's ACPI video
    patches actually didn't flip the video.use_native_backlight
    default, although we had discussed that and decided to do that.
    Since I said we would do that in the previous PM+ACPI pull request,
    make that change for real now.

    - ACPI bus check notifications for PCI host bridges don't cause the
    bus below the host bridge to be checked for changes as they should
    because of a mistake in the ACPI-based PCI hotplug (ACPIPHP)
    subsystem that forgets to add hotplug contexts to PCI host bridge
    ACPI device objects. Create hotplug contexts for PCI host bridges
    too as appropriate.

    - Revert recent cpufreq commit related to the big.LITTLE cpufreq
    driver that breaks arm64 builds.

    - Fix for a regression in the ppc-corenet cpufreq driver introduced
    during the 3.15 cycle and causing the driver to use the remainder
    from do_div instead of the quotient. From Ed Swarthout.

    - Resets triggered by panic activate a BUG_ON() in vmalloc.c on
    systems where the ACPI reset register is located in memory address
    space. Fix from Randy Wright.

    - Fix for a problem with cpufreq governors that decisions made by
    them may be suboptimal due to the fact that deferrable timers are
    used by them for CPU load sampling. From Srivatsa S Bhat.

    - Fix for a problem with the Tegra cpufreq driver where the CPU
    frequency is temporarily switched to a "stable" level that is
    different from both the initial and target frequencies during
    transitions which causes udelay() to expire earlier than it should
    sometimes. From Viresh Kumar.

    - New trace points and rework of some existing trace points for
    system suspend/resume profiling from Todd Brandt.

    - Assorted cpufreq fixes and cleanups from Stratos Karafotis and
    Viresh Kumar.

    - Copyright notice update for suspend-and-cpuhotplug.txt from
    Srivatsa S Bhat"

    * tag 'pm+acpi-3.16-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    ACPI / hotplug / PCI: Add hotplug contexts to PCI host bridges
    PM / sleep: trace events for device PM callbacks
    cpufreq: cpufreq-cpu0: remove dependency on THERMAL and REGULATOR
    cpufreq: tegra: update comment for clarity
    cpufreq: intel_pstate: Remove duplicate CPU ID check
    cpufreq: Mark CPU0 driver with CPUFREQ_NEED_INITIAL_FREQ_CHECK flag
    PM / Documentation: Update copyright in suspend-and-cpuhotplug.txt
    cpufreq: governor: remove copy_prev_load from 'struct cpu_dbs_common_info'
    cpufreq: governor: Be friendly towards latency-sensitive bursty workloads
    PM / sleep: trace events for suspend/resume
    cpufreq: ppc-corenet-cpu-freq: do_div use quotient
    Revert "cpufreq: Enable big.LITTLE cpufreq driver on arm64"
    cpufreq: Tegra: implement intermediate frequency callbacks
    cpufreq: add support for intermediate (stable) frequencies
    ACPI / video: Change the default for video.use_native_backlight to 1
    ACPI: Fix bug when ACPI reset register is implemented in system memory

    Linus Torvalds
     

12 Jun, 2014

1 commit


11 Jun, 2014

1 commit

  • Adds two trace events which supply the same info that initcall_debug
    provides, but via ftrace instead of dmesg. The existing initcall_debug
    calls require the pm_print_times_enabled var to be set (either via
    sysfs or via the kernel cmd line). The new trace events provide all the
    same info as the initcall_debug prints but with less overhead, and also
    with coverage of device prepare and complete device callbacks.

    These events replace the device_pm_report_time event (which has been
    removed). device_pm_callback_start is called first and provides the device
    and callback info. device_pm_callback_end is called after with the
    device name and error info. The time and pid are gathered from the trace
    data headers.

    Signed-off-by: Todd Brandt
    Signed-off-by: Rafael J. Wysocki

    Todd E Brandt
     

10 Jun, 2014

2 commits

  • Pull f2fs updates from Jaegeuk Kim:
    "In this round, there is no special interesting feature, but we've
    investigated a couple of tuning points with respect to the I/O flow.
    Several major bug fixes and a bunch of clean-ups also have been made.

    This patch-set includes the following major enhancement patches:
    - enhance wait_on_page_writeback
    - support SEEK_DATA and SEEK_HOLE
    - enhance readahead flows
    - enhance IO flushes
    - support fiemap
    - add some tracepoints

    The other bug fixes are as follows:
    - fix to support a large volume > 2TB correctly
    - recovery bug fix wrt fallocated space
    - fix recursive lock on xattr operations
    - fix some cases on the remount flow

    And, there are a bunch of cleanups"

    * tag 'for-f2fs-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (52 commits)
    f2fs: support f2fs_fiemap
    f2fs: avoid not to call remove_dirty_inode
    f2fs: recover fallocated space
    f2fs: fix to recover data written by dio
    f2fs: large volume support
    f2fs: avoid crash when trace f2fs_submit_page_mbio event in ra_sum_pages
    f2fs: avoid overflow when large directory feathure is enabled
    f2fs: fix recursive lock by f2fs_setxattr
    MAINTAINERS: add a co-maintainer from samsung for F2FS
    MAINTAINERS: change the email address for f2fs
    f2fs: use inode_init_owner() to simplify codes
    f2fs: avoid to use slab memory in f2fs_issue_flush for efficiency
    f2fs: add a tracepoint for f2fs_read_data_page
    f2fs: add a tracepoint for f2fs_write_{meta,node,data}_pages
    f2fs: add a tracepoint for f2fs_write_{meta,node,data}_page
    f2fs: add a tracepoint for f2fs_write_end
    f2fs: add a tracepoint for f2fs_write_begin
    f2fs: fix checkpatch warning
    f2fs: deactivate inode page if the inode is evicted
    f2fs: decrease the lock granularity during write_begin
    ...

    Linus Torvalds
     
  • Pull tracing updates from Steven Rostedt:
    "Lots of tweaks, small fixes, optimizations, and some helper functions
    to help out the rest of the kernel to ease their use of trace events.

    The big change for this release is the allowing of other tracers, such
    as the latency tracers, to be used in the trace instances and allow
    for function or function graph tracing to be in the top level
    simultaneously"

    * tag 'trace-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits)
    tracing: Fix memory leak on instance deletion
    tracing: Fix leak of ring buffer data when new instances creation fails
    tracing/kprobes: Avoid self tests if tracing is disabled on boot up
    tracing: Return error if ftrace_trace_arrays list is empty
    tracing: Only calculate stats of tracepoint benchmarks for 2^32 times
    tracing: Convert stddev into u64 in tracepoint benchmark
    tracing: Introduce saved_cmdlines_size file
    tracing: Add __get_dynamic_array_len() macro for trace events
    tracing: Remove unused variable in trace_benchmark
    tracing: Eliminate double free on failure of allocation on boot up
    ftrace/x86: Call text_ip_addr() instead of the duplicated code
    tracing: Print max callstack on stacktrace bug
    tracing: Move locking of trace_cmdline_lock into start/stop seq calls
    tracing: Try again for saved cmdline if failed due to locking
    tracing: Have saved_cmdlines use the seq_read infrastructure
    tracing: Add tracepoint benchmark tracepoint
    tracing: Print nasty banner when trace_printk() is in use
    tracing: Add funcgraph_tail option to print function name after closing braces
    tracing: Eliminate duplicate TRACE_GRAPH_PRINT_xx defines
    tracing: Add __bitmask() macro to trace events to cpumasks and other bitmasks
    ...

    Linus Torvalds
     

09 Jun, 2014

1 commit

  • Pull ext4 updates from Ted Ts'o:
    "Clean ups and miscellaneous bug fixes, in particular for the new
    collapse_range and zero_range fallocate functions. In addition,
    improve the scalability of adding and remove inodes from the orphan
    list"

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (25 commits)
    ext4: handle symlink properly with inline_data
    ext4: fix wrong assert in ext4_mb_normalize_request()
    ext4: fix zeroing of page during writeback
    ext4: remove unused local variable "stored" from ext4_readdir(...)
    ext4: fix ZERO_RANGE test failure in data journalling
    ext4: reduce contention on s_orphan_lock
    ext4: use sbi in ext4_orphan_{add|del}()
    ext4: use EXT_MAX_BLOCKS in ext4_es_can_be_merged()
    ext4: add missing BUFFER_TRACE before ext4_journal_get_write_access
    ext4: remove unnecessary double parentheses
    ext4: do not destroy ext4_groupinfo_caches if ext4_mb_init() fails
    ext4: make local functions static
    ext4: fix block bitmap validation when bigalloc, ^flex_bg
    ext4: fix block bitmap initialization under sparse_super2
    ext4: find the group descriptors on a 1k-block bigalloc,meta_bg filesystem
    ext4: avoid unneeded lookup when xattr name is invalid
    ext4: fix data integrity sync in ordered mode
    ext4: remove obsoleted check
    ext4: add a new spinlock i_raw_lock to protect the ext4's raw inode
    ext4: fix locking for O_APPEND writes
    ...

    Linus Torvalds