30 Jan, 2016

1 commit

  • Pull s390 updates from Martin Schwidefsky:
    "An optimization for irq-restore, the SSM instruction is quite a bit
    slower than an if-statement and a STOSM.

    The copy_file_range system all is added.

    Cleanup for PCI and CIO.

    And a couple of bug fixes"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
    s390/cio: update measurement characteristics
    s390/cio: ensure consistent measurement state
    s390/cio: fix measurement characteristics memleak
    s390/zcrypt: Fix cryptographic device id in kernel messages
    s390/pci: remove iomap sanity checks
    s390/pci: set error state for unusable functions
    s390/pci: fix bar check
    s390/pci: resize iomap
    s390/pci: improve ZPCI_* macros
    s390/pci: provide ZPCI_ADDR macro
    s390/pci: adjust IOMAP_MAX_ENTRIES
    s390/numa: move numa_init_late() from device to arch_initcall
    s390: remove all usages of PSW_ADDR_INSN
    s390: remove all usages of PSW_ADDR_AMODE
    s390: wire up copy_file_range syscall
    s390: remove superfluous memblock_alloc() return value checks
    s390/numa: allocate memory with correct alignment
    s390/irqflags: optimize irq restore
    s390/mm: use TASK_MAX_SIZE where applicable

    Linus Torvalds
     

28 Jan, 2016

1 commit

  • Pull KVM fixes from Paolo Bonzini:
    "s390 and POWER bug fixes, plus enabling the KVM-VFIO interface on
    s390"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
    KVM doc: Fix KVM_SMI chapter number
    KVM: s390: fix memory overwrites when vx is disabled
    KVM: s390: Enable the KVM-VFIO device
    KVM: s390: fix guest fprs memory leak
    KVM: PPC: Fix ONE_REG AltiVec support
    KVM: PPC: Increase memslots to 512
    KVM: PPC: Book3S PR: Remove unused variable 'vcpu_book3s'
    KVM: PPC: Fix emulation of H_SET_DABR/X on POWER8
    KVM: PPC: Book3S HV: Handle unexpected traps in guest entry/exit code better

    Linus Torvalds
     

26 Jan, 2016

11 commits

  • The kernel now always uses vector registers when available, however KVM
    has special logic if support is really enabled for a guest. If support
    is disabled, guest_fpregs.fregs will only contain memory for the fpu.
    The kernel, however, will store vector registers into that area,
    resulting in crazy memory overwrites.

    Simply extending that area is not enough, because the format of the
    registers also changes. We would have to do additional conversions, making
    the code even more complex. Therefore let's directly use one place for
    the vector/fpu registers + fpc (in kvm_run). We just have to convert the
    data properly when accessing it. This makes current code much easier.

    Please note that vector/fpu registers are now always stored to
    vcpu->run->s.regs.vrs. Although this data is visible to QEMU and
    used for migration, we only guarantee valid values to user space when
    KVM_SYNC_VRS is set. As that is only the case when we have vector
    register support, we are on the safe side.

    Fixes: b5510d9b68c3 ("s390/fpu: always enable the vector facility if it is available")
    Cc: stable@vger.kernel.org # v4.4 d9a3a09af54d s390/kvm: remove dependency on struct save_area definition
    Signed-off-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger
    [adopt to d9a3a09af54d]

    David Hildenbrand
     
  • The KVM-VFIO device is used by the QEMU VFIO device. It is used to
    record the list of in-use VFIO groups so that KVM can manipulate
    them.
    While we don't need this on s390 currently, let's try to be like
    everyone else.

    Signed-off-by: Dong Jia Shi
    Acked-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger

    Dong Jia Shi
     
  • fprs is never freed, therefore resulting in a memory leak if
    kvm_vcpu_init() fails or the vcpu is destroyed.

    Fixes: 9977e886cbbc ("s390/kernel: lazy restore fpu registers")
    Cc: stable@vger.kernel.org # v4.3+
    Reported-by: Eric Farman
    Signed-off-by: David Hildenbrand
    Reviewed-by: Eric Farman
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     
  • Since each iomap_entry handles only one bar of one pci function
    (even when disjunct ranges of a bar are mapped) the sanity check
    in pci_iomap_range is not needed and can be removed.

    Also convert the remaining BUG_ONs to WARN_ONs.

    Signed-off-by: Sebastian Ott
    Reviewed-by: Gerald Schaefer
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • We receive special notifications from firmware when an error was detected
    and a pci function became unusable. Set the error_state accordingly to give
    device drivers a hint that they don't need to try error recovery.

    Suggested-by: Alexander Schmidt
    Signed-off-by: Sebastian Ott
    Reviewed-by: Gerald Schaefer
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • Fix the check which bar space we should map to allow available bars only.

    Signed-off-by: Sebastian Ott
    Reviewed-by: Gerald Schaefer
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • On s390 we need to maintain a mapping between iomem addresses
    and arch specific function identifiers. Currently the mapping
    table is created as such that we could span the whole iomem
    address space. Since we can only map each bar space from each
    possible function we have an upper bound for the number of
    mapping entries.

    This reduces the size of the iomap from 256K to less than 4K
    (using the defconfig).

    Signed-off-by: Sebastian Ott
    Reviewed-by: Gerald Schaefer
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • Most of the constants defined in pci_io.h depend on each other
    and thus can be calculated.

    Signed-off-by: Sebastian Ott
    Reviewed-by: Gerald Schaefer
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • Provide and use a ZPCI_ADDR macro as the complement of ZPCI_IDX
    to get rid of some constants in the code.

    Signed-off-by: Sebastian Ott
    Reviewed-by: Gerald Schaefer
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • ZPCI_IOMAP_MAX_ENTRIES is off by one. Let's adjust this
    for the sake of correctness.

    Signed-off-by: Sebastian Ott
    Reviewed-by: Gerald Schaefer
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • Commit 3e89e1c5ea ("hugetlb: make mm and fs code explicitly non-modular")
    moves hugetlb_init() from module_init to subsys_initcall.

    The hugetlb_init()->hugetlb_register_node() code accesses "node->dev.kobj"
    which is initialized in numa_init_late().

    Since numa_init_late() is a device_initcall which is called *after*
    subsys_initcall the above mentioned patch breaks NUMA on s390.

    So fix this and move numa_init_late() to arch_initcall.

    Fixes: 3e89e1c5ea ("hugetlb: make mm and fs code explicitly non-modular")
    Reviewed-by: Heiko Carstens
    Signed-off-by: Michael Holzheu
    Signed-off-by: Martin Schwidefsky

    Michael Holzheu
     

23 Jan, 2016

1 commit

  • parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
    inode_foo(inode) being mutex_foo(&inode->i_mutex).

    Please, use those for access to ->i_mutex; over the coming cycle
    ->i_mutex will become rwsem, with ->lookup() done with it held
    only shared.

    Signed-off-by: Al Viro

    Al Viro
     

21 Jan, 2016

1 commit

  • Move the generic implementation to now that all
    architectures support it and remove the HAVE_DMA_ATTR Kconfig symbol now
    that everyone supports them.

    [valentinrothberg@gmail.com: remove leftovers in Kconfig]
    Signed-off-by: Christoph Hellwig
    Cc: "David S. Miller"
    Cc: Aurelien Jacquiot
    Cc: Chris Metcalf
    Cc: David Howells
    Cc: Geert Uytterhoeven
    Cc: Haavard Skinnemoen
    Cc: Hans-Christian Egtvedt
    Cc: Helge Deller
    Cc: James Hogan
    Cc: Jesper Nilsson
    Cc: Koichi Yasutake
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Mikael Starvik
    Cc: Steven Miao
    Cc: Vineet Gupta
    Cc: Christian Borntraeger
    Cc: Joerg Roedel
    Cc: Sebastian Ott
    Signed-off-by: Valentin Rothberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     

19 Jan, 2016

8 commits

  • Yet another leftover from the 31 bit era. The usual operation
    "y = x & PSW_ADDR_INSN" with the PSW_ADDR_INSN mask is a nop for
    CONFIG_64BIT.

    Therefore remove all usages and hope the code is a bit less confusing.

    Signed-off-by: Heiko Carstens
    Reviewed-by: David Hildenbrand

    Heiko Carstens
     
  • This is a leftover from the 31 bit area. For CONFIG_64BIT the usual
    operation "y = x | PSW_ADDR_AMODE" is a nop. Therefore remove all
    usages of PSW_ADDR_AMODE and make the code a bit less confusing.

    Signed-off-by: Heiko Carstens
    Reviewed-by: David Hildenbrand

    Heiko Carstens
     
  • Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     
  • memblock_alloc() and memblock_alloc_base() will panic on their own if
    they can't find free memory. Therefore remove some pointless checks.

    Signed-off-by: Heiko Carstens
    Acked-by: Michael Holzheu
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     
  • Allocating memory with a requested minimum alignment of 1 is wrong
    since pg_data_t contains a spinlock which requires an alignment of 4
    bytes.

    Therefore fix this and ask for an alignment of 8 bytes like it is
    guarenteed for all kmalloc requests.

    Signed-off-by: Heiko Carstens
    Acked-by: Michael Holzheu
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     
  • The ssm instruction takes longer that stnsm/stosm as it is often
    used to modify DAT and PER. We know that irqsave/irqrestore only
    deals with external and I/O interrupts and we know that irqrestore
    can transition only from disabled->disabled or disabled->enabled,
    so we can use the faster stosm.

    Signed-off-by: Christian Borntraeger

    Christian Borntraeger
     
  • To improve readability we can use TASK_MAX_SIZE when we just check for the
    upper limit. All places explicitly dealing with 3 vs 4 level pgtables
    were left unchanged.

    Signed-off-by: Dominik Dingel
    Reviewed-By: Sascha Silbe

    Dominik Dingel
     
  • Pull virtio barrier rework+fixes from Michael Tsirkin:
    "This adds a new kind of barrier, and reworks virtio and xen to use it.

    Plus some fixes here and there"

    * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (44 commits)
    checkpatch: add virt barriers
    checkpatch: check for __smp outside barrier.h
    checkpatch.pl: add missing memory barriers
    virtio: make find_vqs() checkpatch.pl-friendly
    virtio_balloon: fix race between migration and ballooning
    virtio_balloon: fix race by fill and leak
    s390: more efficient smp barriers
    s390: use generic memory barriers
    xen/events: use virt_xxx barriers
    xen/io: use virt_xxx barriers
    xenbus: use virt_xxx barriers
    virtio_ring: use virt_store_mb
    sh: move xchg_cmpxchg to a header by itself
    sh: support 1 and 2 byte xchg
    virtio_ring: update weak barriers to use virt_xxx
    Revert "virtio_ring: Update weak barriers to use dma_wmb/rmb"
    asm-generic: implement virt_xxx memory barriers
    x86: define __smp_xxx
    xtensa: define __smp_xxx
    tile: define __smp_xxx
    ...

    Linus Torvalds
     

17 Jan, 2016

1 commit

  • As illustrated by commit a3afe70b83fd ("[S390] latencytop s390
    support."), HAVE_LATENCYTOP_SUPPORT is defined by an architecture to
    advertise an implementation of save_stack_trace_tsk.

    However, as of 9212ddb5eada ("stacktrace: provide save_stack_trace_tsk()
    weak alias") a dummy implementation is provided if STACKTRACE=y. Given
    that LATENCYTOP already depends on STACKTRACE_SUPPORT and selects
    STACKTRACE, we can remove HAVE_LATENCYTOP_SUPPORT altogether.

    Signed-off-by: Will Deacon
    Acked-by: Heiko Carstens
    Cc: Vineet Gupta
    Cc: Russell King
    Cc: James Hogan
    Cc: Michal Simek
    Cc: Helge Deller
    Acked-by: Michael Ellerman
    Cc: "David S. Miller"
    Cc: Guan Xuetao
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Will Deacon
     

16 Jan, 2016

5 commits

  • By passing a non-null flag we allow fixup_user_fault to retry, which
    enables userfaultfd. As during these retries we might drop the mmap_sem
    we need to check if that happened and redo the complete chain of
    actions.

    Signed-off-by: Dominik Dingel
    Reviewed-by: Andrea Arcangeli
    Cc: "Kirill A. Shutemov"
    Cc: Martin Schwidefsky
    Cc: Christian Borntraeger
    Cc: "Jason J. Herne"
    Cc: David Rientjes
    Cc: Eric B Munson
    Cc: Naoya Horiguchi
    Cc: Mel Gorman
    Cc: Heiko Carstens
    Cc: Dominik Dingel
    Cc: Paolo Bonzini
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dominik Dingel
     
  • During Jason's work with postcopy migration support for s390 a problem
    regarding gmap faults was discovered.

    The gmap code will call fixup_user_fault which will end up always in
    handle_mm_fault. Till now we never cared about retries, but as the
    userfaultfd code kind of relies on it. this needs some fix.

    This patchset does not take care of the futex code. I will now look
    closer at this.

    This patch (of 2):

    With the introduction of userfaultfd, kvm on s390 needs fixup_user_fault
    to pass in FAULT_FLAG_ALLOW_RETRY and give feedback if during the
    faulting we ever unlocked mmap_sem.

    This patch brings in the logic to handle retries as well as it cleans up
    the current documentation. fixup_user_fault was not having the same
    semantics as filemap_fault. It never indicated if a retry happened and
    so a caller wasn't able to handle that case. So we now changed the
    behaviour to always retry a locked mmap_sem.

    Signed-off-by: Dominik Dingel
    Reviewed-by: Andrea Arcangeli
    Cc: "Kirill A. Shutemov"
    Cc: Martin Schwidefsky
    Cc: Christian Borntraeger
    Cc: "Jason J. Herne"
    Cc: David Rientjes
    Cc: Eric B Munson
    Cc: Naoya Horiguchi
    Cc: Mel Gorman
    Cc: Heiko Carstens
    Cc: Dominik Dingel
    Cc: Paolo Bonzini
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dominik Dingel
     
  • With new refcounting we don't need to mark PMDs splitting. Let's drop
    code to handle this.

    pmdp_splitting_flush() is not needed too: on splitting PMD we will do
    pmdp_clear_flush() + set_pte_at(). pmdp_clear_flush() will do IPI as
    needed for fast_gup.

    Signed-off-by: Kirill A. Shutemov
    Cc: Sasha Levin
    Cc: Aneesh Kumar K.V
    Cc: Jerome Marchand
    Cc: Vlastimil Babka
    Cc: Andrea Arcangeli
    Cc: Hugh Dickins
    Cc: Dave Hansen
    Cc: Mel Gorman
    Cc: Rik van Riel
    Cc: Naoya Horiguchi
    Cc: Steve Capper
    Cc: Johannes Weiner
    Cc: Michal Hocko
    Cc: Christoph Lameter
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • Tail page refcounting is utterly complicated and painful to support.

    It uses ->_mapcount on tail pages to store how many times this page is
    pinned. get_page() bumps ->_mapcount on tail page in addition to
    ->_count on head. This information is required by split_huge_page() to
    be able to distribute pins from head of compound page to tails during
    the split.

    We will need ->_mapcount to account PTE mappings of subpages of the
    compound page. We eliminate need in current meaning of ->_mapcount in
    tail pages by forbidding split entirely if the page is pinned.

    The only user of tail page refcounting is THP which is marked BROKEN for
    now.

    Let's drop all this mess. It makes get_page() and put_page() much
    simpler.

    Signed-off-by: Kirill A. Shutemov
    Tested-by: Sasha Levin
    Tested-by: Aneesh Kumar K.V
    Acked-by: Vlastimil Babka
    Acked-by: Jerome Marchand
    Cc: Andrea Arcangeli
    Cc: Hugh Dickins
    Cc: Dave Hansen
    Cc: Mel Gorman
    Cc: Rik van Riel
    Cc: Naoya Horiguchi
    Cc: Steve Capper
    Cc: Johannes Weiner
    Cc: Michal Hocko
    Cc: Christoph Lameter
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • Merge first patch-bomb from Andrew Morton:

    - A few hotfixes which missed 4.4 becasue I was asleep. cc'ed to
    -stable

    - A few misc fixes

    - OCFS2 updates

    - Part of MM. Including pretty large changes to page-flags handling
    and to thp management which have been buffered up for 2-3 cycles now.

    I have a lot of MM material this time.

    [ It turns out the THP part wasn't quite ready, so that got dropped from
    this series - Linus ]

    * emailed patches from Andrew Morton : (117 commits)
    zsmalloc: reorganize struct size_class to pack 4 bytes hole
    mm/zbud.c: use list_last_entry() instead of list_tail_entry()
    zram/zcomp: do not zero out zcomp private pages
    zram: pass gfp from zcomp frontend to backend
    zram: try vmalloc() after kmalloc()
    zram/zcomp: use GFP_NOIO to allocate streams
    mm: add tracepoint for scanning pages
    drivers/base/memory.c: fix kernel warning during memory hotplug on ppc64
    mm/page_isolation: use macro to judge the alignment
    mm: fix noisy sparse warning in LIBCFS_ALLOC_PRE()
    mm: rework virtual memory accounting
    include/linux/memblock.h: fix ordering of 'flags' argument in comments
    mm: move lru_to_page to mm_inline.h
    Documentation/filesystems: describe the shared memory usage/accounting
    memory-hotplug: don't BUG() in register_memory_resource()
    hugetlb: make mm and fs code explicitly non-modular
    mm/swapfile.c: use list_for_each_entry_safe in free_swap_count_continuations
    mm: /proc/pid/clear_refs: no need to clear VM_SOFTDIRTY in clear_soft_dirty_pmd()
    mm: make sure isolate_lru_page() is never called for tail page
    vmstat: make vmstat_updater deferrable again and shut down on idle
    ...

    Linus Torvalds
     

15 Jan, 2016

2 commits

  • Pull livepatching updates from Jiri Kosina:

    - RO/NX attribute fixes for patch module relocations from Josh
    Poimboeuf. As part of this effort, module.c has been cleaned up as
    well and livepatching is piggy-backing on this cleanup. Rusty is OK
    with this whole lot going through livepatching tree.

    - symbol disambiguation support from Chris J Arges. That series is
    also

    Reviewed-by: Miroslav Benes

    but this came in only after I've alredy pushed out. Didn't want to
    rebase because of that, hence I am mentioning it here.

    - symbol lookup fix from Miroslav Benes

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
    livepatch: Cleanup module page permission changes
    module: keep percpu symbols in module's symtab
    module: clean up RO/NX handling.
    module: use a structure to encapsulate layout.
    gcov: use within_module() helper.
    module: Use the same logic for setting and unsetting RO/NX
    livepatch: function,sympos scheme in livepatch sysfs directory
    livepatch: add sympos as disambiguator field to klp_reloc
    livepatch: add old_sympos as disambiguator field to klp_func

    Linus Torvalds
     
  • Currently looking at /proc//status or statm, there is no way to
    distinguish shmem pages from pages mapped to a regular file (shmem pages
    are mapped to /dev/zero), even though their implication in actual memory
    use is quite different.

    The internal accounting currently counts shmem pages together with
    regular files. As a preparation to extend the userspace interfaces,
    this patch adds MM_SHMEMPAGES counter to mm_rss_stat to account for
    shmem pages separately from MM_FILEPAGES. The next patch will expose it
    to userspace - this patch doesn't change the exported values yet, by
    adding up MM_SHMEMPAGES to MM_FILEPAGES at places where MM_FILEPAGES was
    used before. The only user-visible change after this patch is the OOM
    killer message that separates the reported "shmem-rss" from "file-rss".

    [vbabka@suse.cz: forward-porting, tweak changelog]
    Signed-off-by: Jerome Marchand
    Signed-off-by: Vlastimil Babka
    Acked-by: Konstantin Khlebnikov
    Acked-by: Michal Hocko
    Acked-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jerome Marchand
     

14 Jan, 2016

2 commits

  • Pull libnvdimm updates from Dan Williams:
    "The bulk of this has appeared in -next and independently received a
    build success notification from the kbuild robot. The 'for-4.5/block-
    dax' topic branch was rebased over the weekend to drop the "block
    device end-of-life" rework that Al would like to see re-implemented
    with a notifier, and to address bug reports against the badblocks
    integration.

    There is pending feedback against "libnvdimm: Add a poison list and
    export badblocks" received last week. Linda identified some localized
    fixups that we will handle incrementally.

    Summary:

    - Media error handling: The 'badblocks' implementation that
    originated in md-raid is up-levelled to a generic capability of a
    block device. This initial implementation is limited to being
    consulted in the pmem block-i/o path. Later, 'badblocks' will be
    consulted when creating dax mappings.

    - Raw block device dax: For virtualization and other cases that want
    large contiguous mappings of persistent memory, add the capability
    to dax-mmap a block device directly.

    - Increased /dev/mem restrictions: Add an option to treat all
    io-memory as IORESOURCE_EXCLUSIVE, i.e. disable /dev/mem access
    while a driver is actively using an address range. This behavior
    is controlled via the new CONFIG_IO_STRICT_DEVMEM option and can be
    overridden by the existing "iomem=relaxed" kernel command line
    option.

    - Miscellaneous fixes include a 'pfn'-device huge page alignment fix,
    block device shutdown crash fix, and other small libnvdimm fixes"

    * tag 'libnvdimm-for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (32 commits)
    block: kill disk_{check|set|clear|alloc}_badblocks
    libnvdimm, pmem: nvdimm_read_bytes() badblocks support
    pmem, dax: disable dax in the presence of bad blocks
    pmem: fail io-requests to known bad blocks
    libnvdimm: convert to statically allocated badblocks
    libnvdimm: don't fail init for full badblocks list
    block, badblocks: introduce devm_init_badblocks
    block: clarify badblocks lifetime
    badblocks: rename badblocks_free to badblocks_exit
    libnvdimm, pmem: move definition of nvdimm_namespace_add_poison to nd.h
    libnvdimm: Add a poison list and export badblocks
    nfit_test: Enable DSMs for all test NFITs
    md: convert to use the generic badblocks code
    block: Add badblock management for gendisks
    badblocks: Add core badblock management code
    block: fix del_gendisk() vs blkdev_ioctl crash
    block: enable dax for raw block devices
    block: introduce bdev_file_inode()
    restrict /dev/mem to idle io memory ranges
    arch: consolidate CONFIG_STRICT_DEVM in lib/Kconfig.debug
    ...

    Linus Torvalds
     
  • Pull s390 updates from Martin Schwidefsky:
    "Among the traditional bug fixes and cleanups are some improvements:

    - A tool to generated the facility lists, generating the bit fields
    by hand has been a source of bugs in the past

    - The spinlock loop is reordered to avoid bursts of hypervisor calls

    - Add support for the open-for-business interface to the service
    element

    - The get_cpu call is added to the vdso

    - A set of tracepoints is defined for the common I/O layer

    - The deprecated sclp_cpi module is removed

    - Update default configuration"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (56 commits)
    s390/sclp: fix possible control register corruption
    s390: fix normalization bug in exception table sorting
    s390/configs: update default configurations
    s390/vdso: optimize getcpu system call
    s390: drop smp_mb in vdso_init
    s390: rename struct _lowcore to struct lowcore
    s390/mem_detect: use unsigned longs
    s390/ptrace: get rid of long longs in psw_bits
    s390/sysinfo: add missing SYSIB 1.2.2 multithreading fields
    s390: get rid of CONFIG_SCHED_MC and CONFIG_SCHED_BOOK
    s390/Kconfig: remove pointless 64 bit dependencies
    s390/dasd: fix failfast for disconnected devices
    s390/con3270: testing return kzalloc retval
    s390/hmcdrv: constify hmcdrv_ftp_ops structs
    s390/cio: add NULL test
    s390/cio: Change I/O instructions from inline to normal functions
    s390/cio: Introduce common I/O layer tracepoints
    s390/cio: Consolidate inline assemblies and related data definitions
    s390/cio: Fix incorrect xsch opcode specification
    s390/cio: Remove unused inline assemblies
    ...

    Linus Torvalds
     

13 Jan, 2016

7 commits

  • Pull networking updates from Davic Miller:

    1) Support busy polling generically, for all NAPI drivers. From Eric
    Dumazet.

    2) Add byte/packet counter support to nft_ct, from Floriani Westphal.

    3) Add RSS/XPS support to mvneta driver, from Gregory Clement.

    4) Implement IPV6_HDRINCL socket option for raw sockets, from Hannes
    Frederic Sowa.

    5) Add support for T6 adapter to cxgb4 driver, from Hariprasad Shenai.

    6) Add support for VLAN device bridging to mlxsw switch driver, from
    Ido Schimmel.

    7) Add driver for Netronome NFP4000/NFP6000, from Jakub Kicinski.

    8) Provide hwmon interface to mlxsw switch driver, from Jiri Pirko.

    9) Reorganize wireless drivers into per-vendor directories just like we
    do for ethernet drivers. From Kalle Valo.

    10) Provide a way for administrators "destroy" connected sockets via the
    SOCK_DESTROY socket netlink diag operation. From Lorenzo Colitti.

    11) Add support to add/remove multicast routes via netlink, from Nikolay
    Aleksandrov.

    12) Make TCP keepalive settings per-namespace, from Nikolay Borisov.

    13) Add forwarding and packet duplication facilities to nf_tables, from
    Pablo Neira Ayuso.

    14) Dead route support in MPLS, from Roopa Prabhu.

    15) TSO support for thunderx chips, from Sunil Goutham.

    16) Add driver for IBM's System i/p VNIC protocol, from Thomas Falcon.

    17) Rationalize, consolidate, and more completely document the checksum
    offloading facilities in the networking stack. From Tom Herbert.

    18) Support aborting an ongoing scan in mac80211/cfg80211, from
    Vidyullatha Kanchanapally.

    19) Use per-bucket spinlock for bpf hash facility, from Tom Leiming.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1375 commits)
    net: bnxt: always return values from _bnxt_get_max_rings
    net: bpf: reject invalid shifts
    phonet: properly unshare skbs in phonet_rcv()
    dwc_eth_qos: Fix dma address for multi-fragment skbs
    phy: remove an unneeded condition
    mdio: remove an unneed condition
    mdio_bus: NULL dereference on allocation error
    net: Fix typo in netdev_intersect_features
    net: freescale: mac-fec: Fix build error from phy_device API change
    net: freescale: ucc_geth: Fix build error from phy_device API change
    bonding: Prevent IPv6 link local address on enslaved devices
    IB/mlx5: Add flow steering support
    net/mlx5_core: Export flow steering API
    net/mlx5_core: Make ipv4/ipv6 location more clear
    net/mlx5_core: Enable flow steering support for the IB driver
    net/mlx5_core: Initialize namespaces only when supported by device
    net/mlx5_core: Set priority attributes
    net/mlx5_core: Connect flow tables
    net/mlx5_core: Introduce modify flow table command
    net/mlx5_core: Managing root flow table
    ...

    Linus Torvalds
     
  • Pull misc vfs updates from Al Viro:
    "All kinds of stuff. That probably should've been 5 or 6 separate
    branches, but by the time I'd realized how large and mixed that bag
    had become it had been too close to -final to play with rebasing.

    Some fs/namei.c cleanups there, memdup_user_nul() introduction and
    switching open-coded instances, burying long-dead code, whack-a-mole
    of various kinds, several new helpers for ->llseek(), assorted
    cleanups and fixes from various people, etc.

    One piece probably deserves special mention - Neil's
    lookup_one_len_unlocked(). Similar to lookup_one_len(), but gets
    called without ->i_mutex and tries to avoid ever taking it. That, of
    course, means that it's not useful for any directory modifications,
    but things like getting inode attributes in nfds readdirplus are fine
    with that. I really should've asked for moratorium on lookup-related
    changes this cycle, but since I hadn't done that early enough... I
    *am* asking for that for the coming cycle, though - I'm going to try
    and get conversion of i_mutex to rwsem with ->lookup() done under lock
    taken shared.

    There will be a patch closer to the end of the window, along the lines
    of the one Linus had posted last May - mechanical conversion of
    ->i_mutex accesses to inode_lock()/inode_unlock()/inode_trylock()/
    inode_is_locked()/inode_lock_nested(). To quote Linus back then:

    -----
    | This is an automated patch using
    |
    | sed 's/mutex_lock(&\(.*\)->i_mutex)/inode_lock(\1)/'
    | sed 's/mutex_unlock(&\(.*\)->i_mutex)/inode_unlock(\1)/'
    | sed 's/mutex_lock_nested(&\(.*\)->i_mutex,[ ]*I_MUTEX_\([A-Z0-9_]*\))/inode_lock_nested(\1, I_MUTEX_\2)/'
    | sed 's/mutex_is_locked(&\(.*\)->i_mutex)/inode_is_locked(\1)/'
    | sed 's/mutex_trylock(&\(.*\)->i_mutex)/inode_trylock(\1)/'
    |
    | with a very few manual fixups
    -----

    I'm going to send that once the ->i_mutex-affecting stuff in -next
    gets mostly merged (or when Linus says he's about to stop taking
    merges)"

    * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
    nfsd: don't hold i_mutex over userspace upcalls
    fs:affs:Replace time_t with time64_t
    fs/9p: use fscache mutex rather than spinlock
    proc: add a reschedule point in proc_readfd_common()
    logfs: constify logfs_block_ops structures
    fcntl: allow to set O_DIRECT flag on pipe
    fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE
    fs: xattr: Use kvfree()
    [s390] page_to_phys() always returns a multiple of PAGE_SIZE
    nbd: use ->compat_ioctl()
    fs: use block_device name vsprintf helper
    lib/vsprintf: add %*pg format specifier
    fs: use gendisk->disk_name where possible
    poll: plug an unused argument to do_poll
    amdkfd: don't open-code memdup_user()
    cdrom: don't open-code memdup_user()
    rsxx: don't open-code memdup_user()
    mtip32xx: don't open-code memdup_user()
    [um] mconsole: don't open-code memdup_user_nul()
    [um] hostaudio: don't open-code memdup_user()
    ...

    Linus Torvalds
     
  • Pull KVM updates from Paolo Bonzini:
    "PPC changes will come next week.

    - s390: Support for runtime instrumentation within guests, support of
    248 VCPUs.

    - ARM: rewrite of the arm64 world switch in C, support for 16-bit VM
    identifiers. Performance counter virtualization missed the boat.

    - x86: Support for more Hyper-V features (synthetic interrupt
    controller), MMU cleanups"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (115 commits)
    kvm: x86: Fix vmwrite to SECONDARY_VM_EXEC_CONTROL
    kvm/x86: Hyper-V SynIC timers tracepoints
    kvm/x86: Hyper-V SynIC tracepoints
    kvm/x86: Update SynIC timers on guest entry only
    kvm/x86: Skip SynIC vector check for QEMU side
    kvm/x86: Hyper-V fix SynIC timer disabling condition
    kvm/x86: Reorg stimer_expiration() to better control timer restart
    kvm/x86: Hyper-V unify stimer_start() and stimer_restart()
    kvm/x86: Drop stimer_stop() function
    kvm/x86: Hyper-V timers fix incorrect logical operation
    KVM: move architecture-dependent requests to arch/
    KVM: renumber vcpu->request bits
    KVM: document which architecture uses each request bit
    KVM: Remove unused KVM_REQ_KICK to save a bit in vcpu->requests
    kvm: x86: Check kvm_write_guest return value in kvm_write_wall_clock
    KVM: s390: implement the RI support of guest
    kvm/s390: drop unpaired smp_mb
    kvm: x86: fix comment about {mmu,nested_mmu}.gva_to_gpa
    KVM: x86: MMU: Use clear_page() instead of init_shadow_page_table()
    arm/arm64: KVM: Detect vGIC presence at runtime
    ...

    Linus Torvalds
     
  • As per: lkml.kernel.org/r/20150921112252.3c2937e1@mschwide
    atomics imply a barrier on s390, so s390 should change
    smp_mb__before_atomic and smp_mb__after_atomic to barrier() instead of
    smp_mb() and hence should not use the generic versions.

    Suggested-by: Peter Zijlstra
    Suggested-by: Martin Schwidefsky
    Signed-off-by: Michael S. Tsirkin
    Acked-by: Martin Schwidefsky
    Acked-by: Peter Zijlstra (Intel)

    Michael S. Tsirkin
     
  • The s390 kernel is SMP to 99.99%, we just didn't bother with a
    non-smp variant for the memory-barriers. If the generic header
    is used we'd get the non-smp version for free. It will save a
    small amount of text space for CONFIG_SMP=n.

    Suggested-by: Martin Schwidefsky
    Signed-off-by: Michael S. Tsirkin
    Acked-by: Peter Zijlstra (Intel)

    Michael S. Tsirkin
     
  • This defines __smp_xxx barriers for s390,
    for use by virtualization.

    Some smp_xxx barriers are removed as they are
    defined correctly by asm-generic/barriers.h

    Note: smp_mb, smp_rmb and smp_wmb are defined as full barriers
    unconditionally on this architecture.

    Signed-off-by: Michael S. Tsirkin
    Acked-by: Arnd Bergmann
    Acked-by: Martin Schwidefsky
    Acked-by: Peter Zijlstra (Intel)

    Michael S. Tsirkin
     
  • On s390 read_barrier_depends, smp_read_barrier_depends
    smp_store_mb(), smp_mb__before_atomic and smp_mb__after_atomic match the
    asm-generic variants exactly. Drop the local definitions and pull in
    asm-generic/barrier.h instead.

    This is in preparation to refactoring this code area.

    Signed-off-by: Michael S. Tsirkin
    Acked-by: Arnd Bergmann
    Acked-by: Peter Zijlstra (Intel)

    Michael S. Tsirkin