29 Apr, 2013

1 commit

  • Pull s390 update from Martin Schwidefsky:
    "This is the first batch of s390 patches for the 3.10 merge window.

    Included are some performance enhancements: storage key
    initialization, zero page cache synonyms, system call micro
    optimization and the speedup patches for dasdfmt. Sebastian managed
    to get rid of the special casing for the console device in the cio
    layer. And the usual bunch of bug fixes."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (59 commits)
    s390/pci: use pci_scan_root_bus
    s390/scm_blk: fix memleak in init function
    s390/scm_blk: allow more cluster size values
    s390/cio: fix irq statistics
    s390/memory hotplug: prevent offline of active memory increments
    s390: remove small stack config option
    s390: system call path micro optimization
    s390: lowcore stack pointer offsets
    s390/uapi: change struct statfs[64] member types to unsigned values
    s390/pci: return correct dma address for offset > PAGE_SIZE
    s390/ptrace: remove empty ifdefs
    s390/compat: remove ptrace compat definitions from uapi header file
    s390/compat: fix compile error for !COMPAT
    s390/compat: fix compat_sys_statfs() memory corruption
    s390/zcore: Fix HSA copy length for last block
    s390/mm,gmap: segment mapping race
    s390/mm,gmap: implement gmap_translate()
    s390/pci: remove disable_device implementation
    s390/pci: disable per default
    s390/pci: return error after failed pci ops
    ...

    Linus Torvalds
     

26 Apr, 2013

4 commits

  • The pci config space accessors on s390 are (now) smart enough to
    figure out if a pci function is available. So instead of calling
    pci_create_root_bus and then pci_scan_single_device for each
    available function just call pci_scan_root_bus and let the pci core
    do the scanning (via config reads on all possible functions) and
    device creation.

    Reviewed-by: Gerald Schaefer
    Signed-off-by: Sebastian Ott
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • We've seen repeatedly that 8KB stack size on 64 bit kernels
    is not sufficient.
    So simply remove the config option.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     
  • Add a pointer to the system call table to the thread_info structure.
    The TIF_31BIT bit is set or cleared by SET_PERSONALITY exactly once
    for the lifetime of a process. With the pointer to the correct system
    call table in thread_info the system call code in entry64.S path can
    drop the check for TIF_31BIT which saves a couple of instructions.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • Store the stack pointers in the lowcore for the kernel stack, the async
    stack and the panic stack with the offset required for the first user.
    This avoids an unnecessary add instruction on the system call path.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     

23 Apr, 2013

8 commits

  • Kay Sievers reported that coreutils' stat tool has a problem with
    s390's statfs[64] definition:

    > The definition of struct statfs::f_type needs a fix. s390 is the only
    > architecture in the kernel that uses an int and expects magic
    > constants lager than INT_MAX to fit into.
    >
    > A fix is needed to make Fedora boot on s390, it currently fails to do
    > so. Userspace does not want to add code to paper-over this issue.

    [...]

    > Even coreutils cannot handle it:
    > #define RAMFS_MAGIC 0x858458f6
    > # stat -f -c%t /
    > ffffffff858458f6
    >
    > #define BTRFS_SUPER_MAGIC 0x9123683E
    > # stat -f -c%t /mnt
    > ffffffff9123683e

    The bug is caused by an implicit sign extension within the stat tool:

    out_uint_x (pformat, prefix_len, statfsbuf->f_type);

    where the format finally will be "%lx".
    A similar problem can be found in the 'tail' tool.
    s390 is the only architecture which has an int type f_type member in
    struct statfs[64]. Other architectures have either unsigned ints or
    long values, so that the problem doesn't occur there.

    Therefore change the type of the f_type member to unsigned int, so
    that we get zero extension instead of sign extension when assignment to
    a long value happens.

    This patch changes the s390 uapi struct stafs[64] definition in the kernel
    to contain only unsigned values.
    This was true for 32 bit builds anyway, since we use the generic uapi
    header file in that case. So lets not include conditionally the generic
    uapi header file but have the s390 implementation completely independent.

    Also fix the types of struct compat_stafs to match reality and move the
    definition of struct compat_statfs64 to asm/compat.h since it is not part
    of the api.

    Reported-by: Kay Sievers
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     
  • For offset > PAGE_SIZE, s390_dma_map_pages() will issue a warning
    and return a wrong dma address.

    This patch removes the warning and fixes the dma return address
    calculation.

    Signed-off-by: Gerald Schaefer
    Signed-off-by: Martin Schwidefsky

    Gerald Schaefer
     
  • Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     
  • The compat definitions are not part of the uapi. So move them to
    s390's private compat header file.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     
  • Fix this one for !COMPAT:

    compat.h: In function ‘arch_compat_alloc_user_space’:
    compat.h:292:2: error: implicit declaration of function ‘is_compat_task’

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     
  • The f_spare field within struct compat_statfs is four bytes larger
    than within the native 31 bit struct statfs.
    compat_sys_statfs() clears the f_spare field in user space which
    means that in compat mode four bytes that are behind the user space
    supplied struct compat_statfs will be corrupted (zeroed).

    According to Thomas Gleixner's Linux 2.6 history tree this bug is
    present since v2.5.74 87880da124 "[PATCH] s390: 31 bit compat.".
    So it get's fixed shortly before its 10th anniversary. Tough luck.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     
  • The gmap_map_segment function creates a special invalid segment table
    entry with the address of the requested target location in the process
    address space. The first access will create the connection between the
    gmap segment table and the target page table of the main process.
    If two threads do this concurrently both will walk the page tables and
    allocate a gmap_rmap structure for the same segment table entry.
    To avoid the race recheck the segment table entry after taking to page
    table lock.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • Implement gmap_translate() function which translates a guest absolute address
    to a user space process address without establishing the guest page table
    entries.

    This is useful for kvm guest address translations where no memory access
    is expected to happen soon (e.g. tprot exception handler).

    Signed-off-by: Heiko Carstens
    Reviewed-by: Christian Borntraeger
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     

17 Apr, 2013

27 commits

  • Commit b4cbb197c7e7 ("vm: add vm_iomap_memory() helper function") added
    a helper function wrapper around io_remap_pfn_range(), and every other
    architecture defined it in .

    The s390 choice of may make sense, but is not very convenient
    for this case, and gratuitous differences like that cause unexpected errors like this:

    mm/memory.c: In function 'vm_iomap_memory':
    mm/memory.c:2439:2: error: implicit declaration of function 'io_remap_pfn_range' [-Werror=implicit-function-declaration]

    Glory be the kbuild test robot who noticed this, bisected it, and
    reported it to the guilty parties (ie me).

    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • pci_disable_device is called by a driver after it stops using the pci
    function - e.g. during the removal of the driver. The current
    implementation removes the architecture specific information of this
    function such that even after a call to pci_enable_device the pci
    function is no longer usable. Just remove pcibios_disable_device.

    Reviewed-by: Gerald Schaefer
    Signed-off-by: Sebastian Ott
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • Disable pci on s390. Enable with pci=on.

    Suggested-by: Heiko Carstens
    Signed-off-by: Sebastian Ott
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • Access to pci config space via pci_ops should not fail silently.

    Reviewed-by: Gerald Schaefer
    Signed-off-by: Sebastian Ott
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • If a pci load instruction fails the content of the register where the
    data is stored is possibly unchanged. Fix the inline assembly wrapper
    __pcilg to not return stale data. Additionally fix the callers of this
    function who access uninitialized variables.

    Reviewed-by: Gerald Schaefer
    Signed-off-by: Sebastian Ott
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • Don't let pci_load and friends crash the kernel when called with
    e.g. an invalid offset. Return -ENXIO instead.

    Reviewed-by: Gerald Schaefer
    Signed-off-by: Sebastian Ott
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • Use distinct (and hopefully sane) names for the pci instruction
    wrappers.

    Reviewed-by: Gerald Schaefer
    Signed-off-by: Sebastian Ott
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • Uninline pci related instruction wrappers to de-bloat the code:
    add/remove: 15/0 grow/shrink: 2/24 up/down: 1326/-12628 (-11302)

    This is especially useful for the inlined pci read and write functions
    which are used all over the kernel. Also remove the unused __stpcifc
    while at it.

    Reviewed-by: Gerald Schaefer
    Signed-off-by: Sebastian Ott
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • Use pcibios_add_device to do arch specific device initialization.
    This function will be called during pci_bus_add_device.

    Reviewed-by: Gerald Schaefer
    Signed-off-by: Sebastian Ott
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • Don't modify function handles to get a disabled handle - call
    clp_disable_fh. With this change we also do no longer deconfigure
    enabled functions.

    Reviewed-by: Gerald Schaefer
    Signed-off-by: Sebastian Ott
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • Use the debugfs to keep track of a pci function's status changes.

    Reviewed-by: Gerald Schaefer
    Signed-off-by: Sebastian Ott
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • The hash used for mapping irq numbers to msi descriptors does not
    utilize all buckets that were allocated. Fix this by using the same
    value (computed by the number of bits used for the hash function) at
    relevant places.

    Reviewed-by: Gerald Schaefer
    Signed-off-by: Sebastian Ott
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • This patch adds the last breaking event address as parameter
    for 31 bit compat program signal handlers as it is already
    done for 64 bit programs.

    Signed-off-by: Michael Holzheu
    Signed-off-by: Martin Schwidefsky

    Michael Holzheu
     
  • force_console is used to wake up the CCW based console device to
    print a panic message in case something goes wrong in a suspend
    or resume cycle. Stop using the static console_subchannel and add
    a parameter to this function to specify which ccw device we have
    to wake up.

    Reviewed-by: Peter Oberparleiter
    Signed-off-by: Sebastian Ott
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • wait_cons_dev is used to busy wait for an interrupt on the console
    ccw device. Stop using the static console_subchannel and add a
    parameter to this function to specify on which ccw device/subchannel
    we have to do the polling.

    While at it rename the function to ccw_device_wait_idle and
    move it to device.c

    Reviewed-by: Peter Oberparleiter
    Signed-off-by: Sebastian Ott
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • Since commit 5f954c34 ([S390] hibernation: fix lowcore handling)
    the absolute zero lowcore is lost during suspend/resume.
    For example, this leads to the problem that the re-IPL device
    for kdump is no longer set after resume.

    With this patch during suspend a buffer is allocated in the new PM
    notifier "suspend_pm_cb" and then the absolute zero lowcore is saved
    to that buffer. The resume code then copies back this buffer to
    absolute zero and afterwards the PM notifier releases the memory.

    Signed-off-by: Michael Holzheu
    Signed-off-by: Martin Schwidefsky

    Michael Holzheu
     
  • Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     
  • Remove unused __BITOPS_ALIGN, and replace __BITOPS_WORDSIZE with
    BITS_PER_LONG.

    Signed-off-by: Akinobu Mita
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Akinobu Mita
     
  • Use sske with multiple block control to initialize storage keys within
    a 1 MB frame at once.
    It turned out that the sske with mb=1 is an order of magnitude faster
    than pfmf. This is only an issue for very large systems (several 100GB)
    where storage key initialization could last more than a minute.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     
  • dumpstack() did not always print a sane callchain when being called.
    The reason is that show_trace() accessed register 15 directly to get
    the current stack pointer and passed that pointer to __show_trace()
    which expects a valid stack frame pointer as argument.
    However due to tail call optimization the stack frame may not exist
    anymore when __show_trace() gets called and therefore an invalid
    stack frame pointer gets passed.
    To prevent that disable tail call optimization for call chain walking
    functions.
    So move all the show_* functions to a dumpstack.c file like other
    architectures have it already and add a -fno-optimize-sibling-calls
    compile flag to both dumpstack.c and stacktrace.c to prevent tail
    call optimization.

    Fixes callchains that looked e.g. like this:

    [ 12.868258] Call Trace:
    [ 12.868262] ([] 0x8000)

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     
  • Used PTR_RET function instead of IS_ERR and PTR_ERR.
    Patch found using coccinelle.

    Signed-off-by: Alexandru Gheorghiu
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Alexandru Gheorghiu
     
  • Rewrote conditional statement and eliminated the out_kthread label.

    Signed-off-by: Alexandru Gheorghiu
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Alexandru Gheorghiu
     
  • Signed-off-by: Stelian Nirlu
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Stelian Nirlu
     
  • Pass buffer length in extra parameter.

    Signed-off-by: Stefan Raspl
    Reviewed-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Stefan Raspl
     
  • Using kmem_cache_zalloc() instead of kmem_cache_alloc() and memset().

    Signed-off-by: Wei Yongjun
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Wei Yongjun
     
  • To avoid cache synonyms on System zEC12 32 independent zero pages are
    required, one for each combination for bits 2**12 to 2**16 of the virtual
    address. To avoid wasting too much memory on small virtual systems the
    number of zero pages is limited to 4 if the memory size is less or equal
    to 64MB.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • Protection exception usually are suppressing and the fault handler
    needs to rewind the PSW by the instruction length to get the correct
    fault address. Except for protection exceptions while the CPU is in
    the middle of a transaction. The CPU stores the transaction abort
    PSW at the start of the transaction, if the transaction is aborted
    the PSW is already correct and may not be modified by the fault
    handler.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky