07 Jan, 2017

1 commit

  • Pull KVM fixes from Radim Krčmář:
    "MIPS:
    - fix host kernel crashes when receiving a signal with 64-bit
    userspace

    - flush instruction cache on all vcpus after generating entry code

    (both for stable)

    x86:
    - fix NULL dereference in MMU caused by SMM transitions (for stable)

    - correct guest instruction pointer after emulating some VMX errors

    - minor cleanup"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
    KVM: VMX: remove duplicated declaration
    KVM: MIPS: Flush KVM entry code from icache globally
    KVM: MIPS: Don't clobber CP0_Status.UX
    KVM: x86: reset MMU on KVM_SET_VCPU_EVENTS
    KVM: nVMX: fix instruction skipping during emulated vm-entry

    Linus Torvalds
     

05 Jan, 2017

2 commits

  • Flush the KVM entry code from the icache on all CPUs, not just the one
    that built the entry code.

    Signed-off-by: James Hogan
    Cc: Paolo Bonzini
    Cc: "Radim Krčmář"
    Cc: Ralf Baechle
    Cc: linux-mips@linux-mips.org
    Cc: kvm@vger.kernel.org
    Cc: # 3.16.x-
    Signed-off-by: Radim Krčmář

    James Hogan
     
  • On 64-bit kernels, MIPS KVM will clear CP0_Status.UX to prevent the
    guest (running in user mode) from accessing the 64-bit memory segments.
    However the previous value of CP0_Status.UX is never restored when
    exiting from the guest.

    If the user process uses 64-bit addressing (the n64 ABI) this can result
    in address error exceptions from the kernel if it needs to deliver a
    signal before returning to user mode, as the kernel will need to write a
    sigframe to high user addresses on the user stack which are disallowed
    by CP0_Status.UX=0.

    This is fixed by explicitly setting SX and UX again when exiting from
    the guest, and explicitly clearing those bits when returning to the
    guest. Having the SX and UX bits set when handling guest exits (rather
    than only when exiting to userland) will be helpful when we support VZ,
    since we shouldn't need to directly read or write guest memory, so it
    will be valid for cache management IPIs to access host user addresses.

    Signed-off-by: James Hogan
    Cc: Paolo Bonzini
    Cc: "Radim Krčmář"
    Cc: Ralf Baechle
    Cc: linux-mips@linux-mips.org
    Cc: kvm@vger.kernel.org
    Cc: # 4.8.x-
    Signed-off-by: Radim Krčmář

    James Hogan
     

26 Dec, 2016

2 commits

  • Pull timer type cleanups from Thomas Gleixner:
    "This series does a tree wide cleanup of types related to
    timers/timekeeping.

    - Get rid of cycles_t and use a plain u64. The type is not really
    helpful and caused more confusion than clarity

    - Get rid of the ktime union. The union has become useless as we use
    the scalar nanoseconds storage unconditionally now. The 32bit
    timespec alike storage got removed due to the Y2038 limitations
    some time ago.

    That leaves the odd union access around for no reason. Clean it up.

    Both changes have been done with coccinelle and a small amount of
    manual mopping up"

    * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    ktime: Get rid of ktime_equal()
    ktime: Cleanup ktime_set() usage
    ktime: Get rid of the union
    clocksource: Use a plain u64 instead of cycle_t

    Linus Torvalds
     
  • Pull SMP hotplug notifier removal from Thomas Gleixner:
    "This is the final cleanup of the hotplug notifier infrastructure. The
    series has been reintgrated in the last two days because there came a
    new driver using the old infrastructure via the SCSI tree.

    Summary:

    - convert the last leftover drivers utilizing notifiers

    - fixup for a completely broken hotplug user

    - prevent setup of already used states

    - removal of the notifiers

    - treewide cleanup of hotplug state names

    - consolidation of state space

    There is a sphinx based documentation pending, but that needs review
    from the documentation folks"

    * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    irqchip/armada-xp: Consolidate hotplug state space
    irqchip/gic: Consolidate hotplug state space
    coresight/etm3/4x: Consolidate hotplug state space
    cpu/hotplug: Cleanup state names
    cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions
    staging/lustre/libcfs: Convert to hotplug state machine
    scsi/bnx2i: Convert to hotplug state machine
    scsi/bnx2fc: Convert to hotplug state machine
    cpu/hotplug: Prevent overwriting of callbacks
    x86/msr: Remove bogus cleanup from the error path
    bus: arm-ccn: Prevent hotplug callback leak
    perf/x86/intel/cstate: Prevent hotplug callback leak
    ARM/imx/mmcd: Fix broken cpu hotplug handling
    scsi: qedi: Convert to hotplug state machine

    Linus Torvalds
     

25 Dec, 2016

3 commits

  • There is no point in having an extra type for extra confusion. u64 is
    unambiguous.

    Conversion was done with the following coccinelle script:

    @rem@
    @@
    -typedef u64 cycle_t;

    @fix@
    typedef cycle_t;
    @@
    -cycle_t
    +u64

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: John Stultz

    Thomas Gleixner
     
  • When the state names got added a script was used to add the extra argument
    to the calls. The script basically converted the state constant to a
    string, but the cleanup to convert these strings into meaningful ones did
    not happen.

    Replace all the useless strings with 'subsys/xxx/yyy:state' strings which
    are used in all the other places already.

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Sebastian Siewior
    Link: http://lkml.kernel.org/r/20161221192112.085444152@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • This was entirely automated, using the script by Al:

    PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*'
    sed -i -e "s!$PATT!#include !" \
    $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

    to do the replacement at the end of the merge window.

    Requested-by: Al Viro
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

19 Dec, 2016

1 commit

  • Pull RTC updates from Alexandre Belloni:
    "Subsystem:
    - non-modular drivers are now explicitly non-modular

    New driver:
    - Epson Toyocom rtc-7301sf/dg

    Drivers:
    - cmos: reject unsupported alarm values wrt the RTC capabilities
    - ds1307: ACPI support
    - jz4740: DT support, jz4780 handling, can now be used as a system
    power controller
    - mcp795: many fixes, in particular proper month handling
    - twl: driver is now DT only"

    * tag 'rtc-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (31 commits)
    rtc: mcp795: Fix whitespace and indentation.
    rtc: mcp795: Prefer using the BIT() macro.
    rtc: mcp795: fix month write resetting date to 1.
    rtc: mcp795: fix time range difference between linux and RTC chip.
    rtc: mcp795: fix bitmask value for leap year (LP).
    rtc: mcp795: use bcd2bin/bin2bcd.
    rtc: add support for EPSON TOYOCOM RTC-7301SF/DG
    rtc: ds1307: Add ACPI support
    rtc: imxdi: (trivial) fix a typo
    rtc: ds1374: Merge conditional + WARN_ON()
    rtc: twl: make driver DT only
    rtc: twl: kill static variables
    rtc: fix typos in Kconfig
    rtc: jz4740: make the driver builtin only
    rtc: jz4740: remove unused EXPORT_SYMBOL
    Documentation: bindings: fix twl-rtc documentation
    rtc: Enable compile testing for Maxim and Samsung drivers
    MIPS: jz4740: Remove obsolete code
    MIPS: qi_lb60: Probe RTC driver from DT and use it as power controller
    MIPS: jz4740: DTS: Probe the jz4740-rtc driver from devicetree
    ...

    Linus Torvalds
     

15 Dec, 2016

4 commits

  • Merge more updates from Andrew Morton:

    - a few misc things

    - kexec updates

    - DMA-mapping updates to better support networking DMA operations

    - IPC updates

    - various MM changes to improve DAX fault handling

    - lots of radix-tree changes, mainly to the test suite. All leading up
    to reimplementing the IDA/IDR code to be a wrapper layer over the
    radix-tree. However the final trigger-pulling patch is held off for
    4.11.

    * emailed patches from Andrew Morton : (114 commits)
    radix tree test suite: delete unused rcupdate.c
    radix tree test suite: add new tag check
    radix-tree: ensure counts are initialised
    radix tree test suite: cache recently freed objects
    radix tree test suite: add some more functionality
    idr: reduce the number of bits per level from 8 to 6
    rxrpc: abstract away knowledge of IDR internals
    tpm: use idr_find(), not idr_find_slowpath()
    idr: add ida_is_empty
    radix tree test suite: check multiorder iteration
    radix-tree: fix replacement for multiorder entries
    radix-tree: add radix_tree_split_preload()
    radix-tree: add radix_tree_split
    radix-tree: add radix_tree_join
    radix-tree: delete radix_tree_range_tag_if_tagged()
    radix-tree: delete radix_tree_locate_item()
    radix-tree: improve multiorder iterators
    btrfs: fix race in btrfs_free_dummy_fs_info()
    radix-tree: improve dump output
    radix-tree: make radix_tree_find_next_bit more useful
    ...

    Linus Torvalds
     
  • This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
    avoid invoking cache line invalidation if the driver will just handle it
    via a sync_for_cpu or sync_for_device call.

    Link: http://lkml.kernel.org/r/20161110113513.76501.32321.stgit@ahduyck-blue-test.jf.intel.com
    Signed-off-by: Alexander Duyck
    Cc: Ralf Baechle
    Cc: Keguang Zhang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Duyck
     
  • Pull namespace updates from Eric Biederman:
    "After a lot of discussion and work we have finally reachanged a basic
    understanding of what is necessary to make unprivileged mounts safe in
    the presence of EVM and IMA xattrs which the last commit in this
    series reflects. While technically it is a revert the comments it adds
    are important for people not getting confused in the future. Clearing
    up that confusion allows us to seriously work on unprivileged mounts
    of fuse in the next development cycle.

    The rest of the fixes in this set are in the intersection of user
    namespaces, ptrace, and exec. I started with the first fix which
    started a feedback cycle of finding additional issues during review
    and fixing them. Culiminating in a fix for a bug that has been present
    since at least Linux v1.0.

    Potentially these fixes were candidates for being merged during the rc
    cycle, and are certainly backport candidates but enough little things
    turned up during review and testing that I decided they should be
    handled as part of the normal development process just to be certain
    there were not any great surprises when it came time to backport some
    of these fixes"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    Revert "evm: Translate user/group ids relative to s_user_ns when computing HMAC"
    exec: Ensure mm->user_ns contains the execed files
    ptrace: Don't allow accessing an undumpable mm
    ptrace: Capture the ptracer's creds not PT_PTRACE_CAP
    mm: Add a user_ns owner to mm_struct and fix ptrace permission checks

    Linus Torvalds
     
  • Pull trivial updates from Jiri Kosina.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
    NTB: correct ntb_spad_count comment typo
    misc: ibmasm: fix typo in error message
    Remove references to dead make variable LINUX_INCLUDE
    Remove last traces of ikconfig.h
    treewide: Fix printk() message errors
    Documentation/device-mapper: s/getsize/getsz/

    Linus Torvalds
     

14 Dec, 2016

1 commit


13 Dec, 2016

1 commit

  • Pull locking updates from Ingo Molnar:
    "The tree got pretty big in this development cycle, but the net effect
    is pretty good:

    115 files changed, 673 insertions(+), 1522 deletions(-)

    The main changes were:

    - Rework and generalize the mutex code to remove per arch mutex
    primitives. (Peter Zijlstra)

    - Add vCPU preemption support: add an interface to query the
    preemption status of vCPUs and use it in locking primitives - this
    optimizes paravirt performance. (Pan Xinhui, Juergen Gross,
    Christian Borntraeger)

    - Introduce cpu_relax_yield() and remov cpu_relax_lowlatency() to
    clean up and improve the s390 lock yielding machinery and its core
    kernel impact. (Christian Borntraeger)

    - Micro-optimize mutexes some more. (Waiman Long)

    - Reluctantly add the to-be-deprecated mutex_trylock_recursive()
    interface on a temporary basis, to give the DRM code more time to
    get rid of its locking hacks. Any other users will be NAK-ed on
    sight. (We turned off the deprecation warning for the time being to
    not pollute the build log.) (Peter Zijlstra)

    - Improve the rtmutex code a bit, in light of recent long lived
    bugs/races. (Thomas Gleixner)

    - Misc fixes, cleanups"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
    x86/paravirt: Fix bool return type for PVOP_CALL()
    x86/paravirt: Fix native_patch()
    locking/ww_mutex: Use relaxed atomics
    locking/rtmutex: Explain locking rules for rt_mutex_proxy_unlock()/init_proxy_locked()
    locking/rtmutex: Get rid of RT_MUTEX_OWNER_MASKALL
    x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()
    locking/mutex: Break out of expensive busy-loop on {mutex,rwsem}_spin_on_owner() when owner vCPU is preempted
    locking/osq: Break out of spin-wait busy waiting loop for a preempted vCPU in osq_lock()
    Documentation/virtual/kvm: Support the vCPU preemption check
    x86/xen: Support the vCPU preemption check
    x86/kvm: Support the vCPU preemption check
    x86/kvm: Support the vCPU preemption check
    kvm: Introduce kvm_write_guest_offset_cached()
    locking/core, x86/paravirt: Implement vcpu_is_preempted(cpu) for KVM and Xen guests
    locking/spinlocks, s390: Implement vcpu_is_preempted(cpu)
    locking/core, powerpc: Implement vcpu_is_preempted(cpu)
    sched/core: Introduce the vcpu_is_preempted(cpu) interface
    sched/wake_q: Rename WAKE_Q to DEFINE_WAKE_Q
    locking/core: Provide common cpu_relax_yield() definition
    locking/mutex: Don't mark mutex_trylock_recursive() as deprecated, temporarily
    ...

    Linus Torvalds
     

12 Dec, 2016

1 commit

  • Pull networking updates from David Miller:

    1) Platform regulatory domain support for ath10k, from Bartosz
    Markowski.

    2) Centralize min/max MTU checking, thus removing tons of duplicated
    code all of the the various drivers. From Jarod Wilson.

    3) Support ingress actions in act_mirred, from Shmulik Ladkani.

    4) Improve device adjacency tracking, from David Ahern.

    5) Add support for LED triggers on PHY link state changes, from Zach
    Brown.

    6) Improve UDP socket memory accounting, from Paolo Abeni.

    7) Set SK_MEM_QUANTUM to a fixed size of 4096, instead of PAGE_SIZE.
    From Eric Dumazet.

    8) Collapse TCP SKBs at retransmit time even if the right side SKB has
    frags. Also from Eric Dumazet.

    9) Add IP_RECVFRAGSIZE and IPV6_RECVFRAGSIZE cmsgs, from Willem de
    Bruijn.

    10) Support routing by UID, from Lorenzo Colitti.

    11) Handle L3 domain binding (ie. VRF) for RAW sockets, from David
    Ahern.

    12) tcp_get_info() can run lockless, from Eric Dumazet.

    13) 4-tuple UDP hashing in SFC driver, from Edward Cree.

    14) Avoid reorders in GRO code, from Eric Dumazet.

    15) IPV6 Segment Routing support, from David Lebrun.

    16) Support MPLS push and pop for L3 packets in openvswitch, from Jiri
    Benc.

    17) Add LRU datastructure support for BPF, Martin KaFai Lau.

    18) VF support in liquidio driver, from Raghu Vatsavayi.

    19) Multiqueue support in alx driver, from Tobias Regnery.

    20) Networking cgroup BPF support, from Daniel Mack.

    21) TCP chronograph measurements, from Francis Yan.

    22) XDP support for qed driver, from Yuval Mintz.

    23) BPF based lwtunnels, from Thomas Graf.

    24) Consistent FIB dumping to offloading drivers, from Ido Schimmel.

    25) Many optimizations for UDP under high load, from Eric Dumazet.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits)
    netfilter: nft_counter: rework atomic dump and reset
    e1000: use disable_hardirq() for e1000_netpoll()
    i40e: don't truncate match_method assignment
    net: ethernet: ti: netcp: add support of cpts
    net: phy: phy drivers should not set SUPPORTED_[Asym_]Pause
    net: l2tp: ppp: change PPPOL2TP_MSG_* => L2TP_MSG_*
    net: l2tp: deprecate PPPOL2TP_MSG_* in favour of L2TP_MSG_*
    net: l2tp: export debug flags to UAPI
    net: ethernet: stmmac: remove private tx queue lock
    net: ethernet: sxgbe: remove private tx queue lock
    net: bridge: shorten ageing time on topology change
    net: bridge: add helper to set topology change
    net: bridge: add helper to offload ageing time
    net: nicvf: use new api ethtool_{get|set}_link_ksettings
    net: ethernet: ti: cpsw: sync rates for channels in dual emac mode
    net: ethernet: ti: cpsw: re-split res only when speed is changed
    net: ethernet: ti: cpsw: combine budget and weight split and check
    net: ethernet: ti: cpsw: don't start queue twice
    net: ethernet: ti: cpsw: use same macros to get active slave
    net: mvneta: select GENERIC_ALLOCATOR
    ...

    Linus Torvalds
     

11 Dec, 2016

3 commits

  • Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • The hardware documentation says bit 11:10 are used for the GPE
    frequency selection. Fix the mask in the define to match these bits.

    Signed-off-by: Hauke Mehrtens
    Reported-by: Dan Carpenter
    Reviewed-by: Thomas Langer
    Cc: linux-mips@linux-mips.org
    Cc: john@phrozen.org
    Patchwork: https://patchwork.linux-mips.org/patch/14648/
    Signed-off-by: Ralf Baechle

    Hauke Mehrtens
     
  • The sync_cmos_clock function in kernel/time/ntp.c first tries to update
    the internal clock of the cpu by calling the "update_persistent_clock64"
    architecture specific function. If this returns -ENODEV, it then tries
    to update an external RTC using "rtc_set_ntp_time".

    On the mips architecture, the weak implementation of the underlying
    function would return 0 if it wasn't overridden. This meant that the
    sync_cmos_clock function would never try to update an external RTC
    (if both CONFIG_GENERIC_CMOS_UPDATE and CONFIG_RTC_SYSTOHC are
    configured)

    Returning -ENODEV instead, means that an external RTC will be tried.

    Signed-off-by: Luuk Paulussen
    Reviewed-by: Richard Laing
    Reviewed-by: Scott Parlane
    Reviewed-by: Chris Packham
    Cc: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/14649/
    Signed-off-by: Ralf Baechle

    Luuk Paulussen
     

04 Dec, 2016

1 commit

  • Couple conflicts resolved here:

    1) In the MACB driver, a bug fix to properly initialize the
    RX tail pointer properly overlapped with some changes
    to support variable sized rings.

    2) In XGBE we had a "CONFIG_PM" --> "CONFIG_PM_SLEEP" fix
    overlapping with a reorganization of the driver to support
    ACPI, OF, as well as PCI variants of the chip.

    3) In 'net' we had several probe error path bug fixes to the
    stmmac driver, meanwhile a lot of this code was cleaned up
    and reorganized in 'net-next'.

    4) The cls_flower classifier obtained a helper function in
    'net-next' called __fl_delete() and this overlapped with
    Daniel Borkamann's bug fix to use RCU for object destruction
    in 'net'. It also overlapped with Jiri's change to guard
    the rhashtable_remove_fast() call with a check against
    tc_skip_sw().

    5) In mlx4, a revert bug fix in 'net' overlapped with some
    unrelated changes in 'net-next'.

    6) In geneve, a stale header pointer after pskb_expand_head()
    bug fix in 'net' overlapped with a large reorganization of
    the same code in 'net-next'. Since the 'net-next' code no
    longer had the bug in question, there was nothing to do
    other than to simply take the 'net-next' hunks.

    Signed-off-by: David S. Miller

    David S. Miller
     

30 Nov, 2016

1 commit

  • This patch exports the sender chronograph stats via the socket
    SO_TIMESTAMPING channel. Currently we can instrument how long a
    particular application unit of data was queued in TCP by tracking
    SOF_TIMESTAMPING_TX_SOFTWARE and SOF_TIMESTAMPING_TX_SCHED. Having
    these sender chronograph stats exported simultaneously along with
    these timestamps allow further breaking down the various sender
    limitation. For example, a video server can tell if a particular
    chunk of video on a connection takes a long time to deliver because
    TCP was experiencing small receive window. It is not possible to
    tell before this patch without packet traces.

    To prepare these stats, the user needs to set
    SOF_TIMESTAMPING_OPT_STATS and SOF_TIMESTAMPING_OPT_TSONLY flags
    while requesting other SOF_TIMESTAMPING TX timestamps. When the
    timestamps are available in the error queue, the stats are returned
    in a separate control message of type SCM_TIMESTAMPING_OPT_STATS,
    in a list of TLVs (struct nlattr) of types: TCP_NLA_BUSY_TIME,
    TCP_NLA_RWND_LIMITED, TCP_NLA_SNDBUF_LIMITED. Unit is microsecond.

    Signed-off-by: Francis Yan
    Signed-off-by: Yuchung Cheng
    Signed-off-by: Soheil Hassas Yeganeh
    Acked-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Francis Yan
     

25 Nov, 2016

1 commit

  • Since commit 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing
    continuation lines") the output from __do_page_fault on MIPS has been
    pretty unreadable due to the lack of KERN_CONT markers. Use pr_cont
    to provide the appropriate markers & restore the expected output.

    Signed-off-by: Matt Redfearn
    Cc: Paul Gortmaker
    Cc: Kirill A. Shutemov
    Cc: Andrew Morton
    Cc: linux-mips@linux-mips.org
    Cc: linux-kernel@vger.kernel.org
    Patchwork: https://patchwork.linux-mips.org/patch/14544/
    Signed-off-by: Ralf Baechle

    Matt Redfearn
     

24 Nov, 2016

1 commit

  • Since MIPSr6 the Wired register is split into 2 fields, with the upper
    16 bits of the register indicating a limit on the value that the wired
    entry count in the bottom 16 bits of the register can take. This means
    that simply reading the wired register doesn't get us a valid TLB entry
    index any longer, and we instead need to retrieve only the lower 16 bits
    of the register. Introduce a new num_wired_entries() function which does
    this on MIPSr6 or higher and simply returns the value of the wired
    register on older architecture revisions, and make use of it when
    reading the number of wired entries.

    Since commit e710d6668309 ("MIPS: tlb-r4k: If there are wired entries,
    don't use TLBINVF") we have been using a non-zero number of wired
    entries to determine whether we should avoid use of the tlbinvf
    instruction (which would invalidate wired entries) and instead loop over
    TLB entries in local_flush_tlb_all(). This loop begins with the number
    of wired entries, or before this patch some large bogus TLB index on
    MIPSr6 systems. Thus since the aforementioned commit some MIPSr6 systems
    with FTLBs have been prone to leaving stale address translations in the
    FTLB & crashing in various weird & wonderful ways when we later observe
    the wrong memory.

    Signed-off-by: Paul Burton
    Cc: Matt Redfearn
    Cc: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/14557/
    Signed-off-by: Ralf Baechle

    Paul Burton
     

23 Nov, 2016

1 commit

  • It is the reasonable expectation that if an executable file is not
    readable there will be no way for a user without special privileges to
    read the file. This is enforced in ptrace_attach but if ptrace
    is already attached before exec there is no enforcement for read-only
    executables.

    As the only way to read such an mm is through access_process_vm
    spin a variant called ptrace_access_vm that will fail if the
    target process is not being ptraced by the current process, or
    the current process did not have sufficient privileges when ptracing
    began to read the target processes mm.

    In the ptrace implementations replace access_process_vm by
    ptrace_access_vm. There remain several ptrace sites that still use
    access_process_vm as they are reading the target executables
    instructions (for kernel consumption) or register stacks. As such it
    does not appear necessary to add a permission check to those calls.

    This bug has always existed in Linux.

    Fixes: v1.0
    Cc: stable@vger.kernel.org
    Reported-by: Andy Lutomirski
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

17 Nov, 2016

1 commit

  • No need to duplicate the same define everywhere. Since
    the only user is stop-machine and the only provider is
    s390, we can use a default implementation of cpu_relax_yield()
    in sched.h.

    Suggested-by: Russell King
    Signed-off-by: Christian Borntraeger
    Reviewed-by: David Hildenbrand
    Acked-by: Russell King
    Cc: Andrew Morton
    Cc: Catalin Marinas
    Cc: Heiko Carstens
    Cc: Linus Torvalds
    Cc: Martin Schwidefsky
    Cc: Nicholas Piggin
    Cc: Noam Camus
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: kvm@vger.kernel.org
    Cc: linux-arch@vger.kernel.org
    Cc: linux-s390
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: sparclinux@vger.kernel.org
    Cc: virtualization@lists.linux-foundation.org
    Cc: xen-devel@lists.xenproject.org
    Link: http://lkml.kernel.org/r/1479298985-191589-1-git-send-email-borntraeger@de.ibm.com
    Signed-off-by: Ingo Molnar

    Christian Borntraeger
     

16 Nov, 2016

2 commits

  • As there are no users left, we can remove cpu_relax_lowlatency()
    implementations from every architecture.

    Signed-off-by: Christian Borntraeger
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Catalin Marinas
    Cc: Heiko Carstens
    Cc: Linus Torvalds
    Cc: Martin Schwidefsky
    Cc: Nicholas Piggin
    Cc: Noam Camus
    Cc: Peter Zijlstra
    Cc: Russell King
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: virtualization@lists.linux-foundation.org
    Cc: xen-devel@lists.xenproject.org
    Cc:
    Link: http://lkml.kernel.org/r/1477386195-32736-6-git-send-email-borntraeger@de.ibm.com
    Signed-off-by: Ingo Molnar

    Christian Borntraeger
     
  • For spinning loops people do often use barrier() or cpu_relax().
    For most architectures cpu_relax and barrier are the same, but on
    some architectures cpu_relax can add some latency.
    For example on power,sparc64 and arc, cpu_relax can shift the CPU
    towards other hardware threads in an SMT environment.
    On s390 cpu_relax does even more, it uses an hypercall to the
    hypervisor to give up the timeslice.
    In contrast to the SMT yielding this can result in larger latencies.
    In some places this latency is unwanted, so another variant
    "cpu_relax_lowlatency" was introduced. Before this is used in more
    and more places, lets revert the logic and provide a cpu_relax_yield
    that can be called in places where yielding is more important than
    latency. By default this is the same as cpu_relax on all architectures.

    Signed-off-by: Christian Borntraeger
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Catalin Marinas
    Cc: Heiko Carstens
    Cc: Linus Torvalds
    Cc: Martin Schwidefsky
    Cc: Nicholas Piggin
    Cc: Noam Camus
    Cc: Peter Zijlstra
    Cc: Russell King
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: virtualization@lists.linux-foundation.org
    Cc: xen-devel@lists.xenproject.org
    Link: http://lkml.kernel.org/r/1477386195-32736-2-git-send-email-borntraeger@de.ibm.com
    Signed-off-by: Ingo Molnar

    Christian Borntraeger
     

11 Nov, 2016

1 commit


05 Nov, 2016

4 commits

  • This commit removes two things:
    - The platform_device that corresponds to the RTC driver, since we now
    probe this driver from devicetree;
    - The platform power-off code, since all the jz4740-based platforms are
    now using the jz4740-rtc driver as the system power controller.

    Signed-off-by: Paul Cercueil
    Acked-by: Maarten ter Huurne
    Signed-off-by: Alexandre Belloni

    Paul Cercueil
     
  • Since we already have a devicetree node for the jz4740-rtc driver, we
    don't have to probe it from platform code.

    Besides, using the jz4740-rtc driver as the power controller for the
    qi_lb60 platform allows us to remove the jz4740 platform power-off code,
    since this is the only jz4740-based board upstream.

    Signed-off-by: Paul Cercueil
    Acked-by: Maarten ter Huurne
    Signed-off-by: Alexandre Belloni

    Paul Cercueil
     
  • Now that the jz4740-rtc driver supports devicetree, we can add a
    devicetree node for it.

    Signed-off-by: Paul Cercueil
    Acked-by: Maarten ter Huurne
    Signed-off-by: Alexandre Belloni

    Paul Cercueil
     
  • Pull KVM updates from Paolo Bonzini:
    "One NULL pointer dereference, and two fixes for regressions introduced
    during the merge window.

    The rest are fixes for MIPS, s390 and nested VMX"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
    kvm: x86: Check memopp before dereference (CVE-2016-8630)
    kvm: nVMX: VMCLEAR an active shadow VMCS after last use
    KVM: x86: drop TSC offsetting kvm_x86_ops to fix KVM_GET/SET_CLOCK
    KVM: x86: fix wbinvd_dirty_mask use-after-free
    kvm/x86: Show WRMSR data is in hex
    kvm: nVMX: Fix kernel panics induced by illegal INVEPT/INVVPID types
    KVM: document lock orders
    KVM: fix OOPS on flush_work
    KVM: s390: Fix STHYI buffer alignment for diag224
    KVM: MIPS: Precalculate MMIO load resume PC
    KVM: MIPS: Make ERET handle ERL before EXL
    KVM: MIPS: Fix lazy user ASID regenerate for SMP

    Linus Torvalds
     

04 Nov, 2016

8 commits

  • When low memory doesn't reach HIGHMEM_START (e.g. up to 256MB at PA=0 is
    common) and highmem is present above HIGHMEM_START (e.g. on Malta the
    RAM overlayed by the IO region is aliased at PA=0x90000000), max_low_pfn
    will be initially calculated very large and then clipped down to
    HIGHMEM_START.

    This causes crashes when reading /sys/kernel/mm/page_idle/bitmap
    (i.e. CONFIG_IDLE_PAGE_TRACKING=y) when highmem is disabled. pfn_valid()
    will compare against max_mapnr which is derived from max_low_pfn when
    there is no highend_pfn set up, and will return true for PFNs right up
    to HIGHMEM_START, even though they are beyond the end of low memory and
    no page structs will actually exist for these PFNs.

    This is fixed by skipping high memory regions when initially calculating
    max_low_pfn if highmem is disabled, so it doesn't get clipped too high.
    We also clip regions which overlap the highmem boundary when highmem is
    disabled, so that max_pfn doesn't extend into highmem either.

    Signed-off-by: James Hogan
    Cc: Paul Burton
    Cc: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/14490/
    Signed-off-by: Ralf Baechle

    James Hogan
     
  • Complement commit 80cbfad79096 ("MIPS: Correct MIPS I FP context
    layout") and correct the way Floating Point General registers are stored
    in a signal context with MIPS I hardware.

    Use the S.D and L.D assembly macros to have pairs of SWC1 instructions
    and pairs of LWC1 instructions produced, respectively, in an arrangement
    which makes the memory representation of floating-point data passed
    compatible with that used by hardware SDC1 and LDC1 instructions, where
    available, regardless of the hardware endianness used. This matches the
    layout used by r4k_fpu.S, ensuring run-time compatibility for MIPS I
    software across all o32 hardware platforms.

    Define an EX2 macro to handle exceptions from both hardware instructions
    implicitly produced from S.D and L.D assembly macros.

    Signed-off-by: Maciej W. Rozycki
    Cc: linux-mips@linux-mips.org
    Cc: linux-kernel@vger.kernel.org
    Patchwork: https://patchwork.linux-mips.org/patch/14477/
    Signed-off-by: Ralf Baechle

    Maciej W. Rozycki
     
  • Fix a regression introduced with commit 2db9ca0a3551 ("MIPS: Use struct
    mips_abi offsets to save FP context") for MIPS I/I FP signal contexts,
    by converting save/restore code to the updated internal API. Start FGR
    offsets from 0 rather than SC_FPREGS from $a0 and use $a1 rather than
    the offset of SC_FPC_CSR from $a0 for the Floating Point Control/Status
    Register (FCSR).

    Document the new internal API and adjust assembly code formatting for
    consistency.

    Signed-off-by: Maciej W. Rozycki
    Cc: Paul Burton
    Cc: linux-mips@linux-mips.org
    Cc: linux-kernel@vger.kernel.org
    Patchwork: https://patchwork.linux-mips.org/patch/14476/
    Signed-off-by: Ralf Baechle

    Maciej W. Rozycki
     
  • Complement commit e50c0a8fa60d ("Support the MIPS32 / MIPS64 DSP ASE.")
    and remove the Floating Point Implementation Register (FIR) from the FP
    register set recorded in a signal context with MIPS I processors too, in
    line with the change applied to r4k_fpu.S.

    The `sc_fpc_eir' slot is unused according to our current ABI and the FIR
    register is read-only and always directly accessible from user software.

    [ralf@linux-mips.org: This is also required because the next commit depends
    on it.]

    Signed-off-by: Maciej W. Rozycki
    Cc: linux-mips@linux-mips.org
    Cc: linux-kernel@vger.kernel.org
    Patchwork: https://patchwork.linux-mips.org/patch/14475/
    Signed-off-by: Ralf Baechle

    Maciej W. Rozycki
     
  • Complement commit 0ae8dceaebe3 ("Merge with 2.3.10.") and use the local
    `fault' handler to recover from FP sigcontext access violation faults,
    like corresponding code does in r4k_fpu.S. The `bad_stack' handler is
    in syscall.c and is not suitable here as we want to propagate the error
    condition up through the caller rather than killing the thread outright.

    Signed-off-by: Maciej W. Rozycki
    Cc: linux-mips@linux-mips.org
    Cc: linux-kernel@vger.kernel.org
    Patchwork: https://patchwork.linux-mips.org/patch/14474/
    Signed-off-by: Ralf Baechle

    Maciej W. Rozycki
     
  • Sanitize FCSR Cause bit handling, following a trail of past attempts:

    * commit 4249548454f7 ("MIPS: ptrace: Fix FP context restoration FCSR
    regression"),

    * commit 443c44032a54 ("MIPS: Always clear FCSR cause bits after
    emulation"),

    * commit 64bedffe4968 ("MIPS: Clear [MSA]FPE CSR.Cause after
    notify_die()"),

    * commit b1442d39fac2 ("MIPS: Prevent user from setting FCSR cause
    bits"),

    * commit b54d2901517d ("Properly handle branch delay slots in connection
    with signals.").

    Specifically do not mask these bits out in ptrace(2) processing and send
    a SIGFPE signal instead whenever a matching pair of an FCSR Cause and
    Enable bit is seen as execution of an affected context is about to
    resume. Only then clear Cause bits, and even then do not clear any bits
    that are set but masked with the respective Enable bits. Adjust Cause
    bit clearing throughout code likewise, except within the FPU emulator
    proper where they are set according to IEEE 754 exceptions raised as the
    operation emulated executed. Do so so that any IEEE 754 exceptions
    subject to their default handling are recorded like with operations
    executed by FPU hardware.

    Signed-off-by: Maciej W. Rozycki
    Cc: Paul Burton
    Cc: James Hogan
    Cc: linux-mips@linux-mips.org
    Cc: linux-kernel@vger.kernel.org
    Patchwork: https://patchwork.linux-mips.org/patch/14460/
    Signed-off-by: Ralf Baechle

    Maciej W. Rozycki
     
  • Complement commit ac9ad83bc318 ("MIPS: prevent FP context set via ptrace
    being discarded") and also initialize the FP context whenever FCSR alone
    is written with a PTRACE_POKEUSR request addressing FPC_CSR, rather than
    along with the full FPU register set in the case of the PTRACE_SETFPREGS
    request.

    Signed-off-by: Maciej W. Rozycki
    Cc: Paul Burton
    Cc: James Hogan
    Cc: linux-mips@linux-mips.org
    Cc: linux-kernel@vger.kernel.org
    Patchwork: https://patchwork.linux-mips.org/patch/14459/
    Signed-off-by: Ralf Baechle

    Maciej W. Rozycki
     
  • Since commit 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing
    continuation lines") the output from TLB dumps on MIPS has been
    pretty unreadable due to the lack of KERN_CONT markers. Use pr_cont to
    provide the appropriate markers & restore the expected output.

    Continuation is also used for the second line of each TLB entry printed
    in dump_tlb.c even though it has a newline, since it is a continuation
    of the interpretation of the same TLB entry. For example:

    [ 46.371884] Index: 0 pgmask=16kb va=77654000 asid=73 gid=00
    [ri=0 xi=0 pa=ffc18000 c=5 d=0 v=1 g=0] [ri=0 xi=0 pa=ffc1c000 c=5 d=0 v=1 g=0]
    [ 46.385380] Index: 12 pgmask=16kb va=004b4000 asid=73 gid=00
    [ri=0 xi=0 pa=00000000 c=0 d=0 v=0 g=0] [ri=0 xi=0 pa=ffb00000 c=5 d=1 v=1 g=0]

    Signed-off-by: James Hogan
    Cc: Maciej W. Rozycki
    Cc: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/14444/
    Signed-off-by: Ralf Baechle

    James Hogan