05 Aug, 2016

2 commits

  • Pull RTC updates from Alexandre Belloni:
    "RTC for 4.8

    Cleanups:
    - huge cleanup of rtc-generic and char/genrtc this allowed to cleanup
    rtc-cmos, rtc-sh, rtc-m68k, rtc-powerpc and rtc-parisc
    - move mn10300 to rtc-cmos

    Subsystem:
    - fix wakealarms after hibernate
    - multiples fixes for rctest
    - simplify implementations of .read_alarm

    New drivers:
    - Maxim MAX6916

    Drivers:
    - ds1307: fix weekday
    - m41t80: add wakeup support
    - pcf85063: add support for PCF85063A variant
    - rv8803: extend i2c fix and other fixes
    - s35390a: fix alarm reading, this fixes instant reboot after
    shutdown for QNAP TS-41x
    - s3c: clock fixes"

    * tag 'rtc-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (65 commits)
    rtc: rv8803: Clear V1F when setting the time
    rtc: rv8803: Stop the clock while setting the time
    rtc: rv8803: Always apply the I²C workaround
    rtc: rv8803: Fix read day of week
    rtc: rv8803: Remove the check for valid time
    rtc: rv8803: Kconfig: Indicate rx8900 support
    rtc: asm9260: remove .owner field for driver
    rtc: at91sam9: Fix missing spin_lock_init()
    rtc: m41t80: add suspend handlers for alarm IRQ
    rtc: m41t80: make it a real error message
    rtc: pcf85063: Add support for the PCF85063A device
    rtc: pcf85063: fix year range
    rtc: hym8563: in .read_alarm set .tm_sec to 0 to signal minute accuracy
    rtc: explicitly set tm_sec = 0 for drivers with minute accurancy
    rtc: s3c: Add s3c_rtc_{enable/disable}_clk in s3c_rtc_setfreq()
    rtc: s3c: Remove unnecessary call to disable already disabled clock
    rtc: abx80x: use devm_add_action_or_reset()
    rtc: m41t80: use devm_add_action_or_reset()
    rtc: fix a typo and reduce three empty lines to one
    rtc: s35390a: improve two comments in .set_alarm
    ...

    Linus Torvalds
     
  • Pull parisc updates from Helge Deller:

    - added an optimized hash implementation for parisc (George Spelvin)

    - C99 style cleanups in iomap.c (Amitoj Kaur Chawla)

    - added breaks to switch statement in PDC function (noticed by Dan
    Carpenter)

    * 'parisc-4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
    parisc: Change structure intialisation to C99 style in iomap.c
    parisc: Add break statements to pdc_pat_io_pci_cfg_read()
    parisc: Add

    Linus Torvalds
     

04 Aug, 2016

1 commit

  • The dma-mapping core and the implementations do not change the DMA
    attributes passed by pointer. Thus the pointer can point to const data.
    However the attributes do not have to be a bitfield. Instead unsigned
    long will do fine:

    1. This is just simpler. Both in terms of reading the code and setting
    attributes. Instead of initializing local attributes on the stack
    and passing pointer to it to dma_set_attr(), just set the bits.

    2. It brings safeness and checking for const correctness because the
    attributes are passed by value.

    Semantic patches for this change (at least most of them):

    virtual patch
    virtual context

    @r@
    identifier f, attrs;

    @@
    f(...,
    - struct dma_attrs *attrs
    + unsigned long attrs
    , ...)
    {
    ...
    }

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
    )

    and

    // Options: --all-includes
    virtual patch
    virtual context

    @r@
    identifier f, attrs;
    type t;

    @@
    t f(..., struct dma_attrs *attrs);

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
    )

    Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
    Signed-off-by: Krzysztof Kozlowski
    Acked-by: Vineet Gupta
    Acked-by: Robin Murphy
    Acked-by: Hans-Christian Noren Egtvedt
    Acked-by: Mark Salter [c6x]
    Acked-by: Jesper Nilsson [cris]
    Acked-by: Daniel Vetter [drm]
    Reviewed-by: Bart Van Assche
    Acked-by: Joerg Roedel [iommu]
    Acked-by: Fabien Dessenne [bdisp]
    Reviewed-by: Marek Szyprowski [vb2-core]
    Acked-by: David Vrabel [xen]
    Acked-by: Konrad Rzeszutek Wilk [xen swiotlb]
    Acked-by: Joerg Roedel [iommu]
    Acked-by: Richard Kuo [hexagon]
    Acked-by: Geert Uytterhoeven [m68k]
    Acked-by: Gerald Schaefer [s390]
    Acked-by: Bjorn Andersson
    Acked-by: Hans-Christian Noren Egtvedt [avr32]
    Acked-by: Vineet Gupta [arc]
    Acked-by: Robin Murphy [arm64 and dma-iommu]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Krzysztof Kozlowski
     

02 Aug, 2016

3 commits

  • Replace the in order struct initialisation style with explicit field
    style.

    Signed-off-by: Amitoj Kaur Chawla
    Signed-off-by: Helge Deller

    Amitoj Kaur Chawla
     
  • Dan Carpenter noticed that pdc_pat_io_pci_cfg_read() is problematic
    because it's missing some break statements so it copies 4 bytes
    regardless of whether you asked for only 1 or 2.

    Reported-by: Dan Carpenter
    Signed-off-by: Helge Deller

    Helge Deller
     
  • PA-RISC is interesting; integer multiplies are implemented in the
    FPU, so are painful in the kernel. But it tries to be friendly to
    shift-and-add sequences for constant multiplies.

    __hash_32 is implemented using the same shift-and-add sequence as
    Microblaze, just scheduled for the PA7100. (It's 2-way superscalar
    but in-order, like the Pentium.)

    hash_64 was tricky, but a suggestion from Jason Thong allowed a
    good solution by breaking up the multiplier. After a lot of manual
    optimization, I found a 19-instruction sequence for the multiply that
    can be executed in 10 cycles using only 4 temporaries.

    (The PA8xxx can issue 4 instructions per cycle, but 2 must be ALU ops
    and 2 must be loads/stores. And the final add can't be paired.)

    An alternative considered, but ultimately not used, was Thomas Wang's
    64-to-32-bit integer hash. At 12 instructions, it's smaller, but they're
    all sequentially dependent, so it has longer latency.

    https://web.archive.org/web/2011/http://www.concentric.net/~Ttwang/tech/inthash.htm
    http://burtleburtle.net/bob/hash/integer.html

    Signed-off-by: George Spelvin
    Cc: Helge Deller
    Cc: linux-parisc@vger.kernel.org
    Signed-off-by: Helge Deller

    George Spelvin
     

30 Jul, 2016

1 commit

  • Pull security subsystem updates from James Morris:
    "Highlights:

    - TPM core and driver updates/fixes
    - IPv6 security labeling (CALIPSO)
    - Lots of Apparmor fixes
    - Seccomp: remove 2-phase API, close hole where ptrace can change
    syscall #"

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (156 commits)
    apparmor: fix SECURITY_APPARMOR_HASH_DEFAULT parameter handling
    tpm: Add TPM 2.0 support to the Nuvoton i2c driver (NPCT6xx family)
    tpm: Factor out common startup code
    tpm: use devm_add_action_or_reset
    tpm2_i2c_nuvoton: add irq validity check
    tpm: read burstcount from TPM_STS in one 32-bit transaction
    tpm: fix byte-order for the value read by tpm2_get_tpm_pt
    tpm_tis_core: convert max timeouts from msec to jiffies
    apparmor: fix arg_size computation for when setprocattr is null terminated
    apparmor: fix oops, validate buffer size in apparmor_setprocattr()
    apparmor: do not expose kernel stack
    apparmor: fix module parameters can be changed after policy is locked
    apparmor: fix oops in profile_unpack() when policy_db is not present
    apparmor: don't check for vmalloc_addr if kvzalloc() failed
    apparmor: add missing id bounds check on dfa verification
    apparmor: allow SYS_CAP_RESOURCE to be sufficient to prlimit another task
    apparmor: use list_next_entry instead of list_entry_next
    apparmor: fix refcount race when finding a child profile
    apparmor: fix ref count leak when profile sha1 hash is read
    apparmor: check that xindex is in trans_table bounds
    ...

    Linus Torvalds
     

28 Jul, 2016

1 commit

  • Pull LED updates from Jacek Anaszewski:
    "New LED class driver:
    - LED driver for TI LP3952 6-Channel Color LED

    LED core improvements:
    - Only descend into leds directory when CONFIG_NEW_LEDS is set
    - Add no-op gpio_led_register_device when LED subsystem is disabled
    - MAINTAINERS: Add file patterns for led device tree bindings

    LED Trigger core improvements:
    - return error if invalid trigger name is provided via sysfs

    LED class drivers improvements
    - is31fl32xx: define complete i2c_device_id table
    - is31fl32xx: fix typo in id and match table names
    - leds-gpio: Set of_node for created LED devices
    - pca9532: Add device tree support

    Conversion of IDE trigger to common disk trigger:
    - leds: convert IDE trigger to common disk trigger
    - leds: documentation: 'ide-disk' to 'disk-activity'
    - unicore32: use the new LED disk activity trigger
    - parisc: use the new LED disk activity trigger
    - mips: use the new LED disk activity trigger
    - arm: use the new LED disk activity trigger
    - powerpc: use the new LED disk activity trigger"

    * tag 'leds_for_4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
    leds: is31fl32xx: define complete i2c_device_id table
    leds: is31fl32xx: fix typo in id and match table names
    leds: LED driver for TI LP3952 6-Channel Color LED
    leds: leds-gpio: Set of_node for created LED devices
    leds: triggers: return error if invalid trigger name is provided via sysfs
    leds: Only descend into leds directory when CONFIG_NEW_LEDS is set
    leds: Add no-op gpio_led_register_device when LED subsystem is disabled
    unicore32: use the new LED disk activity trigger
    parisc: use the new LED disk activity trigger
    mips: use the new LED disk activity trigger
    arm: use the new LED disk activity trigger
    powerpc: use the new LED disk activity trigger
    leds: documentation: 'ide-disk' to 'disk-activity'
    leds: convert IDE trigger to common disk trigger
    leds: pca9532: Add device tree support
    MAINTAINERS: Add file patterns for led device tree bindings

    Linus Torvalds
     

27 Jul, 2016

1 commit


26 Jul, 2016

1 commit

  • Pull locking updates from Ingo Molnar:
    "The locking tree was busier in this cycle than the usual pattern - a
    couple of major projects happened to coincide.

    The main changes are:

    - implement the atomic_fetch_{add,sub,and,or,xor}() API natively
    across all SMP architectures (Peter Zijlstra)

    - add atomic_fetch_{inc/dec}() as well, using the generic primitives
    (Davidlohr Bueso)

    - optimize various aspects of rwsems (Jason Low, Davidlohr Bueso,
    Waiman Long)

    - optimize smp_cond_load_acquire() on arm64 and implement LSE based
    atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()
    on arm64 (Will Deacon)

    - introduce smp_acquire__after_ctrl_dep() and fix various barrier
    mis-uses and bugs (Peter Zijlstra)

    - after discovering ancient spin_unlock_wait() barrier bugs in its
    implementation and usage, strengthen its semantics and update/fix
    usage sites (Peter Zijlstra)

    - optimize mutex_trylock() fastpath (Peter Zijlstra)

    - ... misc fixes and cleanups"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (67 commits)
    locking/atomic: Introduce inc/dec variants for the atomic_fetch_$op() API
    locking/barriers, arch/arm64: Implement LDXR+WFE based smp_cond_load_acquire()
    locking/static_keys: Fix non static symbol Sparse warning
    locking/qspinlock: Use __this_cpu_dec() instead of full-blown this_cpu_dec()
    locking/atomic, arch/tile: Fix tilepro build
    locking/atomic, arch/m68k: Remove comment
    locking/atomic, arch/arc: Fix build
    locking/Documentation: Clarify limited control-dependency scope
    locking/atomic, arch/rwsem: Employ atomic_long_fetch_add()
    locking/atomic, arch/qrwlock: Employ atomic_fetch_add_acquire()
    locking/atomic, arch/mips: Convert to _relaxed atomics
    locking/atomic, arch/alpha: Convert to _relaxed atomics
    locking/atomic: Remove the deprecated atomic_{set,clear}_mask() functions
    locking/atomic: Remove linux/atomic.h:atomic_fetch_or()
    locking/atomic: Implement atomic{,64,_long}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()
    locking/atomic: Fix atomic64_relaxed() bits
    locking/atomic, arch/xtensa: Implement atomic_fetch_{add,sub,and,or,xor}()
    locking/atomic, arch/x86: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
    locking/atomic, arch/tile: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
    locking/atomic, arch/sparc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
    ...

    Linus Torvalds
     

27 Jun, 2016

1 commit


25 Jun, 2016

2 commits

  • __GFP_REPEAT has a rather weak semantic but since it has been introduced
    around 2.6.12 it has been ignored for low order allocations.

    pmd_alloc_one allocate PMD_ORDER which is 1. This means that this flag
    has never been actually useful here because it has always been used only
    for PAGE_ALLOC_COSTLY requests.

    Link: http://lkml.kernel.org/r/1464599699-30131-10-git-send-email-mhocko@kernel.org
    Signed-off-by: Michal Hocko
    Cc: "James E.J. Bottomley"
    Cc: Helge Deller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     
  • This is the third version of the patchset previously sent [1]. I have
    basically only rebased it on top of 4.7-rc1 tree and dropped "dm: get
    rid of superfluous gfp flags" which went through dm tree. I am sending
    it now because it is tree wide and chances for conflicts are reduced
    considerably when we want to target rc2. I plan to send the next step
    and rename the flag and move to a better semantic later during this
    release cycle so we will have a new semantic ready for 4.8 merge window
    hopefully.

    Motivation:

    While working on something unrelated I've checked the current usage of
    __GFP_REPEAT in the tree. It seems that a majority of the usage is and
    always has been bogus because __GFP_REPEAT has always been about costly
    high order allocations while we are using it for order-0 or very small
    orders very often. It seems that a big pile of them is just a
    copy&paste when a code has been adopted from one arch to another.

    I think it makes some sense to get rid of them because they are just
    making the semantic more unclear. Please note that GFP_REPEAT is
    documented as

    * __GFP_REPEAT: Try hard to allocate the memory, but the allocation attempt

    * _might_ fail. This depends upon the particular VM implementation.
    while !costly requests have basically nofail semantic. So one could
    reasonably expect that order-0 request with __GFP_REPEAT will not loop
    for ever. This is not implemented right now though.

    I would like to move on with __GFP_REPEAT and define a better semantic
    for it.

    $ git grep __GFP_REPEAT origin/master | wc -l
    111
    $ git grep __GFP_REPEAT | wc -l
    36

    So we are down to the third after this patch series. The remaining
    places really seem to be relying on __GFP_REPEAT due to large allocation
    requests. This still needs some double checking which I will do later
    after all the simple ones are sorted out.

    I am touching a lot of arch specific code here and I hope I got it right
    but as a matter of fact I even didn't compile test for some archs as I
    do not have cross compiler for them. Patches should be quite trivial to
    review for stupid compile mistakes though. The tricky parts are usually
    hidden by macro definitions and thats where I would appreciate help from
    arch maintainers.

    [1] http://lkml.kernel.org/r/1461849846-27209-1-git-send-email-mhocko@kernel.org

    This patch (of 19):

    __GFP_REPEAT has a rather weak semantic but since it has been introduced
    around 2.6.12 it has been ignored for low order allocations. Yet we
    have the full kernel tree with its usage for apparently order-0
    allocations. This is really confusing because __GFP_REPEAT is
    explicitly documented to allow allocation failures which is a weaker
    semantic than the current order-0 has (basically nofail).

    Let's simply drop __GFP_REPEAT from those places. This would allow to
    identify place which really need allocator to retry harder and formulate
    a more specific semantic for what the flag is supposed to do actually.

    Link: http://lkml.kernel.org/r/1464599699-30131-2-git-send-email-mhocko@kernel.org
    Signed-off-by: Michal Hocko
    Cc: "David S. Miller"
    Cc: "H. Peter Anvin"
    Cc: "James E.J. Bottomley"
    Cc: "Theodore Ts'o"
    Cc: Andy Lutomirski
    Cc: Benjamin Herrenschmidt
    Cc: Catalin Marinas
    Cc: Chen Liqin
    Cc: Chris Metcalf [for tile]
    Cc: Guan Xuetao
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: Ingo Molnar
    Cc: Jan Kara
    Cc: John Crispin
    Cc: Lennox Wu
    Cc: Ley Foon Tan
    Cc: Martin Schwidefsky
    Cc: Matt Fleming
    Cc: Ralf Baechle
    Cc: Rich Felker
    Cc: Russell King
    Cc: Thomas Gleixner
    Cc: Vineet Gupta
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     

16 Jun, 2016

2 commits

  • Since all architectures have this implemented now natively, remove this
    dead code.

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-arch@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Implement FETCH-OP atomic primitives, these are very similar to the
    existing OP-RETURN primitives we already have, except they return the
    value of the atomic variable _before_ modification.

    This is especially useful for irreversible operations -- such as
    bitops (because it becomes impossible to reconstruct the state prior
    to modification).

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Helge Deller
    Cc: James E.J. Bottomley
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-arch@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-parisc@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

15 Jun, 2016

2 commits


14 Jun, 2016

1 commit

  • This patch updates/fixes all spin_unlock_wait() implementations.

    The update is in semantics; where it previously was only a control
    dependency, we now upgrade to a full load-acquire to match the
    store-release from the spin_unlock() we waited on. This ensures that
    when spin_unlock_wait() returns, we're guaranteed to observe the full
    critical section we waited on.

    This fixes a number of spin_unlock_wait() users that (not
    unreasonably) rely on this.

    I also fixed a number of ticket lock versions to only wait on the
    current lock holder, instead of for a full unlock, as this is
    sufficient.

    Furthermore; again for ticket locks; I added an smp_rmb() in between
    the initial ticket load and the spin loop testing the current value
    because I could not convince myself the address dependency is
    sufficient, esp. if the loads are of different sizes.

    I'm more than happy to remove this smp_rmb() again if people are
    certain the address dependency does indeed work as expected.

    Note: PPC32 will be fixed independently

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: chris@zankel.net
    Cc: cmetcalf@mellanox.com
    Cc: davem@davemloft.net
    Cc: dhowells@redhat.com
    Cc: james.hogan@imgtec.com
    Cc: jejb@parisc-linux.org
    Cc: linux@armlinux.org.uk
    Cc: mpe@ellerman.id.au
    Cc: ralf@linux-mips.org
    Cc: realmz6@gmail.com
    Cc: rkuo@codeaurora.org
    Cc: rth@twiddle.net
    Cc: schwidefsky@de.ibm.com
    Cc: tony.luck@intel.com
    Cc: vgupta@synopsys.com
    Cc: ysato@users.sourceforge.jp
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

05 Jun, 2016

4 commits

  • Signed-off-by: Helge Deller

    Helge Deller
     
  • One of the debian buildd servers had this crash in the syslog without
    any other information:

    Unaligned handler failed, ret = -2
    clock_adjtime (pid 22578): Unaligned data reference (code 28)
    CPU: 1 PID: 22578 Comm: clock_adjtime Tainted: G E 4.5.0-2-parisc64-smp #1 Debian 4.5.4-1
    task: 000000007d9960f8 ti: 00000001bde7c000 task.ti: 00000001bde7c000

    YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
    PSW: 00001000000001001111100000001111 Tainted: G E
    r00-03 000000ff0804f80f 00000001bde7c2b0 00000000402d2be8 00000001bde7c2b0
    r04-07 00000000409e1fd0 00000000fa6f7fff 00000001bde7c148 00000000fa6f7fff
    r08-11 0000000000000000 00000000ffffffff 00000000fac9bb7b 000000000002b4d4
    r12-15 000000000015241c 000000000015242c 000000000000002d 00000000fac9bb7b
    r16-19 0000000000028800 0000000000000001 0000000000000070 00000001bde7c218
    r20-23 0000000000000000 00000001bde7c210 0000000000000002 0000000000000000
    r24-27 0000000000000000 0000000000000000 00000001bde7c148 00000000409e1fd0
    r28-31 0000000000000001 00000001bde7c320 00000001bde7c350 00000001bde7c218
    sr00-03 0000000001200000 0000000001200000 0000000000000000 0000000001200000
    sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000

    IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000402d2e84 00000000402d2e88
    IIR: 0ca0d089 ISR: 0000000001200000 IOR: 00000000fa6f7fff
    CPU: 1 CR30: 00000001bde7c000 CR31: ffffffffffffffff
    ORIG_R28: 00000002369fe628
    IAOQ[0]: compat_get_timex+0x2dc/0x3c0
    IAOQ[1]: compat_get_timex+0x2e0/0x3c0
    RP(r2): compat_get_timex+0x40/0x3c0
    Backtrace:
    [] compat_SyS_clock_adjtime+0x40/0xc0
    [] syscall_exit+0x0/0x14

    This means the userspace program clock_adjtime called the clock_adjtime()
    syscall and then crashed inside the compat_get_timex() function.
    Syscalls should never crash programs, but instead return EFAULT.

    The IIR register contains the executed instruction, which disassebles
    into "ldw 0(sr3,r5),r9".
    This load-word instruction is part of __get_user() which tried to read the word
    at %r5/IOR (0xfa6f7fff). This means the unaligned handler jumped in. The
    unaligned handler is able to emulate all ldw instructions, but it fails if it
    fails to read the source e.g. because of page fault.

    The following program reproduces the problem:

    #define _GNU_SOURCE
    #include
    #include
    #include

    int main(void) {
    /* allocate 8k */
    char *ptr = mmap(NULL, 2*4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
    /* free second half (upper 4k) and make it invalid. */
    munmap(ptr+4096, 4096);
    /* syscall where first int is unaligned and clobbers into invalid memory region */
    /* syscall should return EFAULT */
    return syscall(__NR_clock_adjtime, 0, ptr+4095);
    }

    To fix this issue we simply need to check if the faulting instruction address
    is in the exception fixup table when the unaligned handler failed. If it
    is, call the fixup routine instead of crashing.

    While looking at the unaligned handler I found another issue as well: The
    target register should not be modified if the handler was unsuccessful.

    Signed-off-by: Helge Deller
    Cc: stable@vger.kernel.org

    Helge Deller
     
  • Avoid showing invalid printk time stamps during boot.

    Signed-off-by: Helge Deller
    Reviewed-by: Aaro Koskinen

    Helge Deller
     
  • This patch fixes backtrace on PA-RISC

    There were several problems:

    1) The code that decodes instructions handles instructions that subtract
    from the stack pointer incorrectly. If the instruction subtracts the
    number X from the stack pointer the code increases the frame size by
    (0x100000000-X). This results in invalid accesses to memory and
    recursive page faults.

    2) Because gcc reorders blocks, handling instructions that subtract from
    the frame pointer is incorrect. For example, this function
    int f(int a)
    {
    if (__builtin_expect(a, 1))
    return a;
    g();
    return a;
    }
    is compiled in such a way, that the code that decreases the stack
    pointer for the first "return a" is placed before the code for "g" call.
    If we recognize this decrement, we mistakenly believe that the frame
    size for the "g" call is zero.

    To fix problems 1) and 2), the patch doesn't recognize instructions that
    decrease the stack pointer at all. To further safeguard the unwind code
    against nonsense values, we don't allow frame size larger than
    Total_frame_size.

    3) The backtrace is not locked. If stack dump races with module unload,
    invalid table can be accessed.

    This patch adds a spinlock when processing module tables.

    Note, that for correct backtrace, you need recent binutils.
    Binutils 2.18 from Debian 5 produce garbage unwind tables.
    Binutils 2.21 work better (it sometimes forgets function frames, but at
    least it doesn't generate garbage).

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Helge Deller

    Mikulas Patocka
     

04 Jun, 2016

3 commits

  • This architecture selects RTC_CLASS unconditionally, so the GEN_RTC
    has not worked here for a long time.

    Now we can remove both the asm/rtc.h header and the Kconfig dependency
    for CONFIG_GEN_RTC.

    Signed-off-by: Arnd Bergmann
    Acked-by: Geert Uytterhoeven
    Signed-off-by: Alexandre Belloni

    Arnd Bergmann
     
  • The rtc-generic driver provides an architecture specific
    wrapper on top of the generic rtc_class_ops abstraction,
    and on pa-risc, that is implemented using an open-coded
    version of rtc_time_to_tm/rtc_tm_to_time.

    This changes the parisc rtc-generic device to provide its
    rtc_class_ops directly, using the normal helper functions,
    which makes this y2038 safe (on 32-bit) and simplifies
    the implementation.

    Signed-off-by: Arnd Bergmann
    Acked-by: Geert Uytterhoeven
    Signed-off-by: Alexandre Belloni

    Arnd Bergmann
     
  • Nothing on these architectures ever includes the asm/mc146818rtc.h
    file, the drivers that used to do this have been fixed long ago,
    and the remaining users are all PC-specific.

    This removes the files for good.

    Signed-off-by: Arnd Bergmann
    Acked-by: Geert Uytterhoeven
    Signed-off-by: Alexandre Belloni

    Arnd Bergmann
     

25 May, 2016

1 commit


24 May, 2016

1 commit


23 May, 2016

11 commits

  • Signed-off-by: Andrea Gelmini
    Signed-off-by: Helge Deller

    Andrea Gelmini
     
  • Signed-off-by: Andrea Gelmini
    Signed-off-by: Helge Deller

    Andrea Gelmini
     
  • Signed-off-by: Andrea Gelmini
    Signed-off-by: Helge Deller

    Andrea Gelmini
     
  • Signed-off-by: Andrea Gelmini
    Signed-off-by: Helge Deller

    Andrea Gelmini
     
  • The attached patch updates the parisc version of futex.h to match the
    current generic implementation except for the spinlock code.

    Signed-off-by: John David Anglin
    Signed-off-by: Helge Deller

    John David Anglin
     
  • When enabling all-branch ftrace support (CONFIG_PROFILE_ALL_BRANCHES)
    the kernel gets really huge and some ftrace assembler functions like
    mcount can't reach the ftrace helper functions which are written in C.
    Avoid this problem of too distant branches by moving the ftrace C-helper
    functions into the .text.hot section which is put in front of the
    standard .text section by the linker.

    Signed-off-by: Helge Deller

    Helge Deller
     
  • Add a native implementation for the sched_clock() function which utilizes the
    processor-internal cycle counter (Control Register 16) as high-resolution time
    source.

    With this patch we now get much more fine-grained resolutions in various
    in-kernel time measurements (e.g. when viewing the function tracing logs), and
    probably a more accurate scheduling on SMP systems.

    There are a few specific implementation details in this patch:

    1. On a 32bit kernel we emulate the higher 32bits of the required 64-bit
    resolution of sched_clock() by increasing a per-cpu counter at every
    wrap-around of the 32bit cycle counter.

    2. In a SMP system, the cycle counters of the various CPUs are not syncronized
    (similiar to the TSC in a x86_64 system). To cope with this we define
    HAVE_UNSTABLE_SCHED_CLOCK and let the upper layers do the adjustment work.

    3. Since we need HAVE_UNSTABLE_SCHED_CLOCK, we need to provide a cmpxchg64()
    function even on a 32-bit kernel.

    4. A 64-bit SMP kernel which is started on a UP system will mark the
    sched_clock() implementation as "stable", which means that we don't expect any
    jumps in the returned counter. This is true because we then run only on one
    CPU.

    Signed-off-by: Helge Deller

    Helge Deller
     
  • By adding TRACEHOOK support we now get a clean user interface to access
    registers via PTRACE_GETREGS, PTRACE_SETREGS, PTRACE_GETFPREGS and
    PTRACE_SETFPREGS.

    The user-visible regset struct user_regs_struct and user_fp_struct are
    modelled similiar to x86 and can be accessed via PTRACE_GETREGSET.

    Signed-off-by: Helge Deller

    Helge Deller
     
  • Allow accessing 64-bit values in userspace from a 32-bit kernel.
    The access is not atomic.

    Signed-off-by: Helge Deller

    Helge Deller
     
  • This patch simplifies the code for get_user() and put_user() a lot.

    Instead of accessing kernel memory (%sr0) and userspace memory (%sr3)
    hard-coded in the assembler instruction, we now preload %sr2 with either
    %sr0 (for accessing KERNEL_DS) or with sr3 (to access USER_DS) and
    use %sr2 in the load directly.

    The generated code avoids a branch and speeds up execution by generating
    less assembler instructions.

    Signed-off-by: Helge Deller
    Tested-by: Rolf Eike Beer

    Helge Deller
     
  • This patch adds support for the TIF_SYSCALL_TRACEPOINT on the parisc
    architecture. Basically, it calls the appropriate tracepoints on syscall
    entry and exit.

    Signed-off-by: Helge Deller

    Helge Deller
     

21 May, 2016

2 commits

  • The binary GCD algorithm is based on the following facts:
    1. If a and b are all evens, then gcd(a,b) = 2 * gcd(a/2, b/2)
    2. If a is even and b is odd, then gcd(a,b) = gcd(a/2, b)
    3. If a and b are all odds, then gcd(a,b) = gcd((a-b)/2, b) = gcd((a+b)/2, b)

    Even on x86 machines with reasonable division hardware, the binary
    algorithm runs about 25% faster (80% the execution time) than the
    division-based Euclidian algorithm.

    On platforms like Alpha and ARMv6 where division is a function call to
    emulation code, it's even more significant.

    There are two variants of the code here, depending on whether a fast
    __ffs (find least significant set bit) instruction is available. This
    allows the unpredictable branches in the bit-at-a-time shifting loop to
    be eliminated.

    If fast __ffs is not available, the "even/odd" GCD variant is used.

    I use the following code to benchmark:

    #include
    #include
    #include
    #include
    #include
    #include

    #define swap(a, b) \
    do { \
    a ^= b; \
    b ^= a; \
    a ^= b; \
    } while (0)

    unsigned long gcd0(unsigned long a, unsigned long b)
    {
    unsigned long r;

    if (a < b) {
    swap(a, b);
    }

    if (b == 0)
    return a;

    while ((r = a % b) != 0) {
    a = b;
    b = r;
    }

    return b;
    }

    unsigned long gcd1(unsigned long a, unsigned long b)
    {
    unsigned long r = a | b;

    if (!a || !b)
    return r;

    b >>= __builtin_ctzl(b);

    for (;;) {
    a >>= __builtin_ctzl(a);
    if (a == b)
    return a << __builtin_ctzl(r);

    if (a < b)
    swap(a, b);
    a -= b;
    }
    }

    unsigned long gcd2(unsigned long a, unsigned long b)
    {
    unsigned long r = a | b;

    if (!a || !b)
    return r;

    r &= -r;

    while (!(b & r))
    b >>= 1;

    for (;;) {
    while (!(a & r))
    a >>= 1;
    if (a == b)
    return a;

    if (a < b)
    swap(a, b);
    a -= b;
    a >>= 1;
    if (a & r)
    a += b;
    a >>= 1;
    }
    }

    unsigned long gcd3(unsigned long a, unsigned long b)
    {
    unsigned long r = a | b;

    if (!a || !b)
    return r;

    b >>= __builtin_ctzl(b);
    if (b == 1)
    return r & -r;

    for (;;) {
    a >>= __builtin_ctzl(a);
    if (a == 1)
    return r & -r;
    if (a == b)
    return a << __builtin_ctzl(r);

    if (a < b)
    swap(a, b);
    a -= b;
    }
    }

    unsigned long gcd4(unsigned long a, unsigned long b)
    {
    unsigned long r = a | b;

    if (!a || !b)
    return r;

    r &= -r;

    while (!(b & r))
    b >>= 1;
    if (b == r)
    return r;

    for (;;) {
    while (!(a & r))
    a >>= 1;
    if (a == r)
    return r;
    if (a == b)
    return a;

    if (a < b)
    swap(a, b);
    a -= b;
    a >>= 1;
    if (a & r)
    a += b;
    a >>= 1;
    }
    }

    static unsigned long (*gcd_func[])(unsigned long a, unsigned long b) = {
    gcd0, gcd1, gcd2, gcd3, gcd4,
    };

    #define TEST_ENTRIES (sizeof(gcd_func) / sizeof(gcd_func[0]))

    #if defined(__x86_64__)

    #define rdtscll(val) do { \
    unsigned long __a,__d; \
    __asm__ __volatile__("rdtsc" : "=a" (__a), "=d" (__d)); \
    (val) = ((unsigned long long)__a) | (((unsigned long long)__d)<= start)
    ret = end - start;
    else
    ret = ~0ULL - start + 1 + end;

    *res = gcd_res;
    return ret;
    }

    #else

    static inline struct timespec read_time(void)
    {
    struct timespec time;
    clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &time);
    return time;
    }

    static inline unsigned long long diff_time(struct timespec start, struct timespec end)
    {
    struct timespec temp;

    if ((end.tv_nsec - start.tv_nsec) < 0) {
    temp.tv_sec = end.tv_sec - start.tv_sec - 1;
    temp.tv_nsec = 1000000000ULL + end.tv_nsec - start.tv_nsec;
    } else {
    temp.tv_sec = end.tv_sec - start.tv_sec;
    temp.tv_nsec = end.tv_nsec - start.tv_nsec;
    }

    return temp.tv_sec * 1000000000ULL + temp.tv_nsec;
    }

    static unsigned long long benchmark_gcd_func(unsigned long (*gcd)(unsigned long, unsigned long),
    unsigned long a, unsigned long b, unsigned long *res)
    {
    struct timespec start, end;
    unsigned long gcd_res;

    start = read_time();
    gcd_res = gcd(a, b);
    end = read_time();

    *res = gcd_res;
    return diff_time(start, end);
    }

    #endif

    static inline unsigned long get_rand()
    {
    if (sizeof(long) == 8)
    return (unsigned long)rand() << 32 | rand();
    else
    return rand();
    }

    int main(int argc, char **argv)
    {
    unsigned int seed = time(0);
    int loops = 100;
    int repeats = 1000;
    unsigned long (*res)[TEST_ENTRIES];
    unsigned long long elapsed[TEST_ENTRIES];
    int i, j, k;

    for (;;) {
    int opt = getopt(argc, argv, "n:r:s:");
    /* End condition always first */
    if (opt == -1)
    break;

    switch (opt) {
    case 'n':
    loops = atoi(optarg);
    break;
    case 'r':
    repeats = atoi(optarg);
    break;
    case 's':
    seed = strtoul(optarg, NULL, 10);
    break;
    default:
    /* You won't actually get here. */
    break;
    }
    }

    res = malloc(sizeof(unsigned long) * TEST_ENTRIES * loops);
    memset(elapsed, 0, sizeof(elapsed));

    srand(seed);
    for (j = 0; j < loops; j++) {
    unsigned long a = get_rand();
    /* Do we have args? */
    unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand();
    unsigned long long min_elapsed[TEST_ENTRIES];
    for (k = 0; k < repeats; k++) {
    for (i = 0; i < TEST_ENTRIES; i++) {
    unsigned long long tmp = benchmark_gcd_func(gcd_func[i], a, b, &res[j][i]);
    if (k == 0 || min_elapsed[i] > tmp)
    min_elapsed[i] = tmp;
    }
    }
    for (i = 0; i < TEST_ENTRIES; i++)
    elapsed[i] += min_elapsed[i];
    }

    for (i = 0; i < TEST_ENTRIES; i++)
    printf("gcd%d: elapsed %llu\n", i, elapsed[i]);

    k = 0;
    srand(seed);
    for (j = 0; j < loops; j++) {
    unsigned long a = get_rand();
    unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand();
    for (i = 1; i < TEST_ENTRIES; i++) {
    if (res[j][i] != res[j][0])
    break;
    }
    if (i < TEST_ENTRIES) {
    if (k == 0) {
    k = 1;
    fprintf(stderr, "Error:\n");
    }
    fprintf(stderr, "gcd(%lu, %lu): ", a, b);
    for (i = 0; i < TEST_ENTRIES; i++)
    fprintf(stderr, "%ld%s", res[j][i], i < TEST_ENTRIES - 1 ? ", " : "\n");
    }
    }

    if (k == 0)
    fprintf(stderr, "PASS\n");

    free(res);

    return 0;
    }

    Compiled with "-O2", on "VirtualBox 4.4.0-22-generic #38-Ubuntu x86_64" got:

    zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
    gcd0: elapsed 10174
    gcd1: elapsed 2120
    gcd2: elapsed 2902
    gcd3: elapsed 2039
    gcd4: elapsed 2812
    PASS
    zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
    gcd0: elapsed 9309
    gcd1: elapsed 2280
    gcd2: elapsed 2822
    gcd3: elapsed 2217
    gcd4: elapsed 2710
    PASS
    zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
    gcd0: elapsed 9589
    gcd1: elapsed 2098
    gcd2: elapsed 2815
    gcd3: elapsed 2030
    gcd4: elapsed 2718
    PASS
    zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
    gcd0: elapsed 9914
    gcd1: elapsed 2309
    gcd2: elapsed 2779
    gcd3: elapsed 2228
    gcd4: elapsed 2709
    PASS

    [akpm@linux-foundation.org: avoid #defining a CONFIG_ variable]
    Signed-off-by: Zhaoxiu Zeng
    Signed-off-by: George Spelvin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Zhaoxiu Zeng
     
  • Define HAVE_EXIT_THREAD for archs which want to do something in
    exit_thread. For others, let's define exit_thread as an empty inline.

    This is a cleanup before we change the prototype of exit_thread to
    accept a task parameter.

    [akpm@linux-foundation.org: fix mips]
    Signed-off-by: Jiri Slaby
    Cc: "David S. Miller"
    Cc: "H. Peter Anvin"
    Cc: "James E.J. Bottomley"
    Cc: Aurelien Jacquiot
    Cc: Benjamin Herrenschmidt
    Cc: Catalin Marinas
    Cc: Chen Liqin
    Cc: Chris Metcalf
    Cc: Chris Zankel
    Cc: David Howells
    Cc: Fenghua Yu
    Cc: Geert Uytterhoeven
    Cc: Guan Xuetao
    Cc: Haavard Skinnemoen
    Cc: Hans-Christian Egtvedt
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: Ingo Molnar
    Cc: Ivan Kokshaysky
    Cc: James Hogan
    Cc: Jeff Dike
    Cc: Jesper Nilsson
    Cc: Jiri Slaby
    Cc: Jonas Bonn
    Cc: Koichi Yasutake
    Cc: Lennox Wu
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Martin Schwidefsky
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Mikael Starvik
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Ralf Baechle
    Cc: Rich Felker
    Cc: Richard Henderson
    Cc: Richard Kuo
    Cc: Richard Weinberger
    Cc: Russell King
    Cc: Steven Miao
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiri Slaby