12 Oct, 2013

1 commit

  • ARCompact TRAP_S insn used for breakpoints, commits before exception is
    taken (updating architectural PC). So ptregs->ret contains next-PC and
    not the breakpoint PC itself. This is different from other restartable
    exceptions such as TLB Miss where ptregs->ret has exact faulting PC.
    gdb needs to know exact-PC hence ARC ptrace GETREGSET provides for
    @stop_pc which returns ptregs->ret vs. EFA depending on the
    situation.

    However, writing stop_pc (SETREGSET request), which updates ptregs->ret
    doesn't makes sense stop_pc doesn't always correspond to that reg as
    described above.

    This was not an issue so far since user_regs->ret / user_regs->stop_pc
    had same value and both writing to ptregs->ret was OK, needless, but NOT
    broken, hence not observed.

    With gdb "jump", they diverge, and user_regs->ret updating ptregs is
    overwritten immediately with stop_pc, which this patch fixes.

    Reported-by: Anton Kolesov
    Signed-off-by: Vineet Gupta

    Vineet Gupta
     

03 Oct, 2013

1 commit

  • Previously, when a signal was registered with SA_SIGINFO, parameters 2
    and 3 of the signal handler were written to registers r1 and r2 before
    the register set was saved. This led to corruption of these two
    registers after returning from the signal handler (the wrong values were
    restored).
    With this patch, registers are now saved before any parameters are
    passed, thus maintaining the processor state from before signal entry.

    Signed-off-by: Christian Ruppert
    Signed-off-by: Vineet Gupta

    Christian Ruppert
     

27 Sep, 2013

4 commits

  • clockevents_config_and_register is more clever and correct than doing it
    by hand; so use it.

    [vgupta: fixed build failure due to missing ; in patch]

    Signed-off-by: Uwe Kleine-König
    Signed-off-by: Vineet Gupta

    Uwe Kleine-König
     
  • Some ARC SMP systems lack native atomic R-M-W (LLOCK/SCOND) insns and
    can only use atomic EX insn (reg with mem) to build higher level R-M-W
    primitives. This includes a SystemC based SMP simulation model.

    So rwlocks need to use a protecting spinlock for atomic cmp-n-exchange
    operation to update reader(s)/writer count.

    The spinlock operation itself looks as follows:

    mov reg, 1 ; 1=locked, 0=unlocked
    retry:
    EX reg, [lock] ; load existing, store 1, atomically
    BREQ reg, 1, rety ; if already locked, retry

    In single-threaded simulation, SystemC alternates between the 2 cores
    with "N" insn each based scheduling. Additionally for insn with global
    side effect, such as EX writing to shared mem, a core switch is
    enforced too.

    Given that, 2 cores doing a repeated EX on same location, Linux often
    got into a livelock e.g. when both cores were fiddling with tasklist
    lock (gdbserver / hackbench) for read/write respectively as the
    sequence diagram below shows:

    core1 core2
    -------- --------
    1. spin lock [EX r=0, w=1] - LOCKED
    2. rwlock(Read) - LOCKED
    3. spin unlock [ST 0] - UNLOCKED
    spin lock [EX r=0,w=1] - LOCKED
    -- resched core 1----

    5. spin lock [EX r=1] - ALREADY-LOCKED

    -- resched core 2----
    6. rwlock(Write) - READER-LOCKED
    7. spin unlock [ST 0]
    8. rwlock failed, retry again

    9. spin lock [EX r=0, w=1]
    -- resched core 1----

    10 spinlock locked in #9, retry #5
    11. spin lock [EX gets 1]
    -- resched core 2----
    ...
    ...

    The fix was to unlock using the EX insn too (step 7), to trigger another
    SystemC scheduling pass which would let core1 proceed, eliding the
    livelock.

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • Anton reported

    | LTP tests syscalls/process_vm_readv01 and process_vm_writev01 fail
    | similarly in one testcase test_iov_invalid -> lvec->iov_base.
    | Testcase expects errno EFAULT and return code -1,
    | but it gets return code 1 and ERRNO is 0 what means success.

    Essentially test case was passing a pointer of -1 which access_ok()
    was not catching. It was doing [@addr + @sz
    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • If a load or store is the last instruction in a zero-overhead-loop, and
    it's misaligned, the loop would execute only once.

    This fixes that problem.

    Signed-off-by: Mischa Jonker
    Signed-off-by: Vineet Gupta

    Mischa Jonker
     

13 Sep, 2013

4 commits

  • After the last architecture switched to generic hard irqs the config
    options HAVE_GENERIC_HARDIRQS & GENERIC_HARDIRQS and the related code
    for !CONFIG_GENERIC_HARDIRQS can be removed.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • Merge more patches from Andrew Morton:
    "The rest of MM. Plus one misc cleanup"

    * emailed patches from Andrew Morton : (35 commits)
    mm/Kconfig: add MMU dependency for MIGRATION.
    kernel: replace strict_strto*() with kstrto*()
    mm, thp: count thp_fault_fallback anytime thp fault fails
    thp: consolidate code between handle_mm_fault() and do_huge_pmd_anonymous_page()
    thp: do_huge_pmd_anonymous_page() cleanup
    thp: move maybe_pmd_mkwrite() out of mk_huge_pmd()
    mm: cleanup add_to_page_cache_locked()
    thp: account anon transparent huge pages into NR_ANON_PAGES
    truncate: drop 'oldsize' truncate_pagecache() parameter
    mm: make lru_add_drain_all() selective
    memcg: document cgroup dirty/writeback memory statistics
    memcg: add per cgroup writeback pages accounting
    memcg: check for proper lock held in mem_cgroup_update_page_stat
    memcg: remove MEMCG_NR_FILE_MAPPED
    memcg: reduce function dereference
    memcg: avoid overflow caused by PAGE_ALIGN
    memcg: rename RESOURCE_MAX to RES_COUNTER_MAX
    memcg: correct RESOURCE_MAX to ULLONG_MAX
    mm: memcg: do not trap chargers with full callstack on OOM
    mm: memcg: rework and document OOM waiting and wakeup
    ...

    Linus Torvalds
     
  • Unlike global OOM handling, memory cgroup code will invoke the OOM killer
    in any OOM situation because it has no way of telling faults occuring in
    kernel context - which could be handled more gracefully - from
    user-triggered faults.

    Pass a flag that identifies faults originating in user space from the
    architecture-specific fault handlers to generic code so that memcg OOM
    handling can be improved.

    Signed-off-by: Johannes Weiner
    Reviewed-by: Michal Hocko
    Cc: David Rientjes
    Cc: KAMEZAWA Hiroyuki
    Cc: azurIt
    Cc: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • The memcg code can trap tasks in the context of the failing allocation
    until an OOM situation is resolved. They can hold all kinds of locks
    (fs, mm) at this point, which makes it prone to deadlocking.

    This series converts memcg OOM handling into a two step process that is
    started in the charge context, but any waiting is done after the fault
    stack is fully unwound.

    Patches 1-4 prepare architecture handlers to support the new memcg
    requirements, but in doing so they also remove old cruft and unify
    out-of-memory behavior across architectures.

    Patch 5 disables the memcg OOM handling for syscalls, readahead, kernel
    faults, because they can gracefully unwind the stack with -ENOMEM. OOM
    handling is restricted to user triggered faults that have no other
    option.

    Patch 6 reworks memcg's hierarchical OOM locking to make it a little
    more obvious wth is going on in there: reduce locked regions, rename
    locking functions, reorder and document.

    Patch 7 implements the two-part OOM handling such that tasks are never
    trapped with the full charge stack in an OOM situation.

    This patch:

    Back before smart OOM killing, when faulting tasks were killed directly on
    allocation failures, the arch-specific fault handlers needed special
    protection for the init process.

    Now that all fault handlers call into the generic OOM killer (see commit
    609838cfed97: "mm: invoke oom-killer from remaining unconverted page
    fault handlers"), which already provides init protection, the
    arch-specific leftovers can be removed.

    Signed-off-by: Johannes Weiner
    Reviewed-by: Michal Hocko
    Acked-by: KOSAKI Motohiro
    Cc: David Rientjes
    Cc: KAMEZAWA Hiroyuki
    Cc: azurIt
    Acked-by: Vineet Gupta [arch/arc bits]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

12 Sep, 2013

1 commit

  • Commit 05b016ecf5e7a "ARC: Setup Vector Table Base in early boot" moved
    the Interrupt vector Table setup out of arc_init_IRQ() which is called
    for all CPUs, to entry point of boot cpu only, breaking booting of others.

    Fix by adding the same to entry point of non-boot CPUs too.

    read_arc_build_cfg_regs() printing IVT Base Register didn't help the
    casue since it prints a synthetic value if zero which is totally bogus,
    so fix that to print the exact Register.

    [vgupta: Remove the now stale comment from header of arc_init_IRQ and
    also added the commentary for halt-on-reset]

    Cc: Gilad Ben-Yossef
    Cc: Cc: #3.11
    Signed-off-by: Noam Camus
    Signed-off-by: Vineet Gupta
    Signed-off-by: Linus Torvalds

    Noam Camus
     

11 Sep, 2013

1 commit

  • Pull device tree core updates from Grant Likely:
    "Generally minor changes. A bunch of bug fixes, particularly for
    initialization and some refactoring. Most notable change if feeding
    the entire flattened tree into the random pool at boot. May not be
    significant, but shouldn't hurt either"

    Tim Bird questions whether the boot time cost of the random feeding may
    be noticeable. And "add_device_randomness()" is definitely not some
    speed deamon of a function.

    * tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux:
    of/platform: add error reporting to of_amba_device_create()
    irq/of: Fix comment typo for irq_of_parse_and_map
    of: Feed entire flattened device tree into the random pool
    of/fdt: Clean up casting in unflattening path
    of/fdt: Remove duplicate memory clearing on FDT unflattening
    gpio: implement gpio-ranges binding document fix
    of: call __of_parse_phandle_with_args from of_parse_phandle
    of: introduce of_parse_phandle_with_fixed_args
    of: move of_parse_phandle()
    of: move documentation of of_parse_phandle_with_args
    of: Fix missing memory initialization on FDT unflattening
    of: consolidate definition of early_init_dt_alloc_memory_arch()
    of: Make of_get_phy_mode() return int i.s.o. const int
    include: dt-binding: input: create a DT header defining key codes.
    of/platform: Staticize of_platform_device_create_pdata()
    of: Specify initrd location using 64-bit
    dt: Typo fix
    OF: make of_property_for_each_{u32|string}() use parameters if OF is not enabled

    Linus Torvalds
     

05 Sep, 2013

5 commits


31 Aug, 2013

5 commits

  • This helps remove asid-to-mm reverse map

    While mm->context.id contains the ASID assigned to a process, our ASID
    allocator also used asid_mm_map[] reverse map. In a new allocation
    cycle (mm->ASID >= @asid_cache), the Round Robin ASID allocator used this
    to check if new @asid_cache belonged to some mm2 (from prev cycle).
    If so, it could locate that mm using the ASID reverse map, and mark that
    mm as unallocated ASID, to force it to refresh at the time of switch_mm()

    However, for SMP, the reverse map has to be maintained per CPU, so
    becomes 2 dimensional, hence got rid of it.

    With reverse map gone, it is NOT possible to reach out to current
    assignee. So we track the ASID allocation generation/cycle and
    on every switch_mm(), check if the current generation of CPU ASID is
    same as mm's ASID; If not it is refreshed.

    (Based loosely on arch/sh implementation)

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • ASID allocation changes/2

    Use the fact that switch_mm() and activate_mm() are exactly same code
    now while acknowledging the semantical difference in comment

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • ASID allocation changes/1

    This patch does 2 things:

    (1) get_new_mmu_context() NOW moves mm->ASID to a new value ONLY if it
    was from a prev allocation cycle/generation OR if mm had no ASID
    allocated (vs. before would unconditionally moving to a new ASID)

    Callers desiring unconditional update of ASID, e.g.local_flush_tlb_mm()
    (for parent's address space invalidation at fork) need to first force
    the parent to an unallocated ASID.

    (2) get_new_mmu_context() always sets the MMU PID reg with unchanged/new
    ASID value.

    The gains are:
    - consolidation of all asid alloc logic into get_new_mmu_context()
    - avoiding code duplication in switch_mm() for PID reg setting
    - Enables future change to fold activate_mm() into switch_mm()

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • -Asm code already has values of SW and HW ASID values, so they can be
    passed to the printing routine.

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • Signed-off-by: Vineet Gupta

    Vineet Gupta
     

30 Aug, 2013

3 commits


29 Aug, 2013

4 commits

  • The current ARC VM code has 13 flags in Page Table entry: some software
    (accesed/dirty/non-linear-maps) and rest hardware specific. With 8k MMU
    page, we need 19 bits for addressing page frame so remaining 13 bits is
    just about enough to accomodate the current flags.

    In MMUv4 there are 2 additional flags, SZ (normal or super page) and WT
    (cache access mode write-thru) - and additionally PFN is 20 bits (vs. 19
    before for 8k). Thus these can't be held in current PTE w/o making each
    entry 64bit wide.

    It seems there is some scope of compressing the current PTE flags (and
    freeing up a few bits). Currently PTE contains fully orthogonal distinct
    access permissions for kernel and user mode (Kr, Kw, Kx; Ur, Uw, Ux)
    which can be folded into one set (R, W, X). The translation of 3 PTE
    bits into 6 TLB bits (when programming the MMU) can be done based on
    following pre-requites/assumptions:

    1. For kernel-mode-only translations (vmalloc: 0x7000_0000 to
    0x7FFF_FFFF), PTE additionally has PAGE_GLOBAL flag set (and user
    space entries can never be global). Thus such a PTE can translate
    to Kr, Kw, Kx (as appropriate) and zero for User mode counterparts.

    2. For non global entries, the PTE flags can be used to create mirrored
    K and U TLB bits. This is true after commit a950549c675f2c8c504
    "ARC: copy_(to|from)_user() to honor usermode-access permissions"
    which ensured that user-space translations _MUST_ have same access
    permissions for both U/K mode accesses so that copy_{to,from}_user()
    play fair with fault based CoW break and such...

    There is no such thing as free lunch - the cost is slightly infalted
    TLB-Miss Handlers.

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • * reduce editor lines taken by pt_regs
    * ARCompact ISA specific part of TLB Miss handlers clubbed together
    * cleanup some comments

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • Most architectures use the same implementation. Collapse the common ones
    into a single weak function that can be overridden.

    Signed-off-by: Grant Likely

    Grant Likely
     
  • Linux 3.11-rc7

    Grant Likely
     

26 Aug, 2013

3 commits

  • In the exception return path, for both U/K cases, intr are already
    disabled (for various existing reasons). So when we drop down to
    @restore_regs, we need not redo that.

    There was subtle issue - when intr were NOT being disabled for
    ret-to-kernel-but-no-preemption case - now fixed by moving the
    IRQ_DISABLE further up in @resume_kernel_mode.

    So what do we gain:

    * Shaves off a few insn in return path.

    * Eliminates the need for IRQ_DISABLE_SAVE assembler macro for ARCv2
    hence allows for entry code sharing.

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • After the recent cleanups, all the exception handlers now have same
    boilerplate prologue code. Move that into common macro.

    This reduces readability but helps greatly with sharing / duplicating
    entry code with ARCv2 ISA where the handlers are pretty much the same,
    just the entry prologue is different (due to hardware assist).

    Also while at it, add the missing FAKE_RET_FROM_EXCPN calls in couple of
    places to drop down to pure kernel mode (from exception mode) before
    jumping off into "C" code.

    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • Signed-off-by: Vineet Gupta

    Vineet Gupta
     

25 Aug, 2013

1 commit

  • For a search buffer, 2 byte aligned, strchr() was returning pointer
    outside of buffer (buf - 1)

    ------------->8----------------
    // Input buffer (default 4 byte aigned)
    char *buffer = "1AA_";

    // actual search start (to mimick 2 byte alignment)
    char *current_line = &(buffer[2]);

    // Character to search for
    char c = 'A';

    char *c_pos = strchr(current_line, c);

    printf("%s\n", c_pos) --> 'AA_' as oppose to 'A_'
    ------------->8----------------

    Reported-by: Anton Kolesov
    Debugged-by: Anton Kolesov
    Cc: # [3.9 and 3.10]
    Cc: Noam Camus
    Signed-off-by: Joern Rennecke
    Signed-off-by: Vineet Gupta
    Signed-off-by: Linus Torvalds

    Joern Rennecke
     

27 Jul, 2013

1 commit


24 Jul, 2013

1 commit

  • On some PAE architectures, the entire range of physical memory could reside
    outside the 32-bit limit. These systems need the ability to specify the
    initrd location using 64-bit numbers.

    This patch globally modifies the early_init_dt_setup_initrd_arch() function to
    use 64-bit numbers instead of the current unsigned long.

    There has been quite a bit of debate about whether to use u64 or phys_addr_t.
    It was concluded to stick to u64 to be consistent with rest of the device
    tree code. As summarized by Geert, "The address to load the initrd is decided
    by the bootloader/user and set at that point later in time. The dtb should not
    be tied to the kernel you are booting"

    More details on the discussion can be found here:
    https://lkml.org/lkml/2013/6/20/690
    https://lkml.org/lkml/2012/9/13/544

    Signed-off-by: Santosh Shilimkar
    Acked-by: Rob Herring
    Acked-by: Vineet Gupta
    Acked-by: Jean-Christophe PLAGNIOL-VILLARD
    Signed-off-by: Grant Likely

    Santosh Shilimkar
     

11 Jul, 2013

1 commit

  • Pull second set of ARC architecture updates from Vineet Gupta:
    "Couple of Platform updates (Device Tree files primarily) given that
    the corresponding drivers (net/ethernet/arc/*, irqctl/irq-tb10x.c)
    have now been merged into your tree.

    Ideally these shd have been part of same submissions, oh well..."

    * tag 'arc-v3.11-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
    ARC: [TB10x] Updates for irqchip driver
    ARC: [plat-arcfpga] Enable arc_emac for ARCAngle4 Board

    Linus Torvalds
     

10 Jul, 2013

1 commit

  • A few remaining architectures directly kill the page faulting task in an
    out of memory situation. This is usually not a good idea since that
    task might not even use a significant amount of memory and so may not be
    the optimal victim to resolve the situation.

    Since 2.6.29's 1c0fe6e ("mm: invoke oom-killer from page fault") there
    is a hook that architecture page fault handlers are supposed to call to
    invoke the OOM killer and let it pick the right task to kill. Convert
    the remaining architectures over to this hook.

    To have the previous behavior of simply taking out the faulting task the
    vm.oom_kill_allocating_task sysctl can be set to 1.

    Signed-off-by: Johannes Weiner
    Reviewed-by: Michal Hocko
    Cc: KAMEZAWA Hiroyuki
    Acked-by: David Rientjes
    Acked-by: Vineet Gupta [arch/arc bits]
    Cc: James Hogan
    Cc: David Howells
    Cc: Jonas Bonn
    Cc: Chen Liqin
    Cc: Lennox Wu
    Cc: Chris Metcalf
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

05 Jul, 2013

3 commits

  • Pull i2c updates from Wolfram Sang:
    - new drivers: Kontron PLD, Wondermedia VT
    - mv64xxx driver gained sun4i support and a bigger cleanup
    - duplicate driver 'intel-mid' removed
    - added generic device tree binding for sda holding time (and
    designware driver already uses it)
    - we tried to allow driver probing with only device tree and no i2c
    ids, but I had to revert it because of side effects. Needs some
    rethinking.
    - driver bugfixes, cleanups...

    * 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (34 commits)
    i2c-designware: use div_u64 to fix link
    i2c: Kontron PLD i2c bus driver
    i2c: iop3xxx: fix build failure after waitqueue changes
    i2c-designware: make SDA hold time configurable
    i2c: mv64xxx: Set bus frequency to 100kHz if clock-frequency is not provided
    i2c: imx: allow autoloading on dt ids
    i2c: mv64xxx: Fix transfer error code
    i2c: i801: SMBus patch for Intel Coleto Creek DeviceIDs
    i2c: omap: correct usage of the interrupt enable register
    i2c-pxa: prepare clock before use
    Revert "i2c: core: make it possible to match a pure device tree driver"
    i2c: nomadik: allocate adapter number dynamically
    i2c: nomadik: support elder Nomadiks
    i2c: mv64xxx: Add Allwinner sun4i compatible
    i2c: mv64xxx: make the registers offset configurable
    i2c: mv64xxx: Add macros to access parts of registers
    i2c: vt8500: Add support for I2C bus on Wondermedia SoCs
    i2c: designware: fix race between subsequent xfers
    i2c: bfin-twi: Read and write the FIFO in loop
    i2c: core: make it possible to match a pure device tree driver
    ...

    Linus Torvalds
     
  • Merge Kconfig menu diet patches from Dave Hansen:
    "I think the "Kernel Hacking" menu has gotten a bit out of hand. It is
    over 120 lines long on my system with everything enabled and options
    are scattered around it haphazardly.

    http://sr71.net/~dave/linux/kconfig-horror.png

    Let's try to introduce some sanity. This set takes that 120 lines
    down to 55 and makes it vastly easier to find some things. It's a
    start.

    This set stands on its own, but there is plenty of room for follow-up
    patches. The arch-specific debug options still end up getting stuck
    in the top-level "kernel hacking" menu. OPTIMIZE_INLINING, for
    instance, could obviously go in to the "compiler options" menu, but
    the fact that it is defined in arch/ in a separate Kconfig file keeps
    it on its own for the moment.

    The Signed-off-by's in here look funky. I changed employers while
    working on this set, so I have signoffs from both email addresses"

    * emailed patches from Dave Hansen :
    hang and lockup detection menu
    kconfig: consolidate printk options
    group locking debugging options
    consolidate compilation option configs
    consolidate runtime testing configs
    order memory debugging Kconfig options
    consolidate per-arch stack overflow debugging options

    Linus Torvalds
     
  • Original posting:

    http://lkml.kernel.org/r/20121214184202.F54094D9@kernel.stglabs.ibm.com

    Several architectures have similar stack debugging config options.
    They all pretty much do the same thing, some with slightly
    differing help text.

    This patch changes the architectures to instead enable a Kconfig
    boolean, and then use that boolean in the generic Kconfig.debug
    to present the actual menu option. This removes a bunch of
    duplication and adds consistency across arches.

    Signed-off-by: Dave Hansen
    Reviewed-by: H. Peter Anvin
    Reviewed-by: James Hogan
    Acked-by: Chris Metcalf [for tile]
    Signed-off-by: Dave Hansen
    Signed-off-by: Linus Torvalds

    Dave Hansen