09 Mar, 2015

1 commit

  • Pull tty/serial fixes from Greg KH:
    "Here are some tty and serial driver fixes for 4.0-rc3.

    Along with the atime fix that you know about, here are some other
    serial driver bugfixes as well. Most notable is a wait_until_sent
    bugfix that was traced back to being around since before 2.6.12 that
    Johan has fixed up.

    All have been in linux-next successfully"

    * tag 'tty-4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
    TTY: fix tty_wait_until_sent maximum timeout
    TTY: fix tty_wait_until_sent on 64-bit machines
    USB: serial: fix infinite wait_until_sent timeout
    TTY: bfin_jtag_comm: remove incorrect wait_until_sent operation
    net: irda: fix wait_until_sent poll timeout
    serial: uapi: Declare all userspace-visible io types
    serial: core: Fix iotype userspace breakage
    serial: sprd: Fix missing spin_unlock in sprd_handle_irq()
    console: Fix console name size mismatch
    tty: fix up atime/mtime mess, take four
    serial: 8250_dw: Fix get_mctrl behaviour
    serial:8250:8250_pci: delete unneeded quirk entries
    serial:8250:8250_pci: fix redundant entry report for WCH_CH352_2S
    Change email address for 8250_pci
    serial: 8250: Revert "tty: serial: 8250_core: read only RX if there is something in the FIFO"
    Revert "tty/serial: of_serial: add DT alias ID handling"

    Linus Torvalds
     

08 Mar, 2015

1 commit


07 Mar, 2015

2 commits


06 Mar, 2015

4 commits

  • When CONFIG_DEBUG_SET_MODULE_RONX is enabled, the sizes of
    module sections are aligned up so appropriate permissions can
    be applied. Adjusting for the symbol table may cause them to
    become unaligned. Make sure to re-align the sizes afterward.

    Signed-off-by: Laura Abbott
    Acked-by: Rusty Russell
    Signed-off-by: Catalin Marinas

    Laura Abbott
     
  • * irq-pm:
    genirq / PM: describe IRQF_COND_SUSPEND
    tty: serial: atmel: rework interrupt and wakeup handling
    watchdog: at91sam9: request the irq with IRQF_NO_SUSPEND
    clk: at91: implement suspend/resume for the PMC irqchip
    rtc: at91rm9200: rework wakeup and interrupt handling
    rtc: at91sam9: rework wakeup and interrupt handling
    PM / wakeup: export pm_system_wakeup symbol
    genirq / PM: Add flag for shared NO_SUSPEND interrupt lines
    genirq / PM: better describe IRQF_NO_SUSPEND semantics

    Rafael J. Wysocki
     
  • * suspend-to-idle:
    cpuidle / sleep: Use broadcast timer for states that stop local timer
    cpuidle: Clean up fallback handling in cpuidle_idle_call()
    cpuidle / sleep: Do sanity checks in cpuidle_enter_freeze() too
    idle / sleep: Avoid excessive disabling and enabling interrupts

    Rafael J. Wysocki
     
  • Commit 381063133246 (PM / sleep: Re-implement suspend-to-idle handling)
    overlooked the fact that entering some sufficiently deep idle states
    by CPUs may cause their local timers to stop and in those cases it
    is necessary to switch over to a broadcast timer prior to entering
    the idle state. If the cpuidle driver in use does not provide
    the new ->enter_freeze callback for any of the idle states, that
    problem affects suspend-to-idle too, but it is not taken into account
    after the changes made by commit 381063133246.

    Fix that by changing the definition of cpuidle_enter_freeze() and
    re-arranging of the code in cpuidle_idle_call(), so the former does
    not call cpuidle_enter() any more and the fallback case is handled
    by cpuidle_idle_call() directly.

    Fixes: 381063133246 (PM / sleep: Re-implement suspend-to-idle handling)
    Reported-and-tested-by: Lorenzo Pieralisi
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Peter Zijlstra (Intel)

    Rafael J. Wysocki
     

05 Mar, 2015

1 commit

  • It currently is required that all users of NO_SUSPEND interrupt
    lines pass the IRQF_NO_SUSPEND flag when requesting the IRQ or the
    WARN_ON_ONCE() in irq_pm_install_action() will trigger. That is
    done to warn about situations in which unprepared interrupt handlers
    may be run unnecessarily for suspended devices and may attempt to
    access those devices by mistake. However, it may cause drivers
    that have no technical reasons for using IRQF_NO_SUSPEND to set
    that flag just because they happen to share the interrupt line
    with something like a timer.

    Moreover, the generic handling of wakeup interrupts introduced by
    commit 9ce7a25849e8 (genirq: Simplify wakeup mechanism) only works
    for IRQs without any NO_SUSPEND users, so the drivers of wakeup
    devices needing to use shared NO_SUSPEND interrupt lines for
    signaling system wakeup generally have to detect wakeup in their
    interrupt handlers. Thus if they happen to share an interrupt line
    with a NO_SUSPEND user, they also need to request that their
    interrupt handlers be run after suspend_device_irqs().

    In both cases the reason for using IRQF_NO_SUSPEND is not because
    the driver in question has a genuine need to run its interrupt
    handler after suspend_device_irqs(), but because it happens to
    share the line with some other NO_SUSPEND user. Otherwise, the
    driver would do without IRQF_NO_SUSPEND just fine.

    To make it possible to specify that condition explicitly, introduce
    a new IRQ action handler flag for shared IRQs, IRQF_COND_SUSPEND,
    that, when set, will indicate to the IRQ core that the interrupt
    user is generally fine with suspending the IRQ, but it also can
    tolerate handler invocations after suspend_device_irqs() and, in
    particular, it is capable of detecting system wakeup and triggering
    it as appropriate from its interrupt handler.

    That will allow us to work around a problem with a shared timer
    interrupt line on at91 platforms.

    Link: http://marc.info/?l=linux-kernel&m=142252777602084&w=2
    Link: http://marc.info/?t=142252775300011&r=1&w=2
    Link: https://lkml.org/lkml/2014/12/15/552
    Reported-by: Boris Brezillon
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Peter Zijlstra (Intel)
    Acked-by: Mark Rutland

    Rafael J. Wysocki
     

03 Mar, 2015

2 commits

  • While one must hold RCU-sched (aka. preempt_disable) for find_symbol()
    one must equally hold it over the use of the object returned.

    The moment you release the RCU-sched read lock, the object can be dead
    and gone.

    [jkosina@suse.cz: change subject line to be aligned with other patches]
    Cc: Seth Jennings
    Cc: Josh Poimboeuf
    Cc: Masami Hiramatsu
    Cc: Miroslav Benes
    Cc: Petr Mladek
    Cc: Jiri Kosina
    Cc: "Paul E. McKenney"
    Cc: Rusty Russell
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Masami Hiramatsu
    Acked-by: Paul E. McKenney
    Acked-by: Josh Poimboeuf
    Signed-off-by: Jiri Kosina

    Peter Zijlstra
     
  • Move the fallback code path in cpuidle_idle_call() to the end of the
    function to avoid jumping to a label in an if () branch.

    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     

02 Mar, 2015

1 commit


01 Mar, 2015

3 commits

  • The "usual" path is:

    - rt_mutex_slowlock()
    - set_current_state()
    - task_blocks_on_rt_mutex() (ret 0)
    - __rt_mutex_slowlock()
    - sleep or not but do return with __set_current_state(TASK_RUNNING)
    - back to caller.

    In the early error case where task_blocks_on_rt_mutex() return
    -EDEADLK we never change the task's state back to RUNNING. I
    assume this is intended. Without this change after ww_mutex
    using rt_mutex the selftest passes but later I get plenty of:

    | bad: scheduling from the idle thread!

    backtraces.

    Signed-off-by: Sebastian Andrzej Siewior
    Acked-by: Mike Galbraith
    Cc: Linus Torvalds
    Cc: Maarten Lankhorst
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Fixes: afffc6c1805d ("locking/rtmutex: Optimize setting task running after being blocked")
    Link: http://lkml.kernel.org/r/1425056229-22326-4-git-send-email-bigeasy@linutronix.de
    Signed-off-by: Ingo Molnar

    Sebastian Andrzej Siewior
     
  • Disabling interrupts at the end of cpuidle_enter_freeze() is not
    useful, because its caller, cpuidle_idle_call(), re-enables them
    right away after invoking it.

    To avoid that unnecessary back and forth dance with interrupts,
    make cpuidle_enter_freeze() enable interrupts after calling
    enter_freeze_proper() and drop the local_irq_disable() at its
    end, so that all of the code paths in it end up with interrupts
    enabled. Then, cpuidle_idle_call() will not need to re-enable
    interrupts after calling cpuidle_enter_freeze() any more, because
    the latter will return with interrupts enabled, in analogy with
    cpuidle_enter().

    Reported-by: Lorenzo Pieralisi
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Peter Zijlstra (Intel)

    Rafael J. Wysocki
     
  • There's a uname workaround for broken userspace which can't handle kernel
    versions of 3.x. Update it for 4.x.

    Signed-off-by: Jon DeVree
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jon DeVree
     

25 Feb, 2015

1 commit

  • Pull livepatching fixes from Jiri Kosina:
    "Two tiny fixes for livepatching infrastructure:

    - extending RCU critical section to cover all accessess to
    RCU-protected variable, by Petr Mladek

    - proper format string passing to kobject_init_and_add(), by Jiri
    Kosina"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
    livepatch: RCU protect struct klp_func all the time when used in klp_ftrace_handler()
    livepatch: fix format string in kobject_init_and_add()

    Linus Torvalds
     

23 Feb, 2015

1 commit


22 Feb, 2015

5 commits

  • Pull MIPS updates from Ralf Baechle:
    "This is the main pull request for MIPS:

    - a number of fixes that didn't make the 3.19 release.

    - a number of cleanups.

    - preliminary support for Cavium's Octeon 3 SOCs which feature up to
    48 MIPS64 R3 cores with FPU and hardware virtualization.

    - support for MIPS R6 processors.

    Revision 6 of the MIPS architecture is a major revision of the MIPS
    architecture which does away with many of original sins of the
    architecture such as branch delay slots. This and other changes in
    R6 require major changes throughout the entire MIPS core
    architecture code and make up for the lion share of this pull
    request.

    - finally some preparatory work for eXtendend Physical Address
    support, which allows support of up to 40 bit of physical address
    space on 32 bit processors"

    [ Ahh, MIPS can't leave the PAE brain damage alone. It's like
    every CPU architect has to make that mistake, but pee in the snow
    by changing the TLA. But whether it's called PAE, LPAE or XPA,
    it's horrid crud - Linus ]

    * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (114 commits)
    MIPS: sead3: Corrected get_c0_perfcount_int
    MIPS: mm: Remove dead macro definitions
    MIPS: OCTEON: irq: add CIB and other fixes
    MIPS: OCTEON: Don't do acknowledge operations for level triggered irqs.
    MIPS: OCTEON: More OCTEONIII support
    MIPS: OCTEON: Remove setting of processor specific CVMCTL icache bits.
    MIPS: OCTEON: Core-15169 Workaround and general CVMSEG cleanup.
    MIPS: OCTEON: Update octeon-model.h code for new SoCs.
    MIPS: OCTEON: Implement DCache errata workaround for all CN6XXX
    MIPS: OCTEON: Add little-endian support to asm/octeon/octeon.h
    MIPS: OCTEON: Implement the core-16057 workaround
    MIPS: OCTEON: Delete unused COP2 saving code
    MIPS: OCTEON: Use correct instruction to read 64-bit COP0 register
    MIPS: OCTEON: Save and restore CP2 SHA3 state
    MIPS: OCTEON: Fix FP context save.
    MIPS: OCTEON: Save/Restore wider multiply registers in OCTEON III CPUs
    MIPS: boot: Provide more uImage options
    MIPS: Remove unneeded #ifdef __KERNEL__ from asm/processor.h
    MIPS: ip22-gio: Remove legacy suspend/resume support
    mips: pci: Add ifdef around pci_proc_domain
    ...

    Linus Torvalds
     
  • Pull ntp fix from Ingo Molnar:
    "An adjtimex interface regression fix for 32-bit systems"

    [ A check that was added in a previous commit is really only a concern
    for 64bit systems, but was applied to both 32 and 64bit systems, which
    results in breaking 32bit systems.

    Thus the fix here is to make the check only apply to 64bit systems ]

    * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    ntp: Fixup adjtimex freq validation on 32-bit systems

    Linus Torvalds
     
  • Pull locking fixes from Ingo Molnar:
    "Two fixes: the paravirt spin_unlock() corruption/crash fix, and an
    rtmutex NULL dereference crash fix"

    * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/spinlocks/paravirt: Fix memory corruption on unlock
    locking/rtmutex: Avoid a NULL pointer dereference on deadlock

    Linus Torvalds
     
  • Pull scheduler fixes from Ingo Molnar:
    "Thiscontains misc fixes: preempt_schedule_common() and io_schedule()
    recursion fixes, sched/dl fixes, a completion_done() revert, two
    sched/rt fixes and a comment update patch"

    * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    sched/rt: Avoid obvious configuration fail
    sched/autogroup: Fix failure to set cpu.rt_runtime_us
    sched/dl: Do update_rq_clock() in yield_task_dl()
    sched: Prevent recursion in io_schedule()
    sched/completion: Serialize completion_done() with complete()
    sched: Fix preempt_schedule_common() triggering tracing recursion
    sched/dl: Prevent enqueue of a sleeping task in dl_task_timer()
    sched: Make dl_task_time() use task_rq_lock()
    sched: Clarify ordering between task_rq_lock() and move_queued_task()

    Linus Torvalds
     
  • …ernel.org/pub/scm/linux/kernel/git/tip/tip

    Pull rcu fix and x86 irq fix from Ingo Molnar:

    - Fix a bug that caused an RCU warning splat.

    - Two x86 irq related fixes: a hotplug crash fix and an ACPI IRQ
    registry fix.

    * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    rcu: Clear need_qs flag to prevent splat

    * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/irq: Check for valid irq descriptor in check_irq_vectors_for_cpu_disable()
    x86/irq: Fix regression caused by commit b568b8601f05

    Linus Torvalds
     

21 Feb, 2015

1 commit

  • Pull kgdb/kdb updates from Jason Wessel:
    "KGDB/KDB New:
    - KDB: improved searching
    - No longer enter debug core on panic if panic timeout is set

    KGDB/KDB regressions / cleanups
    - fix pdf doc build errors
    - prevent junk characters on kdb console from printk levels"

    * tag 'for_linux-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb:
    kgdb, docs: Fix pdfdocs build errors
    debug: prevent entering debug mode on panic/exception.
    kdb: Const qualifier for kdb_getstr's prompt argument
    kdb: Provide forward search at more prompt
    kdb: Fix a prompt management bug when using | grep
    kdb: Remove stack dump when entering kgdb due to NMI
    kdb: Avoid printing KERN_ levels to consoles
    kdb: Fix off by one error in kdb_cpu()
    kdb: fix incorrect counts in KDB summary command output

    Linus Torvalds
     

20 Feb, 2015

9 commits

  • On non-developer devices, kgdb prevents the device from rebooting
    after a panic.

    Incase of panics and exceptions, to allow the device to reboot, prevent
    entering debug mode to avoid getting stuck waiting for the user to
    interact with debugger.

    To avoid entering the debugger on panic/exception without any extra
    configuration, panic_timeout is being used which can be set via
    /proc/sys/kernel/panic at run time and CONFIG_PANIC_TIMEOUT sets the
    default value.

    Setting panic_timeout indicates that the user requested machine to
    perform unattended reboot after panic. We dont want to get stuck waiting
    for the user input incase of panic.

    Cc: Andrew Morton
    Cc: kgdb-bugreport@lists.sourceforge.net
    Cc: linux-kernel@vger.kernel.org
    Cc: Android Kernel Team
    Cc: John Stultz
    Cc: Sumit Semwal
    Signed-off-by: Colin Cross
    [Kiran: Added context to commit message.
    panic_timeout is used instead of break_on_panic and
    break_on_exception to honor CONFIG_PANIC_TIMEOUT
    Modified the commit as per community feedback]
    Signed-off-by: Kiran Raparthy
    Signed-off-by: Daniel Thompson
    Signed-off-by: Jason Wessel

    Colin Cross
     
  • All current callers of kdb_getstr() can pass constant pointers via the
    prompt argument. This patch adds a const qualification to make explicit
    the fact that this is safe.

    Signed-off-by: Daniel Thompson
    Signed-off-by: Jason Wessel

    Daniel Thompson
     
  • Currently kdb allows the output of comamnds to be filtered using the
    | grep feature. This is useful but does not permit the output emitted
    shortly after a string match to be examined without wading through the
    entire unfiltered output of the command. Such a feature is particularly
    useful to navigate function traces because these traces often have a
    useful trigger string *before* the point of interest.

    This patch reuses the existing filtering logic to introduce a simple
    forward search to kdb that can be triggered from the more prompt.

    Signed-off-by: Daniel Thompson
    Signed-off-by: Jason Wessel

    Daniel Thompson
     
  • Currently when the "| grep" feature is used to filter the output of a
    command then the prompt is not displayed for the subsequent command.
    Likewise any characters typed by the user are also not echoed to the
    display. This rather disconcerting problem eventually corrects itself
    when the user presses Enter and the kdb_grepping_flag is cleared as
    kdb_parse() tries to make sense of whatever they typed.

    This patch resolves the problem by moving the clearing of this flag
    from the middle of command processing to the beginning.

    Signed-off-by: Daniel Thompson
    Signed-off-by: Jason Wessel

    Daniel Thompson
     
  • Issuing a stack dump feels ergonomically wrong when entering due to NMI.

    Entering due to NMI is normally a reaction to a user request, either the
    NMI button on a server or a "magic knock" on a UART. Therefore the
    backtrace behaviour on entry due to NMI should be like SysRq-g (no stack
    dump) rather than like oops.

    Note also that the stack dump does not offer any information that
    cannot be trivial retrieved using the 'bt' command.

    Signed-off-by: Daniel Thompson
    Signed-off-by: Jason Wessel

    Daniel Thompson
     
  • Currently when kdb traps printk messages then the raw log level prefix
    (consisting of '\001' followed by a numeral) does not get stripped off
    before the message is issued to the various I/O handlers supported by
    kdb. This causes annoying visual noise as well as causing problems
    grepping for ^. It is also a change of behaviour compared to normal usage
    of printk() usage. For example -h ends up with different output to
    that of kdb's "sr h".

    This patch addresses the problem by stripping log levels from messages
    before they are issued to the I/O handlers. printk() which can also
    act as an i/o handler in some cases is special cased; if the caller
    provided a log level then the prefix will be preserved when sent to
    printk().

    The addition of non-printable characters to the output of kdb commands is a
    regression, albeit and extremely elderly one, introduced by commit
    04d2c8c83d0e ("printk: convert the format for KERN_ to a 2 byte
    pattern"). Note also that this patch does *not* restore the original
    behaviour from v3.5. Instead it makes printk() from within a kdb command
    display the message without any prefix (i.e. like printk() normally does).

    Signed-off-by: Daniel Thompson
    Cc: Joe Perches
    Cc: stable@vger.kernel.org
    Signed-off-by: Jason Wessel

    Daniel Thompson
     
  • There was a follow on replacement patch against the prior
    "kgdb: Timeout if secondary CPUs ignore the roundup".

    See: https://lkml.org/lkml/2015/1/7/442

    This patch is the delta vs the patch that was committed upstream:
    * Fix an off-by-one error in kdb_cpu().
    * Replace NR_CPUS with CONFIG_NR_CPUS to tell checkpatch that we
    really want a static limit.
    * Removed the "KGDB: " prefix from the pr_crit() in debug_core.c
    (kgdb-next contains a patch which introduced pr_fmt() to this file
    to the tag will now be applied automatically).

    Cc: Daniel Thompson
    Cc:
    Signed-off-by: Jason Wessel

    Jason Wessel
     
  • The output of KDB 'summary' command should report MemTotal, MemFree
    and Buffers output in kB. Current codes report in unit of pages.

    A define of K(x) as
    is defined in the code, but not used.

    This patch would apply the define to convert the values to kB.
    Please include me on Cc on replies. I do not subscribe to linux-kernel.

    Signed-off-by: Jay Lan
    Cc:
    Signed-off-by: Jason Wessel

    Jay Lan
     
  • Pull kbuild updates from Michal Marek:

    - several cleanups in kbuild

    - serialize multiple *config targets so that 'make defconfig kvmconfig'
    works

    - The cc-ifversion macro got support for an else-branch

    * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
    kbuild,gcov: simplify kernel/gcov/Makefile more
    kbuild: allow cc-ifversion to have the argument for false condition
    kbuild,gcov: simplify kernel/gcov/Makefile
    kbuild,gcov: remove unnecessary workaround
    kbuild: do not add $(call ...) to invoke cc-version or cc-fullversion
    kbuild: fix cc-ifversion macro
    kbuild: drop $(version_h) from MRPROPER_FILES
    kbuild: use mixed-targets when two or more config targets are given
    kbuild: remove redundant line from bounds.h/asm-offsets.h
    kbuild: merge bounds.h and asm-offsets.h rules
    kbuild: Drop support for clean-rule

    Linus Torvalds
     

19 Feb, 2015

1 commit


18 Feb, 2015

7 commits

  • Setting the root group's cpu.rt_runtime_us to 0 is a bad thing; it
    would disallow the kernel creating RT tasks.

    One can of course still set it to 1, which will (likely) still wreck
    your kernel, but at least make it clear that setting it to 0 is not
    good.

    Collect both sanity checks into the one place while we're there.

    Suggested-by: Zefan Li
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/20150209112715.GO24151@twins.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Because task_group() uses a cache of autogroup_task_group(), whose
    output depends on sched_class, switching classes can generate
    problems.

    In particular, when started as fair, the cache points to the
    autogroup, so when switching to RT the tg_rt_schedulable() test fails
    for every cpu.rt_{runtime,period}_us change because now the autogroup
    has tasks and no runtime.

    Furthermore, going back to the previous semantics of varying
    task_group() with sched_class has the down-side that the sched_debug
    output varies as well, even though the task really is in the
    autogroup.

    Therefore add an autogroup exception to tg_has_rt_tasks() -- such that
    both (all) task_group() usages in sched/core now have one. And remove
    all the remnants of the variable task_group() output.

    Reported-by: Zefan Li
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Stefan Bader
    Fixes: 8323f26ce342 ("sched: Fix race in task_group()")
    Link: http://lkml.kernel.org/r/20150209112237.GR5029@twins.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • update_curr_dl() needs actual rq clock.

    Signed-off-by: Kirill Tkhai
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1423040972.18770.10.camel@tkhai
    Signed-off-by: Ingo Molnar

    Kirill Tkhai
     
  • Additional validation of adjtimex freq values to avoid
    potential multiplication overflows were added in commit
    5e5aeb4367b (time: adjtimex: Validate the ADJ_FREQUENCY values)

    Unfortunately the patch used LONG_MAX/MIN instead of
    LLONG_MAX/MIN, which was fine on 64-bit systems, but being
    much smaller on 32-bit systems caused false positives
    resulting in most direct frequency adjustments to fail w/
    EINVAL.

    ntpd only does direct frequency adjustments at startup, so
    the issue was not as easily observed there, but other time
    sync applications like ptpd and chrony were more effected by
    the bug.

    See bugs:

    https://bugzilla.kernel.org/show_bug.cgi?id=92481
    https://bugzilla.redhat.com/show_bug.cgi?id=1188074

    This patch changes the checks to use LLONG_MAX for
    clarity, and additionally the checks are disabled
    on 32-bit systems since LLONG_MAX/PPM_SCALE is always
    larger then the 32-bit long freq value, so multiplication
    overflows aren't possible there.

    Reported-by: Josh Boyer
    Reported-by: George Joseph
    Tested-by: George Joseph
    Signed-off-by: John Stultz
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: # v3.19+
    Cc: Linus Torvalds
    Cc: Sasha Levin
    Link: http://lkml.kernel.org/r/1423553436-29747-1-git-send-email-john.stultz@linaro.org
    [ Prettified the changelog and the comments a bit. ]
    Signed-off-by: Ingo Molnar

    John Stultz
     
  • io_schedule() calls blk_flush_plug() which, depending on the
    contents of current->plug, can initiate arbitrary blk-io requests.

    Note that this contrasts with blk_schedule_flush_plug() which requires
    all non-trivial work to be handed off to a separate thread.

    This makes it possible for io_schedule() to recurse, and initiating
    block requests could possibly call mempool_alloc() which, in times of
    memory pressure, uses io_schedule().

    Apart from any stack usage issues, io_schedule() will not behave
    correctly when called recursively as delayacct_blkio_start() does
    not allow for repeated calls.

    So:
    - use ->in_iowait to detect recursion. Set it earlier, and restore
    it to the old value.
    - move the call to "raw_rq" after the call to blk_flush_plug().
    As this is some sort of per-cpu thing, we want some chance that
    we are on the right CPU
    - When io_schedule() is called recurively, use blk_schedule_flush_plug()
    which cannot further recurse.
    - as this makes io_schedule() a lot more complex and as io_schedule()
    must match io_schedule_timeout(), but all the changes in io_schedule_timeout()
    and make io_schedule a simple wrapper for that.

    Signed-off-by: NeilBrown
    Signed-off-by: Peter Zijlstra (Intel)
    [ Moved the now rudimentary io_schedule() into sched.h. ]
    Cc: Jens Axboe
    Cc: Linus Torvalds
    Cc: Tony Battersby
    Link: http://lkml.kernel.org/r/20150213162600.059fffb2@notabene.brown
    Signed-off-by: Ingo Molnar

    NeilBrown
     
  • Commit de30ec47302c "Remove unnecessary ->wait.lock serialization when
    reading completion state" was not correct, without lock/unlock the code
    like stop_machine_from_inactive_cpu()

    while (!completion_done())
    cpu_relax();

    can return before complete() finishes its spin_unlock() which writes to
    this memory. And spin_unlock_wait().

    While at it, change try_wait_for_completion() to use READ_ONCE().

    Reported-by: Paul E. McKenney
    Reported-by: Davidlohr Bueso
    Tested-by: Paul E. McKenney
    Signed-off-by: Oleg Nesterov
    Signed-off-by: Peter Zijlstra (Intel)
    [ Added a comment with the barrier. ]
    Cc: Linus Torvalds
    Cc: Nicholas Mc Guire
    Cc: raghavendra.kt@linux.vnet.ibm.com
    Cc: waiman.long@hp.com
    Fixes: de30ec47302c ("sched/completion: Remove unnecessary ->wait.lock serialization when reading completion state")
    Link: http://lkml.kernel.org/r/20150212195913.GA30430@redhat.com
    Signed-off-by: Ingo Molnar

    Oleg Nesterov
     
  • Since the function graph tracer needs to disable preemption, it might
    call preempt_schedule() after reenabling it if something triggered the
    need for rescheduling in between.

    Therefore we can't trace preempt_schedule() itself because we would
    face a function tracing recursion otherwise as the tracer is always
    called before PREEMPT_ACTIVE gets set to prevent that recursion. This is
    why preempt_schedule() is tagged as "notrace".

    But the same issue applies to every function called by preempt_schedule()
    before PREEMPT_ACTIVE is actually set. And preempt_schedule_common() is
    one such example. Unfortunately we forgot to tag it as notrace as well
    and as a result we are encountering tracing recursion since it got
    introduced by:

    a18b5d0181923 ("sched: Fix missing preemption opportunity")

    Let's fix that by applying the appropriate function tag to
    preempt_schedule_common().

    Reported-by: Huang Ying
    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Steven Rostedt
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1424110807-15057-1-git-send-email-fweisbec@gmail.com
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker