13 Jan, 2012

22 commits

  • Andrew explains:

    - various misc stuff

    - Most of the rest of MM: memcg, threaded hugepages, others.

    - cpumask

    - kexec

    - kdump

    - some direct-io performance tweaking

    - radix-tree optimisations

    - new selftests code

    A note on this: often people will develop a new userspace-visible
    feature and will develop userspace code to exercise/test that
    feature. Then they merge the patch and the selftest code dies.
    Sometimes we paste it into the changelog. Sometimes the code gets
    thrown into Documentation/(!).

    This saddens me. So this patch creates a bare-bones framework which
    will henceforth allow me to ask people to include their test apps in
    the kernel tree so we can keep them alive. Then when people enhance
    or fix the feature, I can ask them to update the test app too.

    The infrastruture is terribly trivial at present - let's see how it
    evolves.

    - checkpoint/restart feature work.

    A note on this: this is a project by various mad Russians to perform
    c/r mainly from userspace, with various oddball helper code added
    into the kernel where the need is demonstrated.

    So rather than some large central lump of code, what we have is
    little bits and pieces popping up in various places which either
    expose something new or which permit something which is normally
    kernel-private to be modified.

    The overall project is an ongoing thing. I've judged that the size
    and scope of the thing means that we're more likely to be successful
    with it if we integrate the support into mainline piecemeal rather
    than allowing it all to develop out-of-tree.

    However I'm less confident than the developers that it will all
    eventually work! So what I'm asking them to do is to wrap each piece
    of new code inside CONFIG_CHECKPOINT_RESTORE. So if it all
    eventually comes to tears and the project as a whole fails, it should
    be a simple matter to go through and delete all trace of it.

    This lot pretty much wraps up the -rc1 merge for me.

    * akpm: (96 commits)
    unlzo: fix input buffer free
    ramoops: update parameters only after successful init
    ramoops: fix use of rounddown_pow_of_two()
    c/r: prctl: add PR_SET_MM codes to set up mm_struct entries
    c/r: procfs: add start_data, end_data, start_brk members to /proc/$pid/stat v4
    c/r: introduce CHECKPOINT_RESTORE symbol
    selftests: new x86 breakpoints selftest
    selftests: new very basic kernel selftests directory
    radix_tree: take radix_tree_path off stack
    radix_tree: remove radix_tree_indirect_to_ptr()
    dio: optimize cache misses in the submission path
    vfs: cache request_queue in struct block_device
    fs/direct-io.c: calculate fs_count correctly in get_more_blocks()
    drivers/parport/parport_pc.c: fix warnings
    panic: don't print redundant backtraces on oops
    sysctl: add the kernel.ns_last_pid control
    kdump: add udev events for memory online/offline
    include/linux/crash_dump.h needs elf.h
    kdump: fix crash_kexec()/smp_send_stop() race in panic()
    kdump: crashk_res init check for /sys/kernel/kexec_crash_size
    ...

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (69 commits)
    pptp: Accept packet with seq zero
    RDS: Remove some unused iWARP code
    net: fsl: fec: handle 10Mbps speed in RMII mode
    drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c: add missing iounmap
    drivers/net/ethernet/tundra/tsi108_eth.c: add missing iounmap
    ksz884x: fix mtu for VLAN
    net_sched: sfq: add optional RED on top of SFQ
    dp83640: Fix NOHZ local_softirq_pending 08 warning
    gianfar: Fix invalid TX frames returned on error queue when time stamping
    gianfar: Fix missing sock reference when processing TX time stamps
    phylib: introduce mdiobus_alloc_size()
    net: decrement memcg jump label when limit, not usage, is changed
    net: reintroduce missing rcu_assign_pointer() calls
    inet_diag: Rename inet_diag_req_compat into inet_diag_req
    inet_diag: Rename inet_diag_req into inet_diag_req_v2
    bond_alb: don't disable softirq under bond_alb_xmit
    mac80211: fix rx->key NULL pointer dereference in promiscuous mode
    nl80211: fix old station flags compatibility
    mdio-octeon: use an unique MDIO bus name.
    mdio-gpio: use an unique MDIO bus name.
    ...

    Linus Torvalds
     
  • If a platform device exists on the system, but ramoops fails to attach to
    it, the module parameters are overridden before ramoops can fall back and
    try to use passed module parameters. Move update to end of init routine.

    Signed-off-by: Kees Cook
    Cc: Marco Stornelli
    Cc: Sergiu Iordache
    Cc: Seiji Aguchi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • The return value of rounddown_pow_of_two wasn't evaluated, so the
    operation was a no-op.

    Signed-off-by: Marco Stornelli
    Reported-by: Andrew Morton
    Reviewed-by: WANG Cong
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marco Stornelli
     
  • drivers/parport/parport_pc.c: In function '__check_irq':
    drivers/parport/parport_pc.c:3415: warning: return from incompatible pointer type
    drivers/parport/parport_pc.c: In function '__check_dma':
    drivers/parport/parport_pc.c:3417: warning: return from incompatible pointer type

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Currently no udev events for memory hotplug "online" and "offline" are
    generated:

    # udevadm monitor
    # echo offline > /sys/devices/system/memory/memory4/state
    ==> No event

    When kdump is loaded, kexec detects the current memory configuration and
    stores it in the pre-allocated ELF core header. Therefore, for kdump it
    is necessary to reload the kdump kernel with kexec when the memory
    configuration changes (e.g. for online/offline hotplug memory).

    In order to do this automatically, udev rules should be used. This kernel
    patch adds udev events for "online" and "offline". Together with this
    kernel patch, the following udev rules for online/offline have to be added
    to "/etc/udev/rules.d/98-kexec.rules":

    SUBSYSTEM=="memory", ACTION=="online", PROGRAM="/etc/init.d/kdump restart"
    SUBSYSTEM=="memory", ACTION=="offline", PROGRAM="/etc/init.d/kdump restart"

    [sfr@canb.auug.org.au: fixups for class to subsystem conversion]
    Signed-off-by: Michael Holzheu
    Cc: Heiko Carstens
    Cc: Vivek Goyal
    Cc: "Eric W. Biederman"
    Cc: Kay Sievers
    Cc: Dave Hansen
    Cc: Martin Schwidefsky
    Cc: Greg KH
    Signed-off-by: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael Holzheu
     
  • KMSG_DUMP_KEXEC is useless because we already save kernel messages inside
    /proc/vmcore, and it is unsafe to allow modules to do other stuffs in a
    crash dump scenario.

    [akpm@linux-foundation.org: fix powerpc build]
    Signed-off-by: WANG Cong
    Reported-by: Vivek Goyal
    Acked-by: Vivek Goyal
    Acked-by: Jarod Wilson
    Cc: "Eric W. Biederman"
    Cc: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    WANG Cong
     
  • Fix the int/bool confusion in there.

    drivers/video/nvidia/nvidia.c:1602: warning: return from incompatible pointer type

    Cc: Florian Tobias Schandinat
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Initialize the PPTP "seq received" value to 0xffffffff, so we don't
    ignore packets with seq zero.

    Signed-off-by: Bradley Peterson
    Signed-off-by: David S. Miller

    Bradley Peterson
     
  • when the link is 10 Mbps and the mode is RMII, it's necessary
    to set FRCONT to 1 in MIIGSK_CFGR to divide the RMII source
    clock by 10 in order to support 10 Mbps operations.

    Signed-off-by: Eric Bénard
    Acked-by: Shawn Guo
    Signed-off-by: David S. Miller

    Eric Benard
     
  • Add missing iounmap in error handling code, in a case where the function
    already preforms iounmap on some other execution path.

    A simplified version of the semantic match that finds this problem is as
    follows: (http://coccinelle.lip6.fr/)

    //
    @@
    expression e;
    statement S,S1;
    int ret;
    @@
    e = \(ioremap\|ioremap_nocache\)(...)
    ... when != iounmap(e)
    if () S
    ... when any
    when != iounmap(e)
    *if (...)
    { ... when != iounmap(e)
    return ...; }
    ... when any
    iounmap(e);
    //

    Signed-off-by: Julia Lawall
    Signed-off-by: David S. Miller

    Julia Lawall
     
  • Add missing iounmap in error handling code, in a case where the function
    already preforms iounmap on some other execution path.

    A simplified version of the semantic match that finds this problem is as
    follows: (http://coccinelle.lip6.fr/)

    //
    @@
    expression e;
    statement S,S1;
    int ret;
    @@
    e = \(ioremap\|ioremap_nocache\)(...)
    ... when != iounmap(e)
    if () S
    ... when any
    when != iounmap(e)
    *if (...)
    { ... when != iounmap(e)
    return ...; }
    ... when any
    iounmap(e);
    //

    Signed-off-by: Julia Lawall
    Signed-off-by: David S. Miller

    Julia Lawall
     
  • The Ethernet header does not account for the addition of a VLAN header.
    Full size Ethernet frames containing VLAN header are not processed
    because the frame is larger than the resulting hw mtu.

    Signed-off-by: Doug Kehn
    Signed-off-by: David S. Miller

    Doug Kehn
     
  • Similar problem as in 481a8199142c050b72bff8a1956a49fd0a75bbe0 ("can:
    fix NOHZ local_softirq_pending 08 warning"). This fix replaces
    netif_rx() with netif_rx_ni() which has to be used from
    process/softirq context.

    Signed-off-by: Manfred Rudigier
    Signed-off-by: David S. Miller

    Manfred Rudigier
     
  • When TX time stamping for PTP messages is enabled on a socket, a time
    stamp is returned on the socket error queue to the user space application
    after the frame was transmitted. The transmitted frame is also returned on
    the error queue so that an application knows to which frame the time stamp
    belongs.

    In the current implementation the TxFCB is immediately followed by the
    frame. Since the eTSEC inserts the TX time stamp 8 bytes after the TxFCB,
    parts of the frame have been overwritten and an invalid frame was returned
    on the socket error queue.

    This patch fixes the described problem by adding additional 16 padding
    bytes between the TxFCB and the frame for all messages sent from a time
    stamping enabled socket (other sockets are not affected).

    Signed-off-by: Manfred Rudigier
    Signed-off-by: David S. Miller

    Manfred Rudigier
     
  • When there is not enough headroom in the skb a private copy will be made.
    However, the private copy had no reference to the socket and consequently
    no time stamp could be queued on the socket error queue during the
    skb_tstamp_tx function. This patch fixes this issue by also stealing the
    sock reference from the original skb after making the private copy.

    Signed-off-by: Manfred Rudigier
    Signed-off-by: David S. Miller

    Manfred Rudigier
     
  • Introduce function mdiobus_alloc_size() as an alternative to mdiobus_alloc().
    Most callers of mdiobus_alloc() also allocate a private data structure, and
    then manually point bus->priv to this object. mdiobus_alloc_size()
    combines the two operations into one, which simplifies memory management.

    The original mdiobus_alloc() now just calls mdiobus_alloc_size(0).

    Signed-off-by: Timur Tabi
    Signed-off-by: David S. Miller

    Timur Tabi
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
    Input: bcm5974 - set BUTTONPAD property
    Input: serio_raw - return proper result when serio_raw_write fails
    Input: serio_raw - really signal HUP upon disconnect
    Input: serio_raw - remove stray semicolon
    Input: revert some over-zealous conversions to module_platform_driver()

    Linus Torvalds
     
  • * tag 'to-linus' of git://github.com/rustyrussell/linux: (24 commits)
    lguest: Make sure interrupt is allocated ok by lguest_setup_irq
    lguest: move the lguest tool to the tools directory
    lguest: switch segment-voodoo-numbers to readable symbols
    virtio: balloon: Add freeze, restore handlers to support S4
    virtio: balloon: Move vq initialization into separate function
    virtio: net: Add freeze, restore handlers to support S4
    virtio: net: Move vq and vq buf removal into separate function
    virtio: net: Move vq initialization into separate function
    virtio: blk: Add freeze, restore handlers to support S4
    virtio: blk: Move vq initialization to separate function
    virtio: console: Disable callbacks for virtqueues at start of S4 freeze
    virtio: console: Add freeze and restore handlers to support S4
    virtio: console: Move vq and vq buf removal into separate functions
    virtio: pci: add PM notification handlers for restore, freeze, thaw, poweroff
    virtio: pci: switch to new PM API
    virtio_blk: fix config handler race
    virtio: add debugging if driver doesn't kick.
    virtio: expose added descriptors immediately.
    virtio: avoid modulus operation.
    virtio: support unlocked queue kick
    ...

    Linus Torvalds
     
  • It appears that you can only read the sprom contents with aligned 16-bit
    reads: anything else causes at least some versions of the broadcom
    chipset to abort the PCI transaction, returning 0xff.

    This apparently doesn't trigger very often, because most setups don't
    use an external srom chip, and the OTP sprom loading doesn't have this
    issue. But at least the current 11" Macbook Air does trigger it, and
    wireless communications were broken as a result.

    Acked-by: Arend van Spriel
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • David S. Miller
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (526 commits)
    ASoC: twl6040 - Add method to query optimum PDM_DL1 gain
    ALSA: hda - Fix the lost power-setup of seconary pins after PM resume
    ALSA: usb-audio: add Yamaha MOX6/MOX8 support
    ALSA: virtuoso: add S/PDIF input support for all Xonars
    ALSA: ice1724 - Support for ooAoo SQ210a
    ALSA: ice1724 - Allow card info based on model only
    ALSA: ice1724 - Create capture pcm only for ADC-enabled configurations
    ALSA: hdspm - Provide unique driver id based on card serial
    ASoC: Dynamically allocate the rtd device for a non-empty release()
    ASoC: Fix recursive dependency due to select ATMEL_SSC in SND_ATMEL_SOC_SSC
    ALSA: hda - Fix the detection of "Loopback Mixing" control for VIA codecs
    ALSA: hda - Return the error from get_wcaps_type() for invalid NIDs
    ALSA: hda - Use auto-parser for HP laptops with cx20459 codec
    ALSA: asihpi - Fix potential Oops in snd_asihpi_cmode_info()
    ALSA: hdsp - Fix potential Oops in snd_hdsp_info_pref_sync_ref()
    ALSA: hda/cirrus - support for iMac12,2 model
    ASoC: cx20442: add bias control over a platform provided regulator
    ALSA: usb-audio - Avoid flood of frame-active debug messages
    ALSA: snd-usb-us122l: Delete calls to preempt_disable
    mfd: Put WM8994 into cache only mode when suspending
    ...

    Fix up trivial conflicts in:
    - arch/arm/mach-s3c64xx/mach-crag6410.c:
    renamed speyside_wm8962 to tobermory, added littlemill right
    next to it
    - drivers/base/regmap/{regcache.c,regmap.c}:
    duplicate diff that had already come in with other changes in
    the regmap tree

    Linus Torvalds
     

12 Jan, 2012

18 commits

  • SH/R-Mobile updates for 3.3 merge window.

    * tag 'rmobile-for-linus' of git://github.com/pmundt/linux-sh: (32 commits)
    arm: mach-shmobile: add a resource name for shdma
    ARM: mach-shmobile: r8a7779 SMP support V3
    ARM: mach-shmobile: Add kota2 defconfig.
    ARM: mach-shmobile: Add marzen defconfig.
    ARM: mach-shmobile: r8a7779 power domain support V2
    ARM: mach-shmobile: Fix up marzen build for recent GIC changes.
    ARM: mach-shmobile: r8a7779 PFC function support
    ARM: mach-shmobile: Flush caches in platform_cpu_die()
    ARM: mach-shmobile: Allow SoC specific CPU kill code
    ARM: mach-shmobile: Fix headsmp.S code to use CPUINIT
    ARM: mach-shmobile: clock-r8a7779: clkz/clkzs support
    ARM: mach-shmobile: clock-r8a7779: add DIV4 clock support
    ARM: mach-shmobile: Marzen LAN89218 support
    ARM: mach-shmobile: Marzen SCIF2/SCIF4 support
    ARM: mach-shmobile: r8a7779 PFC GPIO-only support V2
    ARM: mach-shmobile: r8a7779 and Marzen base support V2
    sh: pfc: Unlock register support
    sh: pfc: Variable bitfield width config register support
    sh: pfc: Add config_reg_helper() function
    sh: pfc: Convert index to field and value pair
    ...

    Linus Torvalds
     
  • SuperH updates for 3.3 merge window.

    * tag 'sh-for-linus' of git://github.com/pmundt/linux-sh: (38 commits)
    sh: magicpanelr2: Update for parse_mtd_partitions() fallout.
    sh: mach-rsk: Update for parse_mtd_partitions() fallout.
    sh: sh2a: Improve cache flush/invalidate functions
    sh: also without PM_RUNTIME pm_runtime.o must be built
    sh: add a resource name for shdma
    sh: Remove redundant try_to_freeze() invocations.
    sh: Ensure IRQs are enabled across do_notify_resume().
    sh: Fix up store queue code for subsys_interface changes.
    sh: clkfwk: sh_clk_init_parent() should be called after clk_register()
    sh: add platform_device for renesas_usbhs in board-sh7757lcr
    sh: modify clock-sh7757 for renesas_usbhs
    sh: pfc: ioremap() support
    sh: use ioread32/iowrite32 and mapped_reg for div6
    sh: use ioread32/iowrite32 and mapped_reg for div4
    sh: use ioread32/iowrite32 and mapped_reg for mstp32
    sh: extend clock struct with mapped_reg member
    sh: clkfwk: clock-sh73a0: all div6_clks use SH_CLK_DIV6_EXT()
    sh: clkfwk: clock-sh7724: all div6_clks use SH_CLK_DIV6_EXT()
    sh: clock-sh7723: add CLKDEV_ICK_ID for cleanup
    serial: sh-sci: Handle GPIO function requests.
    ...

    Linus Torvalds
     
  • Make sure the interrupt is allocated correctly by lguest_setup_irq (check the
    return value of irq_alloc_desc_at for -ENOMEM)

    Signed-off-by: Stratos Psomadakis
    Signed-off-by: Rusty Russell (cleanups and commentry)

    Stratos Psomadakis
     
  • This is a better location instead of having it in Documentation.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Rusty Russell (fixed compile)

    Davidlohr Bueso
     
  • When studying lguest's x86 segment descriptor code, it is not longer
    necessary to have the Intel x86 architecture manual open on the page
    with the segment descriptor illustration to understand the crazy
    numbers assigned to both descriptor structure halves a/b.
    Now the struct desc_struct's fields, like suggested by
    Glauber de Oliveira Costa in 2008, are used.

    Signed-off-by: Jacek Galowicz
    Signed-off-by: Rusty Russell

    Jacek Galowicz
     
  • Handling balloon hibernate / restore is tricky. If the balloon was
    inflated before going into the hibernation state, upon resume, the host
    will not have any memory of that. Any pages that were passed on to the
    host earlier would most likely be invalid, and the host will have to
    re-balloon to the previous value to get in the pre-hibernate state.

    So the only sane thing for the guest to do here is to discard all the
    pages that were put in the balloon. When to discard the pages is the
    next question.

    One solution is to deflate the balloon just before writing the image to
    the disk (in the freeze() PM callback). However, asking for pages from
    the host just to discard them immediately after seems wasteful of
    resources. Hence, it makes sense to do this by just fudging our
    counters soon after wakeup. This means we don't deflate the balloon
    before sleep, and also don't put unnecessary pressure on the host.

    This also helps in the thaw case: if the freeze fails for whatever
    reason, the balloon should continue to remain in the inflated state.
    This was tested by issuing 'swapoff -a' and trying to go into the S4
    state. That fails, and the balloon stays inflated, as expected. Both
    the host and the guest are happy.

    Finally, in the restore() callback, we empty the list of pages that were
    previously given off to the host, add the appropriate number of pages to
    the totalram_pages counter, reset the num_pages counter to 0, and
    all is fine.

    As a last step, delete the vqs on the freeze callback to prepare for
    hibernation, and re-create them in the restore and thaw callbacks to
    resume normal operation.

    The kthread doesn't race with any operations here, since it's frozen
    before the freeze() call and is thawed after the thaw() and restore()
    callbacks, so we're safe with that.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     
  • The probe and PM restore functions will share this code.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     
  • Remove all the vqs, disable napi and detach from the netdev on
    hibernation.

    Re-create vqs after restoring from a hibernated image, re-enable napi
    and re-attach the netdev. This keeps networking working across
    hibernation.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     
  • The remove and PM freeze functions will share this code.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     
  • The probe and PM restore functions will share this code.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     
  • Delete the vq and flush any pending requests from the block queue on the
    freeze callback to prepare for hibernation.

    Re-create the vq in the restore callback to resume normal function.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     
  • The probe and PM restore functions will share this code.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     
  • To ensure we don't receive any more interrupts from the host after we
    enter the freeze function, disable all vq interrupts.

    There wasn't any problem seen due to this in tests, but applying this
    patch makes the freeze case more robust.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     
  • Remove all vqs and associated buffers in the freeze callback which
    prepares us to go into hibernation state. On restore, re-create all the
    vqs and populate the input vqs with buffers to get to the pre-hibernate
    state.

    Note: Any outstanding unconsumed buffers are discarded; which means
    there's a possibility of data loss in case the host or the guest didn't
    consume any data already present in the vqs. This can be addressed in a
    later patch series, perhaps in virtio common code.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     
  • This common code will be shared with the PM freeze function.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     
  • Handle thaw, restore and freeze notifications from the PM core. Expose
    these to individual virtio drivers that can quiesce and resume vq
    operations. For drivers not implementing the thaw() method, use the
    restore method instead.

    These functions also save device-specific data so that the device can be
    put in pre-suspend state after resume, and disable and enable the PCI
    device in the freeze and resume functions, respectively.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     
  • The older PM API doesn't have a way to get notifications on hibernate
    events. Switch to the newer one that gives us those notifications.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     
  • Fix a theoretical race related to config work
    handler: a config interrupt might happen
    after we flush config work but before we
    reset the device. It will then cause the
    config work to run during or after reset.

    Two problems with this:
    - if this runs after device is gone we will get use after free
    - access of config while reset is in progress is racy
    (as layout is changing).

    As a solution
    1. flush after reset when we know there will be no more interrupts
    2. add a flag to disable config access before reset

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell

    Michael S. Tsirkin