12 Mar, 2011

1 commit

  • nfs4_schedule_state_recovery() should only be used when we need to force
    the state manager to check the lease. If we just want to start the
    state manager in order to handle a state recovery situation, we should be
    using nfs4_schedule_state_manager().

    This patch fixes the abuses of nfs4_schedule_state_recovery() by replacing
    its use with a set of helper functions that do the right thing.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

11 Mar, 2011

2 commits

  • Signed-off-by: Andy Adamson
    Signed-off-by: Trond Myklebust

    Andy Adamson
     
  • Although they run as rpciod background tasks, under normal operation
    (i.e. no SIGKILL), functions like nfs_sillyrename(), nfs4_proc_unlck()
    and nfs4_do_close() want to be fully synchronous. This means that when we
    exit, we want all references to the rpc_task to be gone, and we want
    any dentry references etc. held by that task to be released.

    For this reason these functions call __rpc_wait_for_completion_task(),
    followed by rpc_put_task() in the expectation that the latter will be
    releasing the last reference to the rpc_task, and thus ensuring that the
    callback_ops->rpc_release() has been called synchronously.

    This patch fixes a race which exists due to the fact that
    rpciod calls rpc_complete_task() (in order to wake up the callers of
    __rpc_wait_for_completion_task()) and then subsequently calls
    rpc_put_task() without ensuring that these two steps are done atomically.

    In order to avoid adding new spin locks, the patch uses the existing
    waitqueue spin lock to order the rpc_task reference count releases between
    the waiting process and rpciod.
    The common case where nobody is waiting for completion is optimised for by
    checking if the RPC_TASK_ASYNC flag is cleared and/or if the rpc_task
    reference count is 1: in those cases we drop trying to grab the spin lock,
    and immediately free up the rpc_task.

    Those few processes that need to put the rpc_task from inside an
    asynchronous context and that do not care about ordering are given a new
    helper: rpc_put_task_async().

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

06 Mar, 2011

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
    ceph: no .snap inside of snapped namespace
    libceph: fix msgr standby handling
    libceph: fix msgr keepalive flag
    libceph: fix msgr backoff
    libceph: retry after authorization failure
    libceph: fix handling of short returns from get_user_pages
    ceph: do not clear I_COMPLETE from d_release
    ceph: do not set I_COMPLETE
    Revert "ceph: keep reference to parent inode on ceph_dentry"

    Linus Torvalds
     

05 Mar, 2011

5 commits

  • Add a alloc_page_vma_node that allows passing the "local" node in. Used
    in a followon patch.

    Acked-by: Andrea Arcangeli
    Signed-off-by: Andi Kleen
    Reviewed-by: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • Currently alloc_pages_vma() always uses the local node as policy node for
    the LOCAL policy. Pass this node down as an argument instead.

    No behaviour change from this patch, but will be needed for followons.

    Acked-by: Andrea Arcangeli
    Signed-off-by: Andi Kleen
    Reviewed-by: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • There was some broken keepalive code using a dead variable. Shift to using
    the proper bit flag.

    Signed-off-by: Sage Weil

    Sage Weil
     
  • With commit f363e45f we replaced a bunch of hacky workqueue mutual
    exclusion logic with the WQ_NON_REENTRANT flag. One pieces of fallout is
    that the exponential backoff breaks in certain cases:

    * con_work attempts to connect.
    * we get an immediate failure, and the socket state change handler queues
    immediate work.
    * con_work calls con_fault, we decide to back off, but can't queue delayed
    work.

    In this case, we add a BACKOFF bit to make con_work reschedule delayed work
    next time it runs (which should be immediately).

    Signed-off-by: Sage Weil

    Sage Weil
     
  • They are only used inside kernel/ptrace.c, and have been for a long
    time. We don't want to go back to the bad-old-days when architectures
    did things on their own, so make them static and private.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

04 Mar, 2011

2 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (42 commits)
    MAINTAINERS: Add Andy Gospodarek as co-maintainer.
    r8169: disable ASPM
    RxRPC: Fix v1 keys
    AF_RXRPC: Handle receiving ACKALL packets
    cnic: Fix lost interrupt on bnx2x
    cnic: Prevent status block race conditions with hardware
    net: dcbnl: check correct ops in dcbnl_ieee_set()
    e1000e: disable broken PHY wakeup for ICH10 LOMs, use MAC wakeup instead
    igb: fix sparse warning
    e1000: fix sparse warning
    netfilter: nf_log: avoid oops in (un)bind with invalid nfproto values
    dccp: fix oops on Reset after close
    ipvs: fix dst_lock locking on dest update
    davinci_emac: Add Carrier Link OK check in Davinci RX Handler
    bnx2x: update driver version to 1.62.00-6
    bnx2x: properly calculate lro_mss
    bnx2x: perform statistics "action" before state transition.
    bnx2x: properly configure coefficients for MinBW algorithm (NPAR mode).
    bnx2x: Fix ethtool -t link test for MF (non-pmf) devices.
    bnx2x: Fix nvram test for single port devices.
    ...

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.dk/linux-2.6-block:
    block: kill loop_mutex
    blktrace: Remove blk_fill_rwbs_rq.
    block: blk-flush shouldn't call directly into q->request_fn() __blk_run_queue()
    block: add @force_kblockd to __blk_run_queue()
    block: fix kernel-doc format for blkdev_issue_zeroout
    blk-throttle: Do not use kblockd workqueue for throtl work

    Linus Torvalds
     

03 Mar, 2011

2 commits

  • If we enable trace events to trace block actions, We use
    blk_fill_rwbs_rq to analyze the corresponding actions
    in request's cmd_flags, but we only choose the minor 2 bits
    from it, so most of other flags(e.g, REQ_SYNC) are missing.
    For example, with a sync write we get:
    write_test-2409 [001] 160.013869: block_rq_insert: 3,64 W 0 () 258135 + =
    8 [write_test]

    Since now we have integrated the flags of both bio and request,
    it is safe to pass rq->cmd_flags directly to blk_fill_rwbs and
    blk_fill_rwbs_rq isn't needed any more.

    With this patch, after a sync write we get:
    write_test-2417 [000] 226.603878: block_rq_insert: 3,64 WS 0 () 258135 +=
    8 [write_test]

    Signed-off-by: Tao Ma
    Acked-by: Jeff Moyer
    Signed-off-by: Jens Axboe

    Tao Ma
     
  • commit 339412841d7 (RxRPC: Allow key payloads to be passed in XDR form)
    broke klog for me. I notice the v1 key struct had a kif_version field
    added:

    -struct rxkad_key {
    - u16 security_index; /* RxRPC header security index */
    - u16 ticket_len; /* length of ticket[] */
    - u32 expiry; /* time at which expires */
    - u32 kvno; /* key version number */
    - u8 session_key[8]; /* DES session key */
    - u8 ticket[0]; /* the encrypted ticket */
    -};

    +struct rxrpc_key_data_v1 {
    + u32 kif_version; /* 1 */
    + u16 security_index;
    + u16 ticket_length;
    + u32 expiry; /* time_t */
    + u32 kvno;
    + u8 session_key[8];
    + u8 ticket[0];
    +};

    However the code in rxrpc_instantiate strips it away:

    data += sizeof(kver);
    datalen -= sizeof(kver);

    Removing kif_version fixes my problem.

    Signed-off-by: Anton Blanchard
    Signed-off-by: David Howells
    Signed-off-by: David S. Miller

    Anton Blanchard
     

02 Mar, 2011

3 commits

  • __blk_run_queue() automatically either calls q->request_fn() directly
    or schedules kblockd depending on whether the function is recursed.
    blk-flush implementation needs to be able to explicitly choose
    kblockd. Add @force_kblockd.

    All the current users are converted to specify %false for the
    parameter and this patch doesn't introduce any behavior change.

    stable: This is prerequisite for fixing ide oops caused by the new
    blk-flush implementation.

    Signed-off-by: Tejun Heo
    Cc: Jan Beulich
    Cc: James Bottomley
    Cc: stable@kernel.org
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • ASoC supports keeping the audio subsysetm active over suspend in order
    to support use cases such as audio passthrough from a cellular modem
    with the main CPU suspended. Ensure that we don't power down the CODEC
    when this is happening by checking to see if VMID is up and skipping
    suspend and resume when it is. If the CODEC has suspended then it'll
    turn VMID off before the core suspend() gets called.

    Signed-off-by: Mark Brown
    Signed-off-by: Samuel Ortiz

    Mark Brown
     
  • o Dominik Klein reported a system hang issue while doing some blkio
    throttling testing.

    https://lkml.org/lkml/2011/2/24/173

    o Some tracing revealed that CFQ was not dispatching any more jobs as
    queue unplug was not happening. And queue unplug was not happening
    because unplug work was not being called as there was one throttling
    work on same cpu which as not finished yet. And throttling work had not
    finished as it was tyring to dispatch a bio to CFQ but all the request
    descriptors were consume to it was put to sleep.

    o So basically it is a cyclic dependecny between CFQ unplug work and
    throtl dispatch work. Tejun suggested that use separate workqueue for
    such cases.

    o This patch uses a separate workqueue for throttle related work and
    does not rely on kblockd workqueue anymore.

    Cc: stable@kernel.org
    Reported-by: Dominik Klein
    Signed-off-by: Vivek Goyal
    Acked-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Vivek Goyal
     

01 Mar, 2011

3 commits

  • Several ACPI drivers fail to build if CONFIG_NET is unset, because
    they refer to things depending on CONFIG_THERMAL that in turn depends
    on CONFIG_NET. However, CONFIG_THERMAL doesn't really need to depend
    on CONFIG_NET, because the only part of it requiring CONFIG_NET is
    the netlink interface in thermal_sys.c.

    Put the netlink interface in thermal_sys.c under #ifdef CONFIG_NET
    and remove the dependency of CONFIG_THERMAL on CONFIG_NET from
    drivers/thermal/Kconfig.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Randy Dunlap
    Cc: Ingo Molnar
    Cc: Len Brown
    Cc: Stephen Rothwell
    Cc: Luming Yu
    Cc: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
    drm: fix unsigned vs signed comparison issue in modeset ctl ioctl.
    drm/nv50-nvc0: make sure vma is definitely unmapped when destroying bo

    Linus Torvalds
     
  • Commit e2cda3226481 ("thp: add pmd mangling generic functions") replaced
    some macros in with inline functions.

    If the functions are to be defined (not all architectures need them)
    then struct vm_area_struct must be defined first. So include
    .

    Fixes a build failure seen in Debian:

    CC [M] drivers/media/dvb/mantis/mantis_pci.o
    In file included from arch/arm/include/asm/pgtable.h:460,
    from drivers/media/dvb/mantis/mantis_pci.c:25:
    include/asm-generic/pgtable.h: In function 'ptep_test_and_clear_young':
    include/asm-generic/pgtable.h:29: error: dereferencing pointer to incomplete type

    Signed-off-by: Ben Hutchings
    Signed-off-by: Linus Torvalds

    Ben Hutchings
     

28 Feb, 2011

1 commit


26 Feb, 2011

3 commits


25 Feb, 2011

1 commit

  • Commit 074037e (PM / Wakeup: Introduce wakeup source objects and
    event statistics (v3)) caused ACPI wakeup to only work if
    CONFIG_PM_SLEEP is set, but it also worked for CONFIG_PM_SLEEP unset
    before. This can be fixed by making device_set_wakeup_enable(),
    device_init_wakeup() and device_may_wakeup() work in the same way
    as before commit 074037e when CONFIG_PM_SLEEP is unset.

    Reported-and-tested-by: Justin Maggard
    Cc: stable@kernel.org
    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     

24 Feb, 2011

4 commits

  • There are two cases when we call flush_disk.
    In one, the device has disappeared (check_disk_change) so any
    data will hold becomes irrelevant.
    In the oter, the device has changed size (check_disk_size_change)
    so data we hold may be irrelevant.

    In both cases it makes sense to discard any 'clean' buffers,
    so they will be read back from the device if needed.

    In the former case it makes sense to discard 'dirty' buffers
    as there will never be anywhere safe to write the data. In the
    second case it *does*not* make sense to discard dirty buffers
    as that will lead to file system corruption when you simply enlarge
    the containing devices.

    flush_disk calls __invalidate_devices.
    __invalidate_device calls both invalidate_inodes and invalidate_bdev.

    invalidate_inodes *does* discard I_DIRTY inodes and this does lead
    to fs corruption.

    invalidate_bev *does*not* discard dirty pages, but I don't really care
    about that at present.

    So this patch adds a flag to __invalidate_device (calling it
    __invalidate_device2) to indicate whether dirty buffers should be
    killed, and this is passed to invalidate_inodes which can choose to
    skip dirty inodes.

    flusk_disk then passes true from check_disk_change and false from
    check_disk_size_change.

    dm avoids tripping over this problem by calling i_size_write directly
    rathher than using check_disk_size_change.

    md does use check_disk_size_change and so is affected.

    This regression was introduced by commit 608aeef17a which causes
    check_disk_size_change to call flush_disk, so it is suitable for any
    kernel since 2.6.27.

    Cc: stable@kernel.org
    Acked-by: Jeff Moyer
    Cc: Andrew Patterson
    Cc: Jens Axboe
    Signed-off-by: NeilBrown

    NeilBrown
     
  • Michael Leun reported that running parallel opens on a fuse filesystem
    can trigger a "kernel BUG at mm/truncate.c:475"

    Gurudas Pai reported the same bug on NFS.

    The reason is, unmap_mapping_range() is not prepared for more than
    one concurrent invocation per inode. For example:

    thread1: going through a big range, stops in the middle of a vma and
    stores the restart address in vm_truncate_count.

    thread2: comes in with a small (e.g. single page) unmap request on
    the same vma, somewhere before restart_address, finds that the
    vma was already unmapped up to the restart address and happily
    returns without doing anything.

    Another scenario would be two big unmap requests, both having to
    restart the unmapping and each one setting vm_truncate_count to its
    own value. This could go on forever without any of them being able to
    finish.

    Truncate and hole punching already serialize with i_mutex. Other
    callers of unmap_mapping_range() do not, and it's difficult to get
    i_mutex protection for all callers. In particular ->d_revalidate(),
    which calls invalidate_inode_pages2_range() in fuse, may be called
    with or without i_mutex.

    This patch adds a new mutex to 'struct address_space' to prevent
    running multiple concurrent unmap_mapping_range() on the same mapping.

    [ We'll hopefully get rid of all this with the upcoming mm
    preemptibility series by Peter Zijlstra, the "mm: Remove i_mmap_mutex
    lockbreak" patch in particular. But that is for 2.6.39 ]

    Signed-off-by: Miklos Szeredi
    Reported-by: Michael Leun
    Reported-by: Gurudas Pai
    Tested-by: Gurudas Pai
    Acked-by: Hugh Dickins
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (33 commits)
    Added support for usb ethernet (0x0fe6, 0x9700)
    r8169: fix RTL8168DP power off issue.
    r8169: correct settings of rtl8102e.
    r8169: fix incorrect args to oob notify.
    DM9000B: Fix PHY power for network down/up
    DM9000B: Fix reg_save after spin_lock in dm9000_timeout
    net_sched: long word align struct qdisc_skb_cb data
    sfc: lower stack usage in efx_ethtool_self_test
    bridge: Use IPv6 link-local address for multicast listener queries
    bridge: Fix MLD queries' ethernet source address
    bridge: Allow mcast snooping for transient link local addresses too
    ipv6: Add IPv6 multicast address flag defines
    bridge: Add missing ntohs()s for MLDv2 report parsing
    bridge: Fix IPv6 multicast snooping by correcting offset in MLDv2 report
    bridge: Fix IPv6 multicast snooping by storing correct protocol type
    p54pci: update receive dma buffers before and after processing
    fix cfg80211_wext_siwfreq lock ordering...
    rt2x00: Fix WPA TKIP Michael MIC failures.
    ath5k: Fix fast channel switching
    tcp: undo_retrans counter fixes
    ...

    Linus Torvalds
     
  • netem_skb_cb() does :

    return (struct netem_skb_cb *)qdisc_skb_cb(skb)->data;

    Unfortunatly struct qdisc_skb_cb data is not long word aligned, so
    access to psched_time_t time_to_send uses a non aligned access.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

23 Feb, 2011

3 commits


22 Feb, 2011

1 commit

  • We force particular alignment when we generate attribute structures
    when generation MODULE_VERSION() data and we need to make sure that
    this alignment is followed when we iterate over these structures,
    otherwise we may crash on platforms whose natural alignment is not
    sizeof(void *), such as m68k.

    Reported-by: Geert Uytterhoeven
    Signed-off-by: Dmitry Torokhov
    [ There are more issues here, but the fixes are incredibly ugly - Linus ]
    Signed-off-by: Linus Torvalds

    Dmitry Torokhov
     

20 Feb, 2011

3 commits


19 Feb, 2011

3 commits


18 Feb, 2011

2 commits

  • This patch re-enables UIE timer/polling emulation for rtc devices
    that do not support alarm irqs.

    CC: Uwe Kleine-König
    CC: Thomas Gleixner
    Reported-by: Uwe Kleine-König
    Tested-by: Uwe Kleine-König
    Signed-off-by: John Stultz

    John Stultz
     
  • Uwe pointed out that my alarm based UIE emulation is not sufficient
    to replace the older timer/polling based UIE emulation on devices
    where there is no alarm irq. This causes rtc devices without alarms
    to return -EINVAL to UIE ioctls. The fix is to re-instate the old
    timer/polling method for devices without alarm irqs.

    This patch reverts the following commits:
    042620a018afcfba1d678062b62e46 - Remove UIE emulation
    1daeddd5962acad1bea55e524fc0fa - Cleanup removed UIE emulation declaration
    b5cc8ca1c9c3a37eaddf709b2fd3e1 - Remove Kconfig symbol for UIE emulation

    The emulation mode will still need to be wired-in with a following
    patch before it will work.

    CC: Uwe Kleine-König
    CC: Thomas Gleixner
    Reported-by: Uwe Kleine-König
    Signed-off-by: John Stultz

    John Stultz