30 Mar, 2011

4 commits


29 Mar, 2011

1 commit


27 Mar, 2011

1 commit


26 Mar, 2011

1 commit

  • The release method for mds connections uses a backpointer to the
    mds_client, so we need to flush the workqueue of any pending work (and
    ceph_connection references) prior to freeing the mds_client. This fixes
    an oops easily triggered under UML by

    while true ; do mount ... ; umount ... ; done

    Also fix an outdated comment: the flush in ceph_destroy_client only flushes
    OSD connections out. This bug is basically an artifact of the ceph ->
    ceph+libceph conversion.

    Signed-off-by: Sage Weil

    Sage Weil
     

23 Mar, 2011

2 commits

  • Send notifications when we change the rbd header (e.g. create a snapshot)
    and wait for such notifications. This allows synchronizing the snapshot
    creation between different rbd clients/rools.

    Signed-off-by: Yehuda Sadeh
    Signed-off-by: Sage Weil

    Yehuda Sadeh
     
  • Lingering requests are requests that are sent to the OSD normally but
    tracked also after we get a successful request. This keeps the OSD
    connection open and resends the original request if the object moves to
    another OSD. The OSD can then send notification messages back to us
    if another client initiates a notify.

    This framework will be used by RBD so that the client gets notification
    when a snapshot is created by another node or tool.

    Signed-off-by: Yehuda Sadeh
    Signed-off-by: Sage Weil

    Yehuda Sadeh
     

22 Mar, 2011

9 commits

  • Signed-off-by: Sage Weil

    Sage Weil
     
  • Just for consistency's sake. Fix obsolete comment too.

    Signed-off-by: Sage Weil

    Sage Weil
     
  • In sync_write_wait(), we assume that the newest request is at the
    tail of unsafe write list. We should maintain the semantics here.

    Signed-off-by: Henry C Chang
    Signed-off-by: Sage Weil

    Henry C Chang
     
  • This fixes the list corruption warning like this:

    ------------[ cut here ]------------
    WARNING: at lib/list_debug.c:30 __list_add+0x68/0x81()
    Hardware name: X8DTU
    list_add corruption. prev->next should be next (ffff880618931250), but was (null). (prev=ffff880c188b9130).
    Modules linked in: nfsd lockd nfs_acl auth_rpcgss exportfs ceph libceph libcrc32c sunrpc ipv6 fuse igb i2c_i801 ioatdma i2c_core iTCO_wdt iTCO_vendor_support joydev dca serio_raw usb_storage [last unloaded: scsi_wait_scan]
    Pid: 10977, comm: smbd Tainted: G W 2.6.32.23-170.Elaster.xendom0.fc12.x86_64 #1
    Call Trace:
    [] warn_slowpath_common+0x7c/0x94
    [] warn_slowpath_fmt+0x41/0x43
    [] __list_add+0x68/0x81
    [] ceph_aio_write+0x614/0x8a2 [ceph]
    [] do_sync_write+0xe8/0x125
    [] ? autoremove_wake_function+0x0/0x39
    [] ? selinux_file_permission+0x5c/0xb3
    [] ? security_file_permission+0x16/0x18
    [] vfs_write+0xae/0x10b
    [] sys_pwrite64+0x5a/0x76
    [] system_call_fastpath+0x16/0x1b
    ---[ end trace 08573eb9f07ff6f4 ]---

    Signed-off-by: Henry C Chang
    Signed-off-by: Sage Weil

    Henry C Chang
     
  • Signed-off-by: Sage Weil

    Sage Weil
     
  • The ino32 mount option forces the ceph fs to report 32 bit
    ino values. This is useful for 64 bit kernels with 32 bit userspace.

    Signed-off-by: Yehuda Sadeh

    Yehuda Sadeh
     
  • This updates the common header files used by the different ceph
    related modules. Specifically it adds definitions required by
    the rbd watch/notify feature.

    Signed-off-by: Yehuda Sadeh

    Yehuda Sadeh
     
  • Whoops!

    Signed-off-by: Sage Weil

    Sage Weil
     
  • If we send a request to osd A, and the request's pg remaps to osd B and
    then back to A in quick succession, we need to resend the request to A. The
    old code was only calling kick_requests after processing all incremental
    maps in a message, so it was very possible to not resend a request that
    needed to be resent. This would make the osd eventually time out (at least
    with the current default of osd timeouts enabled).

    The correct approach is to scan requests on every map incremental. This
    patch refactors the kick code in a few ways:
    - all requests are either on req_lru (in flight), req_unsent (ready to
    send), or req_notarget (currently map to no up osd)
    - mapping always done by map_request (previous map_osds)
    - if the mapping changes, we requeue. requests are resent only after all
    map incrementals are processed.
    - some osd reset code is moved out of kick_requests into a separate
    function
    - the "kick this osd" functionality is moved to kick_osd_requests, as it
    is unrelated to scanning for request->pg->osd mapping changes

    Signed-off-by: Sage Weil

    Sage Weil
     

16 Mar, 2011

2 commits

  • d_move puts the renamed dentry at the end of d_subdirs, screwing with our
    cached dentry directory offsets. We were just clearing I_COMPLETE to avoid
    any possibility of trouble. However, assigning the renamed dentry an
    offset at the end of the directory (to match it's new d_subdirs position)
    is sufficient to maintain correct behavior and hold onto I_COMPLETE.

    This is especially important for workloads like rsync, which renames files
    into place. Before, we would lose I_COMPLETE and do MDS lookups for each
    file. With this patch we only talk to the MDS on create and rename.

    Signed-off-by: Sage Weil

    Sage Weil
     
  • It used to return -EINVAL because it thought the end was not aligned
    to 4 bytes.

    Clean up superfluous src < end test in if, the while itself guarantees
    that.

    Signed-off-by: Tommi Virtanen
    Signed-off-by: Sage Weil

    Tommi Virtanen
     

15 Mar, 2011

20 commits

  • Linus Torvalds
     
  • * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-mn10300:
    MN10300: atomic_read() should ensure it emits a load
    MN10300: The SMP_ICACHE_INV_FLUSH_RANGE IPI command does not exist
    MN10300: Proper use of macros get_user() in the case of incremented pointers

    Linus Torvalds
     
  • * 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: (26 commits)
    MIPS: Alchemy: Fix reset for MTX-1 and XXS1500
    MIPS: MTX-1: Make au1000_eth probe all PHY addresses
    MIPS: Jz4740: Add HAVE_CLK
    MIPS: Move idle task creation to work queue
    MIPS, Perf-events: Use unsigned delta for right shift in event update
    MIPS, Perf-events: Work with the new callchain interface
    MIPS, Perf-events: Fix event check in validate_event()
    MIPS, Perf-events: Work with the new PMU interface
    MIPS, Perf-events: Work with irq_work
    MIPS: Fix always CONFIG_LOONGSON_UART_BASE=y
    MIPS: Loongson: Fix potentially wrong string handling
    MIPS: Fix GCC-4.6 'set but not used' warning in arch/mips/mm/init.c
    MIPS: Fix GCC-4.6 'set but not used' warning in ieee754int.h
    MIPS: Remove unused code from arch/mips/kernel/syscall.c
    MIPS: Fix GCC-4.6 'set but not used' warning in signal*.c
    MIPS: MSP: Fix MSP71xx bpci interrupt handler return value
    MIPS: Select R4K timer lib for all MSP platforms
    MIPS: Loongson: Remove ad-hoc cmdline default
    MIPS: Clear the correct flag in sysmips(MIPS_FIXADE, ...).
    MIPS: Add an unreachable return statement to satisfy buggy GCCs.
    ...

    Linus Torvalds
     
  • …git/tip/linux-2.6-tip

    * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86: ce4100: Set pci ops via callback instead of module init
    x86/mm: Fix pgd_lock deadlock
    x86/mm: Handle mm_fault_error() in kernel space
    x86: Don't check for BIOS corruption in first 64K when there's no need to

    Linus Torvalds
     
  • This reverts the parent commit. I hate doing that, but it's generating
    some discussion ("half of it is right"), and since I am planning on
    doing the 2.6.38 release later today we can punt it to stable if
    required. Let's not rock the boat right now.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • oom_kill_process() starts with victim_points == 0. This means that
    (most likely) any child has more points and can be killed erroneously.

    Also, "children has a different mm" doesn't match the reality, we should
    check child->mm != t->mm. This check is not exactly correct if t->mm ==
    NULL but this doesn't really matter, oom_kill_task() will kill them
    anyway.

    Note: "Kill all processes sharing p->mm" in oom_kill_task() is wrong
    too.

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Since commit 32fd6901 (MIPS: Alchemy: get rid of common/reset.c)
    Alchemy-based boards use their own reset function. For MTX-1 and XXS1500,
    the reset function pokes at the BCSR.SYSTEM_RESET register, but this does
    not work. According to Bruno Randolf, this was not tested when written.

    Previously, the generic au1000_restart() routine called the board specific
    reset function, which for MTX-1 and XXS1500 did not work, but finally made
    a jump to the reset vector, which really triggers a system restart. Fix
    reboot for both targets by jumping to the reset vector.

    Signed-off-by: Florian Fainelli
    To: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/2093/
    Acked-by: Bruno Randolf
    Signed-off-by: Ralf Baechle

    Florian Fainelli
     
  • When au1000_eth probes the MII bus for PHY address, if we do not set
    au1000_eth platform data's phy_search_highest_address, the MII probing
    logic will exit early and will assume a valid PHY is found at address 0.
    For MTX-1, the PHY is at address 31, and without this patch, the link
    detection/speed/duplex would not work correctly.

    CC: stable@kernel.org
    Signed-off-by: Florian Fainelli
    To: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/2111/
    Signed-off-by: Ralf Baechle

    Florian Fainelli
     
  • Jz4740 supports the clock framework but doesn't have HAVE_CLK defined,
    so define it!

    Signed-off-by: Maurus Cuelenaere
    To: linux-mips@linux-mips.org
    To: linux-kernel@vger.kernel.org
    Patchwork: https://patchwork.linux-mips.org/patch/2112/
    Acked-by: Lars-Peter Clausen
    Signed-off-by: Ralf Baechle

    Maurus Cuelenaere
     
  • To avoid forking usermode thread when creating an idle task, move fork_idle
    to a work queue.

    If kernel starts with maxcpus= option which does not bring all available
    cpus online at boot time, idle tasks for offline cpus are not created. If
    later offline cpus are hotplugged through sysfs, __cpu_up is called in
    the context of the user task, and fork_idle copies its non-zero mm
    pointer. This causes BUG() in per_cpu_trap_init.

    This also avoids issues with resource limits of the CPU writing to sysfs,
    containers, maybe others.

    Signed-off-by: Maksim Rayskiy
    To: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/2070/
    Signed-off-by: Ralf Baechle

    Maksim Rayskiy
     
  • Leverage the commit for ARM by Will Deacon:

    - 446a5a8b1eb91a6990e5c8fe29f14e7a95b69132
    ARM: 6205/1: perf: ensure counter delta is treated as unsigned

    Hardware performance counters on ARM are 32-bits wide but atomic64_t
    variables are used to represent counter data in the hw_perf_event structure.

    The armpmu_event_update function right-shifts a signed 64-bit delta variable
    and adds the result to the event count. This can lead to shifting in sign-bits
    if the MSB of the 32-bit counter value is set. This results in perf output
    such as:

    Performance counter stats for 'sleep 20':

    18446744073460670464 cycles
    Acked-by: David Daney
    Signed-off-by: Deng-Cheng Zhu
    To: a.p.zijlstra@chello.nl
    To: fweisbec@gmail.com
    To: will.deacon@arm.com
    Cc: linux-mips@linux-mips.org
    Cc: linux-kernel@vger.kernel.org
    Cc: wuzhangjin@gmail.com
    Cc: paulus@samba.org
    Cc: mingo@elte.hu
    Cc: acme@redhat.com
    Cc: matt@console-pimps.org
    Cc: sshtylyov@mvista.com
    Patchwork: http://patchwork.linux-mips.org/patch/2015/
    Signed-off-by: Ralf Baechle

    Deng-Cheng Zhu
     
  • This is the MIPS part of the following commits by Frederic Weisbecker:

    - f72c1a931e311bb7780fee19e41a89ac42cab50e
    perf: Factorize callchain context handling

    Store the kernel and user contexts from the generic layer instead
    of archs, this gathers some repetitive code.

    - 56962b4449af34070bb1994621ef4f0265eed4d8
    perf: Generalize some arch callchain code

    - Most archs use one callchain buffer per cpu, except x86 that needs
    to deal with NMIs. Provide a default perf_callchain_buffer()
    implementation that x86 overrides.

    - Centralize all the kernel/user regs handling and invoke new arch
    handlers from there: perf_callchain_user() / perf_callchain_kernel()
    That avoid all the user_mode(), current->mm checks and so...

    - Invert some parameters in perf_callchain_*() helpers: entry to the
    left, regs to the right, following the traditional (dst, src).

    - 70791ce9ba68a5921c9905ef05d23f62a90bc10c
    perf: Generalize callchain_store()

    callchain_store() is the same on every archs, inline it in
    perf_event.h and rename it to perf_callchain_store() to avoid
    any collision.

    This removes repetitive code.

    - c1a65932fd7216fdc9a0db8bbffe1d47842f862c
    perf: Drop unappropriate tests on arch callchains

    Drop the TASK_RUNNING test on user tasks for callchains as
    this check doesn't seem to make any sense.

    Also remove the tests for !current that is not supposed to
    happen and current->pid as this should be handled at the
    generic level, with exclude_idle attribute.

    Reported-by: Wu Zhangjin
    Acked-by: Frederic Weisbecker
    Acked-by: David Daney
    Signed-off-by: Deng-Cheng Zhu
    To: a.p.zijlstra@chello.nl
    To: will.deacon@arm.com
    Cc: linux-mips@linux-mips.org
    Cc: linux-kernel@vger.kernel.org
    Cc: paulus@samba.org
    Cc: mingo@elte.hu
    Cc: acme@redhat.com
    Cc: dengcheng.zhu@gmail.com
    Cc: matt@console-pimps.org
    Cc: sshtylyov@mvista.com
    Patchwork: http://patchwork.linux-mips.org/patch/2014/
    Signed-off-by: Ralf Baechle

    Deng-Cheng Zhu
     
  • Ignore events that are in off/error state or belong to a different PMU.

    This patch originates from the following commit for ARM by Will Deacon:

    - 65b4711ff513767341aa1915c822de6ec0de65cb
    ARM: 6352/1: perf: fix event validation

    The validate_event function in the ARM perf events backend has the
    following problems:

    1.) Events that are disabled count towards the cost.
    2.) Events associated with other PMUs [for example, software events or
    breakpoints] do not count towards the cost, but do fail validation,
    causing the group to fail.

    This patch changes validate_event so that it ignores events in the
    PERF_EVENT_STATE_OFF state or that are scheduled for other PMUs.

    Acked-by: Will Deacon
    Acked-by: David Daney
    Signed-off-by: Deng-Cheng Zhu
    To: a.p.zijlstra@chello.nl
    To: fweisbec@gmail.com
    To: will.deacon@arm.com
    Cc: linux-mips@linux-mips.org
    Cc: linux-kernel@vger.kernel.org
    Cc: wuzhangjin@gmail.com
    Cc: paulus@samba.org
    Cc: mingo@elte.hu
    Cc: acme@redhat.com
    Cc: dengcheng.zhu@gmail.com
    Cc: matt@console-pimps.org
    Cc: sshtylyov@mvista.com
    Cc: ddaney@caviumnetworks.com
    Patchwork: http://patchwork.linux-mips.org/patch/2013/
    Signed-off-by: Ralf Baechle

    Deng-Cheng Zhu
     
  • This is the MIPS part of the following commits by Peter Zijlstra:

    - a4eaf7f14675cb512d69f0c928055e73d0c6d252
    perf: Rework the PMU methods

    Replace pmu::{enable,disable,start,stop,unthrottle} with
    pmu::{add,del,start,stop}, all of which take a flags argument.

    The new interface extends the capability to stop a counter while
    keeping it scheduled on the PMU. We replace the throttled state with
    the generic stopped state.

    This also allows us to efficiently stop/start counters over certain
    code paths (like IRQ handlers).

    It also allows scheduling a counter without it starting, allowing for
    a generic frozen state (useful for rotating stopped counters).

    The stopped state is implemented in two different ways, depending on
    how the architecture implemented the throttled state:

    1) We disable the counter:
    a) the pmu has per-counter enable bits, we flip that
    b) we program a NOP event, preserving the counter state

    2) We store the counter state and ignore all read/overflow events

    For MIPSXX, the stopped state is implemented in the way of 1.b as above.

    - 33696fc0d141bbbcb12f75b69608ea83282e3117
    perf: Per PMU disable

    Changes perf_disable() into perf_pmu_disable().

    - 24cd7f54a0d47e1d5b3de29e2456bfbd2d8447b7
    perf: Reduce perf_disable() usage

    Since the current perf_disable() usage is only an optimization,
    remove it for now. This eases the removal of the __weak
    hw_perf_enable() interface.

    - b0a873ebbf87bf38bf70b5e39a7cadc96099fa13
    perf: Register PMU implementations

    Simple registration interface for struct pmu, this provides the
    infrastructure for removing all the weak functions.

    - 51b0fe39549a04858001922919ab355dee9bdfcf
    perf: Deconstify struct pmu

    sed -ie 's/const struct pmu\>/struct pmu/g' `git grep -l "const struct pmu\>"`

    Reported-by: Wu Zhangjin
    Acked-by: David Daney
    Signed-off-by: Deng-Cheng Zhu
    To: a.p.zijlstra@chello.nl
    To: fweisbec@gmail.com
    To: will.deacon@arm.com
    Cc: linux-mips@linux-mips.org
    Cc: linux-kernel@vger.kernel.org
    Cc: wuzhangjin@gmail.com
    Cc: paulus@samba.org
    Cc: mingo@elte.hu
    Cc: acme@redhat.com
    Cc: dengcheng.zhu@gmail.com
    Cc: matt@console-pimps.org
    Cc: sshtylyov@mvista.com
    Cc: ddaney@caviumnetworks.com
    Patchwork: http://patchwork.linux-mips.org/patch/2012/
    Signed-off-by: Ralf Baechle

    Deng-Cheng Zhu
     
  • This is the MIPS part of the following commit by Peter Zijlstra:

    - e360adbe29241a0194e10e20595360dd7b98a2b3
    irq_work: Add generic hardirq context callbacks

    Provide a mechanism that allows running code in IRQ context. It is
    most useful for NMI code that needs to interact with the rest of the
    system -- like wakeup a task to drain buffers.

    Perf currently has such a mechanism, so extract that and provide it as
    a generic feature, independent of perf so that others may also
    benefit.

    The IRQ context callback is generated through self-IPIs where
    possible, or on architectures like powerpc the decrementer (the
    built-in timer facility) is set to generate an interrupt immediately.

    Architectures that don't have anything like this get to do with a
    callback from the timer tick. These architectures can call
    irq_work_run() at the tail of any IRQ handlers that might enqueue such
    work (like the perf IRQ handler) to avoid undue latencies in
    processing the work.

    For MIPSXX, we need to call irq_work_run() at the tail of the perf IRQ
    handler as described above.

    Reported-by: Wu Zhangjin
    Acked-by: Peter Zijlstra
    Acked-by: David Daney
    Signed-off-by: Deng-Cheng Zhu
    To: fweisbec@gmail.com
    To: will.deacon@arm.com
    Cc: linux-mips@linux-mips.org
    Cc: linux-kernel@vger.kernel.org
    Cc: paulus@samba.org
    Cc: mingo@elte.hu
    Cc: acme@redhat.com
    Cc: matt@console-pimps.org
    Cc: sshtylyov@mvista.com,
    Patchwork: http://patchwork.linux-mips.org/patch/2011/
    Signed-off-by: Ralf Baechle

    Deng-Cheng Zhu
     
  • Signed-off-by: Yoichi Yuasa
    Cc: linux-mips
    Patchwork: https://patchwork.linux-mips.org/patch/2055/
    Signed-off-by: Ralf Baechle

    Yoichi Yuasa
     
  • This error was reported by cppcheck:
    arch/mips/loongson/common/machtype.c:56: error: Dangerous usage of 'str' (strncpy doesn't always 0-terminate it)

    If strncpy copied MACHTYPE_LEN bytes, the destination string str
    was not terminated.

    The patch adds one more byte to str and makes sure that this byte is
    always 0.

    Signed-off-by: Stefan Weil
    Cc: Wu Zhangjin
    Cc: Arnaud Patard
    Cc: linux-mips@linux-mips.org
    Cc: linux-kernel@vger.kernel.org
    Patchwork: https://patchwork.linux-mips.org/patch/2053/
    Signed-off-by: Ralf Baechle

    Stefan Weil
     
  • Under some combinations of CONFIG_*, lastpfn in page_is_ram is 'set
    but not used'. Mark it as __maybe_unused to quiet the warning/error.

    Signed-off-by: David Daney
    To: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/2033/
    Signed-off-by: Ralf Baechle

    David Daney
     
  • GCC-4.6 can find more unused code than previous versions could.

    In the case of arch/mips/math-emu/ieee754int.h, the COMPXSP and
    COMPXDP macros are used in several places, but a couple of them leave
    xs unused. The easiest thing to do is mark it as __maybe_unused to
    quiet the warning.

    Signed-off-by: David Daney
    To: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/2032/
    Signed-off-by: Ralf Baechle

    David Daney
     
  • The variable arg3 in _sys_sysmips() is unused. Remove it.

    Signed-off-by: David Daney
    To: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/2034/
    Signed-off-by: Ralf Baechle

    David Daney