09 Feb, 2007

2 commits

  • Conflicts:

    crypto/Kconfig

    David S. Miller
     
  • * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/mfasheh/ocfs2: (22 commits)
    configfs: Zero terminate data in configfs attribute writes.
    [PATCH] ocfs2 heartbeat: clean up bio submission code
    ocfs2: introduce sc->sc_send_lock to protect outbound outbound messages
    [PATCH] ocfs2: drop INET from Kconfig, not needed
    ocfs2_dlm: Add timeout to dlm join domain
    ocfs2_dlm: Silence some messages during join domain
    ocfs2_dlm: disallow a domain join if node maps mismatch
    ocfs2_dlm: Ensure correct ordering of set/clear refmap bit on lockres
    ocfs2: Binds listener to the configured ip address
    ocfs2_dlm: Calling post handler function in assert master handler
    ocfs2: Added post handler callable function in o2net message handler
    ocfs2_dlm: Cookies in locks not being printed correctly in error messages
    ocfs2_dlm: Silence a failed convert
    ocfs2_dlm: wake up sleepers on the lockres waitqueue
    ocfs2_dlm: Dlm dispatch was stopping too early
    ocfs2_dlm: Drop inflight refmap even if no locks found on the lockres
    ocfs2_dlm: Flush dlm workqueue before starting to migrate
    ocfs2_dlm: Fix migrate lockres handler queue scanning
    ocfs2_dlm: Make dlmunlock() wait for migration to complete
    ocfs2_dlm: Fixes race between migrate and dirty
    ...

    Linus Torvalds
     

08 Feb, 2007

30 commits

  • Attributes in configfs are text files. As such, most handlers expect to be
    able to call functions like simple_strtoul() without checking the bounds
    of the buffer. Change the call to zero terminate the buffer before calling
    the client's ->store() method. This does reduce the attribute size from
    PAGE_SIZE to PAGE_SIZE-1.

    Also, change get_zeroed_page() to alloc_page(), as we are handling the
    termination.

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     
  • As was already pointed out Mathieu Avila on Thu, 07 Sep 2006 03:15:25 -0700
    that OCFS2 is expecting bio_add_page() to add pages to BIOs in an easily
    predictable manner.

    That is not true, especially for devices with own merge_bvec_fn().

    Therefore OCFS2's heartbeat code is very likely to fail on such devices.

    Move the bio_put() call into the bio's bi_end_io() function. This makes the
    whole idea of trying to predict the behaviour of bio_add_page() unnecessary.
    Removed compute_max_sectors() and o2hb_compute_request_limits().

    Signed-off-by: Philipp Reisner
    Signed-off-by: Mark Fasheh

    Philipp Reisner
     
  • When there is a lot of multithreaded I/O usage, two threads can collide
    while sending out a message to the other nodes. This is due to the lack of
    locking between threads while sending out the messages.

    When a connected TCP send(), sendto(), or sendmsg() arrives in the Linux
    kernel, it eventually comes through tcp_sendmsg(). tcp_sendmsg() protects
    itself by acquiring a lock at invocation by calling lock_sock().
    tcp_sendmsg() then loops over the buffers in the iovec, allocating
    associated sk_buff's and cache pages for use in the actual send. As it does
    so, it pushes the data out to tcp for actual transmission. However, if one
    of those allocation fails (because a large number of large sends is being
    processed, for example), it must wait for memory to become available. It
    does so by jumping to wait_for_sndbuf or wait_for_memory, both of which
    eventually cause a call to sk_stream_wait_memory(). sk_stream_wait_memory()
    contains a code path that calls sk_wait_event(). Finally, sk_wait_event()
    contains the call to release_sock().

    The following patch adds a lock to the socket container in order to
    properly serialize outbound requests.

    From: Zhen Wei
    Acked-by: Jeff Mahoney
    Signed-off-by: Mark Fasheh

    Zhen Wei
     
  • OCFS2: drop 'depends on INET' since local mounts are now allowed.

    Signed-off-by: Randy Dunlap
    Signed-off-by: Mark Fasheh

    Randy Dunlap
     
  • Currently the ocfs2 dlm has no timeout during dlm join domain. While this is
    not a problem in normal operation, this does become an issue if, say, the
    other node is refusing to let the node join the domain because of a stuck
    recovery. This patch adds a 90 sec timeout.

    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     
  • These messages can easily be activated using the mlog infrastructure
    and don't need to be enabled by default.

    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     
  • There is a small window where a joining node may not see the node(s) that
    just died but are still part of the domain. To fix this, we must disallow
    join requests if the joining node has a different node map.

    A new field node_map is added to dlm_query_join_request to send the current
    nodes nodemap along with join request. On the receiving end the nodes that
    are part of the cluster verifies if this new node sees all the nodes that
    are still part of the cluster. They disallow the join if the maps mismatch.

    Signed-off-by: Srinivas Eeda
    Signed-off-by: Mark Fasheh

    Srinivas Eeda
     
  • Eventhough the set refmap bit message is sent before the clear refmap
    message, currently there is no guarentee that the set message will be
    handled before the clear. This patch prevents the clear refmap to be
    processed while the node is sending assert master messages to other
    nodes. (The set refmap message is sent as a response to the assert
    master request).

    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     
  • This patch binds the o2net listener to the configured ip address
    instead of INADDR_ANY for security. Fixes oss.oracle.com bugzilla#814.

    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     
  • This patch prevents the dlm from sending the clear refmap message
    before the set refmap. We use the newly created post function handler
    routine to accomplish the task.

    Signed-off-by: Kurt Hackel
    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Kurt Hackel
     
  • Currently o2net allows one handler function per message type. This
    patch adds the ability to call another function to be called after
    the handler has returned the message to the other node.

    Handlers are now given the option of returning a context (in the form of a
    void **) which will be passed back into the post message handler function.

    Signed-off-by: Kurt Hackel
    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Kurt Hackel
     
  • The dlm encodes the node number and a sequence number in the lock cookie.
    It also stores the cookie in the lockres in the big endian format to avoid
    swapping 8 bytes on each lock request. The bug here was that it was assuming
    the cookie to be in the cpu format when decoding it for printing the error
    message. This patch swaps the bytes before the print.

    Signed-off-by: Kurt Hackel
    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Kurt Hackel
     
  • When the lockres is in migrate or recovery state, all convert requests
    are denied with the appropriate error status that is handled on the
    requester node. This patch silences the erroneous error message printed
    on the master node.

    Signed-off-by: Kurt Hackel
    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Kurt Hackel
     
  • The dlm was not waking up threads waiting on the lockres wait queue,
    waiting for the lockres to be no longer be in the DLM_LOCK_RES_IN_PROGRESS
    and the DLM_LOCK_RES_MIGRATING states.

    Signed-off-by: Kurt Hackel
    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Kurt Hackel
     
  • dlm_dispatch_work was not processing the queued up tasks at
    the first sign of the node leaving the domain leading to not
    only incompleted tasks but also a mismatch in the dlm refcnt.

    Signed-off-by: Kurt Hackel
    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Kurt Hackel
     
  • Signed-off-by: Kurt Hackel
    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Kurt Hackel
     
  • This is to prevent the condition in which a previously queued
    up assert master asserts after we start the migration. Now
    migration ensures the workqueue is flushed before proceeding
    with migrating the lock to another node. This condition is
    typically encountered during parallel umounts.

    Signed-off-by: Kurt Hackel
    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Kurt Hackel
     
  • The migrate lockres handler was only searching for its lock on
    migrated lockres on the expected queue. This could be problematic
    as the new master could have also issued a convert request
    during the migration and thus moved the lock to the convert queue.
    We now search for the lock on all three queues.

    Signed-off-by: Kurt Hackel
    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Kurt Hackel
     
  • dlmunlock() was not waiting for migration to complete before releasing locks
    on locally mastered locks.

    Signed-off-by: Kurt Hackel
    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Kurt Hackel
     
  • dlmthread was removing lockres' from the dirty list
    and resetting the dirty flag before shuffling the list.
    This patch retains the dirty state flag until the lists
    are shuffled.

    Signed-off-by: Kurt Hackel
    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Kurt Hackel
     
  • This patch makes some needlessly global functions static.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Mark Fasheh

    Adrian Bunk
     
  • This was previously broken and migration of some locks had to be temporarily
    disabled. We use a new (and backward-incompatible) set of network messages
    to account for all references to a lock resources held across the cluster.
    once these are all freed, the master node may then free the lock resource
    memory once its local references are dropped.

    Signed-off-by: Kurt Hackel
    Signed-off-by: Mark Fasheh

    Kurt Hackel
     
  • The problem. When implementing a network namespace I need to be able
    to have multiple network devices with the same name. Currently this
    is a problem for /sys/class/net/*.

    What I want is a separate /sys/class/net directory in sysfs for each
    network namespace, and I want to name each of them /sys/class/net.

    I looked and the VFS actually allows that. All that is needed is
    for /sys/class/net to implement a follow link method to redirect
    lookups to the real directory you want.

    Implementing a follow link method that is sensitive to the current
    network namespace turns out to be 3 lines of code so it looks like a
    clean approach. Modifying sysfs so it doesn't get in my was is a bit
    trickier.

    I am calling the concept of multiple directories all at the same path
    in the filesystem shadow directories. With the directory entry really
    at that location the shadow master.

    The following patch modifies sysfs so it can handle a directory
    structure slightly different from the kobject tree so I can implement
    the shadow directories for handling /sys/class/net/.

    Signed-off-by: Eric W. Biederman
    Cc: Maneesh Soni
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • if a driver returns an error in fill_read_buffer(), the buffer will be
    marked as filled. Subsequent reads will return eof. But there is
    no data because of an error, not because it has been read.
    Not marking the buffer filled is the obvious fix.

    Signed-off-by: Oliver Neukum
    Signed-off-by: Greg Kroah-Hartman

    Oliver Neukum
     
  • This patch removes redundant argument checks for kobject_put().

    Signed-off-by: Mariusz Kozlowski
    Signed-off-by: Greg Kroah-Hartman

    Mariusz Kozlowski
     
  • Lockdep issues the following warning:
    [ 9.064000] =============================================
    [ 9.064000] [ INFO: possible recursive locking detected ]
    [ 9.064000] 2.6.20-rc3-mm1 #3
    [ 9.064000] ---------------------------------------------
    [ 9.064000] init/1 is trying to acquire lock:
    [ 9.064000] (&sysfs_inode_imutex_key){--..}, at: [] mutex_lock+0x1c/0x1f
    [ 9.064000]
    [ 9.064000] but task is already holding lock:
    [ 9.064000] (&sysfs_inode_imutex_key){--..}, at: [] mutex_lock+0x1c/0x1f
    [ 9.065000]
    [ 9.065000] other info that might help us debug this:
    [ 9.065000] 2 locks held by init/1:
    [ 9.065000] #0: (tty_mutex){--..}, at: [] mutex_lock+0x1c/0x1f
    [ 9.065000] #1: (&sysfs_inode_imutex_key){--..}, at: [] mutex_lock+0x1c/0x1f
    [ 9.065000]
    [ 9.065000] stack backtrace:
    [ 9.065000] [] show_trace_log_lvl+0x1a/0x30
    [ 9.066000] [] show_trace+0x12/0x14
    [ 9.066000] [] dump_stack+0x16/0x18
    [ 9.066000] [] print_deadlock_bug+0xb9/0xc3
    [ 9.066000] [] check_deadlock+0x55/0x5a
    [ 9.066000] [] __lock_acquire+0x371/0xbf0
    [ 9.066000] [] lock_acquire+0x69/0x83
    [ 9.066000] [] __mutex_lock_slowpath+0x75/0x2d1
    [ 9.066000] [] mutex_lock+0x1c/0x1f
    [ 9.066000] [] sysfs_drop_dentry+0xb1/0x133
    [ 9.066000] [] sysfs_hash_and_remove+0xb3/0x142
    [ 9.066000] [] sysfs_remove_file+0xd/0x10
    [ 9.067000] [] device_remove_file+0x23/0x2e
    [ 9.067000] [] device_del+0x188/0x1e6
    [ 9.067000] [] device_unregister+0xb/0x15
    [ 9.067000] [] device_destroy+0x9c/0xa9
    [ 9.067000] [] vcs_remove_sysfs+0x1c/0x3b
    [ 9.067000] [] con_close+0x5e/0x6b
    [ 9.067000] [] release_dev+0x4c4/0x6e5
    [ 9.067000] [] tty_release+0x12/0x1c
    [ 9.067000] [] __fput+0x177/0x1a0
    [ 9.067000] [] fput+0x3b/0x41
    [ 9.068000] [] filp_close+0x36/0x65
    [ 9.068000] [] sys_close+0x63/0xa4
    [ 9.068000] [] sysenter_past_esp+0x5f/0x99
    [ 9.068000] =======================

    This is due to sysfs_hash_and_remove() holding dir->d_inode->i_mutex
    before calling sysfs_drop_dentry() which calls orphan_all_buffers()
    which in turn takes node->i_mutex.

    Signed-off-by: Frederik Deweerdt
    Cc: Oliver Neukum
    Signed-off-by: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Frederik Deweerdt
     
  • This patch prevents a race between IO and removing a file from sysfs.
    It introduces a list of sysfs_buffers associated with a file at the inode.
    Upon removal of a file the list is walked and the buffers marked orphaned.
    IO to orphaned buffers fails with -ENODEV. The driver can safely free
    associated data structures or be unloaded.

    Signed-off-by: Oliver Neukum
    Acked-by: Maneesh Soni
    Signed-off-by: Greg Kroah-Hartman

    Oliver Neukum
     
  • If we allow NULL as the new parent in device_move(), we need to make sure
    that the device is placed into the same place as it would if it was
    newly registered:

    - Consider the device virtual tree. In order to be able to reuse code,
    setup_parent() has been tweaked a bit.
    - kobject_move() can fall back to the kset's kobject.
    - sysfs_move_dir() uses the sysfs root dir as fallback.

    Signed-off-by: Cornelia Huck
    Cc: Marcel Holtmann
    Signed-off-by: Greg Kroah-Hartman

    Cornelia Huck
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6:
    JFS: Remove incorrect kgdb define
    JFS: call io_schedule() instead of schedule() to avoid deadlock
    JFS: Add lockdep annotations
    JFS: Avoid BUG() on a damaged file system

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (57 commits)
    [GFS2] make gfs2_writepages() static
    [GFS2] Unlock page on prepare_write try lock failure
    [GFS2] nfsd readdirplus assertion failure
    [DLM] fix softlockup in dlm_recv
    [DLM] zero new user lvbs
    [DLM/GFS2] indent help text
    [GFS2] Fix unlink deadlocks
    [GFS2] Put back semaphore to avoid umount problem
    [GFS2] more CURRENT_TIME_SEC
    [GFS2/DLM] fix GFS2 circular dependency
    [GFS2/DLM] use sysfs
    [GFS2] make lock_dlm drop_count tunable in sysfs
    [GFS2] increase default lock limit
    [GFS2] Fix list corruption in lops.c
    [GFS2] Fix recursive locking attempt with NFS
    [DLM] can miss clearing resend flag
    [DLM] saved dlm message can be dropped
    [DLM] Make sock_sem into a mutex
    [GFS2] Fix typo in glock.c
    [GFS2] use CURRENT_TIME_SEC instead of get_seconds in gfs2
    ...

    Linus Torvalds
     

07 Feb, 2007

7 commits


06 Feb, 2007

1 commit