26 Sep, 2006

3 commits

  • There is a potential deadlock in the driver core. It boils down to
    the fact that bus_remove_device() calls klist_remove() instead of
    klist_del(), thereby waiting until the reference count of the
    klist_node in the bus's klist of devices drops to 0. The refcount
    can't reach 0 so long as a modprobe process is trying to bind a new
    driver to the device being removed, by calling __driver_attach(). The
    problem is that __driver_attach() tries to acquire the device's
    parent's semaphore, but the caller of bus_remove_device() is quite
    likely to own that semaphore already.

    It isn't sufficient just to replace klist_remove() with klist_del().
    Doing so runs the risk that the device would remain on the bus's klist
    of devices for some time, and so could be bound to another driver even
    after it was unregistered. What's needed is a new way to distinguish
    whether or not a device is registered, based on a criterion other than
    whether its klist_node is linked into the bus's klist of devices. That
    way driver binding can fail when the device is unregistered, even if
    it is still linked into the klist.

    This patch (as782) implements the solution, by adding a new bitflag to
    indiate when a struct device is registered, by testing the flag before
    allowing a driver to bind a device, and by changing the definition of
    the device_is_registered() inline.

    Signed-off-by: Alan Stern
    Signed-off-by: Greg Kroah-Hartman

    Alan Stern
     
  • This adds the infrastructure for drivers to do a threaded probe, and
    waits at init time for all currently outstanding probes to complete.

    A new kernel thread will be created when the probe() function for the
    driver is called, if the multithread_probe bit is set in the driver
    saying it can support this kind of operation.

    I have tested this with USB and PCI, and it works, and shaves off a lot
    of time in the boot process, but there are issues with finding root boot
    disks, and some USB drivers assume that this can never happen, so it is
    currently not enabled for any bus type. Individual drivers can enable
    this right now if they wish, and bus authors can selectivly turn it on
    as well, once they determine that their subsystem will work properly
    with it.

    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     
  • Add lots of return-value checking.

    : fix bus_rescan_devices()]
    Cc: "Randy.Dunlap"
    Signed-off-by: Cornelia Huck
    Signed-off-by: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Andrew Morton
     

15 Apr, 2006

1 commit


14 Jan, 2006

1 commit

  • Add bus_type probe, remove and shutdown methods to replace the
    corresponding methods in struct device_driver. This matches
    the way we handle the suspend/resume methods.

    Since the bus methods override the device_driver methods, warn
    if a device driver is registered whose methods will not be
    called.

    The long-term idea is to remove the device_driver methods entirely.

    Signed-off-by: Russell King
    Signed-off-by: Greg Kroah-Hartman

    Russell King
     

05 Jan, 2006

1 commit

  • This patch (as604) makes the driver core hold a device's parent's lock
    as well as the device's lock during calls to the probe and remove
    methods in a driver. This facility is needed by USB device drivers,
    owing to the peculiar way USB devices work:

    A device provides multiple interfaces, and drivers are bound
    to interfaces rather than to devices;

    Nevertheless a reset, reset-configuration, suspend, or resume
    affects the entire device and requires the caller to hold the
    lock for the device, not just a lock for one of the interfaces.

    Since a USB driver's probe method is always called with the interface
    lock held, the locking order rules (always lock parent before child)
    prevent these methods from acquiring the device lock. The solution
    provided here is to call all probe and remove methods, for all devices
    (not just USB), with the parent lock already acquired.

    Although currently only the USB subsystem requires these changes, people
    have mentioned in prior discussion that the overhead of acquiring an
    extra semaphore in all the prove/remove sequences is not overly large.

    Up to now, the USB core has been using its own set of private
    semaphores. A followup patch will remove them, relying entirely on the
    device semaphores provided by the driver core.

    The code paths affected by this patch are:

    device_add and device_del: The USB core already holds the parent
    lock, so no actual change is needed.

    driver_register and driver_unregister: The driver core will now
    lock both the parent and the device before probing or removing.

    driver_bind and driver_unbind (in sysfs): These routines will
    now lock both the parent and the device before binding or
    unbinding.

    bus_rescan_devices: The helper routine will lock the parent
    before probing a device.

    I have not tested this patch for conflicts with other subsystems. As
    far as I can see, the only possibility of conflict would lie in the
    bus_rescan_devices pathway, and it seems pretty remote. Nevertheless,
    it would be good for this to get a lot of testing in -mm.

    Signed-off-by: Alan Stern
    Signed-off-by: Greg Kroah-Hartman

    Alan Stern
     

24 Nov, 2005

1 commit

  • This patch (as603) makes a few small fixes to the driver core:

    Change spin_lock_irq for a klist lock to spin_lock;

    Fix reference count leaks;

    Minor spelling and formatting changes.

    Signed-off-by: Alan Stern
    Acked-by Patrick Mochel
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: Linus Torvalds

    Alan Stern
     

22 Sep, 2005

1 commit

  • bus_rescan_devices_helper() does not hold the dev->sem when it checks for
    !dev->driver(). device_attach() holds the sem, but calls again
    device_bind_driver() even when dev->driver is set.

    What happens is that a first device_attach() call (module insertion time)
    is on the way binding the device to a driver. Another thread calls
    bus_rescan_devices(). Now when bus_rescan_devices_helper() checks for
    dev->driver it is still NULL 'cos the the prior device_attach() is not yet
    finished. But as soon as the first one releases the dev->sem the second
    device_attach() tries to rebind the already bound device again.
    device_bind_driver() does this blindly which leads to a corrupt
    driver->klist_devices list (the device links itself, the head points to the
    device). Later a call to device_release_driver() sets dev->driver to NULL
    and breaks the link it has to itself on knode_driver. Rmmoding the driver
    later calls driver_detach() which leads to an endless loop 'cos the list
    head in klist_devices still points to the device. And since dev->driver is
    NULL it's stuck with the same device forever. Boom. And rmmod hangs.

    Very easy to reproduce with new-style pcmcia and a 16bit card. Just loop
    modprobe ;cardctl eject; rmmod .

    Easiest fix is to check if the device is already bound to a driver in
    device_bind_driver(). This avoids the double binding.

    Signed-off-by: Daniel Ritz
    Signed-off-by: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: Linus Torvalds

    Daniel Ritz
     

06 Sep, 2005

1 commit


30 Jun, 2005

1 commit

  • This adds a single file, "bind", to the sysfs directory of every driver
    registered with the driver core. To bind a device to a driver, write
    the bus id of the device you wish to bind to that specific driver to the
    "bind" file (remember to not add a trailing \n). If that bus id matches
    a device on that bus, and it does not currently have a driver bound to
    it, the probe sequence will be initiated with that driver and device.

    Note, this requires that the driver itself be willing and able to accept
    that device (usually through a device id type table). This patch does
    not make it possible to override the driver's id table.

    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

21 Jun, 2005

8 commits

  • The error handling in bus_add_device() and device_attach() is simply
    non-existing. This patch propagates any error from device_attach to
    the upper layers to allow for a proper recovery.

    From: Hannes Reinecke
    Signed-off-by: Greg Kroah-Hartman

    Hannes Reinecke
     
  • This patch is intended for your "driver" tree. It fixes several subtle
    races in driver_detach() and device_release_driver() in the driver-model
    core.

    The major change is to use klist_remove() rather than klist_del() when
    taking a device off its driver's list. There's no other way to guarantee
    that the list pointers will be updated before some other driver binds to
    the device. For this to work driver_detach() can't use a klist iterator,
    so the loop over the devices must be written out in full. In addition the
    patch protects against the possibility that, when a driver and a device
    are unregistered at the same time, one may be unloaded from memory before
    the other is finished using it.

    Signed-off-by: Alan Stern
    Signed-off-by: Greg Kroah-Hartman

    Alan Stern
     
  • There's no check to see if the device is already bound to a driver, which
    could do bad things. The first thing to go wrong is that it will try to match
    a driver with a device already bound to one. In some cases (it appears with
    USB with drivers/usb/core/usb.c::usb_match_id()), some drivers will match a
    device based on the class type, so it would be common (especially for HID
    devices) to match a device that is already bound.

    The fun comes when ->probe() is called, it fails, then
    driver_probe_device() does this:

    dev->driver = NULL;

    Later on, that pointer could be be dereferenced without checking and cause
    hell to break loose.

    This problem could be nasty. It's very hardware dependent, since some
    devices could have a different set of matching qualifiers than others.

    Now, I don't quite see exactly where/how you were getting that crash.
    You're dereferencing bad memory, but I'm not sure which pointer was bad
    and where it came from, but it could have come from a couple of different
    places.

    The patch below will hopefully fix it all up for you. It's against
    2.6.12-rc2-mm1, and does the following:

    - Move logic to driver_probe_device() and comments uncommon returns:
    1 - If device is bound
    0 - If device not bound, and no error
    error - If there was an error.

    - Move locking to caller of that function, since we want to lock a
    device for the entire time we're trying to bind it to a driver (to
    prevent against a driver being loaded at the same time).

    - Update __device_attach() and __driver_attach() to do that locking.

    - Check if device is already bound in __driver_attach()

    - Update the converse device_release_driver() so it locks the device
    around all of the operations.

    - Mark driver_probe_device() as static and remove export. It's an
    internal function, it should stay that way, and there are no other
    callers. If there is ever a need to export it, we can audit it as
    necessary.

    Signed-off-by: Andrew Morton

    Patrick Mochel
     
  • Also stops looping over the lists when a match is found.

    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de

    gregkh@suse.de
     
  • - Can't wait on removing the current item in the list (the positive refcount *because*
    we are using it causes it to deadlock).

    Signed-off-by: Patrick Mochel
    Signed-off-by: Greg Kroah-Hartman

    mochel@digitalimplant.org
     
  • - Now possible, since the lists are locked using the klist lock and not the
    global rwsem.

    Signed-off-by: Patrick Mochel
    Signed-off-by: Greg Kroah-Hartman

    mochel@digitalimplant.org
     
  • - Use it in driver_for_each_device() instead of the regular list_head and stop using
    the bus's rwsem for protection.
    - Use driver_for_each_device() in driver_detach() so we don't deadlock on the
    bus's rwsem.
    - Remove ->devices.
    - Move klist access and sysfs link access out from under device's semaphore, since
    they're synchronized through other means.

    Signed-off-by: Patrick Mochel
    Signed-off-by: Greg Kroah-Hartman

    mochel@digitalimplant.org
     
  • This relocates the driver binding/unbinding code to drivers/base/dd.c. This is done
    for two reasons: One, it's not code related to the bus_type itself; it uses some from
    that, some from devices, and some from drivers. And Two, it will make it easier to do
    some of the upcoming lock removal on that code..

    Signed-off-by: Patrick Mochel
    Signed-off-by: Greg Kroah-Hartman

    mochel@digitalimplant.org