Commit 0ac49527318bc388a881152d60f49d7951606024

Authored by Maciej W. Rozycki
Committed by David S. Miller
1 parent f7ab697d32

PHYLIB: IRQ event workqueue handling fixes

Keep track of disable_irq_nosync() invocations and call enable_irq() the
right number of times if work has been cancelled that would include them.

Now that the call to flush_work_keventd() (problematic because of
rtnl_mutex being held) has been replaced by cancel_work_sync() another
issue has arisen and been left unresolved.  As the MDIO bus cannot be
accessed from the interrupt context the PHY interrupt handler uses
disable_irq_nosync() to prevent from looping and schedules some work to be
done as a softirq, which, apart from handling the state change of the
originating PHY, is responsible for reenabling the interrupt.  Now if the
interrupt line is shared by another device and a call to the softirq
handler has been cancelled, that call to enable_irq() never happens and the
other device cannot use its interrupt anymore as its stuck disabled.

I decided to use a counter rather than a flag because there may be more
than one call to phy_change() cancelled in the queue -- a real one and a
fake one triggered by free_irq() if DEBUG_SHIRQ is used, if nothing else.
Therefore because of its nesting property enable_irq() has to be called the
right number of times to match the number disable_irq_nosync() was called
and restore the original state.  This DEBUG_SHIRQ feature is also the
reason why free_irq() has to be called before cancel_work_sync().

While at it I updated the comment about phy_stop_interrupts() being called
from `keventd' -- this is no longer relevant as the use of
cancel_work_sync() makes such an approach unnecessary.  OTOH a similar
comment referring to flush_scheduled_work() in phy_stop() still applies as
using cancel_work_sync() there would be dangerous.

Checked with checkpatch.pl and at the run time (with and without
DEBUG_SHIRQ).

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Andy Fleming <afleming@freescale.com>
Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

Showing 2 changed files with 22 additions and 5 deletions Side-by-side Diff

drivers/net/phy/phy.c
... ... @@ -7,7 +7,7 @@
7 7 * Author: Andy Fleming
8 8 *
9 9 * Copyright (c) 2004 Freescale Semiconductor, Inc.
10   - * Copyright (c) 2006 Maciej W. Rozycki
  10 + * Copyright (c) 2006, 2007 Maciej W. Rozycki
11 11 *
12 12 * This program is free software; you can redistribute it and/or modify it
13 13 * under the terms of the GNU General Public License as published by the
... ... @@ -35,6 +35,7 @@
35 35 #include <linux/timer.h>
36 36 #include <linux/workqueue.h>
37 37  
  38 +#include <asm/atomic.h>
38 39 #include <asm/io.h>
39 40 #include <asm/irq.h>
40 41 #include <asm/uaccess.h>
... ... @@ -562,6 +563,7 @@
562 563 * queue will write the PHY to disable and clear the
563 564 * interrupt, and then reenable the irq line. */
564 565 disable_irq_nosync(irq);
  566 + atomic_inc(&phydev->irq_disable);
565 567  
566 568 schedule_work(&phydev->phy_queue);
567 569  
... ... @@ -632,6 +634,7 @@
632 634  
633 635 INIT_WORK(&phydev->phy_queue, phy_change);
634 636  
  637 + atomic_set(&phydev->irq_disable, 0);
635 638 if (request_irq(phydev->irq, phy_interrupt,
636 639 IRQF_SHARED,
637 640 "phy_interrupt",
638 641  
639 642  
640 643  
... ... @@ -662,14 +665,23 @@
662 665 if (err)
663 666 phy_error(phydev);
664 667  
  668 + free_irq(phydev->irq, phydev);
  669 +
665 670 /*
666   - * Finish any pending work; we might have been scheduled to be called
667   - * from keventd ourselves, but cancel_work_sync() handles that.
  671 + * Cannot call flush_scheduled_work() here as desired because
  672 + * of rtnl_lock(), but we do not really care about what would
  673 + * be done, except from enable_irq(), so cancel any work
  674 + * possibly pending and take care of the matter below.
668 675 */
669 676 cancel_work_sync(&phydev->phy_queue);
  677 + /*
  678 + * If work indeed has been cancelled, disable_irq() will have
  679 + * been left unbalanced from phy_interrupt() and enable_irq()
  680 + * has to be called so that other devices on the line work.
  681 + */
  682 + while (atomic_dec_return(&phydev->irq_disable) >= 0)
  683 + enable_irq(phydev->irq);
670 684  
671   - free_irq(phydev->irq, phydev);
672   -
673 685 return err;
674 686 }
675 687 EXPORT_SYMBOL(phy_stop_interrupts);
... ... @@ -695,6 +707,7 @@
695 707 phydev->state = PHY_CHANGELINK;
696 708 spin_unlock_bh(&phydev->lock);
697 709  
  710 + atomic_dec(&phydev->irq_disable);
698 711 enable_irq(phydev->irq);
699 712  
700 713 /* Reenable interrupts */
... ... @@ -708,6 +721,7 @@
708 721  
709 722 irq_enable_err:
710 723 disable_irq(phydev->irq);
  724 + atomic_inc(&phydev->irq_disable);
711 725 phy_err:
712 726 phy_error(phydev);
713 727 }
... ... @@ -25,6 +25,8 @@
25 25 #include <linux/timer.h>
26 26 #include <linux/workqueue.h>
27 27  
  28 +#include <asm/atomic.h>
  29 +
28 30 #define PHY_BASIC_FEATURES (SUPPORTED_10baseT_Half | \
29 31 SUPPORTED_10baseT_Full | \
30 32 SUPPORTED_100baseT_Half | \
... ... @@ -281,6 +283,7 @@
281 283 /* Interrupt and Polling infrastructure */
282 284 struct work_struct phy_queue;
283 285 struct timer_list phy_timer;
  286 + atomic_t irq_disable;
284 287  
285 288 spinlock_t lock;
286 289