Commit 08bafc0341f2f7920e9045bc32c40299cac8c21b

Authored by Keith Mannthey
Committed by Jens Axboe
1 parent 7c239517d9

block: Supress Buffer I/O errors when SCSI REQ_QUIET flag set

Allow the scsi request REQ_QUIET flag to be propagated to the buffer
file system layer. The basic ideas is to pass the flag from the scsi
request to the bio (block IO) and then to the buffer layer.  The buffer
layer can then suppress needless printks.

This patch declutters the kernel log by removed the 40-50 (per lun)
buffer io error messages seen during a boot in my multipath setup . It
is a good chance any real errors will be missed in the "noise" it the
logs without this patch.

During boot I see blocks of messages like
"
__ratelimit: 211 callbacks suppressed
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242847
Buffer I/O error on device sdm, logical block 1
Buffer I/O error on device sdm, logical block 5242878
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242872
"
in my logs.

My disk environment is multipath fiber channel using the SCSI_DH_RDAC
code and multipathd.  This topology includes an "active" and "ghost"
path for each lun. IO's to the "ghost" path will never complete and the
SCSI layer, via the scsi device handler rdac code, quick returns the IOs
to theses paths and sets the REQ_QUIET scsi flag to suppress the scsi
layer messages.

 I am wanting to extend the QUIET behavior to include the buffer file
system layer to deal with these errors as well. I have been running this
patch for a while now on several boxes without issue.  A few runs of
bonnie++ show no noticeable difference in performance in my setup.

Thanks for John Stultz for the quiet_error finalization.

Submitted-by:  Keith Mannthey <kmannth@us.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

Showing 4 changed files with 20 additions and 4 deletions Side-by-side Diff

... ... @@ -153,6 +153,9 @@
153 153 nbytes = bio->bi_size;
154 154 }
155 155  
  156 + if (unlikely(rq->cmd_flags & REQ_QUIET))
  157 + set_bit(BIO_QUIET, &bio->bi_flags);
  158 +
156 159 bio->bi_size -= nbytes;
157 160 bio->bi_sector += (nbytes >> 9);
158 161  
... ... @@ -99,10 +99,18 @@
99 99 page_cache_release(page);
100 100 }
101 101  
  102 +
  103 +static int quiet_error(struct buffer_head *bh)
  104 +{
  105 + if (!test_bit(BH_Quiet, &bh->b_state) && printk_ratelimit())
  106 + return 0;
  107 + return 1;
  108 +}
  109 +
  110 +
102 111 static void buffer_io_error(struct buffer_head *bh)
103 112 {
104 113 char b[BDEVNAME_SIZE];
105   -
106 114 printk(KERN_ERR "Buffer I/O error on device %s, logical block %Lu\n",
107 115 bdevname(bh->b_bdev, b),
108 116 (unsigned long long)bh->b_blocknr);
... ... @@ -144,7 +152,7 @@
144 152 if (uptodate) {
145 153 set_buffer_uptodate(bh);
146 154 } else {
147   - if (!buffer_eopnotsupp(bh) && printk_ratelimit()) {
  155 + if (!buffer_eopnotsupp(bh) && !quiet_error(bh)) {
148 156 buffer_io_error(bh);
149 157 printk(KERN_WARNING "lost page write due to "
150 158 "I/O error on %s\n",
... ... @@ -394,7 +402,7 @@
394 402 set_buffer_uptodate(bh);
395 403 } else {
396 404 clear_buffer_uptodate(bh);
397   - if (printk_ratelimit())
  405 + if (!quiet_error(bh))
398 406 buffer_io_error(bh);
399 407 SetPageError(page);
400 408 }
... ... @@ -455,7 +463,7 @@
455 463 if (uptodate) {
456 464 set_buffer_uptodate(bh);
457 465 } else {
458   - if (printk_ratelimit()) {
  466 + if (!quiet_error(bh)) {
459 467 buffer_io_error(bh);
460 468 printk(KERN_WARNING "lost page write due to "
461 469 "I/O error on %s\n",
... ... @@ -2912,6 +2920,9 @@
2912 2920 set_bit(BIO_EOPNOTSUPP, &bio->bi_flags);
2913 2921 set_bit(BH_Eopnotsupp, &bh->b_state);
2914 2922 }
  2923 +
  2924 + if (unlikely (test_bit(BIO_QUIET,&bio->bi_flags)))
  2925 + set_bit(BH_Quiet, &bh->b_state);
2915 2926  
2916 2927 bh->b_end_io(bh, test_bit(BIO_UPTODATE, &bio->bi_flags));
2917 2928 bio_put(bio);
... ... @@ -117,6 +117,7 @@
117 117 #define BIO_CPU_AFFINE 8 /* complete bio on same CPU as submitted */
118 118 #define BIO_NULL_MAPPED 9 /* contains invalid user pages */
119 119 #define BIO_FS_INTEGRITY 10 /* fs owns integrity data, not block layer */
  120 +#define BIO_QUIET 11 /* Make BIO Quiet */
120 121 #define bio_flagged(bio, flag) ((bio)->bi_flags & (1 << (flag)))
121 122  
122 123 /*
include/linux/buffer_head.h
... ... @@ -35,6 +35,7 @@
35 35 BH_Ordered, /* ordered write */
36 36 BH_Eopnotsupp, /* operation not supported (barrier) */
37 37 BH_Unwritten, /* Buffer is allocated on disk but not written */
  38 + BH_Quiet, /* Buffer Error Prinks to be quiet */
38 39  
39 40 BH_PrivateStart,/* not a state bit, but the first bit available
40 41 * for private allocation by other entities