On Wed 01-03-17 15:29:09, Jan Kara wrote: > On Mon 27-02-17 18:27:55, Al Viro wrote: > > On Mon, Feb 27, 2017 at 06:11:11PM +0100, Dmitry Vyukov wrote: > > > Hello, > > > > > > The following program triggers GPF in bdi_put: > > > https://gist.githubusercontent.com/dvyukov/15b3e211f937ff6abc558724369066ce/raw/cc017edf57963e30175a6a6fe2b8d917f6e92899/gistfile1.txt > > > > What happens is > > * attempt of, essentially, mount -t bdev ..., calls mount_pseudo() > > and then promptly destroys the new instance it has created. > > * the only inode created on that sucker (root directory, that > > is) gets evicted. > > * most of ->evict_inode() is harmless, until it gets to > > if (bdev->bd_bdi != &noop_backing_dev_info) > > bdi_put(bdev->bd_bdi); > > Thanks for the analysis! > > > added there by "block: Make blk_get_backing_dev_info() safe without open bdev". > > Since ->bd_bdi hadn't been initialized for that sucker (the same patch has > > placed initialization into bdget()), we step into shit of varying nastiness, > > depending on phase of moon, etc. > > Yup, I've missed that the root inode of bdev superblock does not go through > bdget() (in fact I didn't think what happens with root inode for bdev > superblock at all) and thus bd_bdi is left uninitialized in that case. I'll > send a fix for that in a while. > > > Could somebody explain WTF do we have those two lines in bdev_evict_inode(), > > anyway? We set ->bd_bdi to something other than noop_backing_dev_info only > > in __blkdev_get() when ->bd_openers goes from zero to positive, so why is > > the matching bdi_put() not in __blkdev_put()? Jan? > > The problem is writeback code (from flusher work or through sync(2) - > generally inode_to_bdi() users) can be looking at bdev inode independently > from it being open. So if they start looking while the bdev is open but the > dereference happens after it is closed and device removed, we oops. We have > seen oopses due to this for quite a while. And all the stuff that is done > in __blkdev_put() is not enough to prevent writeback code from having a > look whether there is not something to write. > > So what we do now is that once we establish valid bd_bdi reference, we > leave it alone until bdev inode gets evicted. And to handle the case when > underlying device actually changes, we unhash bdev inode when the device > gets removed from the system so that it cannot be found by bdget() anymore. Attached patch fixes the problem for me. I'll post it officially tomorrow once Al has a chance to reply... Honza -- Jan Kara SUSE Labs, CR