From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1680CC83F12 for ; Mon, 28 Aug 2023 14:27:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 850668E0020; Mon, 28 Aug 2023 10:27:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7FFEC8E000E; Mon, 28 Aug 2023 10:27:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6C7CE8E0020; Mon, 28 Aug 2023 10:27:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 5EB678E000E for ; Mon, 28 Aug 2023 10:27:58 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 3A695B2344 for ; Mon, 28 Aug 2023 14:27:58 +0000 (UTC) X-FDA: 81173742636.30.E9BD830 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) by imf18.hostedemail.com (Postfix) with ESMTP id 8FB4E1C002F for ; Mon, 28 Aug 2023 14:27:56 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=rbTi8dih; dmarc=none; spf=none (imf18.hostedemail.com: domain of BATV+b83d16e5cd0c301f07e4+7309+infradead.org+hch@bombadil.srs.infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=BATV+b83d16e5cd0c301f07e4+7309+infradead.org+hch@bombadil.srs.infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1693232876; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uSnxqZdcrAYLPJPn0DfMY9X2iPdkTMayza8iox8Bem8=; b=3fC2Q+gaeD1VvyouMSf2FMTLl/Oul9PJGo7sWdbVEtwrAH3PeeyuDDjudjVyWRvUI5UZ80 6hkEtZE3Uk9QftQPnUzUsB31HaJvtzSLPrv/ZXbY9YdoUX8FZ9pIzTLB61uj+QLd+l1hPr A5l8JbzNbSLZeRWNbO/GRtvOAWgaYBM= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=rbTi8dih; dmarc=none; spf=none (imf18.hostedemail.com: domain of BATV+b83d16e5cd0c301f07e4+7309+infradead.org+hch@bombadil.srs.infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=BATV+b83d16e5cd0c301f07e4+7309+infradead.org+hch@bombadil.srs.infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1693232876; a=rsa-sha256; cv=none; b=z/noBJgHqEqUJrEjUwtBNz0C3JLJwaAInLcDycYzqSEp47z2sjCY+PbqAknRFrPnPjFJWC tOOBWilCk/MWhd6K+hc0b8Nsykvypmp57CStA++FNNiSMo9x22dWRZEopPdoIi3eYtZlId VMU81ZkLoBg5q0gZh1SLxeCMupMkCeU= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=uSnxqZdcrAYLPJPn0DfMY9X2iPdkTMayza8iox8Bem8=; b=rbTi8dih1WwPPGr+ioF2zSIKKN 0mmMd7Mf6Y5CF6A3xbL6k1rG0Eq5PWNYD84YkTP/PYD6I8n7fwj9JtPf3x5xfflk7Of9PmT6yVoWg jDUfTlhnUqF1ibZ6C9C5Qp1FAcJBPSA1UYGVoKPgYhWBRS4hFyN/qulLfky0u5g5JfBjJuWo9rAVd OmVBPYeBaMmY44rW3+ehZ8vQr36snspggxlo5N+wQ2MnsZOlSw7LhANyJrm+6WDGYJ2q3IlcmhpsD alE6KXAxYfEVAogHtFQTNBnQoXb84wJ9x/IDWMwjf/34MizydIM1onWP9QQVyidEc4XNREZoED++f uj07NLVA==; Received: from hch by bombadil.infradead.org with local (Exim 4.96 #2 (Red Hat Linux)) id 1qadDM-009iIb-0l; Mon, 28 Aug 2023 14:27:36 +0000 Date: Mon, 28 Aug 2023 07:27:36 -0700 From: Christoph Hellwig To: Al Viro Cc: Jan Kara , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, Christoph Hellwig , Alasdair Kergon , Andrew Morton , Anna Schumaker , Chao Yu , Christian Borntraeger , "Darrick J. Wong" , Dave Kleikamp , David Sterba , dm-devel@redhat.com, drbd-dev@lists.linbit.com, Gao Xiang , Jack Wang , Jaegeuk Kim , jfs-discussion@lists.sourceforge.net, Joern Engel , Joseph Qi , Kent Overstreet , linux-bcache@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, linux-mm@kvack.org, linux-mtd@lists.infradead.org, linux-nfs@vger.kernel.org, linux-nilfs@vger.kernel.org, linux-nvme@lists.infradead.org, linux-pm@vger.kernel.org, linux-raid@vger.kernel.org, linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org, linux-xfs@vger.kernel.org, "Md. Haris Iqbal" , Mike Snitzer , Minchan Kim , ocfs2-devel@oss.oracle.com, reiserfs-devel@vger.kernel.org, Sergey Senozhatsky , Song Liu , Sven Schnelle , target-devel@vger.kernel.org, Ted Tso , Trond Myklebust , xen-devel@lists.xenproject.org, Jens Axboe , Christian Brauner Subject: Re: [PATCH v2 0/29] block: Make blkdev_get_by_*() return handle Message-ID: References: <20230810171429.31759-1-jack@suse.cz> <20230825015843.GB95084@ZenIV> <20230825134756.o3wpq6bogndukn53@quack3> <20230826022852.GO3390869@ZenIV> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230826022852.GO3390869@ZenIV> X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 8FB4E1C002F X-Stat-Signature: bn5o4zct1w3i9ycysf95ir1anrfshscb X-Rspam-User: X-HE-Tag: 1693232876-226559 X-HE-Meta: U2FsdGVkX1/cRwkKK9vUjjvWhQNqwFiK5NelQz7bQNveG/wxqjQaKq34xlK24Sf73FHpn2vU2qiwtYTqzsrfLv3YQ26GAuI6Sz+incqQeHYGH8ML++N6n+4br+I/Lt4f8PCai+458s2USsmYUN8EqsLAADfDANJ60xPsKtQcFC7YKjgZlxN9nlv8FOdCvAPqGIKlZiWD6MISLxuI/yPAupeQ6UX4xhEdSO7P5k3XSSrFPwAFRWF5tFTDx3Z8EEq5EshVJxC63texb3lp5kRFxXDcDdlPqbVrK/5YNt6zpIoo/GvDAEXRLa1Ii0fWQiKGM6hjF5bq/O1ghQIacD+7FCny4XXnqhH6tV9d7lIJ5pcLCcZxzfZZe5RRIatFI22Cw72hildJ7E1ukV9pKstEahWq9SqX0GQvC4yjQxzUATMmUi2YWVuzfuXZXYYdM8QiA/c6RlZkULyHA8JxW70fGs94HB+Fbq6OhHfD+ovvrNThza0ly44X4tVrrSImkB5fo6d3LUobo+e13CiZlHHRhLRC6KTdq662vNp9hj4VP6mCceXwt74H5CMwmPiwzDOpPgLihq9rvWKxp5aQWua8fDxdizPVr+j/FNtNt+zhu0fgy19mriotI+jH6t1tQoS8m7D/rWnP0aiZyuhJs+i94pnBENvaBr1N9A5pc/0V1TDskBYeSH//pJJF0qx/Js7ryrVI2cjYwE3uPzx7MhMOQT0c7s6VxGgvSVYbeK0vA0VoiHbukJpQfdClquUa5jTzUgA9Fz6Pc0ewYl2SDtqtayyf8NXVQOURueiyMtQphV1admcJHMjhlEYJUgTHU4fKtCHczjUtMm+XjhSuBhTr0RwBOXqKfYtwrPnFoivQhd9E+w1mSx+q0zmQeJcuj1Z42Ti1SDsyiBKJltp/tcF0tx0byfakmDzPVHs8HQYzfVwfKWUQvUGgKDeUqYLhsasT3yI0z61+jowc0GX/a1n 9vz7Zn+v IxrDRoVuYbCLLn9TJhLYoPpzWc838Ta871OAcJUQ8ST73s8KpVlWza7eoL+b0jRt+YxQYKd1yxDV7ZrwsHn1SJj/jEGDJdoD7Hy6JKrFO9QGmZ2K9Yopyo07vBiQV7/MMMlDwNulnHrxLxg6jOxVpdvD8qBb+F5PjiOjkFUmxYFROofKcqqtIbmlCB2eJESZGmWoGNYvhB8w++P8tfE3FDUnqmAFjVAjwnAxCZO5PUwjP+ZFCmBbgVUrW8rcDbs2jbcjFxGXygDO9UBU3WVXOqMocFPCRxQqslOo2P+N01+Y7XC//x1EJkdt2SspnY8ADcN6QmQYf0Zc+Ui2mzbXAjb++FPoVJ6VDpGWr8klLKKhe5UvWk3ML2E0ZjUF/46T6I6q6bBtL8gyvlefKbF73GfWx3H+UFIzeLIKLHaCdUJP99UhsGlqpAyt1aIa4Q6iF54bbYo9yVUveEto55mKS+404CWJRnsQqt1xZBmH00ye+wvC5j9s3FxehHhhNIqKcd1Hx X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, Aug 26, 2023 at 03:28:52AM +0100, Al Viro wrote: > I mean, look at claim_swapfile() for example: > p->bdev = blkdev_get_by_dev(inode->i_rdev, > FMODE_READ | FMODE_WRITE | FMODE_EXCL, p); > if (IS_ERR(p->bdev)) { > error = PTR_ERR(p->bdev); > p->bdev = NULL; > return error; > } > p->old_block_size = block_size(p->bdev); > error = set_blocksize(p->bdev, PAGE_SIZE); > if (error < 0) > return error; > we already have the file opened, and we keep it opened all the way until > the swapoff(2); here we have noticed that it's a block device and we > * open the fucker again (by device number), this time claiming > it with our swap_info_struct as holder, to be closed at swapoff(2) time > (just before we close the file) Note that some drivers look at FMODE_EXCL/BLK_OPEN_EXCL in ->open. These are probably bogus and maybe we want to kill them, but that will need an audit first. > BTW, what happens if two threads call ioctl(fd, BLKBSZSET, &n) > for the same descriptor that happens to have been opened O_EXCL? > Without O_EXCL they would've been unable to claim the sucker at the same > time - the holder we are using is the address of a function argument, > i.e. something that points to kernel stack of the caller. Those would > conflict and we either get set_blocksize() calls fully serialized, or > one of the callers would eat -EBUSY. Not so in "opened with O_EXCL" > case - they can very well overlap and IIRC set_blocksize() does *not* > expect that kind of crap... It's all under CAP_SYS_ADMIN, so it's not > as if it was a meaningful security hole anyway, but it does look fishy. The user get to keep the pieces.. BLKBSZSET is kinda bogus anyway as the soft blocksize only matters for buffer_head-like I/O, and there only for file systems. Not idea why anyone would set it manually.