From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 536DCC433EF for ; Thu, 28 Apr 2022 17:28:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CD3B06B00C4; Thu, 28 Apr 2022 13:28:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C82406B00C5; Thu, 28 Apr 2022 13:28:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B4A708D0015; Thu, 28 Apr 2022 13:28:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.28]) by kanga.kvack.org (Postfix) with ESMTP id A65926B00C4 for ; Thu, 28 Apr 2022 13:28:43 -0400 (EDT) Received: from smtpin31.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 7F9CB2A153 for ; Thu, 28 Apr 2022 17:28:43 +0000 (UTC) X-FDA: 79406972526.31.BFB7316 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf22.hostedemail.com (Postfix) with ESMTP id AB896C0053 for ; Thu, 28 Apr 2022 17:28:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=A+D+/+/O4qUxMhyVcAI68wJqSOKSL3LvvXpRZMY+WA0=; b=gn8Jdw8W3r5g87Spt44hs3Ak/H 8IPwr/zyyJJ3MK3Joqp2+JZzlpjmz69i95f1GEVjsBP/253F4jCt3Sp28RGX91uESfNdsydmuOYmp QYe5Fz+cG2ZpXEEfeTHKGRJXFlqnaF3N+7pHg3ZRBFi4Yoc9klI+S9EY+yRc8E2+rXta73Lo4rRFd jvM54Y4HQIvX4rsV6TZohyBlpSWQ5Q+cgzOXKbpRBzt2V1BviR7YSQYOLhE5AO7gaLmY+ohSEQUJJ RcnafZYUrytbirtOsuPc6dpSYwKYPpUpQD0F4wwxS3/Ya3UJM48TE/GtHS7VPAo5X+c0LF+OeP+Nh UOmEVnpQ==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1nk7wT-00BhXH-IS; Thu, 28 Apr 2022 17:28:37 +0000 Date: Thu, 28 Apr 2022 18:28:37 +0100 From: Matthew Wilcox To: NeilBrown Cc: linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Miaohe Lin Subject: Re: [RFC] Documentation for folio_lock() and friends Message-ID: References: <164913769939.10985.13675614818955421206@noble.neil.brown.name> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <164913769939.10985.13675614818955421206@noble.neil.brown.name> X-Rspamd-Queue-Id: AB896C0053 X-Stat-Signature: je7jqo85nyz1wu56pmgd4x3epm59b7kk X-Rspam-User: Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=gn8Jdw8W; spf=none (imf22.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none X-Rspamd-Server: rspam09 X-HE-Tag: 1651166921-576146 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Apr 05, 2022 at 03:48:19PM +1000, NeilBrown wrote: > On Tue, 05 Apr 2022, Matthew Wilcox wrote: > > It's a shame to not have these functions documented. I'm sure I've > > missed a few things that would be useful to document here. > > > > diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h > > index ab47579af434..47b7851f1b64 100644 > > --- a/include/linux/pagemap.h > > +++ b/include/linux/pagemap.h > > @@ -888,6 +888,18 @@ bool __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm, > > void unlock_page(struct page *page); > > void folio_unlock(struct folio *folio); > > > > +/** > > + * folio_trylock() - Attempt to lock a folio. > > + * @folio: The folio to attempt to lock. > > + * > > + * Sometimes it is undesirable to wait for a folio to be unlocked (eg > > + * when the locks are being taken in the wrong order, or if making > > + * progress through a batch of folios is more important than processing > > + * them in order). Usually folio_lock() is the correct function to call. > > Usually? > I think a "see also" type reference to folio_lock() is useful, but I > don't think "usually" is helpful. That was supposed to stand in contrast to the "Sometimes". How about this: * folio_lock() is a sleeping function. If sleeping is not the right * behaviour (eg when the locks are being taken in the wrong order, * or if making progress through a batch of folios is more important * than processing them in order) then you can use folio_trylock(). * It is never appropriate to implement a spinlock using folio_trylock(). ... if not, could you suggest some better wording? > > +/** > > + * folio_lock() - Lock this folio. > > + * @folio: The folio to lock. > > + * > > + * The folio lock protects against many things, probably more than it > > + * should. It is primarily held while a folio is being read from storage, > > + * either from its backing file or from swap. It is also held while a > > + * folio is being truncated from its address_space. > > + * > > + * Holding the lock usually prevents the contents of the folio from being > > + * modified by other kernel users, although it does not prevent userspace > > + * from modifying data if it's mapped. The lock is not consistently held > > + * while a folio is being modified by DMA. > > I don't think this paragraph is helpful... maybe if it listed which > change *are* prevented by the lock, rather than a few which aren't? I put that in because it's a common misconception ("holding the page locked will prevent it from being modified"). > I think it is significant that the lock prevents removal from the page > cache, and so ->mapping is only stable while the lock is held. It might > be worth adding something about that. That's implied by the last sentence of the first paragraph, but I can include that. Actually, ->mapping is also stable if the page is mapped and you hold the page table lock of a page table it's mapped in. That's not theoretical BTW, it's the conditions under which ->dirty_folio() is called -- either the folio lock is held, OR the folio is mapped and the PTL is held. That prevents truncation because truncation unmaps pages before clearing ->mapping, and it needs to take the PTL to unmap the page. Hard to express that in a lockdep expression because the PTL isn't passed to folio_mark_dirty(). That explanation probably doesn't belong here, so how about ... * folio_lock() - Lock this folio. * @folio: The folio to lock. * * The folio lock protects against many things, probably more than it * should. It is primarily held while a folio is being brought uptodate, * either from its backing file or from swap. It is also held while a * folio is being truncated from its address_space, so holding the lock * is sufficient to keep folio->mapping stable. * * The folio lock is also held while write() is modifying the page to * provide POSIX atomicity guarantees (as long as the write does not * cross a page boundary). Other modifications to the data in the folio * do not hold the folio lock and can race with writes, eg DMA and stores * to mapped pages. * * Context: May sleep. If you need to acquire the locks of two or * more folios, they must be in order of ascending index, if they are * in the same address_space. If they are in different address_spaces, * acquire the lock of the folio which belongs to the address_space which * has the lowest address in memory first. Looking at the comment on folio_mark_dirty(), it's a bit unhelpful. How about this: * folio_mark_dirty - Mark a folio as being modified. * @folio: The folio. * - * For folios with a mapping this should be done with the folio lock held - * for the benefit of asynchronous memory errors who prefer a consistent - * dirty state. This rule can be broken in some special cases, - * but should be better not to. + * The folio may not be truncated while this function is running. + * Holding the folio lock is sufficient to prevent truncation, but some + * callers cannot acquire a sleeping lock. These callers instead hold + * the page table lock for a page table which contains at least one page + * in this folio. Truncation will block on the page table lock as it + * unmaps pages before removing the folio from its mapping. * * Return: True if the folio was newly dirtied, false if it was already dirty.