From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13559C02194 for ; Tue, 4 Feb 2025 10:19:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5A2786B0083; Tue, 4 Feb 2025 05:19:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 552506B0088; Tue, 4 Feb 2025 05:19:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 419DA6B0089; Tue, 4 Feb 2025 05:19:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 220346B0083 for ; Tue, 4 Feb 2025 05:19:43 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id AA1AE160925 for ; Tue, 4 Feb 2025 10:19:42 +0000 (UTC) X-FDA: 83081865804.19.3DDFBC9 Received: from mail-wm1-f51.google.com (mail-wm1-f51.google.com [209.85.128.51]) by imf29.hostedemail.com (Postfix) with ESMTP id 9951F12000D for ; Tue, 4 Feb 2025 10:19:40 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=ffwll.ch header.s=google header.b="jlV0/zgl"; dmarc=none; spf=none (imf29.hostedemail.com: domain of simona.vetter@ffwll.ch has no SPF policy when checking 209.85.128.51) smtp.mailfrom=simona.vetter@ffwll.ch ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738664380; a=rsa-sha256; cv=none; b=WOhqmPflUj69ebYKXoxaTXiQqxMBX3vM0vKoXkx+iInIR6It1NhkolGcfVkWzfxnK582bg 3thke4+MzI17HHUTF3BZ3adJga7QiWGC+KrvyL1BI+t5l5rxBaBf1/DpYgwMEr+qB46ppx JfVnDFV8QCH4XrYPWoMlEAherRuwM6k= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=ffwll.ch header.s=google header.b="jlV0/zgl"; dmarc=none; spf=none (imf29.hostedemail.com: domain of simona.vetter@ffwll.ch has no SPF policy when checking 209.85.128.51) smtp.mailfrom=simona.vetter@ffwll.ch ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738664380; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ls4rUpX9KWpzDW4b2Z40ClskRs4xRfQGCjUvTHygnvU=; b=PTQCVqnxbbXWcV6QVKQuNg7hs4fghT4pJ5LkTB00govLZvxL0pnVb+SbjbBS2oKJe/s0o2 cDuj5Gims0sOpv7cK3hgZNA7Ah5sPJDZOqlf8qi0KXItneRyIfURq3VpOeqKWCsFQVr1D5 LQpSMBPvx3Ok/vnPq9pENS4jIWdP27k= Received: by mail-wm1-f51.google.com with SMTP id 5b1f17b1804b1-438a39e659cso36954555e9.2 for ; Tue, 04 Feb 2025 02:19:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; t=1738664379; x=1739269179; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references :mail-followup-to:message-id:subject:cc:to:from:date:from:to:cc :subject:date:message-id:reply-to; bh=ls4rUpX9KWpzDW4b2Z40ClskRs4xRfQGCjUvTHygnvU=; b=jlV0/zgl7tnKSR8K8YKYcrl3f4LrZxMQbSlvcJNm0duL4NefolDpTTrkc9T0d4ju54 cUvrQaXVIyMUvSmz5kSDlax9R/fwZyQcWxx+SxFvPWAO4jRS5Ev7dW34mtoXIOVYFjlT q7EEKmfLzAOtK7ayMziW20jeYjpLjMUjg8MKA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738664379; x=1739269179; h=in-reply-to:content-disposition:mime-version:references :mail-followup-to:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ls4rUpX9KWpzDW4b2Z40ClskRs4xRfQGCjUvTHygnvU=; b=dUClRMag1UqU9RfIbLMRw2gh1L/uDEoCJdg6T44o0Ex+HgW+hdbZgdeP2KfYAvT8iY 4gKJCfbV+nr5lfXzPiTspjVJ/R35VuMq6b12D9vgNWlJWGci+KP9T0fAUEiKvMu8Sn5J Ct46dm2bTaMqLrnV4n1abr4K+WnjQF00TG2n4n79ijpemWzlwXb8roXEXnGpXoyyc6dV PqBthkzsArAvlCdddEHCoj1X/4Bs+psCeL4AKCPYYQV3abVZwItxV+YUjv83QHc07u/u JkAmhrS3NWu4x2Ux6DVXmRG1Gg5tSje3uuAs9C08Lc3MhH/38eyKQrtPfS6/SVzdFQlo QaEw== X-Forwarded-Encrypted: i=1; AJvYcCWxJp8KxMja7vZTtvYxOo70uhxWS0FjEnRIE9AkyHIp4lnWqRYPEvAaSD0OmZm4cdCXrMNOlwJ19g==@kvack.org X-Gm-Message-State: AOJu0Yzu6J+YvjCQQHAzmsNFXnsGubGGocMdNLoNYmOSX5gVWLLmhMH1 Ix+KKt5Pul0OgBJ0eek2pMxgdgV0enJOJYUNcSBXzgQA78z1fisOSE2kwMytBXc= X-Gm-Gg: ASbGncutAxI2jATEiqZdHSIno2+HAvQzNYzhhedpC0RUKBjl9wV7Fx/knNj/AkAkhKt FBj8fmTdW1O6/+OYaJ8D4JX4PRoPgVkDaOyWISf+HjIFBS/KzWrBB/oP7LfB3HpIi37HG8+sIDq T+Qtw/WzyLAmpdrdn2jr320ArrE0ab3anyGUFZNwTjeOz9lSeGJgVHzRXhwXYje9QymAsLmhx28 xijRGTzxnXCjEyNdNQRaYCx3dh2hCMhbUr3P3pLN80M9oIh3SAv3A83FmdkKIe02txFhgJ3Sa8f h9rC2K6yGcv4jwbmDv8hvQjmSzI= X-Google-Smtp-Source: AGHT+IGfnTZuu7x6vWbYmuRmysx88ve6RVrYfx4SVhj9Cnq16Qs0v8RQpAhNnfDl/p101BKPt5R/Qw== X-Received: by 2002:a05:600c:6d5a:b0:436:e8b4:3cde with SMTP id 5b1f17b1804b1-438e01febf7mr214271695e9.14.1738664378625; Tue, 04 Feb 2025 02:19:38 -0800 (PST) Received: from phenom.ffwll.local ([2a02:168:57f4:0:5485:d4b2:c087:b497]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-438e245f492sm183438775e9.38.2025.02.04.02.19.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2025 02:19:38 -0800 (PST) Date: Tue, 4 Feb 2025 11:19:35 +0100 From: Simona Vetter To: Lorenzo Stoakes Cc: Andrew Morton , Jaya Kumar , Simona Vetter , Helge Deller , linux-fbdev@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Matthew Wilcox , David Hildenbrand , Kajtar Zsolt , Maira Canal Subject: Re: [PATCH 2/3] mm: provide mapping_wrprotect_page() function Message-ID: Mail-Followup-To: Lorenzo Stoakes , Andrew Morton , Jaya Kumar , Simona Vetter , Helge Deller , linux-fbdev@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Matthew Wilcox , David Hildenbrand , Kajtar Zsolt , Maira Canal References: <655f318b-d883-4ddd-9301-53a05ab06bc0@lucifer.local> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <655f318b-d883-4ddd-9301-53a05ab06bc0@lucifer.local> X-Operating-System: Linux phenom 6.12.11-amd64 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 9951F12000D X-Stat-Signature: 1wkrhb93kzjeshyr9ugmqb48wyzyr3sr X-Rspam-User: X-HE-Tag: 1738664380-847158 X-HE-Meta: U2FsdGVkX19kMkjCmMPEBu06A21qc+6pFj5hwnZ0mOjVKsRcmNmknfZi+s5F0ecWrdw9/sqprm6SnCCVKHK60+YuC2pndtiPrM8DSCvKVyRkeZRBElfmnXvAiy0CMVO90U4wcmenweaKecpk7/YSZSCpOQcv3UPdMSglgPNXyZMfWpqkcvmIdyMPwrFr5SkU3QDEzzQgDJU6aApvPwboOTo/yUKo/eH8kMOJw64WhJ0LSv3cb5yWgQvd2rJhMx1lYzjcjT9f81APu1DoamCuABjwnBJSJzRY/va6ttQMVz30LHZNfHqr/o078ljPj07FDdDldtQ2z6PkfU4iKJ6PkBsvPdKjCY4iRJGXZ6jlHDkxHDv3/FggJc/74OgvZKb222FZOFHxcUf9/Hv/zt934VXdVTjxVk8ycu52tA9xW08p4JnNovLGMfvizD3oK7ITNFJ0gSLup6upLM954LZudV6C8lMo4dHtQtQSvU/Z0hNvnEkQJCz7RKYu74Fota8nUFBYxoW8qN/Vg3RcAa6v2JRUsFO/H31ysiA1E6mUAvZ/OXs3AASyQ7ov7Xql7sr4+VvQ4Yvbs0TzkWCNAH2HpMPrUdsojMy6GlvB/dWpfm6/mtd+ANjBt1MXi+chJeww8BRUttLqFw5li1yIGIFFYh/WxsNyU8fdFOE6KUIzdk/LGOo1JCJX/jgrUviD976rGKrZp4WWwlY1VzsanSvjQaN0yelDXgDDI5AOfcKlX4sbS0EOLjbIJKYDYgJ+jaR8hz2uTOoDzDUb9SL0hVbHT7cNdHJtGlpqvvvLMmsguWn2BYvtN6ufbSGSkraVvSO0OaEYMEqUY1nr7cYB+cqaMZhK7Ai0ZZFIuf97DyGNdp84g8C/1bh38uDXq4CBCoDAWzY4a4XeOu006Dz3OA1prm/XEa6Li8plzsaFk1f70lDNoK497j+BXmlQrSzi/bbyfM5hXQPNnqlfQcS2V0T 0QZQakga dwqmEp3ExB2LxGB+MyaHLmmsNCrq2cs6Uh85bBcRbC3jGz8X3qGlHNoPTpbC9/S+RNAHOzLBPPeLypg+2QUbCrZTgwuFVCOtfhSuelH565G0AH6MlA3aMAkXw9P/y0n2PXfMKXfH5E11HWRk/DgdmvpQ37xR7YQWFJJezvuGUXumOV4/kCUoujvWVmORVDyAzlvHXjdPG1uF0/T4CDX6tBqJQNDSM+/p8DPFf8K04UV9NMivNt71j0xxTKDRbaYyFtXCAm46VYIPRAe+R7odTQX3p3QscGjK1S0vDOF0IZwXXUmspYPeMzGI7CL+ErmxUsqmQXTKh7gahLCPTL77maxMa3yet0JAw+uSDvpRM/3qYQHlWZOLRzrAz64pk0e98tzuVp3PVkcbavk2Sf6NjN5+sAk8yQu4OxxpmeT4dA3Ly3VoCxYnkSmdSzs3amajYXKeDfCuHJuSMJhc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Feb 03, 2025 at 04:30:04PM +0000, Lorenzo Stoakes wrote: > On Mon, Feb 03, 2025 at 04:49:34PM +0100, Simona Vetter wrote: > > On Fri, Jan 31, 2025 at 06:28:57PM +0000, Lorenzo Stoakes wrote: > > > in the fb_defio video driver, page dirty state is used to determine when > > > frame buffer pages have been changed, allowing for batched, deferred I/O to > > > be performed for efficiency. > > > > > > This implementation had only one means of doing so effectively - the use of > > > the folio_mkclean() function. > > > > > > However, this use of the function is inappropriate, as the fb_defio > > > implementation allocates kernel memory to back the framebuffer, and then is > > > forced to specified page->index, mapping fields in order to permit the > > > folio_mkclean() rmap traversal to proceed correctly. > > > > > > It is not correct to specify these fields on kernel-allocated memory, and > > > moreover since these are not folios, page->index, mapping are deprecated > > > fields, soon to be removed. > > > > > > We therefore need to provide a means by which we can correctly traverse the > > > reverse mapping and write-protect mappings for a page backing an > > > address_space page cache object at a given offset. > > > > > > This patch provides this - mapping_wrprotect_page() allows for this > > > operation to be performed for a specified address_space, offset and page, > > > without requiring a folio nor, of course, an inappropriate use of > > > page->index, mapping. > > > > > > With this provided, we can subequently adjust the fb_defio implementation > > > to make use of this function and avoid incorrect invocation of > > > folio_mkclean() and more importantly, incorrect manipulation of > > > page->index, mapping fields. > > > > > > Signed-off-by: Lorenzo Stoakes > > > --- > > > include/linux/rmap.h | 3 ++ > > > mm/rmap.c | 73 ++++++++++++++++++++++++++++++++++++++++++++ > > > 2 files changed, 76 insertions(+) > > > > > > diff --git a/include/linux/rmap.h b/include/linux/rmap.h > > > index 683a04088f3f..0bf5f64884df 100644 > > > --- a/include/linux/rmap.h > > > +++ b/include/linux/rmap.h > > > @@ -739,6 +739,9 @@ unsigned long page_address_in_vma(const struct folio *folio, > > > */ > > > int folio_mkclean(struct folio *); > > > > > > +int mapping_wrprotect_page(struct address_space *mapping, pgoff_t pgoff, > > > + unsigned long nr_pages, struct page *page); > > > + > > > int pfn_mkclean_range(unsigned long pfn, unsigned long nr_pages, pgoff_t pgoff, > > > struct vm_area_struct *vma); > > > > > > diff --git a/mm/rmap.c b/mm/rmap.c > > > index a2ff20c2eccd..bb5a42d95c48 100644 > > > --- a/mm/rmap.c > > > +++ b/mm/rmap.c > > > @@ -1127,6 +1127,79 @@ int folio_mkclean(struct folio *folio) > > > } > > > EXPORT_SYMBOL_GPL(folio_mkclean); > > > > > > +struct wrprotect_file_state { > > > + int cleaned; > > > + pgoff_t pgoff; > > > + unsigned long pfn; > > > + unsigned long nr_pages; > > > +}; > > > + > > > +static bool mapping_wrprotect_page_one(struct folio *folio, > > > + struct vm_area_struct *vma, unsigned long address, void *arg) > > > +{ > > > + struct wrprotect_file_state *state = (struct wrprotect_file_state *)arg; > > > + struct page_vma_mapped_walk pvmw = { > > > + .pfn = state->pfn, > > > + .nr_pages = state->nr_pages, > > > + .pgoff = state->pgoff, > > > + .vma = vma, > > > + .address = address, > > > + .flags = PVMW_SYNC, > > > + }; > > > + > > > + state->cleaned += page_vma_mkclean_one(&pvmw); > > > + > > > + return true; > > > +} > > > + > > > +static void __rmap_walk_file(struct folio *folio, struct address_space *mapping, > > > + pgoff_t pgoff_start, unsigned long nr_pages, > > > + struct rmap_walk_control *rwc, bool locked); > > > + > > > +/** > > > + * mapping_wrprotect_page() - Write protect all mappings of this page. > > > + * > > > + * @mapping: The mapping whose reverse mapping should be traversed. > > > + * @pgoff: The page offset at which @page is mapped within @mapping. > > > + * @nr_pages: The number of physically contiguous base pages spanned. > > > + * @page: The page mapped in @mapping at @pgoff. > > > + * > > > + * Traverses the reverse mapping, finding all VMAs which contain a shared > > > + * mapping of the single @page in @mapping at offset @pgoff and write-protecting > > > + * the mappings. > > > + * > > > + * The page does not have to be a folio, but rather can be a kernel allocation > > > + * that is mapped into userland. We therefore do not require that the page maps > > > + * to a folio with a valid mapping or index field, rather these are specified in > > > + * @mapping and @pgoff. > > > + * > > > + * Return: the number of write-protected PTEs, or an error. > > > + */ > > > +int mapping_wrprotect_page(struct address_space *mapping, pgoff_t pgoff, > > > + unsigned long nr_pages, struct page *page) > > > +{ > > > + struct wrprotect_file_state state = { > > > + .cleaned = 0, > > > + .pgoff = pgoff, > > > + .pfn = page_to_pfn(page), > > > > Could we go one step further and entirely drop the struct page? Similar to > > unmap_mapping_range for VM_SPECIAL mappings, except it only updates the > > write protection. The reason is that ideally we'd like fbdev defio to > > entirely get rid of any struct page usage, because with some dma_alloc() > > memory regions there's simply no struct page for them (it's a carveout). > > See e.g. Sa498d4d06d6 ("drm/fbdev-dma: Only install deferred I/O if > > necessary") for some of the pain this has caused. > > > > So entirely struct page less way to write protect a pfn would be best. And > > it doesn't look like you need the page here at all? > > In the original version [1] we did indeed take a PFN, so this shouldn't be > a problem to change. > > Since we make it possible here to explicitly reference the address_space > object mapping the range, and from that can find all the VMAs that map the > page range [pgoff, pgoff + nr_pages), I don't think we do need to think > about a struct page here at all. > > The defio code does seem to have some questionable assumptions in place, or > at least ones I couldn't explain away re: attempting to folio-lock (the > non-folios...), so there'd need to be changes on that side, which I suggest > would probably be best for a follow-up series given this one's urgency. Yeah there's a bunch more things we need to do to get there. It was the lack of a pfn-based core mm function that stopped us from doing that thus far, plus also fbdev defio being very low priority. But it would definitely avoid a bunch of corner cases and duplication in fbdev emulation code in drivers/gpu/drm. > But I'm more than happy to make this interface work with that by doing > another revision where we export PFN only, I think something like: > > int mapping_wrprotect_range(struct address_space *mapping, pgoff_t pgoff, > unsigned long pfn, unsigned long nr_pages); > > Should work? > > [1]:https://lore.kernel.org/all/cover.1736352361.git.lorenzo.stoakes@oracle.com/ Yup that looks like the thing we'll need to wean defio of all that questionable folio/page wrangling. But like you say, should be easy to add/update when we get there. Thanks, Sima > > > > > Cheers, Sima > > Thanks! > > > > > > > > + .nr_pages = nr_pages, > > > + }; > > > + struct rmap_walk_control rwc = { > > > + .arg = (void *)&state, > > > + .rmap_one = mapping_wrprotect_page_one, > > > + .invalid_vma = invalid_mkclean_vma, > > > + }; > > > + > > > + if (!mapping) > > > + return 0; > > > + > > > + __rmap_walk_file(/* folio = */NULL, mapping, pgoff, nr_pages, &rwc, > > > + /* locked = */false); > > > + > > > + return state.cleaned; > > > +} > > > +EXPORT_SYMBOL_GPL(mapping_wrprotect_page); > > > + > > > /** > > > * pfn_mkclean_range - Cleans the PTEs (including PMDs) mapped with range of > > > * [@pfn, @pfn + @nr_pages) at the specific offset (@pgoff) > > > -- > > > 2.48.1 > > > > > > > -- > > Simona Vetter > > Software Engineer, Intel Corporation > > http://blog.ffwll.ch -- Simona Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch