From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1BD0FE95A8E for ; Mon, 9 Oct 2023 10:47:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6AAC18D0051; Mon, 9 Oct 2023 06:47:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 65AD98D0031; Mon, 9 Oct 2023 06:47:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 549388D0051; Mon, 9 Oct 2023 06:47:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 4333A8D0031 for ; Mon, 9 Oct 2023 06:47:43 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 0A0A71601DC for ; Mon, 9 Oct 2023 10:47:43 +0000 (UTC) X-FDA: 81325597206.30.0265890 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf18.hostedemail.com (Postfix) with ESMTP id 20BD61C000D for ; Mon, 9 Oct 2023 10:47:40 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=os0fcFvV; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf18.hostedemail.com: domain of rppt@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=rppt@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696848461; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5F0nIMmDPiAYZdi/X4+4UPOFc732JNYCYwrq52TbU1Y=; b=3KpHTEkpW4Y/LY7ALn5zxMo/ESeRYgd4SU8o/Es/q8Zf3LZ0hrIteCzSwwmJDNa8OqEyU9 CKul+stN8dPZDLIqC83w0nc/gtQ0XdZCjbn0ItWdC/q344dop0ofs5VcHRNplTe7IhIb5E Iu+IE1x2dn/oq5zmoPaD2Ew0/zga/fU= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=os0fcFvV; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf18.hostedemail.com: domain of rppt@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=rppt@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1696848461; a=rsa-sha256; cv=none; b=TL0G+DDXb8k0PE9XQqEVgb4pxpc+fLw/NembUo4pIDHvGusIC4r+yi+cCNUmXRs9YbUL5H ygs+jQhpl4HcgfYi4Tcjj987fxj4V9c6Y72FfjIkxnbLieTmAPqn8S3hJhzJzmDqOiFmIo cBH8ZT74KSBBiI567UiTWAh1QBdEmew= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by ams.source.kernel.org (Postfix) with ESMTP id 2494AB810C5; Mon, 9 Oct 2023 10:47:39 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4BA7FC433C8; Mon, 9 Oct 2023 10:47:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1696848457; bh=+6ayJyiGaU3UhAWivZrHdqfkLdoeJo+9qadAli1PDXw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=os0fcFvV0DoUVDRpfiTs2bdsrTbzwTozJbTx8mgUsmyU9Sc4D4+ZPpjVa0AJgZJV5 bvyUBPPRc7AqK6NIGrBkPRADD2lsxv8ELlpYVsmS3s2DxYt+ewqJkE5fcEes8hCuz7 f5TDk6x2ykJHV0CcG01FFT56KUEIx+JZ0Yoooum4e0in/iXDu58WVlrO4YMbTk12J1 Er1TB6YYE2+0CDM9omG4LzWL86TIo7QYbrKhFlcKRXMrYxhF2kk9iXQKkBD19dJ6lx y5VVjYGxdDw6C+PtqQS40sNf3shpy/jyhPoEZUVgbHCvmN5WGabHECwUOvbghucPWe gbe0wTYoYKJGA== Date: Mon, 9 Oct 2023 13:46:43 +0300 From: Mike Rapoport To: Lorenzo Stoakes Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , David Hildenbrand , "=Liam R . Howlett" , Vlastimil Babka Subject: Re: [PATCH v2] mm/mprotect: allow unfaulted VMAs to be unaccounted on mprotect() Message-ID: <20231009104643.GO3303@kernel.org> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 20BD61C000D X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: ui7cqcq8eip4a9pymdr9eftdun8s5u7a X-HE-Tag: 1696848460-724257 X-HE-Meta: U2FsdGVkX1+Z8HNg9Qnz09jSgNqglOJoY3CKV37pqoPxFhZBMY6muCXoeghIEH+YURkAT2XCzSUcUrDccRJw+dfbul86ll3a6mmLNvPDC/DeR2b3cIvxz2viX7Cuir1wsWKdMvhDfDaUv+KoxTGsPWmsczeOl+OyJ7eGmlCxUKppwTEuVe2Dw+2mIirpog4Y2HY/RSjggthdY4BiN+dfbcBKK5z9FGQ0+AfO4lPg8oHHrEYx5THdVbpeFXy9UTS1rRGUns0qgnTc5zpxh/3cObVFQVDqTay/RY3edRWlbV8BPR23oNWXmiHgpemCDAEgTirHCfGAPGLbVM7JOOctioKRNYs3CgVmh5JCV/fp1JnOqorHNSpy44BCbj7m3J3Ne1Md7EGdh5UOPIhmUjhf2/iH4OrvwkbEgOgL/FhyAExX2JgGrkykd4oenpWKB8CvVPwCTKYBjtV8a/ku/BR/LZJoZWA5FUR93KJRnMu6st63+AtUOA/7b6mk1qtBFk1B/v0p72fA9sz35QDNxZOrQiytd0Anwg01sGuz4M6cxW/zGOAwS3o4x4BI20eXRAOMRhWfxJXzWoBIlKqI+9vdyhHgEUzSc5tHnDbfzREhpNrpPQ+P/rSiFKAOVZIrydqGTQIIGPJnNlJC1w/bElKwjYa72Rafb0O8XKy54Q10G6mWPaJLSea365/Wp48iozO9FdUwlyar4h+Ywb8F7Pth0zCrzsZ96vJGsnjPLBlLv60joU3sRXnZugWyumvYX5ObadGFWAjnj53bq7RZ7XhmxE5tgv+CeVoxJVNqJ5PWQeUJTWXoK7kUUfBC0DeoNi+0Gn67Fdl+xapeoSemQYOW7Z+S6O+tjOfuycB0cs5BRZlMa7ktexCqrMy5fkeOOVGtU3jG94qPBucK8wt0AAI820GXTMZLhVVnjQVCV5yqFKJ13ALMNGxBB1ZRJdzhzZn+tsgLiJphJFOceXsjqSP 9Hrw1kyL nSx08golDws/cLfA/zucgCRj4t53pWK8a+S4gLRRL6vDGCLccTrsyIkktxOcGYGxyP2OV0+KVocRPqtgn92idCm7sMuW+YKZR42vuxeV/3qBeypGelGCu5NwM/+c3btOttU23U28hh03ZDyPEhQVTvd3gfG49rxsuIMhNlI4otInX6Jzm0rkDX3SPtv84NQVNALS+MXMNXCgbyWm3dMs93RgFEMqzYJG0/tUnhFnyntuV91sInI9KTTVe0nA7TyR50nOr/vjZ0AZysPoTAkTa5JRGYg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, Oct 07, 2023 at 05:47:48PM +0100, Lorenzo Stoakes wrote: > When mprotect() is used to make unwritable VMAs writable, they have the > VM_ACCOUNT flag applied and memory accounted accordingly. > > If the VMA has had no pages faulted in and is then made unwritable once > again, it will remain accounted for, despite not being capable of extending > memory usage. > > Consider:- > > ptr = mmap(NULL, page_size * 3, PROT_READ, MAP_ANON | MAP_PRIVATE, -1, 0); > mprotect(ptr + page_size, page_size, PROT_READ | PROT_WRITE); > mprotect(ptr + page_size, page_size, PROT_READ); > > The first mprotect() splits the range into 3 VMAs and the second fails to > merge the three as the middle VMA has VM_ACCOUNT set and the others do not, > rendering them unmergeable. > > This is unnecessary, since no pages have actually been allocated and the > middle VMA is not capable of utilising more memory, thereby introducing > unnecessary VMA fragmentation (and accounting for more memory than is > necessary). > > Since we cannot efficiently determine which pages map to an anonymous VMA, > we have to be very conservative - determining whether any pages at all have > been faulted in, by checking whether vma->anon_vma is NULL. > > We can see that the lack of anon_vma implies that no anonymous pages are > present as evidenced by vma_needs_copy() utilising this on fork to > determine whether page tables need to be copied. > > The only place where anon_vma is set NULL explicitly is on fork with > VM_WIPEONFORK set, however since this flag is intended to cause the child > process to not CoW on a given memory range, it is right to interpret this > as indicating the VMA has no faulted-in anonymous memory mapped. > > If the VMA was forked without VM_WIPEONFORK set, then anon_vma_fork() will > have ensured that a new anon_vma is assigned (and correctly related to its > parent anon_vma) should any pages be CoW-mapped. > > The overall operation is safe against races as we hold a write lock against > mm->mmap_lock. > > If we could efficiently look up the VMA's faulted-in pages then we would > unaccount all those pages not yet faulted in. However as the original > comment alludes this simply isn't currently possible, so we are > conservative and account all pages or none at all. > > Acked-by: Vlastimil Babka > Signed-off-by: Lorenzo Stoakes Acked-by: Mike Rapoport (IBM) > --- > > v2: > - Minor spelling correction. > > v1: > https://lore.kernel.org/all/20230626204612.106165-1-lstoakes@gmail.com/ > > > mm/mprotect.c | 13 +++++++++++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/mm/mprotect.c b/mm/mprotect.c > index b94fbb45d5c7..10685ec35c5e 100644 > --- a/mm/mprotect.c > +++ b/mm/mprotect.c > @@ -608,8 +608,11 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb, > /* > * If we make a private mapping writable we increase our commit; > * but (without finer accounting) cannot reduce our commit if we > - * make it unwritable again. hugetlb mapping were accounted for > - * even if read-only so there is no need to account for them here > + * make it unwritable again except in the anonymous case where no > + * anon_vma has yet to be assigned. > + * > + * hugetlb mapping were accounted for even if read-only so there is > + * no need to account for them here. > */ > if (newflags & VM_WRITE) { > /* Check space limits when area turns into data. */ > @@ -623,6 +626,9 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb, > return -ENOMEM; > newflags |= VM_ACCOUNT; > } > + } else if ((oldflags & VM_ACCOUNT) && vma_is_anonymous(vma) && > + !vma->anon_vma) { > + newflags &= ~VM_ACCOUNT; > } > > /* > @@ -653,6 +659,9 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb, > } > > success: > + if ((oldflags & VM_ACCOUNT) && !(newflags & VM_ACCOUNT)) > + vm_unacct_memory(nrpages); > + > /* > * vm_flags and vm_page_prot are protected by the mmap_lock > * held in write mode. > -- > 2.42.0 > -- Sincerely yours, Mike.