From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 631F6C6FD18 for ; Tue, 18 Apr 2023 19:01:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B0D9B8E0002; Tue, 18 Apr 2023 15:01:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ABDE98E0001; Tue, 18 Apr 2023 15:01:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9376F8E0002; Tue, 18 Apr 2023 15:01:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 839E98E0001 for ; Tue, 18 Apr 2023 15:01:25 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 4F7DE4043A for ; Tue, 18 Apr 2023 19:01:25 +0000 (UTC) X-FDA: 80695430130.04.59BE437 Received: from wout5-smtp.messagingengine.com (wout5-smtp.messagingengine.com [64.147.123.21]) by imf29.hostedemail.com (Postfix) with ESMTP id 0E73F12001D for ; Tue, 18 Apr 2023 19:01:21 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=shutemov.name header.s=fm1 header.b="t pxOB7N"; dkim=pass header.d=messagingengine.com header.s=fm3 header.b=UOIUErmm; spf=pass (imf29.hostedemail.com: domain of kirill@shutemov.name designates 64.147.123.21 as permitted sender) smtp.mailfrom=kirill@shutemov.name; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681844482; a=rsa-sha256; cv=none; b=GD/i3+aL6h6Zlox4QodhZHpslqYO9ewlhtQWhL1QfNEWRfwoHS/ZkwqgvFEHj7PIKgD0vY t7QiV6YrVmIY4b3A5NHWx3DTDPy0qck1n0NKNeMsBVaNgSyhpHenpYlUKRHt8aBFnT8Oem k/z3Hje46k5URHmeLkqb1ceU1IXhWXo= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=shutemov.name header.s=fm1 header.b="t pxOB7N"; dkim=pass header.d=messagingengine.com header.s=fm3 header.b=UOIUErmm; spf=pass (imf29.hostedemail.com: domain of kirill@shutemov.name designates 64.147.123.21 as permitted sender) smtp.mailfrom=kirill@shutemov.name; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681844482; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=tqi3oI5Ezx9YV9f6lUuUN7M2X1hzJenS7E4oz2JLxDw=; b=VrodgfacXHqj38ZLLtlXi+wqDLw4jno1o+Y03+cZh82R4PrvQxnAFwaT+NPwz5rumbS48b nI2NDp1CwPY2pUxT5wTDz66U5toiStZ75ZAbXl0gSFkSCWIa80Ku47WbRca2l9H0DnGqM1 0j+LUBJ6fbaevZrjp4X3Zw7dqG0hGvI= Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailout.west.internal (Postfix) with ESMTP id 6AA0D3200986; Tue, 18 Apr 2023 15:01:18 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Tue, 18 Apr 2023 15:01:18 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov.name; h=cc:cc:content-type:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm1; t=1681844478; x= 1681930878; bh=tqi3oI5Ezx9YV9f6lUuUN7M2X1hzJenS7E4oz2JLxDw=; b=t pxOB7NwvzZBBJLOSrfdH6NkomNJoFhTTx/7fTzb/PkB8dBn+KhKYJkGtrV4rmHsj eFv6HGQKVILhftA2cd59VIIH1Da9Ni9vI27L2QIF0OBP8VBiZOiMaZk9RGQDyK4j +2SC89wcM3WJ8VCjt/E3kdaXuwYXt21PzYlCJyMxo2KFS944vSP4PMHTiJamHpRu omtuAwRHagvQrQx+k9fyOJc7KzeIQIO4gdmXdEQCGHH52vUsu5XcLUq8SMOzOjPL yP+iXubPDxWRAl0IJiD02AfdkTEnP5w0giii6Kpk1FArpVRLxHHGEj29I00N9Zzb VGup/Xs49aqXlt+tFzOMA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm3; t=1681844478; x=1681930878; bh=tqi3oI5Ezx9YV 9f6lUuUN7M2X1hzJenS7E4oz2JLxDw=; b=UOIUErmmSb5rY+eMuP0Y9+xznTbO3 ClVXiyT5naWzamZ8XASR0t6VUl++KLDf34IcXyHYtbNacKKTecVbPhaQKmil5qki tCgeR5WLquYbXH3W9B8QadI9w/PmN+mypAn4OeNverEkwCEAsd+M3IKrzfmNqxH3 LgN04x8jZPPXfJOtuHpJPaO37H8kUOrIuKegkSR3HDeQda96ASvYHl6daPDmG6CQ o55MJFFJJDVK4UjuDoUUc1HcBl15y0dSPwG5NcHIEekerU3rEuC4eW9ygjd1LZ4p U60ea5JRxFgjFG1PzlwlgCLnP5oLapS7tzgCljB8qbBDxWmdsPzQgYCcg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrvdelkedgudeffecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpeffhffvvefukfhfgggtuggjsehttddttddttddvnecuhfhrohhmpedfmfhi rhhilhhlucetrdcuufhhuhhtvghmohhvfdcuoehkihhrihhllhesshhhuhhtvghmohhvrd hnrghmvgeqnecuggftrfgrthhtvghrnhepkefghfefffeggefgfefhfeetieevgfeltdeu teeggefhvefhjeeifedttdetvdeunecuffhomhgrihhnpehgihhtlhgrsgdrtghomhdpkh gvrhhnvghlrdhorhhgnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghi lhhfrhhomhepkhhirhhilhhlsehshhhuthgvmhhovhdrnhgrmhgv X-ME-Proxy: Feedback-ID: ie3994620:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 18 Apr 2023 15:01:17 -0400 (EDT) Received: by box.shutemov.name (Postfix, from userid 1000) id 63BE110C351; Tue, 18 Apr 2023 22:01:12 +0300 (+03) Date: Tue, 18 Apr 2023 22:01:12 +0300 From: "Kirill A. Shutemov" To: Peter Xu Cc: David Hildenbrand , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Mel Gorman Subject: Re: [PATCH mm-unstable v1] mm: don't check VMA write permissions if the PTE/PMD indicates write permissions Message-ID: <20230418190112.2eyuhzq3hqwvlmyt@box.shutemov.name> References: <20230418142113.439494-1-david@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Queue-Id: 0E73F12001D X-Rspamd-Server: rspam01 X-Stat-Signature: 5qg3j83ge1qopoczgtbue9qbob9f8r9p X-HE-Tag: 1681844481-810252 X-HE-Meta: U2FsdGVkX1/ayIai/JCIYnrRlOOd+kcZT5Mm/furM/JSopZkXQZrcxWyDvgcnWOX0792yD2yqz+/QV8XzVNIgrxV515kcbIjJV7XNOMP9vchafPtReFzF7QW9r86iVpY6WE6Xr5qw77b4xAlmw2GW87JYahvkyr7iyhG4+nzEIukc/tg6NaJ+9Z3TsLjzUu+YRxD8WfQQDdrp/CdZAnk2Mi1IESm1A58eogvKbAoa+htXnLbNjrbcY1iQOHGTwEpdMN6OuYyyHLksQnL/vF5DPEwOvfATb08EHfhzl2kWaoAP6QMxolsZN/ns8hHXyZmYPVznA6Cy5bR7MM/InwgERu3H/Uk8CckN32eHudkZYMrgknBR97T4yO1FXCdwWuLGhabRLlT6qoU/pf3UgAgkRmg+LtnL3/n0L9gfhuWgw/H7U9gx7eHkWMB/eIfsn7uDhJkc6EL71qFa4X2BNDmhE5x0JJFInhmJp+gC3jpjtRv+oH12jsWEcTYG4OPpAIHokIvY80BArB1nN9NCJF0HxEMdGy0abpkNv5JYLPYP0A4GqSYvEZXIwb74V6IKYrBx1oAhzppI+pnDVhhiGolnraILm+v4USOY1tMakQm8Tp9XuinI+vRcZNJQ/au/DFLuNSwBbq46aY/dGypiUFPfs/IT9M6PMEPCv58cgtMOlDLAJDxlUy8V6F3ZI+FCyo6BkIxPwXZbasB4SokJRUQ3NTAw5s4EXJPndOeFov8LJityVvZW3ugjFztgVq2nXHtf3rBNzU4m9x9Zr/l2Ts3P6mpz5ig/dPx1HhuiA1elxlQx1RfNKMLVEdTX0mBXVIVEQP6thyar4MDALe4qyA/0b7lRDl9lTeDb+gnQhdjCywqUKhrevND9uwspfw8bBB0n1e27V0DNHeKb5TiLbYBx4IVuEEjmjCauyb+ZANIAamdmuurqVHJDZ9hJSKP4mBDZl+uJjiy/1plcqDD17K WeYZg3qT Sy7DbFqkC3Oa5d9LjjO/zB+vYgD8CvslsTxuTpK5gH4M+yNQj8u/3gYiNuUazN5ST/ghoAfG6URiiEeuj0x8DI2zrtpgfXePfO2qytE95/tskoLzL+INe6tRKGjNeeuD19GFCydhNqVOt2JJ869YmgV7SLCsb5mx4ZRbpO0WMYy305xAYCW2DHM2XcEam+XEQiMLPVJUhITMyFq1SM/sAsN1fSwXnYsBRu+Qqby7fAtO3Yu9zdCDV1XKmrEUk8ouocVDHO3jdL6dXE3DsGPxw5ABecafS2VwGuDFX1AuDluwAL65tQSME/GL6gmMjpT42dnLr+lajIKJnjGPddZOs4XQV38IsaH0ObWmTzSzZXCsHXJw0hS3XqL6QZ618FKSXeuydp5ZmKddDUpOYjhjcjASVZRylwN7ttfxUypaLY9JvawQQAMZ9h/CHpfAbN+MoHFUOca8L9dmxhc4Cjl0opG8MrfF9ZBlyY/5kL7amlWqSrUcWwMBp8fUrqqe+EoUM7QBEU9FJdnU0rQOvENkqVSSHgUlC0/8/3BtIxTxiTapsB0yihzftazaNjw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Apr 18, 2023 at 11:56:07AM -0400, Peter Xu wrote: > On Tue, Apr 18, 2023 at 04:21:13PM +0200, David Hildenbrand wrote: > > Staring at the comment "Recheck VMA as permissions can change since > > migration started" in remove_migration_pte() can result in confusion, > > because if the source PTE/PMD indicates write permissions, then there > > should be no need to check VMA write permissions when restoring migration > > entries or PTE-mapping a PMD. > > > > Commit d3cb8bf6081b ("mm: migrate: Close race between migration completion > > and mprotect") introduced the maybe_mkwrite() handling in > > remove_migration_pte() in 2014, stating that a race between mprotect() and > > migration finishing would be possible, and that we could end up with > > a writable PTE that should be readable. > > > > However, mprotect() code first updates vma->vm_flags / vma->vm_page_prot > > and then walks the page tables to (a) set all present writable PTEs to > > read-only and (b) convert all writable migration entries to readable > > migration entries. While walking the page tables and modifying the > > entries, migration code has to grab the PT locks to synchronize against > > concurrent page table modifications. > > Makes sense to me. > > > > > Assuming migration would find a writable migration entry (while holding > > the PT lock) and replace it with a writable present PTE, surely mprotect() > > code didn't stumble over the writable migration entry yet (converting it > > into a readable migration entry) and would instead wait for the PT lock to > > convert the now present writable PTE into a read-only PTE. As mprotect() > > didn't finish yet, the behavior is just like migration didn't happen: a > > writable PTE will be converted to a read-only PTE. > > > > So it's fine to rely on the writability information in the source > > PTE/PMD and not recheck against the VMA as long as we're holding the PT > > lock to synchronize with anyone who concurrently wants to downgrade write > > permissions (like mprotect()) by first adjusting vma->vm_flags / > > vma->vm_page_prot to then walk over the page tables to adjust the page > > table entries. > > > > Running test cases that should reveal such races -- mprotect(PROT_READ) > > racing with page migration or THP splitting -- for multiple hours did > > not reveal an issue with this cleanup. > > > > Cc: Andrew Morton > > Cc: Mel Gorman > > Cc: Peter Xu > > Signed-off-by: David Hildenbrand > > --- > > > > This is a follow-up cleanup to [1]: > > [PATCH v1 RESEND 0/6] mm: (pte|pmd)_mkdirty() should not > > unconditionally allow for write access > > > > I wanted to be a bit careful and write some test cases to convince myself > > that I am not missing something important. Of course, there is still the > > possibility that my test cases are buggy ;) > > > > Test cases I'm running: > > https://gitlab.com/davidhildenbrand/scratchspace/-/raw/main/test_mprotect_migration.c > > https://gitlab.com/davidhildenbrand/scratchspace/-/raw/main/test_mprotect_thp_split.c > > > > > > [1] https://lkml.kernel.org/r/20230411142512.438404-1-david@redhat.com > > > > --- > > mm/huge_memory.c | 4 ++-- > > mm/migrate.c | 5 +---- > > 2 files changed, 3 insertions(+), 6 deletions(-) > > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > index c23fa39dec92..624671aaa60d 100644 > > --- a/mm/huge_memory.c > > +++ b/mm/huge_memory.c > > @@ -2234,7 +2234,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, > > } else { > > entry = mk_pte(page + i, READ_ONCE(vma->vm_page_prot)); > > if (write) > > - entry = maybe_mkwrite(entry, vma); > > + entry = pte_mkwrite(entry); > > This is another change besides page migration. I also don't know why it's > needed, but it's there since day 1 of thp split in eef1b3ba053, so maybe > worthwhile to copy Kirill too (which I did). I was concentrated on the correctness at the point and this small inefficency didn't catch my eyes. I was curious how we serialize here against mprotect(). Looks safe to me: CPU0 CPU1 __split_huge_pmd() pmd_lock() __split_huge_pmd_locked() pmdp_invalidate() // PMD is non-present, but huge at this point change_protection() change_pmd_range() pmd_none_or_clear_bad_unless_trans_huge() == 0 // not skipped change_huge_pmd() __pmd_trans_huge_lock() pmd_lock() // serialized against __split_huge_pmd() Acked-by: Kirill A. Shutemov -- Kiryl Shutsemau / Kirill A. Shutemov