From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CFB0DC6FD18 for ; Tue, 18 Apr 2023 15:57:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 428D18E0002; Tue, 18 Apr 2023 11:57:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3D5E88E0001; Tue, 18 Apr 2023 11:57:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 251A08E0002; Tue, 18 Apr 2023 11:57:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 15C728E0001 for ; Tue, 18 Apr 2023 11:57:55 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id BECDA1C6448 for ; Tue, 18 Apr 2023 15:57:54 +0000 (UTC) X-FDA: 80694967668.13.97CC69D Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf27.hostedemail.com (Postfix) with ESMTP id 7448D40018 for ; Tue, 18 Apr 2023 15:57:52 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=FmLE9W9d; spf=pass (imf27.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681833472; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ClxVX5yTsrEn4iN5IePk2auvIblqHsmaWH8UmnJ6Yfs=; b=cN/ua3LJyX0XbIXJfW0TK55Ye577gFIkfTdausN36ltrvZc/URDGh4MfoAgo2FhyARyEzO 37uTjKsnPu7nv0gmCxvLCIsNv5Pvs/jL0usxP9ZTHCzBsZ/reUagbNJDZBmrVUKwl2Y682 WbHbgBzH5jEARAlVwXFjzvWHNblYRzw= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=FmLE9W9d; spf=pass (imf27.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681833472; a=rsa-sha256; cv=none; b=gQfxgDhrDncJBs9o4TAvPvPOmHyM/OIiEpMXl7x+0izHjMZhXcCKBkNi7g5qyTQWzpIdK/ HpaRxCbTGmYowExvRCJynZW5DdVOeVZW4wqDufrPV/eHsr1voGsB/+e7Y2pQwIYMvkrjUg BvJm4iQu0EIHC1ljJD6jRT480fFtOHU= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681833471; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ClxVX5yTsrEn4iN5IePk2auvIblqHsmaWH8UmnJ6Yfs=; b=FmLE9W9dwrV3T9YzfvCwGYlsswRnAqho98Iwcy51cid2pX+4k0jsKoU+6IP45fIcQGFlI3 4YzikRZYk4Ahj8QfEcKJ33MvdLpv+SJovF0Xx7QOGwsmtYtcbZwyhQKYFoaKa4X5ICdVzs Ow94CB6yczJ9JXDbBxr77WWw0D7yuFU= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-454-Yb70G-xzP3OFjr5eU37yXw-1; Tue, 18 Apr 2023 11:57:50 -0400 X-MC-Unique: Yb70G-xzP3OFjr5eU37yXw-1 Received: by mail-wm1-f70.google.com with SMTP id 5b1f17b1804b1-3f08ed462c0so42684405e9.1 for ; Tue, 18 Apr 2023 08:57:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681833469; x=1684425469; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ClxVX5yTsrEn4iN5IePk2auvIblqHsmaWH8UmnJ6Yfs=; b=CU7dvVD7gYZO0V0KLFHk0WuARguiz3pK8rq3RXtPYDeAtDn31nQgajTRXxxWyHh/Pl AnTW3PXoUjk10pvM765OcVFi4GZlneonLaaB9ZDgdfZcRPG2u1/x+6fgoYH9IB9dgSi+ kBL/BsojSU6Mp23qgea7MxM03yBJ9bXYt1YGiDy2LFCEmZmyg0T0GScKFlzuy5Hd/GGF NEapF4ijqesf71zaeWZCGr0iJ3eQHsD8Qr7UC95uItiDIEJOZElVXk1NEZ5DQsJaRrun vTSuLODjHpwJ7mzRvEBfN9YZ2IDi6cR38DJyhvzGw0KhZilkBDRjgHN+IYUxvYc6Q8UP IsMw== X-Gm-Message-State: AAQBX9f9DEpctrErGAK4FRAE+AuymIzYFIDY0DwWDwbRFsWLmerJ4ZVd ujWT81gAi2ibRajJNlCoqO4RPtWY7mhCCvdXt3biSB3iVqRNFEQxBtdN5+Cj34UrrWdJwvOLa0x qYyA0qRMuQQk= X-Received: by 2002:a5d:58d1:0:b0:2f6:aa71:d5b0 with SMTP id o17-20020a5d58d1000000b002f6aa71d5b0mr2372399wrf.15.1681833469411; Tue, 18 Apr 2023 08:57:49 -0700 (PDT) X-Google-Smtp-Source: AKy350Zi+96637CVEF+bLgMNRwea/a7ykxdXxncauziu42MzNJmag43vt2ndN7a2lAALAol0SldyDQ== X-Received: by 2002:a5d:58d1:0:b0:2f6:aa71:d5b0 with SMTP id o17-20020a5d58d1000000b002f6aa71d5b0mr2372384wrf.15.1681833469028; Tue, 18 Apr 2023 08:57:49 -0700 (PDT) Received: from ?IPV6:2003:cb:c715:3f00:7545:deb6:f2f4:27ef? (p200300cbc7153f007545deb6f2f427ef.dip0.t-ipconnect.de. [2003:cb:c715:3f00:7545:deb6:f2f4:27ef]) by smtp.gmail.com with ESMTPSA id f16-20020a05600c155000b003f177cda5ebsm3316361wmg.33.2023.04.18.08.57.48 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 18 Apr 2023 08:57:48 -0700 (PDT) Message-ID: Date: Tue, 18 Apr 2023 17:57:47 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Subject: Re: [PATCH mm-unstable v1] mm: don't check VMA write permissions if the PTE/PMD indicates write permissions To: Peter Xu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Mel Gorman , "Kirill A. Shutemov" References: <20230418142113.439494-1-david@redhat.com> From: David Hildenbrand Organization: Red Hat In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 7448D40018 X-Rspam-User: X-Stat-Signature: 5ife4xcfw749opwsabrjxjcqj34q7uom X-HE-Tag: 1681833472-514695 X-HE-Meta: U2FsdGVkX19IDJGJnkQJmhElePT31axZBzXbNVW86wSwgdWGZEI52PvDpv2+scxRezQyIoRD3gvs61Mu2JNkIVdco+4ePTUI3375gXz+0wCZLnfRJIemCUZmB7Vedfb2lUSibeItEyL+adTYNGNoLwE6XiBVTr0vEFkoRzxrEvO0mT6aBj/y0zv51guhM3u835X0qqnahHWtrohTfmYhUtkuAFHdUD/5trRiHAVrAI4OojiB4fc4BNBXhb0AfYVQzGVed37U7TJ6nwD9MARp+kH70jauGX9cga6N8JyiknBIb8fi2hjEvzvriAZtV81VwFLABfRYaYIoIcpIpSUumbAGVeUWaJhlFhto/iY3bQ/CDvutFUVLiETcSgBoNBS99rsg6aD8B1T6MJSO+LBtS8gi+S1rcYAX7FAKzycearpc0AEDrjvHPJNIu/LO+N+dSIP6ozHX6TP5g1OJknzID5E0KYidtPQqV+2REUSdObjIknDPXYM060sU1vE7pLU2lWcEe0pkbZtwtsFyWfv7D1tnsbNqV7mHWDmUH6cC+Tcsg6CAcnZWapXeH+4xO4oF6DdUUo/uP8bfQYbR3z3DmyYtL8Z1XHawjM39SMZufoghrN7pSULSswa3qUGr2q1eYXQSA43ozDmJ+6lEGroBEI1cwT3823v25fIBIPDRqBMiBhklfxRbspVkI+Nq2r1SHcTmksWgskC4k5Rp5GGewsebZpnSt4OeABBnfJzf0UDZRpkmkBQcWTFMO1DXnWkUOjqzgSiIVp69NEQBXRXcHw9GpV/vkK7cvHxgU5Yl3OsESHVsosoKAJVyd8Kt4Ingtf/qjZ4w+/1gQ5ufCXD5ciK2TYKqmvn2IzbYNZYeZASfReYcWua0U+kT+kQXQD2nuKiqnZcL9/bbCEFyAIbPP6xCz95FysHItJDIC3uWFpZbdxLxjifTjueMq793y8rBCM+0Aleq/QA6PvpXvYI /mb7tW3V J3hBCiDgWRwHDfvY4xkLYpvLrpQbhrWzdDlj1blTxJaP1ISKlA7rg5/NuO+3PgS4Gm+jryPhKoYY/m2B0ITACVzWWDIGKcoNPfrn+vqTi2Ue5pM/djtxExIB13eeLWMcDnXBsQBvtvAfs1idQeokrAA1QMn78CyRdtkXfhZpJFMTNmm1lfgGAJZUTlfr8+J/5z5k0y+UW7cxNYBo/mVSxdngYayHsuB2FNf7UnAh+IL03kexTWbjeRajS6kmK07HxNkW93EZzoW8ctDfCchInTahn3nupcjhwE5gZbt6p6lzOW7WOvtLMmhS9RZqx8iI7hoAuVZuJiL5gR/j5fAnpoIkf5yDh4Gqu11YGHhYWyOk1xVhOdCroCT3bgihhNhw3U7yg7ndqdNSeRlLuyNEfSTRj5xJsOGjJqiHqTiLu9LC6+HRZBSGW+/vWOCp/Aike1xj7G2guMRtPzD2NMolmX4vknpvnf4RNIND0ZBNPJ2tpB8tzvVJdBgmW06/jerwXiXbBUdn7AqedJdVZwCVNI3aiP68OrCc054MCJPIPnBlsZoOUSI2uy/xpixCV29E/UKrTw8yMA6TEHnn3xFUaGb5jCLqXs02bBF8/dXVCgLO3w+p8rXL6yWmDiw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 18.04.23 17:56, Peter Xu wrote: > On Tue, Apr 18, 2023 at 04:21:13PM +0200, David Hildenbrand wrote: >> Staring at the comment "Recheck VMA as permissions can change since >> migration started" in remove_migration_pte() can result in confusion, >> because if the source PTE/PMD indicates write permissions, then there >> should be no need to check VMA write permissions when restoring migration >> entries or PTE-mapping a PMD. >> >> Commit d3cb8bf6081b ("mm: migrate: Close race between migration completion >> and mprotect") introduced the maybe_mkwrite() handling in >> remove_migration_pte() in 2014, stating that a race between mprotect() and >> migration finishing would be possible, and that we could end up with >> a writable PTE that should be readable. >> >> However, mprotect() code first updates vma->vm_flags / vma->vm_page_prot >> and then walks the page tables to (a) set all present writable PTEs to >> read-only and (b) convert all writable migration entries to readable >> migration entries. While walking the page tables and modifying the >> entries, migration code has to grab the PT locks to synchronize against >> concurrent page table modifications. > > Makes sense to me. > >> >> Assuming migration would find a writable migration entry (while holding >> the PT lock) and replace it with a writable present PTE, surely mprotect() >> code didn't stumble over the writable migration entry yet (converting it >> into a readable migration entry) and would instead wait for the PT lock to >> convert the now present writable PTE into a read-only PTE. As mprotect() >> didn't finish yet, the behavior is just like migration didn't happen: a >> writable PTE will be converted to a read-only PTE. >> >> So it's fine to rely on the writability information in the source >> PTE/PMD and not recheck against the VMA as long as we're holding the PT >> lock to synchronize with anyone who concurrently wants to downgrade write >> permissions (like mprotect()) by first adjusting vma->vm_flags / >> vma->vm_page_prot to then walk over the page tables to adjust the page >> table entries. >> >> Running test cases that should reveal such races -- mprotect(PROT_READ) >> racing with page migration or THP splitting -- for multiple hours did >> not reveal an issue with this cleanup. >> >> Cc: Andrew Morton >> Cc: Mel Gorman >> Cc: Peter Xu >> Signed-off-by: David Hildenbrand >> --- >> >> This is a follow-up cleanup to [1]: >> [PATCH v1 RESEND 0/6] mm: (pte|pmd)_mkdirty() should not >> unconditionally allow for write access >> >> I wanted to be a bit careful and write some test cases to convince myself >> that I am not missing something important. Of course, there is still the >> possibility that my test cases are buggy ;) >> >> Test cases I'm running: >> https://gitlab.com/davidhildenbrand/scratchspace/-/raw/main/test_mprotect_migration.c >> https://gitlab.com/davidhildenbrand/scratchspace/-/raw/main/test_mprotect_thp_split.c >> >> >> [1] https://lkml.kernel.org/r/20230411142512.438404-1-david@redhat.com >> >> --- >> mm/huge_memory.c | 4 ++-- >> mm/migrate.c | 5 +---- >> 2 files changed, 3 insertions(+), 6 deletions(-) >> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index c23fa39dec92..624671aaa60d 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -2234,7 +2234,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, >> } else { >> entry = mk_pte(page + i, READ_ONCE(vma->vm_page_prot)); >> if (write) >> - entry = maybe_mkwrite(entry, vma); >> + entry = pte_mkwrite(entry); > > This is another change besides page migration. I also don't know why it's > needed, but it's there since day 1 of thp split in eef1b3ba053, so maybe > worthwhile to copy Kirill too (which I did). Indeed (I wanted but forgot ...), thanks Peter! -- Thanks, David / dhildenb