From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93CE0C6FD18 for ; Tue, 18 Apr 2023 15:56:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1193B8E0003; Tue, 18 Apr 2023 11:56:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0A30C8E0001; Tue, 18 Apr 2023 11:56:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E5EEC8E0003; Tue, 18 Apr 2023 11:56:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D3B558E0001 for ; Tue, 18 Apr 2023 11:56:14 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id A9455401B7 for ; Tue, 18 Apr 2023 15:56:14 +0000 (UTC) X-FDA: 80694963468.22.2C682E6 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf15.hostedemail.com (Postfix) with ESMTP id 95CC4A0018 for ; Tue, 18 Apr 2023 15:56:11 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=QyTWcBRF; spf=temperror (imf15.hostedemail.com: error in processing during lookup of peterx@redhat.com: DNS error) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681833371; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OtKlP8rnOl2lq9t5ADiW6fLtYHBS76FbLQQee5fUbUg=; b=B2Sd0L4j7jHirCBccNBfjaWs88rHbLxOSUrSFS5+9IvAUYhLrXbkx+rCi3X00Q7LnxOXiu 8TvVWGtR32/V35NL6QpcBjPGPju3PQum6szv0zwcmluud2SVp+PBTdQq8GtgK07kbNL27Q v/mCQhRJnau1VUWQNgGBWcpRP9YFbN0= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=QyTWcBRF; spf=temperror (imf15.hostedemail.com: error in processing during lookup of peterx@redhat.com: DNS error) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681833371; a=rsa-sha256; cv=none; b=GoPFCW/kHCvPTl6j2epAdgi7YsJQwhQPr3pYHBx2Kh5X3Zm2UB4sra5BYFFRIKFp1R+oN9 3pmVSI/bHcZNRGWE2s6lEj20qg34BnuY1kXPRKiGkJ5UDp6UOCqYh8FtLZ5zNntLZDtP8Q KJoxVWeCLA6YDn4J4z72GdfQAdI/m6s= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681833370; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=OtKlP8rnOl2lq9t5ADiW6fLtYHBS76FbLQQee5fUbUg=; b=QyTWcBRFcChzdhwoZ6tMHUw6bXyD2W5zv/cPpD5enVBxK+A/szoTMM2fbsPxj/dtFBZCPz D/Q9m0WlI4X0l6pviEE3vq7RnOhuZQgINBs6UHSfMiBsaR9c/rHqW2/O5t6dHHIXIMk9Z+ QgqkVxBd7jBcvkey6+to6Yi3K/YhuWI= Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-593-p8tSG_egM6qdJzatHwIBNQ-1; Tue, 18 Apr 2023 11:56:09 -0400 X-MC-Unique: p8tSG_egM6qdJzatHwIBNQ-1 Received: by mail-qt1-f199.google.com with SMTP id d75a77b69052e-3ecc0c4b867so9999371cf.1 for ; Tue, 18 Apr 2023 08:56:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681833369; x=1684425369; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=OtKlP8rnOl2lq9t5ADiW6fLtYHBS76FbLQQee5fUbUg=; b=SIC7URniT509P8uYjlSZWdoQ6rM4wDC/EFE3wFUO4qaFnM6ksvd2c3r3ySu4Faz+qc 82sIIYq6emKsanuZ0PDZ0tY6O1sPWd7iOGhalyyz0JMLMFWqPgO63wTCQigUN/OfpfMV kaJReRdyarTWEL0ehSzlf/nqIKV2nbUfvi39j3C1BQBJZZ+AbEvoeR7NGRBbK1CjCJ0A e6R3z3Zf1B/p4JYn6fM7obDcdgd5VRtXFhMy8/if6USgPjZOAPeO0H0A0phsrBjmylSp SfGIvzBV4x8VliUOtlxc3dMnc7cd+L4a9cndFDPLQXoNup55+hwga4GCu0ulTDeHNbz5 RxRA== X-Gm-Message-State: AAQBX9e7u962c2NLiqnAlEEzaX8ysT7K9S/tSW33lK9AxGMaTabzELuD ntu7Vy/FoJ9rneKYLiTxXVug08UTO9c75sjRCiAQplCZUWSVSRZNAgmlTwigi8hTR4VpWq5joJG LLbSnr/mjKzk= X-Received: by 2002:a05:622a:1981:b0:3e3:c889:ecf9 with SMTP id u1-20020a05622a198100b003e3c889ecf9mr22684365qtc.1.1681833369356; Tue, 18 Apr 2023 08:56:09 -0700 (PDT) X-Google-Smtp-Source: AKy350ZKZaVQG2Bgur25IfGn8Qc7YxIayAHbRDmpRwtEfLZCJaaykJcrlO20sXyLSzrcYlq+SqRSvQ== X-Received: by 2002:a05:622a:1981:b0:3e3:c889:ecf9 with SMTP id u1-20020a05622a198100b003e3c889ecf9mr22684330qtc.1.1681833368995; Tue, 18 Apr 2023 08:56:08 -0700 (PDT) Received: from x1n (bras-base-aurron9127w-grc-40-70-52-229-124.dsl.bell.ca. [70.52.229.124]) by smtp.gmail.com with ESMTPSA id a76-20020ae9e84f000000b0074df74a9f9fsm1310576qkg.39.2023.04.18.08.56.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Apr 2023 08:56:08 -0700 (PDT) Date: Tue, 18 Apr 2023 11:56:07 -0400 From: Peter Xu To: David Hildenbrand Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Mel Gorman , "Kirill A. Shutemov" Subject: Re: [PATCH mm-unstable v1] mm: don't check VMA write permissions if the PTE/PMD indicates write permissions Message-ID: References: <20230418142113.439494-1-david@redhat.com> MIME-Version: 1.0 In-Reply-To: <20230418142113.439494-1-david@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Stat-Signature: msbks118itx84ep4q8szi4qpfcddfb5b X-Rspam-User: X-Rspamd-Queue-Id: 95CC4A0018 X-Rspamd-Server: rspam06 X-HE-Tag: 1681833371-779282 X-HE-Meta: U2FsdGVkX18awLECVs2jKfSvspYgaN+zQ58lgpPmCz+yb/T51FZQgZJQfBkFlWMzuq9k1YJL1ISSQ6DJ9GyWsBn3XAAAtDFre2oKucrR1nnZgqMH5U/TPlgw4aIAfBVEyCDvb7qpgDa8NRWYlJaypS/ghe7Sa6oq/3+c4wnsCDNozp7wWwzRTrVe8AKPwAvSXu4FadUk2zTm66Un6J67NCmV+z5Jtr/38l4tWmguxkr6Q124x4MRjDclrnsiGZhNlDvVOE9EKCyky7RNuliLTYNi379PjO6mPb23oyCUTeifAr7EuwwTDylZkA3KljnzC8CsvVdBLEarzYVQons5UHRYuNb47mXaZTn30ydyVdFS37lS9t/2QkmaoOxJUJXteA7KTTQ5bMuAKllvvQgaj4/YnywisIgm2TSjB+tBiBfN6E2/SrsnZlgP/WsUMAp5tvqXMoJL3E8rZHEwKOne2y/w3iDP76Oqrgxiv+c/K1CFwHFDjBdFrrACHFto/PpO92YPNAnaPF2pXl/oVvtvxpvNYvV+mzA5pIe/6Wwmxysqia9R8UuMmwe7RBqL2afUWgjSNNClZ37W+Dw4ytrx4ROQdgg78SHiaM7f+7kJyAwVZYE1a2CYOT2E487fPItlw78ekUXdaf/DklweUzIlOGi4eIrFg+4sQOoqOCo2neP2oJ0RqoO0epf6P4yuexMwNLRDEPi34KzhM5szdApZdoua6uAHDFeTHkmuTnUl9IOK2TPsTmd2mMDM6+zLqqpS2ACHD8WZEJXojQkEG717kFEOM9EFr5Q1Z3lV7YGbBKnFm9HwJjqv6Z2dncSZBrOguTQR2wXCZbZyk2CmG/ouOoxMjDOJgvXgGg21opcHvVJvRpqZPiSAxq9pg+RIjFgZSvOAm7uYrKRiMmKv1fyQ/5YF6/+ly+3xOwQgUojXD5kTJreZ0J7qNA4NuuReW0etiejn2S9Cgk0WuSdq2hS DkKcGR0Q dMTgzWKzfnEI5plOUdIhRBG8XWtEtoHJUnT9lfsAvbZtGs4aAMON1O+GYMn6W/vQ56n6pub3Udm14ri0kfBgck25KxedHJC485GUbz0y8lm1Xlq+Z6pLa666nHHmrf52Qc19Ln6XKWrvcJFoPUlGgm8jQ2WSk56VlfMmi+wy6CqeFyKBggoWr9vmvVA+Luu6nhisybsCgDlFKWzpjN4fqXRXIFB1xi6OiV1RmhezbgPouhMxajZ0LirW+m1/Mdrwpya7UUCbw+klLjIOuhbIvnUkZ5df/PcdvbnjnqpgfHMLTtavTrWQRpqp1XC6j65fYtbRcVfJ12xJiBiQHlyBeanH2IdVZQggdJBZ8Luifmd3dAKjn2sSCgLONJeUvotZTx0+qkZrgvL5m4J37T0FkKtdy4XawGUf9vWy6XESGrEqygVM0u9Bk6HgEcc5zkYrVrslW/AylGZlmSgKw/IqcISkViN30jSMjldzpIqoDnDk83LhT89qDF9swpfArmeGsqj0o0dvanSj21FoyecFiQpePly2Bn3rM/Dtg7WIUZ78ddnsAi8kZZm/VV4gNyniV/rgE8mNJ4UWMiR5kru9EynB2qHjZirHgqDA7 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Apr 18, 2023 at 04:21:13PM +0200, David Hildenbrand wrote: > Staring at the comment "Recheck VMA as permissions can change since > migration started" in remove_migration_pte() can result in confusion, > because if the source PTE/PMD indicates write permissions, then there > should be no need to check VMA write permissions when restoring migration > entries or PTE-mapping a PMD. > > Commit d3cb8bf6081b ("mm: migrate: Close race between migration completion > and mprotect") introduced the maybe_mkwrite() handling in > remove_migration_pte() in 2014, stating that a race between mprotect() and > migration finishing would be possible, and that we could end up with > a writable PTE that should be readable. > > However, mprotect() code first updates vma->vm_flags / vma->vm_page_prot > and then walks the page tables to (a) set all present writable PTEs to > read-only and (b) convert all writable migration entries to readable > migration entries. While walking the page tables and modifying the > entries, migration code has to grab the PT locks to synchronize against > concurrent page table modifications. Makes sense to me. > > Assuming migration would find a writable migration entry (while holding > the PT lock) and replace it with a writable present PTE, surely mprotect() > code didn't stumble over the writable migration entry yet (converting it > into a readable migration entry) and would instead wait for the PT lock to > convert the now present writable PTE into a read-only PTE. As mprotect() > didn't finish yet, the behavior is just like migration didn't happen: a > writable PTE will be converted to a read-only PTE. > > So it's fine to rely on the writability information in the source > PTE/PMD and not recheck against the VMA as long as we're holding the PT > lock to synchronize with anyone who concurrently wants to downgrade write > permissions (like mprotect()) by first adjusting vma->vm_flags / > vma->vm_page_prot to then walk over the page tables to adjust the page > table entries. > > Running test cases that should reveal such races -- mprotect(PROT_READ) > racing with page migration or THP splitting -- for multiple hours did > not reveal an issue with this cleanup. > > Cc: Andrew Morton > Cc: Mel Gorman > Cc: Peter Xu > Signed-off-by: David Hildenbrand > --- > > This is a follow-up cleanup to [1]: > [PATCH v1 RESEND 0/6] mm: (pte|pmd)_mkdirty() should not > unconditionally allow for write access > > I wanted to be a bit careful and write some test cases to convince myself > that I am not missing something important. Of course, there is still the > possibility that my test cases are buggy ;) > > Test cases I'm running: > https://gitlab.com/davidhildenbrand/scratchspace/-/raw/main/test_mprotect_migration.c > https://gitlab.com/davidhildenbrand/scratchspace/-/raw/main/test_mprotect_thp_split.c > > > [1] https://lkml.kernel.org/r/20230411142512.438404-1-david@redhat.com > > --- > mm/huge_memory.c | 4 ++-- > mm/migrate.c | 5 +---- > 2 files changed, 3 insertions(+), 6 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index c23fa39dec92..624671aaa60d 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -2234,7 +2234,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, > } else { > entry = mk_pte(page + i, READ_ONCE(vma->vm_page_prot)); > if (write) > - entry = maybe_mkwrite(entry, vma); > + entry = pte_mkwrite(entry); This is another change besides page migration. I also don't know why it's needed, but it's there since day 1 of thp split in eef1b3ba053, so maybe worthwhile to copy Kirill too (which I did). > if (anon_exclusive) > SetPageAnonExclusive(page + i); > if (!young) > @@ -3271,7 +3271,7 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new) > if (pmd_swp_soft_dirty(*pvmw->pmd)) > pmde = pmd_mksoft_dirty(pmde); > if (is_writable_migration_entry(entry)) > - pmde = maybe_pmd_mkwrite(pmde, vma); > + pmde = pmd_mkwrite(pmde); > if (pmd_swp_uffd_wp(*pvmw->pmd)) > pmde = pmd_mkuffd_wp(pmde); > if (!is_migration_entry_young(entry)) > diff --git a/mm/migrate.c b/mm/migrate.c > index 5d95e09b1618..02cace7955d4 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -213,16 +213,13 @@ static bool remove_migration_pte(struct folio *folio, > if (pte_swp_soft_dirty(*pvmw.pte)) > pte = pte_mksoft_dirty(pte); > > - /* > - * Recheck VMA as permissions can change since migration started > - */ > entry = pte_to_swp_entry(*pvmw.pte); > if (!is_migration_entry_young(entry)) > pte = pte_mkold(pte); > if (folio_test_dirty(folio) && is_migration_entry_dirty(entry)) > pte = pte_mkdirty(pte); > if (is_writable_migration_entry(entry)) > - pte = maybe_mkwrite(pte, vma); > + pte = pte_mkwrite(pte); > else if (pte_swp_uffd_wp(*pvmw.pte)) > pte = pte_mkuffd_wp(pte); > > -- > 2.39.2 > -- Peter Xu