From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1238D59F56 for ; Wed, 6 Nov 2024 18:10:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0FE6B6B0085; Wed, 6 Nov 2024 13:10:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0AEB16B0088; Wed, 6 Nov 2024 13:10:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E91646B0089; Wed, 6 Nov 2024 13:10:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id CAC0A6B0085 for ; Wed, 6 Nov 2024 13:10:39 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 7DDFD80E46 for ; Wed, 6 Nov 2024 18:10:39 +0000 (UTC) X-FDA: 82756458834.07.050B36E Received: from mail-ed1-f46.google.com (mail-ed1-f46.google.com [209.85.208.46]) by imf13.hostedemail.com (Postfix) with ESMTP id BE75B2001E for ; Wed, 6 Nov 2024 18:10:01 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ZbG3DnYA; spf=pass (imf13.hostedemail.com: domain of jannh@google.com designates 209.85.208.46 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730916470; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jpqiovl0ffII9JvtOiEcilqO6gAo6Ysuk7IQTNDHH1E=; b=uQVFqRvOXkiso0JA8hp8TtQRZixB8KjrbvoIPMx896enz7adm7d2QPPptuBbRZpx2Az6Lm ur5qxtb7D1hJkZN7XwzWTlF9EB1RSxnoQ+ARrk9eKEJktgeOzYwGjwVJmsWSSA8D3gVeIh TuKf52t8lG0cHPibIy2cRLKXNWmZWYk= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ZbG3DnYA; spf=pass (imf13.hostedemail.com: domain of jannh@google.com designates 209.85.208.46 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730916470; a=rsa-sha256; cv=none; b=Z8GNdPYxFAUC/bompw3A0o1o4uX0BO48n3jIBCyHfY4oZXekoROo/1sJ4uNALHquu2Gf5W N2vt9hsWHLOvUI6IzR1ofvH9xTvHOt1vWUZ61SOOx35ZbKhcYTB9QLzR8CeuiHInRzwFKO snKl9/8qOo/q2RaUEEMicYpqVqPOyv0= Received: by mail-ed1-f46.google.com with SMTP id 4fb4d7f45d1cf-5c947c00f26so1093a12.1 for ; Wed, 06 Nov 2024 10:10:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1730916636; x=1731521436; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=jpqiovl0ffII9JvtOiEcilqO6gAo6Ysuk7IQTNDHH1E=; b=ZbG3DnYAQRnn/JTPY0o0BQVm2T9oe1GdqwYobxL7yQjZJDgq8sGSzj2iCjc/KwP38j Ji7N04qvLsZoSe00JDe0RlTqCrwJR0eMrXEoVW6/WtAAVug5jZfmps6lxBuzaBH9JFKD u9j4tCTceu2kukshVqdFgVKBa59Go6ixC7v5KqqghKjKDykczifguWcnCjnd/4iSoUew coBJHZPUZ5/QI3siNIyQoyc/7hVeyOFDRIvZpGfyYUWJ8ItbA7nmEBAvvc5j/YSvwj2H eXoFCpU2OkG4sJGskB09CR//P+HAv4Jz4sGZsuu8T7zdgDlcATAxQkiHx6opdP8WG98b 94lQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730916636; x=1731521436; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jpqiovl0ffII9JvtOiEcilqO6gAo6Ysuk7IQTNDHH1E=; b=ixz0RYWPFY9wVntO4Q+4KG8Q35waAbi3MV7uNLTkGWZNkZo4zWKKR69gLh7TbYoZuD NS23Y1mosrjY0vpxz2Fvqebq8OI4CkvsjiGx3umpGDck3fz333L+F+KSD6u1DOc5E/Ma Hft8eBuXAZ9Gu4LzFp/wOA/9lza8TNDiEXgxO8MTX42pWpbiklp92taS6HArgSsQG1R1 YN3tn3kvmcVcG+ua4To0qo687er/XDLaIwoWsJKNytwSvQa3YmPoUMD6IZuWo+CLWl69 K8CSBiVQDpdW/+sT/KyRKuWfr0EwOk+hCCzzswsTGnFSTvjifWGXIHO9zR53hwwAy36H vnUg== X-Forwarded-Encrypted: i=1; AJvYcCXqiipZQZKbJezwXvUkEoUjXlODPIGQ+OhBqYN/vVMJhAfadMx0r0eI1RejMJ7Z4DU+kCyU7UuTow==@kvack.org X-Gm-Message-State: AOJu0YzVqISIiUz5rQikJf44eAmll2Ohg1+3soJB0/1bO9Lq+xjF1yle Ihuv6+JPYGET+d+VW2l9iwgrztoLd7XnVtzKJCt01IricuhGZMX0oo9iuJWcP5xs/1M7m7FSszg 2dShNPz1CaXhlUSSEygdEBuOmKmwCxCdqyR6H X-Gm-Gg: ASbGncv2C4/gdqla1oEjTAdCjJmRbbShSk7zUiRAutY0V0w8QshrLr8bGy06WOEWjk5 Mn21+lSPs1pSq6EXFeEetQJzWJNxJCk58QJviYPu5H4QOB7jJPUQlyLKWbE0= X-Google-Smtp-Source: AGHT+IFfu8DjAm0U1ISPoTiJAxlj7uKrBs27eWrWlAcwhN7fibd3fMEyowQvMJcxtq9Vya2Xc31xMnfP1tRWRloFaGo= X-Received: by 2002:a05:6402:6c1:b0:5c9:85dc:5ae2 with SMTP id 4fb4d7f45d1cf-5cefbcd6d8bmr119927a12.7.1730916635289; Wed, 06 Nov 2024 10:10:35 -0800 (PST) MIME-Version: 1.0 References: <20241101185033.131880-1-lorenzo.stoakes@oracle.com> <2bf6329e-eb3b-4c5e-bd3a-b519eefffd63@lucifer.local> In-Reply-To: From: Jann Horn Date: Wed, 6 Nov 2024 19:09:58 +0100 Message-ID: Subject: Re: [RFC PATCH] docs/mm: add VMA locks documentation To: Qi Zheng Cc: Lorenzo Stoakes , Jonathan Corbet , Andrew Morton , "Liam R . Howlett" , Vlastimil Babka , Alice Ryhl , Boqun Feng , Matthew Wilcox , Mike Rapoport , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Suren Baghdasaryan , "open list:DOCUMENTATION" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: BE75B2001E X-Stat-Signature: h37zigom8fgapqqeyuxy1bjzyxrxfa6u X-Rspam-User: X-HE-Tag: 1730916601-402154 X-HE-Meta: U2FsdGVkX1+Xoo6y8JbDmJg+QBFEVPQJRKfi67oI98VcpVBjH7rOCRLL/gck0djBKV72USxHKAOM47EujIAaSPeXdM74xS2TGjIBylrd2n7uVeigiAqdpXPxXvEP2yQWuU/bD3kiivememgug5zhhVtGLEkDorTqoRyt0rhgggi4VOSVLNIEHRySorFQSstpVtMCgK4a85W41IiiwJkNBb8Qt/59047gpi/wyiEUz5xv5DRPfV9rFFnxgYNQK+G6nBn0VzMwyOhpk8pM7AZvQgENV7PaRSQar73077YuMkT0h0tZRNaodxZHw35+dS9yBsne6/wL7kHviB4oGfsQ/XFfNAZfugAJlpMERkOcCgCAUHLwWyaCSpl0OqJ0n5hUyHP/OhzCluV0Me+TWrW4833GvCVuytdCYz5efPp1gEukk9knBG0adhUgulVVDfY5Fsag00oVNxs3uqA0a1FmM/By8YtfYxnfB4qG503r6Hrm1YXxJeYq6bdpXN3lnd0xJotRtPW9gV68lqruH0MchL8dbms+ibpUO6NGgdwfSkU2W9Ui4UkJ6/MHh1F3SkHhnvIGjJztLeNIZfOvQKg+xY16/T/3I7iSvKh5tOTh0n41OPKIs7P0IiCfiUSPo3NgFIUBk89BQX18/8Qe4VWGk0hBeIj/YB3LwQbleKF+prWYC8hLnrERP0r/a0z2G9AVzUrhmxz5rlEd2nGLM9v2TWGNCfFK0DOzAExXCHmMq1u1qkuamne8y0kY7s/qzzVj/b/baWxCAkbHgzMiEr7j9kbu0CtVI5pEtYiZ3wUbF8cklCda1HZbn1NLzWGJJ8aZCSrt+T8//+ChYFbz31VOoCLG9PQF1qA716hE3W1KGgRsLcXPu1pjJ2LQDaTYi5f4tdSImSLm23GgvdeLP7KDS/7UUMXcWXVUuFDj7mg27rFXjhBQByTNMQLR2CHiYkKUQTn4eryaGkb7ZINcTpZ qEf4eC4M Nx6aDYhrG5wauMzlolv1cFw4tjKRKla85ELLN8wxbYyy5Q6GB7Ikc9QkpDh8CAoyc9YXjIPsvDdztFQZBU2tf8XXhFy3pUq+SR44JYF6EF7B1UpFgmbqwYG1Tr3VIw8NKhKkb72ADZxdkBrLioJA+YaJaDR9kF1PQBQIMQdtYbo5NTIQo9TGBip/KqjP035yvX6xO5bbuaUFiVmFll66WXkk2pzB1owtpYnrzsIwx4Yzf0aCqRE7DSdrTS5wh1xzLouHub3AjMeArB60B+FhZTCg7YLKVREiNHQ4Z X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Nov 6, 2024 at 4:09=E2=80=AFAM Qi Zheng wrote: > On 2024/11/5 05:29, Jann Horn wrote: > > On Mon, Nov 4, 2024 at 5:42=E2=80=AFPM Lorenzo Stoakes > > [...] > > > > > I think it's important to know about the existence of hardware writes > > because it means you need atomic operations when making changes to > > page tables. Like, for example, in many cases when changing a present > > PTE, you can't even use READ_ONCE()/WRITE_ONCE() for PTEs and need > > atomic RMW operations instead - see for example ptep_get_and_clear(), > > which is basically implemented in arch code as an atomic xchg so that > > it can't miss concurrent A/D bit updates. > > > > Totally agree! But I noticed before that ptep_clear() doesn't seem > to need atomic operations because it doesn't need to care about the > A/D bit. > > I once looked at the history of how the ptep_clear() was introduced. > If you are interested, you can take a look at my local draft below. > Maybe I missed something. > > ``` > mm: pgtable: make ptep_clear() non-atomic > > In the generic ptep_get_and_clear() implementation, it is just a sim= ple > combination of ptep_get() and pte_clear(). But for some architecture= s > (such as x86 and arm64, etc), the hardware will modify the A/D bits > of the > page table entry, so the ptep_get_and_clear() needs to be overwritte= n > and implemented as an atomic operation to avoid contention, which ha= s a > performance cost. > > The commit d283d422c6c4 ("x86: mm: add x86_64 support for page table > check") adds the ptep_clear() on the x86, and makes it call > ptep_get_and_clear() when CONFIG_PAGE_TABLE_CHECK is enabled. The pa= ge > table check feature does not actually care about the A/D bits, so on= ly > ptep_get() + pte_clear() should be called. But considering that the > page > table check is a debug option, this should not have much of an impac= t. > > But then the commit de8c8e52836d ("mm: page_table_check: add hooks t= o > public helpers") changed ptep_clear() to unconditionally call > ptep_get_and_clear(), so that the CONFIG_PAGE_TABLE_CHECK check can= be > put into the page table check stubs (in > include/linux/page_table_check.h). > This also cause performance loss to the kernel without > CONFIG_PAGE_TABLE_CHECK enabled, which doesn't make sense. > > To fix it, just calling ptep_get() and pte_clear() in the ptep_clear= (). > > Signed-off-by: Qi Zheng > > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > index 117b807e3f894..2ace92293f5f5 100644 > --- a/include/linux/pgtable.h > +++ b/include/linux/pgtable.h > @@ -506,7 +506,10 @@ static inline void clear_young_dirty_ptes(struct > vm_area_struct *vma, > static inline void ptep_clear(struct mm_struct *mm, unsigned long addr, > pte_t *ptep) > { > - ptep_get_and_clear(mm, addr, ptep); > + pte_t pte =3D ptep_get(ptep); > + > + pte_clear(mm, addr, ptep); > + page_table_check_pte_clear(mm, pte); > } > > ``` ptep_clear() is currently only used in debug code and in khugepaged collapse paths, which are fairly expensive, so I don't think the cost of an extra atomic RMW op should matter here; but yeah, the change looks correct to me.