From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67371C5AE59 for ; Thu, 5 Jun 2025 23:30:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C6A246B00D4; Thu, 5 Jun 2025 19:30:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C415D6B00D5; Thu, 5 Jun 2025 19:30:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B57846B00D6; Thu, 5 Jun 2025 19:30:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 89D556B00D4 for ; Thu, 5 Jun 2025 19:30:14 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 9BD3F141963 for ; Thu, 5 Jun 2025 23:30:13 +0000 (UTC) X-FDA: 83522942706.13.817BF36 Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) by imf03.hostedemail.com (Postfix) with ESMTP id A70FF20006 for ; Thu, 5 Jun 2025 23:30:11 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jirqoQ5t; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of bagasdotme@gmail.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=bagasdotme@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749166211; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nIg/QKJK+t2tcG+N4zRKy0QR/HL0BRMZBIkzqbGzHhM=; b=mQKPaPpuPJLGlRN4gmlLW3Jv8d9DnVYzmtPp1Qkj1DLdBjy1z3aFPXP2c6OXhBKvJAWThy ez7xVcUGuAc1MZ7xx2DZCFdPZOuINQvfw/5LtKDVM+BXRU5F/dAQyrA4zX+lbVVP61kzC/ kmpTDluudDjFGW2erSpBj8nxPNWEWSk= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jirqoQ5t; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of bagasdotme@gmail.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=bagasdotme@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749166211; a=rsa-sha256; cv=none; b=pxiYCzDRoFU+/9k75styhy+Ny/JVFd+Lh21YF0+qFg9Za3EmxoBrc2AZA6wNmZGemXVgC1 W4Mi+AQfSz66hme2NzAfxS8r9Slf9maMU0PM/EkWpG3SXKPc76qxHmHabbewyIy01YxZQ7 JpopphwC5+SQq/DIpdr//XOlrrISa9M= Received: by mail-pl1-f182.google.com with SMTP id d9443c01a7336-234fcadde3eso18799265ad.0 for ; Thu, 05 Jun 2025 16:30:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749166210; x=1749771010; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=nIg/QKJK+t2tcG+N4zRKy0QR/HL0BRMZBIkzqbGzHhM=; b=jirqoQ5twCXMMguxSSxsKQ7q2H0zcB1hX4iXt3Xja5kdiWgzLzMJlbtaEw/ri/ks29 Noe8/QIsOcVD1PCzgty99tAvxPTuoAe0ZEiCb0po2jsgX9MSzqsbEevMM0+/mRYYcnEO TMaeIJLPq9+5TwYIIWO8Hsu7Yww5Etui+E0TJvJIFxOHKK580fISwKMK/riakkLUeT/G 1p9G8Y1QAq2/dz7fIJB5D4+D6tTh5SmFA35fBaww6DjbwhaLQSUgDboFnzwM7/34K14i DiWgwoZBDD6+G/0nR4brW/AXPOO/KkzbJGDykCbiJ+qJ1NxXX7eQ1jEDoE8kIdqaDZle b+Ig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749166210; x=1749771010; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=nIg/QKJK+t2tcG+N4zRKy0QR/HL0BRMZBIkzqbGzHhM=; b=KU04Ow33l31F65THpsBMSxELd2zkVmITxTMsk+5OW4lOKaAgi3Jo57HtsKLm/GTqIW aOgTymq6n2b1YjLMvP+n3Sum4mk1FKW4EmY39xDlXSr1vN6ZShVDlABTI0q/pSd0Fx6c 9ERjTmBdMCpC/0ON7LpEQV5swYf7P86GWyIkSjLueZTNHmIW0Ifq9od4d6NHESy1XSSk 7chbdjMIUSACP7rGswG+f5gWgS4ZGhyyqnmrlN+6Fhb5UQAcetBJAhDFgnhjTI7stRWf GQbexk7g+DHyMDQoIAnw5EaduvSyNKzqJZryse6ztJ3ZnTx7Igk/95uNQqOAqTlt4zyF xWAA== X-Forwarded-Encrypted: i=1; AJvYcCXWTS95UA7nEZT9S+ofaZ86S7DzuDbZq4eEF6USPkKaXuJvhGIDI1N/1oHlV4ZDqFBcJEKKt0Mnfw==@kvack.org X-Gm-Message-State: AOJu0Yx6mQi3djI9DxW3YGHSrBHqyxYMn7GCjceJesK7DSIi1FbyBy+o ocd8tRwhiaMAzhkszbJn1gOeg2w17qsgKTvKqnXtF2fxMPqjNzhCDfJy X-Gm-Gg: ASbGncuBoPL26zSQemgZzalG+r8V/T0uMsXQwzO6GehQeEF9mhUmzGNa4i87lrJwe+X 4qBC7qE32hzq58UN5XsyzSbSYyn0IUA93DMZEGF/oql+4ciOaSqlqdNUv/kNGbo0eRAMaoAJPBZ EUNN4UYZ4uIEB7DO65yF4IFNmIrvl+AJ2pYev2ClJviHgvzS9STxD63oFHNeYiIZatWRrTV1j0f lklbPPaJQxPUOw/elRqiEheVozxpmYk+dCf6gnT3XgqK4zkbv4jmEb+Ax1tuHX/4TgHt4THisbr n3Waf3y0xNoBoHcUupVGBdRjYgvxpFaLW4HmpNCjE4SCwftK1/o= X-Google-Smtp-Source: AGHT+IHXKd4dGIULKIaYlncrp+DqaXIwq0DERiJ/XQO9ES/5H6bjbbs9gaHl/qGj5RljyK1lfMJmeg== X-Received: by 2002:a17:902:d483:b0:235:e1d6:4e29 with SMTP id d9443c01a7336-23601dc0136mr18210325ad.36.1749166210136; Thu, 05 Jun 2025 16:30:10 -0700 (PDT) Received: from archie.me ([103.124.138.155]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-236034056f8sm1405105ad.161.2025.06.05.16.30.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Jun 2025 16:30:09 -0700 (PDT) Received: by archie.me (Postfix, from userid 1000) id B8D784222987; Fri, 06 Jun 2025 06:30:06 +0700 (WIB) Date: Fri, 6 Jun 2025 06:30:06 +0700 From: Bagas Sanjaya To: Lorenzo Stoakes , Andrew Morton Cc: Suren Baghdasaryan , "Liam R . Howlett" , Vlastimil Babka , Shakeel Butt , Jonathan Corbet , Jann Horn , Qi Zheng , linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal Message-ID: References: <20250604180308.137116-1-lorenzo.stoakes@oracle.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="TKrJpDjXj3ldSAvw" Content-Disposition: inline In-Reply-To: <20250604180308.137116-1-lorenzo.stoakes@oracle.com> X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: A70FF20006 X-Stat-Signature: ppgdioh1z4t3aqd8ga74qk58nso4rp5q X-Rspam-User: X-HE-Tag: 1749166211-632114 X-HE-Meta: U2FsdGVkX1+PhDs3f4/dYaObvbze46th3IpvY22xoWcE2IzhmzxcFeBRwaFQmgQAnPTYctICTwcdCKNODuuq7FspE7EOQCQazFwh5pGAGqchPHB+cdKEtIaPJ1oafinQX3MlRu+IkAbfY0zD1swLsO7yhi80znGRLV2xj28WxnJnWhsXiUVxq7PsVR7hjnZnVLrztEvm3ExV5sL4LEfTRE0QtxlKLOLrRfcwaGiJ0RCtdsMc69ijNuSMtJYz3YZyXzEq0DYwVjWFT4BjEWIJxqjEIm7NIL7VCD/qu7HXQJxmYnmDLSW1/u7jW/9UzZ1BtAEIDFvgm6WJ6oduMAj2DZCT23nv1BflAoIlm6BYmnN8OlFX5Dd2Kf8519pUKVQU8B6KKn4zU/yUQRCuyg5r7u5SG6rPDIlz7S5UOOodGw4rcz7aIckpnp1l5Ou3qF2vesOSaG5Zjh+RZdIIfalLT7KN0WYL4yn3cskfIWwM4f9UdLpXTcWPC2LuiVAL39HQpjji834ubmA4GDcnZeug4zdtrlchpnmwBTEwqAJmiiRvFGOnnwYEyD1x4pRXnY6F8A8ndSD9Bzbh2wqgLCABR04ZCaiAOsmXk6B/VmAkFrA8yqQ+CiZ9xkmjRTl1ZH9wBMLPK0ItXs6qRlteiQmZ3nGWfp5nqw0Muizr0IPd6s8kR/3gFVrP3F0SzTDGHZbp449bA5wD+uOB2y7O0VCrin+Rd2WDrGj0nCt4PYCupJ+49h90I2CejWww5JhPJJM7TDo+JcdOUWpSkxJizMKZiq9KVVDFuh8SQdNJKX/5WpdogAVziRzzUeWjdtP8QwUxWThDZWJda5+uoNCkn3ePoG+OQ1VKA176XABvzAfXPi4cFEJiOIw8s6zUkaXtu1oaDHcZNCKRFPKLT8CV2QaFgfrWP13iHlgzNl0SA8VbXB1Joi669WcG0xwNAsxDH+wTLZuvQmk45uqm//W4xI3 u0uemFHb Xrs7sfy5nvOc66ejCXvnHvdtk6NYsIEkqZgL1hbBciL0lLUDPrLEeEBG/OZ0F5F94BZqlmL1OvaUHRxb46WnM1VB4oRYVCp2WsRZC4unZu3iboXlwzHsbfkh6BCRBZtpAeNJR1UPHBe2COpjUZC3IaiheJhMzJ0aQwgiJvTxYssW1Xf2B+g16JUyiasW1Pa1qfyyDtNZIPB69IUCceYVebJohySxdrN4oQKWgoyY2v/z30nevzD2BDIV6xLxHfQRyGYxE1DpSqhyOwr3ziNarzM01wKuVN5S0I3JodOS3EqitehY1gGhBwr6s1EdzCLMIVmasE/OEnPqdfsAZ5dS1M6+E1URNod5xC270ewnk2jMPwqtpxfBW5UZnhqePyx15P/szUCsfJycbiBwJ81NmfUeDHkVMbMVeDRWO63G2WqubgyCrEpt3nkLKKf/Yo9E0/ASUI1fFUxIRzuOU7K/QJcUpKtcE+3eQaMF7 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: --TKrJpDjXj3ldSAvw Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jun 04, 2025 at 07:03:08PM +0100, Lorenzo Stoakes wrote: > diff --git a/Documentation/mm/process_addrs.rst b/Documentation/mm/proces= s_addrs.rst > index e6756e78b476..be49e2a269e4 100644 > --- a/Documentation/mm/process_addrs.rst > +++ b/Documentation/mm/process_addrs.rst > @@ -303,7 +303,9 @@ There are four key operations typically performed on = page tables: > 1. **Traversing** page tables - Simply reading page tables in order to t= raverse > them. This only requires that the VMA is kept stable, so a lock which > establishes this suffices for traversal (there are also lockless vari= ants > - which eliminate even this requirement, such as :c:func:`!gup_fast`). > + which eliminate even this requirement, such as :c:func:`!gup_fast`). = There is > + also a special case of page table traversal for non-VMA regions which= we > + consider separately below. > 2. **Installing** page table mappings - Whether creating a new mapping or > modifying an existing one in such a way as to change its identity. Th= is > requires that the VMA is kept stable via an mmap or VMA lock (explici= tly not > @@ -335,15 +337,13 @@ ahead and perform these operations on page tables (= though internally, kernel > operations that perform writes also acquire internal page table locks to > serialise - see the page table implementation detail section for more de= tails). >=20 > +.. note:: We free empty PTE tables on zap under the RCU lock - this does= not > + change the aforementioned locking requirements around zapping. > + > When **installing** page table entries, the mmap or VMA lock must be hel= d to > keep the VMA stable. We explore why this is in the page table locking de= tails > section below. >=20 > -.. warning:: Page tables are normally only traversed in regions covered = by VMAs. > - If you want to traverse page tables in areas that might not= be > - covered by VMAs, heavier locking is required. > - See :c:func:`!walk_page_range_novma` for details. > - > **Freeing** page tables is an entirely internal memory management operat= ion and > has special requirements (see the page freeing section below for more de= tails). >=20 > @@ -355,6 +355,44 @@ has special requirements (see the page freeing secti= on below for more details). > from the reverse mappings, but no other VMAs can be permitt= ed to be > accessible and span the specified range. >=20 > +Traversing non-VMA page tables > +------------------------------ > + > +We've focused above on traversal of page tables belonging to VMAs. It is= also > +possible to traverse page tables which are not represented by VMAs. > + > +Kernel page table mappings themselves are generally managed but whatever= part of > +the kernel established them and the aforementioned locking rules do not = apply - > +for instance vmalloc has its own set of locks which are utilised for > +establishing and tearing down page its page tables. > + > +However, for convenience we provide the :c:func:`!walk_kernel_page_table= _range` > +function which is synchronised via the mmap lock on the :c:macro:`!init_= mm` > +kernel instantiation of the :c:struct:`!struct mm_struct` metadata objec= t. > + > +If an operation requires exclusive access, a write lock is used, but if = not, a > +read lock suffices - we assert only that at least a read lock has been a= cquired. > + > +Since, aside from vmalloc and memory hot plug, kernel page tables are no= t torn > +down all that often - this usually suffices, however any caller of this > +functionality must ensure that any additionally required locks are acqui= red in > +advance. > + > +We also permit a truly unusual case is the traversal of non-VMA ranges in > +**userland** ranges, as provided for by :c:func:`!walk_page_range_debug`. > + > +This has only one user - the general page table dumping logic (implement= ed in > +:c:macro:`!mm/ptdump.c`) - which seeks to expose all mappings for debug = purposes > +even if they are highly unusual (possibly architecture-specific) and are= not > +backed by a VMA. > + > +We must take great care in this case, as the :c:func:`!munmap` implement= ation > +detaches VMAs under an mmap write lock before tearing down page tables u= nder a > +downgraded mmap read lock. > + > +This means such an operation could race with this, and thus an mmap **wr= ite** > +lock is required. > + > Lock ordering > ------------- >=20 > @@ -461,6 +499,10 @@ Locking Implementation Details > Page table locking details > -------------------------- >=20 > +.. note:: This section explores page table locking requirements for page= tables > + encompassed by a VMA. See the above section on non-VMA page ta= ble > + traversal for details on how we handle that case. > + > In addition to the locks described in the terminology section above, we = have > additional locks dedicated to page tables: >=20 The wording looks good, thanks! Reviewed-by: Bagas Sanjaya --=20 An old man doll... just what I always wanted! - Clara --TKrJpDjXj3ldSAvw Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQSSYQ6Cy7oyFNCHrUH2uYlJVVFOowUCaEIoegAKCRD2uYlJVVFO o9YLAP0XEbGybhTT0kMYnpcDPlleOpjIwK2lOzh5ejXz8EynYwEApRxlM9Qqf+x1 Hjc7iMbG0xjNPtyNl1s36fJcTI751gg= =sqiG -----END PGP SIGNATURE----- --TKrJpDjXj3ldSAvw--