From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8533C3DA4A for ; Mon, 29 Jul 2024 08:52:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 419186B008C; Mon, 29 Jul 2024 04:52:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3A24F6B0092; Mon, 29 Jul 2024 04:52:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 21BC86B0093; Mon, 29 Jul 2024 04:52:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 01AC86B008C for ; Mon, 29 Jul 2024 04:52:53 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 9CAB41601F7 for ; Mon, 29 Jul 2024 08:52:53 +0000 (UTC) X-FDA: 82392175026.14.4823C34 Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46]) by imf24.hostedemail.com (Postfix) with ESMTP id D0BF1180027 for ; Mon, 29 Jul 2024 08:52:50 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=S4U0NLFA; spf=pass (imf24.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.46 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722243101; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bwy22zocUuX93+ONQXiNwUNSliVIGdLep8Ilhq4sGGY=; b=aA6q8X4RHQ9oiOPuCvgptUJhDZxlr62I6MHV5AaWAUCINUDuU3/hRJGUtY/uxRzGR/fhWh oMfxZRo2+1EeX7lBCMMFd6lQQz/DKeD6LUMmdiNMegNA7NIBSQWRVRyzYNtMX7p3o1U3h7 ZZ4sK//vtIWRUBfQvJ1KSr24CDIAkGg= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=S4U0NLFA; spf=pass (imf24.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.46 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722243101; a=rsa-sha256; cv=none; b=30mSHaFmJqkpWDiDdRAC8J8J9ES2LI4HwEAC364B4UpX5lY+DzZ1J8pJDYoDngNA+tB4JX /Z9giN+XZHKvJsX9RV4NDzJHNlmJyMy9TyfbjYbnDPWYKVDrbr1n0cB/B12zV+heqhZcbL hKQHWcccjzqEgsl9PaBa4KnEpPdMJ7I= Received: by mail-pj1-f46.google.com with SMTP id 98e67ed59e1d1-2cdba658093so615403a91.1 for ; Mon, 29 Jul 2024 01:52:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1722243169; x=1722847969; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=bwy22zocUuX93+ONQXiNwUNSliVIGdLep8Ilhq4sGGY=; b=S4U0NLFA7RPJt5H1i41x4hCuAp/5f29bH+m+nJCADVzwVhEdg5UrQGGC+WpIZEDNkc z8n56TwUXXd1q7ToUo3/xo5T/NcGkWRXLOpaUQjKgLmfevbL7lnY2d75PkA+UeTsjPKg xLaOj69W+zh+bKT8mZJ4EuzYvLD9iryn4199aToovcDSy9JcQ+1epUqoUFXWLzv2gyf7 vdpEQOD4THU/EC2VK5ol4dXOxXegvFz8/SEF1rRrE/ybT6zS3aBqu94VJpMlAkEKl31x zcsX1IkeL0/o75xjDVxfNcuAGXpwkPzQzTgLzYWX7ul9UP20DdLfxbQ5G6EZkK54U7Rz rAfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722243169; x=1722847969; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=bwy22zocUuX93+ONQXiNwUNSliVIGdLep8Ilhq4sGGY=; b=aSmUabAsNl9GocLdi8ZNurybCoZ+AO1ykv0iVUocDl5vUd7oCEPrNkZR5EObivNgNJ csS9L66DBNEcJVN7hm328g9pguZXiGyDCS2fBHc5Ql+sZKxp0r1cVMOAuUUh1raLwkmv B1iIiIs5n3yQ9TSl5ZfsZvru2AeB4xoIdgtd7Si/FdWx39nv6g7F0tquCOKmXdlzZA/X X4twHeEDp5n0p8EzM56YofvVZUcRXBagdnVh5r/gafMUfLRBphl6DEM2RW/irDloAJxr BL+e3V8F9sNSxUDY9R633UaRMvOQr9VkYp6tt0TI+4LGLXC26E6ttAO3RYgd85GzTWKI 4nUA== X-Forwarded-Encrypted: i=1; AJvYcCX7R4gzKadGB7RorZ7PiBtMcbFDz+29YI4SpX3LDUOdx2e6vx5y/YoPsjOWV9PZs2CABPeBmWrwLR/A7JWdGkoSw1I= X-Gm-Message-State: AOJu0YwRTxo+MEorZoeXglIQpt72zSTUueAMQCz7RSjmwhSECfYXpK7f QF4D86it/nJ5TWq1C600PY+1+Gng/xTL42GAg5RDF1tJfXlaCT0o6CNNQDeMoNw= X-Google-Smtp-Source: AGHT+IFkmB1l+wjpZUszQ3u2ixYrhaPDUo4qcydMIVUBmHHd8vkjvarPmTReJYy2tlqntC/cggnGtA== X-Received: by 2002:a17:902:e544:b0:1fc:5b41:bac9 with SMTP id d9443c01a7336-1fed6cca233mr94301525ad.7.1722243169421; Mon, 29 Jul 2024 01:52:49 -0700 (PDT) Received: from [10.255.168.175] ([139.177.225.232]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1fed7ee15b6sm77097035ad.125.2024.07.29.01.52.46 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 29 Jul 2024 01:52:49 -0700 (PDT) Message-ID: <6edb1aa3-ea72-49fd-9aaf-189ad6b61ee5@bytedance.com> Date: Mon, 29 Jul 2024 16:52:44 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v1 1/2] mm: let pte_lockptr() consume a pte_t pointer Content-Language: en-US To: David Hildenbrand Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Muchun Song , Peter Xu , Oscar Salvador References: <20240725183955.2268884-1-david@redhat.com> <20240725183955.2268884-2-david@redhat.com> <100ecc66-c2ce-4dbb-8600-d782e75ab69c@bytedance.com> <498c936f-fa30-4670-9bbc-4cd8b7995091@redhat.com> From: Qi Zheng In-Reply-To: <498c936f-fa30-4670-9bbc-4cd8b7995091@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: D0BF1180027 X-Stat-Signature: ju1akai3rdfrhzk8cy9bywm3ssf954gc X-Rspam-User: X-HE-Tag: 1722243170-963825 X-HE-Meta: U2FsdGVkX19u1MuAxZAqnKoa6isH9Rrn1K7o1EH/V1xQ5I3TyWsXcaxYCq5fynVaunU/19CnvyGy+2W8cpby8rY4+U8jTgomcYV40j2OsYSF+tdGmWJYSKGiQFjVYwcW4EcXuyw1dWDJ4ouibQEbdm0LEzNKB19d9eXkslkuVQpL7OVBXYKuzN+/MmLVCaFArQjYRDMn20x1+pALJJpiTHwdJ9ezTed+soMiSG0CGvjMCPsFy2h0ZnrD6E3wi3e9zswO2zSN9F3YbC82DZRS4SvWHP7bGCRKQ6jVSZyZ4+ZbRF1aSkbLZjnXzGiDap5yVVE9MxD+UiycdbW8XE51ehzZwXbMOegqovYOljcOUkqwjoxRORvbWuN8srzWl/V9fhb4k4iSERAr1pFwC1Bqri2IaCLXuCLqZCL0h5rTpm1RZSRRc1eu83co43ciUp4G3JBn8qK0o+WyfNok/XGDvMHq3DkN144cgU8CzBtp6bEuCmUNiw8OwcsqwLmtshOpiyElQTotZpnBR+W0KEGNp6NuJx5wdL6OlyCru2UWgOgQjrjcYmM1s4mNJQYZ4qWzriXWq2lHf9miM1WUn+mNiHzDzXlKc6AnU/VHkimB9ldyb3C8TWzDTTF9BYbDZTFJFLqkFa7PePEj8MUsiPsu79TXvqCk4fdlg4HLNnOJFscKy6c/laMqdelvoinI7P9TrwZAfp6neryvQQv3G/GWP0jB7JzvYWzPHJmYKcUbGY4OXRVncEmCzfu/th7dJNGVJyGZ1J7+lUCGIQFjt62Z3mvJE3DsirPs8BAkYU2OeyLr3bSTkM/qANYqSqyh/2jGCdtqueEa3710HzS/F1ypxbHvy0tmGV8+IzhbBrKWxnabBVnua6WokjvKYGRuq937ypFnH1sbaGdrnV9JDJOrsMBVrtEL8j92t7+xOTZHzV4u0m1mTzpYlvDEpwJG2I7uoGRM5BP9vsAo7jIAcRy OjJQbhGB F1jFu+E8J9DwLLCE6RE24mJadNf0YFXMTtjm4dM+P/NKeEC4wyMNisowxdCpdIPxbtX0mZM/ipd4b81zI8PvC7wjwUhKj2eYec7OK1SiR8Urn023P4gDMz1z6T9GHJnQIU3Uk5RhYFozpD58hxIOM/8tc9fydvJeUY5Zgms/LKjOr0+ZQIOBcBErURQbsEY26Nxyj4V8DOA/OWT/MGAtK9qY4ofyQOovcEd7V5PZh3hOsnHCiqlapvfHyHt502pxZ0xsD8xlbsAgAJmCJjju+QvpImVv/55toEEEoobZ4tZdLlKOICo/uhE6/YJVkl4Zdc1FmD5gLRhCAb//9XMVSpwzOTkk7UlU+hrc7DptTrGm4KoL7AAjarZhtcD4QbrD6lmSoC5wtpVnryFD3vIQleemOjQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/7/29 16:46, David Hildenbrand wrote: > On 29.07.24 09:48, Qi Zheng wrote: >> >> >> On 2024/7/26 02:39, David Hildenbrand wrote: >>> pte_lockptr() is the only *_lockptr() function that doesn't consume >>> what would be expected: it consumes a pmd_t pointer instead of a pte_t >>> pointer. >>> >>> Let's change that. The two callers in pgtable-generic.c are easily >>> adjusted. Adjust khugepaged.c:retract_page_tables() to simply do a >>> pte_offset_map_nolock() to obtain the lock, even though we won't >>> actually >>> be traversing the page table. >>> >>> This makes the code more similar to the other variants and avoids other >>> hacks to make the new pte_lockptr() version happy. pte_lockptr() users >>> reside now only in  pgtable-generic.c. >>> >>> Maybe, using pte_offset_map_nolock() is the right thing to do because >>> the PTE table could have been removed in the meantime? At least it >>> sounds >>> more future proof if we ever have other means of page table reclaim. >> >> Agree, this helps us recheck the pmd entry. >> >>> >>> It's not quite clear if holding the PTE table lock is really required: >>> what if someone else obtains the lock just after we unlock it? But we'll >>> leave that as is for now, maybe there are good reasons. >>> >>> This is a preparation for adapting hugetlb page table locking logic to >>> take the same locks as core-mm page table walkers would. >>> >>> Signed-off-by: David Hildenbrand >>> --- >>>    include/linux/mm.h   |  7 ++++--- >>>    mm/khugepaged.c      | 21 +++++++++++++++------ >>>    mm/pgtable-generic.c |  4 ++-- >>>    3 files changed, 21 insertions(+), 11 deletions(-) >> >> Since pte_lockptr() no longer has a pmd parameter, it is best to modify >> the comments above __pte_offset_map_lock() as well: >> >> ``` >> This helps the caller to avoid a later pte_lockptr(mm, *pmd), which >> might by that time act on a changed *pmd ... >> ``` > > Right, thanks a lot for the review! > > The following on top; > > > From a46b16aa9bfa02ffb425d364d7f00129a8e803ad Mon Sep 17 00:00:00 2001 > From: David Hildenbrand > Date: Mon, 29 Jul 2024 10:43:34 +0200 > Subject: [PATCH] fixup: mm: let pte_lockptr() consume a pte_t pointer > > Let's adjust the comment, passing a pte to pte_lockptr() and dropping > a detail about changed *pmd, which no longer applies. > > Signed-off-by: David Hildenbrand > --- >  mm/pgtable-generic.c | 10 +++++----- >  1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c > index 13a7705df3f87..f17465b43d344 100644 > --- a/mm/pgtable-generic.c > +++ b/mm/pgtable-generic.c > @@ -350,11 +350,11 @@ pte_t *pte_offset_map_nolock(struct mm_struct *mm, > pmd_t *pmd, >   * pte_offset_map_nolock(mm, pmd, addr, ptlp), above, is like > pte_offset_map(); >   * but when successful, it also outputs a pointer to the spinlock in > ptlp - as >   * pte_offset_map_lock() does, but in this case without locking it. > This helps > - * the caller to avoid a later pte_lockptr(mm, *pmd), which might by > that time > - * act on a changed *pmd: pte_offset_map_nolock() provides the correct > spinlock > - * pointer for the page table that it returns.  In principle, the > caller should > - * recheck *pmd once the lock is taken; in practice, no callsite needs > that - > - * either the mmap_lock for write, or pte_same() check on contents, is > enough. > + * the caller to avoid a later pte_lockptr(mm, pte): > pte_offset_map_nolock() > + * provides the correct spinlock pointer for the page table that it > returns. > + * In principle, the caller should recheck *pmd once the lock is taken; in > + * practice, no callsite needs that - either the mmap_lock for write, or > + * pte_same() check on contents, is enough. >   * >   * Note that free_pgtables(), used after unmapping detached vmas, or when >   * exiting the whole mm, does not take page table lock before freeing > a page LGTM. Thanks!