From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF813D5AE6C for ; Thu, 7 Nov 2024 07:07:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 028536B0098; Thu, 7 Nov 2024 02:07:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F1A476B0099; Thu, 7 Nov 2024 02:07:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D93B66B009A; Thu, 7 Nov 2024 02:07:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id B80D26B0098 for ; Thu, 7 Nov 2024 02:07:45 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 10C4F80149 for ; Thu, 7 Nov 2024 07:07:45 +0000 (UTC) X-FDA: 82758417798.30.045F53C Received: from mail-pg1-f178.google.com (mail-pg1-f178.google.com [209.85.215.178]) by imf09.hostedemail.com (Postfix) with ESMTP id C8939140011 for ; Thu, 7 Nov 2024 07:07:17 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=jRhlZz1b; spf=pass (imf09.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.215.178 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730963213; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=t9J6ZkVLkF9TjYcxor30Dw8R0oFPmno4Aqm71Ans3kY=; b=mzFYvJSicwPXb9nMFAjD0oiAJZmMaUaeYkFrUqkrA4RzC+PD2KEYDUHQWgWX/cn8NXvmUo 7secqv0wQ0q9i2M5vH8R8RNMSzTkYCY1QPAXDlDbzd/v3P0obbmc7KBd2ojPTLwB86oF6c HHW8qVMEnh2ImzAEdjmRC+jAtW901E8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730963213; a=rsa-sha256; cv=none; b=jtE5BgWquMOAVqRwBpOk8OOCN7kyBpHaSMtWhrNPCIa+rXPxwjsPYYdHOmDPdIVrk3BDb8 DvlndHOERTgGe723bHJrJOoPtN1j/5cJnZoXSsNRA/9KKaQLXQ0jQL1igiFs99N4ubvHWT 0yNim8bxdvEgeLuWs6PmIIO/EQvwyPo= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=jRhlZz1b; spf=pass (imf09.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.215.178 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-pg1-f178.google.com with SMTP id 41be03b00d2f7-7f3e30a43f1so487344a12.1 for ; Wed, 06 Nov 2024 23:07:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1730963261; x=1731568061; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=t9J6ZkVLkF9TjYcxor30Dw8R0oFPmno4Aqm71Ans3kY=; b=jRhlZz1bP2IWzTwWBULpQMqFIY3jNeulJSoKhKgHEHenJmTd36h4aYPltIfnEKm4NT ph+aN9biigpoWjC+sV6WAzXQPLJW2VE7rJwcA31TDqaMpVuj3YzKwWVJIALB+UcH4S/i 5PPFgzFUH1z29qV8gr3NpVrMeWnNJjLMuZRl8ZR/nPhq/sR+j7w47QVGKGOCwu8WYaBn 0myxUnz43vy5wjeW8KuBuwHA3Qbu4DXTy7X/XURaBW2OnKQFOFnQU057i1zL3gRV900+ SUpdu3s3WV36Bk8V4Gan4JqOgOrGbtWCoZNCfo/Z8bIOd9G2R37weebb9XpQ/+EPRx6H Po7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730963261; x=1731568061; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=t9J6ZkVLkF9TjYcxor30Dw8R0oFPmno4Aqm71Ans3kY=; b=i7uSiiLYtxlpMfEvTcf5QPYDaMeD5VhlTkx38qB5VfSA/MnB1YC8Ncsk150YXFTIXV kYHq6vxPCxbKoA4pNUE09wA8j2tfATIzMOXtrD/qs+4I93sMfPEvqynYik0ArDQ1qq++ 1BVMweftAxRaJaxFlI2m+aWGgkgKBrFw0rfQerEUTAxiBA88Y11sp7sneAdMTLYE8czv WXLBWMSqp/YQEheamiD1O8dOZUqd5OudaMVU2WTZBNjiRfb3rFQ2vrA+NnU13cAWWFHv 9nAetBwAOGkipI8b3fU+xLqNsToOf7eFjf+il4wnqaAygtFAGd6AjfY3rrucHTezaQAT QLfA== X-Forwarded-Encrypted: i=1; AJvYcCX19sVPrUBw56ODwGOcyR5GDKcWk02x7jOymCVQZrSLPK32NBuXVDaxHgdSQkHc8pYzhYM0xwfDFQ==@kvack.org X-Gm-Message-State: AOJu0YzY+KsmWquxb4Oa63nsc+cdoOK6ovVNPjzAGHMf8Oz1zqJeoZir 7ua5lJ+BxJuDPBKkVGUdEDfDg3551ObpW/h+WCMLczQw8RmTlSvnMZvN/Ia9Jj4= X-Google-Smtp-Source: AGHT+IHf4CApQHJTCaTaCg14Ohd9icv9JUFttYZZ8i+nL6y8/wPd1vccYgmdOWC5BimsZ1SOppfnwA== X-Received: by 2002:a05:6a21:3382:b0:1db:e0ad:8ff6 with SMTP id adf61e73a8af0-1dbe0ad944amr16241954637.1.1730963260628; Wed, 06 Nov 2024 23:07:40 -0800 (PST) Received: from [10.84.149.95] ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-72407a17ebcsm746789b3a.138.2024.11.06.23.07.35 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 06 Nov 2024 23:07:40 -0800 (PST) Message-ID: <28a9f8d6-3b8e-44c6-9458-062a7fe2b8e1@bytedance.com> Date: Thu, 7 Nov 2024 15:07:32 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH] docs/mm: add VMA locks documentation Content-Language: en-US To: Jann Horn Cc: Lorenzo Stoakes , Jonathan Corbet , Andrew Morton , "Liam R . Howlett" , Vlastimil Babka , Alice Ryhl , Boqun Feng , Matthew Wilcox , Mike Rapoport , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Suren Baghdasaryan , "open list:DOCUMENTATION" References: <20241101185033.131880-1-lorenzo.stoakes@oracle.com> <2bf6329e-eb3b-4c5e-bd3a-b519eefffd63@lucifer.local> From: Qi Zheng In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Stat-Signature: pgpg97hmp318q1mk7hjydjri6urk56jb X-Rspam-User: X-Rspamd-Queue-Id: C8939140011 X-Rspamd-Server: rspam02 X-HE-Tag: 1730963237-608747 X-HE-Meta: U2FsdGVkX19mL466daNp6JOn0dx7HDhrn1H4Kuz9nRsP+z4Y9rv3epWTR1QvExcYlzN+CflNvUOa9HefLMtpLSnkmXTnYHriPHQJEHk2X/UNzjp3vg3oN87SsNRrVDfda0NIOQhM6/QIBfIAXzr6tMeXv146PKjRpWH0MD/XFY4nVBTyvAAtLtAptpzLCa2yz39ek04QIb3j1HeDgVIoVOa488i7J0vEJgcKPFGiaw3Jh5b1TJhGuART74J9VezNlltfdLOufINAPDF90n9khPQEzdn3ANU0ub6objSwJo/w4Wr9GfNoshggdUcFqteZ+g5cmGHCmfKOQXRkh5ANLqrWFrYw7fu48AemUuTGWvxZEaHEJGTTGXz+qf+hMMdUaT0CiKIbRl0qbJE+zwZaSD5xakCft2ykjmA2kRxaZQXTYon+UxUM+1mQzipMEre2Dwz4MIL43soSE3ta+HiQHlAiIFG8Oj2z77Q+ySfMgWs8Mm8Oc7gQYDiMDOzVqff5Gi5x9+Q75bF/R4331EhsBFEb2rXBd6SnoIQSqf5CaY4RJY3e3nP4QmxC2n4FNLzetHjWFudgOIDhi/oLhtEtGB8uK66TKAJ4gLP5HzHeOOFU2rbDGDzkuXrDvy0MF3ZG6GiZQqFhxoPgYzX8voLYBijVEDJNcrohPNocdnVxhr4f+EHSuvRKGKN250bwIinzDUSFrjwYp9Cx7GlAdAO9RHdKFbReIQofZixSYPyum+hrwV/JwpV6K/mxNLQN9vAClkp8thbbNPw3MU4Lvqhl7hkTRHeD9Z3LbNnma6w7Pa/1USJfJxJu9EDQZTEgvND96SujisHzWRFCbbZgGO04jZFHEcplcFcG14dTqTp0l1Y0NuW6Q39KpBln3KAbIz496LYn4jHS5shNOdGP55k5et1q/D0QLsWcJdmiRz/gbUULJiuKaK/ZaShrQDsq7k3SrCo9hNfmz090820dvHE 47yQkOtv 1JlxSi3XuQ6IVJiJp17gzFbXpFKo1/+GE6cwzWpwbqc2GHaduBanpygHkfgsKbJeqLR1uSZ1yo7voYc7Q/B52Qkg/+wf9RBchp3QAbyviC/0W2pG50WAcDNalZDbV1lgyZzTHci7mcrKN9ESnJUTrZ72S7NT20E06FFdS4vjo+BQzZPuXpr4AB/4r/120PLKDUpmzWmzxT5tN3GTesjVWXE0cfubDGM5Lp56TwqTurzc2+SzWt7jZRqtY0Q9V2CiDrYUXfxh0b/NtCcProkiFMiZcEbLbcEEsKE6iNxJGdHRb1/RVLg/DUKH6geT/woYJA60QRx1bSH2Ex2dHRDKnewCjJMWPnvRVdcLvBfTpl3i64baym+Je80FqKLcFyy//2mjXSYmUSWQMX4qp3rpgtQ3QcWwUCwHLYSU4vilu/zURV0SIdsz8RWUrddhaWpc5h99tC3QMHrrBx/E= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/11/7 02:09, Jann Horn wrote: > On Wed, Nov 6, 2024 at 4:09 AM Qi Zheng wrote: >> On 2024/11/5 05:29, Jann Horn wrote: >>> On Mon, Nov 4, 2024 at 5:42 PM Lorenzo Stoakes >> >> [...] >> >>> >>> I think it's important to know about the existence of hardware writes >>> because it means you need atomic operations when making changes to >>> page tables. Like, for example, in many cases when changing a present >>> PTE, you can't even use READ_ONCE()/WRITE_ONCE() for PTEs and need >>> atomic RMW operations instead - see for example ptep_get_and_clear(), >>> which is basically implemented in arch code as an atomic xchg so that >>> it can't miss concurrent A/D bit updates. >>> >> >> Totally agree! But I noticed before that ptep_clear() doesn't seem >> to need atomic operations because it doesn't need to care about the >> A/D bit. >> >> I once looked at the history of how the ptep_clear() was introduced. >> If you are interested, you can take a look at my local draft below. >> Maybe I missed something. >> >> ``` >> mm: pgtable: make ptep_clear() non-atomic >> >> In the generic ptep_get_and_clear() implementation, it is just a simple >> combination of ptep_get() and pte_clear(). But for some architectures >> (such as x86 and arm64, etc), the hardware will modify the A/D bits >> of the >> page table entry, so the ptep_get_and_clear() needs to be overwritten >> and implemented as an atomic operation to avoid contention, which has a >> performance cost. >> >> The commit d283d422c6c4 ("x86: mm: add x86_64 support for page table >> check") adds the ptep_clear() on the x86, and makes it call >> ptep_get_and_clear() when CONFIG_PAGE_TABLE_CHECK is enabled. The page >> table check feature does not actually care about the A/D bits, so only >> ptep_get() + pte_clear() should be called. But considering that the >> page >> table check is a debug option, this should not have much of an impact. >> >> But then the commit de8c8e52836d ("mm: page_table_check: add hooks to >> public helpers") changed ptep_clear() to unconditionally call >> ptep_get_and_clear(), so that the CONFIG_PAGE_TABLE_CHECK check can be >> put into the page table check stubs (in >> include/linux/page_table_check.h). >> This also cause performance loss to the kernel without >> CONFIG_PAGE_TABLE_CHECK enabled, which doesn't make sense. >> >> To fix it, just calling ptep_get() and pte_clear() in the ptep_clear(). >> >> Signed-off-by: Qi Zheng >> >> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h >> index 117b807e3f894..2ace92293f5f5 100644 >> --- a/include/linux/pgtable.h >> +++ b/include/linux/pgtable.h >> @@ -506,7 +506,10 @@ static inline void clear_young_dirty_ptes(struct >> vm_area_struct *vma, >> static inline void ptep_clear(struct mm_struct *mm, unsigned long addr, >> pte_t *ptep) >> { >> - ptep_get_and_clear(mm, addr, ptep); >> + pte_t pte = ptep_get(ptep); >> + >> + pte_clear(mm, addr, ptep); >> + page_table_check_pte_clear(mm, pte); >> } >> >> ``` > > ptep_clear() is currently only used in debug code and in khugepaged > collapse paths, which are fairly expensive, so I don't think the cost > of an extra atomic RMW op should matter here; but yeah, the change > looks correct to me. Thanks for double-checking it! And I agree that an extra atomic RMW op is not a problem in the current call path. But this may be used for other paths in the future. After all, for the present pte entry, we need to call ptep_clear() instead of pte_clear() to ensure that PAGE_TABLE_CHECK works properly. Maybe this is worth sending a formal patch. ;) Thanks!