From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62F22D1CA0E for ; Wed, 6 Nov 2024 03:10:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ADF306B0085; Tue, 5 Nov 2024 22:10:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A8D826B00A0; Tue, 5 Nov 2024 22:10:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 97C366B00A4; Tue, 5 Nov 2024 22:10:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 7A84C6B0085 for ; Tue, 5 Nov 2024 22:10:01 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 25CCBACC1C for ; Wed, 6 Nov 2024 03:10:01 +0000 (UTC) X-FDA: 82754190330.04.5E020A6 Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) by imf24.hostedemail.com (Postfix) with ESMTP id 4A12818000F for ; Wed, 6 Nov 2024 03:09:53 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=NksDxnkF; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf24.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.170 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730862475; a=rsa-sha256; cv=none; b=vNnhARWdBAkncOlybTfGEJg1GqoYdiqpOhYtT9aE7LXX5e3VMQYodckxrsyLRyb3Iug5Vw q02+RNjYc+5tXfXJVl3HhLuBwdX4muVtjgUDzgNU1TDPXWcy60Y6dDs734xzT5/v8cZld3 uHvw4MkOLJ9qWS+6CaXUaGA7hBJ3/hI= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=NksDxnkF; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf24.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.170 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730862475; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=JJJ4//7bLOxEBzzcePvQMSHmxMCqJrcmMS56wQXcOKg=; b=QmPJCjugTETmIhIhPjF5tfHoSskDZgBp1hY16suJCTI7gNZp0flZOnHnjGWCDgV94sNNHt YV8balcCJCvGDVfYfnYBaXl8KIlmfIz3bL/pjhIu5BbeOCcklDULrTsgcJRlmqXsO9ckSk 4kq/fdA4SszREDzIq34UViwiqg/lIUI= Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-720be27db74so343330b3a.1 for ; Tue, 05 Nov 2024 19:09:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1730862596; x=1731467396; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=JJJ4//7bLOxEBzzcePvQMSHmxMCqJrcmMS56wQXcOKg=; b=NksDxnkFhPbgL+DXm21wU7lL/oztbq67YkQuj4CEFbstUZeMtVy/yTX3hRHhkN6d4t HZuoFSiulkMrK9SaRkQXUBOdgOUQBLZO2aNLw3pB2gHe3pEsSYJEd3gGOcbpW9xaRNVe gruGCG5jaH7iB/d4ijspAYHleQNw2yTOmpM4BmMI3RzGRMEriN8dTDLlS52HKryXSs79 EBIhCWzO5BA85+vD0afD1EqtvwLwx4vQQ8fn6dD7j674sDjvVt8AIT4upQvwXxfuA/N5 xAzO3yF4ISrMiHlTbPYQOlU6c1tgPpDQrdD9lcuiWD/44VvYeb50ThxZIASe3mY5GcVr a3kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730862596; x=1731467396; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=JJJ4//7bLOxEBzzcePvQMSHmxMCqJrcmMS56wQXcOKg=; b=vhnptQDify9KjtkgGBlpsLMBJ+Dqum8b2nkdXaMhYi5CtT0kLwC+qliRPYniQjoIne jgIQJxZ5l2K9Q0OQbGFOoLgrhCC7BR3EFcCzSSlAhOnk4M9boYBLPFKBSa6U4BMeKKC0 mxNwGoBu94swB7t+t1PDgMROHI3g7LHWPoEatMeJm1XYzig6vK9U0Q8NtViZnMlOL/0Y qpvL+uUuYFaSePSAVM8fG0jU00jEZlC0R58wEOebUhsRfhXyqjraeLOf0O/nIzjNjq1A tV+/WOmPomUvTa4CZW3Zy4Tb4vUa/HjzUyw/7AINWNfiyxrOtSXJCGH73vqRWP0kf1cg AJfw== X-Forwarded-Encrypted: i=1; AJvYcCVEdubk1722BnGnNb4LmZnBlO+XAi1RfV0GLF9yZAR6eeLd1J8HKHtkTXyg7VZPb3jvdXgDH5/IRg==@kvack.org X-Gm-Message-State: AOJu0YzYDRbWlz1+AZgVDe1KQrRg0xUCv7Cmne+P6Ve5Ylcf/sdkI9F2 DWqgN0pvRUqgejd6J2GVF+GjU5YCMhenjOMsvxkMyncWcOZ+iyYjv0rlsa9hfng= X-Google-Smtp-Source: AGHT+IHBwkmXZOKA15JmPL3jtnBRqc8ey/KEXtUKR9FaxQ0zlyGhFQ+Nk2fN9DHaMcJvrMsMFFhSCw== X-Received: by 2002:a05:6a00:b93:b0:71e:41a6:a0c6 with SMTP id d2e1a72fcca58-723f7a99d60mr1224329b3a.13.1730862596341; Tue, 05 Nov 2024 19:09:56 -0800 (PST) Received: from [10.84.149.95] ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-720bc1b8ce8sm10500455b3a.35.2024.11.05.19.09.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 05 Nov 2024 19:09:55 -0800 (PST) Message-ID: Date: Wed, 6 Nov 2024 11:09:47 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH] docs/mm: add VMA locks documentation Content-Language: en-US To: Jann Horn Cc: Lorenzo Stoakes , Jonathan Corbet , Andrew Morton , "Liam R . Howlett" , Vlastimil Babka , Alice Ryhl , Boqun Feng , Matthew Wilcox , Mike Rapoport , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Suren Baghdasaryan , "open list:DOCUMENTATION" References: <20241101185033.131880-1-lorenzo.stoakes@oracle.com> <2bf6329e-eb3b-4c5e-bd3a-b519eefffd63@lucifer.local> From: Qi Zheng In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 4A12818000F X-Stat-Signature: bg15astembsy67ijuashxr5nmhzb479y X-Rspam-User: X-Rspamd-Server: rspam05 X-HE-Tag: 1730862593-895222 X-HE-Meta: U2FsdGVkX19AJdxP1pJxppX7QoKvn3k6n4pL5BRgGiwWRwy6R8GnICtj6Q26sukK9o1BpOl5c4Uvx0cH9mMurT89jLj8He+8i6m8X1FGwGLIlv29zEfQVEEQxqCkngVvktVWCkYNoJmV95nNowirZFzeQQov0VQeRwB4qAhQJwzlUl/2dRQ8XPLAXblrFI/tT2JlpH27jCdd5kCEBnct7rnBtltWXGEhHnZD6WVjZL0MgKZyk7oN5AzCN3GiWQLao+n6oovZcR10stClh7zRJlKjxgJdAiF2r2N7q9icS553QC0Ss2eRsvv5e4GKQLD2lFcdL3prYW/nykuapMrt3d1CE8a4hHWMNhbE3xOkwx47Ux0mFO9PUaiymeO5SwwuGC0+XfWOrpwOppSL9/kqMCYSi5W7SeJUitnZgcF16nhWUmJ5Zy2TD7tHF0zvm0W/VKxXX0f0qVShSV4s1XP58LsSrfQkXXxQaMNd/Purizx+mPw1ZKRrWfZH9+GDOSayq7FJ4YPYwQjYVXOXQtFwMWqraL5upRcG6dBxuOzK3fEBSmTNeAWrLkA9kP9nn8+1bdzOSTHw3hkk6RXejLcsPsATlsbhFgHlUAwMiPI3XK/Gk2bHIGOCvg2BMTsyhEbzZkHYX7BWhs6cnwGCMtJtnJ0JtFVekoPmPaXD6hq4K8uNpDeXqMTYb9VcyAnGqaAgkJ9Qm8DQAByJqORnYsjPkZx1+7YN/VI8o4608uQSt8IuIbwvOiWAi1KBfmLhDeSRiJcx/RxKDO3iEWkxmeXkPvD2t6VIK55tDm+K1kYARPpT+KvmyjuGV5WEfpvF2JNDo0KjkeXrqtPWxdRJIW1jT5L4IHuZaqp9NZPuuDh4GrOWCXfM7EDz99pPrL5r7UtdvzWAIomb/EawQXA0juNqj+zueN9c2htoW7GDD9NP5vSr89eydzjuptkMGU4a1OuoZa0miGWkyVv2L7a1P0t vp2O1VPQ 5p3sPvcUHA0lSYWLSXjgq2COMgr1kySYotaVmkR8PMGVg6LOZiwRD3iYfFy7n0dBCyEJMVgqUr8U9SmGbuAQw2kvSADPrVlcXyrizXVswXZ9Tkmz3aVHPdvuHDVKpBsIsLQzNV5vObrdxrWN2z0331nej5KA7TktgkUu8tO29cTBQBte9SLz/6lYnRJlj3RdG247ipKOGhrSjd3pzbBchuh7L3P5wyLHAYBcn6khbLJHu82U0w+zelWODoDNZm5QNitsAxtjW9F4tglSBEMwQ8GB1d3p5qMonrM1MpxKy1O0QwSk5DQ8AlDn/S/zQf1vi6D+D93ULjf8bakyCX1tmE/AkLOv3o2EDav8ijTXalkQBfk/jtDVXkWOnMc1CHH/R7ArnZwcYg1YkLDZecLU3WkNYTxtk1UUIw0jdoJvqYEKo6nRTlrD/FPr8d3ayqD/4CSWZjMUycrr8014= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Jann, On 2024/11/5 05:29, Jann Horn wrote: > On Mon, Nov 4, 2024 at 5:42 PM Lorenzo Stoakes [...] > > I think it's important to know about the existence of hardware writes > because it means you need atomic operations when making changes to > page tables. Like, for example, in many cases when changing a present > PTE, you can't even use READ_ONCE()/WRITE_ONCE() for PTEs and need > atomic RMW operations instead - see for example ptep_get_and_clear(), > which is basically implemented in arch code as an atomic xchg so that > it can't miss concurrent A/D bit updates. > Totally agree! But I noticed before that ptep_clear() doesn't seem to need atomic operations because it doesn't need to care about the A/D bit. I once looked at the history of how the ptep_clear() was introduced. If you are interested, you can take a look at my local draft below. Maybe I missed something. ``` mm: pgtable: make ptep_clear() non-atomic In the generic ptep_get_and_clear() implementation, it is just a simple combination of ptep_get() and pte_clear(). But for some architectures (such as x86 and arm64, etc), the hardware will modify the A/D bits of the page table entry, so the ptep_get_and_clear() needs to be overwritten and implemented as an atomic operation to avoid contention, which has a performance cost. The commit d283d422c6c4 ("x86: mm: add x86_64 support for page table check") adds the ptep_clear() on the x86, and makes it call ptep_get_and_clear() when CONFIG_PAGE_TABLE_CHECK is enabled. The page table check feature does not actually care about the A/D bits, so only ptep_get() + pte_clear() should be called. But considering that the page table check is a debug option, this should not have much of an impact. But then the commit de8c8e52836d ("mm: page_table_check: add hooks to public helpers") changed ptep_clear() to unconditionally call ptep_get_and_clear(), so that the CONFIG_PAGE_TABLE_CHECK check can be put into the page table check stubs (in include/linux/page_table_check.h). This also cause performance loss to the kernel without CONFIG_PAGE_TABLE_CHECK enabled, which doesn't make sense. To fix it, just calling ptep_get() and pte_clear() in the ptep_clear(). Signed-off-by: Qi Zheng diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 117b807e3f894..2ace92293f5f5 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -506,7 +506,10 @@ static inline void clear_young_dirty_ptes(struct vm_area_struct *vma, static inline void ptep_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - ptep_get_and_clear(mm, addr, ptep); + pte_t pte = ptep_get(ptep); + + pte_clear(mm, addr, ptep); + page_table_check_pte_clear(mm, pte); } ``` Thanks!