From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 08F12D3398C for ; Mon, 28 Oct 2024 14:22:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 81EBC6B0095; Mon, 28 Oct 2024 10:22:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7CE576B0096; Mon, 28 Oct 2024 10:22:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 696446B0098; Mon, 28 Oct 2024 10:22:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 4915F6B0095 for ; Mon, 28 Oct 2024 10:22:20 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id E26301409F0 for ; Mon, 28 Oct 2024 14:22:19 +0000 (UTC) X-FDA: 82723225452.05.12C34DF Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) by imf29.hostedemail.com (Postfix) with ESMTP id 9626412001E for ; Mon, 28 Oct 2024 14:21:44 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf29.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730125221; a=rsa-sha256; cv=none; b=pBkuikVz08UZgs4lPQeKdH4tKVnYsjMilPmyJmPW1Rjzwq8DbZmKkeSMQEA0AorAcv6E0z mkEnF+MGfqn2xD3J8TDpjnqtGTLewhUhpKo2oponw3amW+rl9mmjbkg4HCnkVT+G9JlB3R stKF8GWqRk13oDJK0cOgax0p5Rxi1ZI= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf29.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730125221; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WuRVVItyAF/X1PqpeJt1n3cELV0WbbIxAWebiUhRX6o=; b=Rqjp0tVn3K2K5KKdRdVu6iBjWRovfaFS9CL/E9o4jxSBZvTGsrNt8W/GDdRpJTFXK5AnCK nCccGjdmhIoKcQO1CH52qfFBKWhGnBgIwl4LwUxozoDj7ZTd5DzoGw5WBSTtuvaO6m3LPc JadaVfvZlDwAoRCU8LYHCEjwlQHswNc= Received: from mail.maildlp.com (unknown [172.19.163.17]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4XcbCq2c7Jz20qHl; Mon, 28 Oct 2024 22:21:11 +0800 (CST) Received: from dggpemf100008.china.huawei.com (unknown [7.185.36.138]) by mail.maildlp.com (Postfix) with ESMTPS id AFC851A0188; Mon, 28 Oct 2024 22:22:08 +0800 (CST) Received: from [10.174.177.243] (10.174.177.243) by dggpemf100008.china.huawei.com (7.185.36.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Mon, 28 Oct 2024 22:22:08 +0800 Message-ID: <9b06805b-4f4f-4b37-861f-681e3ab9d470@huawei.com> Date: Mon, 28 Oct 2024 22:22:07 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 1/2] mm: use aligned address in clear_gigantic_page() To: David Hildenbrand , Andrew Morton CC: Matthew Wilcox , Muchun Song , "Huang, Ying" , References: <20241026054307.3896926-1-wangkefeng.wang@huawei.com> <54f5f3ee-8442-4c49-ab4e-c46e8db73576@huawei.com> <4219a788-52ad-4d80-82e6-35a64c980d50@redhat.com> <127d4a00-29cc-4b45-aa96-eea4e0adaed2@huawei.com> Content-Language: en-US From: Kefeng Wang In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.177.243] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpemf100008.china.huawei.com (7.185.36.138) X-Rspamd-Queue-Id: 9626412001E X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: n6hwy79tz74okh3d7dehsr76ir6fa6ek X-HE-Tag: 1730125304-484831 X-HE-Meta: U2FsdGVkX181RPVYErvuTpd8lAm3n/IcMZmqbN+VIcSYN9oWEOwmFp7+6ZjOyo1iLLuO7JLhcbR35idG2OD8zelbhpMrH7mQvtoyrzjisJx1lPrVvlGJIrxMF9w8jJtjsL78JQY1Wi3W/+Li6NhMNgBOT+HdWmJcHt5HSiLqo9KSuM/udtLcgz2+Z7A7eey5i6Vc+xoWeRTH3aRaGmX/ho65scDaxqg3PR+qq0pmxnglPGaBBjsE8pY/KPSWRvRPaV9G5UVKEcQ9vFdYPdvRgVStp33zRaPxoYbkDyDMzZZy/PgYBRVNsRDvLy4oMV/zbgNbu5F4latni1GXZVeNWsSDUw3HMEhzejmQ1zMVGKurpI3LrWiqnGpYu3ENAgWqRWLltKNPAlS723DKIH2cwsJiqPtUh7izykTwAG+EkXfxxLlEAKijzPLFRNiZp1pKzB/eWM+OOBYPlut3Fi+G6/uhlj//IPwGFOws3i/YC0sqGmzVTRqHX88cuWrCZ37Sw768bSElugeJWSQezQ7eyC3gYkZW8UrctCn3baEf0Pr0m+ksIkU8U1J4XPAmQuLuDBFdVhHmqFErMbYryudogsVHlGz6Wyuh4tSuBfkrxFlxdgRAephOxbxIWBII+Yu/EcwhpwKFqCk2Zgf0x08FFSw9+GYLKpxvTO9t6+Brb96ymK8cbAKAvF/j4PXgiOunrXFwyMGQkPbGWxfYtVHGjrvLzF7Ibpsdef5VEgt5v3Drs66YqnnpwM7DRR1kAIKQ4NqMQIF0EVsY2C4l5YwO3W8Gb898ZGp99bM6fj9ch3r+Imfk0KmgVgjRuuGX6YXUVeeQIHOFViltPXx6QfWMg9KC7gEbRF7WUyyf8JaLDy4HraCfCbMI5iI5booeV5zzjr6ykJXTr7tmUxmaaKqOej82UXq7xD8juQhSK/DT7/WFtA/U4c49p7EFkR7Pb5qa4SaAhmMccp1dS6EFNfE LEJe106k h5Dildi5c1qMXHem04bvW0MiC6h9nKR1UXRLfxMKD5uWXHJ4UsDF+DG7IlFV8bmUsR1M0ppy119g1I8shNtjm/JFGqStqb1N41j7weB6WwTZo0XKG3SxGSpIpuUE0c+pWO/cnH9Mrp242iiA43IPps550CxYifFpcTBHM7GYNrWRIfAHGCWJkaNxfND0nXmbunNo16nLQf6OKVfaOyyQSRwKsAHSR4xbIm0SGBnbVDQohX/u6vAiGixALPJBP9+tGeWI9m+Xhxm5322iPoYIkG/J8AcfnRQUQEyp6 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/10/28 21:46, David Hildenbrand wrote: > On 28.10.24 14:33, Kefeng Wang wrote: >> >> >> On 2024/10/28 21:14, David Hildenbrand wrote: >>> On 28.10.24 13:52, Kefeng Wang wrote: >>>> >>>> >>>> On 2024/10/28 18:00, David Hildenbrand wrote: >>>>> On 26.10.24 07:43, Kefeng Wang wrote: >>>>>> When clearing gigantic page, it zeros page from the first page to the >>>>>> last page, if directly passing addr_hint which maybe not the address >>>>>> of the first page of folio, then some archs could flush the wrong >>>>>> cache >>>>>> if it does use the addr_hint as a hint. For non-gigantic page, it >>>>>> calculates the base address inside, even passed the wrong >>>>>> addr_hint, it >>>>>> only has performance impact as the process_huge_page() wants to >>>>>> process >>>>>> target page last to keep its cache lines hot), no functional impact. >>>>>> >>>>>> Let's pass the real accessed address to folio_zero_user() and use the >>>>>> aligned address in clear_gigantic_page() to fix it. >>>>>> >>>>>> Fixes: 78fefd04c123 ("mm: memory: convert clear_huge_page() to >>>>>> folio_zero_user()") >>>>>> Signed-off-by: Kefeng Wang >>>>>> --- >>>>>> v2: >>>>>> - update changelog to clarify the impact, per Andrew >>>>>> >>>>>>     fs/hugetlbfs/inode.c | 2 +- >>>>>>     mm/memory.c          | 1 + >>>>>>     2 files changed, 2 insertions(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c >>>>>> index a4441fb77f7c..a5ea006f403e 100644 >>>>>> --- a/fs/hugetlbfs/inode.c >>>>>> +++ b/fs/hugetlbfs/inode.c >>>>>> @@ -825,7 +825,7 @@ static long hugetlbfs_fallocate(struct file >>>>>> *file, >>>>>> int mode, loff_t offset, >>>>>>                 error = PTR_ERR(folio); >>>>>>                 goto out; >>>>>>             } >>>>>> -        folio_zero_user(folio, ALIGN_DOWN(addr, hpage_size)); >>>>>> +        folio_zero_user(folio, addr); >>>>>>             __folio_mark_uptodate(folio); >>>>>>             error = hugetlb_add_to_page_cache(folio, mapping, index); >>>>>>             if (unlikely(error)) { >>>>>> diff --git a/mm/memory.c b/mm/memory.c >>>>>> index 75c2dfd04f72..ef47b7ea5ddd 100644 >>>>>> --- a/mm/memory.c >>>>>> +++ b/mm/memory.c >>>>>> @@ -6821,6 +6821,7 @@ static void clear_gigantic_page(struct folio >>>>>> *folio, unsigned long addr, >>>>>>         int i; >>>>>>         might_sleep(); >>>>>> +    addr = ALIGN_DOWN(addr, folio_size(folio)); >>>>> >>>>> Right, that's what's effectively done in a very bad way in >>>>> process_huge_page() >>>>> >>>>> unsigned long addr = addr_hint & >>>>>                 ~(((unsigned long)nr_pages << PAGE_SHIFT) - 1); >>>>> >>>>> >>>>> That should all be cleaned up ... process_huge_page() likely shouldn't >>>> >>>> Yes, let's fix the bug firstly, >>>> >>>>> be even consuming "nr_pages". >>>> >>>> No sure about this part, it uses nr_pages as the end and calculate the >>>> 'base'. >>> >>> It should be using folio_nr_pages(). >> >> But process_huge_page() without an explicit folio argument, I'd like to >> move the aligned address calculate into the folio_zero_user and >> copy_user_large_folio(will rename it to folio_copy_user()) in the >> following cleanup patches, or do it in the fix patches? > > First, why does folio_zero_user() call process_huge_page() for *a small > folio*? Because we like or code to be extra complicated to understand? > Or am I missing something important? The folio_zero_user() used for PMD-sized THP and HugeTLB before, and after anon mTHP supported, it is used for order-2~order-PMD-order THP and HugeTLB, so it won't process a small folio if I understand correctly. > > Second, we should be passing the folio to "process_huge_page" and likely > rename it to "folio_process_pages()" or sth like that. The function even > documents "of the specified huge page", but there is none specified. The > copy case might require a rework. > > I think this code needs a serious cleanup ... > Yes, I'd like to do more cleanup and rework them, also I find some performance issue on my arm64 machine[1] which need to be addressed. [1] https://lore.kernel.org/linux-mm/2524689c-08f5-446c-8cb9-924f9db0ee3a@huawei.com/