From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B69FC2BD09 for ; Thu, 27 Jun 2024 12:46:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E1A316B0098; Thu, 27 Jun 2024 08:46:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DC9CE6B0099; Thu, 27 Jun 2024 08:46:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CB8766B009A; Thu, 27 Jun 2024 08:46:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id ADB1E6B0098 for ; Thu, 27 Jun 2024 08:46:50 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 5D551403AF for ; Thu, 27 Jun 2024 12:46:50 +0000 (UTC) X-FDA: 82276642980.15.26FF7BB Received: from m15.mail.163.com (m15.mail.163.com [45.254.50.220]) by imf29.hostedemail.com (Postfix) with ESMTP id C220512001B for ; Thu, 27 Jun 2024 12:46:46 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=163.com header.s=s110527 header.b=S0y+nBbM; spf=pass (imf29.hostedemail.com: domain of ranxiaokai627@163.com designates 45.254.50.220 as permitted sender) smtp.mailfrom=ranxiaokai627@163.com; dmarc=pass (policy=none) header.from=163.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719492391; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pIFigibXxiJl8Xxkvx7dP2YimD87atVk5KkBCJ4OxgE=; b=IEcQQ4lbfyZP9VB8ElSDLYjBF0I6/ClkVRjDJx3Wg6SgQcE+DZ1F7Ray0iC/cF858Jas16 KS+lleX5XEOHs6UTt9jOXAiNV68Qxn3Xnn1phdSdISO9fJFErQ298JL6KT8CbF8B+i2Ape NZBEgCbAYptiLo+PfjZLz0aiXzjzs0Q= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719492391; a=rsa-sha256; cv=none; b=xmiHxxfWvXWkVc4r680zdhrNaf46AfZFUmucNKusUSN7dgSIs1j/UQpB0UQhmrWKc78ZcA d9wC7L/EAPMVlB+FBD0NGTvkdfmemGKyUAYM/TKDgwFm0fq/oE19BpQD/EVixVAKMm1SqS u96yUNTWzohHu3yc+7Jun65MfIiH0kc= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=163.com header.s=s110527 header.b=S0y+nBbM; spf=pass (imf29.hostedemail.com: domain of ranxiaokai627@163.com designates 45.254.50.220 as permitted sender) smtp.mailfrom=ranxiaokai627@163.com; dmarc=pass (policy=none) header.from=163.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=163.com; s=s110527; h=From:Subject:Date:Message-Id:MIME-Version; bh=pIFig ibXxiJl8Xxkvx7dP2YimD87atVk5KkBCJ4OxgE=; b=S0y+nBbMS3giy7EKQn5tD DJeTJcxZ/hSPb7pJFhAdVnA7BmDMNU0KCilVPycc62EdlxfOQ/kELnWUskfgIK/r GZ3lXS/wXYH2R4f34UvDdx7cRaEKAP9Vwb9jpLdvkmqu3cWqTHC++kZbL4BkPAAp asSxlvKD1TP/lgBcYYCBb4= Received: from localhost.localdomain (unknown [193.203.214.57]) by gzga-smtp-mta-g0-3 (Coremail) with SMTP id _____wDHr84VX31mAubSAg--.63733S4; Thu, 27 Jun 2024 20:46:15 +0800 (CST) From: ran xiaokai To: ryan.roberts@arm.com Cc: 21cnbao@gmail.com, akpm@linux-foundation.org, baolin.wang@linux.alibaba.com, david@redhat.com, ioworker0@gmail.com, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, peterx@redhat.com, ran.xiaokai@zte.com.cn, ranxiaokai627@163.com, svetly.todorov@memverge.com, vbabka@suse.cz, yang.yang29@zte.com.cn, si.hao@zte.com.cn, wangkefeng.wang@huawei.com, willy@infradead.org, ziy@nvidia.com Subject: Re: [PATCH 2/2] kpageflags: fix wrong KPF_THP on non-pmd-mappable compound pages Date: Thu, 27 Jun 2024 12:46:13 +0000 Message-Id: <20240627124613.23377-1-ranxiaokai627@163.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <4e1a1878-4133-4d78-90fa-1d5bc99d179c@arm.com> References: <4e1a1878-4133-4d78-90fa-1d5bc99d179c@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CM-TRANSID:_____wDHr84VX31mAubSAg--.63733S4 X-Coremail-Antispam: 1Uf129KBjvJXoWxJF48Wr4kXr15JF4kKrW7urg_yoWrKryUpF yrtFyDtF4ktr4Fyr17tw4UtFy8Kr13XFWrWr98Ary8Zwn0qrnrur17G3y09F9rZrn7Ar1j vF4jvF93ua4qvFJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x0JUrxhQUUUUU= X-Originating-IP: [193.203.214.57] X-CM-SenderInfo: xudq5x5drntxqwsxqiywtou0bp/xtbB0hkLTGWXyo-slAAAsD X-Stat-Signature: rhfhmsy75n19moxghhpfrnib9u6u7eto X-Rspamd-Queue-Id: C220512001B X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1719492406-450819 X-HE-Meta: U2FsdGVkX1+fYxV2tZpHLNuLe9dkPsQXZ9tyzo61uh4QzjJAMYeKXkpcAc3pu34jKExwueY+Fs46qIKnibviswJk8P1Q5uFdry198/WWg8IMTp8e2jD+c8sksnZT1fWGX/0ItwuI73DVG3cofaWYG7vZs7rR6Pk8cj47cVZBv/D9ncX3zbIzPs3O1RaaeHCZWDujXnX63t+70mztaHsJOZbjWfRvRQ/c+gKayseAYsJMBJeHZFRHhH86FnxAEyRfYAa9vXPg274oUvUg4xv0vIz5VnbkvMK4plTNHysnhdhuuKwEObX8sGqYhKjdKujgf7MAwlXpqFqaSgBVukXnZx76teaWq5dbFkmpb734Edy8lcXfrhkNAcrtWC6jB69z18P9giiZo9EC7AT1iPjd5w7Q8f6TJM/ol5GhY6YZu4YTsMDsBWd0sTBvii63coko2WkW2BoswmuaVlZH4dQIs0YgNM6mt30UR+RMuDYJhwKUStYGZP0LIfsRCp09NJS00UQpo7KWCxxQcMCj39nN1WgXj3XbcLFehrwAubqTq/6A8/20v4rysE857H1L7uebX6JEJv5ZwNoH75yPGcTr7RVbu48ApTfNl+MjTthQhSpBiYCvkfelKGjobroWTyCQ+2kQC1FdhCDfUEzTlsAbko6067l50WlNpoSzNC0mYlr4njCDbkUKrsLI9X7rZoH3wkunEdbjxsCGwVw5WIoWie6fVl2NFmfqBgZt5/4lPcLgJDeJX3Yn+Nzz/JCxu07TVubKgH9bbwRpvAVZdFt9MYypDOZXMkmGAgW2vCsxISFWtVR02O5ke/U2coF7wd+qEGOs2rX33wE/2ssH/TSpov/W11e6qftosrj0taGkeC5e1qzVXn2J1JjneM18WLcbX0AZ6ez6W8A/BXHiaDEqTE3J/f84nQF/gR3GpFcwm7AR6AdxqF5OH4/mlAv9J/jkPoX4UbYQXgj/XCor2U+ zfX6AEdG 9f9OwZm6v5cmfHtgZ5l9YXjqf26pHR8F4hOX4qqtK5yfNEn48BYiZASWaLf4ldE7vO13KP93e6fQ8AG7J4oqACbAnETO0YyTkdK9jmsy9ZBv0tdCdWBb9H+/wWkFBTn2gNNTBbZxqZipjvzWFwSYlc183Zch+9Xvp5QzVjpQqRdsleF6C6ezdc7Q6FHUJVk73cnhfDJwgYp4I23eErZgr96FuCg5eWCu9XqqZfVro9/2nnXlteTwxfj3xF0gYbTwS241Ps5rAqbr6fIjr1vLsv0F4JQosXWOyGHmeYAXnW0YfXpkLLekfIpRZ9I7kIBokX6yNiTMCrafoNFG3Dgxf+jksqgXPPxR7SqkcErlbpKJgYVf1n27o9u1HfIDaiiR0D9UM/Woh+rm5Gf1kxEOc0wR63bF+AfKol7R1W3LkX57Jikhv/OC6li3FSH8dPGH8g2qZBoShBoBuNOlnqMFOBKzWxv42PJ1qp6nzwt0u5Gxtk9254yxoyKuyAR3x7IY8epUaCzyKo5jsZa+TvEYCt2zb0sDkJh07X659HVbk7/QuyMg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: >On 27/06/2024 10:16, Barry Song wrote: >> On Thu, Jun 27, 2024 at 8:39?PM Ryan Roberts wrote: >>> >>> On 27/06/2024 05:10, Barry Song wrote: >>>> On Thu, Jun 27, 2024 at 2:40?AM Zi Yan wrote: >>>>> >>>>> On Wed Jun 26, 2024 at 7:07 AM EDT, Ryan Roberts wrote: >>>>>> On 26/06/2024 04:06, Zi Yan wrote: >>>>>>> On Tue Jun 25, 2024 at 10:49 PM EDT, ran xiaokai wrote: >>>>>>>> From: Ran Xiaokai >>>>>>>> >>>>>>>> KPF_COMPOUND_HEAD and KPF_COMPOUND_TAIL are set on "common" compound >>>>>>>> pages, which means of any order, but KPF_THP should only be set >>>>>>>> when the folio is a 2M pmd mappable THP. >>>>>> >>>>>> Why should KPF_THP only be set on 2M THP? What problem does it cause as it is >>>>>> currently configured? >>>>>> >>>>>> I would argue that mTHP is still THP so should still have the flag. And since >>>>>> these smaller mTHP sizes are disabled by default, only mTHP-aware user space >>>>>> will be enabling them, so I'll naively state that it should not cause compat >>>>>> issues as is. >>>>>> >>>>>> Also, the script at tools/mm/thpmaps relies on KPF_THP being set for all mTHP >>>>>> sizes to function correctly. So that would need to be reworked if making this >>>>>> change. >>>>> >>>>> + more folks working on mTHP >>>>> >>>>> I agree that mTHP is still THP, but we might want different >>>>> stats/counters for it, since people might want to keep the old THP counters >>>>> consistent. See recent commits on adding mTHP counters: >>>>> ec33687c6749 ("mm: add per-order mTHP anon_fault_alloc and anon_fault_fallback >>>>> counters"), 1f97fd042f38 ("mm: shmem: add mTHP counters for anonymous shmem") >>>>> >>>>> and changes to make THP counter to only count PMD THP: >>>>> 835c3a25aa37 ("mm: huge_memory: add the missing folio_test_pmd_mappable() for >>>>> THP split statistics") >>>>> >>>>> In this case, I wonder if we want a new KPF_MTHP bit for mTHP and some >>>>> adjustment on tools/mm/thpmaps. >>>> >>>> It seems we have to do this though I think keeping KPF_THP and adding a >>>> separate bit like KPF_PMD_MAPPED makes more sense. but those tools >>>> relying on KPF_THP need to realize this and check the new bit , which is >>>> not done now. >>>> whether the mTHP's name is mTHP or THP will make no difference for >>>> this case:-) >>> >>> I don't quite follow your logic for that last part; If there are 2 separate >>> bits; KPF_THP and KPF_MTHP, and KPF_THP is only set for PMD-sized THP, that >>> would be a safe/compatible approach, right? Where as your suggestion requires >>> changes to existing tools to work. >> >> Right, my point is that mTHP and THP are both types of THP. The only difference >> is whether they are PMD-mapped or PTE-mapped. Adding a bit to describe how >> the page is mapped would more accurately reflect reality. However, this change >> would disrupt tools that assume KPF_THP always means PMD-mapped THP. >> Therefore, we would still need separate bits for THP and mTHP in this case. > >I think perhaps PTE- vs PMD-mapped is a separate issue. The issue at hand is >whether PKF_THP implies a fixed size (and alignment). If compat is an issue, >then PKF_THP must continue to imply PMD-size. If compat is not an issue, then >size can be determined by iterating over the entries. > >Having a mechanism to determine the level at which a block is mapped would >potentially be a useful feature, but seems orthogonal to me. > >> >> I saw Willy complain about mTHP being called "mTHP," but in this case, calling >> it "mTHP" or just "THP" doesn't change anything if old tools continue to assume >> that KPF_THP means PMD-mapped THP. > >I think Willy was just ribbing me because he preferred calling it "anonymous >large folios". That's how I took it anyway. > >> >>> >>> Thinking about this a bit more, I wonder if PKF_MTHP is the right name for a new >>> flag; We don't currently expose the term "mTHP" to user space. I can't think of >>> a better name though. >> >> Yes. If "compatibility" is a requirement, we cannot disregard it. >> >>> I'd still like to understand what is actually broken that this change is fixing. >>> Is the concern that a user could see KPF_THP and advance forward by >>> "/sys/kernel/mm/transparent_hugepage/hpage_pmd_size / getpagesize()" entries? >>> >> >> Maybe we need an example which is thinking that KPF_THP is PMD-mapped. > >Yes, that would help. For now it is the testcase in tools/testing/selftests/mm/split_huge_page_test, if we try to split THP to other orders other than 0, the testcase will break. Maybe we can use KPF_COMPOUND_HEAD and KPF_COMPOUND_TAIL to figure out the compound page's start/end and the order. But these two flags are not for userspace memory only.