From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C829CC3064D for ; Tue, 2 Jul 2024 08:24:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 399ED6B0085; Tue, 2 Jul 2024 04:24:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 349406B0088; Tue, 2 Jul 2024 04:24:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 210E66B0089; Tue, 2 Jul 2024 04:24:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 035186B0085 for ; Tue, 2 Jul 2024 04:24:37 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 99F7541BBE for ; Tue, 2 Jul 2024 08:24:37 +0000 (UTC) X-FDA: 82294126194.15.AB40BB6 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf18.hostedemail.com (Postfix) with ESMTP id B22181C0011 for ; Tue, 2 Jul 2024 08:24:35 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719908646; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gV5vyULXcF4U2Ps2Owl6oIirunH8b5Iv3a52WOZ34Lc=; b=PzdaWtIqgEU155DBtDMtP76+ml7npgOj+9YvIwd4OhH1r57/N+spbronAMnaYYrsnkxPeS bMJoxA/mvGApQLTd745JWWLshv9IswUr9CMTYT/sL3KFEtKBxcVLa9IaRV2ciBwE9abkaS rvQ+lu55XPn8+jgWo+d1DkRnX+0dlV0= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719908646; a=rsa-sha256; cv=none; b=Uzo4q4LzLGyTP5BKVDene3H4uiDyqix+yBns6wlw4u3ArvUAmPJQftThvT/BRrKlQHFvKq wKDMQlYP1nhH6UXhw7Gyo0+5Lo/JuDgjyNWjNFcPSswvn9lNu4fO6J2g4yEfnV2yS+cXst 2q4Rd0sQaQm2F68iokRZd/RQSOBqPdY= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B59F4339; Tue, 2 Jul 2024 01:24:59 -0700 (PDT) Received: from [10.57.72.41] (unknown [10.57.72.41]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 62CE33F766; Tue, 2 Jul 2024 01:24:33 -0700 (PDT) Message-ID: <2450e4f8-236f-49ce-8bd3-b30a6d8c5e57@arm.com> Date: Tue, 2 Jul 2024 09:24:31 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] support "THPeligible" semantics for mTHP with anonymous shmem Content-Language: en-GB To: Yang Shi , David Hildenbrand Cc: Baolin Wang , Bang Li , hughd@google.com, akpm@linux-foundation.org, wangkefeng.wang@huawei.com, ziy@nvidia.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20240628104926.34209-1-libang.li@antgroup.com> <4b38db15-0716-4ffb-a38b-bd6250eb93da@arm.com> <4d54880e-03f4-460a-94b9-e21b8ad13119@linux.alibaba.com> <516aa6b3-617c-4642-b12b-0c5f5b33d1c9@arm.com> <597ac51e-3f27-4606-8647-395bb4e60df4@redhat.com> <6f68fb9d-3039-4e38-bc08-44948a1dae4d@arm.com> <992cdbf9-80df-4a91-aea6-f16789c5afd7@redhat.com> <2e0a1554-d24f-4d0d-860b-0c2cf05eb8da@arm.com> <06c74db8-4d10-4a41-9a05-776f8dca7189@redhat.com> <429f2873-8532-4cc8-b0e1-1c3de9f224d9@arm.com> <7a0bbe69-1e3d-4263-b206-da007791a5c4@redhat.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: B22181C0011 X-Stat-Signature: qejmswjzit55y58xzd5ch75qhsjfmkd6 X-Rspam-User: X-HE-Tag: 1719908675-850116 X-HE-Meta: U2FsdGVkX1+qH4t/NM4g6IbQp74aLO1Gir+tLPy10tFqTCYJbgFjMmSxdBHcdnW3IVxUYxltWVBdOr9gTAOLbyK111qzfB46x+Pcm3UcEFW01Hp/tFBJv0SNZETgJH7PrnLmGwUgdbIgxWr0N/5ttp/Loe4Mp6A1LcR+DFgXyYU/jdRZy04/JaZEeI8rvZuujplryQBhxlFlJuqu9/mxR6R1xh1tL/Vz6akOBrORAdB9ZbDAQ5er317EFSxEztHusN+8CBmhueiKS1kOibSqF3U7ld2UEbeXDLAforgg1CnsF/HYbTEBNou0CMYaAV1sPy4nqIaYy12+bPV2y6YLD35z4q8yGw8zmqUFUmGBiUAlq4IVdYjWFXFWCgPlHs8+qjhI6qIZuqKJgjeiRDUaiIWMp5RPegP/v8dM1rV7KGO21u38/QrAA4A6r50zbTaAJsFSsgjUXpvdPSzFspPT1GPYKCN8QVbz/uZZiPqFuH0eRO0UN+USgVXxHSap0M6Mj0ckv/W/6W67O9ZhgKCsrzgqGaDGPiMn9FM9o6seLpA1GUikCUs15Qss0ZCG61Lka4+7ZewMJZk6tkNHYKKPTySUHbH5EdqDRbwSV65hSv6mBHpeGMFTVsNCLTIj87x0T5xSd0z0ZRhOsh7/0pwV3SI+bmlmBx4uMa8pZk5NAbWCU5NuoVYFiyGoLOE5oylwNnRInRrCfbKnKCjdx4c4e+9X24VS8UB5oNqBMPwHT5gWyuh5K6j+ud280DnCALgjj0rz0daAeNFFKQNvY4v613UhbJTN9BhUD2F4nveX8SushVRnRuTBra1nmRtz+VwyJZ9iWvD6ORKt2NZ2Bcc+msJl1Qa2hsl4DLqeVuq3YeDl9jXGBtoHxuRXsYJaDLndRUyEKTMOA5k9iEssilPj3UdwwMptBu9ntQLggos59SbCsRN761HUdGoYScfcYIA1x9lmxOb8A1CiZA1Xvv+ NrY10Ue/ sRuL6/Xjt56uAjI/mS31QR+zkyaUneUNTYPP3S6WP2LPA/pmfhyTeVfT1ym/RPcJofOGUtQovlrDQXY1a8SeC0Li9zuWKztDA3eP/u9GUcqJwQYTCWZeUz5/OfWgd9VKD4QGuua4efsbiz5BTqt+dT99Mi1e/KPb7eJ7zThLeKlzbJxkfywiTgP8PRXcjIvStQkKJTqMhXJxcfygyfBiT9PHs3VQPfQBDQZe8XzlvQYnb97OpCv+GSCkyAhxyD+LjcBy5 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 01/07/2024 19:20, Yang Shi wrote: > On Mon, Jul 1, 2024 at 3:23 AM David Hildenbrand wrote: >> >> On 01.07.24 12:16, Ryan Roberts wrote: >>> On 01/07/2024 10:17, David Hildenbrand wrote: >>>> On 01.07.24 11:14, Ryan Roberts wrote: >>>>> On 01/07/2024 09:57, David Hildenbrand wrote: >>>>>> On 01.07.24 10:50, Ryan Roberts wrote: >>>>>>> On 01/07/2024 09:48, David Hildenbrand wrote: >>>>>>>> On 01.07.24 10:40, Ryan Roberts wrote: >>>>>>>>> On 01/07/2024 09:33, Baolin Wang wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 2024/7/1 15:55, Ryan Roberts wrote: >>>>>>>>>>> On 28/06/2024 11:49, Bang Li wrote: >>>>>>>>>>>> After the commit 7fb1b252afb5 ("mm: shmem: add mTHP support for >>>>>>>>>>>> anonymous shmem"), we can configure different policies through >>>>>>>>>>>> the multi-size THP sysfs interface for anonymous shmem. But >>>>>>>>>>>> currently "THPeligible" indicates only whether the mapping is >>>>>>>>>>>> eligible for allocating THP-pages as well as the THP is PMD >>>>>>>>>>>> mappable or not for anonymous shmem, we need to support semantics >>>>>>>>>>>> for mTHP with anonymous shmem similar to those for mTHP with >>>>>>>>>>>> anonymous memory. >>>>>>>>>>>> >>>>>>>>>>>> Signed-off-by: Bang Li >>>>>>>>>>>> --- >>>>>>>>>>>> fs/proc/task_mmu.c | 10 +++++++--- >>>>>>>>>>>> include/linux/huge_mm.h | 11 +++++++++++ >>>>>>>>>>>> mm/shmem.c | 9 +-------- >>>>>>>>>>>> 3 files changed, 19 insertions(+), 11 deletions(-) >>>>>>>>>>>> >>>>>>>>>>>> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c >>>>>>>>>>>> index 93fb2c61b154..09b5db356886 100644 >>>>>>>>>>>> --- a/fs/proc/task_mmu.c >>>>>>>>>>>> +++ b/fs/proc/task_mmu.c >>>>>>>>>>>> @@ -870,6 +870,7 @@ static int show_smap(struct seq_file *m, void *v) >>>>>>>>>>>> { >>>>>>>>>>>> struct vm_area_struct *vma = v; >>>>>>>>>>>> struct mem_size_stats mss = {}; >>>>>>>>>>>> + bool thp_eligible; >>>>>>>>>>>> smap_gather_stats(vma, &mss, 0); >>>>>>>>>>>> @@ -882,9 +883,12 @@ static int show_smap(struct seq_file *m, void >>>>>>>>>>>> *v) >>>>>>>>>>>> __show_smap(m, &mss, false); >>>>>>>>>>>> - seq_printf(m, "THPeligible: %8u\n", >>>>>>>>>>>> - !!thp_vma_allowable_orders(vma, vma->vm_flags, >>>>>>>>>>>> - TVA_SMAPS | TVA_ENFORCE_SYSFS, THP_ORDERS_ALL)); >>>>>>>>>>>> + thp_eligible = !!thp_vma_allowable_orders(vma, vma->vm_flags, >>>>>>>>>>>> + TVA_SMAPS | TVA_ENFORCE_SYSFS, THP_ORDERS_ALL); >>>>>>>>>>>> + if (vma_is_anon_shmem(vma)) >>>>>>>>>>>> + thp_eligible = >>>>>>>>>>>> !!shmem_allowable_huge_orders(file_inode(vma->vm_file), >>>>>>>>>>>> + vma, vma->vm_pgoff, thp_eligible); >>>>>>>>>>> >>>>>>>>>>> Afraid I haven't been following the shmem mTHP support work as much as I >>>>>>>>>>> would >>>>>>>>>>> have liked, but is there a reason why we need a separate function for >>>>>>>>>>> shmem? >>>>>>>>>> >>>>>>>>>> Since shmem_allowable_huge_orders() only uses shmem specific logic to >>>>>>>>>> determine >>>>>>>>>> if huge orders are allowable, there is no need to complicate the >>>>>>>>>> thp_vma_allowable_orders() function by adding more shmem related logic, >>>>>>>>>> making >>>>>>>>>> it more bloated. In my view, providing a dedicated helper >>>>>>>>>> shmem_allowable_huge_orders(), specifically for shmem, simplifies the logic. >>>>>>>>> >>>>>>>>> My point was really that a single interface (thp_vma_allowable_orders) >>>>>>>>> should be >>>>>>>>> used to get this information. I have no strong opinon on how the >>>>>>>>> implementation >>>>>>>>> of that interface looks. What you suggest below seems perfectly reasonable >>>>>>>>> to me. >>>>>>>> >>>>>>>> Right. thp_vma_allowable_orders() might require some care as discussed in >>>>>>>> other >>>>>>>> context (cleanly separate dax and shmem handling/orders). But that would be >>>>>>>> follow-up cleanups. >>>>>>> >>>>>>> Are you planning to do that, or do you want me to send a patch? >>>>>> >>>>>> I'm planning on looking into some details, especially the interaction with large >>>>>> folios in the pagecache. I'll let you know once I have a better idea what >>>>>> actually should be done :) >>>>> >>>>> OK great - I'll scrub it from my todo list... really getting things done today :) >>>> >>>> Resolved the khugepaged thiny already? :P >>>> >>>> [khugepaged not active when only enabling the sub-size via the 2M folder IIRC] >>> >>> Hmm... baby brain? >> >> :) >> >> I think I only mentioned it in a private mail at some point. >> >>> >>> Sorry about that. I've been a bit useless lately. For some reason it wasn't on >>> my list, but its there now. Will prioritise it, because I agree it's not good. >> >> >> IIRC, if you do >> >> echo never > /sys/kernel/mm/transparent_hugepage/enabled >> echo always > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled >> >> khugepaged will not get activated. > > khugepaged is controlled by the top level knob. What do you mean by "top level knob"? I assume /sys/kernel/mm/transparent_hugepage/enabled ? If so, that's not really a thing in its own right; its just the legacy PMD-size THP control, and we only take any notice of it if a per-size control is set to "inherit". So if we have: # echo always > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled Then by design, /sys/kernel/mm/transparent_hugepage/enabled should be ignored. > But the above setting > sounds confusing, can we disable the top level knob, but enable it on > a per-order basis? TBH, it sounds weird and doesn't make too much > sense to me. Well that's the design and that's how its documented. It's done this way for back-compat. All controls are now per-size. But at boot, we default all per-size controls to "never" except for the PMD-sized control, which is defaulted to "inherit". That way, an unenlightened user-space can still control PMD-sized THP via the legacy (top-level) control. But enlightened apps can directly control per-size. I'm not sure how your way would work, because you would have 2 controls competing to do the same thing? > >> >> -- >> Cheers, >> >> David / dhildenb >> >>