From: Ryan Roberts <ryan.roberts@arm.com>
To: David Hildenbrand <david@redhat.com>,
Lance Yang <ioworker0@gmail.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Hugh Dickins <hughd@google.com>, Jonathan Corbet <corbet@lwn.net>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
Barry Song <baohua@kernel.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v1 2/2] mm: mTHP stats for pagecache folio allocations
Date: Wed, 17 Jul 2024 10:50:57 +0100 [thread overview]
Message-ID: <dca879ca-e430-4759-b992-4f763344319c@arm.com> (raw)
In-Reply-To: <5472faf5-1fbe-4a89-a17e-83716fc00b5a@redhat.com>
On 17/07/2024 09:44, David Hildenbrand wrote:
>>>> I guess the real supported orders are:
>>>>
>>>> anon:
>>>> min order: 2
>>>> max order: PMD_ORDER
>>>> anon-shmem:
>>>> min order: 1
>>>> max order: MAX_PAGECACHE_ORDER
>>>> tmpfs-shmem:
>>>> min order: PMD_ORDER <= 11 ? PMD_ORDER : NONE
>>>> max order: PMD_ORDER <= 11 ? PMD_ORDER : NONE
>>>> file:
>>>> min order: 1
>>>> max order: MAX_PAGECACHE_ORDER
>>>
>>> That's my understanding. But not sure about anon-shmem really supporting
>>> order-1, maybe we do.
>>
>> Oh, I thought we only had the restriction for anon folios now (due to deferred
>> split queue), so assumed it would just work. With Gavin's
>> THP_ORDERS_ALL_FILE_DEFAULT change, that certainly implies that shmem must
>> support order-1. If it doesn't then we we might want to tidy that further.
>>
>> Baolin, perhaps you can confirm either way?
>
> Currently there would not have been a way to enable it, right? (maybe I'm wrong)
__thp_vma_allowable_orders() doesn't do anything special for shmem if TVA_IN_PF
is set, so I guess it could concievably return order-1 in that path. Not sure if
it ever gets called that way for shmem though - I don't think so. But agree that
shmem_allowable_huge_orders() will not currently return order-1.
>
>>
>>>
>>>>
>>>> But today, controls and stats are exposed for:
>>>>
>>>> anon:
>>>> min order: 2
>>>> max order: PMD_ORDER
>>>> anon-shmem:
>>>> min order: 2
>>>> max order: PMD_ORDER
>>>> tmpfs-shmem:
>>>> min order: PMD_ORDER
>>>> max order: PMD_ORDER
>>>> file:
>>>> min order: Nothing yet (this patch proposes 1)
>>>> max order: Nothing yet (this patch proposes MAX_PAGECACHE_ORDER)
>>>>
>>>> So I think there is definitely a bug for shmem where the minimum order control
>>>> should be order-1 but its currently order-2.
>>>
>>> Maybe, did not play with that yet. Likely order-1 will work. (although probably
>>> of questionable use :) )
>>
>> You might have to expand on why its of "questionable use". I'd assume it has the
>> same amount of value as using order-1 for regular page cache pages? i.e. half
>> the number of objects to manage for the same amount of memory.
>
> order-1 was recently added for the pagecache to get some device setups running
> (IIRC, where we cannot use order-0, because device blocksize > PAGE_SIZE).
>
> You might be right about "half the number of objects", but likely just going for
> order-2, order-3, order-4 ... for shmem might be even better. And simply falling
> back to order-0 when you cannot get the larger orders.
Sure, but then you're into the territory of baking in policy. Remember that
originally I was only interested in 64K but the concensus was to expose all the
sizes. Same argument applies to 8K; expose it and let others decide policy.
>
> I could have sworn you mentioned something like that in your "configurable
> orders for pagecache" RFC that I only briefly skimmed so far :P
I'm exposing the 8K control for pagecache in that series.
>
> ... only enabling "order-1" and none of the other orders for shmem sounds rather
> "interesting".
>
> But yeah, maybe there is valid use for it, so I'm all for allowing it if it can
> be done.
>
>>
>>>
>>>>
>>>> I also wonder about PUD-order for DAX? We don't currently have a stat/control.
>>>> If we wanted to add it in future, if we take the "expose all stats/controls for
>>>> all orders" approach, we would end up extending all the way to PUD-order and
>>>> all
>>>> the orders between PMD and PUD would be dummy for all memory types. That really
>>>> starts to feel odd, so I still favour only populating what's really supported.
>>>
>>> I would go further and say that calling the fsdax thing a THP is borderline
>>> wrong and we should not expose any new toggles for it that way.
>>>
>>> It really behaves much more like hugetlb folios that can be PTE-mapped ... we
>>> cannot split these things, and they are not allocated from the buddy. So I
>>> wouldn't worry about fsdax for now.
>>>
>>> fsdax support for compound pages (now large folios) probably never should have
>>> been glued to any THP toggle.
>>
>> Yeah fair enough. I wasn't really arguing for adding any dax controls; I was
>> just trying to think of examples as to why adding dummy controls might be a bad
>> idea.
>
> Yes.
>
>>>
>>>>
>>>> I propose to fix shmem (extend down to 1, stop at MAX_PAGECACHE_ORDER) and
>>>> continue with the approach of "indicating only what really exists" for v2.
>>>>
>>>> Shout if you disagree.
>>>
>>> Makes sense.
>>
>> Excellent. I posted v2, which has these changes, yesterday afternoon. :)
>
> Yes, still digging through mails ... in-between having roughly 1000 meetings a
> day :)
No problem. You're in-demand. I can wait. :)
next prev parent reply other threads:[~2024-07-17 9:51 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-11 7:29 [PATCH v1 0/2] mTHP allocation stats for file-backed memory Ryan Roberts
2024-07-11 7:29 ` [PATCH v1 1/2] mm: Cleanup count_mthp_stat() definition Ryan Roberts
2024-07-11 8:20 ` Barry Song
2024-07-12 2:31 ` Baolin Wang
2024-07-12 11:57 ` Lance Yang
2024-07-11 7:29 ` [PATCH v1 2/2] mm: mTHP stats for pagecache folio allocations Ryan Roberts
2024-07-12 3:00 ` Baolin Wang
2024-07-12 12:22 ` Lance Yang
2024-07-13 1:08 ` David Hildenbrand
2024-07-13 10:45 ` Ryan Roberts
2024-07-16 8:31 ` Ryan Roberts
2024-07-16 10:19 ` David Hildenbrand
2024-07-16 11:14 ` Ryan Roberts
2024-07-17 8:02 ` David Hildenbrand
2024-07-17 8:29 ` Ryan Roberts
2024-07-17 8:44 ` David Hildenbrand
2024-07-17 9:50 ` Ryan Roberts [this message]
2024-07-17 10:03 ` David Hildenbrand
2024-07-17 10:18 ` Ryan Roberts
2024-07-17 10:25 ` David Hildenbrand
2024-07-17 10:48 ` Ryan Roberts
2024-07-13 11:00 ` Ryan Roberts
2024-07-13 12:54 ` Baolin Wang
2024-07-14 9:05 ` Ryan Roberts
2024-07-22 3:52 ` Baolin Wang
2024-07-22 7:36 ` Ryan Roberts
2024-07-12 22:44 ` kernel test robot
2024-07-15 13:55 ` Ryan Roberts
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=dca879ca-e430-4759-b992-4f763344319c@arm.com \
--to=ryan.roberts@arm.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=corbet@lwn.net \
--cc=david@redhat.com \
--cc=hughd@google.com \
--cc=ioworker0@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox