From: Ryan Roberts <ryan.roberts@arm.com>
To: David Hildenbrand <david@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Matthew Wilcox <willy@infradead.org>,
Yin Fengwei <fengwei.yin@intel.com>, Yu Zhao <yuzhao@google.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Yang Shi <shy828301@gmail.com>,
"Huang, Ying" <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
Luis Chamberlain <mcgrof@kernel.org>,
Itaru Kitayama <itaru.kitayama@gmail.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
John Hubbard <jhubbard@nvidia.com>,
David Rientjes <rientjes@google.com>,
Vlastimil Babka <vbabka@suse.cz>, Hugh Dickins <hughd@google.com>,
Kefeng Wang <wangkefeng.wang@huawei.com>,
Barry Song <21cnbao@gmail.com>,
Alistair Popple <apopple@nvidia.com>
Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, Barry Song <v-songbaohua@oppo.com>
Subject: Re: [PATCH v9 03/10] mm: thp: Introduce multi-size THP sysfs interface
Date: Tue, 12 Dec 2023 15:32:29 +0000 [thread overview]
Message-ID: <e424982c-8a2f-4c98-83aa-fdb0ee765776@arm.com> (raw)
In-Reply-To: <ff7a3e9c-53cb-4283-9298-781d4fb7c7f8@redhat.com>
On 12/12/2023 14:54, David Hildenbrand wrote:
> On 07.12.23 17:12, Ryan Roberts wrote:
>> In preparation for adding support for anonymous multi-size THP,
>> introduce new sysfs structure that will be used to control the new
>> behaviours. A new directory is added under transparent_hugepage for each
>> supported THP size, and contains an `enabled` file, which can be set to
>> "inherit" (to inherit the global setting), "always", "madvise" or
>> "never". For now, the kernel still only supports PMD-sized anonymous
>> THP, so only 1 directory is populated.
>>
>> The first half of the change converts transhuge_vma_suitable() and
>> hugepage_vma_check() so that they take a bitfield of orders for which
>> the user wants to determine support, and the functions filter out all
>> the orders that can't be supported, given the current sysfs
>> configuration and the VMA dimensions. The resulting functions are
>> renamed to thp_vma_suitable_orders() and thp_vma_allowable_orders()
>> respectively. Convenience functions that take a single, unencoded order
>> and return a boolean are also defined as thp_vma_suitable_order() and
>> thp_vma_allowable_order().
>>
>> The second half of the change implements the new sysfs interface. It has
>> been done so that each supported THP size has a `struct thpsize`, which
>> describes the relevant metadata and is itself a kobject. This is pretty
>> minimal for now, but should make it easy to add new per-thpsize files to
>> the interface if needed in future (e.g. per-size defrag). Rather than
>> keep the `enabled` state directly in the struct thpsize, I've elected to
>> directly encode it into huge_anon_orders_[always|madvise|inherit]
>> bitfields since this reduces the amount of work required in
>> thp_vma_allowable_orders() which is called for every page fault.
>>
>> See Documentation/admin-guide/mm/transhuge.rst, as modified by this
>> commit, for details of how the new sysfs interface works.
>>
>> Reviewed-by: Barry Song <v-songbaohua@oppo.com>
>> Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> Tested-by: John Hubbard <jhubbard@nvidia.com>
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>> ---
>
> [...]
>
>> +
>> +static ssize_t thpsize_enabled_store(struct kobject *kobj,
>> + struct kobj_attribute *attr,
>> + const char *buf, size_t count)
>> +{
>> + int order = to_thpsize(kobj)->order;
>> + ssize_t ret = count;
>> +
>> + if (sysfs_streq(buf, "always")) {
>> + spin_lock(&huge_anon_orders_lock);
>> + clear_bit(order, &huge_anon_orders_inherit);
>> + clear_bit(order, &huge_anon_orders_madvise);
>> + set_bit(order, &huge_anon_orders_always);
>> + spin_unlock(&huge_anon_orders_lock);
>> + } else if (sysfs_streq(buf, "inherit")) {
>> + spin_lock(&huge_anon_orders_lock);
>> + clear_bit(order, &huge_anon_orders_always);
>> + clear_bit(order, &huge_anon_orders_madvise);
>> + set_bit(order, &huge_anon_orders_inherit);
>> + spin_unlock(&huge_anon_orders_lock);
>> + } else if (sysfs_streq(buf, "madvise")) {
>> + spin_lock(&huge_anon_orders_lock);
>> + clear_bit(order, &huge_anon_orders_always);
>> + clear_bit(order, &huge_anon_orders_inherit);
>> + set_bit(order, &huge_anon_orders_madvise);
>> + spin_unlock(&huge_anon_orders_lock);
>> + } else if (sysfs_streq(buf, "never")) {
>> + spin_lock(&huge_anon_orders_lock);
>> + clear_bit(order, &huge_anon_orders_always);
>> + clear_bit(order, &huge_anon_orders_inherit);
>> + clear_bit(order, &huge_anon_orders_madvise);
>> + spin_unlock(&huge_anon_orders_lock);
>
> Why not perform lock/unlock only once in surrounding code? :)
I was nervous that sysfs_streq() may be unhappy in atomic context... Unfounded?
>
>
> Much better
>
> Acked-by: David Hildenbrand <david@redhat.com>
>
next prev parent reply other threads:[~2023-12-12 15:32 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-07 16:12 [PATCH v9 00/10] Multi-size THP for anonymous memory Ryan Roberts
2023-12-07 16:12 ` [PATCH v9 01/10] mm: Allow deferred splitting of arbitrary anon large folios Ryan Roberts
2023-12-07 16:12 ` [PATCH v9 02/10] mm: Non-pmd-mappable, large folios for folio_add_new_anon_rmap() Ryan Roberts
2024-01-13 22:42 ` Jiri Olsa
2024-01-14 17:33 ` David Hildenbrand
2024-01-14 20:55 ` Jiri Olsa
2024-01-15 8:50 ` Ryan Roberts
2024-01-15 9:38 ` David Hildenbrand
[not found] ` <yt9d1qa7x9qv.fsf@linux.ibm.com>
2024-01-24 11:19 ` Jiri Olsa
2024-01-24 12:02 ` Ryan Roberts
[not found] ` <ZbD9YdCmZ3_uTj_k@krava>
2024-01-24 12:17 ` Ryan Roberts
[not found] ` <yt9dcytqx6dv.fsf@linux.ibm.com>
2024-01-24 12:42 ` Ryan Roberts
2023-12-07 16:12 ` [PATCH v9 03/10] mm: thp: Introduce multi-size THP sysfs interface Ryan Roberts
2023-12-12 14:54 ` David Hildenbrand
2023-12-12 15:32 ` Ryan Roberts [this message]
2023-12-12 16:27 ` Andrew Morton
2023-12-07 16:12 ` [PATCH v9 04/10] mm: thp: Support allocation of anonymous multi-size THP Ryan Roberts
2023-12-12 15:02 ` David Hildenbrand
2023-12-12 15:38 ` Ryan Roberts
2023-12-12 16:35 ` David Hildenbrand
2023-12-13 7:21 ` Dan Carpenter
2023-12-14 10:54 ` Ryan Roberts
2023-12-14 11:30 ` Dan Carpenter
2023-12-14 12:12 ` Ryan Roberts
2023-12-14 16:02 ` [PATCH] mm: Resolve some multi-size THP review nits Ryan Roberts
2023-12-07 16:12 ` [PATCH v9 05/10] selftests/mm/kugepaged: Restore thp settings at exit Ryan Roberts
2023-12-07 16:12 ` [PATCH v9 06/10] selftests/mm: Factor out thp settings management Ryan Roberts
2023-12-07 16:12 ` [PATCH v9 07/10] selftests/mm: Support multi-size THP interface in thp_settings Ryan Roberts
2023-12-07 16:12 ` [PATCH v9 08/10] selftests/mm/khugepaged: Enlighten for multi-size THP Ryan Roberts
2023-12-07 16:12 ` [PATCH v9 09/10] selftests/mm/cow: Generalize do_run_with_thp() helper Ryan Roberts
2024-01-03 6:21 ` Itaru Kitayama
2024-01-03 8:33 ` Ryan Roberts
2024-01-04 0:09 ` Itaru Kitayama
2023-12-07 16:12 ` [PATCH v9 10/10] selftests/mm/cow: Add tests for anonymous multi-size THP Ryan Roberts
2023-12-07 22:05 ` [PATCH v9 00/10] Multi-size THP for anonymous memory Andrew Morton
2023-12-11 11:51 ` Ryan Roberts
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e424982c-8a2f-4c98-83aa-fdb0ee765776@arm.com \
--to=ryan.roberts@arm.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=apopple@nvidia.com \
--cc=catalin.marinas@arm.com \
--cc=david@redhat.com \
--cc=fengwei.yin@intel.com \
--cc=hughd@google.com \
--cc=itaru.kitayama@gmail.com \
--cc=jhubbard@nvidia.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mcgrof@kernel.org \
--cc=rientjes@google.com \
--cc=shy828301@gmail.com \
--cc=v-songbaohua@oppo.com \
--cc=vbabka@suse.cz \
--cc=wangkefeng.wang@huawei.com \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
--cc=yuzhao@google.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox