From: Vlastimil Babka <vbabka@suse.cz>
To: David Wang <00107082@163.com>,
akpm@linux-foundation.org, surenb@google.com, mhocko@suse.com,
jackmanb@google.com, hannes@cmpxchg.org, ziy@nvidia.com
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Shakeel Butt <shakeel.butt@linux.dev>
Subject: Re: [PATCH] mm/codetag: sub in advance when free non-compound high order pages
Date: Mon, 5 May 2025 15:12:55 +0200 [thread overview]
Message-ID: <8edbd2be-d495-4bfc-a9f3-6eaae7a66d91@suse.cz> (raw)
In-Reply-To: <20250504061923.66914-1-00107082@163.com>
On 5/4/25 08:19, David Wang wrote:
> When page is non-compound, page[0] could be released by other
> thread right after put_page_testzero failed in current thread,
> pgalloc_tag_sub_pages afterwards would manipulate an invalid
> page for accounting remaining pages:
>
> [timeline] [thread1] [thread2]
> | alloc_page non-compound
> V
> | get_page, rf counter inc
> V
> | in ___free_pages
> | put_page_testzero fails
> V
> | put_page, page released
> V
> | in ___free_pages,
> | pgalloc_tag_sub_pages
> | manipulate an invalid page
> V
> V
>
> Move the tag page accounting ahead, and only account remaining pages
> for non-compound pages with non-zero order.
>
> Signed-off-by: David Wang <00107082@163.com>
Hmm, I think the problem was introduced by 51ff4d7486f0 ("mm: avoid extra
mem_alloc_profiling_enabled() checks"). Previously we'd get the tag pointer
upfront and avoid the page use-after-free.
It would likely be nicer to fix it by going back to that approach for
___free_pages(), while hopefully keeping the optimisations of 51ff4d7486f0
for the other call sites where it applies?
> ---
> mm/page_alloc.c | 36 +++++++++++++++++++++++++++++++++---
> 1 file changed, 33 insertions(+), 3 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 5669baf2a6fe..c42e41ed35fe 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1163,12 +1163,25 @@ static inline void pgalloc_tag_sub_pages(struct page *page, unsigned int nr)
> this_cpu_sub(tag->counters->bytes, PAGE_SIZE * nr);
> }
>
> +static inline void pgalloc_tag_add_pages(struct page *page, unsigned int nr)
> +{
> + struct alloc_tag *tag;
> +
> + if (!mem_alloc_profiling_enabled())
> + return;
> +
> + tag = __pgalloc_tag_get(page);
> + if (tag)
> + this_cpu_add(tag->counters->bytes, PAGE_SIZE * nr);
> +}
> +
> #else /* CONFIG_MEM_ALLOC_PROFILING */
>
> static inline void pgalloc_tag_add(struct page *page, struct task_struct *task,
> unsigned int nr) {}
> static inline void pgalloc_tag_sub(struct page *page, unsigned int nr) {}
> static inline void pgalloc_tag_sub_pages(struct page *page, unsigned int nr) {}
> +static inline void pgalloc_tag_add_pages(struct page *page, unsigned int nr) {}
>
> #endif /* CONFIG_MEM_ALLOC_PROFILING */
>
> @@ -5065,11 +5078,28 @@ static void ___free_pages(struct page *page, unsigned int order,
> {
> /* get PageHead before we drop reference */
> int head = PageHead(page);
> + /*
> + * For remaining pages other than the first page of
> + * a non-compound allocation, we decrease its tag
> + * pages in advance, in case the first page is released
> + * by other thread inbetween our put_page_testzero and any
> + * accounting behavior afterwards.
> + */
> + unsigned int remaining_tag_pages = 0;
>
> - if (put_page_testzero(page))
> + if (order > 0 && !head) {
> + if (unlikely(page_ref_count(page) > 1)) {
> + remaining_tag_pages = (1 << order) - 1;
> + pgalloc_tag_sub_pages(page, remaining_tag_pages);
> + }
> + }
> +
> + if (put_page_testzero(page)) {
> + /* no need special treat for remaining pages, add it back. */
> + if (unlikely(remaining_tag_pages > 0))
> + pgalloc_tag_add_pages(page, remaining_tag_pages);
> __free_frozen_pages(page, order, fpi_flags);
> - else if (!head) {
> - pgalloc_tag_sub_pages(page, (1 << order) - 1);
> + } else if (!head) {
> while (order-- > 0)
> __free_frozen_pages(page + (1 << order), order,
> fpi_flags);
next prev parent reply other threads:[~2025-05-05 13:13 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-04 6:19 David Wang
2025-05-05 13:12 ` Vlastimil Babka [this message]
2025-05-05 14:31 ` David Wang
2025-05-05 14:55 ` Vlastimil Babka
2025-05-05 15:33 ` Suren Baghdasaryan
2025-05-05 16:42 ` David Wang
2025-05-05 16:53 ` Suren Baghdasaryan
2025-05-05 18:34 ` [PATCH v2] mm/codetag: move tag retrieval back upfront in __free_pages() David Wang
2025-05-05 19:17 ` David Wang
2025-05-05 19:30 ` [PATCH v3] " David Wang
2025-05-05 20:32 ` Suren Baghdasaryan
2025-05-06 7:58 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8edbd2be-d495-4bfc-a9f3-6eaae7a66d91@suse.cz \
--to=vbabka@suse.cz \
--cc=00107082@163.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=jackmanb@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=shakeel.butt@linux.dev \
--cc=surenb@google.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox