From: "Liam R. Howlett" <Liam.Howlett@oracle.com>
To: jane.chu@oracle.com
Cc: "David Hildenbrand (Red Hat)" <david@kernel.org>,
muchun.song@linux.dev, osalvador@suse.de, linmiaohe@huawei.com,
jiaqiyan@google.com, william.roche@oracle.com,
rientjes@google.com, akpm@linux-foundation.org,
lorenzo.stoakes@oracle.com, rppt@kernel.org, surenb@google.com,
mhocko@suse.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm/memory-failure: fix missing ->mf_stats count in hugetlb poison
Date: Thu, 18 Dec 2025 23:09:55 -0500 [thread overview]
Message-ID: <urxtx66muia2uggyr4mysxzdakavu5gypn6hu4xlwizisl4cjz@btsapum2gzbo> (raw)
In-Reply-To: <55542653-8902-46ea-b416-776fc58bd644@oracle.com>
* jane.chu@oracle.com <jane.chu@oracle.com> [251218 18:10]:
...
> > * jane.chu@oracle.com <jane.chu@oracle.com> [251218 14:01]:
...
> > > On 12/18/2025 12:41 AM, David Hildenbrand (Red Hat) wrote:
> > > > On 12/16/25 22:56, Jane Chu wrote:
...
> > > > > -static int folio_set_hugetlb_hwpoison(struct folio *folio, struct
> > > > > page *page)
> > > > > +static int folio_set_hugetlb_hwpoison(struct folio *folio, struct
> > > > > page *page,
> > > > > + bool *samepg)
> > > > > {
> > > > > struct llist_head *head;
> > > > > struct raw_hwp_page *raw_hwp;
> > > > > @@ -1889,17 +1890,16 @@ static int folio_set_hugetlb_hwpoison(struct
> > > > > folio *folio, struct page *page)
> > > > > return -EHWPOISON;
> > > > > head = raw_hwp_list_head(folio);
> > > > > llist_for_each_entry(p, head->first, node) {
> > > > > - if (p->page == page)
> > > > > + if (p->page == page) {
> > > > > + *samepg = true;
> > > > > return -EHWPOISON;
> > > > > + }
> > > > > }
> > > > > raw_hwp = kmalloc(sizeof(struct raw_hwp_page), GFP_ATOMIC);
> > > > > if (raw_hwp) {
> > > > > raw_hwp->page = page;
> > > > > llist_add(&raw_hwp->node, head);
> > > > > - /* the first error event will be counted in action_result(). */
> > > > > - if (ret)
> > > > > - num_poisoned_pages_inc(page_to_pfn(page));
> > > > > } else {
> > > > > /*
> > > > > * Failed to save raw error info. We no longer trace all
...
> > > > > try_memory_failure_hugetlb(unsigned long pfn, int flags, int
> > > > > *hugetlb
> > > > > folio = page_folio(p);
> > > > > res = kill_accessing_process(current,
> > > > > folio_pfn(folio), flags);
> > > > > }
> > > > > - action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
> > > > > + if (samepg)
> > > > > + action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
> > > > > + else
> > > > > + action_result(pfn, MF_MSG_HUGE, MF_FAILED);
> > > >
> > > > Can't we somehow return that result from get_huge_page_for_hwpoison()
> > > > ... folio_set_hugetlb_hwpoison() differently? E.g., return an enum
> > > > instead of "-EHWPOISON" or magic value "2".
> > >
> > > This is an option. The existing return codes are as follow.
> > > __get_huge_page_for_hwpoison():
> > > * Return values:
> > > * 0 - free hugepage
> > > * 1 - in-use hugepage
> > > * 2 - not a hugepage
> > > * -EBUSY - the hugepage is busy (try to retry)
> > > * -EHWPOISON - the hugepage is already hwpoisoned
> > >
> > > folio_set_hugetlb_hwpoison()
> > > returns
> > > 0: folio was not poisoned before
> > > -EHWPOISON: folio was poisoned before
> > >
> > > To get rid of 'samepg', how about
> > >
> > > __get_huge_page_for_hwpoison():
> > > * Return values:
> > > * 0 - free hugepage
> > > * 1 - in-use hugepage
> > > * 2 - not a hugepage
> > > * 3 - the hugepage is already hwpoisoned in different page
> > > * 4 - the hugepage is already hwpoisoned in the same page
> > > * -EBUSY - the hugepage is busy (try to retry)
> > >
> > > folio_set_hugetlb_hwpoison()
> > > returns
> > > 0: folio was not poisoned before
> > > 1: folio was poisoned before in different page
> > > 2: folio was poisoned before in the same page
> > >
> > > The whole point about identifying the same page is so that the re-poison
> > > event is not doubled counted.
> >
> > This means folio_set_hugetlb_hwpoison() returns 0 on success but
> > positives on error.. this seems to be going further away from the
> > standard way of doing things?
>
> Yes.
> > > It would actually be good to remove all magic values instead of
> > expanding them.
> >
> > I think what David was trying to say is to have a local enum that states
> > what these numbers mean so that the code reads more cleanly, instead of
> > digging for the right comment to decode it.
> >
> > For example, in try_memory_failure_hugetlb():
> >
> > if (res == 2) { /* fallback to normal page handling */
> >
> > vs:
> >
> > if (res == MEMORY_FAILURE_NOT_HUGEPAGE) { /* fallback to normal page handling */
> >
> > You could spell out your other options as well. Maybe something like
> > MEMORY_FAILURE_HWPOISONED_ALREADY_COUNTED
> > MEMORY_FAILURE_HWPOISONED
> >
> > This would avoid adding more magic values and increase readability.
> >
> > If you changed try_memory_failure_hugetlb() to use a switch statement,
> > then the compiler can catch unchecked enums for us too.
> >
> > If you don't want to go the enum route, then you could use a different
> > error code and propagate it through, like -EEXISTS for the new case?
> > That way the return is still 0 on success and less than 0 on failure,
> > but I think the enum idea has a number of advantages.
>
> I am open, actually prefer enum with switch statement as you suggested
> above.
It's David's suggestion, really :)
>
> What about folio_set_hugetlb_hwpoison()?
> Indeed the conventional way of folio_set_X_Y() returns only two possible
> values, but we need three.
I am having a hard time finding any folio_set_* that returns anything.
$ git grep int\ folio_set_|wc -l
1
$ git grep void\ folio_set_|wc -l
20
> How about changing the function name to set_hugetlb_hwpoison() to deviate
> from the convention? afterall, the function does more than conventional bit
> setting, it maintains a per folio raw-error linked list
> to track the poisoned pages within.
I think that's a good idea considering it seems like folio_set_ implies
that it will be setting something unconditionally, and this function
does not always set something and does even more work.
In fact the names make no sense to begin with.
get_huge_page_for_hwpoison() ends up actually calling
folio_set_hugetlb_hwpoison(), so it's not getting the huge page at all,
it's doing (at least some of) the work.
So, yeah, renaming it isn't going to make things worse. I'd try to
indicate that folio_set_hugetlb_hwpoison() might not do anything. It's
static and has hugetlb in the name, so we're pretty safe with anything.
Maybe update_hugetlb_hwpoison_list() ?
Thanks,
Liam
prev parent reply other threads:[~2025-12-19 4:10 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-16 21:56 Jane Chu
2025-12-18 8:41 ` David Hildenbrand (Red Hat)
2025-12-18 19:01 ` jane.chu
2025-12-18 20:26 ` Liam R. Howlett
2025-12-18 23:10 ` jane.chu
2025-12-19 4:09 ` Liam R. Howlett [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=urxtx66muia2uggyr4mysxzdakavu5gypn6hu4xlwizisl4cjz@btsapum2gzbo \
--to=liam.howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=david@kernel.org \
--cc=jane.chu@oracle.com \
--cc=jiaqiyan@google.com \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=osalvador@suse.de \
--cc=rientjes@google.com \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=william.roche@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox