Re: [RFC PATCH v1 0/2] How HugeTLB handle HWPoison page at truncation

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: David Hildenbrand <david@redhat.com>
To: Jiaqi Yan <jiaqiyan@google.com>,
	nao.horiguchi@gmail.com, linmiaohe@huawei.com,
	sidhartha.kumar@oracle.com, muchun.song@linux.dev
Cc: jane.chu@oracle.com, akpm@linux-foundation.org,
	osalvador@suse.de, rientjes@google.com, jthoughton@google.com,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v1 0/2] How HugeTLB handle HWPoison page at truncation
Date: Mon, 20 Jan 2025 11:59:29 +0100	[thread overview]
Message-ID: <6f97f3b6-3e7b-4ff8-8d67-ef972791cccd@redhat.com> (raw)
In-Reply-To: <20250119180608.2132296-1-jiaqiyan@google.com>

On 19.01.25 19:06, Jiaqi Yan wrote:
> While I was working on userspace MFR via memfd [1], I spend some time to
> understand what current kernel does when a HugeTLB-backing memfd is
> truncated. My expectation is, if there is a HWPoison HugeTLB folio
> mapped via the memfd to userspace, it will be unmapped right away but
> still be kept in page cache [2]; however when the memfd is truncated to
> zero or after the memfd is closed, kernel should dissolve the HWPoison
> folio in the page cache, and free only the clean raw pages to buddy
> allocator, excluding the poisoned raw page.
> 
> So I wrote a hugetlb-mfr-base.c selftest and expect
> 0. say nr_hugepages initially is 64 as system configuration.
> 1. after MADV_HWPOISON, nr_hugepages should still be 64 as we kept even
>     HWPoison huge folio in page cache. free_hugepages should be
>     nr_hugepages minus whatever the amount in use.
> 2. after truncated memfd to zero, nr_hugepages should reduced to 63 as
>     kernel dissolved and freed the HWPoison huge folio. free_hugepages
>     should also be 63.
> 
> However, when testing at the head of mm-stable commit 2877a83e4a0a
> ("mm/hugetlb: use folio->lru int demote_free_hugetlb_folios()"), I found
> although free_hugepages is reduced to 63, nr_hugepages is not reduced
> and stay at 64.
> 
> Is my expectation outdated? Or is this some kind of bug?
> 
> I assume this is a bug and then digged a little bit more. It seems there
> are two issues, or two things I don't really understand.
> 
> 1. During try_memory_failure_hugetlb, we should increased the target
>     in-use folio's refcount via get_hwpoison_hugetlb_folio. However,
>     until the end of try_memory_failure_hugetlb, this refcout is not put.
>     I can make sense of this given we keep in-use huge folio in page
>     cache.

Isn't the general rule that hwpoisoned folios have a raised refcount 
such that they won't get freed + reused? At least that's how the buddy 
deals with them, and I suspect also hugetlb?

> [ 1069.320976] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2780000
> [ 1069.320978] head: order:18 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> [ 1069.320980] flags: 0x400000000100044(referenced|head|hwpoison|node=0|zone=1)
> [ 1069.320982] page_type: f4(hugetlb)
> [ 1069.320984] raw: 0400000000100044 ffffffff8760bbc8 ffffffff8760bbc8 0000000000000000
> [ 1069.320985] raw: 0000000000000000 0000000000000000 00000001f4000000 0000000000000000
> [ 1069.320987] head: 0400000000100044 ffffffff8760bbc8 ffffffff8760bbc8 0000000000000000
> [ 1069.320988] head: 0000000000000000 0000000000000000 00000001f4000000 0000000000000000
> [ 1069.320990] head: 0400000000000012 ffffdd53de000001 ffffffffffffffff 0000000000000000
> [ 1069.320991] head: 0000000000040000 0000000000000000 00000000ffffffff 0000000000000000
> [ 1069.320992] page dumped because: track hwpoison folio's ref
> 
> 2. Even if folio's refcount do drop to zero and we get into
>     free_huge_folio, it is not clear to me which part of free_huge_folio
>     is handling the case that folio is HWPoison. In my test what I
>     observed is that evantually the folio is enqueue_hugetlb_folio()-ed.

How would we get a refcount of 0 if we assume the raised refcount on a 
hwpoisoned hugetlb folio?

I'm probably missing something: are you saying that you can trigger a 
hwpoisoned hugetlb folio to get reallocated again, in upstream code?


-- 
Cheers,

David / dhildenb

next prev parent reply	other threads:[~2025-01-20 10:59 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-19 18:06 Jiaqi Yan
2025-01-19 18:06 ` [RFC PATCH v1 1/2] selftest/mm: test HWPoison hugetlb truncation behavior Jiaqi Yan
2025-01-19 20:18   ` Pedro Falcato
2025-01-19 18:06 ` [RFC PATCH v1 2/2] mm/hugetlb: immature fix to handle HWPoisoned folio Jiaqi Yan
2025-01-20 10:59 ` David Hildenbrand [this message]
2025-01-21  1:21   ` [RFC PATCH v1 0/2] How HugeTLB handle HWPoison page at truncation Jiaqi Yan
2025-01-21  5:00     ` jane.chu
2025-01-21  5:08       ` Jiaqi Yan
2025-01-21  5:22         ` jane.chu
2025-01-21  8:02     ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6f97f3b6-3e7b-4ff8-8d67-ef972791cccd@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=jane.chu@oracle.com \
    --cc=jiaqiyan@google.com \
    --cc=jthoughton@google.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=muchun.song@linux.dev \
    --cc=nao.horiguchi@gmail.com \
    --cc=osalvador@suse.de \
    --cc=rientjes@google.com \
    --cc=sidhartha.kumar@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox