linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Miaohe Lin <linmiaohe@huawei.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: <linux-mm@kvack.org>, Naoya Horiguchi <naoya.horiguchi@nec.com>,
	Andrew Morton <akpm@linux-foundation.org>, <ak@linux.intel.com>
Subject: Re: [PATCH 6/8] mm/memory-failure: Convert memory_failure() to use a folio
Date: Tue, 12 Mar 2024 15:07:39 +0800	[thread overview]
Message-ID: <5eab08d7-ae38-4f99-401f-f361466e34e0@huawei.com> (raw)
In-Reply-To: <Ze75lfChfOA97YQO@casper.infradead.org>

On 2024/3/11 20:31, Matthew Wilcox wrote:
> On Fri, Mar 08, 2024 at 04:48:33PM +0800, Miaohe Lin wrote:
>> On 2024/3/1 5:20, Matthew Wilcox (Oracle) wrote:
>>> @@ -2277,8 +2277,8 @@ int memory_failure(unsigned long pfn, int flags)
>>>  		}
>>>  	}
>>>  
>>> -	hpage = compound_head(p);
>>> -	if (PageTransHuge(hpage)) {
>>> +	folio = page_folio(p);
>>> +	if (folio_test_large(folio)) {
>>>  		/*
>>>  		 * The flag must be set after the refcount is bumped
>>>  		 * otherwise it may race with THP split.
> [...]
>>> @@ -2318,11 +2319,11 @@ int memory_failure(unsigned long pfn, int flags)
>>>  	 * race window. If this happens, we could try again to hopefully
>>>  	 * handle the page next round.
>>>  	 */
>>> -	if (PageCompound(p)) {
>>> +	if (folio_test_large(folio)) {
>>
>> folio_test_large() only checks whether PG_head is set but PageCompound() also checks PageTail().
>> So folio_test_large() and PageCompound() are not equivalent?
> 
> Assuming we have a refcount on this page so it can't be simultaneously
> split/freed/whatever, these three sequences are equivalent:

If page is stable after page refcnt is held, I agree below three sequences are equivalent.

> 
> 1	if (PageCompound(p))
> 
> 2	struct page *head = compound_head(p);
> 2	if (PageHead(head))
> 
> 3	struct folio *folio = page_folio(p);
> 3	if (folio_test_large(folio))
> 
> .
> 

But please see below commit:

"""
commit f37d4298aa7f8b74395aa13c728677e2ed86fdaf
Author: Andi Kleen <ak@linux.intel.com>
Date:   Wed Aug 6 16:06:49 2014 -0700

    hwpoison: fix race with changing page during offlining

    When a hwpoison page is locked it could change state due to parallel
    modifications.  The original compound page can be torn down and then
    this 4k page becomes part of a differently-size compound page is is a
    standalone regular page.

    Check after the lock if the page is still the same compound page.

    We could go back, grab the new head page and try again but it should be
    quite rare, so I thought this was safest.  A retry loop would be more
    difficult to test and may have more side effects.

    The hwpoison code by design only tries to handle cases that are
    reasonably common in workloads, as visible in page-flags.

    I'm not really that concerned about handling this (likely rare case),
    just not crashing on it.

    Signed-off-by: Andi Kleen <ak@linux.intel.com>
    Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index a013bc94ebbe..44c6bd201d3a 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1172,6 +1172,16 @@ int memory_failure(unsigned long pfn, int trapno, int flags)

 	lock_page(hpage);

+	/*
+	 * The page could have changed compound pages during the locking.
+	 * If this happens just bail out.
+	 */
+	if (compound_head(p) != hpage) {
+		action_result(pfn, "different compound page after locking", IGNORED);
+		res = -EBUSY;
+		goto out;
+	}
+
 	/*
 	 * We use page flags to determine what action should be taken, but
 	 * the flags can be modified by the error containment action.  One

"""

It says a page could still change to a differently-size compound page due to parallel
modifications even if extra page refcnt is held and page is locked. But this commit is
early (ten years ago) things might have changed. Any thoughts?

Thanks.



  reply	other threads:[~2024-03-12  7:07 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-29 21:20 [PATCH 0/8] Some cleanups for memory-failure Matthew Wilcox (Oracle)
2024-02-29 21:20 ` [PATCH 1/8] mm/memory-failure: Remove fsdax_pgoff argument from __add_to_kill Matthew Wilcox (Oracle)
2024-03-04 12:09   ` Miaohe Lin
2024-03-13  2:07   ` Jane Chu
2024-03-13  3:23     ` Matthew Wilcox
2024-03-13 18:11       ` Jane Chu
2024-03-14  3:51         ` Matthew Wilcox
2024-03-14 17:54           ` Jane Chu
2024-03-19  0:36   ` Dan Williams
2024-02-29 21:20 ` [PATCH 2/8] mm/memory-failure: Pass addr to __add_to_kill() Matthew Wilcox (Oracle)
2024-03-04 12:10   ` Miaohe Lin
2024-02-29 21:20 ` [PATCH 3/8] mm: Return the address from page_mapped_in_vma() Matthew Wilcox (Oracle)
2024-03-04 12:31   ` Miaohe Lin
2024-03-05 20:09     ` Matthew Wilcox
2024-03-06  8:10       ` Miaohe Lin
2024-03-06  8:17   ` Miaohe Lin
2024-02-29 21:20 ` [PATCH 4/8] mm/memory-failure: Convert shake_page() to shake_folio() Matthew Wilcox (Oracle)
2024-03-06  9:31   ` Miaohe Lin
2024-04-08 15:36     ` Matthew Wilcox
2024-04-08 18:31       ` Jane Chu
2024-04-10  4:01         ` Miaohe Lin
2024-02-29 21:20 ` [PATCH 5/8] mm: Convert hugetlb_page_mapping_lock_write to folio Matthew Wilcox (Oracle)
2024-03-08  8:33   ` Miaohe Lin
2024-02-29 21:20 ` [PATCH 6/8] mm/memory-failure: Convert memory_failure() to use a folio Matthew Wilcox (Oracle)
2024-03-08  8:48   ` Miaohe Lin
2024-03-11 12:31     ` Matthew Wilcox
2024-03-12  7:07       ` Miaohe Lin [this message]
2024-03-12 14:14         ` Matthew Wilcox
2024-03-13  1:23           ` Jane Chu
2024-03-14  2:34             ` Miaohe Lin
2024-03-14 18:15               ` Jane Chu
2024-03-15  6:25                 ` Miaohe Lin
2024-03-15  8:32             ` Miaohe Lin
2024-03-15 19:22               ` Jane Chu
2024-03-18  2:28                 ` Miaohe Lin
2024-02-29 21:20 ` [PATCH 7/8] mm/memory-failure: Convert hwpoison_user_mappings to take " Matthew Wilcox (Oracle)
2024-03-11 11:44   ` Miaohe Lin
2024-02-29 21:20 ` [PATCH 8/8] mm/memory-failure: Add some folio conversions to unpoison_memory Matthew Wilcox (Oracle)
2024-03-11 11:29   ` Miaohe Lin
2024-03-01  6:28 ` [PATCH 0/8] Some cleanups for memory-failure Miaohe Lin
2024-03-01 12:40   ` Muhammad Usama Anjum
2024-03-04  1:55     ` Miaohe Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5eab08d7-ae38-4f99-401f-f361466e34e0@huawei.com \
    --to=linmiaohe@huawei.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox