linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yang Shi <shy828301@gmail.com>
To: Yin Fengwei <fengwei.yin@intel.com>
Cc: linux-mm@kvack.org, naoya.horiguchi@nec.com,
	linmiaohe@huawei.com,  willy@infradead.org, aaron.lu@intel.com,
	tony.luck@intel.com,  qiuxu.zhuo@intel.com
Subject: Re: [PATCH v2] mm: release private data before split THP
Date: Mon, 8 Aug 2022 10:49:19 -0700	[thread overview]
Message-ID: <CAHbLzkptEctyoN6BCH7GhT_8skLx3izxfrwPnWCXOG4J50wk1g@mail.gmail.com> (raw)
In-Reply-To: <20220805062844.439152-1-fengwei.yin@intel.com>

On Thu, Aug 4, 2022 at 11:29 PM Yin Fengwei <fengwei.yin@intel.com> wrote:
>
> If there is private data attached to THP, the refcount of
> THP will be increased and block the THP split. Release
> private data attached to THP before split it to increase
> the chance of splitting THP successfully.
>
> There was a memory failure issue hit during HW error
> injection testing with 5.18 kernel + xfs as rootfs. Test
> got killed and system reboot was required to re-run the
> test.
>
> The issue was tracked down to THP split failure caused the
> memory failure not being handled. The page dump showed:
>
> [ 1785.433075] page:0000000025f9530b refcount:18 mapcount:0 mapping:000000008162eea7 index:0xa10 pfn:0x2f0200
> [ 1785.443954] head:0000000025f9530b order:4 compound_mapcount:0 compound_pincount:0
> [ 1785.452408] memcg:ff4247f2d28e9000
> [ 1785.456304] aops:xfs_address_space_operations ino:8555182 dentry name:"baseos-filenames.solvx"
> [ 1785.466612] flags: 0x1000000000012036(referenced|uptodate|lru|active|private|head|node=0|zone=2)
> [ 1785.476514] raw: 1000000000012036 ffb9460f8bc07c08 ffb9460f8bc08408 ff4247f22e6299f8
> [ 1785.485268] raw: 0000000000000a10 ff4247f194ade900 00000012ffffffff ff4247f2d28e9000
>
> It was like the error was injected to a large folio for xfs
> with private data attached.
>
> With private data released before split THP, the test case
> could be run successfully many times without reboot system.
>
> Co-developed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> Reviewed-by: Aaron Lu <aaron.lu@intel.com>
> ---
> Changelog from v1:
>  - Move private release to split_huge_page_to_list
>    to cover wider path per Yang's comment
>  - Update to commit message
>
> Changelog from RFC:
>  - Use new folio API per Mathhew Wilcox's suggestion
>  - Add one line comment before re-get folio of page per
>    Miaohe's comment
>  - Remove RFC tag
>  - Add Co-developed-by of Qiuxu who did a lot of debugging
>    work to locate where the real issue is
>
>  mm/huge_memory.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 15965084816d..edcbc6c2bb3f 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2590,6 +2590,12 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
>                         goto out;
>                 }
>
> +               if (folio_test_private(folio) &&
> +                               !filemap_release_folio(folio, GFP_KERNEL)) {

The GFP_KERNEL is fine for most THP split callsites except for the
memory reclaim path since it might not allow certain flags to avoid
recursion, for example, nested reclaim, issue I/O, etc. The most
filesystems clear __GFP_FS. However it should not be a real life
problem now since AFAIK just xfs supports large folios for now and xfs
uses iomap release_folio() method which actually ignores gfp flags.

So it sounds safer to follow the gfp convention used by
xas_split_alloc() in the below. The best way is to pass in the gfp
flag from the reclaimer IMO, but it seems overkilling at the moment.

> +                       ret = -EBUSY;
> +                       goto out;
> +               }
> +
>                 xas_split_alloc(&xas, head, compound_order(head),
>                                 mapping_gfp_mask(mapping) & GFP_RECLAIM_MASK);
>                 if (xas_error(&xas)) {
>
> base-commit: 31be1d0fbd950395701d9fd47d8fb1f99c996f61
> --
> 2.25.1
>


  reply	other threads:[~2022-08-08 17:49 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-05  6:28 Yin Fengwei
2022-08-08 17:49 ` Yang Shi [this message]
2022-08-09  1:12   ` Yin Fengwei
2022-08-09  9:08     ` Aaron Lu
2022-08-09 16:45       ` Yang Shi
2022-08-09 23:55         ` Yin Fengwei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHbLzkptEctyoN6BCH7GhT_8skLx3izxfrwPnWCXOG4J50wk1g@mail.gmail.com \
    --to=shy828301@gmail.com \
    --cc=aaron.lu@intel.com \
    --cc=fengwei.yin@intel.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-mm@kvack.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=qiuxu.zhuo@intel.com \
    --cc=tony.luck@intel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox