From: Wang Yugui <wangyugui@e16-tech.com>
To: Yang Shi <shy828301@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Linux MM <linux-mm@kvack.org>
Subject: Re: kernel BUG at mm/huge_memory.c:2736(linux 5.10.29)
Date: Fri, 23 Apr 2021 10:16:55 +0800 [thread overview]
Message-ID: <20210423101654.1242.409509F4@e16-tech.com> (raw)
In-Reply-To: <CAHbLzkpiVjDs9qPL=sX7PRMTweyi9TForGB3B4yGhqR575p_Xg@mail.gmail.com>
Hi,
> On Sat, Apr 17, 2021 at 1:33 AM Wang Yugui <wangyugui@e16-tech.com> wrote:
> >
> > Hi,
> >
> > > On Mon, Apr 12, 2021 at 3:07 AM Wang Yugui <wangyugui@e16-tech.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > kernel BUG at mm/huge_memory.c:2736(linux 5.10.29) is triggered
> > > > by some files write test.
> > > >
> > > > mm/huge_memory.c:
> > > > if (IS_ENABLED(CONFIG_DEBUG_VM) && mapcount) {
> > > > pr_alert("total_mapcount: %u, page_count(): %u\n",
> > > > mapcount, count);
> > > > if (PageTail(page))
> > > > dump_page(head, NULL);
> > > > dump_page(page, "total_mapcount(head) > 0");
> > > > L2736: BUG();
> > > > }
> > >
> > > We just can tell the mapcount of the page is not zero from the current
> > > log, it might mean the unmap_page() call is failed. It seems you have
> > > CONFIG_DEBUG_VM enabled, could you please paste more log? There is
> > > "VM_BUG_ON_PAGE(!unmap_success, page)" in unmap_page(). It should be
> > > able to tell us if unmap_page() is failed or not, or something else
> > > happened.
> >
> > This is the full dmesg output
> >
> > [63080.331513] huge_memory: total_mapcount: 511, page_count(): 512
> > [63080.332167] page:00000000d2e1a982 refcount:512 mapcount:0 mapping:0000000000000000 index:0x7fe260582 pfn:0x676a00
> > [63080.332167] head:00000000d2e1a982 order:9 compound_mapcount:0 compound_pincount:0
> > [63080.332167] anon flags: 0x17ffffc009001d(locked|uptodate|dirty|lru|head|swapbacked)
> > [63080.332167] raw: 0017ffffc009001d ffffc93cda0d0008 ffffc93cd9ab0008 ffff8f21be9f0cb9
> > [63080.332167] raw: 00000007fe260582 0000000000000000 00000200ffffffff ffff8f1021810000
> > [63080.332167] page->mem_cgroup:ffff8f1021810000
> > [63080.332167] page:00000000bc78ac24 refcount:512 mapcount:1 mapping:0000000000000000 index:0x7fe260584 pfn:0x676a02
> > [63080.332167] head:00000000d2e1a982 order:9 compound_mapcount:0 compound_pincount:0
> > [63080.332167] anon flags: 0x17ffffc009001d(locked|uptodate|dirty|lru|head|swapbacked)
> > [63080.332167] raw: 0017ffffc0000000 ffffc93cd9da8001 dead000000000000 ffffc93d428d0098
> > [63080.332167] raw: ffffa002cd183bf0 0000000000000000 0000000000000000 0000000000000000
> > [63080.332167] head: 0017ffffc009001d ffffc93cda0d0008 ffffc93cd9ab0008 ffff8f21be9f0cb9
> > [63080.332167] head: 00000007fe260582 0000000000000000 00000200ffffffff ffff8f1021810000
> > [63080.332167] page dumped because: total_mapcount(head) > 0
>
> Added Kirill in this loop too, he may have some insights.
>
> Thanks a lot for pasting the full log. It seems the BUG_ON in
> unmap_page() and VM_BUG_ON_PAGE(compound_mapcount(head), head) were
> not triggered. But the dumped page shows its total_mapcount is 511. It
> means 511 subpages of the huge page are PTE mapped. It seems all tail
> pages are PTE mapped. It may be because unmap_page() is failed or they
> are mapped again after unmap_page().
>
> But the VM_BUG_ON_PAGE just checks compound_mapcount, and it seems
> page_mapcount() call in unmap_page() also just checks
> compound_mapcount and the mapcount of the head page. If the mapcount
> of the head page is 0 and compound_mapcount is also 0, try_to_unmap()
> considers unmap is successful.
>
> So we can't tell which case it is although I don't think of how
> unmap_page() could fail for this case. I think we should check the
> total mapcount in try_to_unmap() instead.
>
> Can you please try the below debug patch (untested) to help narrow
> down the problem?
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index ae907a9c2050..c10e89be1c99 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2726,7 +2726,7 @@ int split_huge_page_to_list(struct page *page,
> struct list_head *list)
> }
>
> unmap_page(head);
> - VM_BUG_ON_PAGE(compound_mapcount(head), head);
> + VM_BUG_ON_PAGE(total_mapcount(head), head);
>
> /* block interrupt reentry in xa_lock and spinlock */
> local_irq_disable();
> diff --git a/mm/rmap.c b/mm/rmap.c
> index b0fc27e77d6d..537dfc557744 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1777,7 +1777,7 @@ bool try_to_unmap(struct page *page, enum ttu_flags flags)
> else
> rmap_walk(page, &rwc);
>
> - return !page_mapcount(page) ? true : false;
> + return !total_mapcount(page) ? true : false;
> }
>
> /**
>
>
With this patch, the problem yet not happen after 4 tests(5.10.x).
By the way, the problem does not happen in 5.4.x.(>about 120 tests)
does this match the code version?
Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2021/04/23
next prev parent reply other threads:[~2021-04-23 2:16 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-12 10:07 Wang Yugui
2021-04-12 20:18 ` Yang Shi
2021-04-13 11:30 ` Wang Yugui
2021-04-15 11:18 ` Wang Yugui
2021-04-15 16:26 ` Yang Shi
2021-04-17 8:33 ` Wang Yugui
2021-04-22 0:11 ` Yang Shi
2021-04-23 2:16 ` Wang Yugui [this message]
2021-04-23 8:07 ` Wang Yugui
2021-04-23 21:05 ` Yang Shi
2021-04-24 5:28 ` Wang Yugui
2021-04-26 22:56 ` Yang Shi
2021-04-28 21:55 ` Wang Yugui
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210423101654.1242.409509F4@e16-tech.com \
--to=wangyugui@e16-tech.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-mm@kvack.org \
--cc=shy828301@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox