From: Johannes Weiner <hannes@cmpxchg.org>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
Andrew Morton <akpm@linux-foundation.org>,
Mel Gorman <mgorman@techsingularity.net>,
Miaohe Lin <linmiaohe@huawei.com>,
Kefeng Wang <wangkefeng.wang@huawei.com>, Zi Yan <ziy@nvidia.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene
Date: Tue, 19 Sep 2023 02:49:14 -0400 [thread overview]
Message-ID: <20230919064914.GA124289@cmpxchg.org> (raw)
In-Reply-To: <20230918174037.GA112714@monkey>
On Mon, Sep 18, 2023 at 10:40:37AM -0700, Mike Kravetz wrote:
> On 09/18/23 10:52, Johannes Weiner wrote:
> > On Mon, Sep 18, 2023 at 09:16:58AM +0200, Vlastimil Babka wrote:
> > > On 9/16/23 21:57, Mike Kravetz wrote:
> > > > On 09/15/23 10:16, Johannes Weiner wrote:
> > > >> On Thu, Sep 14, 2023 at 04:52:38PM -0700, Mike Kravetz wrote:
> > > >
> > > > With the patch below applied, a slightly different workload triggers the
> > > > following warnings. It seems related, and appears to go away when
> > > > reverting the series.
> > > >
> > > > [ 331.595382] ------------[ cut here ]------------
> > > > [ 331.596665] page type is 5, passed migratetype is 1 (nr=512)
> > > > [ 331.598121] WARNING: CPU: 2 PID: 935 at mm/page_alloc.c:662 expand+0x1c9/0x200
> > >
> > > Initially I thought this demonstrates the possible race I was suggesting in
> > > reply to 6/6. But, assuming you have CONFIG_CMA, page type 5 is cma and we
> > > are trying to get a MOVABLE page from a CMA page block, which is something
> > > that's normally done and the pageblock stays CMA. So yeah if the warnings
> > > are to stay, they need to handle this case. Maybe the same can happen with
> > > HIGHATOMIC blocks?
Ok, the CMA thing gave me pause because Mike's pagetypeinfo didn't
show any CMA pages.
5 is actually MIGRATE_ISOLATE - see the double use of 3 for PCPTYPES
and HIGHATOMIC.
> > This means we have an order-10 page where one half is MOVABLE and the
> > other is CMA.
This means the scenario is different:
We get a MAX_ORDER page off the MOVABLE freelist. The removal checks
that the first pageblock is indeed MOVABLE. During the expand, the
second pageblock turns out to be of type MIGRATE_ISOLATE.
The page allocator wouldn't have merged those types. It triggers a bit
too fast to be a race condition.
It appears that MIGRATE_ISOLATE is simply set on the tail pageblock
while the head is on the list, and then stranded there.
Could this be an issue in the page_isolation code? Maybe a range
rounding error?
Zi Yan, does this ring a bell for you?
I don't quite see how my patches could have caused this. But AFAICS we
also didn't have warnings for this scenario so it could be an old bug.
> > Mike, could you describe the workload that is triggering this?
>
> This 'slightly different workload' is actually a slightly different
> environment. Sorry for mis-speaking! The slight difference is that this
> environment does not use the 'alloc hugetlb gigantic pages from CMA'
> (hugetlb_cma) feature that triggered the previous issue.
>
> This is still on a 16G VM. Kernel command line here is:
> "BOOT_IMAGE=(hd0,msdos1)/vmlinuz-6.6.0-rc1-next-20230913+
> root=UUID=49c13301-2555-44dc-847b-caabe1d62bdf ro console=tty0
> console=ttyS0,115200 audit=0 selinux=0 transparent_hugepage=always
> hugetlb_free_vmemmap=on"
>
> The workload is just running this script:
> while true; do
> echo 4 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
> echo 4 > /sys/kernel/mm/hugepages/hugepages-1048576kB/demote
> echo 0 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> done
>
> >
> > Does this reproduce instantly and reliably?
> >
>
> It is not 'instant' but will reproduce fairly reliably within a minute
> or so.
>
> Note that the 'echo 4 > .../hugepages-1048576kB/nr_hugepages' is going
> to end up calling alloc_contig_pages -> alloc_contig_range. Those pages
> will eventually be freed via __free_pages(folio, 9).
No luck reproducing this yet, but I have a question. In that crash
stack trace, the expand() is called via this:
[ 331.645847] get_page_from_freelist+0x3ed/0x1040
[ 331.646837] ? prepare_alloc_pages.constprop.0+0x197/0x1b0
[ 331.647977] __alloc_pages+0xec/0x240
[ 331.648783] alloc_buddy_hugetlb_folio.isra.0+0x6a/0x150
[ 331.649912] __alloc_fresh_hugetlb_folio+0x157/0x230
[ 331.650938] alloc_pool_huge_folio+0xad/0x110
[ 331.651909] set_max_huge_pages+0x17d/0x390
I don't see an __alloc_fresh_hugetlb_folio() in my tree. Only
alloc_fresh_hugetlb_folio(), which has this:
if (hstate_is_gigantic(h))
folio = alloc_gigantic_folio(h, gfp_mask, nid, nmask);
else
folio = alloc_buddy_hugetlb_folio(h, gfp_mask,
nid, nmask, node_alloc_noretry);
where gigantic is defined as the order exceeding MAX_ORDER, which
should be the case for 1G pages on x86.
So the crashing stack must be from a 2M allocation, no? I'm confused
how that could happen with the above test case.
next prev parent reply other threads:[~2023-09-19 6:49 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-11 19:41 Johannes Weiner
2023-09-11 19:41 ` [PATCH 1/6] mm: page_alloc: remove pcppage migratetype caching Johannes Weiner
2023-09-11 19:59 ` Zi Yan
2023-09-11 21:09 ` Andrew Morton
2023-09-12 13:47 ` Vlastimil Babka
2023-09-12 14:50 ` Johannes Weiner
2023-09-13 9:33 ` Vlastimil Babka
2023-09-13 13:24 ` Johannes Weiner
2023-09-13 13:34 ` Vlastimil Babka
2023-09-12 15:03 ` Johannes Weiner
2023-09-14 7:29 ` Vlastimil Babka
2023-09-14 9:56 ` Mel Gorman
2023-09-27 5:42 ` Huang, Ying
2023-09-27 14:51 ` Johannes Weiner
2023-09-30 4:26 ` Huang, Ying
2023-10-02 14:58 ` Johannes Weiner
2023-09-11 19:41 ` [PATCH 2/6] mm: page_alloc: fix up block types when merging compatible blocks Johannes Weiner
2023-09-11 20:01 ` Zi Yan
2023-09-13 9:52 ` Vlastimil Babka
2023-09-14 10:00 ` Mel Gorman
2023-09-11 19:41 ` [PATCH 3/6] mm: page_alloc: move free pages when converting block during isolation Johannes Weiner
2023-09-11 20:17 ` Zi Yan
2023-09-11 20:47 ` Johannes Weiner
2023-09-11 20:50 ` Zi Yan
2023-09-13 14:31 ` Vlastimil Babka
2023-09-14 10:03 ` Mel Gorman
2023-09-11 19:41 ` [PATCH 4/6] mm: page_alloc: fix move_freepages_block() range error Johannes Weiner
2023-09-11 20:23 ` Zi Yan
2023-09-13 14:40 ` Vlastimil Babka
2023-09-14 13:37 ` Johannes Weiner
2023-09-14 10:03 ` Mel Gorman
2023-09-11 19:41 ` [PATCH 5/6] mm: page_alloc: fix freelist movement during block conversion Johannes Weiner
2023-09-13 19:52 ` Vlastimil Babka
2023-09-14 14:47 ` Johannes Weiner
2023-09-11 19:41 ` [PATCH 6/6] mm: page_alloc: consolidate free page accounting Johannes Weiner
2023-09-13 20:18 ` Vlastimil Babka
2023-09-14 4:11 ` Johannes Weiner
2023-09-14 23:52 ` [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene Mike Kravetz
2023-09-15 14:16 ` Johannes Weiner
2023-09-15 15:05 ` Mike Kravetz
2023-09-16 19:57 ` Mike Kravetz
2023-09-16 20:13 ` Andrew Morton
2023-09-18 7:16 ` Vlastimil Babka
2023-09-18 14:52 ` Johannes Weiner
2023-09-18 17:40 ` Mike Kravetz
2023-09-19 6:49 ` Johannes Weiner [this message]
2023-09-19 12:37 ` Zi Yan
2023-09-19 15:22 ` Zi Yan
2023-09-19 18:47 ` Mike Kravetz
2023-09-19 20:57 ` Zi Yan
2023-09-20 0:32 ` Mike Kravetz
2023-09-20 1:38 ` Zi Yan
2023-09-20 6:07 ` Vlastimil Babka
2023-09-20 13:48 ` Johannes Weiner
2023-09-20 16:04 ` Johannes Weiner
2023-09-20 17:23 ` Zi Yan
2023-09-21 2:31 ` Zi Yan
2023-09-21 10:19 ` David Hildenbrand
2023-09-21 14:47 ` Zi Yan
2023-09-25 21:12 ` Zi Yan
2023-09-26 17:39 ` Johannes Weiner
2023-09-28 2:51 ` Zi Yan
2023-10-03 2:26 ` Zi Yan
2023-10-10 21:12 ` Johannes Weiner
2023-10-11 15:25 ` Johannes Weiner
2023-10-11 15:45 ` Johannes Weiner
2023-10-11 15:57 ` Zi Yan
2023-10-13 0:06 ` Zi Yan
2023-10-13 14:51 ` Zi Yan
2023-10-16 13:35 ` Zi Yan
2023-10-16 14:37 ` Johannes Weiner
2023-10-16 15:00 ` Zi Yan
2023-10-16 18:51 ` Johannes Weiner
2023-10-16 19:49 ` Zi Yan
2023-10-16 20:26 ` Johannes Weiner
2023-10-16 20:39 ` Johannes Weiner
2023-10-16 20:48 ` Zi Yan
2023-09-26 18:19 ` David Hildenbrand
2023-09-28 3:22 ` Zi Yan
2023-10-02 11:43 ` David Hildenbrand
2023-10-03 2:35 ` Zi Yan
2023-09-18 7:07 ` Vlastimil Babka
2023-09-18 14:09 ` Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230919064914.GA124289@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mike.kravetz@oracle.com \
--cc=vbabka@suse.cz \
--cc=wangkefeng.wang@huawei.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox