linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Mike Kravetz <mike.kravetz@oracle.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Kefeng Wang <wangkefeng.wang@huawei.com>, Zi Yan <ziy@nvidia.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene
Date: Mon, 18 Sep 2023 10:52:04 -0400	[thread overview]
Message-ID: <20230918145204.GB16104@cmpxchg.org> (raw)
In-Reply-To: <a88b7339-beab-37c6-7d32-0292b325916d@suse.cz>

On Mon, Sep 18, 2023 at 09:16:58AM +0200, Vlastimil Babka wrote:
> On 9/16/23 21:57, Mike Kravetz wrote:
> > On 09/15/23 10:16, Johannes Weiner wrote:
> >> On Thu, Sep 14, 2023 at 04:52:38PM -0700, Mike Kravetz wrote:
> >> > In next-20230913, I started hitting the following BUG.  Seems related
> >> > to this series.  And, if series is reverted I do not see the BUG.
> >> > 
> >> > I can easily reproduce on a small 16G VM.  kernel command line contains
> >> > "hugetlb_free_vmemmap=on hugetlb_cma=4G".  Then run the script,
> >> > while true; do
> >> >  echo 4 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
> >> >  echo 4 > /sys/kernel/mm/hugepages/hugepages-1048576kB/demote
> >> >  echo 0 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> >> > done
> >> > 
> >> > For the BUG below I believe it was the first (or second) 1G page creation from
> >> > CMA that triggered:  cma_alloc of 1G.
> >> > 
> >> > Sorry, have not looked deeper into the issue.
> >> 
> >> Thanks for the report, and sorry about the breakage!
> >> 
> >> I was scratching my head at this:
> >> 
> >>                         /* MIGRATE_ISOLATE page should not go to pcplists */
> >>                         VM_BUG_ON_PAGE(is_migrate_isolate(mt), page);
> >> 
> >> because there is nothing in page isolation that prevents setting
> >> MIGRATE_ISOLATE on something that's on the pcplist already. So why
> >> didn't this trigger before already?
> >> 
> >> Then it clicked: it used to only check the *pcpmigratetype* determined
> >> by free_unref_page(), which of course mustn't be MIGRATE_ISOLATE.
> >> 
> >> Pages that get isolated while *already* on the pcplist are fine, and
> >> are handled properly:
> >> 
> >>                         mt = get_pcppage_migratetype(page);
> >> 
> >>                         /* MIGRATE_ISOLATE page should not go to pcplists */
> >>                         VM_BUG_ON_PAGE(is_migrate_isolate(mt), page);
> >> 
> >>                         /* Pageblock could have been isolated meanwhile */
> >>                         if (unlikely(isolated_pageblocks))
> >>                                 mt = get_pageblock_migratetype(page);
> >> 
> >> So this was purely a sanity check against the pcpmigratetype cache
> >> operations. With that gone, we can remove it.
> > 
> > With the patch below applied, a slightly different workload triggers the
> > following warnings.  It seems related, and appears to go away when
> > reverting the series.
> > 
> > [  331.595382] ------------[ cut here ]------------
> > [  331.596665] page type is 5, passed migratetype is 1 (nr=512)
> > [  331.598121] WARNING: CPU: 2 PID: 935 at mm/page_alloc.c:662 expand+0x1c9/0x200
> 
> Initially I thought this demonstrates the possible race I was suggesting in
> reply to 6/6. But, assuming you have CONFIG_CMA, page type 5 is cma and we
> are trying to get a MOVABLE page from a CMA page block, which is something
> that's normally done and the pageblock stays CMA. So yeah if the warnings
> are to stay, they need to handle this case. Maybe the same can happen with
> HIGHATOMIC blocks?

Hm I don't think that's quite it.

CMA and HIGHATOMIC have their own freelists. When MOVABLE requests dip
into CMA and HIGHATOMIC, we explicitly pass that migratetype to
__rmqueue_smallest(). This takes a chunk of e.g. CMA, expands the
remainder to the CMA freelist, then returns the page. While you get a
different mt than requested, the freelist typing should be consistent.

In this splat, the migratetype passed to __rmqueue_smallest() is
MOVABLE. There is no preceding warning from del_page_from_freelist()
(Mike, correct me if I'm wrong), so we got a confirmed MOVABLE
order-10 block from the MOVABLE list. So far so good. However, when we
expand() the order-9 tail of this block to the MOVABLE list, it warns
that its pageblock type is CMA.

This means we have an order-10 page where one half is MOVABLE and the
other is CMA.

I don't see how the merging code in __free_one_page() could have done
that. The CMA buddy would have failed the migrate_is_mergeable() test
and we should have left it at order-9s.

I also don't see how the CMA setup could have done this because
MIGRATE_CMA is set on the range before the pages are fed to the buddy.

Mike, could you describe the workload that is triggering this?

Does this reproduce instantly and reliably?

Is there high load on the system, or is it requesting the huge page
with not much else going on?

Do you see compact_* history in /proc/vmstat after this triggers?

Could you please also provide /proc/zoneinfo, /proc/pagetypeinfo and
the hugetlb_cma= parameter you're using?

Thanks!


  reply	other threads:[~2023-09-18 14:52 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-11 19:41 Johannes Weiner
2023-09-11 19:41 ` [PATCH 1/6] mm: page_alloc: remove pcppage migratetype caching Johannes Weiner
2023-09-11 19:59   ` Zi Yan
2023-09-11 21:09     ` Andrew Morton
2023-09-12 13:47   ` Vlastimil Babka
2023-09-12 14:50     ` Johannes Weiner
2023-09-13  9:33       ` Vlastimil Babka
2023-09-13 13:24         ` Johannes Weiner
2023-09-13 13:34           ` Vlastimil Babka
2023-09-12 15:03     ` Johannes Weiner
2023-09-14  7:29       ` Vlastimil Babka
2023-09-14  9:56   ` Mel Gorman
2023-09-27  5:42   ` Huang, Ying
2023-09-27 14:51     ` Johannes Weiner
2023-09-30  4:26       ` Huang, Ying
2023-10-02 14:58         ` Johannes Weiner
2023-09-11 19:41 ` [PATCH 2/6] mm: page_alloc: fix up block types when merging compatible blocks Johannes Weiner
2023-09-11 20:01   ` Zi Yan
2023-09-13  9:52   ` Vlastimil Babka
2023-09-14 10:00   ` Mel Gorman
2023-09-11 19:41 ` [PATCH 3/6] mm: page_alloc: move free pages when converting block during isolation Johannes Weiner
2023-09-11 20:17   ` Zi Yan
2023-09-11 20:47     ` Johannes Weiner
2023-09-11 20:50       ` Zi Yan
2023-09-13 14:31   ` Vlastimil Babka
2023-09-14 10:03   ` Mel Gorman
2023-09-11 19:41 ` [PATCH 4/6] mm: page_alloc: fix move_freepages_block() range error Johannes Weiner
2023-09-11 20:23   ` Zi Yan
2023-09-13 14:40   ` Vlastimil Babka
2023-09-14 13:37     ` Johannes Weiner
2023-09-14 10:03   ` Mel Gorman
2023-09-11 19:41 ` [PATCH 5/6] mm: page_alloc: fix freelist movement during block conversion Johannes Weiner
2023-09-13 19:52   ` Vlastimil Babka
2023-09-14 14:47     ` Johannes Weiner
2023-09-11 19:41 ` [PATCH 6/6] mm: page_alloc: consolidate free page accounting Johannes Weiner
2023-09-13 20:18   ` Vlastimil Babka
2023-09-14  4:11     ` Johannes Weiner
2023-09-14 23:52 ` [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene Mike Kravetz
2023-09-15 14:16   ` Johannes Weiner
2023-09-15 15:05     ` Mike Kravetz
2023-09-16 19:57     ` Mike Kravetz
2023-09-16 20:13       ` Andrew Morton
2023-09-18  7:16       ` Vlastimil Babka
2023-09-18 14:52         ` Johannes Weiner [this message]
2023-09-18 17:40           ` Mike Kravetz
2023-09-19  6:49             ` Johannes Weiner
2023-09-19 12:37               ` Zi Yan
2023-09-19 15:22                 ` Zi Yan
2023-09-19 18:47               ` Mike Kravetz
2023-09-19 20:57                 ` Zi Yan
2023-09-20  0:32                   ` Mike Kravetz
2023-09-20  1:38                     ` Zi Yan
2023-09-20  6:07                       ` Vlastimil Babka
2023-09-20 13:48                         ` Johannes Weiner
2023-09-20 16:04                           ` Johannes Weiner
2023-09-20 17:23                             ` Zi Yan
2023-09-21  2:31                               ` Zi Yan
2023-09-21 10:19                                 ` David Hildenbrand
2023-09-21 14:47                                   ` Zi Yan
2023-09-25 21:12                                     ` Zi Yan
2023-09-26 17:39                                       ` Johannes Weiner
2023-09-28  2:51                                         ` Zi Yan
2023-10-03  2:26                                           ` Zi Yan
2023-10-10 21:12                                             ` Johannes Weiner
2023-10-11 15:25                                               ` Johannes Weiner
2023-10-11 15:45                                                 ` Johannes Weiner
2023-10-11 15:57                                                   ` Zi Yan
2023-10-13  0:06                                               ` Zi Yan
2023-10-13 14:51                                                 ` Zi Yan
2023-10-16 13:35                                                   ` Zi Yan
2023-10-16 14:37                                                     ` Johannes Weiner
2023-10-16 15:00                                                       ` Zi Yan
2023-10-16 18:51                                                         ` Johannes Weiner
2023-10-16 19:49                                                           ` Zi Yan
2023-10-16 20:26                                                             ` Johannes Weiner
2023-10-16 20:39                                                               ` Johannes Weiner
2023-10-16 20:48                                                                 ` Zi Yan
2023-09-26 18:19                                     ` David Hildenbrand
2023-09-28  3:22                                       ` Zi Yan
2023-10-02 11:43                                         ` David Hildenbrand
2023-10-03  2:35                                           ` Zi Yan
2023-09-18  7:07     ` Vlastimil Babka
2023-09-18 14:09       ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230918145204.GB16104@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mike.kravetz@oracle.com \
    --cc=vbabka@suse.cz \
    --cc=wangkefeng.wang@huawei.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox