From: Johannes Weiner <hannes@cmpxchg.org>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Mel Gorman <mgorman@techsingularity.net>,
Miaohe Lin <linmiaohe@huawei.com>,
Kefeng Wang <wangkefeng.wang@huawei.com>, Zi Yan <ziy@nvidia.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 6/6] mm: page_alloc: consolidate free page accounting
Date: Thu, 14 Sep 2023 00:11:44 -0400 [thread overview]
Message-ID: <20230914041144.GD48476@cmpxchg.org> (raw)
In-Reply-To: <37dbd4d0-c125-6694-dec4-6322ae5b6dee@suse.cz>
On Wed, Sep 13, 2023 at 10:18:17PM +0200, Vlastimil Babka wrote:
> Pageblock migratetype is set under zone->lock but there are
> places that read it outside of zone->lock and then trust it to perform the
> freelist placement. See for example __free_pages_ok(), or free_unref_page()
> in the cases it calls free_one_page(). These determine pageblock migratetype
> before taking the zone->lock. Only for has_isolate_pageblock() cases we are
> more careful, because previously isolation was the only case where precision
> was needed. So I think this warning is going to trigger?
Good catch on these two. It didn't get the warning, but the code is
indeed not quite right. The fix looks straight-forward: move the
lookup in those two cases into the zone->lock.
__free_pages_ok() is used
- when freeing >COSTLY order pages that aren't THPs
- when bootstrapping the allocator
- when allocating with alloc_pages_exact()
- when "accepting" unaccepted vm host memory before first use
none of which are too hot to tolerate the bitmap access inside the
lock instead of outside. free_one_page() is used in the isolated pages
and pcp-contended paths, both of which are exceptions as well.
Plus, it saves two branches (under the lock) in those paths to test
for the isolate conditions.
And a double lookup in case there *is* isolation.
I checked the code again and didn't see any other instances like the
two here. But I'll double check tomorrow and then send a fix.
> > @@ -757,23 +783,21 @@ static inline void __free_one_page(struct page *page,
> > VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP, page);
> >
> > VM_BUG_ON(migratetype == -1);
> > - if (likely(!is_migrate_isolate(migratetype)))
> > - __mod_zone_freepage_state(zone, 1 << order, migratetype);
> > -
> > VM_BUG_ON_PAGE(pfn & ((1 << order) - 1), page);
> > VM_BUG_ON_PAGE(bad_range(zone, page), page);
> >
> > while (order < MAX_ORDER) {
> > - if (compaction_capture(capc, page, order, migratetype)) {
> > - __mod_zone_freepage_state(zone, -(1 << order),
> > - migratetype);
> > + int buddy_mt;
> > +
> > + if (compaction_capture(capc, page, order, migratetype))
> > return;
> > - }
> >
> > buddy = find_buddy_page_pfn(page, pfn, order, &buddy_pfn);
> > if (!buddy)
> > goto done_merging;
> >
> > + buddy_mt = get_pfnblock_migratetype(buddy, buddy_pfn);
>
> You should assume buddy_mt equals migratetype, no? It's the same assumption
> as the VM_WARN_ONCE() I've discussed?
Ah, you're right, that lookup can be removed.
Actually, that section is brainfarts. There is an issue with updating
the buddy type before removing it from the list. I was confused why
this didn't warn for me, but it's because on my test setup, the
pageblock_order == MAX_ORDER since I don't have hugetlb compiled in.
The fix looks simple. I'll test it, with pb order 9 as well, and
follow-up.
> > @@ -801,9 +825,9 @@ static inline void __free_one_page(struct page *page,
> > * merge with it and move up one order.
> > */
> > if (page_is_guard(buddy))
> > - clear_page_guard(zone, buddy, order, migratetype);
> > + clear_page_guard(zone, buddy, order);
> > else
> > - del_page_from_free_list(buddy, zone, order);
> > + del_page_from_free_list(buddy, zone, order, buddy_mt);
>
> Ugh so this will add account_freepages() call to each iteration of the
> __free_one_page() hot loop, which seems like a lot of unnecessary overhead -
> as long as we are within pageblock_order the migratetype should be the same,
> and thus also is_migrate_isolate() and is_migrate_cma() tests should return
> the same value so we shouldn't need to call __mod_zone_page_state()
> piecemeal like this.
Good point, this is unnecessarily naive. The net effect on the
counters from the buddy merging is nil. That's true even when we go
beyond the page block, because we don't merge isolated/cma blocks with
anything except their own.
I'll move the accounting out into a single call.
Thanks for your thorough review!
next prev parent reply other threads:[~2023-09-14 4:11 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-11 19:41 [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene Johannes Weiner
2023-09-11 19:41 ` [PATCH 1/6] mm: page_alloc: remove pcppage migratetype caching Johannes Weiner
2023-09-11 19:59 ` Zi Yan
2023-09-11 21:09 ` Andrew Morton
2023-09-12 13:47 ` Vlastimil Babka
2023-09-12 14:50 ` Johannes Weiner
2023-09-13 9:33 ` Vlastimil Babka
2023-09-13 13:24 ` Johannes Weiner
2023-09-13 13:34 ` Vlastimil Babka
2023-09-12 15:03 ` Johannes Weiner
2023-09-14 7:29 ` Vlastimil Babka
2023-09-14 9:56 ` Mel Gorman
2023-09-27 5:42 ` Huang, Ying
2023-09-27 14:51 ` Johannes Weiner
2023-09-30 4:26 ` Huang, Ying
2023-10-02 14:58 ` Johannes Weiner
2023-09-11 19:41 ` [PATCH 2/6] mm: page_alloc: fix up block types when merging compatible blocks Johannes Weiner
2023-09-11 20:01 ` Zi Yan
2023-09-13 9:52 ` Vlastimil Babka
2023-09-14 10:00 ` Mel Gorman
2023-09-11 19:41 ` [PATCH 3/6] mm: page_alloc: move free pages when converting block during isolation Johannes Weiner
2023-09-11 20:17 ` Zi Yan
2023-09-11 20:47 ` Johannes Weiner
2023-09-11 20:50 ` Zi Yan
2023-09-13 14:31 ` Vlastimil Babka
2023-09-14 10:03 ` Mel Gorman
2023-09-11 19:41 ` [PATCH 4/6] mm: page_alloc: fix move_freepages_block() range error Johannes Weiner
2023-09-11 20:23 ` Zi Yan
2023-09-13 14:40 ` Vlastimil Babka
2023-09-14 13:37 ` Johannes Weiner
2023-09-14 10:03 ` Mel Gorman
2023-09-11 19:41 ` [PATCH 5/6] mm: page_alloc: fix freelist movement during block conversion Johannes Weiner
2023-09-13 19:52 ` Vlastimil Babka
2023-09-14 14:47 ` Johannes Weiner
2023-09-11 19:41 ` [PATCH 6/6] mm: page_alloc: consolidate free page accounting Johannes Weiner
2023-09-13 20:18 ` Vlastimil Babka
2023-09-14 4:11 ` Johannes Weiner [this message]
2023-09-14 23:52 ` [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene Mike Kravetz
2023-09-15 14:16 ` Johannes Weiner
2023-09-15 15:05 ` Mike Kravetz
2023-09-16 19:57 ` Mike Kravetz
2023-09-16 20:13 ` Andrew Morton
2023-09-18 7:16 ` Vlastimil Babka
2023-09-18 14:52 ` Johannes Weiner
2023-09-18 17:40 ` Mike Kravetz
2023-09-19 6:49 ` Johannes Weiner
2023-09-19 12:37 ` Zi Yan
2023-09-19 15:22 ` Zi Yan
2023-09-19 18:47 ` Mike Kravetz
2023-09-19 20:57 ` Zi Yan
2023-09-20 0:32 ` Mike Kravetz
2023-09-20 1:38 ` Zi Yan
2023-09-20 6:07 ` Vlastimil Babka
2023-09-20 13:48 ` Johannes Weiner
2023-09-20 16:04 ` Johannes Weiner
2023-09-20 17:23 ` Zi Yan
2023-09-21 2:31 ` Zi Yan
2023-09-21 10:19 ` David Hildenbrand
2023-09-21 14:47 ` Zi Yan
2023-09-25 21:12 ` Zi Yan
2023-09-26 17:39 ` Johannes Weiner
2023-09-28 2:51 ` Zi Yan
2023-10-03 2:26 ` Zi Yan
2023-10-10 21:12 ` Johannes Weiner
2023-10-11 15:25 ` Johannes Weiner
2023-10-11 15:45 ` Johannes Weiner
2023-10-11 15:57 ` Zi Yan
2023-10-13 0:06 ` Zi Yan
2023-10-13 14:51 ` Zi Yan
2023-10-16 13:35 ` Zi Yan
2023-10-16 14:37 ` Johannes Weiner
2023-10-16 15:00 ` Zi Yan
2023-10-16 18:51 ` Johannes Weiner
2023-10-16 19:49 ` Zi Yan
2023-10-16 20:26 ` Johannes Weiner
2023-10-16 20:39 ` Johannes Weiner
2023-10-16 20:48 ` Zi Yan
2023-09-26 18:19 ` David Hildenbrand
2023-09-28 3:22 ` Zi Yan
2023-10-02 11:43 ` David Hildenbrand
2023-10-03 2:35 ` Zi Yan
2023-09-18 7:07 ` Vlastimil Babka
2023-09-18 14:09 ` Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230914041144.GD48476@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=vbabka@suse.cz \
--cc=wangkefeng.wang@huawei.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox