From: Barry Song <21cnbao@gmail.com>
To: zhongjinji <zhongjinji@honor.com>
Cc: zhanghongru06@gmail.com, Liam.Howlett@oracle.com,
akpm@linux-foundation.org, axelrasmussen@google.com,
david@kernel.org, hannes@cmpxchg.org, jackmanb@google.com,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
lorenzo.stoakes@oracle.com, mhocko@suse.com, rppt@kernel.org,
surenb@google.com, vbabka@suse.cz, weixugc@google.com,
yuanchu@google.com, zhanghongru@xiaomi.com, ziy@nvidia.com
Subject: Re: [PATCH 2/3] mm/vmstat: get fragmentation statistics from per-migragetype count
Date: Sat, 29 Nov 2025 15:55:19 +0800 [thread overview]
Message-ID: <CAGsJ_4wUQdQyB_3y0Buf3uG34hvgpMAP3qHHwJM3=R01RJOuvw@mail.gmail.com> (raw)
In-Reply-To: <CAGsJ_4xtc7ipFKYNQkGa-dSn7C8S7-J8LURqYrehfgenfPT=+w@mail.gmail.com>
On Sat, Nov 29, 2025 at 8:00 AM Barry Song <21cnbao@gmail.com> wrote:
>
> > > if (order >= pageblock_order && !is_migrate_isolate(migratetype))
> > > __mod_zone_page_state(zone, NR_FREE_PAGES_BLOCKS, -nr_pages);
> > > diff --git a/mm/vmstat.c b/mm/vmstat.c
> > > index bb09c032eecf..9334bbbe1e16 100644
> > > --- a/mm/vmstat.c
> > > +++ b/mm/vmstat.c
> > > @@ -1590,32 +1590,16 @@ static void pagetypeinfo_showfree_print(struct seq_file *m,
> > > zone->name,
> > > migratetype_names[mtype]);
> > > for (order = 0; order < NR_PAGE_ORDERS; ++order) {
> > > - unsigned long freecount = 0;
> > > - struct free_area *area;
> > > - struct list_head *curr;
> > > + unsigned long freecount;
> > > bool overflow = false;
> > >
> > > - area = &(zone->free_area[order]);
> > > -
> > > - list_for_each(curr, &area->free_list[mtype]) {
> > > - /*
> > > - * Cap the free_list iteration because it might
> > > - * be really large and we are under a spinlock
> > > - * so a long time spent here could trigger a
> > > - * hard lockup detector. Anyway this is a
> > > - * debugging tool so knowing there is a handful
> > > - * of pages of this order should be more than
> > > - * sufficient.
> > > - */
> > > - if (++freecount >= 100000) {
> > > - overflow = true;
> > > - break;
> > > - }
> > > + /* Keep the same output format for user-space tools compatibility */
> > > + freecount = READ_ONCE(zone->free_area[order].mt_nr_free[mtype]);
> >
> > I think it might be better for using an array of size NR_PAGE_ORDERS to store
> > the free count for each order. Like the code below.
>
> Right. If we want the freecount to accurately reflect the current system
> state, we still need to take the zone lock.
>
> Multiple independent WRITE_ONCE and READ_ONCE operations do not guarantee
> correctness. They may ensure single-copy atomicity per access, but not for the
> overall result.
On second thought, the original code releases and re-acquires the spinlock
for each order, so cross-variable consistency may not be a real issue.
Adding data_race() to silence KCSAN warnings should be sufficient?
I mean something like the following.
@@ -843,8 +842,8 @@ static inline void move_to_free_list(struct page
*page, struct zone *zone,
get_pageblock_migratetype(page), old_mt, nr_pages);
list_move_tail(&page->buddy_list, &area->free_list[new_mt]);
- WRITE_ONCE(area->mt_nr_free[old_mt], area->mt_nr_free[old_mt] - 1);
- WRITE_ONCE(area->mt_nr_free[new_mt], area->mt_nr_free[new_mt] + 1);
+ area->mt_nr_free[old_mt]--;
+ area->mt_nr_free[new_mt]++;
account_freepages(zone, -nr_pages, old_mt);
account_freepages(zone, nr_pages, new_mt);
@@ -875,8 +874,7 @@ static inline void
__del_page_from_free_list(struct page *page, struct zone *zon
__ClearPageBuddy(page);
set_page_private(page, 0);
area->nr_free--;
- WRITE_ONCE(area->mt_nr_free[migratetype],
- area->mt_nr_free[migratetype] - 1);
+ area->mt_nr_free[migratetype]--;
if (order >= pageblock_order && !is_migrate_isolate(migratetype))
__mod_zone_page_state(zone, NR_FREE_PAGES_BLOCKS, -nr_pages);
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 7e1e931eb209..d74004eb8c4d 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1599,7 +1599,7 @@ static void pagetypeinfo_showfree_print(struct
seq_file *m,
bool overflow = false;
/* Keep the same output format for user-space
tools compatibility */
- freecount =
READ_ONCE(zone->free_area[order].mt_nr_free[mtype]);
+ freecount =
data_race(zone->free_area[order].mt_nr_free[mtype]);
if (freecount >= 100000) {
overflow = true;
freecount = 100000;
Thanks
Barry
next prev parent reply other threads:[~2025-11-29 7:55 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-28 3:10 [PATCH 0/3] mm: add per-migratetype counts to buddy allocator and optimize pagetypeinfo access Hongru Zhang
2025-11-28 3:11 ` [PATCH 1/3] mm/page_alloc: add per-migratetype counts to buddy allocator Hongru Zhang
2025-11-29 0:34 ` Barry Song
2025-11-28 3:12 ` [PATCH 2/3] mm/vmstat: get fragmentation statistics from per-migragetype count Hongru Zhang
2025-11-28 12:03 ` zhongjinji
2025-11-29 0:00 ` Barry Song
2025-11-29 7:55 ` Barry Song [this message]
2025-12-01 12:29 ` Hongru Zhang
2025-12-01 18:54 ` Barry Song
2025-11-28 3:12 ` [PATCH 3/3] mm: optimize free_area_empty() check using per-migratetype counts Hongru Zhang
2025-11-29 0:04 ` Barry Song
2025-11-29 9:24 ` Barry Song
2025-11-28 7:49 ` [PATCH 0/3] mm: add per-migratetype counts to buddy allocator and optimize pagetypeinfo access Lorenzo Stoakes
2025-11-28 8:34 ` Hongru Zhang
2025-11-28 8:40 ` Lorenzo Stoakes
2025-11-28 9:24 ` Vlastimil Babka
2025-11-28 13:08 ` Johannes Weiner
2025-12-01 2:36 ` Hongru Zhang
2025-12-01 17:01 ` Zi Yan
2025-12-02 2:42 ` Hongru Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAGsJ_4wUQdQyB_3y0Buf3uG34hvgpMAP3qHHwJM3=R01RJOuvw@mail.gmail.com' \
--to=21cnbao@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=david@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=jackmanb@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=weixugc@google.com \
--cc=yuanchu@google.com \
--cc=zhanghongru06@gmail.com \
--cc=zhanghongru@xiaomi.com \
--cc=zhongjinji@honor.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox