From: Jens Axboe <axboe@kernel.dk>
To: Linux-MM <linux-mm@kvack.org>
Cc: Yu Zhao <yuzhao@google.com>, Johannes Weiner <hannes@cmpxchg.org>,
Andrew Morton <akpm@linux-foundation.org>,
Muchun Song <muchun.song@linux.dev>
Subject: Hugepage program taking forever to exit
Date: Tue, 10 Sep 2024 12:21:42 -0600 [thread overview]
Message-ID: <02ffa542-ce49-4755-9d2b-29841f9973e0@kernel.dk> (raw)
Hi,
Investigating another issue, I wrote the following simple program that allocates
and faults in 500 1GB huge pages, and then registers them with io_uring. Each
step is timed:
Got 500 huge pages (each 1024MB) in 0 msec
Faulted in 500 huge pages in 38632 msec
Registered 500 pages in 867 msec
and as expected, faulting in the pages takes (by far) the longest. From
the above, you'd also expect the total runtime to be around ~39 seconds.
But it is not... In fact it takes 82 seconds in total for this program
to have exited. Looking at why, I see:
[<0>] __wait_rcu_gp+0x12b/0x160
[<0>] synchronize_rcu_normal.part.0+0x2a/0x30
[<0>] hugetlb_vmemmap_restore_folios+0x22/0xe0
[<0>] update_and_free_pages_bulk+0x4c/0x220
[<0>] return_unused_surplus_pages+0x80/0xa0
[<0>] hugetlb_acct_memory.part.0+0x2dd/0x3b0
[<0>] hugetlb_vm_op_close+0x160/0x180
[<0>] remove_vma+0x20/0x60
[<0>] exit_mmap+0x199/0x340
[<0>] mmput+0x49/0x110
[<0>] do_exit+0x261/0x9b0
[<0>] do_group_exit+0x2c/0x80
[<0>] __x64_sys_exit_group+0x14/0x20
[<0>] x64_sys_call+0x714/0x720
[<0>] do_syscall_64+0x5b/0x160
[<0>] entry_SYSCALL_64_after_hwframe+0x4b/0x53
and yes, it does look like the program is mostly idle for most of the
time while returning these huge pages. It's also telling us exactly why
we're just sitting idle - RCU grace period.
The below quick change means the runtime of the program is pretty much
just the time it takes to execute the parts of it, as you can see from
the full output after the change:
axboe@r7525 ~> time sudo ./reg-huge
Got 500 huge pages (each 1024MB) in 0 msec
Faulted in 500 huge pages in 38632 msec
Registered 500 pages in 867 msec
________________________________________________________
Executed in 39.53 secs fish external
usr time 4.88 millis 238.00 micros 4.64 millis
sys time 0.00 millis 0.00 micros 0.00 millis
where 38632+876 == 39.51s.
Looks like this was introduced by:
commit bd225530a4c717714722c3731442b78954c765b3
Author: Yu Zhao <yuzhao@google.com>
Date: Thu Jun 27 16:27:05 2024 -0600
mm/hugetlb_vmemmap: fix race with speculative PFN walkers
diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index 0c3f56b3578e..95f6ad8f8232 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -517,7 +517,7 @@ long hugetlb_vmemmap_restore_folios(const struct hstate *h,
long ret = 0;
/* avoid writes from page_ref_add_unless() while unfolding vmemmap */
- synchronize_rcu();
+ synchronize_rcu_expedited();
list_for_each_entry_safe(folio, t_folio, folio_list, lru) {
if (folio_test_hugetlb_vmemmap_optimized(folio)) {
--
Jens Axboe
next reply other threads:[~2024-09-10 18:21 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-10 18:21 Jens Axboe [this message]
2024-09-10 19:33 ` Johannes Weiner
2024-09-10 20:17 ` Yu Zhao
2024-09-10 23:08 ` Jens Axboe
2024-09-11 3:42 ` Andrew Morton
2024-09-11 13:22 ` Jens Axboe
2024-09-11 16:23 ` Yu Zhao
2024-09-11 18:38 ` Andrew Morton
2024-09-11 22:08 ` Yu Zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=02ffa542-ce49-4755-9d2b-29841f9973e0@kernel.dk \
--to=axboe@kernel.dk \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=linux-mm@kvack.org \
--cc=muchun.song@linux.dev \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox