From: Johannes Weiner <hannes@cmpxchg.org>
To: Vern Hao <haoxing990@gmail.com>
Cc: mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com,
akpm@linux-foundation.org, cgroups@vger.kernel.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Xin Hao <vernhao@tencent.com>
Subject: Re: [PATCH v2] mm: memcg: add THP swap out info for anonymous reclaim
Date: Wed, 13 Sep 2023 11:52:17 -0400 [thread overview]
Message-ID: <20230913155217.GC45543@cmpxchg.org> (raw)
In-Reply-To: <20230912021727.61601-1-vernhao@tencent.com>
On Tue, Sep 12, 2023 at 10:17:25AM +0800, Vern Hao wrote:
> From: Xin Hao <vernhao@tencent.com>
>
> At present, we support per-memcg reclaim strategy, however we do not
> know the number of transparent huge pages being reclaimed, as we know
> the transparent huge pages need to be splited before reclaim them, and
> they will bring some performance bottleneck effect. for example, when
> two memcg (A & B) are doing reclaim for anonymous pages at same time,
> and 'A' memcg is reclaiming a large number of transparent huge pages, we
> can better analyze that the performance bottleneck will be caused by 'A'
> memcg. therefore, in order to better analyze such problems, there add
> THP swap out info for per-memcg.
>
> Signed-off-by: Xin Hao <vernhao@tencent.com>
> ---
> v1 -> v2
> - Do some fix as Johannes Weiner suggestion.
> v1:
> https://lore.kernel.org/linux-mm/20230911160824.GB103342@cmpxchg.org/T/
>
> mm/memcontrol.c | 2 ++
> mm/page_io.c | 4 +++-
> mm/vmscan.c | 1 +
> 3 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index ecc07b47e813..32d50db9ea0d 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -752,6 +752,8 @@ static const unsigned int memcg_vm_event_stat[] = {
> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> THP_FAULT_ALLOC,
> THP_COLLAPSE_ALLOC,
> + THP_SWPOUT,
> + THP_SWPOUT_FALLBACK,
Can you please add documentation to
Documentation/admin-guide/cgroup-v2.rst?
Sorry, I missed this in v1.
> @@ -208,8 +208,10 @@ int swap_writepage(struct page *page, struct writeback_control *wbc)
> static inline void count_swpout_vm_event(struct folio *folio)
> {
> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> - if (unlikely(folio_test_pmd_mappable(folio)))
> + if (unlikely(folio_test_pmd_mappable(folio))) {
> + count_memcg_folio_events(folio, THP_SWPOUT, 1);
> count_vm_event(THP_SWPOUT);
> + }
> #endif
> count_vm_events(PSWPOUT, folio_nr_pages(folio));
> }
Looking through the callers, they seem mostly fine except this one:
static void sio_write_complete(struct kiocb *iocb, long ret)
{
struct swap_iocb *sio = container_of(iocb, struct swap_iocb, iocb);
struct page *page = sio->bvec[0].bv_page;
int p;
if (ret != sio->len) {
/*
* In the case of swap-over-nfs, this can be a
* temporary failure if the system has limited
* memory for allocating transmit buffers.
* Mark the page dirty and avoid
* folio_rotate_reclaimable but rate-limit the
* messages but do not flag PageError like
* the normal direct-to-bio case as it could
* be temporary.
*/
pr_err_ratelimited("Write error %ld on dio swapfile (%llu)\n",
ret, page_file_offset(page));
for (p = 0; p < sio->pages; p++) {
page = sio->bvec[p].bv_page;
set_page_dirty(page);
ClearPageReclaim(page);
}
} else {
for (p = 0; p < sio->pages; p++)
count_swpout_vm_event(page_folio(sio->bvec[p].bv_page));
This is called at the end of IO where the page isn't locked
anymore. Since it's not locked, page->memcg is not stable and might
get freed (charge moving is deprecated but still possible).
The fix is simple, though. Every other IO path bumps THP_SWPOUT before
starting the IO while the page is still locked. We don't really care
if we get SWPOUT events even for failed IOs. So we can just adjust
this caller to fit the others, and count while still locked:
diff --git a/mm/page_io.c b/mm/page_io.c
index fe4c21af23f2..7925e19aeedd 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -278,9 +278,6 @@ static void sio_write_complete(struct kiocb *iocb, long ret)
set_page_dirty(page);
ClearPageReclaim(page);
}
- } else {
- for (p = 0; p < sio->pages; p++)
- count_swpout_vm_event(page_folio(sio->bvec[p].bv_page));
}
for (p = 0; p < sio->pages; p++)
@@ -296,6 +293,7 @@ static void swap_writepage_fs(struct page *page, struct writeback_control *wbc)
struct file *swap_file = sis->swap_file;
loff_t pos = page_file_offset(page);
+ count_swpout_vm_event(page_folio(sio->bvec[p].bv_page));
set_page_writeback(page);
unlock_page(page);
if (wbc->swap_plug)
prev parent reply other threads:[~2023-09-13 15:52 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-12 2:17 Vern Hao
2023-09-13 15:52 ` Johannes Weiner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230913155217.GC45543@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=haoxing990@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=roman.gushchin@linux.dev \
--cc=shakeelb@google.com \
--cc=vernhao@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox