From: Barry Song <21cnbao@gmail.com>
To: SeongJae Park <sj@kernel.org>
Cc: akpm@linux-foundation.org, damon@lists.linux.dev,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
minchan@kernel.org, mhocko@suse.com, hannes@cmpxchg.org,
Barry Song <v-songbaohua@oppo.com>
Subject: Re: [PATCH RFC] mm: madvise: pageout: ignore references rather than clearing young
Date: Sun, 25 Feb 2024 03:50:48 +0800 [thread overview]
Message-ID: <CAGsJ_4x-p+8SzyHQq_EJpbq+hSEu5MCtwpGWvafpk4xfpB1gKg@mail.gmail.com> (raw)
In-Reply-To: <20240224190255.45616-1-sj@kernel.org>
On Sun, Feb 25, 2024 at 3:02 AM SeongJae Park <sj@kernel.org> wrote:
>
> On Fri, 23 Feb 2024 17:15:50 +1300 Barry Song <21cnbao@gmail.com> wrote:
>
> > From: Barry Song <v-songbaohua@oppo.com>
> >
> > While doing MADV_PAGEOUT, the current code will clear PTE young
> > so that vmscan won't read young flags to allow the reclamation
> > of madvised folios to go ahead.
> > It seems we can do it by directly ignoring references, thus we
> > can remove tlb flush in madvise and rmap overhead in vmscan.
> >
> > Regarding the side effect, in the original code, if a parallel
> > thread runs side by side to access the madvised memory with the
> > thread doing madvise, folios will get a chance to be re-activated
> > by vmscan. But with the patch, they will still be reclaimed. But
> > this behaviour doing PAGEOUT and doing access at the same time is
> > quite silly like DoS. So probably, we don't need to care.
>
> I think we might need to take care of the case, since users may use just a
> best-effort estimation like DAMON for the target pages. In such cases, the
> page granularity re-check of the access could be helpful. So I concern if this
> could be a visible behavioral change for some valid use cases.
Hi SeongJae,
If you read the code of MADV_PAGEOUT, you will find it is not the best-effort.
It does clearing pte young and immediately after the ptes are cleared, it reads
pte and checks if the ptes are young. If not, reclaim it. So the
purpose of clearing
PTE young is helping the check of young in folio_references to return false.
The gap between clearing ptes and re-checking ptes is quite small at
microseconds
level.
>
> >
> > A microbench as below has shown 6% decrement on the latency of
> > MADV_PAGEOUT,
>
> I assume some of the users may use MADV_PAGEOUT for proactive reclamation of
> the memory. In the use case, I think latency of MADV_PAGEOUT might be not that
> important.
>
> Hence I think the cons of the behavioral change might outweigh the pros of the
> latench improvement, for such best-effort proactive reclamation use case. Hope
> to hear and learn from others' opinions.
I don't see the behavioral change for MADV_PAGEOUT as just the ping-pong
is removed. The only chance is in that very small time gap, somebody accesses
the cleared ptes and makes it young again, considering this time gap
is so small,
i don't think it is worth caring. thus, i don't see pros for MADV_PAGEOUT case,
but we improve the efficiency of MADV_PAGEOUT and save the power of
Android phones.
>
> >
> > #define PGSIZE 4096
> > main()
> > {
> > int i;
> > #define SIZE 512*1024*1024
> > volatile long *p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
> > MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> >
> > for (i = 0; i < SIZE/sizeof(long); i += PGSIZE / sizeof(long))
> > p[i] = 0x11;
> >
> > madvise(p, SIZE, MADV_PAGEOUT);
> > }
> >
> > w/o patch w/ patch
> > root@10:~# time ./a.out root@10:~# time ./a.out
> > real 0m49.634s real 0m46.334s
> > user 0m0.637s user 0m0.648s
> > sys 0m47.434s sys 0m44.265s
> >
> > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
>
>
Thanks
Barry
next prev parent reply other threads:[~2024-02-24 19:51 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-23 4:15 Barry Song
2024-02-23 22:09 ` Minchan Kim
2024-02-23 22:20 ` Barry Song
2024-02-23 23:24 ` Minchan Kim
2024-02-24 4:37 ` Barry Song
2024-02-24 19:07 ` SeongJae Park
2024-02-24 20:01 ` Barry Song
2024-02-24 20:54 ` SeongJae Park
2024-02-24 21:54 ` Barry Song
2024-02-24 20:12 ` SeongJae Park
2024-02-24 20:33 ` Barry Song
2024-02-24 21:02 ` SeongJae Park
2024-02-24 19:02 ` SeongJae Park
2024-02-24 19:50 ` Barry Song [this message]
2024-02-24 20:02 ` SeongJae Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAGsJ_4x-p+8SzyHQq_EJpbq+hSEu5MCtwpGWvafpk4xfpB1gKg@mail.gmail.com \
--to=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=damon@lists.linux.dev \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=sj@kernel.org \
--cc=v-songbaohua@oppo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox