From: Zhaoyang Huang <huangzhaoyang@gmail.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Zhaoyang Huang <zhaoyang.huang@unisoc.com>,
"open list:MEMORY MANAGEMENT" <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm: skip current when memcg reclaim
Date: Tue, 19 Oct 2021 20:17:16 +0800 [thread overview]
Message-ID: <CAGWkznHSPAu572BjoE510Sm+G9vGetKg-v2TkjwtcmZGo8MPVw@mail.gmail.com> (raw)
In-Reply-To: <YW6LSVK+NTiZ05+X@dhcp22.suse.cz>
On Tue, Oct 19, 2021 at 5:09 PM Michal Hocko <mhocko@suse.com> wrote:
>
> On Tue 19-10-21 15:11:30, Zhaoyang Huang wrote:
> > On Mon, Oct 18, 2021 at 8:41 PM Michal Hocko <mhocko@suse.com> wrote:
> > >
> > > On Mon 18-10-21 17:25:23, Zhaoyang Huang wrote:
> > > > On Mon, Oct 18, 2021 at 4:23 PM Michal Hocko <mhocko@suse.com> wrote:
> [...]
> > > > > I would be really curious about more specifics of the used hierarchy.
> > > > What I am facing is a typical scenario on Android, that is a big
> > > > memory consuming APP(camera etc) launched while background filled by
> > > > other processes. The hierarchy is like what you describe above where B
> > > > represents the APP and memory.low is set to help warm restart. Both of
> > > > kswapd and direct reclaim work together to reclaim pages under this
> > > > scenario, which can cause 20MB file page delete from LRU in several
> > > > second. This change could help to have current process's page escape
> > > > from being reclaimed and cause page thrashing. We observed the result
> > > > via systrace which shows that the Uninterruptible sleep(block on page
> > > > bit) and iowait get smaller than usual.
> > >
> > > I still have hard time to understand the exact setup and why the patch
> > > helps you. If you want to protect B more than the low limit would allow
> > > for by stealiong from C then the same thing can happen from anybody
> > > reclaiming from C so in the end there is no protection. The same would
> > > apply for any global direct memory reclaim done by a 3rd party. So I
> > > suspect that your patch just happens to work by a luck.
> > B and C compete fairly and superior than others. The idea based on
> > assuming NOT all groups will trap into direct reclaim concurrently, so
> > we want to have the groups steal pages from the processes under
> > root(Non-memory sensitive) or other groups with lower thresholds(high
> > memory tolerance) or the one totally sleeping(not busy for the time
> > being, borrow some pages).
>
> I am really confused now. The memcg reclaim cannot really reclaim
> anything from outside of the reclaimed hierarchy. Protected memcgs are
> only considered if the reclaim was not able to reclaim anything during
> the first hierarchy walk. That would imply that the reclaimed hierarchy
> has either all memcgs with memory protected or non-protected memcgs do
> not have any memory to reclaim.
>
> I think it would really help to provide much details about what is going
> on here before we can move forward.
>
> > > Why both B and C have low limit setup and they both cannot be reclaimed?
> > > Isn't that a weird setup where A hard limit is too close to sum of low
> > > limits of B and C?
> > >
> > > In other words could you share a more detailed configuration you are
> > > using and some more details why both B and C have been skipped during
> > > the first pass of the reclaim?
> > My practical scenario is that important processes(vip APP etc) are
> > placed into protected memcg and keep other processes just under root.
> > Current introduces direct reclaim because of alloc_pages(DMA_ALLOC
> > etc), in which the number of allocation would be much larger than low
> > but would NOT be charged to LRU. Whereas, current also wants to keep
> > the pages(.so files to exec) on LRU.
>
> I am sorry but this description makes even less sense to me. If your
> important process runs under a protected memcg and everything else is
> running under root memcg then your memcg will get protected as long as
> there is a reclaimable memory. There should ever be only global memory
> reclaim happening, unless you specify a hard/high limit on your
> important memcg. If you do so then there is no way to reclaim from
> outside of that specific memcg.
>
> I really fail how your patch can help with either of those situations.
please find cgv2 hierarchy on my sys[1], where uid_2000 is a cgroup
under root and trace_printk info[3] from trace_printk embedded in
shrink_node[2]. I don't why you say there should be no reclaim from
groups under root which opposite to[3]
[1]
/sys/fs/cgroup # ls uid_2000
cgroup.controllers cgroup.max.depth cgroup.stat
cgroup.type io.pressure memory.events.local memory.max
memory.pressure memory.swap.events
cgroup.events cgroup.max.descendants cgroup.subtree_control
cpu.pressure memory.current memory.high memory.min
memory.stat memory.swap.max
cgroup.freeze cgroup.procs cgroup.threads
cpu.stat memory.events memory.low memory.oom.group
memory.swap.current pid_275
[2]
@@ -2962,6 +2962,7 @@ static bool shrink_node(pg_data_t *pgdat, struct
scan_control *sc)
reclaimed = sc->nr_reclaimed;
scanned = sc->nr_scanned;
+ trace_printk("root %x memcg %x reclaimed
%ld\n",root_mem_cgroup,memcg,sc->nr_reclaimed);
shrink_node_memcg(pgdat, memcg, sc, &lru_pages);
node_lru_pages += lru_pages;
[3]
allocator@4.0-s-1034 [005] .... 442.077013: shrink_node: root
ef022800 memcg ef027800 reclaimed 41
kworker/u16:3-931 [002] .... 442.077019: shrink_node: root
ef022800 memcg c7e54000 reclaimed 17
allocator@4.0-s-1034 [005] .... 442.077019: shrink_node: root
ef022800 memcg ef025000 reclaimed 41
allocator@4.0-s-1034 [005] .... 442.077024: shrink_node: root
ef022800 memcg ef023000 reclaimed 41
kworker/u16:3-931 [002] .... 442.077026: shrink_node: root
ef022800 memcg c7e57800 reclaimed 17
allocator@4.0-s-1034 [005] .... 442.077028: shrink_node: root
ef022800 memcg ef026800 reclaimed 41
> --
> Michal Hocko
> SUSE Labs
next prev parent reply other threads:[~2021-10-19 12:17 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-15 6:15 Huangzhaoyang
2021-10-15 20:00 ` Andrew Morton
2021-10-16 2:28 ` Zhaoyang Huang
2021-10-16 2:58 ` Andrew Morton
2021-10-16 3:05 ` Matthew Wilcox
2021-10-16 8:17 ` Zhaoyang Huang
2021-10-18 8:23 ` Michal Hocko
2021-10-18 9:25 ` Zhaoyang Huang
2021-10-18 12:41 ` Michal Hocko
2021-10-19 7:11 ` Zhaoyang Huang
2021-10-19 9:09 ` Michal Hocko
2021-10-19 12:17 ` Zhaoyang Huang [this message]
2021-10-19 13:23 ` Michal Hocko
2021-10-20 7:33 ` Zhaoyang Huang
2021-10-20 8:55 ` Michal Hocko
2021-10-20 11:45 ` Zhaoyang Huang
2021-10-20 15:11 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAGWkznHSPAu572BjoE510Sm+G9vGetKg-v2TkjwtcmZGo8MPVw@mail.gmail.com \
--to=huangzhaoyang@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=vdavydov.dev@gmail.com \
--cc=zhaoyang.huang@unisoc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox