RE: [LSF/MM/BPF TOPIC] MGLRU on Android: Real-World Problems and Challenges

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: wangzicheng <wangzicheng@honor.com>
To: Barry Song <21cnbao@gmail.com>
Cc: "lsf-pc@lists.linux-foundation.org"
	<lsf-pc@lists.linux-foundation.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	wangxin 00023513 <wangxin23@honor.com>, gao xu <gaoxu2@honor.com>,
	wangtao <tao.wangtao@honor.com>,
	liulu 00013167 <liulu.liu@honor.com>,
	zhouxiaolong <zhouxiaolong9@honor.com>,
	linkunli <linkunli@honor.com>,
	"kasong@tencent.com" <kasong@tencent.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"axelrasmussen@google.com" <axelrasmussen@google.com>,
	"yuanchu@google.com" <yuanchu@google.com>,
	"weixugc@google.com" <weixugc@google.com>,
	Randy Dunlap <rdunlap@infradead.org>,
	"Liam.Howlett@oracle.com" <Liam.Howlett@oracle.com>,
	"willy@infradead.org" <willy@infradead.org>,
	"surenb@google.com" <surenb@google.com>,
	yangxuzhe 00017436 <yangxuzhe@honor.com>,
	Kalesh Singh <kaleshsingh@google.com>,
	android-mm <android-mm@google.com>
Subject: RE: [LSF/MM/BPF TOPIC] MGLRU on Android: Real-World Problems and Challenges
Date: Thu, 26 Feb 2026 13:29:40 +0000	[thread overview]
Message-ID: <d91b347b9639488580182a9032ba1f2f@honor.com> (raw)
In-Reply-To: <CAGsJ_4zPYpTDY4U0wSGhoa9dVHus_FChXaaD45H170TSXJ+RvQ@mail.gmail.com>

> > > > reclaim via try_to_free_mem_cgroup_pages() lacks clear abort
> > > > semantics and can overshoot the intended reclaim amount.
> > > > We currently use OEM hooks [2] to early-exit or bypass reclaim under
> > > > some conditions
> > > >
> > > > Q3: High reclaim cost and long uninterruptible sleep on lower-end
> > > > devices
> > > > On lower-end devices, reclaim cost and latency are harder to control:
> > > > throttle_direct_reclaim can make tasks wait for kswapd instead of
> > > > doing direct reclaim;
> > > > sometimes the target generations in many memcgs have very few
> > > > reclaimable
> > > > pages, so the CPU spends time scanning with little progress.
> > > > We observe tasks staying in uninterruptible sleep in
> try_to_free_pages()
> > > > We haven't find any proper ways to fix it.
> > >
> > > Have you identified the exact line of code where direct reclaim enters
> > > uninterruptible sleep? Is it waiting on a lock or something else?
> > >
> > Yes, we’ve identified the exact code locations where this happens:
> >
> > in slow path
> >
> > static bool throttle_direct_reclaim(gfp_t gfp_mask, struct zonelist *zonelist,
> >                                         nodemask_t *nodemask)
> > {
> > ...
> >         if (!(gfp_mask & __GFP_FS))
> >                 wait_event_interruptible_timeout(pgdat->pfmemalloc_wait,
> >                         allow_direct_reclaim(pgdat), HZ);
> >         else
> >                 /* Throttle until kswapd wakes the process */
> >                 wait_event_killable(zone->zone_pgdat->pfmemalloc_wait,
> >                         allow_direct_reclaim(pgdat));
> > ...
> > }
> >
> > in kswapd
> >
> > static bool prepare_kswapd_sleep(pg_data_t *pgdat, int order,
> >                                 int highest_zoneidx)
> > {
> > ...
> >         if (waitqueue_active(&pgdat->pfmemalloc_wait))
> >                 wake_up_all(&pgdat->pfmemalloc_wait);
> > ...
> > }
> 
> Thanks. I understand it could be problematic if throttling occurs,
> especially on threads related to user experience.
> 
Thank you for the detailed following up.

For Q3, the throttling is dangerous for UX‑critical threads. Kalesh also shared similar
observations about long direct reclaim tail latencies.

> >
> > > >
> > > > Q4: Lack of global hot/cold + priority view with per-app memcg
> > > > Android uses a per-app memcg model and foreground/background
> levels
> > > > for resource control. root reclaim lacks a cross-memcg hot/cold and
> > > > priority view;
> > > > foreground app file pages may be reclaimed and reloaded frequently,
> > > > causing visible stalls;
> > > > We currently use a hook [3] to skip reclaim for foreground apps.
> > >
> > > Interesting. This somehow reflects that the LRU lacks the user’s
> > > context, especially on Android systems—for example, which apps are in
> > > the foreground, which are in the background, and how long an app has
> > > been in the background.
> > >
> > > But this is not specific to MGLRU; it can also be an issue for the
> > > active/inactive LRU?
> > >
> > That's right, this affects both MGLRU and the traditional LRU.
> > We believe this comes from a semantic gap between the kernel and
> Android
> > (e.g. fg/bg, per-app priorities), and this is one of the main topics we’d like
> to discuss.
> > Additionally, even with MGLRU’s memcg-LRU, this issue is still not fully
> resolved
> > in our workloads.
> 
> We might be able to leverage some existing infrastructure. MGLRU
> maintains an LRU of LRUs, and within this structure, it may be
> possible to adjust positions based on whether a cgroup is in the
> foreground or background. In other words, the LRU of LRUs could
> receive hints from userspace to influence a cgroup’s position.
> 
Regarding the LRU‑of‑LRUs idea, that does sound like a promising direction 
(compare to vendor hook).
but it seems hardly support some more complex policies, e.g.,
- reclaim from bg apps first, but capping the reclaim amount and protect the bg 'super apps'
- preserving memcg MGLRU generation info after app frozen and not running.
Looking forward to the discussion.

> >
> > >
> > > Additionally, I’d like to add Q5 based on my observations:
> > > Q5:
> > > MGLRU places readahead folios in the newest generation. For example, if
> > > a page fault occurs at address 5, readahead fetches addresses 1–16, and
> > > all 16 folios are put in the youngest generation, even though many may
> > > not be needed. This can seriously impact reclamation performance, as
> > > these cold readahead folios occupy active slots.
> > >
> > > See the code below and the checks performed by lru_gen_in_fault().
> > >
> > > void folio_add_lru(struct folio *folio)
> > > {
> > >         ...
> > >         /* see the comment in lru_gen_folio_seq() */
> > >         if (lru_gen_enabled() && !folio_test_unevictable(folio) &&
> > >             lru_gen_in_fault() && !(current->flags & PF_MEMALLOC))
> > >                 folio_set_active(folio);
> > >
> > >         folio_batch_add_and_move(folio, lru_add);
> > > }
> > > EXPORT_SYMBOL(folio_add_lru);
> > >
> > > I could have submitted a patchset to address this by initially marking
> > > only address 5 as active, and activating the other addresses later when
> > > they are actually mapped or accessed.
> > >
> > This sounds very reasonable to us, and we look forward to discussing
> > and evaluating this direction together.
> 
> I sent an RFC today for Q5. I hope you can review, comment,
> and test it together:
> 
For Q5, we are happy to see your RFC. We’ve already comment to the thread
and will run it on our workloads to provide feedback.

Thanks,
Zicheng

> https://lore.kernel.org/linux-mm/20260225223712.3685-1-
> 21cnbao@gmail.com/
> 
> Thanks
> Barry

next prev parent reply	other threads:[~2026-02-26 13:29 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-24  3:17 wangzicheng
2026-02-24 17:10 ` Suren Baghdasaryan
2026-02-25 10:46   ` wangzicheng
2026-02-26  2:04     ` Kalesh Singh
2026-02-26 13:06       ` wangzicheng
2026-02-24 20:23 ` Barry Song
2026-02-25 10:43   ` wangzicheng
2026-02-26  8:03     ` Barry Song
2026-02-26 13:29       ` wangzicheng [this message]
  -- strict thread matches above, loose matches on Subject: below --
2026-02-14 10:06 wangzicheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d91b347b9639488580182a9032ba1f2f@honor.com \
    --to=wangzicheng@honor.com \
    --cc=21cnbao@gmail.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=android-mm@google.com \
    --cc=axelrasmussen@google.com \
    --cc=gaoxu2@honor.com \
    --cc=kaleshsingh@google.com \
    --cc=kasong@tencent.com \
    --cc=linkunli@honor.com \
    --cc=linux-mm@kvack.org \
    --cc=liulu.liu@honor.com \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=rdunlap@infradead.org \
    --cc=surenb@google.com \
    --cc=tao.wangtao@honor.com \
    --cc=wangxin23@honor.com \
    --cc=weixugc@google.com \
    --cc=willy@infradead.org \
    --cc=yangxuzhe@honor.com \
    --cc=yuanchu@google.com \
    --cc=zhouxiaolong9@honor.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox