linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* memcg reclaim demotion wrt. isolation
@ 2022-12-13 15:41 Michal Hocko
  2022-12-13 16:14 ` Johannes Weiner
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Michal Hocko @ 2022-12-13 15:41 UTC (permalink / raw)
  To: Dave Hansen, Huang, Ying
  Cc: Yang Shi, Wei Xu, Johannes Weiner, Andrew Morton, linux-mm, LKML

Hi,
I have just noticed that that pages allocated for demotion targets
includes __GFP_KSWAPD_RECLAIM (through GFP_NOWAIT). This is the case
since the code has been introduced by 26aa2d199d6f ("mm/migrate: demote
pages during reclaim"). I suspect the intention is to trigger the aging
on the fallback node and either drop or further demote oldest pages.

This makes sense but I suspect that this wasn't intended also for
memcg triggered reclaim. This would mean that a memory pressure in one
hierarchy could trigger paging out pages of a different hierarchy if the
demotion target is close to full.

I haven't really checked at the current kswapd wake up checks but I
suspect that kswapd would back off in most cases so this shouldn't
really cause any big problems. But I guess it would be better to simply
not wake kswapd up for the memcg reclaim. What do you think?
---
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 8fcc5fa768c0..1f3161173b85 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1568,7 +1568,7 @@ static struct page *alloc_demote_page(struct page *page, unsigned long private)
  * Folios which are not demoted are left on @demote_folios.
  */
 static unsigned int demote_folio_list(struct list_head *demote_folios,
-				     struct pglist_data *pgdat)
+				     struct pglist_data *pgdat, bool cgroup_reclaim)
 {
 	int target_nid = next_demotion_node(pgdat->node_id);
 	unsigned int nr_succeeded;
@@ -1589,6 +1589,10 @@ static unsigned int demote_folio_list(struct list_head *demote_folios,
 	if (list_empty(demote_folios))
 		return 0;
 
+	/* local memcg reclaim shouldn't directly reclaim from other memcgs */
+	if (cgroup_reclaim)
+		mtc->gfp_mask &= ~__GFP_RECLAIM;
+
 	if (target_nid == NUMA_NO_NODE)
 		return 0;
 
@@ -2066,7 +2070,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
 	/* 'folio_list' is always empty here */
 
 	/* Migrate folios selected for demotion */
-	nr_reclaimed += demote_folio_list(&demote_folios, pgdat);
+	nr_reclaimed += demote_folio_list(&demote_folios, pgdat, cgroup_reclaim(sc));
 	/* Folios that could not be demoted are still in @demote_folios */
 	if (!list_empty(&demote_folios)) {
 		/* Folios which weren't demoted go back on @folio_list for retry: */
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-12-16  3:17 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-13 15:41 memcg reclaim demotion wrt. isolation Michal Hocko
2022-12-13 16:14 ` Johannes Weiner
2022-12-14  9:42   ` Michal Hocko
2022-12-14 12:40     ` Johannes Weiner
2022-12-14 15:29       ` Michal Hocko
2022-12-14 17:40         ` Johannes Weiner
2022-12-15  6:17     ` Huang, Ying
2022-12-15  8:22       ` Johannes Weiner
2022-12-16  3:16         ` Huang, Ying
2022-12-13 22:26 ` Dave Hansen
2022-12-14  9:45   ` Michal Hocko
2022-12-14  2:57 ` Huang, Ying
2022-12-14  9:49   ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox