* [PATCH] memcg: page_cgroup_ino() get memcg from the page's folio
@ 2023-04-12 0:34 Yosry Ahmed
0 siblings, 0 replies; only message in thread
From: Yosry Ahmed @ 2023-04-12 0:34 UTC (permalink / raw)
To: Hugh Dickins, Johannes Weiner, Michal Hocko, Roman Gushchin,
Shakeel Butt, Muchun Song, Andrew Morton, Naoya Horiguchi,
Miaohe Lin, Vladimir Davydov, Matthew Wilcox
Cc: linux-mm, cgroups, Yosry Ahmed
In a kernel with added WARN_ON_ONCE(PageTail) in page_memcg_check(), we
observed a warning from page_cgroup_ino() when reading
/proc/kpagecgroup. This warning was added to catch fragile reads of
a page memcg. Make page_cgroup_ino() get memcg from the page's folio
using folio_memcg_check(): that gives it the correct memcg for each page
of a folio, so is the right fix.
Note that page_folio() is racy, the page's folio can change from under
us, but the entire function is racy and documented as such.
I dithered between the right fix and the safer "fix": it's unlikely but
conceivable that some userspace has learnt that /proc/kpagecgroup gives
no memcg on tail pages, and compensates for that in some (racy) way: so
continuing to give no memcg on tails, without warning, might be safer.
But hwpoison_filter_task(), the only other user of page_cgroup_ino(),
persuaded me. It looks as if it currently leaves out tail pages of the
selected memcg, by mistake: whereas hwpoison_inject() uses compound_head()
and expects the tails to be included. So hwpoison testing coverage has
probably been restricted by the wrong output from page_cgroup_ino() (if
that memcg filter is used at all): in the short term, it might be safer
not to enable wider coverage there, but long term we would regret that.
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
---
This is based on a patch originally written by Hugh Dickins and retains
most of the original commit log:
https://lore.kernel.org/linux-mm/20230313083452.1319968-1-yosryahmed@google.com/
The patch was changed to use folio_memcg_check(page_folio(page)) instead
of page_memcg_check(compound_head(page)) based on discussions with
Matthew Wilcox; where he stated that callers of page_memcg_check()
should stop using it due to the ambiguity around tail pages -- instead
they should use folio_memcg_check() and handle tail pages themselves.
I dropped Michal's Ack as the only line in the patch was changed, but
the patch should be functionally the same.
---
mm/memcontrol.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5abffe6f8389..fec3c4fd9c1c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -395,7 +395,8 @@ ino_t page_cgroup_ino(struct page *page)
unsigned long ino = 0;
rcu_read_lock();
- memcg = page_memcg_check(page);
+ /* page_folio() is racy here, but the entire function is racy anyway */
+ memcg = folio_memcg_check(page_folio(page));
while (memcg && !(memcg->css.flags & CSS_ONLINE))
memcg = parent_mem_cgroup(memcg);
--
2.40.0.577.gac1e443424-goog
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2023-04-12 0:34 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-12 0:34 [PATCH] memcg: page_cgroup_ino() get memcg from the page's folio Yosry Ahmed
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox