From: Johannes Weiner <hannes@cmpxchg.org>
To: Roman Gushchin <guro@fb.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@kernel.org>,
Shakeel Butt <shakeelb@google.com>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
linux-kernel@vger.kernel.org, kernel-team@fb.com,
Bharata B Rao <bharata@linux.ibm.com>,
Yafang Shao <laoar.shao@gmail.com>
Subject: Re: [PATCH v2 16/28] mm: memcg/slab: allocate obj_cgroups for non-root slab pages
Date: Mon, 3 Feb 2020 17:29:44 -0500 [thread overview]
Message-ID: <20200203222944.GB7345@cmpxchg.org> (raw)
In-Reply-To: <20200203211915.GB6781@xps.dhcp.thefacebook.com>
On Mon, Feb 03, 2020 at 01:19:15PM -0800, Roman Gushchin wrote:
> On Mon, Feb 03, 2020 at 03:46:27PM -0500, Johannes Weiner wrote:
> > On Mon, Feb 03, 2020 at 10:34:52AM -0800, Roman Gushchin wrote:
> > > On Mon, Feb 03, 2020 at 01:27:56PM -0500, Johannes Weiner wrote:
> > > > On Mon, Jan 27, 2020 at 09:34:41AM -0800, Roman Gushchin wrote:
> > > > > Allocate and release memory to store obj_cgroup pointers for each
> > > > > non-root slab page. Reuse page->mem_cgroup pointer to store a pointer
> > > > > to the allocated space.
> > > > >
> > > > > To distinguish between obj_cgroups and memcg pointers in case
> > > > > when it's not obvious which one is used (as in page_cgroup_ino()),
> > > > > let's always set the lowest bit in the obj_cgroup case.
> > > > >
> > > > > Signed-off-by: Roman Gushchin <guro@fb.com>
> > > > > ---
> > > > > include/linux/mm.h | 25 ++++++++++++++++++--
> > > > > include/linux/mm_types.h | 5 +++-
> > > > > mm/memcontrol.c | 5 ++--
> > > > > mm/slab.c | 3 ++-
> > > > > mm/slab.h | 51 +++++++++++++++++++++++++++++++++++++++-
> > > > > mm/slub.c | 2 +-
> > > > > 6 files changed, 83 insertions(+), 8 deletions(-)
> > > > >
> > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > > > > index 080f8ac8bfb7..65224becc4ca 100644
> > > > > --- a/include/linux/mm.h
> > > > > +++ b/include/linux/mm.h
> > > > > @@ -1264,12 +1264,33 @@ static inline void set_page_links(struct page *page, enum zone_type zone,
> > > > > #ifdef CONFIG_MEMCG
> > > > > static inline struct mem_cgroup *page_memcg(struct page *page)
> > > > > {
> > > > > - return page->mem_cgroup;
> > > > > + struct mem_cgroup *memcg = page->mem_cgroup;
> > > > > +
> > > > > + /*
> > > > > + * The lowest bit set means that memcg isn't a valid memcg pointer,
> > > > > + * but a obj_cgroups pointer. In this case the page is shared and
> > > > > + * isn't charged to any specific memory cgroup. Return NULL.
> > > > > + */
> > > > > + if ((unsigned long) memcg & 0x1UL)
> > > > > + memcg = NULL;
> > > > > +
> > > > > + return memcg;
> > > >
> > > > That should really WARN instead of silently returning NULL. Which
> > > > callsite optimistically asks a page's cgroup when it has no idea
> > > > whether that page is actually a userpage or not?
> > >
> > > For instance, look at page_cgroup_ino() called from the
> > > reading /proc/kpageflags.
> >
> > But that checks PageSlab() and implements memcg_from_slab_page() to
> > handle that case properly. And that's what we expect all callsites to
> > do: make sure that the question asked actually makes sense, instead of
> > having the interface paper over bogus requests.
> >
> > If that function is completely racy and PageSlab isn't stable, then it
> > should really just open-code the lookup, rather than require weakening
> > the interface for everybody else.
>
> Why though?
>
> Another example: process stack can be depending on the machine config and
> platform a vmalloc allocation, a slab allocation or a "high-order slab allocation",
> which is executed by the page allocator directly.
>
> It's kinda nice to have a function that hides accounting details
> and returns a valid memcg pointer for any kind of objects.
Hm? I'm not objecting to that, memcg_from_obj() makes perfect sense to
me, to use with kvmalloc() objects for example.
I'm objecting to page_memcg() silently swallowing bogus inputs. That
function shouldn't silently say "there is no cgroup associated with
this page" when the true answer is "this page has MANY cgroups
associated with it, this question doesn't make any sense".
It's not exactly hard to imagine how this could cause bugs, is it?
Where a caller should implement a slab case (exactly like
page_cgroup_ino()) but is confused about the type of page it has,
whether it's charged or not etc.?
next prev parent reply other threads:[~2020-02-03 22:29 UTC|newest]
Thread overview: 84+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-27 17:34 [PATCH v2 00/28] The new cgroup slab memory controller Roman Gushchin
2020-01-27 17:34 ` [PATCH v2 01/28] mm: kmem: cleanup (__)memcg_kmem_charge_memcg() arguments Roman Gushchin
2020-01-27 17:34 ` [PATCH v2 02/28] mm: kmem: cleanup memcg_kmem_uncharge_memcg() arguments Roman Gushchin
2020-01-27 17:34 ` [PATCH v2 03/28] mm: kmem: rename memcg_kmem_(un)charge() into memcg_kmem_(un)charge_page() Roman Gushchin
2020-01-27 17:34 ` [PATCH v2 04/28] mm: kmem: switch to nr_pages in (__)memcg_kmem_charge_memcg() Roman Gushchin
2020-01-27 17:34 ` [PATCH v2 05/28] mm: memcg/slab: cache page number in memcg_(un)charge_slab() Roman Gushchin
2020-01-27 17:34 ` [PATCH v2 06/28] mm: kmem: rename (__)memcg_kmem_(un)charge_memcg() to __memcg_kmem_(un)charge() Roman Gushchin
2020-01-27 17:34 ` [PATCH v2 07/28] mm: memcg/slab: introduce mem_cgroup_from_obj() Roman Gushchin
2020-02-03 16:05 ` Johannes Weiner
2020-01-27 17:34 ` [PATCH v2 08/28] mm: fork: fix kernel_stack memcg stats for various stack implementations Roman Gushchin
2020-02-03 16:12 ` Johannes Weiner
2020-01-27 17:34 ` [PATCH v2 09/28] mm: memcg/slab: rename __mod_lruvec_slab_state() into __mod_lruvec_obj_state() Roman Gushchin
2020-02-03 16:13 ` Johannes Weiner
2020-01-27 17:34 ` [PATCH v2 10/28] mm: memcg: introduce mod_lruvec_memcg_state() Roman Gushchin
2020-02-03 17:39 ` Johannes Weiner
2020-01-27 17:34 ` [PATCH v2 11/28] mm: slub: implement SLUB version of obj_to_index() Roman Gushchin
2020-02-03 17:44 ` Johannes Weiner
2020-01-27 17:34 ` [PATCH v2 12/28] mm: vmstat: use s32 for vm_node_stat_diff in struct per_cpu_nodestat Roman Gushchin
2020-02-03 17:58 ` Johannes Weiner
2020-02-03 18:25 ` Roman Gushchin
2020-02-03 20:34 ` Johannes Weiner
2020-02-03 22:28 ` Roman Gushchin
2020-02-03 22:39 ` Johannes Weiner
2020-02-04 1:44 ` Roman Gushchin
2020-01-27 17:34 ` [PATCH v2 13/28] mm: vmstat: convert slab vmstat counter to bytes Roman Gushchin
2020-01-27 17:34 ` [PATCH v2 14/28] mm: memcontrol: decouple reference counting from page accounting Roman Gushchin
2020-01-27 17:34 ` [PATCH v2 15/28] mm: memcg/slab: obj_cgroup API Roman Gushchin
2020-02-03 19:31 ` Johannes Weiner
2020-01-27 17:34 ` [PATCH v2 16/28] mm: memcg/slab: allocate obj_cgroups for non-root slab pages Roman Gushchin
2020-02-03 18:27 ` Johannes Weiner
2020-02-03 18:34 ` Roman Gushchin
2020-02-03 20:46 ` Johannes Weiner
2020-02-03 21:19 ` Roman Gushchin
2020-02-03 22:29 ` Johannes Weiner [this message]
2020-01-27 17:34 ` [PATCH v2 17/28] mm: memcg/slab: save obj_cgroup for non-root slab objects Roman Gushchin
2020-02-03 19:53 ` Johannes Weiner
2020-01-27 17:34 ` [PATCH v2 18/28] mm: memcg/slab: charge individual slab objects instead of pages Roman Gushchin
2020-01-27 17:34 ` [PATCH v2 19/28] mm: memcg/slab: deprecate memory.kmem.slabinfo Roman Gushchin
2020-01-27 17:34 ` [PATCH v2 20/28] mm: memcg/slab: move memcg_kmem_bypass() to memcontrol.h Roman Gushchin
2020-01-27 17:34 ` [PATCH v2 21/28] mm: memcg/slab: use a single set of kmem_caches for all memory cgroups Roman Gushchin
2020-02-03 19:50 ` Johannes Weiner
2020-02-03 20:58 ` Roman Gushchin
2020-02-03 22:17 ` Johannes Weiner
2020-02-03 22:38 ` Roman Gushchin
2020-02-04 1:15 ` Roman Gushchin
2020-02-04 2:47 ` Johannes Weiner
2020-02-04 4:35 ` Roman Gushchin
2020-02-04 18:41 ` Johannes Weiner
2020-02-05 15:58 ` Roman Gushchin
2020-01-27 17:34 ` [PATCH v2 22/28] mm: memcg/slab: simplify memcg cache creation Roman Gushchin
2020-01-27 17:34 ` [PATCH v2 23/28] mm: memcg/slab: deprecate memcg_kmem_get_cache() Roman Gushchin
2020-01-27 17:34 ` [PATCH v2 24/28] mm: memcg/slab: deprecate slab_root_caches Roman Gushchin
2020-01-27 17:34 ` [PATCH v2 25/28] mm: memcg/slab: remove redundant check in memcg_accumulate_slabinfo() Roman Gushchin
2020-01-27 17:34 ` [PATCH v2 26/28] tools/cgroup: add slabinfo.py tool Roman Gushchin
2020-01-27 17:34 ` [PATCH v2 27/28] tools/cgroup: make slabinfo.py compatible with new slab controller Roman Gushchin
2020-01-30 2:17 ` Bharata B Rao
2020-01-30 2:44 ` Roman Gushchin
2020-01-31 22:24 ` Roman Gushchin
2020-02-12 5:21 ` Bharata B Rao
2020-02-12 20:42 ` Roman Gushchin
2020-01-27 17:34 ` [PATCH v2 28/28] kselftests: cgroup: add kernel memory accounting tests Roman Gushchin
2020-01-30 2:06 ` [PATCH v2 00/28] The new cgroup slab memory controller Bharata B Rao
2020-01-30 2:41 ` Roman Gushchin
2020-08-12 23:16 ` Pavel Tatashin
2020-08-12 23:18 ` Pavel Tatashin
2020-08-13 0:04 ` Roman Gushchin
2020-08-13 0:31 ` Pavel Tatashin
2020-08-28 16:47 ` Pavel Tatashin
2020-09-01 5:28 ` Bharata B Rao
2020-09-01 12:52 ` Pavel Tatashin
2020-09-02 6:23 ` Bharata B Rao
2020-09-02 12:34 ` Pavel Tatashin
2020-09-02 9:53 ` Vlastimil Babka
2020-09-02 10:39 ` David Hildenbrand
2020-09-02 12:42 ` Pavel Tatashin
2020-09-02 13:50 ` Michal Hocko
2020-09-02 14:20 ` Pavel Tatashin
2020-09-03 18:09 ` David Hildenbrand
2020-09-02 11:26 ` Michal Hocko
2020-09-02 12:51 ` Pavel Tatashin
2020-09-02 13:51 ` Michal Hocko
2020-09-02 11:32 ` Michal Hocko
2020-09-02 12:53 ` Pavel Tatashin
2020-09-02 13:52 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200203222944.GB7345@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=bharata@linux.ibm.com \
--cc=guro@fb.com \
--cc=kernel-team@fb.com \
--cc=laoar.shao@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=shakeelb@google.com \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox