From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f69.google.com (mail-wm0-f69.google.com [74.125.82.69]) by kanga.kvack.org (Postfix) with ESMTP id DC96F6B0005 for ; Tue, 19 Jun 2018 12:22:15 -0400 (EDT) Received: by mail-wm0-f69.google.com with SMTP id n8-v6so491731wmh.0 for ; Tue, 19 Jun 2018 09:22:15 -0700 (PDT) Received: from gum.cmpxchg.org (gum.cmpxchg.org. [85.214.110.215]) by mx.google.com with ESMTPS id l7-v6si174920edn.256.2018.06.19.09.22.14 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 19 Jun 2018 09:22:14 -0700 (PDT) Date: Tue, 19 Jun 2018 12:24:29 -0400 From: Johannes Weiner Subject: Re: [PATCH 1/3] mm: memcg: remote memcg charging for kmem allocations Message-ID: <20180619162429.GB27423@cmpxchg.org> References: <20180619051327.149716-1-shakeelb@google.com> <20180619051327.149716-2-shakeelb@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180619051327.149716-2-shakeelb@google.com> Sender: owner-linux-mm@kvack.org List-ID: To: Shakeel Butt Cc: Andrew Morton , Michal Hocko , Vladimir Davydov , Jan Kara , Greg Thelen , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Jan Kara , Amir Goldstein , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Mel Gorman , Vlastimil Babka , Alexander Viro On Mon, Jun 18, 2018 at 10:13:25PM -0700, Shakeel Butt wrote: > @@ -248,6 +248,30 @@ static inline void memalloc_noreclaim_restore(unsigned int flags) > current->flags = (current->flags & ~PF_MEMALLOC) | flags; > } > > +#ifdef CONFIG_MEMCG > +static inline struct mem_cgroup *memalloc_memcg_save(struct mem_cgroup *memcg) > +{ > + struct mem_cgroup *old_memcg = current->target_memcg; > + > + current->target_memcg = memcg; > + return old_memcg; > +} > + > +static inline void memalloc_memcg_restore(struct mem_cgroup *memcg) > +{ > + current->target_memcg = memcg; > +} The use_mm() and friends naming scheme would be better here: memalloc_use_memcg(), memalloc_unuse_memcg(), current->active_memcg > @@ -375,6 +376,27 @@ static __always_inline void kfree_bulk(size_t size, void **p) > kmem_cache_free_bulk(NULL, size, p); > } > > +/* > + * Calling kmem_cache_alloc_memcg implicitly assumes that the caller wants > + * a __GFP_ACCOUNT allocation. However if memcg is NULL then > + * kmem_cache_alloc_memcg is same as kmem_cache_alloc. > + */ > +static __always_inline void *kmem_cache_alloc_memcg(struct kmem_cache *cachep, > + gfp_t flags, > + struct mem_cgroup *memcg) > +{ > + struct mem_cgroup *old_memcg; > + void *ptr; > + > + if (!memcg) > + return kmem_cache_alloc(cachep, flags); > + > + old_memcg = memalloc_memcg_save(memcg); > + ptr = kmem_cache_alloc(cachep, flags | __GFP_ACCOUNT); > + memalloc_memcg_restore(old_memcg); > + return ptr; I'm not a big fan of these functions as an interface because it implies that kmem_cache_alloc() et al wouldn't charge a memcg - but they do, just using current's memcg. It's also a lot of churn to duplicate all the various slab functions. Can you please inline the save/restore (or use/unuse) functions into the callsites? If you make them handle NULL as parameters, it merely adds two bracketing lines around the allocation call in the callsites, which I think would be better to understand - in particular with a comment on why we are charging *that* group instead of current's. > +static __always_inline struct mem_cgroup *get_mem_cgroup( > + struct mem_cgroup *memcg, struct mm_struct *mm) > +{ > + if (unlikely(memcg)) { > + rcu_read_lock(); > + if (css_tryget_online(&memcg->css)) { > + rcu_read_unlock(); > + return memcg; > + } > + rcu_read_unlock(); > + } > + return get_mem_cgroup_from_mm(mm); > +} > + > /** > * mem_cgroup_iter - iterate over memory cgroup hierarchy > * @root: hierarchy root > @@ -2260,7 +2274,7 @@ struct kmem_cache *memcg_kmem_get_cache(struct kmem_cache *cachep) > if (current->memcg_kmem_skip_account) > return cachep; > > - memcg = get_mem_cgroup_from_mm(current->mm); > + memcg = get_mem_cgroup(current->target_memcg, current->mm); get_mem_cgroup_from_current(), which uses current->active_memcg if set and current->mm->memcg otherwise, would be a nicer abstraction IMO.