From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ie0-f181.google.com (mail-ie0-f181.google.com [209.85.223.181]) by kanga.kvack.org (Postfix) with ESMTP id DE5996B0031 for ; Thu, 27 Mar 2014 00:31:56 -0400 (EDT) Received: by mail-ie0-f181.google.com with SMTP id tp5so2870838ieb.40 for ; Wed, 26 Mar 2014 21:31:56 -0700 (PDT) Received: from mail-ie0-x249.google.com (mail-ie0-x249.google.com [2607:f8b0:4001:c03::249]) by mx.google.com with ESMTPS id l4si6763803igx.25.2014.03.26.21.31.55 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 26 Mar 2014 21:31:56 -0700 (PDT) Received: by mail-ie0-f201.google.com with SMTP id rd18so647314iec.0 for ; Wed, 26 Mar 2014 21:31:53 -0700 (PDT) References: <5a5b09d4cb9a15fc120b4bec8be168630a3b43c2.1395846845.git.vdavydov@parallels.com> From: Greg Thelen Subject: Re: [PATCH -mm 1/4] sl[au]b: do not charge large allocations to memcg In-reply-to: <5a5b09d4cb9a15fc120b4bec8be168630a3b43c2.1395846845.git.vdavydov@parallels.com> Date: Wed, 26 Mar 2014 21:31:51 -0700 Message-ID: MIME-Version: 1.0 Content-Type: text/plain Sender: owner-linux-mm@kvack.org List-ID: To: Vladimir Davydov Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, mhocko@suse.cz, glommer@gmail.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, devel@openvz.org, Christoph Lameter , Pekka Enberg On Wed, Mar 26 2014, Vladimir Davydov wrote: > We don't track any random page allocation, so we shouldn't track kmalloc > that falls back to the page allocator. This seems like a change which will leads to confusing (and arguably improper) kernel behavior. I prefer the behavior prior to this patch. Before this change both of the following allocations are charged to memcg (assuming kmem accounting is enabled): a = kmalloc(KMALLOC_MAX_CACHE_SIZE, GFP_KERNEL) b = kmalloc(KMALLOC_MAX_CACHE_SIZE + 1, GFP_KERNEL) After this change only 'a' is charged; 'b' goes directly to page allocator which no longer does accounting. > Signed-off-by: Vladimir Davydov > Cc: Johannes Weiner > Cc: Michal Hocko > Cc: Glauber Costa > Cc: Christoph Lameter > Cc: Pekka Enberg > --- > include/linux/slab.h | 2 +- > mm/memcontrol.c | 27 +-------------------------- > mm/slub.c | 4 ++-- > 3 files changed, 4 insertions(+), 29 deletions(-) > > diff --git a/include/linux/slab.h b/include/linux/slab.h > index 3dd389aa91c7..8a928ff71d93 100644 > --- a/include/linux/slab.h > +++ b/include/linux/slab.h > @@ -363,7 +363,7 @@ kmalloc_order(size_t size, gfp_t flags, unsigned int order) > { > void *ret; > > - flags |= (__GFP_COMP | __GFP_KMEMCG); > + flags |= __GFP_COMP; > ret = (void *) __get_free_pages(flags, order); > kmemleak_alloc(ret, size, 1, flags); > return ret; > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index b4b6aef562fa..81a162d01d4d 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -3528,35 +3528,10 @@ __memcg_kmem_newpage_charge(gfp_t gfp, struct mem_cgroup **_memcg, int order) > > *_memcg = NULL; > > - /* > - * Disabling accounting is only relevant for some specific memcg > - * internal allocations. Therefore we would initially not have such > - * check here, since direct calls to the page allocator that are marked > - * with GFP_KMEMCG only happen outside memcg core. We are mostly > - * concerned with cache allocations, and by having this test at > - * memcg_kmem_get_cache, we are already able to relay the allocation to > - * the root cache and bypass the memcg cache altogether. > - * > - * There is one exception, though: the SLUB allocator does not create > - * large order caches, but rather service large kmallocs directly from > - * the page allocator. Therefore, the following sequence when backed by > - * the SLUB allocator: > - * > - * memcg_stop_kmem_account(); > - * kmalloc() > - * memcg_resume_kmem_account(); > - * > - * would effectively ignore the fact that we should skip accounting, > - * since it will drive us directly to this function without passing > - * through the cache selector memcg_kmem_get_cache. Such large > - * allocations are extremely rare but can happen, for instance, for the > - * cache arrays. We bring this test here. > - */ > - if (!current->mm || current->memcg_kmem_skip_account) > + if (!current->mm) > return true; > > memcg = get_mem_cgroup_from_mm(current->mm); > - > if (!memcg_can_account_kmem(memcg)) { > css_put(&memcg->css); > return true; > diff --git a/mm/slub.c b/mm/slub.c > index 5e234f1f8853..c2e58a787443 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -3325,7 +3325,7 @@ static void *kmalloc_large_node(size_t size, gfp_t flags, int node) > struct page *page; > void *ptr = NULL; > > - flags |= __GFP_COMP | __GFP_NOTRACK | __GFP_KMEMCG; > + flags |= __GFP_COMP | __GFP_NOTRACK; > page = alloc_pages_node(node, flags, get_order(size)); > if (page) > ptr = page_address(page); > @@ -3395,7 +3395,7 @@ void kfree(const void *x) > if (unlikely(!PageSlab(page))) { > BUG_ON(!PageCompound(page)); > kfree_hook(x); > - __free_memcg_kmem_pages(page, compound_order(page)); > + __free_pages(page, compound_order(page)); > return; > } > slab_free(page->slab_cache, page, object, _RET_IP_); > -- > 1.7.10.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org