linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Michal Hocko <mhocko@kernel.org>,
	David Miller <davem@davemloft.net>,
	akpm@linux-foundation.org, tj@kernel.org, netdev@vger.kernel.org,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 5/8] mm: memcontrol: account socket memory on unified hierarchy
Date: Fri, 6 Nov 2015 11:35:55 -0500	[thread overview]
Message-ID: <20151106163555.GB7813@cmpxchg.org> (raw)
In-Reply-To: <20151106090555.GK29259@esperanza>

On Fri, Nov 06, 2015 at 12:05:55PM +0300, Vladimir Davydov wrote:
> On Thu, Nov 05, 2015 at 03:55:22PM -0500, Johannes Weiner wrote:
> > On Thu, Nov 05, 2015 at 03:40:02PM +0100, Michal Hocko wrote:
> ...
> > > 3) keep only some (safe) cache types enabled by default with the current
> > >    failing semantic and require an explicit enabling for the complete
> > >    kmem accounting. [di]cache code paths should be quite robust to
> > >    handle allocation failures.
> > 
> > Vladimir, what would be your opinion on this?
> 
> I'm all for this option. Actually, I've been thinking about this since I
> introduced the __GFP_NOACCOUNT flag. Not because of the failing
> semantics, since we can always let kmem allocations breach the limit.
> This shouldn't be critical, because I don't think it's possible to issue
> a series of kmem allocations w/o a single user page allocation, which
> would reclaim/kill the excess.
> 
> The point is there are allocations that are shared system-wide and
> therefore shouldn't go to any memcg. Most obvious examples are: mempool
> users and radix_tree/idr preloads. Accounting them to memcg is likely to
> result in noticeable memory overhead as memory cgroups are
> created/destroyed, because they pin dead memory cgroups with all their
> kmem caches, which aren't tiny.
> 
> Another funny example is objects destroyed lazily for performance
> reasons, e.g. vmap_area. Such objects are usually very small, so
> delaying destruction of a bunch of them will normally go unnoticed.
> However, if kmemcg is used the effective memory consumption caused by
> such objects can be multiplied by many times due to dangling kmem
> caches.
> 
> We can, of course, mark all such allocations as __GFP_NOACCOUNT, but the
> problem is they are tricky to identify, because they are scattered all
> over the kernel source tree. E.g. Dave Chinner mentioned that XFS
> internals do a lot of allocations that are shared among all XFS
> filesystems and therefore should not be accounted (BTW that's why
> list_lru's used by XFS are not marked as memcg-aware). There must be
> more out there. Besides, kernel developers don't usually even know about
> kmemcg (they just write the code for their subsys, so why should they?)
> so they won't care thinking about using __GFP_NOACCOUNT, and hence new
> falsely-accounted allocations are likely to appear.
> 
> That said, by switching from black-list (__GFP_NOACCOUNT) to white-list
> (__GFP_ACCOUNT) kmem accounting policy we would make the system more
> predictable and robust IMO. OTOH what would we lose? Security? Well,
> containers aren't secure IMHO. In fact, I doubt they will ever be (as
> secure as VMs). Anyway, if a runaway allocation is reported, it should
> be trivial to fix by adding __GFP_ACCOUNT where appropriate.

I wholeheartedly agree with all of this.

> If there are no objections, I'll prepare a patch switching to the
> white-list approach. Let's start from obvious things like fs_struct,
> mm_struct, task_struct, signal_struct, dentry, inode, which can be
> easily allocated from user space. This should cover 90% of all
> allocations that should be accounted AFAICS. The rest will be added
> later if necessarily.

Awesome, I'm looking forward to that patch!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2015-11-06 16:36 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-22  4:21 [PATCH 0/8] mm: memcontrol: account socket memory in " Johannes Weiner
2015-10-22  4:21 ` [PATCH 1/8] mm: page_counter: let page_counter_try_charge() return bool Johannes Weiner
2015-10-23 11:31   ` Michal Hocko
2015-10-22  4:21 ` [PATCH 2/8] mm: memcontrol: export root_mem_cgroup Johannes Weiner
2015-10-23 11:32   ` Michal Hocko
2015-10-22  4:21 ` [PATCH 3/8] net: consolidate memcg socket buffer tracking and accounting Johannes Weiner
2015-10-22 18:46   ` Vladimir Davydov
2015-10-22 19:09     ` Johannes Weiner
2015-10-23 13:42       ` Vladimir Davydov
2015-10-23 12:38   ` Michal Hocko
2015-10-22  4:21 ` [PATCH 4/8] mm: memcontrol: prepare for unified hierarchy socket accounting Johannes Weiner
2015-10-23 12:39   ` Michal Hocko
2015-10-22  4:21 ` [PATCH 5/8] mm: memcontrol: account socket memory on unified hierarchy Johannes Weiner
2015-10-22 18:47   ` Vladimir Davydov
2015-10-23 13:19   ` Michal Hocko
2015-10-23 13:59     ` David Miller
2015-10-26 16:56       ` Johannes Weiner
2015-10-27 12:26         ` Michal Hocko
2015-10-27 13:49           ` David Miller
2015-10-27 15:41           ` Johannes Weiner
2015-10-27 16:15             ` Michal Hocko
2015-10-27 16:42               ` Johannes Weiner
2015-10-28  0:45                 ` David Miller
2015-10-28  3:05                   ` Johannes Weiner
2015-10-29 15:25                 ` Michal Hocko
2015-10-29 16:10                   ` Johannes Weiner
2015-11-04 10:42                     ` Michal Hocko
2015-11-04 19:50                       ` Johannes Weiner
2015-11-05 14:40                         ` Michal Hocko
2015-11-05 16:16                           ` David Miller
2015-11-05 16:28                             ` Michal Hocko
2015-11-05 16:30                               ` David Miller
2015-11-05 22:32                               ` Johannes Weiner
2015-11-06 12:51                                 ` Michal Hocko
2015-11-05 20:55                           ` Johannes Weiner
2015-11-05 22:52                             ` Johannes Weiner
2015-11-06 10:57                               ` Michal Hocko
2015-11-06 16:19                                 ` Johannes Weiner
2015-11-06 16:46                                   ` Michal Hocko
2015-11-06 17:45                                     ` Johannes Weiner
2015-11-07  3:45                                     ` David Miller
2015-11-12 18:36                                   ` Mel Gorman
2015-11-12 19:12                                     ` Johannes Weiner
2015-11-06  9:05                             ` Vladimir Davydov
2015-11-06 13:29                               ` Michal Hocko
2015-11-06 16:35                               ` Johannes Weiner [this message]
2015-11-06 13:21                             ` Michal Hocko
2015-10-22  4:21 ` [PATCH 6/8] mm: vmscan: simplify memcg vs. global shrinker invocation Johannes Weiner
2015-10-23 13:26   ` Michal Hocko
2015-10-22  4:21 ` [PATCH 7/8] mm: vmscan: report vmpressure at the level of reclaim activity Johannes Weiner
2015-10-22 18:48   ` Vladimir Davydov
2015-10-23 13:49   ` Michal Hocko
2015-10-22  4:21 ` [PATCH 8/8] mm: memcontrol: hook up vmpressure to socket pressure Johannes Weiner
2015-10-22 18:57   ` Vladimir Davydov
2015-10-22 18:45 ` [PATCH 0/8] mm: memcontrol: account socket memory in unified hierarchy Vladimir Davydov
2015-10-26 17:22   ` Johannes Weiner
2015-10-27  8:43     ` Vladimir Davydov
2015-10-27 16:01       ` Johannes Weiner
2015-10-28  8:20         ` Vladimir Davydov
2015-10-28 18:58           ` Johannes Weiner
2015-10-29  9:27             ` Vladimir Davydov
2015-10-29 17:52               ` Johannes Weiner
2015-11-02 14:47                 ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151106163555.GB7813@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=vdavydov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox