From: Shakeel Butt <shakeelb@google.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Greg Thelen <gthelen@google.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linux MM <linux-mm@kvack.org>,
linux-fsdevel@vger.kernel.org,
LKML <linux-kernel@vger.kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: [PATCH] fs, mm: account filp and names caches to kmemcg
Date: Tue, 10 Oct 2017 15:21:53 -0700 [thread overview]
Message-ID: <CALvZod5VzPRRbhxLSn5GkgPbJEVJ9X5SfA=rjzRtTqLbCAe+eA@mail.gmail.com> (raw)
In-Reply-To: <20171010091042.eokqlrqec33w3qzt@dhcp22.suse.cz>
On Sun, Oct 8, 2017 at 11:24 PM, Michal Hocko <mhocko@kernel.org> wrote:
> On Fri 06-10-17 12:33:03, Shakeel Butt wrote:
>> >> names_cachep = kmem_cache_create("names_cache", PATH_MAX, 0,
>> >> - SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
>> >> + SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT, NULL);
>> >
>> > I might be wrong but isn't name cache only holding temporary objects
>> > used for path resolution which are not stored anywhere?
>> >
>>
>> Even though they're temporary, many containers can together use a
>> significant amount of transient uncharged memory. We've seen machines
>> with 100s of MiBs in names_cache.
>
> Yes that might be possible but are we prepared for random ENOMEM from
> vfs calls which need to allocate a temporary name?
>
I looked at all the syscalls which invoke allocations from
'names_cache' and tried to narrow down whose man page does not mention
that they can return ENOMEM. I found couple of syscalls like
truncate(), readdir() & getdents() which does not mention that they
can return ENOMEM but this patch will make them return ENOMEM.
>>
>> >> filp_cachep = kmem_cache_create("filp", sizeof(struct file), 0,
>> >> - SLAB_HWCACHE_ALIGN | SLAB_PANIC, NULL);
>> >> + SLAB_HWCACHE_ALIGN | SLAB_PANIC | SLAB_ACCOUNT, NULL);
>> >> percpu_counter_init(&nr_files, 0, GFP_KERNEL);
>> >> }
>> >
>> > Don't we have a limit for the maximum number of open files?
>> >
>>
>> Yes, there is a system limit of maximum number of open files. However
>> this limit is shared between different users on the system and one
>> user can hog this resource. To cater that, we set the maximum limit
>> very high and let the memory limit of each user limit the number of
>> files they can open.
>
> Similarly here. Are all syscalls allocating a fd prepared to return
> ENOMEM?
For filp, I found _sysctl(). However the man page says not to use it.
On Tue, Oct 10, 2017 at 2:10 AM, Michal Hocko <mhocko@kernel.org> wrote:
> On Mon 09-10-17 20:17:54, Michal Hocko wrote:
>> the primary concern for this patch was whether we really need/want to
>> charge short therm objects which do not outlive a single syscall.
>
> Let me expand on this some more. What is the benefit of kmem accounting
> of such an object? It cannot stop any runaway as a syscall lifetime
> allocations are bound to number of processes which we kind of contain by
> other means.
We can contain by limited the number of processes or thread but for us
applications having thousands of threads is very common. So, limiting
the number of threads/processes will not work.
> If we do account then we put a memory pressure due to
> something that cannot be reclaimed by no means. Even the memcg OOM
> killer would simply kick a single path while there might be others
> to consume the same type of memory.
>
> So what is the actual point in accounting these? Does it help to contain
> any workload better? What kind of workload?
>
I think the benefits will be isolation and more accurate billing. As I
have said before we have observed 100s of MiBs in names_cache on many
machines and cumulative amount is not something we can ignore as just
memory overhead.
> Or am I completely wrong and name objects can outlive a syscall
> considerably?
>
No, I didn't find any instance of the name objects outliving the syscall.
Anyways, we can discuss more on names_cache, do you have any objection
regarding charging filp?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-10-10 22:21 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-05 22:21 Shakeel Butt
2017-10-06 7:59 ` Michal Hocko
2017-10-06 19:33 ` Shakeel Butt
2017-10-09 6:24 ` Michal Hocko
2017-10-09 17:52 ` Greg Thelen
2017-10-09 18:04 ` Michal Hocko
2017-10-09 18:17 ` Michal Hocko
2017-10-10 9:10 ` Michal Hocko
2017-10-10 22:21 ` Shakeel Butt [this message]
2017-10-11 9:09 ` Michal Hocko
2017-10-09 20:26 ` Johannes Weiner
2017-10-10 9:14 ` Michal Hocko
2017-10-10 14:17 ` Johannes Weiner
2017-10-10 14:24 ` Michal Hocko
2017-10-12 19:03 ` Johannes Weiner
2017-10-12 23:57 ` Greg Thelen
2017-10-13 6:51 ` Michal Hocko
2017-10-13 6:35 ` Michal Hocko
2017-10-13 7:00 ` Michal Hocko
2017-10-13 15:24 ` Michal Hocko
2017-10-24 12:18 ` Michal Hocko
2017-10-24 17:54 ` Johannes Weiner
2017-10-24 16:06 ` Johannes Weiner
2017-10-24 16:22 ` Michal Hocko
2017-10-24 17:23 ` Johannes Weiner
2017-10-24 17:55 ` Michal Hocko
2017-10-24 18:58 ` Johannes Weiner
2017-10-24 20:15 ` Michal Hocko
2017-10-25 6:51 ` Greg Thelen
2017-10-25 7:15 ` Michal Hocko
2017-10-25 13:11 ` Johannes Weiner
2017-10-25 14:12 ` Michal Hocko
2017-10-25 16:44 ` Johannes Weiner
2017-10-25 17:29 ` Michal Hocko
2017-10-25 18:11 ` Johannes Weiner
2017-10-25 19:00 ` Michal Hocko
2017-10-25 21:13 ` Johannes Weiner
2017-10-25 22:49 ` Greg Thelen
2017-10-26 7:49 ` Michal Hocko
2017-10-26 12:45 ` Tetsuo Handa
2017-10-26 14:31 ` Johannes Weiner
2017-10-26 19:56 ` Greg Thelen
2017-10-27 8:20 ` Michal Hocko
2017-10-27 20:50 ` Shakeel Butt
2017-10-30 8:29 ` Michal Hocko
2017-10-30 19:28 ` Shakeel Butt
2017-10-31 8:00 ` Michal Hocko
2017-10-31 16:49 ` Johannes Weiner
2017-10-31 18:50 ` Michal Hocko
2017-10-24 15:45 ` Johannes Weiner
2017-10-24 16:30 ` Michal Hocko
2017-10-10 23:32 ` Al Viro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CALvZod5VzPRRbhxLSn5GkgPbJEVJ9X5SfA=rjzRtTqLbCAe+eA@mail.gmail.com' \
--to=shakeelb@google.com \
--cc=akpm@linux-foundation.org \
--cc=gthelen@google.com \
--cc=hannes@cmpxchg.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=vdavydov.dev@gmail.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox