linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Josh Poimboeuf <jpoimboe@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	Jeff Layton <jlayton@kernel.org>,
	Chuck Lever <chuck.lever@oracle.com>,
	Shakeel Butt <shakeelb@google.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>
Cc: linux-kernel@vger.kernel.org, Jens Axboe <axboe@kernel.dk>,
	Tejun Heo <tj@kernel.org>,
	Vasily Averin <vasily.averin@linux.dev>,
	Michal Koutny <mkoutny@suse.com>,
	Waiman Long <longman@redhat.com>,
	Muchun Song <muchun.song@linux.dev>,
	Jiri Kosina <jikos@kernel.org>,
	cgroups@vger.kernel.org, linux-mm@kvack.org
Subject: [PATCH RFC 1/4] fs/locks: Fix file lock cache accounting, again
Date: Wed, 17 Jan 2024 08:14:43 -0800	[thread overview]
Message-ID: <ac84a832feba5418e1b58d1c7f3fe6cc7bc1de58.1705507931.git.jpoimboe@kernel.org> (raw)
In-Reply-To: <cover.1705507931.git.jpoimboe@kernel.org>

A container can exceed its memcg limits by allocating a bunch of file
locks.

This bug was originally fixed by commit 0f12156dff28 ("memcg: enable
accounting for file lock caches"), but was later reverted by commit
3754707bcc3e ("Revert "memcg: enable accounting for file lock caches"")
due to performance issues.

Unfortunately those performance issues were never addressed and the bug
has remained unfixed for over two years.

Fix it by default but allow users to disable it with a cmdline option
(flock_accounting=off).

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
 .../admin-guide/kernel-parameters.txt         | 17 +++++++++++
 fs/locks.c                                    | 30 +++++++++++++++++--
 2 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 6ee0f9a5da70..91987b06bc52 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1527,6 +1527,23 @@
 			See Documentation/admin-guide/sysctl/net.rst for
 			fb_tunnels_only_for_init_ns
 
+	flock_accounting=
+			[KNL] Enable/disable accounting for kernel
+			memory allocations related to file locks.
+			Format: { on | off }
+			Default: on
+			on:	Enable kernel memory accounting for file
+				locks.  This prevents task groups from
+				exceeding their memcg allocation limits.
+				However, it may cause slowdowns in the
+				flock() system call.
+			off:	Disable kernel memory accounting for
+				file locks.  This may allow a rogue task
+				to DoS the system by forcing the kernel
+				to allocate memory beyond the task
+				group's memcg limits.  Not recommended
+				unless you have trusted user space.
+
 	floppy=		[HW]
 			See Documentation/admin-guide/blockdev/floppy.rst.
 
diff --git a/fs/locks.c b/fs/locks.c
index cc7c117ee192..235ac56c557d 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2905,15 +2905,41 @@ static int __init proc_locks_init(void)
 fs_initcall(proc_locks_init);
 #endif
 
+static bool flock_accounting __ro_after_init = true;
+
+static int __init flock_accounting_cmdline(char *str)
+{
+	if (!str)
+		return -EINVAL;
+
+	if (!strcmp(str, "off"))
+		flock_accounting = false;
+	else if (!strcmp(str, "on"))
+		flock_accounting = true;
+	else
+		return -EINVAL;
+
+	return 0;
+}
+early_param("flock_accounting", flock_accounting_cmdline);
+
+#define FLOCK_ACCOUNTING_MSG "WARNING: File lock accounting is disabled, container-triggered host memory exhaustion possible!\n"
+
 static int __init filelock_init(void)
 {
 	int i;
+	slab_flags_t flags = SLAB_PANIC;
+
+	if (!flock_accounting)
+		pr_err(FLOCK_ACCOUNTING_MSG);
+	else
+		flags |= SLAB_ACCOUNT;
 
 	flctx_cache = kmem_cache_create("file_lock_ctx",
-			sizeof(struct file_lock_context), 0, SLAB_PANIC, NULL);
+			sizeof(struct file_lock_context), 0, flags, NULL);
 
 	filelock_cache = kmem_cache_create("file_lock_cache",
-			sizeof(struct file_lock), 0, SLAB_PANIC, NULL);
+			sizeof(struct file_lock), 0, flags, NULL);
 
 	for_each_possible_cpu(i) {
 		struct file_lock_list_struct *fll = per_cpu_ptr(&file_lock_list, i);
-- 
2.43.0



  reply	other threads:[~2024-01-17 16:15 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-17 16:14 [PATCH RFC 0/4] " Josh Poimboeuf
2024-01-17 16:14 ` Josh Poimboeuf [this message]
2024-01-17 19:00   ` [PATCH RFC 1/4] fs/locks: " Jeff Layton
2024-01-17 19:39     ` Josh Poimboeuf
2024-01-17 20:20       ` Linus Torvalds
2024-01-17 21:02         ` Shakeel Butt
2024-01-17 22:20           ` Roman Gushchin
2024-01-17 22:56             ` Shakeel Butt
2024-01-22  5:10               ` Linus Torvalds
2024-01-22 17:38                 ` Shakeel Butt
2024-01-26  9:50                 ` Vlastimil Babka
2024-01-30 11:04                   ` Vlastimil Babka
2024-01-19  7:47             ` Shakeel Butt
2024-01-17 21:19         ` Vlastimil Babka
2024-01-17 21:50         ` Roman Gushchin
2024-01-18  9:49     ` Michal Hocko
2024-01-17 16:14 ` [PATCH RFC 2/4] fs/locks: Add CONFIG_FLOCK_ACCOUNTING Josh Poimboeuf
2024-01-17 16:14 ` [PATCH RFC 3/4] mitigations: Expand 'mitigations=off' to include optional software mitigations Josh Poimboeuf
     [not found] ` <3e803d5aee5dd1f4c738f0de1e839e6cfcb9dc41.1705507931.git.jpoimboe@kernel.org>
2024-01-18  9:04   ` [PATCH RFC 4/4] mitigations: Add flock cache accounting to 'mitigations=off' Michal Koutný

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ac84a832feba5418e1b58d1c7f3fe6cc7bc1de58.1705507931.git.jpoimboe@kernel.org \
    --to=jpoimboe@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=hannes@cmpxchg.org \
    --cc=jikos@kernel.org \
    --cc=jlayton@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longman@redhat.com \
    --cc=mhocko@kernel.org \
    --cc=mkoutny@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vasily.averin@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox