From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: cgroups@vger.kernel.org, linux-mm@kvack.org
Cc: "Andrew Morton" <akpm@linux-foundation.org>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Michal Hocko" <mhocko@kernel.org>,
"Michal Koutný" <mkoutny@suse.com>,
"Peter Zijlstra" <peterz@infradead.org>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Vladimir Davydov" <vdavydov.dev@gmail.com>,
"Waiman Long" <longman@redhat.com>
Subject: [PATCH 0/4] mm/memcg: Address PREEMPT_RT problems instead of disabling it.
Date: Tue, 25 Jan 2022 17:43:33 +0100 [thread overview]
Message-ID: <20220125164337.2071854-1-bigeasy@linutronix.de> (raw)
Hi,
this series is a follow up to the initial RFC
https://lore.kernel.org/all/20211222114111.2206248-1-bigeasy@linutronix.de
and aims to enable MEMCG for PREEMPT_RT instead of disabling it.
where it has been suggested that I should try again with memcg instead
of simply disabling it.
Changes since the RFC:
- cgroup.event_control / memory.soft_limit_in_bytes is disabled on
PREEMPT_RT. It is a deprecated v1 feature. Fixing the signal path is
not worth it.
- The updates to per-CPU counters are usually synchronised by disabling
interrupts. There are a few spots where assumption about disabled
interrupts are not true on PREEMPT_RT and therefore preemption is
disabled. This is okay since the counter are never written from
in_irq() context.
Patch #2 deals with the counters.
Patch #3 is a follow up to
https://lkml.kernel.org/r/20211214144412.447035-1-longman@redhat.com
Patch #4 restricts the task_obj usage to !PREEMPTION kernels. Based on
the numbers in
https://lore.kernel.org/all/YdX+INO9gQje6d0S@linutronix.de
it seems to make sense to not restrict it only to PREEMPT_RT but to
PREEMPTION kernels (including PREEMPT_DYNAMIC).
I tested them on CONFIG_PREEMPT_NONE + CONFIG_PREEMPT_RT with the
tools/testing/selftests/cgroup/* tests. It looked good except for the
following (which was also there before the patches):
- test_kmem sometimes complained about:
not ok 2 test_kmem_memcg_deletion
- test_memcontrol complained always about
not ok 3 test_memcg_min
not ok 4 test_memcg_low
and did not finish.
- lockdep complains were triggered by test_core and test_freezer (both
had to run):
======================================================
WARNING: possible circular locking dependency detected
5.17.0-rc1+ #2 Not tainted
------------------------------------------------------
test_core/4751 is trying to acquire lock:
ffffffff82a35018 (css_set_lock){..-.}-{2:2}, at: obj_cgroup_release+0x22/0x90
but task is already holding lock:
ffff88810ba6abd8 (&sighand->siglock){....}-{2:2}, at: __lock_task_sighand+0x60/0x170
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&sighand->siglock){....}-{2:2}:
_raw_spin_lock+0x2a/0x40
cgroup_post_fork+0x1f5/0x290
copy_process+0x1ac9/0x1fc0
kernel_clone+0x5a/0x400
__do_sys_clone3+0xb9/0x120
do_syscall_64+0x64/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #0 (css_set_lock){..-.}-{2:2}:
__lock_acquire+0x1275/0x22e0
lock_acquire+0xd0/0x2e0
_raw_spin_lock_irqsave+0x39/0x50
obj_cgroup_release+0x22/0x90
refill_obj_stock+0x3cd/0x410
obj_cgroup_charge+0x159/0x320
kmem_cache_alloc+0xa7/0x480
__sigqueue_alloc+0x129/0x2d0
__send_signal+0x87/0x550
do_send_specific+0x10f/0x1d0
do_tkill+0x83/0xb0
__x64_sys_tgkill+0x20/0x30
do_syscall_64+0x64/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&sighand->siglock);
lock(css_set_lock);
lock(&Sagan->siglock);
lock(css_set_lock);
*** DEADLOCK ***
3 locks held by test_core/4751:
#0: ffffffff829a3f60 (rcu_read_lock){....}-{1:2}, at: do_send_specific+0x0/0x1d0
#1: ffff88810ba6abd8 (&sighand->siglock){....}-{2:2}, at: __lock_task_sighand+0x60/0x170
#2: ffffffff829a3f60 (rcu_read_lock){....}-{1:2}, at: refill_obj_stock+0x1a4/0x410
stack backtrace:
CPU: 1 PID: 4751 Comm: test_core Not tainted 5.17.0-rc1+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x45/0x59
check_noncircular+0xfe/0x110
__lock_acquire+0x1275/0x22e0
lock_acquire+0xd0/0x2e0
_raw_spin_lock_irqsave+0x39/0x50
obj_cgroup_release+0x22/0x90
refill_obj_stock+0x3cd/0x410
obj_cgroup_charge+0x159/0x320
kmem_cache_alloc+0xa7/0x480
__sigqueue_alloc+0x129/0x2d0
__send_signal+0x87/0x550
do_send_specific+0x10f/0x1d0
do_tkill+0x83/0xb0
__x64_sys_tgkill+0x20/0x30
do_syscall_64+0x64/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
</TASK>
Sebasttian
next reply other threads:[~2022-01-25 16:43 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-25 16:43 Sebastian Andrzej Siewior [this message]
2022-01-25 16:43 ` [PATCH 1/4] mm/memcg: Disable threshold event handlers on PREEMPT_RT Sebastian Andrzej Siewior
2022-01-26 14:40 ` Michal Hocko
2022-01-26 14:45 ` Sebastian Andrzej Siewior
2022-01-26 15:04 ` Michal Koutný
2022-01-27 13:36 ` Sebastian Andrzej Siewior
2022-01-26 15:21 ` Michal Hocko
2022-01-25 16:43 ` [PATCH 2/4] mm/memcg: Protect per-CPU counter by disabling preemption on PREEMPT_RT where needed Sebastian Andrzej Siewior
2022-01-26 10:06 ` Vlastimil Babka
2022-01-26 11:24 ` Sebastian Andrzej Siewior
2022-01-26 14:56 ` Michal Hocko
2022-01-25 16:43 ` [PATCH 3/4] mm/memcg: Add a local_lock_t for IRQ and TASK object Sebastian Andrzej Siewior
2022-01-26 15:20 ` Michal Hocko
2022-01-27 11:53 ` Sebastian Andrzej Siewior
2022-02-01 12:04 ` Michal Hocko
2022-02-01 12:11 ` Sebastian Andrzej Siewior
2022-02-01 15:29 ` Michal Hocko
2022-02-03 9:54 ` Sebastian Andrzej Siewior
2022-02-03 10:09 ` Michal Hocko
2022-02-03 11:09 ` Sebastian Andrzej Siewior
2022-02-08 17:58 ` Shakeel Butt
2022-02-09 9:17 ` Michal Hocko
2022-01-26 16:57 ` Vlastimil Babka
2022-01-31 15:06 ` Sebastian Andrzej Siewior
2022-02-03 16:01 ` Vlastimil Babka
2022-02-08 17:17 ` Sebastian Andrzej Siewior
2022-02-08 17:28 ` Michal Hocko
2022-02-09 1:48 ` [mm/memcg] 86895e1e85: WARNING:possible_circular_locking_dependency_detected kernel test robot
2022-01-25 16:43 ` [PATCH 4/4] mm/memcg: Allow the task_obj optimization only on non-PREEMPTIBLE kernels Sebastian Andrzej Siewior
2022-01-25 23:21 ` [PATCH 0/4] mm/memcg: Address PREEMPT_RT problems instead of disabling it Andrew Morton
2022-01-26 7:30 ` Sebastian Andrzej Siewior
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220125164337.2071854-1-bigeasy@linutronix.de \
--to=bigeasy@linutronix.de \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=linux-mm@kvack.org \
--cc=longman@redhat.com \
--cc=mhocko@kernel.org \
--cc=mkoutny@suse.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox