linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: akpm@linux-foundation.org, alexander.mikhalitsyn@virtuozzo.com,
	avagin@gmail.com, dave@stgolabs.net, ebiederm@xmission.com,
	gregkh@linuxfoundation.org, linux-mm@kvack.org,
	manfred@colorfullife.com, mm-commits@vger.kernel.org,
	ptikhomirov@virtuozzo.com, stable@vger.kernel.org,
	torvalds@linux-foundation.org, vvs@virtuozzo.com
Subject: [patch 02/15] ipc: WARN if trying to remove ipc object which is absent
Date: Fri, 19 Nov 2021 16:43:18 -0800	[thread overview]
Message-ID: <20211120004318.MW2qxdhTI%akpm@linux-foundation.org> (raw)
In-Reply-To: <20211119164248.50feee07c5d2cc6cc4addf97@linux-foundation.org>

From: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Subject: ipc: WARN if trying to remove ipc object which is absent

Patch series "shm: shm_rmid_forced feature fixes".

Some time ago I met kernel crash after CRIU restore procedure,
fortunately, it was CRIU restore, so, I had dump files and could do
restore many times and crash reproduced easily.  After some investigation
I've constructed the minimal reproducer.  It was found that it's
use-after-free and it happens only if sysctl kernel.shm_rmid_forced = 1.

The key of the problem is that the exit_shm() function not handles shp's
object destroy when task->sysvshm.shm_clist contains items from different
IPC namespaces.  In most cases this list will contain only items from one
IPC namespace.

Why this list may contain object from different namespaces?  Function
exit_shm() designed to clean up this list always when process leaves IPC
namespace.  But we made a mistake a long time ago and not add exit_shm()
call into setns() syscall procedures.  1st second idea was just to add
this call to setns() syscall but it's obviously changes semantics of
setns() syscall and that's userspace-visible change.  So, I gave up this
idea.

First real attempt to address the issue was just to omit forced destroy if
we meet shp object not from current task IPC namespace [1].  But that was
not the best idea because task->sysvshm.shm_clist was protected by rwsem
which belongs to current task IPC namespace.  It means that list
corruption may occur.

Second approach is just extend exit_shm() to properly handle shp's from
different IPC namespaces [2].  This is really non-trivial thing, I've put
a lot of effort into that but not believed that it's possible to make it
fully safe, clean and clear.

Thanks to the efforts of Manfred Spraul working an elegant solution was
designed.  Thanks a lot, Manfred!

Eric also suggested the way to address the issue in ("[RFC][PATCH] shm: In
shm_exit destroy all created and never attached segments") Eric's idea was
to maintain a list of shm_clists one per IPC namespace, use lock-less
lists.  But there is some extra memory consumption-related concerns.

Alternative solution which was suggested by me was implemented in ("shm:
reset shm_clist on setns but omit forced shm destroy") Idea is pretty
simple, we add exit_shm() syscall to setns() but DO NOT destroy shm
segments even if sysctl kernel.shm_rmid_forced = 1, we just clean up the
task->sysvshm.shm_clist list.  This chages semantics of setns() syscall a
little bit but in comparision to "naive" solution when we just add
exit_shm() without any special exclusions this looks like a safer option.

[1] https://lkml.org/lkml/2021/7/6/1108
[2] https://lkml.org/lkml/2021/7/14/736


This patch (of 2):

Let's produce a warning if we trying to remove non-existing IPC object
from IPC namespace kht/idr structures.

This allows to catch possible bugs when ipc_rmid() function was called
with inconsistent struct ipc_ids*, struct kern_ipc_perm* arguments.

Link: https://lkml.kernel.org/r/20211027224348.611025-1-alexander.mikhalitsyn@virtuozzo.com
Link: https://lkml.kernel.org/r/20211027224348.611025-2-alexander.mikhalitsyn@virtuozzo.com
Co-developed-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Vasily Averin <vvs@virtuozzo.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 ipc/util.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/ipc/util.c~ipc-warn-if-trying-to-remove-ipc-object-which-is-absent
+++ a/ipc/util.c
@@ -447,8 +447,8 @@ static int ipcget_public(struct ipc_name
 static void ipc_kht_remove(struct ipc_ids *ids, struct kern_ipc_perm *ipcp)
 {
 	if (ipcp->key != IPC_PRIVATE)
-		rhashtable_remove_fast(&ids->key_ht, &ipcp->khtnode,
-				       ipc_kht_params);
+		WARN_ON_ONCE(rhashtable_remove_fast(&ids->key_ht, &ipcp->khtnode,
+				       ipc_kht_params));
 }
 
 /**
@@ -498,7 +498,7 @@ void ipc_rmid(struct ipc_ids *ids, struc
 {
 	int idx = ipcid_to_idx(ipcp->id);
 
-	idr_remove(&ids->ipcs_idr, idx);
+	WARN_ON_ONCE(idr_remove(&ids->ipcs_idr, idx) != ipcp);
 	ipc_kht_remove(ids, ipcp);
 	ids->in_use--;
 	ipcp->deleted = true;
_


  parent reply	other threads:[~2021-11-20  0:44 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-20  0:42 incoming Andrew Morton
2021-11-20  0:43 ` [patch 01/15] mm/swap.c:put_pages_list(): reinitialise the page list Andrew Morton
2021-11-20  0:43 ` Andrew Morton [this message]
2021-11-20  0:43 ` [patch 03/15] shm: extend forced shm destroy to support objects from several IPC nses Andrew Morton
2021-11-20  0:43 ` [patch 04/15] mm: emit the "free" trace report before freeing memory in kmem_cache_free() Andrew Morton
2021-11-20  0:43 ` [patch 05/15] hexagon: export raw I/O routines for modules Andrew Morton
2021-11-20  0:43 ` [patch 06/15] hexagon: clean up timer-regs.h Andrew Morton
2021-11-20  0:43 ` [patch 07/15] hexagon: ignore vmlinux.lds Andrew Morton
2021-11-20  0:43 ` [patch 08/15] mm: kmemleak: slob: respect SLAB_NOLEAKTRACE flag Andrew Morton
2021-11-20  0:43 ` [patch 09/15] hugetlb: fix hugetlb cgroup refcounting during mremap Andrew Morton
2021-11-20  0:43 ` [patch 10/15] hugetlb, userfaultfd: fix reservation restore on userfaultfd error Andrew Morton
2021-11-20  0:43 ` [patch 11/15] kasan: test: silence intentional read overflow warnings Andrew Morton
2021-11-20  0:43 ` [patch 12/15] mm/damon/dbgfs: use '__GFP_NOWARN' for user-specified size buffer allocation Andrew Morton
2021-11-20  0:43 ` [patch 13/15] mm/damon/dbgfs: fix missed use of damon_dbgfs_lock Andrew Morton
2021-11-20  0:43 ` [patch 14/15] kmap_local: don't assume kmap PTEs are linear arrays in memory Andrew Morton
2021-11-20  0:43 ` [patch 15/15] proc/vmcore: fix clearing user buffer by properly using clear_user() Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211120004318.MW2qxdhTI%akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=alexander.mikhalitsyn@virtuozzo.com \
    --cc=avagin@gmail.com \
    --cc=dave@stgolabs.net \
    --cc=ebiederm@xmission.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-mm@kvack.org \
    --cc=manfred@colorfullife.com \
    --cc=mm-commits@vger.kernel.org \
    --cc=ptikhomirov@virtuozzo.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vvs@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox