linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Claudio Imbrenda <imbrenda@linux.ibm.com>
To: ebiederm@xmission.com (Eric W. Biederman)
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	thuth@redhat.com, frankja@linux.ibm.com, borntraeger@de.ibm.com,
	Ulrich.Weigand@de.ibm.com, david@redhat.com, ultrachin@163.com,
	akpm@linux-foundation.org, vbabka@suse.cz, brookxu.cn@gmail.com,
	xiaoggchen@tencent.com, linuszeng@tencent.com,
	yihuilu@tencent.com, mhocko@suse.com, daniel.m.jordan@oracle.com,
	axboe@kernel.dk, legion@kernel.org, peterz@infradead.org,
	aarcange@redhat.com, christian@brauner.io, tglx@linutronix.de
Subject: Re: [RFC v1 2/4] kernel/fork.c: implement new process_mmput_async syscall
Date: Fri, 12 Nov 2021 10:27:28 +0100	[thread overview]
Message-ID: <20211112102728.47a9d1f2@p-imbrenda> (raw)
In-Reply-To: <874k8ixzx0.fsf@email.froward.int.ebiederm.org>

On Thu, 11 Nov 2021 13:20:11 -0600
ebiederm@xmission.com (Eric W. Biederman) wrote:

> Claudio Imbrenda <imbrenda@linux.ibm.com> writes:
> 
> > The goal of this new syscall is to be able to asynchronously free the
> > mm of a dying process. This is especially useful for processes that use
> > huge amounts of memory (e.g. databases or KVM guests). The process is
> > allowed to terminate immediately, while its mm is cleaned/reclaimed
> > asynchronously.
> >
> > A separate process needs use the process_mmput_async syscall to attach
> > itself to the mm of a running target process. The process will then
> > sleep until the last user of the target mm has gone.
> >
> > When the last user of the mm has gone, instead of synchronously free
> > the mm, the attached process is awoken. The syscall will then continue
> > and clean up the target mm.
> >
> > This solution has the advantage that the cleanup of the target mm can
> > happen both be asynchronous and properly accounted for (e.g. cgroups).
> >
> > Tested on s390x.
> >
> > A separate patch will actually wire up the syscall.  
> 
> I am a bit confused.
> 
> You want the process report that it has finished immediately,
> and you want the cleanup work to continue on in the background.

yes

> Why do you need a separate process?
> 
> Why not just modify the process cleanup code to keep the task_struct
> running while allowing waitpid to reap the process (aka allowing
> release_task to run)?  All tasks can be already be reaped after
> exit_notify in do_exit.
> 
> I can see some reasons for wanting an opt-in.  It is nice to know all of
> a processes resources have been freed when waitpid succeeds.
> 
> Still I don't see why this whole thing isn't exit_mm returning
> the mm_sturct when a flag is set, and then having an exit_mm_late
> being called and passed the returned mm after exit_notify.

so if I understand correctly you are saying exit_mm would skip the
mmput, set a flag, then I should introduce a new function
"exit_mm_late" after exit_notify, to check the flag and do the mmput if
needed

and that would mean that the cleanup would still be done in the context
of the exiting process, but without holding back anyone waiting for the
process to terminate (so the process appears to exit immediately)

sounds clean, I will do it

> Or maybe something with schedule_work or task_work, instead of an
> exit_mm_late.  I don't see any practical difference.
> 
> I really don't see why this needs a whole other process to connect to
> the process you care about asynchronously.

accounting. workqueues or kernel threads are not properly accounted to
the right cgroups; by using a userspace process, things get accounted
properly.

this was a major point that was made last month when a similar
discussion came up

> This whole thing seems an exercise in spending lots of resources to free
> resources much later.

there are some usecases for this (huge processes like databases, or huge
secure VMs where the teardown is significantly slower than normal
processes)

> 
> Eric



  reply	other threads:[~2021-11-12  9:27 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-11  9:50 [RFC v1 0/4] Two alternatives for mm async teardown Claudio Imbrenda
2021-11-11  9:50 ` [RFC v1 1/4] add arch mmput hook in exit.c Claudio Imbrenda
2021-11-11  9:50 ` [RFC v1 1/4] exit: add arch mmput hook in exit_mm Claudio Imbrenda
2021-11-11 18:43   ` Eric W. Biederman
2021-11-11  9:50 ` [RFC v1 2/4] kernel/fork.c: implement new process_mmput_async syscall Claudio Imbrenda
2021-11-11 19:20   ` Eric W. Biederman
2021-11-12  9:27     ` Claudio Imbrenda [this message]
2021-11-12  9:34     ` Claudio Imbrenda
2021-11-12 14:57       ` Eric W. Biederman
2021-11-12 16:53         ` Claudio Imbrenda
2021-11-15 10:43           ` Michal Hocko
2021-11-11  9:50 ` [RFC v1 3/4] mm: wire up the " Claudio Imbrenda
2021-11-11  9:50 ` [RFC v1 4/4] kernel/fork.c: process_mmput_async: stop OOM while freeing memory Claudio Imbrenda
2021-11-12 10:15 ` [RFC v1 0/4] Two alternatives for mm async teardown Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211112102728.47a9d1f2@p-imbrenda \
    --to=imbrenda@linux.ibm.com \
    --cc=Ulrich.Weigand@de.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=borntraeger@de.ibm.com \
    --cc=brookxu.cn@gmail.com \
    --cc=christian@brauner.io \
    --cc=daniel.m.jordan@oracle.com \
    --cc=david@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=frankja@linux.ibm.com \
    --cc=legion@kernel.org \
    --cc=linuszeng@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=thuth@redhat.com \
    --cc=ultrachin@163.com \
    --cc=vbabka@suse.cz \
    --cc=xiaoggchen@tencent.com \
    --cc=yihuilu@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox