From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BD26C433F5 for ; Fri, 12 Nov 2021 09:34:59 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C2D2260EE2 for ; Fri, 12 Nov 2021 09:34:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C2D2260EE2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.ibm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 5E4A96B0074; Fri, 12 Nov 2021 04:34:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 593DA6B0078; Fri, 12 Nov 2021 04:34:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 45C056B007B; Fri, 12 Nov 2021 04:34:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0250.hostedemail.com [216.40.44.250]) by kanga.kvack.org (Postfix) with ESMTP id 3612C6B0074 for ; Fri, 12 Nov 2021 04:34:58 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id F192B183D2A10 for ; Fri, 12 Nov 2021 09:34:57 +0000 (UTC) X-FDA: 78799769034.23.675B740 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by imf27.hostedemail.com (Postfix) with ESMTP id 72C5570000BD for ; Fri, 12 Nov 2021 09:34:57 +0000 (UTC) Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1AC9ENPE006498; Fri, 12 Nov 2021 09:34:52 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=date : from : to : cc : subject : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=pp1; bh=7cwt4AT1QjTUgKa8sxcTBMQE0d4Px34PsgVdvpoiweg=; b=NIpxfvBxs9HwTvdatgUcysCioh21u5zRRrKQr47sHKrV2paiddhqzkanM1W6vhLho2/d SdttHcuG3COt3XZ8wdf0dX2z6FMBVMvDTqx2U22LeWNj/CMEGVKIo7czNeCkuB/w6pDh cPRwxHCb0U7CvzZLZ9rGPJsIYIqvR4UrlfiTfK1nKtiN7hAbj3Ez92etxibQUEPtFrkh BJYYQpJWS3lnZZmBa6ze4ajhc4hK1VN29B0P9bZnG+ii5qL0mFrVAgnTUP94SHAInyPD chccCkyHXt7zfEoup4VzZbTJPkHorYrmKswjEcQqB6hJ///GPuRhdWNSI1IvUx7fWlBN rw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3c9nfb8d0b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 12 Nov 2021 09:34:52 +0000 Received: from m0098404.ppops.net (m0098404.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 1AC9EUpC007131; Fri, 12 Nov 2021 09:34:51 GMT Received: from ppma04fra.de.ibm.com (6a.4a.5195.ip4.static.sl-reverse.com [149.81.74.106]) by mx0a-001b2d01.pphosted.com with ESMTP id 3c9nfb8cym-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 12 Nov 2021 09:34:51 +0000 Received: from pps.filterd (ppma04fra.de.ibm.com [127.0.0.1]) by ppma04fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1AC9XL0Z020025; Fri, 12 Nov 2021 09:34:48 GMT Received: from b06avi18626390.portsmouth.uk.ibm.com (b06avi18626390.portsmouth.uk.ibm.com [9.149.26.192]) by ppma04fra.de.ibm.com with ESMTP id 3c5hbaunc3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 12 Nov 2021 09:34:48 +0000 Received: from d06av24.portsmouth.uk.ibm.com (mk.ibm.com [9.149.105.60]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1AC9S05262718334 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 12 Nov 2021 09:28:00 GMT Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id AF57B4203F; Fri, 12 Nov 2021 09:34:43 +0000 (GMT) Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8AE8B42052; Fri, 12 Nov 2021 09:34:42 +0000 (GMT) Received: from p-imbrenda (unknown [9.145.6.198]) by d06av24.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 12 Nov 2021 09:34:42 +0000 (GMT) Date: Fri, 12 Nov 2021 10:34:39 +0100 From: Claudio Imbrenda To: ebiederm@xmission.com (Eric W. Biederman) Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, thuth@redhat.com, frankja@linux.ibm.com, borntraeger@de.ibm.com, Ulrich.Weigand@de.ibm.com, david@redhat.com, ultrachin@163.com, akpm@linux-foundation.org, vbabka@suse.cz, brookxu.cn@gmail.com, xiaoggchen@tencent.com, linuszeng@tencent.com, yihuilu@tencent.com, mhocko@suse.com, daniel.m.jordan@oracle.com, axboe@kernel.dk, legion@kernel.org, peterz@infradead.org, aarcange@redhat.com, christian@brauner.io, tglx@linutronix.de Subject: Re: [RFC v1 2/4] kernel/fork.c: implement new process_mmput_async syscall Message-ID: <20211112103439.441b4c12@p-imbrenda> In-Reply-To: <874k8ixzx0.fsf@email.froward.int.ebiederm.org> References: <20211111095008.264412-1-imbrenda@linux.ibm.com> <20211111095008.264412-4-imbrenda@linux.ibm.com> <874k8ixzx0.fsf@email.froward.int.ebiederm.org> Organization: IBM X-Mailer: Claws Mail 3.18.0 (GTK+ 2.24.33; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: wYUp4jYQXevSbwEoGnS_6eLANud7K0fM X-Proofpoint-GUID: hqeNNGxrdhxIiKg0-_TUWO2BGY-nDive X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-11-12_03,2021-11-11_01,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 lowpriorityscore=0 phishscore=0 impostorscore=0 mlxscore=0 spamscore=0 mlxlogscore=999 suspectscore=0 malwarescore=0 clxscore=1015 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111120053 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 72C5570000BD X-Stat-Signature: 1upgd6bu3ypfzyd6hh6cqsu1h7gyb9fh Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=NIpxfvBx; spf=pass (imf27.hostedemail.com: domain of imbrenda@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=imbrenda@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com X-HE-Tag: 1636709697-185856 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 11 Nov 2021 13:20:11 -0600 ebiederm@xmission.com (Eric W. Biederman) wrote: > Claudio Imbrenda writes: > > > The goal of this new syscall is to be able to asynchronously free the > > mm of a dying process. This is especially useful for processes that use > > huge amounts of memory (e.g. databases or KVM guests). The process is > > allowed to terminate immediately, while its mm is cleaned/reclaimed > > asynchronously. > > > > A separate process needs use the process_mmput_async syscall to attach > > itself to the mm of a running target process. The process will then > > sleep until the last user of the target mm has gone. > > > > When the last user of the mm has gone, instead of synchronously free > > the mm, the attached process is awoken. The syscall will then continue > > and clean up the target mm. > > > > This solution has the advantage that the cleanup of the target mm can > > happen both be asynchronous and properly accounted for (e.g. cgroups). > > > > Tested on s390x. > > > > A separate patch will actually wire up the syscall. > > I am a bit confused. > > You want the process report that it has finished immediately, > and you want the cleanup work to continue on in the background. > > Why do you need a separate process? > > Why not just modify the process cleanup code to keep the task_struct > running while allowing waitpid to reap the process (aka allowing > release_task to run)? All tasks can be already be reaped after > exit_notify in do_exit. > > I can see some reasons for wanting an opt-in. It is nice to know all of > a processes resources have been freed when waitpid succeeds. > > Still I don't see why this whole thing isn't exit_mm returning > the mm_sturct when a flag is set, and then having an exit_mm_late > being called and passed the returned mm after exit_notify. nevermind, exit_notify is done after cgroup_exit, the teardown would then not be accounted properly > > Or maybe something with schedule_work or task_work, instead of an > exit_mm_late. I don't see any practical difference. > > I really don't see why this needs a whole other process to connect to > the process you care about asynchronously. > > This whole thing seems an exercise in spending lots of resources to free > resources much later. > > Eric