From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DEF3C54E5D for ; Mon, 18 Mar 2024 10:52:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DDFF96B0083; Mon, 18 Mar 2024 06:52:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D90356B0085; Mon, 18 Mar 2024 06:52:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C7FDD6B0087; Mon, 18 Mar 2024 06:52:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id B68546B0083 for ; Mon, 18 Mar 2024 06:52:37 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 30BD4A12E0 for ; Mon, 18 Mar 2024 10:52:37 +0000 (UTC) X-FDA: 81909846354.25.4E703E7 Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) by imf04.hostedemail.com (Postfix) with ESMTP id A06E14000D for ; Mon, 18 Mar 2024 10:52:33 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=none; spf=pass (imf04.hostedemail.com: domain of tujinjiang@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=tujinjiang@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710759155; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ya9usNjYanz4QbvnRGFSXdf8DOyxh1oszmMH0K3S03c=; b=JnAyLi8W/HMX6sHUyGgUpLm+M0NREOWxId0nxKTbO5FQx0gQgWVOXe0JujfGUBEoyk9/1n VOjDvwW7TUOl1UP2O+nVHz5KUHgO+0xoPbRQIQtdoAZAQbqAeK6rIB/kBMSAUHdPLfM8Wb frj4+bNexD/r5XWcHwa5ekU2xQxS6CU= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=none; spf=pass (imf04.hostedemail.com: domain of tujinjiang@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=tujinjiang@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710759155; a=rsa-sha256; cv=none; b=fe2joURKOVhUF6PVsL2WnLv8buLz/ci67i+A6IrutReuN+BMZucYcDznSdmNs0F7dmdIDi ri5/nx18zFg2xi9TV6kMHUd1cYDcuC3rXnuBmOKn0dTBHnyqmd/cv6HxPIC4PbmIVLRqwa DAUWQ61B0Zc5iXVKU06IjOyjdR1NaKA= Received: from mail.maildlp.com (unknown [172.19.163.17]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4Tys8G24htz1xrbN; Mon, 18 Mar 2024 18:50:38 +0800 (CST) Received: from dggpeml500011.china.huawei.com (unknown [7.185.36.84]) by mail.maildlp.com (Postfix) with ESMTPS id 033DA1A0187; Mon, 18 Mar 2024 18:52:29 +0800 (CST) Received: from [10.174.179.13] (10.174.179.13) by dggpeml500011.china.huawei.com (7.185.36.84) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Mon, 18 Mar 2024 18:52:28 +0800 Message-ID: <527963a1-e7e5-78ab-99dd-2677de980eed@huawei.com> Date: Mon, 18 Mar 2024 18:52:27 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.11.2 Subject: Re: [PATCH v3] mm/ksm: fix ksm exec support for prctl To: David Hildenbrand , , , , , , , References: <20240318090441.179486-1-tujinjiang@huawei.com> From: Jinjiang Tu In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.179.13] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To dggpeml500011.china.huawei.com (7.185.36.84) X-Rspamd-Queue-Id: A06E14000D X-Rspam-User: X-Stat-Signature: 1dzb6e4a9zswz3c38ogd9ai579zo5hxp X-Rspamd-Server: rspam01 X-HE-Tag: 1710759153-164947 X-HE-Meta: U2FsdGVkX1+HdRfuQ36xHgLh2R54Wwa7UCN601IzvOD57RIM1mIeMEItOUk/PcN2+DcWzKA4vmzgqxJXCZoFsAqXXYMuTmqwqpTKyr2U/+SjA6MGcFs1APYcQdF2R8c4Mj07oBMMix9E71RhLyNv1XhIc3JINNXFApskXHLkFfVi2AC2X0l10Rli6sxmHXJH6kOunWpmQa0aT0qzLofpaz7FDieJe4IZSrWIPBo8Bsi175El6434Ok3quL6y9Ayz4jPjHnWaeLFIr2SfDJJCF0cEQspRIu9Phg41fvCUb+ahHd8/1TO0eyPDnbyxhAlwSIY/uTR2qvbF1E8VGQFQYM8HO7W2okOoopXGA2BOvzmd2QpceQZRmdhPabfq3wXB5B4byzlvjzezGiXKN6vq0Tl4LM1WcqNNh7Ku1yM4QYkp0It1pmFKGu+x5DZrd051Cwp0Sx2dIeWvZ8cSENZXolj2XcbmydTXOZbB4R9G8Rt4EZUaGl8XJ0Oc2e7QqQ+voBxrYyurCkLLgVKLDl682RysclUTs7TyFyeej2R/H/OKLEG+QlSv+ThNKnfSisX3gnDBRVpeHOEXYlc8Wb8P6NyHymDezDHOsO68DBl45E7Iplo+v2CuV0N4vzAULiW/IktbFQ4TSMU1//Uo2YUbLJFuwmwP1p44q+ZjXGcZT4nz+EHuxa8oDXPsw0IiqY/Un2yOuYwZ6g45Wl99Wgd1JvkeODjxkto5Y2ZRMUBUD0sqpq8u+ZODj/iuzgqGiPve+CAJfCGdMYrQaJPfGFImanm64irQKxyjVSMsHvrSqbyQTNWVmFIv56rFwl51aJA/X8mWsAbuG1xWoLP+wVqBKVwfGJceTYC9HoAiodt2YowL+OjKS9UoGd3tPL3qHtrhmecA09RIkaIxVqdlyVaGEcx92+7cNMS1xMFxJTQGjML0Oh+G6uI9w/EXHlWu3GTAHy+ZzynQ1f/RHKgOcja AOClhvvV DU3MRDNqHjtVy6ibadlwDum/RFPvc9nXMk1kJ4kQoq8lnjIxNMOVvlGC4ON5JVKUYaaI+vKcHjHiZY+wBZovOnJzTeDDjK99jWmkB9i2PJlsai623yItzqVabk2Jr0bP03Jl8KECWtAzmPB0/X43Lr5eOL70AkQkIN110aUhqrb4kVpxtKCBjTiDIFz8D3uyU1K5R X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: 在 2024/3/18 17:36, David Hildenbrand 写道: > On 18.03.24 10:04, Jinjiang Tu wrote: >> commit 3c6f33b7273a ("mm/ksm: support fork/exec for prctl") inherits >> MMF_VM_MERGE_ANY flag when a task calls execve(). Howerver, it doesn't >> create the mm_slot, so ksmd will not try to scan this task. >> >> To fix it, allocate and add the mm_slot to ksm_mm_head in >> __bprm_mm_init() >> when the mm has MMF_VM_MERGE_ANY flag. > > That would mean that 3c6f33b7273a is effectively ineffective for > fork+exec and only works with fork? > Yes. In my test case, parent process calls prctl with PR_SET_MEMORY_MERGE, and then fork, execeve a new process. The new process allocates 3 anon pages with same content and loops infinitely. However, the 3 pages are not merged. >> >> Fixes: 3c6f33b7273a ("mm/ksm: support fork/exec for prctl") >> Signed-off-by: Jinjiang Tu >> --- >>   fs/exec.c           |  4 ++++ >>   include/linux/ksm.h | 13 +++++++++++++ >>   2 files changed, 17 insertions(+) >> >> diff --git a/fs/exec.c b/fs/exec.c >> index ff6f26671cfc..00f40163cc12 100644 >> --- a/fs/exec.c >> +++ b/fs/exec.c >> @@ -67,6 +67,7 @@ >>   #include >>   #include >>   #include >> +#include >>     #include >>   #include >> @@ -267,6 +268,9 @@ static int __bprm_mm_init(struct linux_binprm *bprm) >>           goto err_free; >>       } >>   +    if (ksm_execve(mm)) >> +        goto err; >> + >>       /* >>        * Place the stack at the largest stack address the architecture >>        * supports. Later, we'll move this to an appropriate place. We >> don't >> diff --git a/include/linux/ksm.h b/include/linux/ksm.h >> index 401348e9f92b..7e2b1de3996a 100644 >> --- a/include/linux/ksm.h >> +++ b/include/linux/ksm.h >> @@ -59,6 +59,14 @@ static inline int ksm_fork(struct mm_struct *mm, >> struct mm_struct *oldmm) >>       return 0; >>   } >>   +static inline int ksm_execve(struct mm_struct *mm) >> +{ >> +    if (test_bit(MMF_VM_MERGE_ANY, &mm->flags)) >> +        return __ksm_enter(mm); > > As soon as we did the __ksm_enter(), we have to set MMF_VM_MERGEABLE. > I don't think it would be set right now, because: > > mm_alloc()->mm_init() will initialize the flags using > >     mm->flags = mmf_init_flags(current->mm->flags); > > Whereby MMF_INIT_MASK contains only MMF_VM_MERGE_ANY_MASK. > > > So we likely need a set_bit(MMF_VM_MERGEABLE, &mm->flags) here as > well. Otherwise ksm_exit() wouldn't clean up, and we might call > __ksm_enter() twice. __ksm_enter() will set MMF_VM_MERGEABLE when it succeeds. > > > And now I wonder, when would be the right point to call __ksm_enter(). > > __mmput() calls ksm_exit(). But for example, if __bprm_mm_init() fails > after __ksm_enter(), we will only call mmdrop(), not cleaning up ... > so that looks wrong. Yes, I forgot cleanup in error path. ksm_exit() should be called when __bprm_mm_init() fails after __ksm_enter(). > > We would have to make sure we call __ksm_enter() only once we know > that callers will call mm_put(). that is the case once we return from > bprm_mm_init() with success. > > Maybe move the ksm_execve() to bprm_mm_init(), adding a comment that > cleanup will only happen during mm_put(), so it's harder to mess up in > the future? > __ksm_enter() should be called under mmap write lock to avoid race with ksmd. So we can't move ksm_execve() to bprm_mm_init().