From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 199F4C433F5 for ; Sat, 19 Mar 2022 10:42:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 684218D0002; Sat, 19 Mar 2022 06:42:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6333B8D0001; Sat, 19 Mar 2022 06:42:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4FA268D0002; Sat, 19 Mar 2022 06:42:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.a.hostedemail.com [64.99.140.24]) by kanga.kvack.org (Postfix) with ESMTP id 40C9B8D0001 for ; Sat, 19 Mar 2022 06:42:39 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay13.hostedemail.com (Postfix) with ESMTP id 1292A60988 for ; Sat, 19 Mar 2022 10:42:39 +0000 (UTC) X-FDA: 79260797238.03.4FF895E Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by imf13.hostedemail.com (Postfix) with ESMTP id E2EEC2001F for ; Sat, 19 Mar 2022 10:42:37 +0000 (UTC) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.57]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4KLHTV0qwKzCqkS; Sat, 19 Mar 2022 18:40:30 +0800 (CST) Received: from [10.174.177.76] (10.174.177.76) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Sat, 19 Mar 2022 18:42:33 +0800 Subject: Re: [PATCH] mm/mempolicy: fix potential mpol_new leak in shared_policy_replace From: Miaohe Lin To: Michal Hocko , Andrew Morton CC: , , , References: <20220311093624.39546-1-linmiaohe@huawei.com> <26577566-ae1e-801c-8c64-89c2c89a487d@huawei.com> <24b2a9ef-eea0-09bd-6842-121d8436e56a@huawei.com> <6ebebfd6-6356-e956-4fbc-0abaa58308ff@huawei.com> <207bbd69-6678-5120-3760-e2bcd9803a14@huawei.com> Message-ID: <36b0ea44-39ab-bc52-1ae5-eca2cf832900@huawei.com> Date: Sat, 19 Mar 2022 18:42:33 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <207bbd69-6678-5120-3760-e2bcd9803a14@huawei.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.177.76] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected X-Rspam-User: X-Rspamd-Queue-Id: E2EEC2001F X-Stat-Signature: 84dq7cihxwr6bpiducfxiy465j564q49 Authentication-Results: imf13.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf13.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com X-Rspamd-Server: rspam03 X-HE-Tag: 1647686557-885022 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2022/3/17 17:34, Miaohe Lin wrote: > On 2022/3/17 17:03, Michal Hocko wrote: >> On Thu 17-03-22 10:05:08, Miaohe Lin wrote: >>> On 2022/3/16 17:56, Michal Hocko wrote: >>>> On Wed 16-03-22 14:39:37, Miaohe Lin wrote: >>>>> On 2022/3/15 23:27, Michal Hocko wrote: >>>>>> On Tue 15-03-22 21:42:29, Miaohe Lin wrote: >>>>>>> On 2022/3/15 0:44, Michal Hocko wrote: >>>>>>>> On Fri 11-03-22 17:36:24, Miaohe Lin wrote: >>>>>>>>> If mpol_new is allocated but not used in restart loop, mpol_new will be >>>>>>>>> freed via mpol_put before returning to the caller. But refcnt is not >>>>>>>>> initialized yet, so mpol_put could not do the right things and might >>>>>>>>> leak the unused mpol_new. >>>>>>>> >>>>>>>> The code is really hideous but is there really any bug there? AFAICS the >>>>>>>> new policy is only allocated in if (n->end > end) branch and that one >>>>>>>> will set the reference count on the retry. Or am I missing something? >>>>>>>> >>>>>>> >>>>>>> Many thanks for your comment. >>>>>>> IIUC, new policy is allocated via the below code: >>>>>>> >>>>>>> shared_policy_replace: >>>>>>> alloc_new: >>>>>>> write_unlock(&sp->lock); >>>>>>> ret = -ENOMEM; >>>>>>> n_new = kmem_cache_alloc(sn_cache, GFP_KERNEL); >>>>>>> if (!n_new) >>>>>>> goto err_out; >>>>>>> mpol_new = kmem_cache_alloc(policy_cache, GFP_KERNEL); >>>>>>> if (!mpol_new) >>>>>>> goto err_out; >>>>>>> goto restart; >>>>>>> >>>>>>> And mpol_new' reference count will be set before used in n->end > end case. But >>>>>>> if that is "not" the case, i.e. mpol_new is not inserted into the rb_tree, mpol_new >>>>>>> will be freed via mpol_put before return: >>>>>> >>>>>> One thing I have missed previously is that the lock is dropped during >>>>>> the allocation so I guess the memory policy could have been changed >>>>>> during that time. Is this possible? Have you explored this possibility? >>>>>> Is this a theoretical problem or it can be triggered intentionally. >>>>>> >>>>> >>>>> This is found via code investigation. I think this could be triggered if there >>>>> are many concurrent mpol_set_shared_policy in place. But the user-visible effect >>>>> might be obscure as only sizeof(struct mempolicy) bytes leaks possiblely every time. >>>>> >>>>>> These details would be really interesting for the changelog so that we >>>>>> can judge how important this would be. >>>>> >>>>> This might not be that important as this issue should have been well-concealed for >>>>> almost ten years (since commit 42288fe366c4 ("mm: mempolicy: Convert shared_policy mutex to spinlock")). >>>> >>>> I think it is really worth to drill down to the bottom of the issue. >>>> While theoretically possible can be a good enough to justify the change >>>> it is usually preferred to describe the underlying problem for future >>>> maintainability. >>> >>> This issue mainly causes mpol_new memory leaks and this is pointed out in the commit log. >>> Am I supposed to do something more to move forward this patch ? Could you point that out >>> for me? >> >> Sorry if I was not really clear. My main request is to have a clear >> insight whether this is a theretical issue or the leak could be really >> triggered. If the later we need to mark it properly and backport to >> older kernels because memory leaks can lead to DoS when they are >> reasonably easy to trigger. >> >> Is this more clear now? > > I see. Many thanks. I would have a try to trigger this. :) > This would be triggered easily with below code snippet in my virtual machine: shmid = shmget((key_t)5566, 1024 * PAGE_SIZE, 0666|IPC_CREAT); shm = shmat(shmid, 0, 0); loop { mbind(shm, 1024 * PAGE_SIZE, MPOL_LOCAL, mask, maxnode, 0); mbind(shm + 128 * PAGE_SIZE, 128 * PAGE_SIZE, MPOL_DEFAULT, mask, maxnode, 0); } If there're many process doing the above work, mpol_new will be leaked easily. So should I resend this patch with Cc stable? But it seems I'am not supposed to make this decision and the maintainer will take care of this? Many thanks. :) >> >