From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E934C433DB for ; Tue, 16 Mar 2021 01:47:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6C66664FC2 for ; Tue, 16 Mar 2021 01:47:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6C66664FC2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D37D76B006C; Mon, 15 Mar 2021 21:47:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CE79C6B006E; Mon, 15 Mar 2021 21:47:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B88686B0070; Mon, 15 Mar 2021 21:47:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0120.hostedemail.com [216.40.44.120]) by kanga.kvack.org (Postfix) with ESMTP id 994666B006C for ; Mon, 15 Mar 2021 21:47:34 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 5333718034A46 for ; Tue, 16 Mar 2021 01:47:34 +0000 (UTC) X-FDA: 77924050428.09.AC2A0F5 Received: from szxga06-in.huawei.com (szxga06-in.huawei.com [45.249.212.32]) by imf17.hostedemail.com (Postfix) with ESMTP id 224944000340 for ; Tue, 16 Mar 2021 01:47:31 +0000 (UTC) Received: from DGGEMS408-HUB.china.huawei.com (unknown [172.30.72.59]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4Dzx2W0T2vzkZRl; Tue, 16 Mar 2021 09:45:55 +0800 (CST) Received: from [10.174.177.131] (10.174.177.131) by DGGEMS408-HUB.china.huawei.com (10.3.19.208) with Microsoft SMTP Server id 14.3.498.0; Tue, 16 Mar 2021 09:47:23 +0800 Subject: Re: [PATCH v2] hugetlb_cgroup: fix imbalanced css_get and css_put pair for shared mappings To: Mike Kravetz , CC: , , , References: <20210301120540.37076-1-linmiaohe@huawei.com> <771ee69e-61d9-b1c9-b72d-3a50d2cbe4de@oracle.com> <7868ec9c-0762-2a78-2dfc-2bd07dca15f5@huawei.com> <2b23bbf7-bb32-4ed7-87ae-5e55b4bc6629@oracle.com> From: Miaohe Lin Message-ID: <004ebc95-314c-1fbf-be12-cdeeda2b84d7@huawei.com> Date: Tue, 16 Mar 2021 09:47:23 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <2b23bbf7-bb32-4ed7-87ae-5e55b4bc6629@oracle.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.177.131] X-CFilter-Loop: Reflected X-Stat-Signature: c9fostag3zcaq8ibemym1o1buz1jhs3k X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 224944000340 Received-SPF: none (huawei.com>: No applicable sender policy available) receiver=imf17; identity=mailfrom; envelope-from=""; helo=szxga06-in.huawei.com; client-ip=45.249.212.32 X-HE-DKIM-Result: none/none X-HE-Tag: 1615859251-367610 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2021/3/16 2:42, Mike Kravetz wrote: > On 3/12/21 7:11 PM, Miaohe Lin wrote: >> On 2021/3/13 3:09, Mike Kravetz wrote: >>> On 3/1/21 4:05 AM, Miaohe Lin wrote: >>>> The current implementation of hugetlb_cgroup for shared mappings could have >>>> different behavior. Consider the following two scenarios: >>>> >>>> 1.Assume initial css reference count of hugetlb_cgroup is 1: >>>> 1.1 Call hugetlb_reserve_pages with from = 1, to = 2. So css reference >>>> count is 2 associated with 1 file_region. >>>> 1.2 Call hugetlb_reserve_pages with from = 2, to = 3. So css reference >>>> count is 3 associated with 2 file_region. >>>> 1.3 coalesce_file_region will coalesce these two file_regions into one. >>>> So css reference count is 3 associated with 1 file_region now. >>>> >>>> 2.Assume initial css reference count of hugetlb_cgroup is 1 again: >>>> 2.1 Call hugetlb_reserve_pages with from = 1, to = 3. So css reference >>>> count is 2 associated with 1 file_region. >>>> >>>> Therefore, we might have one file_region while holding one or more css >>>> reference counts. This inconsistency could lead to imbalanced css_get() >>>> and css_put() pair. If we do css_put one by one (i.g. hole punch case), >>>> scenario 2 would put one more css reference. If we do css_put all together >>>> (i.g. truncate case), scenario 1 will leak one css reference. >>>> >>>> The imbalanced css_get() and css_put() pair would result in a non-zero >>>> reference when we try to destroy the hugetlb cgroup. The hugetlb cgroup >>>> directory is removed __but__ associated resource is not freed. This might >>>> result in OOM or can not create a new hugetlb cgroup in a busy workload >>>> ultimately. >>>> >>>> In order to fix this, we have to make sure that one file_region must hold >>>> and only hold one css reference. So in coalesce_file_region case, we should >>> >>> I think this would sound/read better if stated as: >>> ... one file_region must hold exactly one css reference ... >>> >>> This phrase is repeated in comments throughout the patch. >>> >>>> release one css reference before coalescence. Also only put css reference >>>> when the entire file_region is removed. >>>> >>>> The last thing to note is that the caller of region_add() will only hold >>>> one reference to h_cg->css for the whole contiguous reservation region. >>>> But this area might be scattered when there are already some file_regions >>>> reside in it. As a result, many file_regions may share only one h_cg->css >>>> reference. In order to ensure that one file_region must hold and only hold >>>> one h_cg->css reference, we should do css_get() for each file_region and >>>> release the reference held by caller when they are done. >>> >>> Thanks for adding this to the commit message. >>> >>>> >>>> Fixes: 075a61d07a8e ("hugetlb_cgroup: add accounting for shared mappings") >>>> Reported-by: kernel test robot (auto build test ERROR) >>>> Signed-off-by: Miaohe Lin >>>> Cc: stable@kernel.org >>>> --- >>>> v1->v2: >>>> Fix auto build test ERROR when CGROUP_HUGETLB is disabled. >>>> --- >>>> include/linux/hugetlb_cgroup.h | 15 ++++++++++-- >>>> mm/hugetlb.c | 42 ++++++++++++++++++++++++++++++---- >>>> mm/hugetlb_cgroup.c | 11 +++++++-- >>>> 3 files changed, 60 insertions(+), 8 deletions(-) >>> >>> Just a few minor nits below, all in comments. It is not required, but >>> would be nice to update these. Code looks good. >>> >> >> Many thanks for review! I will fix all this nits. Should I resend this patch or send another >> one to fix this? Just let me know which is the easiest one for you. >> >> Thanks again. :) >> > > This is really Andrew's call as it goes through his tree. > Sorry, I should have sought advice from Andrew explictly. > I would suggest you just update the comments and send another verion. > In that way, Andrew can do whatever is easiest for him. Will send v3. Many thanks. >