From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03EC1C433B4 for ; Thu, 29 Apr 2021 04:03:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4BDA061451 for ; Thu, 29 Apr 2021 04:03:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4BDA061451 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8FC7A6B006C; Thu, 29 Apr 2021 00:03:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8ACF16B006E; Thu, 29 Apr 2021 00:03:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 74E586B0070; Thu, 29 Apr 2021 00:03:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0047.hostedemail.com [216.40.44.47]) by kanga.kvack.org (Postfix) with ESMTP id 566616B006C for ; Thu, 29 Apr 2021 00:03:28 -0400 (EDT) Received: from smtpin39.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 124FC4857 for ; Thu, 29 Apr 2021 04:03:28 +0000 (UTC) X-FDA: 78084060096.39.E962FA4 Received: from mail-pj1-f48.google.com (mail-pj1-f48.google.com [209.85.216.48]) by imf24.hostedemail.com (Postfix) with ESMTP id C0AA1A0003A2 for ; Thu, 29 Apr 2021 04:03:14 +0000 (UTC) Received: by mail-pj1-f48.google.com with SMTP id t13so7868507pji.4 for ; Wed, 28 Apr 2021 21:03:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=BgKRdomt1eM7oewQ+UA6utKIEhGUPS/zeW1fYoQll3U=; b=kVDAcgAuPe+/NbI2rxd0BlbgqIAyC8fhu/ek3Yb9xSySx4dsD7WKs3lRupAG0cMXtY FP5RzPf/gAMMOsonUOM2GPIJz2rw4Gt/gqhqxYBslfv4SyDlxQtGLfYWxW0EGFJVY12Q LKLGG5slmnuXOwaGHK3p2AoghFFOVcGS1DnaD11za2Cb2aEcxCxEdKfaXntmqxxo5rUj 6H7r+htb+3yudEcWzZ0QDPcPU2DtrF3RxIqoQkQsaY/Pace5nTvXJYb04EDGggbVpq01 8khI4ZnXRxLV4Rmf0HyBgUouxWa8zVv1sWHXzTqex/SkwOt9eSxKcYJ0nnamnyCkk1lI gWCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=BgKRdomt1eM7oewQ+UA6utKIEhGUPS/zeW1fYoQll3U=; b=heLH42JMgDTb+DehXmCZYsMwzlFpHDR+UuzfczbzlWrrPWt98VKU0FXT0Nkv1dkGOe Escd1vKPf9pMK0plbg3N9jkXYejC1YFZd82le9u8nXVzZLZPPfmncnPWo0U9alRVKVeg /YW2lWURy/SnNvCHNwv99SQC7yTlkGT/evxAH33rGWGqraodsKBJibROHNy3PuKiqVAz G1TiY/GCQeeXS2utZOK3GUfQtdOBTSYEGbwORAvhyqHEvP14+es5SIa6SU4dXzPBIcHb RPJXzQHlCwnLl9jN28aJW93op2PzjC/9GVVWUCiklr9a3lOleYi6h4yfCMmVSwxxt3Z+ pE2Q== X-Gm-Message-State: AOAM531EnuEw6ILPt85+S8/If1mQgwycXsNLxt43mudtJPqyZBQsCCg8 df1USdI6sbEnAMT7FSCostEmOKpNf5xSWrP9PTFJUA== X-Google-Smtp-Source: ABdhPJyKOjVYbyArfKUcKkdzEn5LiMhEGJaQDb94gqC4bSYk8kPeWBkXPfUoPD/pQRBUMMKbMO3L8y2/dyBk0IUHW94= X-Received: by 2002:a17:902:8308:b029:e9:d69:a2f with SMTP id bd8-20020a1709028308b02900e90d690a2fmr33676088plb.20.1619669004051; Wed, 28 Apr 2021 21:03:24 -0700 (PDT) MIME-Version: 1.0 References: <20210425070752.17783-1-songmuchun@bytedance.com> <98f191e8-b509-e541-9d9d-76029c74d241@oracle.com> In-Reply-To: <98f191e8-b509-e541-9d9d-76029c74d241@oracle.com> From: Muchun Song Date: Thu, 29 Apr 2021 12:02:46 +0800 Message-ID: Subject: Re: [External] Re: [PATCH v21 0/9] Free some vmemmap pages of HugeTLB page To: Mike Kravetz Cc: Jonathan Corbet , Thomas Gleixner , Ingo Molnar , bp@alien8.de, X86 ML , hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, Peter Zijlstra , Alexander Viro , Andrew Morton , paulmck@kernel.org, pawan.kumar.gupta@linux.intel.com, Randy Dunlap , oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de, Mina Almasry , David Rientjes , Matthew Wilcox , Oscar Salvador , Michal Hocko , "Song Bao Hua (Barry Song)" , David Hildenbrand , =?UTF-8?B?SE9SSUdVQ0hJIE5BT1lBKOWggOWPoyDnm7TkuZ8p?= , Joao Martins , Xiongchun duan , fam.zheng@bytedance.com, zhengqi.arch@bytedance.com, linux-doc@vger.kernel.org, LKML , Linux Memory Management List , linux-fsdevel Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: C0AA1A0003A2 X-Stat-Signature: sw4hhct3uwy67pczjazeb33dkkoaq4ch Received-SPF: none (bytedance.com>: No applicable sender policy available) receiver=imf24; identity=mailfrom; envelope-from=""; helo=mail-pj1-f48.google.com; client-ip=209.85.216.48 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619668994-548104 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Apr 29, 2021 at 10:32 AM Mike Kravetz wrote: > > On 4/28/21 5:26 AM, Muchun Song wrote: > > On Wed, Apr 28, 2021 at 7:47 AM Mike Kravetz wrote: > >> > >> Thanks! I will take a look at the modifications soon. > >> > >> I applied the patches to Andrew's mmotm-2021-04-21-23-03, ran some tests and > >> got the following warning. We may need to special case that call to > >> __prep_new_huge_page/free_huge_page_vmemmap from alloc_and_dissolve_huge_page > >> as it is holding hugetlb lock with IRQs disabled. > > > > Good catch. Thanks Mike. I will fix it in the next version. How about this: > > > > @@ -1618,7 +1617,8 @@ static void __prep_new_huge_page(struct hstate > > *h, struct page *page) > > > > static void prep_new_huge_page(struct hstate *h, struct page *page, int nid) > > { > > + free_huge_page_vmemmap(h, page); > > __prep_new_huge_page(page); > > spin_lock_irq(&hugetlb_lock); > > __prep_account_new_huge_page(h, nid); > > spin_unlock_irq(&hugetlb_lock); > > @@ -2429,6 +2429,7 @@ static int alloc_and_dissolve_huge_page(struct > > hstate *h, struct page *old_page, > > if (!new_page) > > return -ENOMEM; > > > > + free_huge_page_vmemmap(h, new_page); > > retry: > > spin_lock_irq(&hugetlb_lock); > > if (!PageHuge(old_page)) { > > @@ -2489,7 +2490,7 @@ static int alloc_and_dissolve_huge_page(struct > > hstate *h, struct page *old_page, > > > > free_new: > > spin_unlock_irq(&hugetlb_lock); > > - __free_pages(new_page, huge_page_order(h)); > > + update_and_free_page(h, new_page, false); > > > > return ret; > > } > > > > > > Another option would be to leave the prep* routines as is and only > modify alloc_and_dissolve_huge_page as follows: OK. LGTM. I will use this. Thanks Mike. > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 9c617c19fc18..f8e5013a6b46 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -2420,14 +2420,15 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page, > > /* > * Before dissolving the page, we need to allocate a new one for the > - * pool to remain stable. Using alloc_buddy_huge_page() allows us to > - * not having to deal with prep_new_huge_page() and avoids dealing of any > - * counters. This simplifies and let us do the whole thing under the > - * lock. > + * pool to remain stable. Here, we allocate the page and 'prep' it > + * by doing everything but actually updating counters and adding to > + * the pool. This simplifies and let us do most of the processing > + * under the lock. > */ > new_page = alloc_buddy_huge_page(h, gfp_mask, nid, NULL, NULL); > if (!new_page) > return -ENOMEM; > + __prep_new_huge_page(h, new_page); > > retry: > spin_lock_irq(&hugetlb_lock); > @@ -2473,7 +2474,6 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page, > * Reference count trick is needed because allocator gives us > * referenced page but the pool requires pages with 0 refcount. > */ > - __prep_new_huge_page(h, new_page); > __prep_account_new_huge_page(h, nid); > page_ref_dec(new_page); > enqueue_huge_page(h, new_page); > @@ -2489,7 +2489,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page, > > free_new: > spin_unlock_irq(&hugetlb_lock); > - __free_pages(new_page, huge_page_order(h)); > + update_and_free_page(h, old_page, false); > > return ret; > } > > -- > Mike Kravetz