From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 436AFC388F9 for ; Fri, 30 Oct 2020 10:25:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B184222242 for ; Fri, 30 Oct 2020 10:25:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="lKwLLt4d" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B184222242 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 964876B0062; Fri, 30 Oct 2020 06:25:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8F2406B006C; Fri, 30 Oct 2020 06:25:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7B5156B006E; Fri, 30 Oct 2020 06:25:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0187.hostedemail.com [216.40.44.187]) by kanga.kvack.org (Postfix) with ESMTP id 4AA246B0062 for ; Fri, 30 Oct 2020 06:25:05 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id D66318249980 for ; Fri, 30 Oct 2020 10:25:04 +0000 (UTC) X-FDA: 77428208928.19.pet89_4302dc027295 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin19.hostedemail.com (Postfix) with ESMTP id B5DFB1AD1B1 for ; Fri, 30 Oct 2020 10:25:04 +0000 (UTC) X-HE-Tag: pet89_4302dc027295 X-Filterd-Recvd-Size: 5074 Received: from mail-pf1-f196.google.com (mail-pf1-f196.google.com [209.85.210.196]) by imf04.hostedemail.com (Postfix) with ESMTP for ; Fri, 30 Oct 2020 10:25:04 +0000 (UTC) Received: by mail-pf1-f196.google.com with SMTP id 13so4869931pfy.4 for ; Fri, 30 Oct 2020 03:25:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=k3BGkHyMUAcPWaiYElJuGkin/Ch6RoeIDSf/cjT/XKY=; b=lKwLLt4dLJrwYKz1isaaQyrg51qniG2bO9YO+gcQpxSKGHXbaqagOs5Xi28UmxmT8+ y54t5hh34z1betgNgZdy7j5Qp9nHohzQ+5lIzDf0xKNl/sgbY5WfNgm5zFYRTVCGEoW+ M5WK5W83RIkmiZrTrRXoooavnPTUqq2z6ThXPcQpe7LseuTflVuAIbIW+lk4lzU0fNpE eO7Pb82pw58h64VKr3dt+77B3cvRaU1A6fhTkoYUONfYYMOkk7fYfKVmLys/v6RKU983 +8RODoyILxdxifNbcJRBHEuFgr9ZDWWbC0yvWPfVdHFsfjIfUtzUNaKT37paV15J7hrr UX0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=k3BGkHyMUAcPWaiYElJuGkin/Ch6RoeIDSf/cjT/XKY=; b=V18De4n7PPTQN1aiRvdlwi+AHRkrkJTGtpDmQWa7EyOkObh16sfWO62EHqdW0Vqq7v +P5lgmN5bafHIKZ0pvubEa+S5w8agMko0Xuvs/A4VSS9/rZvZq0eKsMIwzfZPoialgp0 QdHYZpwid64kM/VsLRvZqu8nHhjC08Kzt64m8pT9cb/87WCGod7beHe8fUOQBLWyJk3P TDo+qxz+XTwY/sctLmSlB/uw159iWeCDdHKy95nKj81yk+WA8PcRYCQDONlgzC4IhlM3 kSqUOBOb2HDPKAhCzo2XaS3BD68OLz2Fr5qiNSKpGvLbFpnkpopUEROXyqhzTKxnqPs2 3gEA== X-Gm-Message-State: AOAM531xi41ZC3N+QWc/1xsklfnrKXp5I3nt6EOSu2HuBg3ac9ZrVGL1 5TVVNfDiJDJCnuFocVYR39+lZdRCV2NDRcqjjDXy0Q== X-Google-Smtp-Source: ABdhPJwyjzUykhJBgO4xRIxz/zsV9eOTOu6Kknf9mk+qKBEDABpYQ3M+M9v2OBxghVmkxo89DPwXn0OjWqLJPuxH/hg= X-Received: by 2002:a17:90b:198d:: with SMTP id mv13mr2031860pjb.13.1604053502754; Fri, 30 Oct 2020 03:25:02 -0700 (PDT) MIME-Version: 1.0 References: <20201026145114.59424-1-songmuchun@bytedance.com> <20201030091445.GF1478@dhcp22.suse.cz> In-Reply-To: <20201030091445.GF1478@dhcp22.suse.cz> From: Muchun Song Date: Fri, 30 Oct 2020 18:24:25 +0800 Message-ID: Subject: Re: [External] Re: [PATCH v2 00/19] Free some vmemmap pages of hugetlb page To: Michal Hocko Cc: Jonathan Corbet , Mike Kravetz , Thomas Gleixner , mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, Peter Zijlstra , viro@zeniv.linux.org.uk, Andrew Morton , paulmck@kernel.org, mchehab+huawei@kernel.org, pawan.kumar.gupta@linux.intel.com, Randy Dunlap , oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de, Mina Almasry , David Rientjes , Matthew Wilcox , Xiongchun duan , linux-doc@vger.kernel.org, LKML , Linux Memory Management List , linux-fsdevel Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Oct 30, 2020 at 5:14 PM Michal Hocko wrote: > > On Mon 26-10-20 22:50:55, Muchun Song wrote: > > If we uses the 1G hugetlbpage, we can save 4095 pages. This is a very > > substantial gain. On our server, run some SPDK/QEMU applications which > > will use 1000GB hugetlbpage. With this feature enabled, we can save > > ~16GB(1G hugepage)/~11GB(2MB hugepage) memory. > [...] > > 15 files changed, 1091 insertions(+), 165 deletions(-) > > create mode 100644 include/linux/bootmem_info.h > > create mode 100644 mm/bootmem_info.c > > This is a neat idea but the code footprint is really non trivial. To a > very tricky code which hugetlb is unfortunately. > > Saving 1,6% of memory is definitely interesting especially for 1GB pages > which tend to be more static and where the savings are more visible. > > Anyway, I haven't seen any runtime overhead analysis here. What is the > price to modify the vmemmap page tables and make them pte rather than > pmd based (especially for 2MB hugetlb). Also, how expensive is the > vmemmap page tables reconstruction on the freeing path? Yeah, I haven't tested the remapping overhead of reserving a hugetlb page. I can do that. But the overhead is not on the allocation/freeing of each hugetlb page, it is only once when we reserve some hugetlb pages through /proc/sys/vm/nr_hugepages. Once the reservation is successful, the subsequent allocation, freeing and using are the same as before (not patched). So I think that the overhead is acceptable. Thanks. > > Thanks! > -- > Michal Hocko > SUSE Labs -- Yours, Muchun