From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 503FFC433EF for ; Wed, 9 Feb 2022 07:45:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B48156B0073; Wed, 9 Feb 2022 02:45:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AF7CD6B0074; Wed, 9 Feb 2022 02:45:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9BF8C6B0075; Wed, 9 Feb 2022 02:45:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0086.hostedemail.com [216.40.44.86]) by kanga.kvack.org (Postfix) with ESMTP id 89E206B0073 for ; Wed, 9 Feb 2022 02:45:34 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 410CC8249980 for ; Wed, 9 Feb 2022 07:45:34 +0000 (UTC) X-FDA: 79122456588.26.8B81DBB Received: from mail-yb1-f173.google.com (mail-yb1-f173.google.com [209.85.219.173]) by imf14.hostedemail.com (Postfix) with ESMTP id F3608100009 for ; Wed, 9 Feb 2022 07:45:32 +0000 (UTC) Received: by mail-yb1-f173.google.com with SMTP id j2so3747207ybu.0 for ; Tue, 08 Feb 2022 23:45:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=rZt97/8ROY3+UKnKOkybgWpfp57iv+flrkhWaztw2B4=; b=hjNPEUw9OlUPIk2B4S7HhB3hqe9U9AT7bF0Q+ofYKz+gn7Tit4EYniAacYbI8Ut2DR sImSygPZsw2DdMFe6k/h3puZokb+ORZ/tqEi/CO2VbqgO7ZI4sa9YhKqKgeo5M6HzPCk 1hBQBzkRbptsJjjpKnecf9nOiDO0mP2eQMZ3VoHmrbfD+b47MiHA246ZqOnFS2Uk4Coc l4amT4j7IO2ciSvjaSLS6qzWp2xzxvsf0qFFWuHepJGrt0wiq0F8fdTuGhKeNx2YcxBU 2DfUuPmKQjorZC0cZDfWjOV+yj1U5RrMmOD/JFmKV6JBRfCzPnjTAY9xXB/8HokbQILO 1Nmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=rZt97/8ROY3+UKnKOkybgWpfp57iv+flrkhWaztw2B4=; b=RqFuk+NHSa13h898Yzk+lhMAkzHHG7UWMUmCs/eALcoojYHjCCecw1/gvOeWtRDBrp a58RYQ35VdFQ0tuCMM2ayF9Swlh+amMtdRIqhmxeYdhL3ILq/TJ4rEtzXdDeO80kKld0 dcWd1aV5ez2hXbK0yXacRGlZx3czEktqCIOjqKC3DOq+o1WCEm32ay/dRFaFvYixTolS A1i2uvG6AkSvPCnTYXu/wYcl2sWVjNkYdr/KK/cSYPS9t9QqqF1Jn9poFV+QHKQoJwVX sSKsBPRIkGUCRGjCnjvPaMdDyHeFTY82qIpb9DCoNr1BeswBY8viBRBwmUcfoXMSAWOU iJqQ== X-Gm-Message-State: AOAM530odftl5Q8NxWCSwmMUaYdGH2OD1OMStUHGZ+vMJeibhwlSwfnR /27hOlV6z/DeYfuA3Bv9I9PawtXobcTPAQGYSPMIig== X-Google-Smtp-Source: ABdhPJxnnpaw8noZfnF3Z83h+7aPBZguIP6vpvycsBalPzDcqBtgoaWBC8+48r0zPplursG/0OyeH/mYz65iclJX4Ww= X-Received: by 2002:a25:c983:: with SMTP id z125mr1039624ybf.132.1644392731988; Tue, 08 Feb 2022 23:45:31 -0800 (PST) MIME-Version: 1.0 References: <20211101031651.75851-1-songmuchun@bytedance.com> <35c5217d-eb8f-6f70-544a-a3e8bd009a46@oracle.com> <20211123190952.7d1e0cac2d72acacd2df016c@linux-foundation.org> In-Reply-To: From: Muchun Song Date: Wed, 9 Feb 2022 15:44:54 +0800 Message-ID: Subject: Re: [PATCH v7 0/5] Free the 2nd vmemmap page associated with each HugeTLB page To: Andrew Morton , Mike Kravetz Cc: Oscar Salvador , David Hildenbrand , Michal Hocko , Matthew Wilcox , Jonathan Corbet , Xiongchun duan , Fam Zheng , Muchun Song , Qi Zheng , Linux Doc Mailing List , LKML , Linux Memory Management List , "Song Bao Hua (Barry Song)" , Barry Song <21cnbao@gmail.com>, "Bodeddula, Balasubramaniam" , Jue Wang Content-Type: text/plain; charset="UTF-8" Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=hjNPEUw9; dmarc=pass (policy=none) header.from=bytedance.com; spf=pass (imf14.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.219.173 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com X-Rspamd-Server: rspam03 X-Rspam-User: X-Stat-Signature: k8mdr7oid6t5weif6zf5ni13r4k8mhs5 X-Rspamd-Queue-Id: F3608100009 X-HE-Tag: 1644392732-592544 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Jan 26, 2022 at 4:04 PM Muchun Song wrote: > > On Wed, Nov 24, 2021 at 11:09 AM Andrew Morton > wrote: > > > > On Mon, 22 Nov 2021 12:21:32 +0800 Muchun Song wrote: > > > > > On Wed, Nov 10, 2021 at 2:18 PM Muchun Song wrote: > > > > > > > > On Tue, Nov 9, 2021 at 3:33 AM Mike Kravetz wrote: > > > > > > > > > > On 11/8/21 12:16 AM, Muchun Song wrote: > > > > > > On Mon, Nov 1, 2021 at 11:22 AM Muchun Song wrote: > > > > > >> > > > > > >> This series can minimize the overhead of struct page for 2MB HugeTLB pages > > > > > >> significantly. It further reduces the overhead of struct page by 12.5% for > > > > > >> a 2MB HugeTLB compared to the previous approach, which means 2GB per 1TB > > > > > >> HugeTLB. It is a nice gain. Comments and reviews are welcome. Thanks. > > > > > >> > > > > > > > > > > > > Hi, > > > > > > > > > > > > Ping guys. Does anyone have any comments or suggestions > > > > > > on this series? > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > I did look over the series earlier. I have no issue with the hugetlb and > > > > > vmemmap modifications as they are enhancements to the existing > > > > > optimizations. My primary concern is the (small) increased overhead > > > > > for the helpers as outlined in your cover letter. Since these helpers > > > > > are not limited to hugetlb and used throughout the kernel, I would > > > > > really like to get comments from others with a better understanding of > > > > > the potential impact. > > > > > > > > Thanks Mike. I'd like to hear others' comments about this as well. > > > > From my point of view, maybe the (small) overhead is acceptable > > > > since it only affects the head page, however Matthew Wilcox's folio > > > > series could reduce this situation as well. > > > > I think Mike was inviting you to run some tests to quantify the > > overhead ;) > > Hi Andrew, > > Sorry for the late reply. > > Specific overhead figures are already in the cover letter. Also, > I did some other tests, e.g. kernel compilation, sysbench. I didn't > see any regressions. The overhead is introduced by page_fixed_fake_head() which has an "if" statement and an access to a possible cold cache line. I think the main overhead is from the latter. However, probabilistically, only 1/64 of the pages need to do the latter. And page_fixed_fake_head() is already simple (I mean the overhead is small enough) and many performance bottlenecks in mm are not in compound_head(). This also matches the tests I did. I didn't see any regressions after enabling this feature. I knew Mike's concern is the increased overhead to use cases beyond HugeTLB. If we really want to avoid the access to a possible cold cache line, we can introduce a new page flag like PG_hugetlb and test if it is set in the page->flags, if so, then return the read head page struct. Then page_fixed_fake_head() looks like below. static __always_inline const struct page *page_fixed_fake_head(const struct page *page) { if (!hugetlb_free_vmemmap_enabled()) return page; if (test_bit(PG_hugetlb, &page->flags)) { unsigned long head = READ_ONCE(page[1].compound_head); if (likely(head & 1)) return (const struct page *)(head - 1); } return page; } But I don't think it's worth doing this. Hi Mike and Andrew, Since these helpers are not limited to hugetlb and used throughout the kernel, I would really like to get comments from others with a better understanding of the potential impact. Do you have any appropriate reviewers to invite? Thanks. > > > > > > Ping guys. > > > > > > Hi Andrew, > > > > > > Do you have any suggestions on this series to move it on? > > > > > > > I tossed it in there for some testing but yes please, additional > > reviewing? > > It's already been in the next-tree (also in our ByteDance servers) > for several months, and I didn't receive any negative feedback. > > Do you think it is ready for 5.17? > > Thanks.