From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DEB9FC433EF for ; Mon, 13 Jun 2022 01:51:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 622E68D0141; Sun, 12 Jun 2022 21:51:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5D30E8D013C; Sun, 12 Jun 2022 21:51:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 49B298D0141; Sun, 12 Jun 2022 21:51:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 3A7B78D013C for ; Sun, 12 Jun 2022 21:51:31 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id ECD7632599 for ; Mon, 13 Jun 2022 01:51:30 +0000 (UTC) X-FDA: 79571535540.15.7F80274 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf18.hostedemail.com (Postfix) with ESMTP id E67FD1C007A for ; Mon, 13 Jun 2022 01:51:29 +0000 (UTC) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4LLvd82cPpzgYpJ; Mon, 13 Jun 2022 09:49:32 +0800 (CST) Received: from [10.174.177.76] (10.174.177.76) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Mon, 13 Jun 2022 09:51:27 +0800 Subject: Re: [PATCH v2] mm/page_alloc: minor clean up for memmap_init_compound() To: Muchun Song , Joao Martins CC: , , References: <20220611021352.13529-1-linmiaohe@huawei.com> From: Miaohe Lin Message-ID: <05a774de-12ea-e425-bd9d-b626aafa5831@huawei.com> Date: Mon, 13 Jun 2022 09:51:26 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.177.76] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655085090; a=rsa-sha256; cv=none; b=IyrdZ8CMhLeijzlJBNl4RV2iTpKJwVGntsJXNvAcIQ2EL22OYmdJpsomjtQbxKYteEBIu9 MKvBj6NHnSudL6DySqDo6LM119XM0NhlXfFuYMfMIT+0Skx+736a0RFmTuRitKsh83DxMJ gG1yrwXqjCbJfRCmaBY45WKDqdotuwQ= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655085090; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iWEGsH263o7t7l0ZRvdj9tLeaP4E3BZzEOTKfQeuqkQ=; b=as3ZsvzxfJ9ckLF1sQ+8l9RiBv/oKYq82c0pOFAcP4g4qpB3Au8qG5ipeTRIRqlsr9hEFv FrGr3v8ljywRs+9l2o+vqYRhdHUbEC/pk3G+8f4+mcYtcYDMBHXAIU8EV3bAZJ9eOvvn4l 5TWpLkcGsxXRulZ3SzxTxghaTE0RF2o= X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: E67FD1C007A X-Rspam-User: Authentication-Results: imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com X-Stat-Signature: opus43ait1hhgh95bnmfi356u7cgdctj X-HE-Tag: 1655085089-549641 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2022/6/12 23:44, Muchun Song wrote: > On Sat, Jun 11, 2022 at 10:13:52AM +0800, Miaohe Lin wrote: >> Since commit 5232c63f46fd ("mm: Make compound_pincount always available"), >> compound_pincount_ptr is stored at first tail page now. So we should call >> prep_compound_head() after the first tail page is initialized to take >> advantage of the likelihood of that tail struct page being cached given >> that we will read them right after in prep_compound_head(). >> >> Signed-off-by: Miaohe Lin >> Cc: Joao Martins >> --- >> v2: >> Don't move prep_compound_head() outside loop per Joao. >> --- >> mm/page_alloc.c | 17 +++++++++++------ >> 1 file changed, 11 insertions(+), 6 deletions(-) >> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 4c7d99ee58b4..048df5d78add 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -6771,13 +6771,18 @@ static void __ref memmap_init_compound(struct page *head, >> set_page_count(page, 0); >> >> /* >> - * The first tail page stores compound_mapcount_ptr() and >> - * compound_order() and the second tail page stores >> - * compound_pincount_ptr(). Call prep_compound_head() after >> - * the first and second tail pages have been initialized to >> - * not have the data overwritten. >> + * The first tail page stores compound_mapcount_ptr(), >> + * compound_order() and compound_pincount_ptr(). Call >> + * prep_compound_head() after the first tail page have >> + * been initialized to not have the data overwritten. >> + * >> + * Note the idea to make this right after we initialize >> + * the offending tail pages is trying to take advantage >> + * of the likelihood of those tail struct pages being >> + * cached given that we will read them right after in >> + * prep_compound_head(). >> */ >> - if (pfn == head_pfn + 2) >> + if (unlikely(pfn == head_pfn + 1)) >> prep_compound_head(head, order); > > For me it is weird not to put this out of the loop. I saw the reason > is because of the caching suggested by Joao. But I think this is not > a hot path and putting it out of the loop may be more intuitive at least > for me. Maybe this optimization is unnecessary (maybe I am wrong). > And it will be consistent with prep_compound_page() (at least it does > not do the similar optimization) if we drop this optimization. This is also what I thought in the first version. :) > > Hi Joao, > > I am wondering is it a significant optimization for zone device memory? > I found this code existed from the 1st version you introduced. So > I suspect maybe you have some numbers, would you like to share with us? Those numbers would be really helpful. > > Thanks. Thanks! > > . >