From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.5 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84906C433DF for ; Sat, 18 Jul 2020 14:16:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4C4A32070E for ; Sat, 18 Jul 2020 14:16:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4C4A32070E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C0DB16B000D; Sat, 18 Jul 2020 10:16:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BBD586B000E; Sat, 18 Jul 2020 10:16:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AF9606B0022; Sat, 18 Jul 2020 10:16:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0038.hostedemail.com [216.40.44.38]) by kanga.kvack.org (Postfix) with ESMTP id 9B5006B000D for ; Sat, 18 Jul 2020 10:16:09 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 581A4D238 for ; Sat, 18 Jul 2020 14:16:09 +0000 (UTC) X-FDA: 77051396058.09.tent25_250649d26f14 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin09.hostedemail.com (Postfix) with ESMTP id 40EF11810F5CB for ; Sat, 18 Jul 2020 14:15:23 +0000 (UTC) X-HE-Tag: tent25_250649d26f14 X-Filterd-Recvd-Size: 6354 Received: from out30-57.freemail.mail.aliyun.com (out30-57.freemail.mail.aliyun.com [115.124.30.57]) by imf44.hostedemail.com (Postfix) with ESMTP for ; Sat, 18 Jul 2020 14:15:21 +0000 (UTC) X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R181e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01422;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=21;SR=0;TI=SMTPD_---0U33hZjf_1595081706; Received: from IT-FVFX43SYHV2H.local(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0U33hZjf_1595081706) by smtp.aliyun-inc.com(127.0.0.1); Sat, 18 Jul 2020 22:15:06 +0800 Subject: Re: [PATCH v16 18/22] mm/lru: replace pgdat lru_lock with lruvec lock To: Alexander Duyck Cc: Andrew Morton , Mel Gorman , Tejun Heo , Hugh Dickins , Konstantin Khlebnikov , Daniel Jordan , Yang Shi , Matthew Wilcox , Johannes Weiner , kbuild test robot , linux-mm , LKML , cgroups@vger.kernel.org, Shakeel Butt , Joonsoo Kim , Wei Yang , "Kirill A. Shutemov" , Michal Hocko , Vladimir Davydov , Rong Chen References: <1594429136-20002-1-git-send-email-alex.shi@linux.alibaba.com> <1594429136-20002-19-git-send-email-alex.shi@linux.alibaba.com> From: Alex Shi Message-ID: <62dfd262-a7ac-d18e-216a-2988c690b256@linux.alibaba.com> Date: Sat, 18 Jul 2020 22:15:02 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 X-Rspamd-Queue-Id: 40EF11810F5CB X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: =E5=9C=A8 2020/7/18 =E4=B8=8A=E5=8D=885:38, Alexander Duyck =E5=86=99=E9=81= =93: >> + return locked_lruvec; >> + >> + if (locked_lruvec) >> + unlock_page_lruvec_irqrestore(locked_lruvec, *flags); >> + >> + return lock_page_lruvec_irqsave(page, flags); >> +} >> + > These relock functions have no users in this patch. It might make > sense and push this code to patch 19 in your series since that is > where they are first used. In addition they don't seem very efficient > as you already had to call mem_cgroup_page_lruvec once, why do it > again when you could just store the value and lock the new lruvec if > needed? Right, it's better to move for late patch. As to call the func again, mainly it's for code neat. Thanks! >=20 >> #ifdef CONFIG_CGROUP_WRITEBACK >> >> struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb); >> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h >> index 14c668b7e793..36c1680efd90 100644 >> --- a/include/linux/mmzone.h >> +++ b/include/linux/mmzone.h >> @@ -261,6 +261,8 @@ struct lruvec { >> atomic_long_t nonresident_age; >> /* Refaults at the time of last reclaim cycle */ >> unsigned long refaults; >> + /* per lruvec lru_lock for memcg */ >> + spinlock_t lru_lock; >> /* Various lruvec state flags (enum lruvec_flags) */ >> unsigned long flags; > Any reason for placing this here instead of at the end of the > structure? From what I can tell it looks like lruvec is already 128B > long so placing the lock on the end would put it into the next > cacheline which may provide some performance benefit since it is > likely to be bounced quite a bit. Rong Chen(Cced) once reported a performance regression when the lock at the end of struct, and move here could remove it. Although I can't not reproduce. But I trust his report. ... >> putback: >> - spin_unlock_irq(&zone->zone_pgdat->lru_lock); >> pagevec_add(&pvec_putback, pvec->pages[i]); >> pvec->pages[i] =3D NULL; >> } >> - /* tempary disable irq, will remove later */ >> - local_irq_disable(); >> __mod_zone_page_state(zone, NR_MLOCK, delta_munlocked); >> - local_irq_enable(); >> + if (lruvec) >> + unlock_page_lruvec_irq(lruvec); > So I am not a fan of this change. You went to all the trouble of > reducing the lock scope just to bring it back out here again. In > addition it implies there is a path where you might try to update the > page state without disabling interrupts. Right. but any idea to avoid this except a extra local_irq_disable? ... >> if (PageLRU(page)) { >> - struct pglist_data *pgdat =3D page_pgdat(page)= ; >> + struct lruvec *new_lruvec; >> >> - if (pgdat !=3D locked_pgdat) { >> - if (locked_pgdat) >> - spin_unlock_irqrestore(&locked= _pgdat->lru_lock, >> + new_lruvec =3D mem_cgroup_page_lruvec(page, >> + page_pgdat(pag= e)); >> + if (new_lruvec !=3D lruvec) { >> + if (lruvec) >> + unlock_page_lruvec_irqrestore(= lruvec, >> = flags); >> lock_batch =3D 0; >> - locked_pgdat =3D pgdat; >> - spin_lock_irqsave(&locked_pgdat->lru_l= ock, flags); >> + lruvec =3D lock_page_lruvec_irqsave(pa= ge, &flags); >> } > This just kind of seems ugly to me. I am not a fan of having to fetch > the lruvec twice when you already have it in new_lruvec. I suppose it > is fine though since you are just going to be replacing it later > anyway. >=20 yes, it will be reproduce later. Thanks Alex