From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.1 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45CEFC433E0 for ; Wed, 29 Jul 2020 17:52:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 02A422075D for ; Wed, 29 Jul 2020 17:52:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="BbAS+8cK" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 02A422075D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7260B6B0006; Wed, 29 Jul 2020 13:52:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6D3EF8D0001; Wed, 29 Jul 2020 13:52:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 59C616B0008; Wed, 29 Jul 2020 13:52:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0003.hostedemail.com [216.40.44.3]) by kanga.kvack.org (Postfix) with ESMTP id 3E5056B0006 for ; Wed, 29 Jul 2020 13:52:15 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id CE4D183EF for ; Wed, 29 Jul 2020 17:52:14 +0000 (UTC) X-FDA: 77091857388.06.form37_171004a26f74 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin06.hostedemail.com (Postfix) with ESMTP id 791141005143C for ; Wed, 29 Jul 2020 17:52:14 +0000 (UTC) X-HE-Tag: form37_171004a26f74 X-Filterd-Recvd-Size: 12040 Received: from mail-io1-f67.google.com (mail-io1-f67.google.com [209.85.166.67]) by imf41.hostedemail.com (Postfix) with ESMTP for ; Wed, 29 Jul 2020 17:52:13 +0000 (UTC) Received: by mail-io1-f67.google.com with SMTP id g19so13370645ioh.8 for ; Wed, 29 Jul 2020 10:52:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=VBlR6JvMKpFF4I6cEiozKbbb6b38RhwCuKSXFCxjOBY=; b=BbAS+8cKgeQXImpsn8VUTIj5E9jS7pOeOZZuJs7yJcBS3WYt4zxTCXJncgw9qrRlKZ JSEao8c1IRX7/KzDmjLH1TZkcv9XjoBA5hZNqnzhpmh7J1fbNQB9iaZY+vuk6kaVu2o1 D6w18a83AFD7Qea2KzU0Z4CfRQrwxMEAoxO0pU+iGD7Khq+imB/4kFtt+Cro8gVrfer6 PxXgXA/ElnRARXP39gBsugGIMwsIUcmfeTxCqavE0tWKC+RTdTO07VvLaWAj2brGnxTh mbGLFtEii9t0scIc1deLu+4xGFiwO86SGDUVoq68nY9E9bPs+NOmcwbuhimgb8dROhlP or8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=VBlR6JvMKpFF4I6cEiozKbbb6b38RhwCuKSXFCxjOBY=; b=KJ+lmvBFd0+DVMhp0AIJ2pn/xbxHH9N1lrxe/rOtPFTLQ2stciZsYgw8DDih38C33v 2IVJ633sWdvvoJyZWn+U/UCsfQ3NhL0HBFGokyKa6c2BnZwZN0cDnem+0yF0+28MRVGB FdNpiUpPJWt4/3VZPQgSjoZ7KwMi7XRJx2p9t2/Qal+vDr7McviIY0YpA5B7/3Uq6fTI IyVE6rzvfI/MCyxQjh5hBZgX2TB1OugAfwlEzZp+VBz4ysA54OLD6Qfwy2MU9oOklWKV wdy3khzIR5MOk15ppadkZn+JGLomwfzEPDp9tOjvyQvIZ9w6EjpNyZZOMzkl9qJsYwo0 89Hw== X-Gm-Message-State: AOAM533+1ZQJgi/5SCii+o3qzpgh1sxCbvsAFPbg6p6vuW2YatjMxDmo hdb88GDPvQLrbdc5w4E7yBDu3PkypDXQFxSP2s4= X-Google-Smtp-Source: ABdhPJy8ptUMPnNeTVvk544sb5TR8r+665OH7AZPqT7XUIQPGpVqzSqTZzmSG8WdDR+0ut5GmJTS9yod0augP8gi5tw= X-Received: by 2002:a02:c842:: with SMTP id r2mr39394965jao.87.1596045133015; Wed, 29 Jul 2020 10:52:13 -0700 (PDT) MIME-Version: 1.0 References: <1595681998-19193-1-git-send-email-alex.shi@linux.alibaba.com> <1595681998-19193-19-git-send-email-alex.shi@linux.alibaba.com> In-Reply-To: <1595681998-19193-19-git-send-email-alex.shi@linux.alibaba.com> From: Alexander Duyck Date: Wed, 29 Jul 2020 10:52:01 -0700 Message-ID: Subject: Re: [PATCH v17 18/21] mm/lru: introduce the relock_page_lruvec function To: Alex Shi Cc: Andrew Morton , Mel Gorman , Tejun Heo , Hugh Dickins , Konstantin Khlebnikov , Daniel Jordan , Yang Shi , Matthew Wilcox , Johannes Weiner , kbuild test robot , linux-mm , LKML , cgroups@vger.kernel.org, Shakeel Butt , Joonsoo Kim , Wei Yang , "Kirill A. Shutemov" , Rong Chen , Thomas Gleixner , Andrey Ryabinin Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 791141005143C X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, Jul 25, 2020 at 6:00 AM Alex Shi wrote: > > Use this new function to replace repeated same code, no func change. > > Signed-off-by: Alex Shi > Cc: Johannes Weiner > Cc: Andrew Morton > Cc: Thomas Gleixner > Cc: Andrey Ryabinin > Cc: Matthew Wilcox > Cc: Mel Gorman > Cc: Konstantin Khlebnikov > Cc: Hugh Dickins > Cc: Tejun Heo > Cc: linux-kernel@vger.kernel.org > Cc: cgroups@vger.kernel.org > Cc: linux-mm@kvack.org > --- > include/linux/memcontrol.h | 40 ++++++++++++++++++++++++++++++++++++++++ > mm/mlock.c | 9 +-------- > mm/swap.c | 33 +++++++-------------------------- > mm/vmscan.c | 8 +------- > 4 files changed, 49 insertions(+), 41 deletions(-) > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > index 258901021c6c..6e670f991b42 100644 > --- a/include/linux/memcontrol.h > +++ b/include/linux/memcontrol.h > @@ -1313,6 +1313,46 @@ static inline void unlock_page_lruvec_irqrestore(struct lruvec *lruvec, > spin_unlock_irqrestore(&lruvec->lru_lock, flags); > } > > +/* Don't lock again iff page's lruvec locked */ > +static inline struct lruvec *relock_page_lruvec_irq(struct page *page, > + struct lruvec *locked_lruvec) > +{ > + struct pglist_data *pgdat = page_pgdat(page); > + bool locked; > + > + rcu_read_lock(); > + locked = mem_cgroup_page_lruvec(page, pgdat) == locked_lruvec; > + rcu_read_unlock(); > + > + if (locked) > + return locked_lruvec; > + > + if (locked_lruvec) > + unlock_page_lruvec_irq(locked_lruvec); > + > + return lock_page_lruvec_irq(page); > +} > + > +/* Don't lock again iff page's lruvec locked */ > +static inline struct lruvec *relock_page_lruvec_irqsave(struct page *page, > + struct lruvec *locked_lruvec, unsigned long *flags) > +{ > + struct pglist_data *pgdat = page_pgdat(page); > + bool locked; > + > + rcu_read_lock(); > + locked = mem_cgroup_page_lruvec(page, pgdat) == locked_lruvec; > + rcu_read_unlock(); > + > + if (locked) > + return locked_lruvec; > + > + if (locked_lruvec) > + unlock_page_lruvec_irqrestore(locked_lruvec, *flags); > + > + return lock_page_lruvec_irqsave(page, flags); > +} > + So looking these over they seem to be pretty inefficient for what they do. Basically in worst case (locked_lruvec == NULL) you end up calling mem_cgoup_page_lruvec and all the rcu_read_lock/unlock a couple times for a single page. It might make more sense to structure this like: if (locked_lruvec) { if (lruvec_holds_page_lru_lock(page, locked_lruvec)) return locked_lruvec; unlock_page_lruvec_irqrestore(locked_lruvec, *flags); } return lock_page_lruvec_irqsave(page, flags); The other piece that has me scratching my head is that I wonder if we couldn't do this without needing the rcu_read_lock. For example, what if we were to compare the page mem_cgroup pointer to the memcg back pointer stored in the mem_cgroup_per_node? It seems like ordering things this way would significantly reduce the overhead due to the pointer chasing to see if the page is in the locked lruvec or not. > #ifdef CONFIG_CGROUP_WRITEBACK > > struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb); > diff --git a/mm/mlock.c b/mm/mlock.c > index 5d40d259a931..bc2fb3bfbe7a 100644 > --- a/mm/mlock.c > +++ b/mm/mlock.c > @@ -303,17 +303,10 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone) > /* Phase 1: page isolation */ > for (i = 0; i < nr; i++) { > struct page *page = pvec->pages[i]; > - struct lruvec *new_lruvec; > > /* block memcg change in mem_cgroup_move_account */ > lock_page_memcg(page); > - new_lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); > - if (new_lruvec != lruvec) { > - if (lruvec) > - unlock_page_lruvec_irq(lruvec); > - lruvec = lock_page_lruvec_irq(page); > - } > - > + lruvec = relock_page_lruvec_irq(page, lruvec); > if (TestClearPageMlocked(page)) { > /* > * We already have pin from follow_page_mask() > diff --git a/mm/swap.c b/mm/swap.c > index 09edac441eb6..6d9c7288f7de 100644 > --- a/mm/swap.c > +++ b/mm/swap.c > @@ -209,19 +209,12 @@ static void pagevec_lru_move_fn(struct pagevec *pvec, > > for (i = 0; i < pagevec_count(pvec); i++) { > struct page *page = pvec->pages[i]; > - struct lruvec *new_lruvec; > > /* block memcg migration during page moving between lru */ > if (!TestClearPageLRU(page)) > continue; > > - new_lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); > - if (lruvec != new_lruvec) { > - if (lruvec) > - unlock_page_lruvec_irqrestore(lruvec, flags); > - lruvec = lock_page_lruvec_irqsave(page, &flags); > - } > - > + lruvec = relock_page_lruvec_irqsave(page, lruvec, &flags); > (*move_fn)(page, lruvec); > > SetPageLRU(page); > @@ -864,17 +857,12 @@ void release_pages(struct page **pages, int nr) > } > > if (PageLRU(page)) { > - struct lruvec *new_lruvec; > - > - new_lruvec = mem_cgroup_page_lruvec(page, > - page_pgdat(page)); > - if (new_lruvec != lruvec) { > - if (lruvec) > - unlock_page_lruvec_irqrestore(lruvec, > - flags); > + struct lruvec *prev_lruvec = lruvec; > + > + lruvec = relock_page_lruvec_irqsave(page, lruvec, > + &flags); > + if (prev_lruvec != lruvec) > lock_batch = 0; > - lruvec = lock_page_lruvec_irqsave(page, &flags); > - } > > __ClearPageLRU(page); > del_page_from_lru_list(page, lruvec, page_off_lru(page)); > @@ -980,15 +968,8 @@ void __pagevec_lru_add(struct pagevec *pvec) > > for (i = 0; i < pagevec_count(pvec); i++) { > struct page *page = pvec->pages[i]; > - struct lruvec *new_lruvec; > - > - new_lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); > - if (lruvec != new_lruvec) { > - if (lruvec) > - unlock_page_lruvec_irqrestore(lruvec, flags); > - lruvec = lock_page_lruvec_irqsave(page, &flags); > - } > > + lruvec = relock_page_lruvec_irqsave(page, lruvec, &flags); > __pagevec_lru_add_fn(page, lruvec); > } > if (lruvec) > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 168c1659e430..bdb53a678e7e 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -4292,15 +4292,9 @@ void check_move_unevictable_pages(struct pagevec *pvec) > > for (i = 0; i < pvec->nr; i++) { > struct page *page = pvec->pages[i]; > - struct lruvec *new_lruvec; > > pgscanned++; > - new_lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); > - if (lruvec != new_lruvec) { > - if (lruvec) > - unlock_page_lruvec_irq(lruvec); > - lruvec = lock_page_lruvec_irq(page); > - } > + lruvec = relock_page_lruvec_irq(page, lruvec); > > if (!PageLRU(page) || !PageUnevictable(page)) > continue; > -- > 1.8.3.1 >