From: Aaron Lu <aaron.lu@intel.com>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Dave Hansen <dave.hansen@linux.intel.com>,
Vlastimil Babka <vbabka@suse.cz>,
Michal Hocko <mhocko@kernel.org>,
Jesper Dangaard Brouer <brouer@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
Linux-MM <linux-mm@kvack.org>
Subject: Re: [PATCH 5/6] mm/page_alloc: Free pages in a single pass during bulk free
Date: Fri, 18 Feb 2022 20:13:04 +0800 [thread overview]
Message-ID: <Yg+NUEazDNNtpVzv@ziqianlu-nuc9qn> (raw)
In-Reply-To: <20220218094716.GY3366@techsingularity.net>
On Fri, Feb 18, 2022 at 09:47:16AM +0000, Mel Gorman wrote:
> On Fri, Feb 18, 2022 at 02:07:42PM +0800, Aaron Lu wrote:
> > > @@ -1498,12 +1508,6 @@ static void free_pcppages_bulk(struct zone *zone, int count,
> > > if (bulkfree_pcp_prepare(page))
> > > continue;
> > >
> > > - /* Encode order with the migratetype */
> > > - page->index <<= NR_PCP_ORDER_WIDTH;
> > > - page->index |= order;
> > > -
> > > - list_add_tail(&page->lru, &head);
> > > -
> > > /*
> > > * We are going to put the page back to the global
> > > * pool, prefetch its buddy to speed up later access
> > > @@ -1517,36 +1521,18 @@ static void free_pcppages_bulk(struct zone *zone, int count,
> > > prefetch_buddy(page, order);
> > > prefetch_nr--;
> > > }
> >
> > The comment above 'if (prefetch_nr)' says: "We are going to put the page
> > back to the global pool, prefetch its buddy to speed up later access
> > under zone->lock..." will have to be modified as the prefetch is now
> > done inside the lock.
> >
>
> Yes, that was my understanding.
>
> > I remember prefetch_buddy()'s original intent is to fetch the buddy
> > page's 'struct page' before acquiring the zone lock to speed up
> > operations inside the locked region. Now that the zone lock is acquired
> > early, whether to still keep the prefetch_buddy() inside the lock
> > becomes questionable.
> >
>
> I agree. I wanted to take it out but worried it might stall (drumroll)
> the rest of the series as evaluating prefetch is machine specific. Before
Understood.
> this series I thought it was possible that the prefetched lines would be
> flushed if the lists were large enough. Due to free_factor, it's possible
> we are 10's of thousands of pages and the prefetched pages would be
> evicted. It would require a fairly small cache though.
Makes sense.
>
> There are still two reasons why I thought it should go away as a
> follow-up to the series.
>
> 1. There is a guaranteed cost to calculating the buddy which definitely
> has to be calculated again. However, as the zone lock is held and
> there is no deferring of buddy merging, there is no guarantee that the
> prefetch will have completed when the second buddy calculation takes
> place and buddies are being merged. With or without the prefetch, there
> may be further stalls depending on how many pages get merged. In other
> words, a stall due to merging is inevitable and at best only one stall
> might be avoided at the cost of calculating the buddy location twice.
>
> 2. As the zone lock is held, prefetch_nr makes less sense as once
> prefetch_nr expires, the cache lines of interest have already been
> merged.
>
> It's point 1 that was my main concern. We are paying a guaranteed cost for
> a maybe win if prefetching is fast enough and it would be very difficult to
> spot what percentage of prefetches actually helped. It was more clear-cut
> when the buddy freeing was deferred as there was more time for the prefetch
> to complete.
Both points make sense to me.
I'm also thinking since zone lock contention is much better
now(presumbly due to your free_factor patchset) than before, these
techniques(pick pages to free before acquiring lock and prefetch buddy
on free path) make less sense now.
>
> > After the nr_task=4/16/64 tests finished, I'll also test the effect of
> > removing prefetch_buddy() here.
> >
>
> I'd appreciate it. I think the patch is this (build tested only);
>
Looks good to me, thanks!
> --8<--
> mm/page_alloc: Do not prefetch buddies during bulk free
>
> free_pcppages_bulk() has taken two passes through the pcp lists since
> commit 0a5f4e5b4562 ("mm/free_pcppages_bulk: do not hold lock when picking
> pages to free") due to deferring the cost of selecting PCP lists until
> the zone lock is held.
>
> As the list processing now takes place under the zone lock, it's less
> clear that this will always benefit for two reasons.
>
> 1. There is a guaranteed cost to calculating the buddy which definitely
> has to be calculated again. However, as the zone lock is held and
> there is no deferring of buddy merging, there is no guarantee that the
> prefetch will have completed when the second buddy calculation takes
> place and buddies are being merged. With or without the prefetch, there
> may be further stalls depending on how many pages get merged. In other
> words, a stall due to merging is inevitable and at best only one stall
> might be avoided at the cost of calculating the buddy location twice.
>
> 2. As the zone lock is held, prefetch_nr makes less sense as once
> prefetch_nr expires, the cache lines of interest have already been
> merged.
>
> The main concern is that there is a definite cost to calculating the
> buddy location early for the prefetch and it is a "maybe win" depending
> on whether the CPU prefetch logic and memory is fast enough. Remove the
> prefetch logic on the basis that reduced instructions in a path is always
> a saving where as the prefetch might save one memory stall depending on
> the CPU and memory.
>
> Suggested-by: Aaron Lu <aaron.lu@intel.com>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> ---
> mm/page_alloc.c | 24 ------------------------
> 1 file changed, 24 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index de9f072d23bd..2d5cc098136d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1432,15 +1432,6 @@ static bool bulkfree_pcp_prepare(struct page *page)
> }
> #endif /* CONFIG_DEBUG_VM */
>
> -static inline void prefetch_buddy(struct page *page, unsigned int order)
> -{
> - unsigned long pfn = page_to_pfn(page);
> - unsigned long buddy_pfn = __find_buddy_pfn(pfn, order);
> - struct page *buddy = page + (buddy_pfn - pfn);
> -
> - prefetch(buddy);
> -}
> -
> /*
> * Frees a number of pages from the PCP lists
> * Assumes all pages on list are in same zone.
> @@ -1453,7 +1444,6 @@ static void free_pcppages_bulk(struct zone *zone, int count,
> int min_pindex = 0;
> int max_pindex = NR_PCP_LISTS - 1;
> unsigned int order;
> - int prefetch_nr = READ_ONCE(pcp->batch);
> bool isolated_pageblocks;
> struct page *page;
>
> @@ -1508,20 +1498,6 @@ static void free_pcppages_bulk(struct zone *zone, int count,
> if (bulkfree_pcp_prepare(page))
> continue;
>
> - /*
> - * We are going to put the page back to the global
> - * pool, prefetch its buddy to speed up later access
> - * under zone->lock. It is believed the overhead of
> - * an additional test and calculating buddy_pfn here
> - * can be offset by reduced memory latency later. To
> - * avoid excessive prefetching due to large count, only
> - * prefetch buddy for the first pcp->batch nr of pages.
> - */
> - if (prefetch_nr) {
> - prefetch_buddy(page, order);
> - prefetch_nr--;
> - }
> -
> /* MIGRATE_ISOLATE page should not go to pcplists */
> VM_BUG_ON_PAGE(is_migrate_isolate(mt), page);
> /* Pageblock could have been isolated meanwhile */
next prev parent reply other threads:[~2022-02-18 12:13 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-17 0:22 [PATCH v2 0/6] Follow-up on high-order PCP caching Mel Gorman
2022-02-17 0:22 ` [PATCH 1/6] mm/page_alloc: Fetch the correct pcp buddy during bulk free Mel Gorman
2022-02-17 1:43 ` Aaron Lu
2022-02-17 0:22 ` [PATCH 2/6] mm/page_alloc: Track range of active PCP lists " Mel Gorman
2022-02-17 9:41 ` Vlastimil Babka
2022-02-17 0:22 ` [PATCH 3/6] mm/page_alloc: Simplify how many pages are selected per pcp list " Mel Gorman
2022-02-17 0:22 ` [PATCH 4/6] mm/page_alloc: Drain the requested list first " Mel Gorman
2022-02-17 9:42 ` Vlastimil Babka
2022-02-17 0:22 ` [PATCH 5/6] mm/page_alloc: Free pages in a single pass " Mel Gorman
2022-02-17 1:53 ` Aaron Lu
2022-02-17 8:49 ` Aaron Lu
2022-02-17 9:31 ` Mel Gorman
2022-02-18 4:20 ` Aaron Lu
2022-02-18 9:20 ` Mel Gorman
2022-02-21 13:38 ` Aaron Lu
2022-02-23 11:30 ` Aaron Lu
2022-02-23 13:05 ` Mel Gorman
2022-02-24 1:34 ` Lu, Aaron
2022-02-18 6:07 ` Aaron Lu
2022-02-18 9:47 ` Mel Gorman
2022-02-18 12:13 ` Aaron Lu [this message]
2022-02-17 0:22 ` [PATCH 6/6] mm/page_alloc: Limit number of high-order pages on PCP " Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yg+NUEazDNNtpVzv@ziqianlu-nuc9qn \
--to=aaron.lu@intel.com \
--cc=akpm@linux-foundation.org \
--cc=brouer@redhat.com \
--cc=dave.hansen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@kernel.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox