linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: John Hubbard <jhubbard@nvidia.com>
To: David Hildenbrand <david@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: LKML <linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>,
	Vivek Kasireddy <vivek.kasireddy@intel.com>,
	Dave Airlie <airlied@redhat.com>,
	Gerd Hoffmann <kraxel@redhat.com>,
	Matthew Wilcox <willy@infradead.org>,
	Christoph Hellwig <hch@infradead.org>,
	Jason Gunthorpe <jgg@nvidia.com>, Peter Xu <peterx@redhat.com>,
	Arnd Bergmann <arnd@arndb.de>,
	Daniel Vetter <daniel.vetter@ffwll.ch>,
	Dongwon Kim <dongwon.kim@intel.com>,
	Hugh Dickins <hughd@google.com>,
	Junxiao Chang <junxiao.chang@intel.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Oscar Salvador <osalvador@suse.de>,
	<linux-stable@vger.kernel.org>
Subject: Re: [PATCH v2 0/1] mm/gup: avoid an unnecessary allocation call for FOLL_LONGTERM cases
Date: Wed, 6 Nov 2024 20:57:25 -0800	[thread overview]
Message-ID: <8943a8bd-644f-48fe-8502-6150c993c445@nvidia.com> (raw)
In-Reply-To: <f747223e-042f-40f4-841c-1c8019dc8510@redhat.com>

On 11/5/24 12:42 AM, David Hildenbrand wrote:
> On 05.11.24 04:29, John Hubbard wrote:
...
> Yeah, I was only adding it because I stumbled over it. It might not be a problem, because we simply "skip" if we find a folio that was already isolated (possibly by us). What might happen is that we unnecessarily drain the LRU.
> 
> __collapse_huge_page_isolate() scans the compound_pagelist() list, before try-locking and isolating. But it also just "fails" instead of retrying forever.
> 
> Imagine the page tables looking like the following (e.g., COW in a MAP_PRIVATE file mapping that supports large folios)
> 
>                ------ F0P2 was replaced by a new (small) folio
>               |
> [ F0P0 ] [ F0P1 ] [ F1P0 ] [F0P3 ]
> 
> F0P0: Folio 0, page 0
> 
> Assume we try pinning that range and end up in collect_longterm_unpinnable_folios() with:
> 
> F0, F0, F1, F0
> 
> 
> Assume F0 and F1 are not long-term pinnable.
> 
> i = 0: We isolate F0
> i = 1: We see that it is the same F0 and skip
> i = 2: We isolate F1
> i = 3: We see !folio_test_lru() and do a lru_add_drain_all() to then
>         fail folio_isolate_lru()
> 
> So the drain in i=3 could be avoided by scanning the list, if we already isolated that one. Working better than I originally thought.

Thanks for spelling out that case, I was having trouble visualizing it,
but now it's clear.

OK, so looking at this, I think it could be extended to more than just
"skip the drain". It seems like we should also avoid counting the folio
(the existing code seems wrong).

So I think this approach would be correct, does it seem accurate to
you as well? Here:

diff --git a/mm/gup.c b/mm/gup.c
index ad0c8922dac3..ab8e706b52f0 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2324,11 +2324,21 @@ static unsigned long collect_longterm_unpinnable_folios(
  
  	for (i = 0; i < pofs->nr_entries; i++) {
  		struct folio *folio = pofs_get_folio(pofs, i);
+		struct folio *tmp_folio;
  
+		/*
+		 * Two checks to see if this folio has already been collected.
+		 * The first check is quick, and the second check is thorough.
+		 */
  		if (folio == prev_folio)
  			continue;
  		prev_folio = folio;
  
+		list_for_each_entry(tmp_folio, movable_folio_list, lru) {
+			if (folio == tmp_folio)
+				continue;
+		}
+
  		if (folio_is_longterm_pinnable(folio))
  			continue;



I need to test this more thoroughly, though, with a directed gup test (I'm not sure we
have one yet).
  

thanks,
-- 
John Hubbard



      reply	other threads:[~2024-11-07  4:57 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-05  3:29 John Hubbard
2024-11-05  3:29 ` [PATCH v2 1/1] [PATCH] " John Hubbard
2024-11-05  8:47   ` David Hildenbrand
2024-11-05 21:31     ` John Hubbard
2024-11-06  9:23   ` Oscar Salvador
2024-11-07  4:20     ` John Hubbard
2024-11-05  8:42 ` [PATCH v2 0/1] " David Hildenbrand
2024-11-07  4:57   ` John Hubbard [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8943a8bd-644f-48fe-8502-6150c993c445@nvidia.com \
    --to=jhubbard@nvidia.com \
    --cc=airlied@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=daniel.vetter@ffwll.ch \
    --cc=david@redhat.com \
    --cc=dongwon.kim@intel.com \
    --cc=hch@infradead.org \
    --cc=hughd@google.com \
    --cc=jgg@nvidia.com \
    --cc=junxiao.chang@intel.com \
    --cc=kraxel@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-stable@vger.kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=osalvador@suse.de \
    --cc=peterx@redhat.com \
    --cc=vivek.kasireddy@intel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox