Re: Direct I/O performance problems with 1GB pages

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Christoph Hellwig <hch@infradead.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: linux-mm@kvack.org, linux-block@vger.kernel.org,
	Muchun Song <muchun.song@linux.dev>,
	Jane Chu <jane.chu@oracle.com>,
	David Hildenbrand <david@redhat.com>
Subject: Re: Direct I/O performance problems with 1GB pages
Date: Mon, 27 Jan 2025 21:56:45 -0800	[thread overview]
Message-ID: <Z5hxnRqbvi7KiXBW@infradead.org> (raw)
In-Reply-To: <Z5WF9cA-RZKZ5lDN@casper.infradead.org>

I read through this and unfortunately have nothing useful to contribute
to the actual lock sharding.  Just two semi-related bits:

On Sun, Jan 26, 2025 at 12:46:45AM +0000, Matthew Wilcox wrote:
> Postgres are experimenting with doing direct I/O to 1GB hugetlb pages.
> Andres has gathered some performance data showing significantly worse
> performance with 1GB pages compared to 2MB pages.  I sent a patch
> recently which improves matters [1], but problems remain.
> 
> The primary problem we've identified is contention of folio->_refcount
> with a strong secondary contention on folio->_pincount.  This is coming
> from the call chain:
> 
> iov_iter_extract_pages ->
> gup_fast_fallback ->
> try_grab_folio_fast

Eww, gup_fast_fallback sent me down the wrong path, as it suggests that
is is the fallback slow path, but it's not.  It got renamed from
internal_get_user_pages_fast in commit 23babe1934d7 ("mm/gup:
consistently name GUP-fast functions").  While old name wasn't all
that great the new one including fallback is just horrible.  Can we
please fix this up?

The other thing is that the whole GUP machinery takes a reference
per page fragment it touches and not just per folio fragment.
I'm not sure how fair access to the atomics is, but I suspect the
multiple calls to increment/decrement them per operation probably
don't make the lock contention any better.

next prev parent reply	other threads:[~2025-01-28  5:56 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-26  0:46 Matthew Wilcox
2025-01-27 14:09 ` David Hildenbrand
2025-01-27 16:02   ` Matthew Wilcox
2025-01-27 16:09     ` David Hildenbrand
2025-01-27 16:20       ` David Hildenbrand
2025-01-27 16:56         ` Matthew Wilcox
2025-01-27 16:59           ` David Hildenbrand
2025-01-27 18:21       ` Andres Freund
2025-01-27 18:54         ` Jens Axboe
2025-01-27 19:07           ` David Hildenbrand
2025-01-27 21:32           ` Pavel Begunkov
2025-01-27 16:24     ` Keith Busch
2025-01-27 17:25   ` Andres Freund
2025-01-27 19:20     ` David Hildenbrand
2025-01-27 19:36       ` Andres Freund
2025-01-28  5:56 ` Christoph Hellwig [this message]
2025-01-28  9:47   ` David Hildenbrand
2025-01-29  6:03     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z5hxnRqbvi7KiXBW@infradead.org \
    --to=hch@infradead.org \
    --cc=david@redhat.com \
    --cc=jane.chu@oracle.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=muchun.song@linux.dev \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox