Re: [PATCH v2 2/8] fs/buffer: try to use folio lock for pagecache lookups

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Jan Kara <jack@suse.cz>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: brauner@kernel.org, jack@suse.cz, tytso@mit.edu,
	 adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org,
	riel@surriel.com, dave@stgolabs.net,  willy@infradead.org,
	hannes@cmpxchg.org, oliver.sang@intel.com, david@redhat.com,
	 axboe@kernel.dk, hare@suse.de, david@fromorbit.com,
	djwong@kernel.org,  ritesh.list@gmail.com,
	linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org,
	 linux-mm@kvack.org, gost.dev@samsung.com, p.raghav@samsung.com,
	da.gomez@samsung.com
Subject: Re: [PATCH v2 2/8] fs/buffer: try to use folio lock for pagecache lookups
Date: Thu, 10 Apr 2025 16:38:35 +0200	[thread overview]
Message-ID: <plt72kbiee2sz32mqslvhmmlny6dqfeccnf2d325cus45qpo3t@m6t563ijkvr5> (raw)
In-Reply-To: <20250410014945.2140781-3-mcgrof@kernel.org>

On Wed 09-04-25 18:49:39, Luis Chamberlain wrote:
> From: Davidlohr Bueso <dave@stgolabs.net>
> 
> Callers of __find_get_block() may or may not allow for blocking
> semantics, and is currently assumed that it will not. Layout
> two paths based on this. Ultimately the i_private_lock scheme will
> be used as a fallback in non-blocking contexts. Otherwise
> always take the folio lock instead. The suggested trylock idea
> is implemented, thereby potentially reducing i_private_lock
> contention in addition to later enabling future migration support
> around with large folios and noref migration.
> 
> No change in semantics. All lookup users are non-blocking.
> 
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

...

> @@ -204,7 +195,19 @@ __find_get_block_slow(struct block_device *bdev, sector_t block)
>  	if (IS_ERR(folio))
>  		goto out;
>  
> -	spin_lock(&bd_mapping->i_private_lock);
> +	/*
> +	 * Folio lock protects the buffers. Callers that cannot block
> +	 * will fallback to serializing vs try_to_free_buffers() via
> +	 * the i_private_lock.
> +	 */
> +	if (!folio_trylock(folio)) {
> +		if (atomic) {
> +			spin_lock(&bd_mapping->i_private_lock);
> +			folio_locked = false;
> +		} else
> +			folio_lock(folio);
> +	}

Ewww, this is going to be pain. You will mostly use the folio_trylock() for
protecting the lookup, except when some insane workload / fuzzer manages to
trigger the other path which will lead to completely unreproducible bugs...
I'd rather do:

	if (atomic) {
		spin_lock(&bd_mapping->i_private_lock);
		folio_locked = false;
	} else {
		folio_lock(folio);
	}

I'd actually love to do something like:

	if (atomic) {
		if (!folio_trylock(folio))
			bail...
	} else {
		folio_lock(folio);
	}

but that may be just too radical this point and would need some serious
testing how frequent the trylock failures are. No point in blocking this
series with it. So just go with the deterministic use of i_private_lock for
atomic users for now.

								Honza

> +
>  	head = folio_buffers(folio);
>  	if (!head)
>  		goto out_unlock;
> @@ -236,7 +239,10 @@ __find_get_block_slow(struct block_device *bdev, sector_t block)
>  		       1 << blkbits);
>  	}
>  out_unlock:
> -	spin_unlock(&bd_mapping->i_private_lock);
> +	if (folio_locked)
> +		folio_unlock(folio);
> +	else
> +		spin_unlock(&bd_mapping->i_private_lock);
>  	folio_put(folio);
>  out:
>  	return ret;
> @@ -1388,14 +1394,15 @@ lookup_bh_lru(struct block_device *bdev, sector_t block, unsigned size)
>   * it in the LRU and mark it as accessed.  If it is not present then return
>   * NULL
>   */
> -struct buffer_head *
> -__find_get_block(struct block_device *bdev, sector_t block, unsigned size)
> +static struct buffer_head *
> +find_get_block_common(struct block_device *bdev, sector_t block,
> +			unsigned size, bool atomic)
>  {
>  	struct buffer_head *bh = lookup_bh_lru(bdev, block, size);
>  
>  	if (bh == NULL) {
>  		/* __find_get_block_slow will mark the page accessed */
> -		bh = __find_get_block_slow(bdev, block);
> +		bh = __find_get_block_slow(bdev, block, atomic);
>  		if (bh)
>  			bh_lru_install(bh);
>  	} else
> @@ -1403,6 +1410,12 @@ __find_get_block(struct block_device *bdev, sector_t block, unsigned size)
>  
>  	return bh;
>  }
> +
> +struct buffer_head *
> +__find_get_block(struct block_device *bdev, sector_t block, unsigned size)
> +{
> +	return find_get_block_common(bdev, block, size, true);
> +}
>  EXPORT_SYMBOL(__find_get_block);
>  
>  /**
> -- 
> 2.47.2
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

next prev parent reply	other threads:[~2025-04-10 14:38 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-10  1:49 [PATCH v2 0/8] mm: enhance migration work around on noref buffer-heads Luis Chamberlain
2025-04-10  1:49 ` [PATCH v2 1/8] migrate: fix skipping metadata buffer heads on migration Luis Chamberlain
2025-04-10  3:18   ` Matthew Wilcox
2025-04-10 12:05   ` Jan Kara
2025-04-14 21:09     ` Luis Chamberlain
2025-04-14 22:19       ` Luis Chamberlain
2025-04-15  9:05         ` Christian Brauner
2025-04-15 15:47           ` Luis Chamberlain
2025-04-15 16:23             ` Jan Kara
2025-04-15 21:06               ` Luis Chamberlain
2025-04-16  2:02                 ` Davidlohr Bueso
2025-04-15 11:17         ` Jan Kara
2025-04-15 11:23       ` Jan Kara
2025-04-15 16:18     ` Luis Chamberlain
2025-04-15 16:28       ` Jan Kara
2025-04-16 16:58         ` Luis Chamberlain
2025-04-23 17:09           ` Jan Kara
2025-04-23 20:30             ` Luis Chamberlain
2025-04-25 22:51               ` Luis Chamberlain
2025-04-28 23:08                 ` Luis Chamberlain
2025-04-29  9:32                   ` Jan Kara
2025-04-15  1:36   ` Davidlohr Bueso
2025-04-15 11:25     ` Jan Kara
2025-04-15 18:14       ` Davidlohr Bueso
2025-04-10  1:49 ` [PATCH v2 2/8] fs/buffer: try to use folio lock for pagecache lookups Luis Chamberlain
2025-04-10 14:38   ` Jan Kara [this message]
2025-04-10 17:38     ` Davidlohr Bueso
2025-04-10  1:49 ` [PATCH v2 3/8] fs/buffer: introduce __find_get_block_nonatomic() Luis Chamberlain
2025-04-10  1:49 ` [PATCH v2 4/8] fs/ocfs2: use sleeping version of __find_get_block() Luis Chamberlain
2025-04-10  1:49 ` [PATCH v2 5/8] fs/jbd2: " Luis Chamberlain
2025-04-10  1:49 ` [PATCH v2 6/8] fs/ext4: " Luis Chamberlain
2025-04-10 13:36   ` Jan Kara
2025-04-10 17:32     ` Davidlohr Bueso
2025-04-10  1:49 ` [PATCH v2 7/8] mm/migrate: enable noref migration for jbd2 Luis Chamberlain
2025-04-10 13:40   ` Jan Kara
2025-04-10 17:30     ` Davidlohr Bueso
2025-04-14 12:12       ` Jan Kara
2025-04-10  1:49 ` [PATCH v2 8/8] mm: add migration buffer-head debugfs interface Luis Chamberlain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=plt72kbiee2sz32mqslvhmmlny6dqfeccnf2d325cus45qpo3t@m6t563ijkvr5 \
    --to=jack@suse.cz \
    --cc=adilger.kernel@dilger.ca \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=da.gomez@samsung.com \
    --cc=dave@stgolabs.net \
    --cc=david@fromorbit.com \
    --cc=david@redhat.com \
    --cc=djwong@kernel.org \
    --cc=gost.dev@samsung.com \
    --cc=hannes@cmpxchg.org \
    --cc=hare@suse.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mcgrof@kernel.org \
    --cc=oliver.sang@intel.com \
    --cc=p.raghav@samsung.com \
    --cc=riel@surriel.com \
    --cc=ritesh.list@gmail.com \
    --cc=tytso@mit.edu \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox