linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: hch@lst.de, hare@suse.de, david@fromorbit.com, djwong@kernel.org,
	john.g.garry@oracle.com, ritesh.list@gmail.com,
	kbusch@kernel.org, linux-fsdevel@vger.kernel.org,
	linux-xfs@vger.kernel.org, linux-mm@kvack.org,
	linux-block@vger.kernel.org, gost.dev@samsung.com,
	p.raghav@samsung.com, da.gomez@samsung.com,
	kernel@pankajraghav.com
Subject: Re: [RFC 3/8] fs/buffer: restart block_read_full_folio() to avoid array overflow
Date: Wed, 13 Nov 2024 18:50:08 +0000	[thread overview]
Message-ID: <ZzT04C0iwlkxg6aL@casper.infradead.org> (raw)
In-Reply-To: <20241113094727.1497722-4-mcgrof@kernel.org>

On Wed, Nov 13, 2024 at 01:47:22AM -0800, Luis Chamberlain wrote:
> +++ b/fs/buffer.c
> @@ -2366,7 +2366,7 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block)
>  {
>  	struct inode *inode = folio->mapping->host;
>  	sector_t iblock, lblock;
> -	struct buffer_head *bh, *head, *arr[MAX_BUF_PER_PAGE];
> +	struct buffer_head *bh, *head, *restart_bh = NULL, *arr[MAX_BUF_PER_PAGE];

MAX_BUF_PER_PAGE is a pain.  There are configs like hexagon which have
256kB pages and so this array ends up being 512 * 8 bytes = 4kB in size
which spews stack growth warnings.  Can we just make this 8?

> @@ -2385,6 +2385,7 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block)
>  	iblock = div_u64(folio_pos(folio), blocksize);
>  	lblock = div_u64(limit + blocksize - 1, blocksize);
>  	bh = head;
> +restart:
>  	nr = 0;
>  	i = 0;
>  
> @@ -2417,7 +2418,12 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block)
>  				continue;
>  		}
>  		arr[nr++] = bh;
> -	} while (i++, iblock++, (bh = bh->b_this_page) != head);
> +	} while (i++, iblock++, (bh = bh->b_this_page) != head && nr < MAX_BUF_PER_PAGE);
> +
> +	if (nr == MAX_BUF_PER_PAGE && bh != head)
> +		restart_bh = bh;
> +	else
> +		restart_bh = NULL;
>  
>  	if (fully_mapped)
>  		folio_set_mappedtodisk(folio);
> @@ -2450,6 +2456,15 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block)
>  		else
>  			submit_bh(REQ_OP_READ, bh);
>  	}
> +
> +	/*
> +	 * Found more buffers than 'arr' could hold,
> +	 * restart to submit the remaining ones.
> +	 */
> +	if (restart_bh) {
> +		bh = restart_bh;
> +		goto restart;
> +	}
>  	return 0;

This isn't right.

Let's assume we need 16 blocks to fill in this folio and we have 8
entries in 'arr'.

        nr = 0;
        i = 0;

        do {
                if (buffer_uptodate(bh))
                        continue;
...
                arr[nr++] = bh;
        } while (i++, iblock++, (bh = bh->b_this_page) != head);

        for (i = 0; i < nr; i++) {
                bh = arr[i];
                        submit_bh(REQ_OP_READ, bh);

OK, so first time round, we've submitted 8 I/Os.  Now we see that
restart_bh is not NULL and so we go round again.

This time, we happen to find that the last 8 BHs are uptodate.
And so we take this path:

        if (!nr) {
                /*
                 * All buffers are uptodate or get_block() returned an
                 * error when trying to map them - we can finish the read.
                 */
                folio_end_read(folio, !page_error);

oops, we forgot about the 8 buffers we already submitted for read.


  reply	other threads:[~2024-11-13 18:50 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-13  9:47 [RFC 0/8] enable bs > ps for block devices Luis Chamberlain
2024-11-13  9:47 ` [RFC 1/8] fs/mpage: use blocks_per_folio instead of blocks_per_page Luis Chamberlain
2024-11-13  9:47 ` [RFC 2/8] fs/mpage: avoid negative shift for large blocksize Luis Chamberlain
2024-11-13 14:06   ` Matthew Wilcox
2024-11-14 13:47     ` Hannes Reinecke
2024-11-13  9:47 ` [RFC 3/8] fs/buffer: restart block_read_full_folio() to avoid array overflow Luis Chamberlain
2024-11-13 18:50   ` Matthew Wilcox [this message]
2024-11-13  9:47 ` [RFC 4/8] fs/buffer fs/mpage: remove large folio restriction Luis Chamberlain
2024-11-13  9:55   ` Hannes Reinecke
2024-11-13  9:47 ` [RFC 5/8] block/bdev: enable large folio support for large logical block sizes Luis Chamberlain
2024-11-13  9:47 ` [RFC 6/8] block/bdev: lift block size restrictions and use common definition Luis Chamberlain
2024-11-13  9:57   ` Hannes Reinecke
2024-11-13 14:14   ` Matthew Wilcox
2024-11-18  9:18   ` John Garry
2024-11-13  9:47 ` [RFC 7/8] nvme: remove superfluous block size check Luis Chamberlain
2024-11-13  9:57   ` Hannes Reinecke
2024-11-13  9:47 ` [RFC 8/8] bdev: use bdev_io_min() for statx block size Luis Chamberlain
2024-11-13  9:59   ` Hannes Reinecke
2024-11-18  7:08   ` Christoph Hellwig
2024-11-18 21:16     ` Luis Chamberlain
2024-11-19  6:08       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZzT04C0iwlkxg6aL@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=da.gomez@samsung.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=gost.dev@samsung.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=john.g.garry@oracle.com \
    --cc=kbusch@kernel.org \
    --cc=kernel@pankajraghav.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=p.raghav@samsung.com \
    --cc=ritesh.list@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox