Re: squashfs performance regression and readahea

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Hsin-Yi Wang <hsinyi@chromium.org>
To: Xiongwei Song <sxwjean@gmail.com>
Cc: Phillip Lougher <phillip@squashfs.org.uk>,
	Matthew Wilcox <willy@infradead.org>,
	 Zheng Liang <zhengliang6@huawei.com>,
	Zhang Yi <yi.zhang@huawei.com>,  Hou Tao <houtao1@huawei.com>,
	Miao Xie <miaoxie@huawei.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	 "Song, Xiongwei" <Xiongwei.Song@windriver.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	 "squashfs-devel@lists.sourceforge.net"
	<squashfs-devel@lists.sourceforge.net>
Subject: Re: squashfs performance regression and readahea
Date: Mon, 16 May 2022 19:01:08 +0800	[thread overview]
Message-ID: <CAJMQK-ieZfaeTR15dgpmHHTtsQZFu6u=jjtLtf7RqBBzVj_9bA@mail.gmail.com> (raw)
In-Reply-To: <CAEVVKH-d1nS=uzFNpt3MNbxMnn5qoRsu5Ztj+Y080_q5BBGPHg@mail.gmail.com>

On Mon, May 16, 2022 at 5:35 PM Xiongwei Song <sxwjean@gmail.com> wrote:
>
> On Mon, May 16, 2022 at 5:10 PM Hsin-Yi Wang <hsinyi@chromium.org> wrote:
> >
> > On Mon, May 16, 2022 at 5:00 PM Xiongwei Song <sxwjean@gmail.com> wrote:
> > >
> > > You can not just sign your signature. You ignored others' contributions.
> > > This is unacceptable.
> > >
> >
> > Hi,
> >
> > Don't be angry. My thought is, since this is still under review and
> > I'm not sure if the performance issue is settled, it's more important
> > to make sure it's ready.
> >
> > If it's ready, I'll send it to the list formally, so it's easier for
> > maintainers (Matthew) to pick. At that time, I'll add your Tested-by
> > (again, I'm not sure the performance now is okay for you or not, so I
> > didn't add your tag. It would be incorrect to add your tag if the
> > performance is still not desired) and Phillips's Reviewed-by (Or
> > Signed-off-by) (I'm also not sure if Phillip or Matthew have other
> > comments, so I can't add their signature now). Maintainers will
> > probably add their Signed-off-by when they pick the patch.
> >
> > I'm sorry that if not adding the tags in this WIP patch now offended you.
>
> Your apology is not sincere. I told you you should release @pages in the
> exit path, you didn't even mention it.

This normally is mentioned in the patch changes if this is sent to the
list. See for example:
https://patchwork.kernel.org/project/dri-devel/patch/20220516085258.1227691-3-cyndis@kapsi.fi/
https://patchwork.kernel.org/project/dri-devel/patch/20220213103437.3363848-3-hsinyi@chromium.org/

It will mention the change in each version. I didn't add them in the
attachments in this thread.
Now I think it's better to send the patch directly to the list
formally to avoid making you angry, though the patch is still
work-in-progress.
https://lore.kernel.org/lkml/20220516105100.1412740-1-hsinyi@chromium.org/T/#md06bd638bcdd985766ee75d17fbaa548218e0038

> I told you patch v2 made ~6s
> difference, you didn't provide any response.

Currently I don't have any better ideas to improve the performance.
I'm sorry about that.

> I told you the 9eec1d897139
> ("squashfs: provide backing_dev_info in order to disable read-ahead")
> should be reverted, you didn't reply.

Again, I didn't send the series formally in this thread since my focus
is still on getting the code right. So I didn't include the revert in
my attachment, only the readahead patch itself.

Talking about revert, since I didn't send a revert patch before. There
are conflict between
124cfc154f6c ("squashfs: Convert squashfs to read_folio") and
reverting 9eec1d897139 ("squashfs: provide backing_dev_info in order
to disable read-ahead").
Currently I just remove the comments in squashfs_fill_super() but I'm
not sure if this is the correct way to do it.
Hi, Matthew, will you handle the reverts? Thanks


> I think you don't  know what is
> respection.
>
> >
> >
> >
> > > On Mon, May 16, 2022 at 4:23 PM Hsin-Yi Wang <hsinyi@chromium.org> wrote:
> > > >
> > > > On Sun, May 15, 2022 at 8:55 AM Phillip Lougher <phillip@squashfs.org.uk> wrote:
> > > > >
> > > > > On 13/05/2022 07:35, Hsin-Yi Wang wrote:
> > > > > > On Fri, May 13, 2022 at 1:33 PM Phillip Lougher <phillip@squashfs.org.uk> wrote:
> > > > > >>
> > > > > >> My understanding is that this call will fully populate the
> > > > > >> pages array with page references without any holes.  That
> > > > > >> is none of the pages array entries will be NULL, meaning
> > > > > >> there isn't a page for that entry.  In other words, if the
> > > > > >> pages array has 32 pages, each of the 32 entries will
> > > > > >> reference a page.
> > > > > >>
> > > > > > I noticed that if nr_pages < max_pages, calling read_blocklist() will
> > > > > > have SQUASHFS errors,
> > > > > >
> > > > > > SQUASHFS error: Failed to read block 0x125ef7d: -5
> > > > > > SQUASHFS error: zlib decompression failed, data probably corrupt
> > > > > >
> > > > > > so I did a check if nr_pages < max_pages before squashfs_read_data(),
> > > > > > just skip the remaining pages and let them be handled by readpage.
> > > > > >
> > > > >
> > > > > Yes that avoids passing the decompressor code a too small page range.
> > > > > As such extending the decompressor code isn't necessary.
> > > > >
> > > > > Testing your patch I discovered a number of cases where
> > > > > the decompressor still failed as above.
> > > > >
> > > > > This I traced to "sparse blocks", these are zero filled blocks, and
> > > > > are indicated/stored as a block length of 0 (bsize == 0).  Skipping
> > > > > this sparse block and letting it be handled by readpage fixes this
> > > > > issue.
> > > > >
> > > > Ack. Thanks for testing this.
> > > >
> > > > > I also noticed a potential performance improvement.  You check for
> > > > > "pages[nr_pages - 1]->index >> shift) == index" after calling
> > > > > squashfs_read_data.  But this information is known before
> > > > > calling squashfs_read_data and moving the check to before
> > > > > squashfs_read_data saves the cost of doing a redundant block
> > > > > decompression.
> > > > >
> > > > After applying this, The performance becomes:
> > > > 2.73s
> > > > 2.76s
> > > > 2.73s
> > > >
> > > > Original:
> > > > 2.76s
> > > > 2.79s
> > > > 2.77s
> > > >
> > > > (The pack file is different from my previous testing in this email thread.)
> > > >
> > > > > Finally I noticed that if nr_pages grows after the __readahead_batch
> > > > > call, then the pages array and the page actor will be too small, and
> > > > > it will cause the decompressor to fail.  Changing the allocation to
> > > > > max_pages fixes this.
> > > > >
> > > > Ack.
> > > >
> > > > I've added the fixes patch and previous fixes.
> > > > > I have rolled these fixes into the patch below (also attached in
> > > > > case it gets garbled).
> > > > >
> > > > > diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> > > > > index 7cd57e0d88de..14485a7af5cf 100644
> > > > > --- a/fs/squashfs/file.c
> > > > > +++ b/fs/squashfs/file.c
> > > > > @@ -518,13 +518,11 @@ static void squashfs_readahead(struct
> > > > > readahead_control *ractl)
> > > > >             file_end == 0)
> > > > >                 return;
> > > > >
> > > > > -       nr_pages = min(readahead_count(ractl), max_pages);
> > > > > -
> > > > > -       pages = kmalloc_array(nr_pages, sizeof(void *), GFP_KERNEL);
> > > > > +       pages = kmalloc_array(max_pages, sizeof(void *), GFP_KERNEL);
> > > > >         if (!pages)
> > > > >                 return;
> > > > >
> > > > > -       actor = squashfs_page_actor_init_special(pages, nr_pages, 0);
> > > > > +       actor = squashfs_page_actor_init_special(pages, max_pages, 0);
> > > > >         if (!actor)
> > > > >                 goto out;
> > > > >
> > > > > @@ -538,11 +536,18 @@ static void squashfs_readahead(struct
> > > > > readahead_control *ractl)
> > > > >                         goto skip_pages;
> > > > >
> > > > >                 index = pages[0]->index >> shift;
> > > > > +
> > > > > +               if ((pages[nr_pages - 1]->index >> shift) != index)
> > > > > +                       goto skip_pages;
> > > > > +
> > > > >                 bsize = read_blocklist(inode, index, &block);
> > > > > +               if (bsize == 0)
> > > > > +                       goto skip_pages;
> > > > > +
> > > > >                 res = squashfs_read_data(inode->i_sb, block, bsize, NULL,
> > > > >                                          actor);
> > > > >
> > > > > -               if (res >= 0 && (pages[nr_pages - 1]->index >> shift) == index)
> > > > > +               if (res >= 0)
> > > > >                         for (i = 0; i < nr_pages; i++)
> > > > >                                 SetPageUptodate(pages[i]);
> > > > >
> > > > > --
> > > > > 2.34.1
> > > > >
> > > > >
> > > > >
> > > > > Phillip
> > > > >
> > > > >
> > > > > >> This is important for the decompression code, because it
> > > > > >> expects each pages array entry to reference a page, which
> > > > > >> can be kmapped to an address.  If an entry in the pages
> > > > > >> array is NULL, this will break.
> > > > > >>
> > > > > >> If the pages array can have holes (NULL pointers), I have
> > > > > >> written an update patch which allows the decompression code
> > > > > >> to handle these NULL pointers.
> > > > > >>
> > > > > >> If the pages array can have NULL pointers, I can send you
> > > > > >> the patch which will deal with this.
> > > > > >
> > > > > > Sure, if there are better ways to deal with this.
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > > > >>
> > > > > >> Thanks
> > > > > >>
> > > > > >> Phillip
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >>>
> > > > > >>>>>
> > > > > >>>>> It's also noticed that when the crash happened, nr_pages obtained by
> > > > > >>>>> readahead_count() is 512.
> > > > > >>>>> nr_pages = readahead_count(ractl); // this line
> > > > > >>>>>
> > > > > >>>>> 2) Normal cases that won't crash:
> > > > > >>>>> [   22.651750] Block @ 0xb3bbca6, compressed size 42172, src size 262144
> > > > > >>>>> [   22.653580] Block @ 0xb3c6162, compressed size 29815, src size 262144
> > > > > >>>>> [   22.656692] Block @ 0xb4a293f, compressed size 17484, src size 131072
> > > > > >>>>> [   22.666099] Block @ 0xb593881, compressed size 39742, src size 262144
> > > > > >>>>> [   22.668699] Block @ 0xb59d3bf, compressed size 37841, src size 262144
> > > > > >>>>> [   22.695739] Block @ 0x13698673, compressed size 65907, src size 131072
> > > > > >>>>> [   22.698619] Block @ 0x136a87e6, compressed size 3155, src size 131072
> > > > > >>>>> [   22.703400] Block @ 0xb1babe8, compressed size 99391, src size 131072
> > > > > >>>>> [   22.706288] Block @ 0x1514abc6, compressed size 4627, src size 131072
> > > > > >>>>>
> > > > > >>>>> nr_pages are observed to be 32, 64, 256... These won't cause a crash.
> > > > > >>>>> Other values (max_pages, bsize, block...) looks normal
> > > > > >>>>>
> > > > > >>>>> I'm not sure why the crash happened, but I tried to modify the mask
> > > > > >>>>> for a bit. After modifying the mask value to below, the crash is gone
> > > > > >>>>> (nr_pages are <=256).
> > > > > >>>>> Based on my testing on a 300K pack file, there's no performance change.
> > > > > >>>>>
> > > > > >>>>> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> > > > > >>>>> index 20ec48cf97c5..f6d9b6f88ed9 100644
> > > > > >>>>> --- a/fs/squashfs/file.c
> > > > > >>>>> +++ b/fs/squashfs/file.c
> > > > > >>>>> @@ -499,8 +499,8 @@ static void squashfs_readahead(struct
> > > > > >>>>> readahead_control *ractl)
> > > > > >>>>>     {
> > > > > >>>>>            struct inode *inode = ractl->mapping->host;
> > > > > >>>>>            struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
> > > > > >>>>> -       size_t mask = (1UL << msblk->block_log) - 1;
> > > > > >>>>>            size_t shift = msblk->block_log - PAGE_SHIFT;
> > > > > >>>>> +       size_t mask = (1UL << shift) - 1;
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>> Any pointers are appreciated. Thanks!
> > > > > >>>>
> > > > > >>

next prev parent reply	other threads:[~2022-05-16 11:01 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-08 14:46 Song, Xiongwei
2022-05-08 16:44 ` Matthew Wilcox
2022-05-09  7:05   ` Hsin-Yi Wang
2022-05-09  7:10     ` Hsin-Yi Wang
2022-05-09  9:42       ` Song, Xiongwei
2022-05-09 12:43       ` Xiongwei Song
2022-05-09 13:21         ` Matthew Wilcox
2022-05-09 14:29           ` Hsin-Yi Wang
2022-05-10 12:30             ` Xiongwei Song
2022-05-10 12:47               ` Hsin-Yi Wang
2022-05-10 13:18                 ` Xiongwei Song
2022-05-11 15:12                   ` Hsin-Yi Wang
2022-05-11 19:13                     ` Phillip Lougher
2022-05-12  6:23                       ` Hsin-Yi Wang
2022-05-13  5:33                         ` Phillip Lougher
2022-05-13  6:35                           ` Hsin-Yi Wang
2022-05-15  0:54                             ` Phillip Lougher
2022-05-16  8:23                               ` Hsin-Yi Wang
2022-05-16  9:00                                 ` Xiongwei Song
2022-05-16  9:10                                   ` Hsin-Yi Wang
2022-05-16  9:34                                     ` Xiongwei Song
2022-05-16 11:01                                       ` Hsin-Yi Wang [this message]
2022-05-13 13:09                           ` Matthew Wilcox
2022-05-15  0:05                             ` Phillip Lougher
2022-05-13 12:16                         ` Xiongwei Song
2022-05-13 12:29                           ` Xiongwei Song
2022-05-13 16:43                           ` Hsin-Yi Wang
2022-05-13 18:12                             ` Matthew Wilcox
2022-05-13 23:12                               ` Xiongwei Song
2022-05-14 11:51                               ` Hsin-Yi Wang
2022-05-10  1:11           ` Phillip Lougher
2022-05-10  2:35             ` Matthew Wilcox
2022-05-10  3:20               ` Phillip Lougher
2022-05-10  3:41                 ` Phillip Lougher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJMQK-ieZfaeTR15dgpmHHTtsQZFu6u=jjtLtf7RqBBzVj_9bA@mail.gmail.com' \
    --to=hsinyi@chromium.org \
    --cc=Xiongwei.Song@windriver.com \
    --cc=akpm@linux-foundation.org \
    --cc=houtao1@huawei.com \
    --cc=linux-mm@kvack.org \
    --cc=miaoxie@huawei.com \
    --cc=phillip@squashfs.org.uk \
    --cc=squashfs-devel@lists.sourceforge.net \
    --cc=sxwjean@gmail.com \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    --cc=yi.zhang@huawei.com \
    --cc=zhengliang6@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox