From: Xiongwei Song <sxwjean@gmail.com>
To: Hsin-Yi Wang <hsinyi@chromium.org>
Cc: Phillip Lougher <phillip@squashfs.org.uk>,
Matthew Wilcox <willy@infradead.org>,
Zheng Liang <zhengliang6@huawei.com>,
Zhang Yi <yi.zhang@huawei.com>, Hou Tao <houtao1@huawei.com>,
Miao Xie <miaoxie@huawei.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
"Song, Xiongwei" <Xiongwei.Song@windriver.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"squashfs-devel@lists.sourceforge.net"
<squashfs-devel@lists.sourceforge.net>
Subject: Re: squashfs performance regression and readahea
Date: Mon, 16 May 2022 17:00:30 +0800 [thread overview]
Message-ID: <CAEVVKH8tQeBH-hPyq69ncJPXKw=o_q0e0f+K9n8psXX7jXq_vA@mail.gmail.com> (raw)
In-Reply-To: <CAJMQK-g9tA4Ev_fkFTvW=CbpZRTgBLE4ynfGNUAi7pOPfpeTXg@mail.gmail.com>
You can not just sign your signature. You ignored others' contributions.
This is unacceptable.
On Mon, May 16, 2022 at 4:23 PM Hsin-Yi Wang <hsinyi@chromium.org> wrote:
>
> On Sun, May 15, 2022 at 8:55 AM Phillip Lougher <phillip@squashfs.org.uk> wrote:
> >
> > On 13/05/2022 07:35, Hsin-Yi Wang wrote:
> > > On Fri, May 13, 2022 at 1:33 PM Phillip Lougher <phillip@squashfs.org.uk> wrote:
> > >>
> > >> My understanding is that this call will fully populate the
> > >> pages array with page references without any holes. That
> > >> is none of the pages array entries will be NULL, meaning
> > >> there isn't a page for that entry. In other words, if the
> > >> pages array has 32 pages, each of the 32 entries will
> > >> reference a page.
> > >>
> > > I noticed that if nr_pages < max_pages, calling read_blocklist() will
> > > have SQUASHFS errors,
> > >
> > > SQUASHFS error: Failed to read block 0x125ef7d: -5
> > > SQUASHFS error: zlib decompression failed, data probably corrupt
> > >
> > > so I did a check if nr_pages < max_pages before squashfs_read_data(),
> > > just skip the remaining pages and let them be handled by readpage.
> > >
> >
> > Yes that avoids passing the decompressor code a too small page range.
> > As such extending the decompressor code isn't necessary.
> >
> > Testing your patch I discovered a number of cases where
> > the decompressor still failed as above.
> >
> > This I traced to "sparse blocks", these are zero filled blocks, and
> > are indicated/stored as a block length of 0 (bsize == 0). Skipping
> > this sparse block and letting it be handled by readpage fixes this
> > issue.
> >
> Ack. Thanks for testing this.
>
> > I also noticed a potential performance improvement. You check for
> > "pages[nr_pages - 1]->index >> shift) == index" after calling
> > squashfs_read_data. But this information is known before
> > calling squashfs_read_data and moving the check to before
> > squashfs_read_data saves the cost of doing a redundant block
> > decompression.
> >
> After applying this, The performance becomes:
> 2.73s
> 2.76s
> 2.73s
>
> Original:
> 2.76s
> 2.79s
> 2.77s
>
> (The pack file is different from my previous testing in this email thread.)
>
> > Finally I noticed that if nr_pages grows after the __readahead_batch
> > call, then the pages array and the page actor will be too small, and
> > it will cause the decompressor to fail. Changing the allocation to
> > max_pages fixes this.
> >
> Ack.
>
> I've added the fixes patch and previous fixes.
> > I have rolled these fixes into the patch below (also attached in
> > case it gets garbled).
> >
> > diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> > index 7cd57e0d88de..14485a7af5cf 100644
> > --- a/fs/squashfs/file.c
> > +++ b/fs/squashfs/file.c
> > @@ -518,13 +518,11 @@ static void squashfs_readahead(struct
> > readahead_control *ractl)
> > file_end == 0)
> > return;
> >
> > - nr_pages = min(readahead_count(ractl), max_pages);
> > -
> > - pages = kmalloc_array(nr_pages, sizeof(void *), GFP_KERNEL);
> > + pages = kmalloc_array(max_pages, sizeof(void *), GFP_KERNEL);
> > if (!pages)
> > return;
> >
> > - actor = squashfs_page_actor_init_special(pages, nr_pages, 0);
> > + actor = squashfs_page_actor_init_special(pages, max_pages, 0);
> > if (!actor)
> > goto out;
> >
> > @@ -538,11 +536,18 @@ static void squashfs_readahead(struct
> > readahead_control *ractl)
> > goto skip_pages;
> >
> > index = pages[0]->index >> shift;
> > +
> > + if ((pages[nr_pages - 1]->index >> shift) != index)
> > + goto skip_pages;
> > +
> > bsize = read_blocklist(inode, index, &block);
> > + if (bsize == 0)
> > + goto skip_pages;
> > +
> > res = squashfs_read_data(inode->i_sb, block, bsize, NULL,
> > actor);
> >
> > - if (res >= 0 && (pages[nr_pages - 1]->index >> shift) == index)
> > + if (res >= 0)
> > for (i = 0; i < nr_pages; i++)
> > SetPageUptodate(pages[i]);
> >
> > --
> > 2.34.1
> >
> >
> >
> > Phillip
> >
> >
> > >> This is important for the decompression code, because it
> > >> expects each pages array entry to reference a page, which
> > >> can be kmapped to an address. If an entry in the pages
> > >> array is NULL, this will break.
> > >>
> > >> If the pages array can have holes (NULL pointers), I have
> > >> written an update patch which allows the decompression code
> > >> to handle these NULL pointers.
> > >>
> > >> If the pages array can have NULL pointers, I can send you
> > >> the patch which will deal with this.
> > >
> > > Sure, if there are better ways to deal with this.
> > >
> > > Thanks.
> > >
> > >>
> > >> Thanks
> > >>
> > >> Phillip
> > >>
> > >>
> > >>
> > >>>
> > >>>>>
> > >>>>> It's also noticed that when the crash happened, nr_pages obtained by
> > >>>>> readahead_count() is 512.
> > >>>>> nr_pages = readahead_count(ractl); // this line
> > >>>>>
> > >>>>> 2) Normal cases that won't crash:
> > >>>>> [ 22.651750] Block @ 0xb3bbca6, compressed size 42172, src size 262144
> > >>>>> [ 22.653580] Block @ 0xb3c6162, compressed size 29815, src size 262144
> > >>>>> [ 22.656692] Block @ 0xb4a293f, compressed size 17484, src size 131072
> > >>>>> [ 22.666099] Block @ 0xb593881, compressed size 39742, src size 262144
> > >>>>> [ 22.668699] Block @ 0xb59d3bf, compressed size 37841, src size 262144
> > >>>>> [ 22.695739] Block @ 0x13698673, compressed size 65907, src size 131072
> > >>>>> [ 22.698619] Block @ 0x136a87e6, compressed size 3155, src size 131072
> > >>>>> [ 22.703400] Block @ 0xb1babe8, compressed size 99391, src size 131072
> > >>>>> [ 22.706288] Block @ 0x1514abc6, compressed size 4627, src size 131072
> > >>>>>
> > >>>>> nr_pages are observed to be 32, 64, 256... These won't cause a crash.
> > >>>>> Other values (max_pages, bsize, block...) looks normal
> > >>>>>
> > >>>>> I'm not sure why the crash happened, but I tried to modify the mask
> > >>>>> for a bit. After modifying the mask value to below, the crash is gone
> > >>>>> (nr_pages are <=256).
> > >>>>> Based on my testing on a 300K pack file, there's no performance change.
> > >>>>>
> > >>>>> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> > >>>>> index 20ec48cf97c5..f6d9b6f88ed9 100644
> > >>>>> --- a/fs/squashfs/file.c
> > >>>>> +++ b/fs/squashfs/file.c
> > >>>>> @@ -499,8 +499,8 @@ static void squashfs_readahead(struct
> > >>>>> readahead_control *ractl)
> > >>>>> {
> > >>>>> struct inode *inode = ractl->mapping->host;
> > >>>>> struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
> > >>>>> - size_t mask = (1UL << msblk->block_log) - 1;
> > >>>>> size_t shift = msblk->block_log - PAGE_SHIFT;
> > >>>>> + size_t mask = (1UL << shift) - 1;
> > >>>>>
> > >>>>>
> > >>>>> Any pointers are appreciated. Thanks!
> > >>>>
> > >>
next prev parent reply other threads:[~2022-05-16 9:00 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-08 14:46 Song, Xiongwei
2022-05-08 16:44 ` Matthew Wilcox
2022-05-09 7:05 ` Hsin-Yi Wang
2022-05-09 7:10 ` Hsin-Yi Wang
2022-05-09 9:42 ` Song, Xiongwei
2022-05-09 12:43 ` Xiongwei Song
2022-05-09 13:21 ` Matthew Wilcox
2022-05-09 14:29 ` Hsin-Yi Wang
2022-05-10 12:30 ` Xiongwei Song
2022-05-10 12:47 ` Hsin-Yi Wang
2022-05-10 13:18 ` Xiongwei Song
2022-05-11 15:12 ` Hsin-Yi Wang
2022-05-11 19:13 ` Phillip Lougher
2022-05-12 6:23 ` Hsin-Yi Wang
2022-05-13 5:33 ` Phillip Lougher
2022-05-13 6:35 ` Hsin-Yi Wang
2022-05-15 0:54 ` Phillip Lougher
2022-05-16 8:23 ` Hsin-Yi Wang
2022-05-16 9:00 ` Xiongwei Song [this message]
2022-05-16 9:10 ` Hsin-Yi Wang
2022-05-16 9:34 ` Xiongwei Song
2022-05-16 11:01 ` Hsin-Yi Wang
2022-05-13 13:09 ` Matthew Wilcox
2022-05-15 0:05 ` Phillip Lougher
2022-05-13 12:16 ` Xiongwei Song
2022-05-13 12:29 ` Xiongwei Song
2022-05-13 16:43 ` Hsin-Yi Wang
2022-05-13 18:12 ` Matthew Wilcox
2022-05-13 23:12 ` Xiongwei Song
2022-05-14 11:51 ` Hsin-Yi Wang
2022-05-10 1:11 ` Phillip Lougher
2022-05-10 2:35 ` Matthew Wilcox
2022-05-10 3:20 ` Phillip Lougher
2022-05-10 3:41 ` Phillip Lougher
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAEVVKH8tQeBH-hPyq69ncJPXKw=o_q0e0f+K9n8psXX7jXq_vA@mail.gmail.com' \
--to=sxwjean@gmail.com \
--cc=Xiongwei.Song@windriver.com \
--cc=akpm@linux-foundation.org \
--cc=houtao1@huawei.com \
--cc=hsinyi@chromium.org \
--cc=linux-mm@kvack.org \
--cc=miaoxie@huawei.com \
--cc=phillip@squashfs.org.uk \
--cc=squashfs-devel@lists.sourceforge.net \
--cc=torvalds@linux-foundation.org \
--cc=willy@infradead.org \
--cc=yi.zhang@huawei.com \
--cc=zhengliang6@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox