On 13/05/2022 07:35, Hsin-Yi Wang wrote:
> On Fri, May 13, 2022 at 1:33 PM Phillip Lougher <phillip@squashfs.org.uk> wrote:
>>
>> My understanding is that this call will fully populate the
>> pages array with page references without any holes.  That
>> is none of the pages array entries will be NULL, meaning
>> there isn't a page for that entry.  In other words, if the
>> pages array has 32 pages, each of the 32 entries will
>> reference a page.
>>
> I noticed that if nr_pages < max_pages, calling read_blocklist() will
> have SQUASHFS errors,
> 
> SQUASHFS error: Failed to read block 0x125ef7d: -5
> SQUASHFS error: zlib decompression failed, data probably corrupt
> 
> so I did a check if nr_pages < max_pages before squashfs_read_data(),
> just skip the remaining pages and let them be handled by readpage.
> 

Yes that avoids passing the decompressor code a too small page range.
As such extending the decompressor code isn't necessary.

Testing your patch I discovered a number of cases where
the decompressor still failed as above.

This I traced to "sparse blocks", these are zero filled blocks, and
are indicated/stored as a block length of 0 (bsize == 0).  Skipping
this sparse block and letting it be handled by readpage fixes this
issue.

I also noticed a potential performance improvement.  You check for
"pages[nr_pages - 1]->index >> shift) == index" after calling
squashfs_read_data.  But this information is known before
calling squashfs_read_data and moving the check to before
squashfs_read_data saves the cost of doing a redundant block
decompression.

Finally I noticed that if nr_pages grows after the __readahead_batch
call, then the pages array and the page actor will be too small, and
it will cause the decompressor to fail.  Changing the allocation to 
max_pages fixes this.

I have rolled these fixes into the patch below (also attached in
case it gets garbled).

diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
index 7cd57e0d88de..14485a7af5cf 100644
--- a/fs/squashfs/file.c
+++ b/fs/squashfs/file.c
@@ -518,13 +518,11 @@ static void squashfs_readahead(struct 
readahead_control *ractl)
  	    file_end == 0)
  		return;

-	nr_pages = min(readahead_count(ractl), max_pages);
-
-	pages = kmalloc_array(nr_pages, sizeof(void *), GFP_KERNEL);
+	pages = kmalloc_array(max_pages, sizeof(void *), GFP_KERNEL);
  	if (!pages)
  		return;

-	actor = squashfs_page_actor_init_special(pages, nr_pages, 0);
+	actor = squashfs_page_actor_init_special(pages, max_pages, 0);
  	if (!actor)
  		goto out;

@@ -538,11 +536,18 @@ static void squashfs_readahead(struct 
readahead_control *ractl)
  			goto skip_pages;

  		index = pages[0]->index >> shift;
+
+		if ((pages[nr_pages - 1]->index >> shift) != index)
+			goto skip_pages;
+
  		bsize = read_blocklist(inode, index, &block);
+		if (bsize == 0)
+			goto skip_pages;
+
  		res = squashfs_read_data(inode->i_sb, block, bsize, NULL,
  					 actor);

-		if (res >= 0 && (pages[nr_pages - 1]->index >> shift) == index)
+		if (res >= 0)
  			for (i = 0; i < nr_pages; i++)
  				SetPageUptodate(pages[i]);

-- 
2.34.1


Phillip


>> This is important for the decompression code, because it
>> expects each pages array entry to reference a page, which
>> can be kmapped to an address.  If an entry in the pages
>> array is NULL, this will break.
>>
>> If the pages array can have holes (NULL pointers), I have
>> written an update patch which allows the decompression code
>> to handle these NULL pointers.
>>
>> If the pages array can have NULL pointers, I can send you
>> the patch which will deal with this.
> 
> Sure, if there are better ways to deal with this.
> 
> Thanks.
> 
>>
>> Thanks
>>
>> Phillip
>>
>>
>>
>>>
>>>>>
>>>>> It's also noticed that when the crash happened, nr_pages obtained by
>>>>> readahead_count() is 512.
>>>>> nr_pages = readahead_count(ractl); // this line
>>>>>
>>>>> 2) Normal cases that won't crash:
>>>>> [   22.651750] Block @ 0xb3bbca6, compressed size 42172, src size 262144
>>>>> [   22.653580] Block @ 0xb3c6162, compressed size 29815, src size 262144
>>>>> [   22.656692] Block @ 0xb4a293f, compressed size 17484, src size 131072
>>>>> [   22.666099] Block @ 0xb593881, compressed size 39742, src size 262144
>>>>> [   22.668699] Block @ 0xb59d3bf, compressed size 37841, src size 262144
>>>>> [   22.695739] Block @ 0x13698673, compressed size 65907, src size 131072
>>>>> [   22.698619] Block @ 0x136a87e6, compressed size 3155, src size 131072
>>>>> [   22.703400] Block @ 0xb1babe8, compressed size 99391, src size 131072
>>>>> [   22.706288] Block @ 0x1514abc6, compressed size 4627, src size 131072
>>>>>
>>>>> nr_pages are observed to be 32, 64, 256... These won't cause a crash.
>>>>> Other values (max_pages, bsize, block...) looks normal
>>>>>
>>>>> I'm not sure why the crash happened, but I tried to modify the mask
>>>>> for a bit. After modifying the mask value to below, the crash is gone
>>>>> (nr_pages are <=256).
>>>>> Based on my testing on a 300K pack file, there's no performance change.
>>>>>
>>>>> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
>>>>> index 20ec48cf97c5..f6d9b6f88ed9 100644
>>>>> --- a/fs/squashfs/file.c
>>>>> +++ b/fs/squashfs/file.c
>>>>> @@ -499,8 +499,8 @@ static void squashfs_readahead(struct
>>>>> readahead_control *ractl)
>>>>>     {
>>>>>            struct inode *inode = ractl->mapping->host;
>>>>>            struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
>>>>> -       size_t mask = (1UL << msblk->block_log) - 1;
>>>>>            size_t shift = msblk->block_log - PAGE_SHIFT;
>>>>> +       size_t mask = (1UL << shift) - 1;
>>>>>
>>>>>
>>>>> Any pointers are appreciated. Thanks!
>>>>
>>