From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44B70C433F5 for ; Sun, 15 May 2022 00:55:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 794296B0073; Sat, 14 May 2022 20:55:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 742606B0075; Sat, 14 May 2022 20:55:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5E3866B0078; Sat, 14 May 2022 20:55:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 502546B0073 for ; Sat, 14 May 2022 20:55:06 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 267E2331E2 for ; Sun, 15 May 2022 00:55:06 +0000 (UTC) X-FDA: 79466158212.18.17FAB3B Received: from p3plwbeout06-06.prod.phx3.secureserver.net (p3plsmtp06-06-2.prod.phx3.secureserver.net [97.74.135.61]) by imf05.hostedemail.com (Postfix) with ESMTP id 9F6B61000BE for ; Sun, 15 May 2022 00:54:44 +0000 (UTC) Received: from mailex.mailcore.me ([94.136.40.143]) by :WBEOUT: with ESMTP id q2XHnmDviQCpNq2XInk9Ja; Sat, 14 May 2022 17:55:04 -0700 X-CMAE-Analysis: v=2.4 cv=E+sIGYRl c=1 sm=1 tr=0 ts=62804f68 a=EhJYbXVJKsomWlz4CTV+qA==:117 a=84ok6UeoqCVsigPHarzEiQ==:17 a=ggZhUymU-5wA:10 a=oZkIemNP1mAA:10 a=r77TgQKjGQsHNAKrUKIA:9 a=FXvPX3liAAAA:8 a=AuEceIlb4g7ZCpDxhjwA:9 a=QEXdDO2ut3YA:10 a=jxwgQWOI4D4vHzhjtTUA:9 a=UObqyxdv-6Yh2QiB9mM_:22 X-SECURESERVER-ACCT: phillip@squashfs.org.uk X-SID: q2XHnmDviQCpN Received: from 82-69-79-175.dsl.in-addr.zen.co.uk ([82.69.79.175] helo=[192.168.178.33]) by smtp05.mailcore.me with esmtpa (Exim 4.94.2) (envelope-from ) id 1nq2XG-0002ju-Du; Sun, 15 May 2022 01:55:03 +0100 Content-Type: multipart/mixed; boundary="------------QTUWKTb9K4TsTpr03WouSST9" Message-ID: Date: Sun, 15 May 2022 01:54:58 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: Re: squashfs performance regression and readahea To: Hsin-Yi Wang Cc: Xiongwei Song , Matthew Wilcox , Zheng Liang , Zhang Yi , Hou Tao , Miao Xie , Andrew Morton , Linus Torvalds , "Song, Xiongwei" , "linux-mm@kvack.org" , "squashfs-devel@lists.sourceforge.net" References: <1f8a8009-1c05-d55c-08bd-89c5916e5240@squashfs.org.uk> From: Phillip Lougher In-Reply-To: X-Mailcore-Auth: 439999529 X-Mailcore-Domain: 1394945 X-123-reg-Authenticated: phillip@squashfs.org.uk X-Originating-IP: 82.69.79.175 X-CMAE-Envelope: MS4xfCuVI+B1zp/eA5cUyFp9d6KcSv/zR22/PhxeSNgUOuhPt9NRze++fnZOTivDtjOTx3xZQIO+9oKmpc/tF1UKMM/c4c/u41LyDIq5aeU218jByRXjlOdW INlKLNNJEvHzc9VMQIPljMALce8ux1RNzpf1vIs50/JwYnkddSwwV6OdwyaMIBWm8RJd0Aor4z+OB8sZ1Ho/Aq8Hm3jUGPkfMeQ= Authentication-Results: imf05.hostedemail.com; dkim=none; spf=none (imf05.hostedemail.com: domain of phillip@squashfs.org.uk has no SPF policy when checking 97.74.135.61) smtp.mailfrom=phillip@squashfs.org.uk; dmarc=none X-Rspam-User: X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 9F6B61000BE X-Stat-Signature: xsqmws4xw8pjqhxdm193nrksnbo7wwqk X-HE-Tag: 1652576084-22892 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is a multi-part message in MIME format. --------------QTUWKTb9K4TsTpr03WouSST9 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 13/05/2022 07:35, Hsin-Yi Wang wrote: > On Fri, May 13, 2022 at 1:33 PM Phillip Lougher wrote: >> >> My understanding is that this call will fully populate the >> pages array with page references without any holes. That >> is none of the pages array entries will be NULL, meaning >> there isn't a page for that entry. In other words, if the >> pages array has 32 pages, each of the 32 entries will >> reference a page. >> > I noticed that if nr_pages < max_pages, calling read_blocklist() will > have SQUASHFS errors, > > SQUASHFS error: Failed to read block 0x125ef7d: -5 > SQUASHFS error: zlib decompression failed, data probably corrupt > > so I did a check if nr_pages < max_pages before squashfs_read_data(), > just skip the remaining pages and let them be handled by readpage. > Yes that avoids passing the decompressor code a too small page range. As such extending the decompressor code isn't necessary. Testing your patch I discovered a number of cases where the decompressor still failed as above. This I traced to "sparse blocks", these are zero filled blocks, and are indicated/stored as a block length of 0 (bsize == 0). Skipping this sparse block and letting it be handled by readpage fixes this issue. I also noticed a potential performance improvement. You check for "pages[nr_pages - 1]->index >> shift) == index" after calling squashfs_read_data. But this information is known before calling squashfs_read_data and moving the check to before squashfs_read_data saves the cost of doing a redundant block decompression. Finally I noticed that if nr_pages grows after the __readahead_batch call, then the pages array and the page actor will be too small, and it will cause the decompressor to fail. Changing the allocation to max_pages fixes this. I have rolled these fixes into the patch below (also attached in case it gets garbled). diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c index 7cd57e0d88de..14485a7af5cf 100644 --- a/fs/squashfs/file.c +++ b/fs/squashfs/file.c @@ -518,13 +518,11 @@ static void squashfs_readahead(struct readahead_control *ractl) file_end == 0) return; - nr_pages = min(readahead_count(ractl), max_pages); - - pages = kmalloc_array(nr_pages, sizeof(void *), GFP_KERNEL); + pages = kmalloc_array(max_pages, sizeof(void *), GFP_KERNEL); if (!pages) return; - actor = squashfs_page_actor_init_special(pages, nr_pages, 0); + actor = squashfs_page_actor_init_special(pages, max_pages, 0); if (!actor) goto out; @@ -538,11 +536,18 @@ static void squashfs_readahead(struct readahead_control *ractl) goto skip_pages; index = pages[0]->index >> shift; + + if ((pages[nr_pages - 1]->index >> shift) != index) + goto skip_pages; + bsize = read_blocklist(inode, index, &block); + if (bsize == 0) + goto skip_pages; + res = squashfs_read_data(inode->i_sb, block, bsize, NULL, actor); - if (res >= 0 && (pages[nr_pages - 1]->index >> shift) == index) + if (res >= 0) for (i = 0; i < nr_pages; i++) SetPageUptodate(pages[i]); -- 2.34.1 Phillip >> This is important for the decompression code, because it >> expects each pages array entry to reference a page, which >> can be kmapped to an address. If an entry in the pages >> array is NULL, this will break. >> >> If the pages array can have holes (NULL pointers), I have >> written an update patch which allows the decompression code >> to handle these NULL pointers. >> >> If the pages array can have NULL pointers, I can send you >> the patch which will deal with this. > > Sure, if there are better ways to deal with this. > > Thanks. > >> >> Thanks >> >> Phillip >> >> >> >>> >>>>> >>>>> It's also noticed that when the crash happened, nr_pages obtained by >>>>> readahead_count() is 512. >>>>> nr_pages = readahead_count(ractl); // this line >>>>> >>>>> 2) Normal cases that won't crash: >>>>> [ 22.651750] Block @ 0xb3bbca6, compressed size 42172, src size 262144 >>>>> [ 22.653580] Block @ 0xb3c6162, compressed size 29815, src size 262144 >>>>> [ 22.656692] Block @ 0xb4a293f, compressed size 17484, src size 131072 >>>>> [ 22.666099] Block @ 0xb593881, compressed size 39742, src size 262144 >>>>> [ 22.668699] Block @ 0xb59d3bf, compressed size 37841, src size 262144 >>>>> [ 22.695739] Block @ 0x13698673, compressed size 65907, src size 131072 >>>>> [ 22.698619] Block @ 0x136a87e6, compressed size 3155, src size 131072 >>>>> [ 22.703400] Block @ 0xb1babe8, compressed size 99391, src size 131072 >>>>> [ 22.706288] Block @ 0x1514abc6, compressed size 4627, src size 131072 >>>>> >>>>> nr_pages are observed to be 32, 64, 256... These won't cause a crash. >>>>> Other values (max_pages, bsize, block...) looks normal >>>>> >>>>> I'm not sure why the crash happened, but I tried to modify the mask >>>>> for a bit. After modifying the mask value to below, the crash is gone >>>>> (nr_pages are <=256). >>>>> Based on my testing on a 300K pack file, there's no performance change. >>>>> >>>>> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c >>>>> index 20ec48cf97c5..f6d9b6f88ed9 100644 >>>>> --- a/fs/squashfs/file.c >>>>> +++ b/fs/squashfs/file.c >>>>> @@ -499,8 +499,8 @@ static void squashfs_readahead(struct >>>>> readahead_control *ractl) >>>>> { >>>>> struct inode *inode = ractl->mapping->host; >>>>> struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info; >>>>> - size_t mask = (1UL << msblk->block_log) - 1; >>>>> size_t shift = msblk->block_log - PAGE_SHIFT; >>>>> + size_t mask = (1UL << shift) - 1; >>>>> >>>>> >>>>> Any pointers are appreciated. Thanks! >>>> >> --------------QTUWKTb9K4TsTpr03WouSST9 Content-Type: text/plain; charset=UTF-8; name="patch" Content-Disposition: attachment; filename="patch" Content-Transfer-Encoding: base64 ZGlmZiAtLWdpdCBhL2ZzL3NxdWFzaGZzL2ZpbGUuYyBiL2ZzL3NxdWFzaGZzL2ZpbGUuYwpp bmRleCA3Y2Q1N2UwZDg4ZGUuLjE0NDg1YTdhZjVjZiAxMDA2NDQKLS0tIGEvZnMvc3F1YXNo ZnMvZmlsZS5jCisrKyBiL2ZzL3NxdWFzaGZzL2ZpbGUuYwpAQCAtNTE4LDEzICs1MTgsMTEg QEAgc3RhdGljIHZvaWQgc3F1YXNoZnNfcmVhZGFoZWFkKHN0cnVjdCByZWFkYWhlYWRfY29u dHJvbCAqcmFjdGwpCiAJICAgIGZpbGVfZW5kID09IDApCiAJCXJldHVybjsKIAotCW5yX3Bh Z2VzID0gbWluKHJlYWRhaGVhZF9jb3VudChyYWN0bCksIG1heF9wYWdlcyk7Ci0KLQlwYWdl cyA9IGttYWxsb2NfYXJyYXkobnJfcGFnZXMsIHNpemVvZih2b2lkICopLCBHRlBfS0VSTkVM KTsKKwlwYWdlcyA9IGttYWxsb2NfYXJyYXkobWF4X3BhZ2VzLCBzaXplb2Yodm9pZCAqKSwg R0ZQX0tFUk5FTCk7CiAJaWYgKCFwYWdlcykKIAkJcmV0dXJuOwogCi0JYWN0b3IgPSBzcXVh c2hmc19wYWdlX2FjdG9yX2luaXRfc3BlY2lhbChwYWdlcywgbnJfcGFnZXMsIDApOworCWFj dG9yID0gc3F1YXNoZnNfcGFnZV9hY3Rvcl9pbml0X3NwZWNpYWwocGFnZXMsIG1heF9wYWdl cywgMCk7CiAJaWYgKCFhY3RvcikKIAkJZ290byBvdXQ7CiAKQEAgLTUzOCwxMSArNTM2LDE4 IEBAIHN0YXRpYyB2b2lkIHNxdWFzaGZzX3JlYWRhaGVhZChzdHJ1Y3QgcmVhZGFoZWFkX2Nv bnRyb2wgKnJhY3RsKQogCQkJZ290byBza2lwX3BhZ2VzOwogCiAJCWluZGV4ID0gcGFnZXNb MF0tPmluZGV4ID4+IHNoaWZ0OworCisJCWlmICgocGFnZXNbbnJfcGFnZXMgLSAxXS0+aW5k ZXggPj4gc2hpZnQpICE9IGluZGV4KQorCQkJZ290byBza2lwX3BhZ2VzOworCiAJCWJzaXpl ID0gcmVhZF9ibG9ja2xpc3QoaW5vZGUsIGluZGV4LCAmYmxvY2spOworCQlpZiAoYnNpemUg PT0gMCkKKwkJCWdvdG8gc2tpcF9wYWdlczsKKwogCQlyZXMgPSBzcXVhc2hmc19yZWFkX2Rh dGEoaW5vZGUtPmlfc2IsIGJsb2NrLCBic2l6ZSwgTlVMTCwKIAkJCQkJIGFjdG9yKTsKIAot CQlpZiAocmVzID49IDAgJiYgKHBhZ2VzW25yX3BhZ2VzIC0gMV0tPmluZGV4ID4+IHNoaWZ0 KSA9PSBpbmRleCkKKwkJaWYgKHJlcyA+PSAwKQogCQkJZm9yIChpID0gMDsgaSA8IG5yX3Bh Z2VzOyBpKyspCiAJCQkJU2V0UGFnZVVwdG9kYXRlKHBhZ2VzW2ldKTsKIAotLSAKMi4zNC4x Cgo= --------------QTUWKTb9K4TsTpr03WouSST9--