From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E12BCC433EF for ; Mon, 16 May 2022 11:01:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1C85D6B0072; Mon, 16 May 2022 07:01:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 151856B0073; Mon, 16 May 2022 07:01:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EE73F6B0075; Mon, 16 May 2022 07:01:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id D8CD36B0072 for ; Mon, 16 May 2022 07:01:35 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay11.hostedemail.com (Postfix) with ESMTP id 8D19881237 for ; Mon, 16 May 2022 11:01:35 +0000 (UTC) X-FDA: 79471315350.26.03829D9 Received: from mail-vs1-f46.google.com (mail-vs1-f46.google.com [209.85.217.46]) by imf06.hostedemail.com (Postfix) with ESMTP id 7A4F41800C8 for ; Mon, 16 May 2022 11:01:32 +0000 (UTC) Received: by mail-vs1-f46.google.com with SMTP id e19so15048038vsu.12 for ; Mon, 16 May 2022 04:01:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=jx0lHvmIreC9w7DcXPteSm/6/VxHSoCR8MHfQQw8qzE=; b=RoKAaVQTmM1R5ce+r1TDPL27uQHRNOBh/RBUm0c9eG0aP88a8m9VGV99x7Hilm5+SA 9CTtBCqMtSUm153e837/zVzCp33DZMPXmB0/+VaCPsLWQ79Pnhs9xJnmELBPHpPVG9Gu +Pcz6MYd6gSzXnkMigylW+U/f9piv83dFjLfw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=jx0lHvmIreC9w7DcXPteSm/6/VxHSoCR8MHfQQw8qzE=; b=DjQrXQ7L8ILYLYOHKhNqXAeoBSecfc4f07fHRh9X7bJWKvJtvIXZCiO2L1MMIFJxIn sfjNMQLsyNU2B0MfpPjGZeNwKNQzNWB1V5m70KUbXqN0dZ11JxSRT8PhKac76aMYFnCn o+64ICPVC83I4hyZj8Chs0TuTtQTatNaTXxPveNyOkzr+MYIM4UxNnJT9swqJc86qb5g J83Jr0Luy8x6vV6Ad3rFmdSPceeeDKznLhaZaMmiWdNW3SvBIhbYqHmPsdiFUNAtuQw9 dNtg+/+i/k6D0Z3r1SRqc2PxSNDEJjnAS/x5Zi3gT6oWnNM13VuzkaSSunK4X/xSP8k+ k4xQ== X-Gm-Message-State: AOAM531dgluxNgfSI7l10NnlV2P8OFjCtvIdt4eTmlVCrK+rfI+Pc+H/ fHgNbStYITf4y5iRjQSaB7yGvl4uvnkGfIYUnjzfnw== X-Google-Smtp-Source: ABdhPJxMV3vx2rvJQ6I+y+4wKrwQAsujjNg2HODK136VbYRl/pZpiQpp6b5ySF5yo4yeZZPSCT4ln8e6wHuIcJIp0sE= X-Received: by 2002:a67:d999:0:b0:335:7e5c:63d5 with SMTP id u25-20020a67d999000000b003357e5c63d5mr4819216vsj.69.1652698894156; Mon, 16 May 2022 04:01:34 -0700 (PDT) MIME-Version: 1.0 References: <1f8a8009-1c05-d55c-08bd-89c5916e5240@squashfs.org.uk> In-Reply-To: From: Hsin-Yi Wang Date: Mon, 16 May 2022 19:01:08 +0800 Message-ID: Subject: Re: squashfs performance regression and readahea To: Xiongwei Song Cc: Phillip Lougher , Matthew Wilcox , Zheng Liang , Zhang Yi , Hou Tao , Miao Xie , Andrew Morton , Linus Torvalds , "Song, Xiongwei" , "linux-mm@kvack.org" , "squashfs-devel@lists.sourceforge.net" Content-Type: text/plain; charset="UTF-8" X-Stat-Signature: hxjsyca1dpsj6pbfidup6mp8xsfqoekj X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 7A4F41800C8 X-Rspam-User: Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=RoKAaVQT; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf06.hostedemail.com: domain of hsinyi@chromium.org designates 209.85.217.46 as permitted sender) smtp.mailfrom=hsinyi@chromium.org X-HE-Tag: 1652698892-427361 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, May 16, 2022 at 5:35 PM Xiongwei Song wrote: > > On Mon, May 16, 2022 at 5:10 PM Hsin-Yi Wang wrote: > > > > On Mon, May 16, 2022 at 5:00 PM Xiongwei Song wrote: > > > > > > You can not just sign your signature. You ignored others' contributions. > > > This is unacceptable. > > > > > > > Hi, > > > > Don't be angry. My thought is, since this is still under review and > > I'm not sure if the performance issue is settled, it's more important > > to make sure it's ready. > > > > If it's ready, I'll send it to the list formally, so it's easier for > > maintainers (Matthew) to pick. At that time, I'll add your Tested-by > > (again, I'm not sure the performance now is okay for you or not, so I > > didn't add your tag. It would be incorrect to add your tag if the > > performance is still not desired) and Phillips's Reviewed-by (Or > > Signed-off-by) (I'm also not sure if Phillip or Matthew have other > > comments, so I can't add their signature now). Maintainers will > > probably add their Signed-off-by when they pick the patch. > > > > I'm sorry that if not adding the tags in this WIP patch now offended you. > > Your apology is not sincere. I told you you should release @pages in the > exit path, you didn't even mention it. This normally is mentioned in the patch changes if this is sent to the list. See for example: https://patchwork.kernel.org/project/dri-devel/patch/20220516085258.1227691-3-cyndis@kapsi.fi/ https://patchwork.kernel.org/project/dri-devel/patch/20220213103437.3363848-3-hsinyi@chromium.org/ It will mention the change in each version. I didn't add them in the attachments in this thread. Now I think it's better to send the patch directly to the list formally to avoid making you angry, though the patch is still work-in-progress. https://lore.kernel.org/lkml/20220516105100.1412740-1-hsinyi@chromium.org/T/#md06bd638bcdd985766ee75d17fbaa548218e0038 > I told you patch v2 made ~6s > difference, you didn't provide any response. Currently I don't have any better ideas to improve the performance. I'm sorry about that. > I told you the 9eec1d897139 > ("squashfs: provide backing_dev_info in order to disable read-ahead") > should be reverted, you didn't reply. Again, I didn't send the series formally in this thread since my focus is still on getting the code right. So I didn't include the revert in my attachment, only the readahead patch itself. Talking about revert, since I didn't send a revert patch before. There are conflict between 124cfc154f6c ("squashfs: Convert squashfs to read_folio") and reverting 9eec1d897139 ("squashfs: provide backing_dev_info in order to disable read-ahead"). Currently I just remove the comments in squashfs_fill_super() but I'm not sure if this is the correct way to do it. Hi, Matthew, will you handle the reverts? Thanks > I think you don't know what is > respection. > > > > > > > > > > On Mon, May 16, 2022 at 4:23 PM Hsin-Yi Wang wrote: > > > > > > > > On Sun, May 15, 2022 at 8:55 AM Phillip Lougher wrote: > > > > > > > > > > On 13/05/2022 07:35, Hsin-Yi Wang wrote: > > > > > > On Fri, May 13, 2022 at 1:33 PM Phillip Lougher wrote: > > > > > >> > > > > > >> My understanding is that this call will fully populate the > > > > > >> pages array with page references without any holes. That > > > > > >> is none of the pages array entries will be NULL, meaning > > > > > >> there isn't a page for that entry. In other words, if the > > > > > >> pages array has 32 pages, each of the 32 entries will > > > > > >> reference a page. > > > > > >> > > > > > > I noticed that if nr_pages < max_pages, calling read_blocklist() will > > > > > > have SQUASHFS errors, > > > > > > > > > > > > SQUASHFS error: Failed to read block 0x125ef7d: -5 > > > > > > SQUASHFS error: zlib decompression failed, data probably corrupt > > > > > > > > > > > > so I did a check if nr_pages < max_pages before squashfs_read_data(), > > > > > > just skip the remaining pages and let them be handled by readpage. > > > > > > > > > > > > > > > > Yes that avoids passing the decompressor code a too small page range. > > > > > As such extending the decompressor code isn't necessary. > > > > > > > > > > Testing your patch I discovered a number of cases where > > > > > the decompressor still failed as above. > > > > > > > > > > This I traced to "sparse blocks", these are zero filled blocks, and > > > > > are indicated/stored as a block length of 0 (bsize == 0). Skipping > > > > > this sparse block and letting it be handled by readpage fixes this > > > > > issue. > > > > > > > > > Ack. Thanks for testing this. > > > > > > > > > I also noticed a potential performance improvement. You check for > > > > > "pages[nr_pages - 1]->index >> shift) == index" after calling > > > > > squashfs_read_data. But this information is known before > > > > > calling squashfs_read_data and moving the check to before > > > > > squashfs_read_data saves the cost of doing a redundant block > > > > > decompression. > > > > > > > > > After applying this, The performance becomes: > > > > 2.73s > > > > 2.76s > > > > 2.73s > > > > > > > > Original: > > > > 2.76s > > > > 2.79s > > > > 2.77s > > > > > > > > (The pack file is different from my previous testing in this email thread.) > > > > > > > > > Finally I noticed that if nr_pages grows after the __readahead_batch > > > > > call, then the pages array and the page actor will be too small, and > > > > > it will cause the decompressor to fail. Changing the allocation to > > > > > max_pages fixes this. > > > > > > > > > Ack. > > > > > > > > I've added the fixes patch and previous fixes. > > > > > I have rolled these fixes into the patch below (also attached in > > > > > case it gets garbled). > > > > > > > > > > diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c > > > > > index 7cd57e0d88de..14485a7af5cf 100644 > > > > > --- a/fs/squashfs/file.c > > > > > +++ b/fs/squashfs/file.c > > > > > @@ -518,13 +518,11 @@ static void squashfs_readahead(struct > > > > > readahead_control *ractl) > > > > > file_end == 0) > > > > > return; > > > > > > > > > > - nr_pages = min(readahead_count(ractl), max_pages); > > > > > - > > > > > - pages = kmalloc_array(nr_pages, sizeof(void *), GFP_KERNEL); > > > > > + pages = kmalloc_array(max_pages, sizeof(void *), GFP_KERNEL); > > > > > if (!pages) > > > > > return; > > > > > > > > > > - actor = squashfs_page_actor_init_special(pages, nr_pages, 0); > > > > > + actor = squashfs_page_actor_init_special(pages, max_pages, 0); > > > > > if (!actor) > > > > > goto out; > > > > > > > > > > @@ -538,11 +536,18 @@ static void squashfs_readahead(struct > > > > > readahead_control *ractl) > > > > > goto skip_pages; > > > > > > > > > > index = pages[0]->index >> shift; > > > > > + > > > > > + if ((pages[nr_pages - 1]->index >> shift) != index) > > > > > + goto skip_pages; > > > > > + > > > > > bsize = read_blocklist(inode, index, &block); > > > > > + if (bsize == 0) > > > > > + goto skip_pages; > > > > > + > > > > > res = squashfs_read_data(inode->i_sb, block, bsize, NULL, > > > > > actor); > > > > > > > > > > - if (res >= 0 && (pages[nr_pages - 1]->index >> shift) == index) > > > > > + if (res >= 0) > > > > > for (i = 0; i < nr_pages; i++) > > > > > SetPageUptodate(pages[i]); > > > > > > > > > > -- > > > > > 2.34.1 > > > > > > > > > > > > > > > > > > > > Phillip > > > > > > > > > > > > > > > >> This is important for the decompression code, because it > > > > > >> expects each pages array entry to reference a page, which > > > > > >> can be kmapped to an address. If an entry in the pages > > > > > >> array is NULL, this will break. > > > > > >> > > > > > >> If the pages array can have holes (NULL pointers), I have > > > > > >> written an update patch which allows the decompression code > > > > > >> to handle these NULL pointers. > > > > > >> > > > > > >> If the pages array can have NULL pointers, I can send you > > > > > >> the patch which will deal with this. > > > > > > > > > > > > Sure, if there are better ways to deal with this. > > > > > > > > > > > > Thanks. > > > > > > > > > > > >> > > > > > >> Thanks > > > > > >> > > > > > >> Phillip > > > > > >> > > > > > >> > > > > > >> > > > > > >>> > > > > > >>>>> > > > > > >>>>> It's also noticed that when the crash happened, nr_pages obtained by > > > > > >>>>> readahead_count() is 512. > > > > > >>>>> nr_pages = readahead_count(ractl); // this line > > > > > >>>>> > > > > > >>>>> 2) Normal cases that won't crash: > > > > > >>>>> [ 22.651750] Block @ 0xb3bbca6, compressed size 42172, src size 262144 > > > > > >>>>> [ 22.653580] Block @ 0xb3c6162, compressed size 29815, src size 262144 > > > > > >>>>> [ 22.656692] Block @ 0xb4a293f, compressed size 17484, src size 131072 > > > > > >>>>> [ 22.666099] Block @ 0xb593881, compressed size 39742, src size 262144 > > > > > >>>>> [ 22.668699] Block @ 0xb59d3bf, compressed size 37841, src size 262144 > > > > > >>>>> [ 22.695739] Block @ 0x13698673, compressed size 65907, src size 131072 > > > > > >>>>> [ 22.698619] Block @ 0x136a87e6, compressed size 3155, src size 131072 > > > > > >>>>> [ 22.703400] Block @ 0xb1babe8, compressed size 99391, src size 131072 > > > > > >>>>> [ 22.706288] Block @ 0x1514abc6, compressed size 4627, src size 131072 > > > > > >>>>> > > > > > >>>>> nr_pages are observed to be 32, 64, 256... These won't cause a crash. > > > > > >>>>> Other values (max_pages, bsize, block...) looks normal > > > > > >>>>> > > > > > >>>>> I'm not sure why the crash happened, but I tried to modify the mask > > > > > >>>>> for a bit. After modifying the mask value to below, the crash is gone > > > > > >>>>> (nr_pages are <=256). > > > > > >>>>> Based on my testing on a 300K pack file, there's no performance change. > > > > > >>>>> > > > > > >>>>> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c > > > > > >>>>> index 20ec48cf97c5..f6d9b6f88ed9 100644 > > > > > >>>>> --- a/fs/squashfs/file.c > > > > > >>>>> +++ b/fs/squashfs/file.c > > > > > >>>>> @@ -499,8 +499,8 @@ static void squashfs_readahead(struct > > > > > >>>>> readahead_control *ractl) > > > > > >>>>> { > > > > > >>>>> struct inode *inode = ractl->mapping->host; > > > > > >>>>> struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info; > > > > > >>>>> - size_t mask = (1UL << msblk->block_log) - 1; > > > > > >>>>> size_t shift = msblk->block_log - PAGE_SHIFT; > > > > > >>>>> + size_t mask = (1UL << shift) - 1; > > > > > >>>>> > > > > > >>>>> > > > > > >>>>> Any pointers are appreciated. Thanks! > > > > > >>>> > > > > > >>