From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BCBCC433EF for ; Fri, 13 May 2022 12:17:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9047A6B0073; Fri, 13 May 2022 08:17:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8B48F6B0075; Fri, 13 May 2022 08:17:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 72D358D0001; Fri, 13 May 2022 08:17:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 639986B0073 for ; Fri, 13 May 2022 08:17:16 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 2880B21408 for ; Fri, 13 May 2022 12:17:16 +0000 (UTC) X-FDA: 79460619672.26.55F84CA Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) by imf07.hostedemail.com (Postfix) with ESMTP id DAE90400A5 for ; Fri, 13 May 2022 12:17:08 +0000 (UTC) Received: by mail-pl1-f171.google.com with SMTP id j14so7769453plx.3 for ; Fri, 13 May 2022 05:17:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=zyvL9dJGBzMxJzmYL2n5sXHNsDzqReHu7GcA1I1bZV8=; b=chkd9CiCx3BteggTzoQtXOQE2H9Y4MZjlrvBxZF/IlGVB/RSA6+Wv7E6Wo14UdcOSV gMS2om8hC5cU0jCvWvB9YtSulmhfuiOAM2gtoQWStq8nKLLSl52SBxNshtqwbJc3ELzP UkFpFzIG010T12SCLoed6mkEZm1pTiixYJ9xxrNgRyNptzm2Z+ma4sFA33uoa0nPhewA O1B1gaKSaOVBFYaWWNCxLP7ZKZN4RaOLibfPn06puBcGkuq2WX1VSYysYfE2t8FHNmm0 RY1S/3yEm/sNFk16Ir2o3woN1/Qc7hbtBzjyvLYc7q5z62qjMskXXwExC54r2OG2L8ky wYWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=zyvL9dJGBzMxJzmYL2n5sXHNsDzqReHu7GcA1I1bZV8=; b=eByD8s3h3nptyI5LbVJ3t3U6rZmezmMdjR5XoBl9qxiWml02sKZIrNdFwoMWCK0h1l uPS5sdMkrKe4UUj50VRO1sq/6E5as19p/aJFIltL3Rr6SStrVsQ0EmjNBYxUaSr7lpAn uWTD85LSTBrfwysEJeMVqv9RtwI6aJ82hsrDFcMUvlXIVfDOKhOe62XNyBGzMPFxtl6d DUU0Wa/ZjOsh+z/tWYHGlkTMs/9YS7gPSEU/OFVf1BwOmmXwcGCAg18ztBqKsEr/7bUz hELRYqnEThLIlIKhCyFAHOb3Vn3OycjSLNbEB+xoyf9qw4mq9IaOA9LF6otUzPT2Wr3e NY7A== X-Gm-Message-State: AOAM531iCGFkUM8pA9Zyl3dNmHxREvTGHWWMCebcCinlKI/16uGumxtT OkGGIewo2wBvdHFV7so6nk2qBn+WUnKdhflRnaw= X-Google-Smtp-Source: ABdhPJwbxHXrEh6gwF2Hi2a8Uyqj5otc455K8i5y7Hft6j/uATJVAMZCZpxdCdi+Ql/q9wVjBj6zHi21G++ikLuybHA= X-Received: by 2002:a17:902:c202:b0:15f:28b6:ad43 with SMTP id 2-20020a170902c20200b0015f28b6ad43mr4666339pll.84.1652444234388; Fri, 13 May 2022 05:17:14 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Xiongwei Song Date: Fri, 13 May 2022 20:16:47 +0800 Message-ID: Subject: Re: squashfs performance regression and readahea To: Hsin-Yi Wang Cc: Phillip Lougher , Matthew Wilcox , Zheng Liang , Zhang Yi , Hou Tao , Miao Xie , Andrew Morton , Linus Torvalds , "Song, Xiongwei" , "linux-mm@kvack.org" , "squashfs-devel@lists.sourceforge.net" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: DAE90400A5 X-Stat-Signature: oyqk98iygt4k4hgcxdebgyez518k3wb4 X-Rspam-User: Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=chkd9CiC; spf=pass (imf07.hostedemail.com: domain of sxwjean@gmail.com designates 209.85.214.171 as permitted sender) smtp.mailfrom=sxwjean@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-HE-Tag: 1652444228-303649 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hello, On Thu, May 12, 2022 at 2:23 PM Hsin-Yi Wang wrote: > > On Thu, May 12, 2022 at 3:13 AM Phillip Lougher = wrote: > > > > On 11/05/2022 16:12, Hsin-Yi Wang wrote: > > > On Tue, May 10, 2022 at 9:19 PM Xiongwei Song wro= te: > > >> > > >> On Tue, May 10, 2022 at 8:47 PM Hsin-Yi Wang w= rote: > > >>> > > >>> On Tue, May 10, 2022 at 8:31 PM Xiongwei Song w= rote: > > >>>> > > >>>> Hi Hsin-Yi, > > >>>> > > >>>> On Mon, May 9, 2022 at 10:29 PM Hsin-Yi Wang = wrote: > > >>>>> > > >>>>> On Mon, May 9, 2022 at 9:21 PM Matthew Wilcox wrote: > > >>>>>> > > >>>>>> On Mon, May 09, 2022 at 08:43:45PM +0800, Xiongwei Song wrote: > > >>>>>>> Hi Hsin-Yi and Matthew, > > >>>>>>> > > >>>>>>> With the patch from the attachment on linux 5.10, ran the comma= nd as I > > >>>>>>> mentioned earlier, > > >>>>>>> got the results below: > > >>>>>>> 1:40.65 (1m + 40.65s) > > >>>>>>> 1:10.12 > > >>>>>>> 1:11.10 > > >>>>>>> 1:11.47 > > >>>>>>> 1:11.59 > > >>>>>>> 1:11.94 > > >>>>>>> 1:11.86 > > >>>>>>> 1:12.04 > > >>>>>>> 1:12.21 > > >>>>>>> 1:12.06 > > >>>>>>> > > >>>>>>> The performance has improved obviously, but compared to linux 4= .18, the > > >>>>>>> performance is not so good. > > >>>>>>> > > >>>>> I think you shouldn't compare the performance with 4.18 directly, > > >>>>> since there might be other factors that impact the performance. > > >>>> > > >>>> Make sense. > > >>>> > > >>>>> I'd suggest comparing the same kernel version with: > > >>>>> a) with this patch > > >>>>> b) with c1f6925e1091 ("mm: put readahead pages in cache earlier")= reverted. > > >>>> > > >>>> With 9eec1d897139 ("squashfs: provide backing_dev_info in order to= disable > > >>>> read-ahead") reverted and applied 0001-WIP-squashfs-implement-read= ahead.patch, > > >>>> test result on linux 5.18=EF=BC=9A > > >>>> 1:41.51 (1m + 41.51s) > > >>>> 1:08.11 > > >>>> 1:10.37 > > >>>> 1:11.17 > > >>>> 1:11.32 > > >>>> 1:11.59 > > >>>> 1:12.23 > > >>>> 1:12.08 > > >>>> 1:12.76 > > >>>> 1:12.51 > > >>>> > > >>>> performance worse 1 ~ 2s than linux 5.18 vanilla. > > >>>> > > >>> > > >>> Can you share the pack file you used for testing? Thanks > > >> > > >> You are saying the files that are put in squashfs partitions? If yes= , I can tell > > >> I just put some dynamic libraries in partitions: > > >> -rwxr-xr-x 1 root root 200680 Apr 20 03:57 ld-2.33.so > > >> lrwxrwxrwx 1 root root 10 Apr 20 03:57 ld-linux-x86-64.so.2 -> = ld-2.33.so > > >> -rwxr-xr-x 1 root root 18776 Apr 20 03:57 libanl-2.33.so > > >> lrwxrwxrwx 1 root root 14 Apr 20 03:57 libanl.so.1 -> libanl-2.= 33.so > > >> lrwxrwxrwx 1 root root 17 Apr 20 04:08 libblkid.so.1 -> libblki= d.so.1.1.0 > > >> -rwxr-xr-x 1 root root 330776 Apr 20 04:08 libblkid.so.1.1.0 > > >> -rwxr-xr-x 1 root root 1823192 Apr 20 03:57 libc-2.33.so > > >> ...... snip ...... > > >> > > >> The number of files is 110(55 libs + 55 soft links to libs). I have= 90 squashfs > > >> partitions which save the identical files. The space of each partiti= on is 11M, > > >> nothing special. > > >> > > >> Thanks. > > >> > > > > > > I noticed that there's a crash at > > > https://elixir.bootlin.com/linux/latest/source/lib/lzo/lzo1x_decompre= ss_safe.c#L218 > > > when testing on my system. > > > (I have CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS enabled) > > > > > > Full logs: > > > [ 119.062420] Unable to handle kernel paging request at virtual > > > address ffffffc017337000 > > > [ 119.062437] Mem abort info: > > > [ 119.062442] ESR =3D 0x96000047 > > > [ 119.062447] EC =3D 0x25: DABT (current EL), IL =3D 32 bits > > > [ 119.062451] SET =3D 0, FnV =3D 0 > > > [ 119.062454] EA =3D 0, S1PTW =3D 0 > > > [ 119.062457] Data abort info: > > > [ 119.062460] ISV =3D 0, ISS =3D 0x00000047 > > > [ 119.062464] CM =3D 0, WnR =3D 1 > > > [ 119.062469] swapper pgtable: 4k pages, 39-bit VAs, pgdp=3D00000000= 41099000 > > > [ 119.062473] [ffffffc017337000] pgd=3D000000010014a003, > > > p4d=3D000000010014a003, pud=3D000000010014a003, pmd=3D000000010ba5900= 3, > > > pte=3D0000000000000000 > > > [ 119.062489] Internal error: Oops: 96000047 [#1] PREEMPT SMP > > > [ 119.062494] Modules linked in: vhost_vsock vhost vhost_iotlb > > > vmw_vsock_virtio_transport_common vsock rfcomm algif_hash > > > algif_skcipher af_alg veth uinput xt_cgroup mtk_dip mtk_cam_isp > > > mtk_vcodec_enc mtk_vcodec_dec hci_uart mtk_fd mtk_mdp3 v4l2_h264 > > > mtk_vcodec_common mtk_vpu xt_MASQUERADE mtk_jpeg cros_ec_rpmsg btqca > > > videobuf2_dma_contig v4l2_fwnode v4l2_mem2mem btrtl elants_i2c mtk_sc= p > > > mtk_rpmsg rpmsg_core mtk_scp_ipi mt8183_cci_devfreq ip6table_nat fuse > > > 8021q bluetooth ecdh_generic ecc iio_trig_sysfs cros_ec_lid_angle > > > cros_ec_sensors cros_ec_sensors_core industrialio_triggered_buffer > > > kfifo_buf cros_ec_sensorhub cros_ec_typec typec hid_google_hammer > > > ath10k_sdio lzo_rle lzo_compress ath10k_core ath mac80211 zram > > > cfg80211 uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 > > > videobuf2_common cdc_ether usbnet r8152 mii joydev > > > [ 119.062613] CPU: 4 PID: 4161 Comm: chrome Tainted: G W > > > 5.10.112 #105 39f11bffda227eaae4c704733b9bf01db22d8b4d > > > [ 119.062617] Hardware name: Google burnet board (DT) > > > [ 119.062623] pstate: 20400005 (nzCv daif +PAN -UAO -TCO BTYPE=3D--) > > > [ 119.062636] pc : lzo1x_decompress_safe+0x1dc/0x564 > > > [ 119.062643] lr : lzo_uncompress+0x134/0x1f0 > > > [ 119.062647] sp : ffffffc01837b860 > > > [ 119.062650] x29: ffffffc01837b860 x28: 0000000000000000 > > > [ 119.062656] x27: 0000000000005451 x26: ffffffc0171c9445 > > > [ 119.062662] x25: 0000000000000000 x24: ffffffc017437000 > > > [ 119.062668] x23: ffffffc0171c944f x22: ffffffc017136000 > > > [ 119.062673] x21: ffffffc017336ff1 x20: ffffffc017237000 > > > [ 119.062679] x19: ffffffc01837b8d0 x18: 0000000000000000 > > > [ 119.062684] x17: 00000000000001eb x16: 0000000000000012 > > > [ 119.062689] x15: 000000000010000f x14: d600120202000001 > > > [ 119.062695] x13: ffffffc017336ff1 x12: ffffffc017336ff4 > > > [ 119.062700] x11: 0000000000000002 x10: 01010101010100ff > > > [ 119.062705] x9 : ffffffffffffffff x8 : ffffffc0171c944d > > > [ 119.062710] x7 : d15d3aaabd294330 x6 : 0206397115fe28ab > > > [ 119.062715] x5 : ffffffc0171c944f x4 : 000000000009344f > > > [ 119.062721] x3 : ffffffc01837b8d0 x2 : ffffffc017237000 > > > [ 119.062726] x1 : 000000000009344f x0 : ffffffc0171c9447 > > > [ 119.062731] Call trace: > > > [ 119.062738] lzo1x_decompress_safe+0x1dc/0x564 > > > [ 119.062742] lzo_uncompress+0x134/0x1f0 > > > [ 119.062746] squashfs_decompress+0x6c/0xb4 > > > [ 119.062753] squashfs_read_data+0x1a8/0x298 > > > [ 119.062758] squashfs_readahead+0x308/0x474 > > > [ 119.062765] read_pages+0x74/0x280 > > > [ 119.062769] page_cache_ra_unbounded+0x1d0/0x228 > > > [ 119.062773] do_page_cache_ra+0x44/0x50 > > > [ 119.062779] do_sync_mmap_readahead+0x188/0x1a0 > > > [ 119.062783] filemap_fault+0x100/0x350 > > > [ 119.062789] __do_fault+0x48/0x10c > > > [ 119.062793] do_cow_fault+0x58/0x12c > > > [ 119.062797] handle_mm_fault+0x544/0x904 > > > [ 119.062804] do_page_fault+0x260/0x384 > > > [ 119.062809] do_translation_fault+0x44/0x5c > > > [ 119.062813] do_mem_abort+0x48/0xb4 > > > [ 119.062819] el0_da+0x28/0x34 > > > [ 119.062824] el0_sync_compat_handler+0xb8/0xcc > > > [ 119.062829] el0_sync_compat+0x188/0x1c0 > > > [ 119.062837] Code: f94001ae f90002ae f94005ae 910041ad (f90006ae) > > > [ 119.062842] ---[ end trace 3e9828c7360fd7be ]--- > > > [ 119.090436] Kernel panic - not syncing: Oops: Fatal exception > > > [ 119.090455] SMP: stopping secondary CPUs > > > [ 119.090467] Kernel Offset: 0x2729c00000 from 0xffffffc010000000 > > > [ 119.090471] PHYS_OFFSET: 0xffffffd880000000 > > > [ 119.090477] CPU features: 0x08240002,2188200c > > > > > > 1) Traces near when the crash happened: > > > [ 79.495580] Block @ 0x60eea9c, compressed size 65744, src size 104= 8576 > > > [ 80.363573] Block @ 0x1f9f000, compressed size 200598, src size 10= 48576 > > > [ 80.371256] Block @ 0x1fcff96, compressed size 80772, src size 104= 8576 > > > [ 80.428388] Block @ 0x1fe3b1a, compressed size 83941, src size 104= 8576 > > > [ 80.435319] Block @ 0x1ff82ff, compressed size 77936, src size 104= 8576 > > > [ 80.724331] Block @ 0x4501000, compressed size 364069, src size 10= 48576 > > > [ 80.738683] Block @ 0x4dccf28, compressed size 603215, src size 20= 97152 > > > > Src size 2097152 is clearly wrong, as the maximum data block size is 1 > > Mbyte or 1048576. > > > > That debug line comes from > > > > https://elixir.bootlin.com/linux/latest/source/fs/squashfs/block.c#L156 > > > > ---- > > TRACE("Block @ 0x%llx, %scompressed size %d, src size %d\n", > > index, compressed ? "" : "un", length, output->length); > > ---- > > > > Which indicates your code has created a page_actor of 2 Mbytes in size > > (output->length). > > > > This is completely incorrect, as the page_actor should never be larger > > than the size of the block to be read in question. In most cases that > > will be msblk->block_size, but it may be less at the end of the file. > > > > You appear to be trying to read the amount of readahead requested. But= , > > you should always be trying to read the lesser of readahead, and the > > size of the block in question. > > > > Hope that helps. > > > > Phillip > > > Hi Phillip, > Thanks for the explanation. After restricting the size feed to > page_actor, the crash no longer happened. > > Hi Xiongwei, > Can you test this version (sent as attachment) again? I've tested on > my platform: > - arm64 > - kernel 5.10 > - pack_data size ~ 300K > - time ureadahead pack_data > 1. with c1f6925e1091 ("mm: put readahead pages in cache earlier") reverte= d: > 0.633s > 0.755s > 0.804s > > 2. apply the patch: > 0.625s > 0.656s > 0.768s > Thanks for sharing. I have done the test on 5.10 and 5.18. The results are a little worse than patch v1 for my test env. On linux 5.10=EF=BC=9A With c1f6925e1091 ("mm: put readahead pages in cache earlier") reverted:: 1:37.16 (1m +37.16s) 1:04.18 1:05.28 1:06.07 1:06.31 1:06.58 1:06.80 1:06.79 1:06.95 1:06.61 With your patch v2: 2:04.27 (2m + 4.27s) 1:14.95 1:14.56 1:15.75 1:16.55 1:16.87 1:16.74 1:17.36 1:17.50 1:17.32 On linux 5.18: The ra disabled by default:: 1:12.82 1:07.68 1:08.94 1:09.65 1:09.87 1:10.32 1:10.47 1:10.34 1:10.24 1:10.34 With your patch v2: 2:00.14 (2m + 0.14s) 1:13.46 1:14.62 1:15.02 1:15.78 1:16.01 1:16.03 1:16.24 1:16.44 1:16.16 As you can see, there are extra 6s increase on both 5.10 and 5.18. Don't know if the change of page number makes the overhead. One stupid question, see below code from your patch: + } + + kfree(actor); + return; + +skip_pages: when release page pointers array after pages cached? I don't see any chance to do that. Regards, Xiongwei > Hi Matthew, > Thanks for reviewing the patch previously. Does this version look good > to you? If so, I can send it to the list. > > > Thanks for all of your help. > > > > > > > It's also noticed that when the crash happened, nr_pages obtained by > > > readahead_count() is 512. > > > nr_pages =3D readahead_count(ractl); // this line > > > > > > 2) Normal cases that won't crash: > > > [ 22.651750] Block @ 0xb3bbca6, compressed size 42172, src size 262= 144 > > > [ 22.653580] Block @ 0xb3c6162, compressed size 29815, src size 262= 144 > > > [ 22.656692] Block @ 0xb4a293f, compressed size 17484, src size 131= 072 > > > [ 22.666099] Block @ 0xb593881, compressed size 39742, src size 262= 144 > > > [ 22.668699] Block @ 0xb59d3bf, compressed size 37841, src size 262= 144 > > > [ 22.695739] Block @ 0x13698673, compressed size 65907, src size 13= 1072 > > > [ 22.698619] Block @ 0x136a87e6, compressed size 3155, src size 131= 072 > > > [ 22.703400] Block @ 0xb1babe8, compressed size 99391, src size 131= 072 > > > [ 22.706288] Block @ 0x1514abc6, compressed size 4627, src size 131= 072 > > > > > > nr_pages are observed to be 32, 64, 256... These won't cause a crash. > > > Other values (max_pages, bsize, block...) looks normal > > > > > > I'm not sure why the crash happened, but I tried to modify the mask > > > for a bit. After modifying the mask value to below, the crash is gone > > > (nr_pages are <=3D256). > > > Based on my testing on a 300K pack file, there's no performance chang= e. > > > > > > diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c > > > index 20ec48cf97c5..f6d9b6f88ed9 100644 > > > --- a/fs/squashfs/file.c > > > +++ b/fs/squashfs/file.c > > > @@ -499,8 +499,8 @@ static void squashfs_readahead(struct > > > readahead_control *ractl) > > > { > > > struct inode *inode =3D ractl->mapping->host; > > > struct squashfs_sb_info *msblk =3D inode->i_sb->s_fs_info; > > > - size_t mask =3D (1UL << msblk->block_log) - 1; > > > size_t shift =3D msblk->block_log - PAGE_SHIFT; > > > + size_t mask =3D (1UL << shift) - 1; > > > > > > > > > Any pointers are appreciated. Thanks! > >