From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31488C77B6E for ; Thu, 13 Apr 2023 21:33:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B02C66B0072; Thu, 13 Apr 2023 17:33:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AB25E6B0075; Thu, 13 Apr 2023 17:33:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 97A2F900002; Thu, 13 Apr 2023 17:33:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 843C26B0072 for ; Thu, 13 Apr 2023 17:33:14 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 468A2AB619 for ; Thu, 13 Apr 2023 21:33:14 +0000 (UTC) X-FDA: 80677668708.13.89E63C9 Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) by imf05.hostedemail.com (Postfix) with ESMTP id 04639100013 for ; Thu, 13 Apr 2023 21:33:11 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=fastmail.fm header.s=fm3 header.b=YyyRC314; dkim=pass header.d=messagingengine.com header.s=fm3 header.b=PKyz5a1G; spf=pass (imf05.hostedemail.com: domain of bernd.schubert@fastmail.fm designates 66.111.4.26 as permitted sender) smtp.mailfrom=bernd.schubert@fastmail.fm; dmarc=pass (policy=none) header.from=fastmail.fm ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681421592; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Il0eEIqNhqBU8ECHfwauUNcYJt9ebk7gt+h3fhmjFGc=; b=KWPBC22rwt302RUZ4PIiQIg6j+D8mtxkkGTBtduttAD3vDEdv6puQbCRToV2IaD/6getDo cnTfC/ENaHplipz8kDJcNYIB8fxtq7xE4f+33AH7o54iwo/fQieuF2BebLPP8Z6dJuOMrA wLzebSICgQHx20zu6CvK6VZHtU+pnzs= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=fastmail.fm header.s=fm3 header.b=YyyRC314; dkim=pass header.d=messagingengine.com header.s=fm3 header.b=PKyz5a1G; spf=pass (imf05.hostedemail.com: domain of bernd.schubert@fastmail.fm designates 66.111.4.26 as permitted sender) smtp.mailfrom=bernd.schubert@fastmail.fm; dmarc=pass (policy=none) header.from=fastmail.fm ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681421592; a=rsa-sha256; cv=none; b=QE4hbkonvPC9B7PLsOtH/APHC8A4MFoaszJVWTrVQAuY1mTvPCPz9fjWqSsGusqR4TOt5Q Ju1IjMQZZs4/rLrKFvlG0UHzLs7QD2uBxDBzEJDmt4D7VQ3Wqcfxt/XgRA8ZhJIZ2cK0oP 5ldhW61YY2mSB7SmlziaAQsAtsp1Gic= Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 3FC295C00B4; Thu, 13 Apr 2023 17:33:11 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Thu, 13 Apr 2023 17:33:11 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastmail.fm; h= cc:cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to; s=fm3; t= 1681421591; x=1681507991; bh=Il0eEIqNhqBU8ECHfwauUNcYJt9ebk7gt+h 3fhmjFGc=; b=YyyRC314q/dOGJbm3084kqHXIIoNdSkABdDNoMaY7wQTt1ebsU2 6QbdsFLDVMxdmrAu0JQ60PZhYNEqTYh5vRtu/dsLJIP56staGvbQyCb/vlbAWOvr Aits8j57GQqxYPs/ha5ad5jrzbW+/+b+/OVmeoSf6uFhyIoGI2by50+GkKAf1IOn /lFLTwtfOnHqdvHuUUw+kXuuO+58uqjtsQDZiV0FN33l9GSot7rKIed73+RXr+2T 9lJednC1wchn9X0x+A5uFou/gvhGVLMryzbpMrEa94ebKf8kZVblC0gsufxbgahs kIRMPj1ezf0QG1IMx0nMZAkNCBxgUquU9Cg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t= 1681421591; x=1681507991; bh=Il0eEIqNhqBU8ECHfwauUNcYJt9ebk7gt+h 3fhmjFGc=; b=PKyz5a1GBus+8U9NINMGS+SAQn3OZxB22mvwGfa1uw0kKYDw1aw /fjjMA9hZBlNuEY109j8zM6jXERF+HRjWsINNVLEW7q5ltDNcFrUsk0I5Dw2lRBp GChYh9dnj7In24/ZDs5PPRyTtoxuNdiyruVvhYZDst5SwkunKD54wx19MXkvoh+6 APlG8X4/rmRYCkpp/juXqoRdgYwc9UkLas04bPJ+KQgbolpbYQNWSoa2NE0wkcpF YiA1Q4cEHIsksFMjuAQaIYMgCqUoSiogx7Ll6TM4cjLOuyDhWytDzruono9PrK9H ASKftv+iTiENLRexp5FOTQhwYILYrQFFK5w== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrvdekkedgudeigecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefkffggfgfuhffvvehfjggtgfesthekredttdefjeenucfhrhhomhepuegv rhhnugcuufgthhhusggvrhhtuceosggvrhhnugdrshgthhhusggvrhhtsehfrghsthhmrg hilhdrfhhmqeenucggtffrrghtthgvrhhnpeduteejuedtvddtudfffeduudehvddvhfeg leehteevgfekhfelgefhfffgtedutdenucevlhhushhtvghrufhiiigvpedtnecurfgrrh grmhepmhgrihhlfhhrohhmpegsvghrnhgurdhstghhuhgsvghrthesfhgrshhtmhgrihhl rdhfmh X-ME-Proxy: Feedback-ID: id8a24192:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 13 Apr 2023 17:33:10 -0400 (EDT) Message-ID: Date: Thu, 13 Apr 2023 23:33:09 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.1 Subject: Re: sequential 1MB mmap read ends in 1 page sync read-ahead Content-Language: en-US, de-DE From: Bernd Schubert To: "Matthew Wilcox (Oracle)" Cc: "linux-fsdevel@vger.kernel.org" , Miklos Szeredi , Andrew Morton , linux-mm@kvack.org References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 04639100013 X-Stat-Signature: 6p9o7wf55ddrn7moy8izp499tyixj3kh X-Rspam-User: X-HE-Tag: 1681421591-872324 X-HE-Meta: U2FsdGVkX180jBpVV2+3swhYGzIIbSOqsWEzLpoRNUPWYg5ijevshSG/UFuc642knep4ZR+xHrTUh6urK5OGew5WLbo1qCNxKvVdTwzj4K4aHxFB6smGhQ+ke6YI8F30vWyeEhDOQs9/ldgcNfhL0iwFjWGIc/+4gqEUMct+h5UvN0k997nVFmlwsAR2oHJPuktCzJgC/wUu5J+NChLV5ELttrHRKG8F0yehPA70nK8Ow0SvQLBxIY/4djp/dJl2NZjnMOw1E3+63AeRqY/aOm8D/adMRW9i98u3NMyycNOwLSR04fCPY0dfOml2v+PppmR2QbLBwy2vNQ0t2F9eWmrOjgVl26BOtkpWkW0LTXLQuJ0LryGw23xwGlQRjk8tJpoRkQfihA+E+65+f2Hku2fd4qFhv6ggev+bmyFLHHU/aal8+4G5VaEUinxzIVBeiipiA1UHhGWIPBlN/VU8NjQ7W8Xt9f4IxAxczhYq85w9xhEeXcQzV8W6cFS/NofeIkP6fnxTfQfbcQD5Ra/G9AlgngC9DWkcFiTLeaHKZPLp25urATwNmHk+dYKfyxFSncnVWi54hmegnmOP1cPv0Il0N7AODunUhUfTpzFZnovXlMHDyqe4FMvaKEzvM9RG69S3vGIaXtbwh7agvDhANEVPhec5liGWGieL+1Qya72mUGBl4uEUs9kDdJ8d1WkIKwPFx4Tx8eUUsXAPodpvYjhv9SBBz55Om95sjLQNgVRR41L2tMc1/LHkj1au4evpc6hD6B3X784X0jY4b3EMbs+UeR3rvVCdhd4E9SQxoRyOaDHsrJz/OIOH8jhUUkKn6j/Hh5OvepnpEDUvlj75G4uiYlKZlFIx3tSY58i/J3jTjkKLu/iRzaLjvE5flx0nXAcsQphd4Ksx75IUj1ZdJtSqmai+ZlJa1njrKmdMcyHF1fLxAl9Htx17i8jfpoA16ccvF1o94xTLM5stVrd Hy2mmIXQ P5SO6MT8Rprm4yT5mkt84DWMxqQDjUx5WaYmPIQgq98QuRYj5Ge838A+m4cCnjvYvig+27R47fHwwNR7m0gkZBjxesinUh9x1xGh/XKPiRvB8g95UDl2P7j8jXzycHjZIp6An5ojVnYmMVlUEwiMpNxB+g8Y6TEPUIcuU2Y+E8yuBkdbkJ9DhQeinukTZquoLhXCZMdDzZgWNimGz9l4i4RunI7nzV+pYGWXAgd8wq9zyNoTaRpXN5/WXuONMtHBmCVK0wIRpKDDdnuTB2KXWTO/TwIx6wsF+6N1B0Qg6B1nAWKDrirRH0nC0jg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Sorry, forgot to add Andrew and linux-mm into CC. On 4/13/23 23:27, Bernd Schubert wrote: > Hello, > > I found a weird mmap read behavior while benchmarking the fuse-uring > patches. > I did not verify yet, but it does not look fuse specific. > Basically, I started to check because fio results were much lower > than expected (better with the new code, though) > > fio cmd line: > fio --size=1G --numjobs=1 --ioengine=mmap --output-format=normal,terse > --directory=/scratch/dest/ --rw=read multi-file.fio > > > bernd@squeeze1 test2>cat multi-file.fio > [global] > group_reporting > bs=1M > runtime=300 > > [test] > > This sequential fio sets POSIX_MADV_SEQUENTIAL and then does memcpy > beginning at offset 0 in 1MB steps (verified with additional > logging in fios engines/mmap.c). > > And additional log in fuse_readahead() gives > > [ 1396.215084] fuse: 000000003fdec504 inode=00000000be0f29d3 count=64 > index=0 > [ 1396.237466] fuse: 000000003fdec504 inode=00000000be0f29d3 count=64 > index=255 > [ 1396.263175] fuse: 000000003fdec504 inode=00000000be0f29d3 count=1 > index=254 > [ 1396.282055] fuse: 000000003fdec504 inode=00000000be0f29d3 count=1 > index=253 > ... > [ 1496.353745] fuse: 000000003fdec504 inode=00000000be0f29d3 count=1 > index=64 > [ 1496.381105] fuse: 000000003fdec504 inode=00000000be0f29d3 count=64 > index=511 > [ 1496.397487] fuse: 000000003fdec504 inode=00000000be0f29d3 count=1 > index=510 > [ 1496.416385] fuse: 000000003fdec504 inode=00000000be0f29d3 count=1 > index=509 > ... > > Logging in do_sync_mmap_readahead() > > [ 1493.130764] do_sync_mmap_readahead:3015 ino=132 index=0 count=0 > ras_start=0 ras_size=0 ras_async=0 ras_ra_pages=64 ras_mmap_miss=0 > ras_prev_pos=-1 > [ 1493.147173] do_sync_mmap_readahead:3015 ino=132 index=255 count=0 > ras_start=0 ras_size=64 ras_async=32 ras_ra_pages=64 ras_mmap_miss=0 > ras_prev_pos=-1 > [ 1493.165952] do_sync_mmap_readahead:3015 ino=132 index=254 count=0 > ras_start=0 ras_size=64 ras_async=32 ras_ra_pages=64 ras_mmap_miss=0 > ras_prev_pos=-1 > [ 1493.185566] do_sync_mmap_readahead:3015 ino=132 index=253 count=0 > ras_start=0 ras_size=64 ras_async=32 ras_ra_pages=64 ras_mmap_miss=0 > ras_prev_pos=-1 > ... > [ 1496.341890] do_sync_mmap_readahead:3015 ino=132 index=64 count=0 > ras_start=0 ras_size=64 ras_async=32 ras_ra_pages=64 ras_mmap_miss=0 > ras_prev_pos=-1 > [ 1496.361385] do_sync_mmap_readahead:3015 ino=132 index=511 count=0 > ras_start=96 ras_size=64 ras_async=64 ras_ra_pages=64 ras_mmap_miss=0 > ras_prev_pos=-1 > > > So we can see from fuse that it starts to read at page index 0, wants > 64 pages (which is actually the double of bdi read_ahead_kb), then > skips index 64...254) and immediately goes to index 255. For the mmaped > memcpy pages are missing and then it goes back in 1 page steps to get > these. > > A workaround here is to set read_ahead_kb in the bdi to a larger > value, another workaround might be (untested) to increase the read-ahead > window. Either of these two seem to be workarounds for the index order > above. > > I understand that read-ahead gets limited by the bdi value (although > exceeded above), but why does it go back in 1 page steps? My expectation > would have been > > index=0  count=32 (128kb read-head) > index=32 count=32 > index=64 count=32 > ... > > > This is with plain 6.2 + fuse-uring patches. > > Thanks, > Bernd