From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44CE7C27C76 for ; Thu, 26 Jan 2023 00:34:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C1AEC6B0071; Wed, 25 Jan 2023 19:34:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BCB286B0072; Wed, 25 Jan 2023 19:34:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AB9D26B0073; Wed, 25 Jan 2023 19:34:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 9D2FA6B0071 for ; Wed, 25 Jan 2023 19:34:03 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 7442B1A0465 for ; Thu, 26 Jan 2023 00:34:03 +0000 (UTC) X-FDA: 80395077966.27.7FC2B29 Received: from smtp-fw-80007.amazon.com (smtp-fw-80007.amazon.com [99.78.197.218]) by imf03.hostedemail.com (Postfix) with ESMTP id 7CA5420007 for ; Thu, 26 Jan 2023 00:34:01 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=TEBwkYBt; dmarc=pass (policy=quarantine) header.from=amazon.com; spf=pass (imf03.hostedemail.com: domain of "prvs=383fb4613=risbhat@amazon.com" designates 99.78.197.218 as permitted sender) smtp.mailfrom="prvs=383fb4613=risbhat@amazon.com" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1674693241; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=nuFZ7JWueZSk4ZKs2v/HsBakEfBEo8m7KeqzL4CyUT4=; b=R2bH2sW9rrO6yEGTDMtKR5kNFzYSxDe7zs0ub6CrsZMxzJjwYowzQDjKMgTjuIzIsLI30l XeV3r/s38MLL26jtrCHsqAV54SGmFzFYou5j49LyTQBfv2m/Dt8tL9ohWpa1ZOKWCaEQsS yfr+MXEfEl39Y82NZRg0ZIGFMAi03/M= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=TEBwkYBt; dmarc=pass (policy=quarantine) header.from=amazon.com; spf=pass (imf03.hostedemail.com: domain of "prvs=383fb4613=risbhat@amazon.com" designates 99.78.197.218 as permitted sender) smtp.mailfrom="prvs=383fb4613=risbhat@amazon.com" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1674693241; a=rsa-sha256; cv=none; b=Xp77ZnxTa9CV5+U0T30jfwojodL7PKpOBv9fAWTBRfjbkbsGZxctJ6o8IxES0e2nNcdXDI KDJc4cVrK/nZNxb9ln2ARqOuC/kKdS6N9QY5aHoMAIWG5XNFSGUxM8EI1zI0GvsetFxd8B Kja1UHFqqYsrYb17RTUkmHEH8tcRA28= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1674693242; x=1706229242; h=message-id:date:mime-version:from:subject:to:cc: content-transfer-encoding; bh=nuFZ7JWueZSk4ZKs2v/HsBakEfBEo8m7KeqzL4CyUT4=; b=TEBwkYBt4MEl0BrDXhUTAZdRmMXYKT2T2V2y0/TSbNSH6iI/KHz4T6tl cHiO/XhUkC+pQc2h55kEwPDRCQBU88yLAECt4za6Zak29rYaCexpQ/wVf sTD/OfLVJpIDf4ZVtXiPGs721ddEjBvqX0xmEddh5x//ZYY+8sdBd8xFx s=; X-IronPort-AV: E=Sophos;i="5.97,246,1669075200"; d="scan'208";a="175195278" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO email-inbound-relay-pdx-2b-m6i4x-f253a3a3.us-west-2.amazon.com) ([10.25.36.214]) by smtp-border-fw-80007.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Jan 2023 00:34:01 +0000 Received: from EX13MTAUWC002.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan2.pdx.amazon.com [10.236.137.194]) by email-inbound-relay-pdx-2b-m6i4x-f253a3a3.us-west-2.amazon.com (Postfix) with ESMTPS id D7CFE81A79; Thu, 26 Jan 2023 00:33:59 +0000 (UTC) Received: from EX19D002UWC004.ant.amazon.com (10.13.138.186) by EX13MTAUWC002.ant.amazon.com (10.43.162.240) with Microsoft SMTP Server (TLS) id 15.0.1497.45; Thu, 26 Jan 2023 00:33:56 +0000 Received: from [192.168.18.75] (10.43.162.56) by EX19D002UWC004.ant.amazon.com (10.13.138.186) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1118.7; Thu, 26 Jan 2023 00:33:55 +0000 Message-ID: <053b60a6-133e-5d59-0732-464d5160772a@amazon.com> Date: Wed, 25 Jan 2023 16:33:54 -0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.6.1 Content-Language: en-US From: "Bhatnagar, Rishabh" Subject: EXT4 IOPS degradation between 4.14 and 5.10 To: Jan Kara , , CC: , , , Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.43.162.56] X-ClientProxiedBy: EX13D31UWA002.ant.amazon.com (10.43.160.82) To EX19D002UWC004.ant.amazon.com (10.13.138.186) X-Rspamd-Queue-Id: 7CA5420007 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: detz7fn8y74s9cfse6urjph11shah8zb X-HE-Tag: 1674693241-857919 X-HE-Meta: U2FsdGVkX1/B/eLeHc9NjKUZ771T2Q833RUsJLIP+ia9VHOnaDzWV/fjpoZX/+gB2kUNEdpJRzEsiBsjKD3csSd130m0l0EztaiGGyYZdQCzHZAflWw7R/UDA6pHJMyoqImDIzyQY4IBBZMqY4ktTwRfAGAxrRCj3dO+Iv+WvgzQBC8VzuAOcYyZigUSikUetg1mIaKLGzOzw05WfpNVRihZHis1ybuJdSjoDPaHJCR6PkZOvt6BWbs8yEe5+54oAxSqJPN3hOwy1OgCr8OnaZV3DjuSxByZysGL01eGcYWGJCvnYbJuzkwjBgSNqSnnRRg3gz6uKc5pKsnw6jWH7tdkWLloR2yi+VUOFC7jcdsmXIyKEr3XtcsBn6AY2WHrU42bBAkFoUvvad268BB7fzPEZpOMT20p3wTyV0a04f5GxjsFEXxihw8aJmNGugXEZ7KgcnCGG85Q8qOD7bqv+tFgRblxjq/LmQVXbPJX7ymU9TeHgMF8OiK/4yJ6lVmbpFdq8pU9e2NOvig8MFkcTnsfqdjm9UZR9Bo5EZHbIfKrcxmWZO+y9zCTq+qKr98KyjL09SwCiFcpLJa1pOjOxPY+u84Bqm2yt/6r81TLCbNMQiBj3mvSVkn/By/DDu0w/+dOqrxjfojUcn9ESpj/E+WMFMCEPhnsdJAiBFxi9UXBNY07TOU0MxCO9L/iyw+TQqmUP9EGo6E5oK59YxgbeoPqRtoHZewBQeiSnXDv9aCS4a5J1vgXXB65zBXs3DgrfAvG/RExoXqCnlRTVuNGSg2omAr8YnilBC2b5Lp0qftvaJfKyYKKDkrfwFCGYiBg22K9jYWvMlHDq+L5daieyMv4rm4CmZLFn3LivmHv6KQk2RFEeZKAngW+48XGz85C+noLgGmOACfhfen/EHBe5zhaBrtwtv4W+MBdJJxd5yxTC6xKQ3E0mQp+g8gUoQdtGpxiqjv2u5B+sWmUukG wHZlLdUV BgK9EsSFyn0Ks1MrGs4YKxjo9lPf1vFQ3Gt942gzcXAQtCBcB10gbHEBfEsBQhzNG1EgTbNlkSevY/+yrUyODVgD6g0O3F+G3dBMoBRq61DCfs6sQPZG4BoyhOUS8kfXwUKtYMi3FpHnj5l6+HkvSgFCqwSA23a0DTqzbpfwD1/Yvl35V3IFudNMPSvWHXX5cwiQH8BFwCFklAFqt0eU9E64+jiZb46s2WTd8eds9ND8nmaRCruxJDNiw3K7b6aCcw966UbkvH+VZnONX/jF7AF0vzRNobJ4zpMJP6+d/Z+R5hp7xE2+X/h5mGBfW3RA23ZyHeVEfKvQ21NqooYaXn/rCLn/KoW1JT074TZ+gFUWq6kC1pZc4Rz/zmZ/niBOqo9lzbsjKh4EwYHVHmpqSRfM4eUWKAZds1QsTND6uS7fKKFiQ4BbChgQ/wFD4oGM+PQEtpKZsiWCS67PHTGRudU7sFRFE090T59y9psaAgENr0Osa2Mm9zPvsLRnZjycKgUGg+Kl/yba9Ff+KID1MUKZnt+GnxEJTtYlj6oYpAAz5dPwPAuhLl/Aj5YKBZKPib40PN8f//bGyLUdZL4mPNctc3/QQFLqLD4mvOoWfQQShJJEQ7enb4ReTSr7mVCWUq7kz X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Jan As discussed in the previous thread I'm chasing IOPS regression between 4.14 -> 5.10 kernels. https://lore.kernel.org/lkml/20230112113820.hjwvieq3ucbwreql@quack3/T/ Last issue we discussed was difficult to resolve so keeping it on the back burner for now. I did some more bisecting and saw another series of patches that potentially impacts iops score: 72b045aecdd856b083521f2a963705b4c2e59680 (mm: implement find_get_pages_range_tag()) Running fio tests on tip as 9c19a9cb1642c074aa8bc7693cd4c038643960ae (including the 16 patch series) vs tip as 6b4c54e3787bc03e810062bd257a3b05fd9c72d6 (without the above series) shows an IOPS jump. Fio with buffered io/fsync=1/randwrite With HEAD as 9c19a9cb1642c074aa8bc7693cd4c038643960ae (with the above series) write: io=445360KB, bw=7418.6KB/s, *iops=463*, runt= 60033msec clat (usec): min=4, max=32132, avg=311.90, stdev=1812.74 lat (usec): min=5, max=32132, avg=312.28, stdev=1812.74 clat percentiles (usec): | 1.00th=[ 8], 5.00th=[ 10], 10.00th=[ 16], 20.00th=[ 25], | 30.00th=[ 36], 40.00th=[ 47], 50.00th=[ 60], 60.00th=[ 71], | 70.00th=[ 84], 80.00th=[ 97], 90.00th=[ 111], 95.00th=[ 118], | 99.00th=[11840], 99.50th=[15936], 99.90th=[21888], 99.95th=[23936], With HEAD as 6b4c54e3787bc03e810062bd257a3b05fd9c72d6(without the above series) write: io=455184KB, bw=7583.4KB/s, *iops=473*, runt= 60024msec clat (usec): min=6, max=24325, avg=319.72, stdev=1694.52 lat (usec): min=6, max=24326, avg=319.99, stdev=1694.53 clat percentiles (usec): | 1.00th=[ 9], 5.00th=[ 11], 10.00th=[ 17], 20.00th=[ 26], | 30.00th=[ 38], 40.00th=[ 50], 50.00th=[ 60], 60.00th=[ 73], | 70.00th=[ 85], 80.00th=[ 98], 90.00th=[ 111], 95.00th=[ 118], | 99.00th=[ 9792], 99.50th=[14016], 99.90th=[21888], 99.95th=[22400], | 99.99th=[24192] I also see that number of handles per transaction were much higher before this patch series 0ms waiting for transaction 0ms request delay 20ms running transaction 0ms transaction was being locked 0ms flushing data (in ordered mode) 10ms logging transaction *13524us average transaction commit time* *73 handles per transaction* 0 blocks per transaction 1 logged blocks per transaction vs after the patch series. 0ms waiting for transaction 0ms request delay 20ms running transaction 0ms transaction was being locked 0ms flushing data (in ordered mode) 20ms logging transaction *21468us average transaction commit time* *66 handles per transaction* 1 blocks per transaction 1 logged blocks per transaction This is probably again helping in bunching the writeback transactions and increasing throughput. I looked at the code to understand what might be going on. It seems like commit 72b045aecdd856b083521f2a963705b4c2e59680 changes the behavior of find_get_pages_range_tag. Before this commit if find_get_pages_tag cannot find nr_pages (PAGEVEC_SIZE) it returns the number of pages found as ret and sets the *index to the last page it found + 1. After the commit the behavior changes such that if we don’t find nr_pages pages we set the index to end and not to the last found page. (added diff from above commit) Since pagevec_lookup_range_tag is always called in a while loop (index <= end) the code before the commit helps in coalescing writeback of pages if there are multiple threads doing write as it might keep finding new dirty (tagged) pages since it doesn’t set index to end. + /* + * We come here when we got at @end. We take care to not overflow the + * index @index as it confuses some of the callers. This breaks the + * iteration when there is page at index -1 but that is already broken + * anyway. + */ + if (end == (pgoff_t)-1) + *index = (pgoff_t)-1; + else + *index = end + 1; +out: rcu_read_unlock(); - if (ret) - *index = pages[ret - 1]->index + 1; - From the description of the patch i didn't see any mention of this functional change. Was this change intentional and did help some usecase or general performance improvement? Thanks Rishabh