From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 014CCC021BB for ; Wed, 26 Feb 2025 00:49:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9522C6B0089; Tue, 25 Feb 2025 19:49:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8D9B46B008C; Tue, 25 Feb 2025 19:49:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 753636B0093; Tue, 25 Feb 2025 19:49:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 54F846B0089 for ; Tue, 25 Feb 2025 19:49:30 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 0540AAF4B5 for ; Wed, 26 Feb 2025 00:49:30 +0000 (UTC) X-FDA: 83160262500.21.DCD92DD Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) by imf08.hostedemail.com (Postfix) with ESMTP id 23349160004 for ; Wed, 26 Feb 2025 00:49:27 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Uh+5OUwO; spf=pass (imf08.hostedemail.com: domain of kaleshsingh@google.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=kaleshsingh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740530968; a=rsa-sha256; cv=none; b=IEKGTDIITLaAS/lSOdJYwkBw8rwsARw7mcpDMS0bB3pCde0twH6Oly5Vv7DUQGrBPwNWr5 syQh+pzqnnvQ2DDcFAzrKFz/I6DdNEgLcudfBgWFxeDl1Z2RsVHly3NC9bmyLRKKwf/SyR ZbK+//CIhxD9qfTjGGeG0W7Unaw9wsM= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Uh+5OUwO; spf=pass (imf08.hostedemail.com: domain of kaleshsingh@google.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=kaleshsingh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740530968; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=JLpzFy1eZlUB+Co+8Fz3W/0aCqMmDbq8AAvSFegM+xA=; b=pbPZ13U1h20Oab6fdMdGcA2otlFQyXZufN6S8K7+3Djs/z/R3pGe2/OthQO0e7MXMjfNb4 6qCybRDWqFghddRud/tohs4U7/4R3rUmevJC+NVCNQyXnhscDVQPMBGDwMpH0d7YfdQ+V1 3NkSohvKba0ZiMzadqS0pkyeEqQ3XNM= Received: by mail-pl1-f170.google.com with SMTP id d9443c01a7336-22117c396baso43085ad.1 for ; Tue, 25 Feb 2025 16:49:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740530967; x=1741135767; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=JLpzFy1eZlUB+Co+8Fz3W/0aCqMmDbq8AAvSFegM+xA=; b=Uh+5OUwO3mDZ4c/hDqjcAsebhU9y/D5KN+Dok1gD2pDYqpIV3tpRQu6GBs/igAdssQ nuIHxWhCK8hJHhlcGD146K0PglEXPOEQvVV3n8x9Mrhc3yaBZpZ29W/B5VAL2G3kDvQG rOi8DpprSw2vkotbCm5+9f4izwEHsu7zN+NyMg1Q3o2HyU5+zgYtoLh8Z1BdM5JA4ypr iXnrDBzYCWwJONO1gGR05xH/1GJuxf2uQdmQtyKWs7qNIHQx+0H7JHiAf0c9REOeNBYH Tyap/xWmhjqKgbCFhNVExDqDC4rdaVhsCw8e4JUTroDEKJiRV+9Ii8lw42EE7CeLPDAc NCvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740530967; x=1741135767; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JLpzFy1eZlUB+Co+8Fz3W/0aCqMmDbq8AAvSFegM+xA=; b=rjT/bd/gLm2uXGPG/a9yLt+3qnpD7FchZhpXAhNTAy/bcHuec4dtdJmK1k4w7ekUdD uB+8s2IDUTCK5yCRzUIRaoB8/QgYal60f5AtPQ76A+sLr8iIoGgDKV0HeYdrTT7yXoCa RkiX9f8dPoNE8C43bf4UtPgpng1sqW7vjN4Roa0IS2MV+M/LY/NuxfYSzrVfqNUPCqDl 6jxf4Tql2OJLLRDHyIeaaDVVOLKQlPoPP9g1hlSS3SALrgnyaeoVP+d+1GeQqERQpYEv VwXrcsWPubRxDVnnboRZV8b0l5KnyUO8ZP5Og8AMMN3VYhqCD2kUVlKXw9Wp6wZCkXD2 OXDQ== X-Forwarded-Encrypted: i=1; AJvYcCWD8r8A2TZigDapjFKd+kVwmZilvfiADHT8JmrKwVQkvKx8Dm30fj8sHOIcENfTubcbA3tXAhE9vw==@kvack.org X-Gm-Message-State: AOJu0Ywai0kM+R57P6YgEo/jR6v/fQEnkCVZz+zswQ9kW7z5zMQ6kdQG THAIHkEAj4JE+wvv6rNce9CYTBxZYD15/6ujD2O/cxyBfHoY6Ki0jspEazIr5CosUW3AWPYf9Ku MidnDHouNawC/6MarA+y6LC6w98xGYseIZEl9 X-Gm-Gg: ASbGncth8EzPym0AgLUo+SEHCv+K+t4yAKFHM0M+6e93HDaw+nl9ESt+vRFOc13dS3o WmywB4zpWcrWPfL9DXnXVoVgjfTpWE/RLkCej/4HEpHOrs2Y/XixlQHBOGk+mFsyE+6yWT6zziv xnWpMw9o3atrdAihdm3CRsdnDglpWWM0McMpk= X-Google-Smtp-Source: AGHT+IG90pILEcepVYruyOtG/cFNqGXmQtOt3vqmJXOayyRiGXLeQJNoLvWR67x+HpC2WBY5cIPmuqv6icg1EjoRJcE= X-Received: by 2002:a17:903:41ce:b0:215:aca2:dc04 with SMTP id d9443c01a7336-22307aaca93mr5261065ad.26.1740530966777; Tue, 25 Feb 2025 16:49:26 -0800 (PST) MIME-Version: 1.0 References: <3bd275ed-7951-4a55-9331-560981770d30@lucifer.local> <82fbe53b-98c4-4e55-9eeb-5a013596c4c6@lucifer.local> In-Reply-To: From: Kalesh Singh Date: Tue, 25 Feb 2025 16:49:14 -0800 X-Gm-Features: AWEUYZkjjuouYBkdANv8_JSZ7I4hioxdu3ksdNpLharkVu2AtsYMHuGfmDf5kuQ Message-ID: Subject: Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Optimizing Page Cache Readahead Behavior To: Jan Kara Cc: Lorenzo Stoakes , lsf-pc@lists.linux-foundation.org, "open list:MEMORY MANAGEMENT" , linux-fsdevel , Suren Baghdasaryan , David Hildenbrand , "Liam R. Howlett" , Juan Yescas , android-mm , Matthew Wilcox , Vlastimil Babka , Michal Hocko , "Cc: Android Kernel" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: aqrauqz4y7zas7ii9d1p3om95n513y9w X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 23349160004 X-Rspam-User: X-HE-Tag: 1740530967-906165 X-HE-Meta: U2FsdGVkX1+7eR7keNDXmYndEpjXxGKFJnEumN1fvmgGA2qhI8WkWo6V1wRT28UTqwXkMKt89aPe+CboiKIMHRBMy8FMPZKBWy4Ko1ff+55aaZGyIPnaWHmYxxJ9RWDv/D5tKRsIO0ZC0tKN6uJYU0x+y5DSw7XwgA/X3tILdDIpTcG3cuNGhXvebSYXEexIh8Cflyb0aktAAPSX1jVlcZAg+mYB909Zjk3MRTo7vGTF42ub0FrIPhbO9HN5rGAwHHSJKaMtbMcV12iIAXstxQBSlgC11EnuAi7HrY2Ts791T95VTLzEn0ESpDKSgJVCw85N/8fOYt2EDltVDu1udH4F0sl1UBeeDaafimYL5f27or1Ds5tv6GZdTBx9Q1CyXJSMcqkHU3jzYpaD3mG62o9C+nx9ilfHPBMAPqPvnBoMQSg8oXqvc6BKTpcDlVN6aFQhMjj+7epDraZs670+KDlvMpnfbXSu5TyoZetCM9frJzhQSE5Fox6p8Gv5a1VPpN4xM5IA0o2gAPHTqo+AbdkdafdtTNR8oqvEbSAt6EWRLD9rD8Xw7NYqXrrWvcz0ukkp1aoQnIZGbda/Km1pof5BBaIMOtwesCgXaXku5DPljXYVXkeFPjQiYbOtdZg7xLTrCnWMw86dT+TvnHwqvGZWZlWmnjwGQywL+O58RKfE54T22I7PrQ9j+473ZyogWjmDqlurifKuPg2DkRaPUsuVKtLSjb3bbWLcfe01vvdKz7auKloQbXD70PmwbV+m19sScfIW0//Farzu/pikglJJmvL3merThD9wcwD5u74/Szc72UJyq7Q/EzqnDf2TBeitIQncoA3GNPRXf5b5AsslA8M4F+yC8Tw6W/C9yuUsJ+1MhuIT0d7lJ8dou0LVuo3Zx7QhTj2Eb+FFJCiz7YyVv8dUFow4JBzg6g7B+aCt2MdEBC7k1TSffEwNE4B+VWQ1KRqltu+SH2MMYCA aBYU7VdM N5aCdeyELxKROtGOOG8kMt1lw1HY0bIXV208kDIj3ScKIFmRmQytx7XMRVN4xkCdqJNlsFs60LpG8wjWyCKAuX3J/X8eggfpCt/f0iqoTHuRlhUIx++gU+m9Ton3E3dk09q7QpqTYXnRpWxbfMLe8Kixn6nkQJJ9DT0bXjSrXidsZ704V7N88pdjXKs0AfmzbcPhYm7AR3Uv12vAg1KNG9W8a3v4mZgknOM78WiYAT5DSVFTFjakwaLTcmAZ2Eu6vHOVe X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Feb 25, 2025 at 8:36=E2=80=AFAM Jan Kara wrote: > > On Mon 24-02-25 13:36:50, Kalesh Singh wrote: > > On Mon, Feb 24, 2025 at 8:52=E2=80=AFAM Lorenzo Stoakes > > > > > > OK, I agree the behavior you describe exists. But do you have s= ome > > > > > > real-world numbers showing its extent? I'm not looking for some= artificial > > > > > > numbers - sure bad cases can be constructed - but how big pract= ical problem > > > > > > is this? If you can show that average Android phone has 10% of = these > > > > > > useless pages in memory than that's one thing and we should be = looking for > > > > > > some general solution. If it is more like 0.1%, then why bother= ? > > > > > > > > > > Once I revert a workaround that we currently have to avoid > > fault-around for these regions (we don't have an out of tree solution > > to prevent the page cache population); our CI which checks memory > > usage after performing some common app user-journeys; reports > > regressions as shown in the snippet below. Note, that the increases > > here are only for the populated PTEs (bounded by VMA) so the actual > > pollution is theoretically larger. > > > > Metric: perfetto_media.extractor#file-rss-avg > > Increased by 7.495 MB (32.7%) > > > > Metric: perfetto_/system/bin/audioserver#file-rss-avg > > Increased by 6.262 MB (29.8%) > > > > Metric: perfetto_/system/bin/mediaserver#file-rss-max > > Increased by 8.325 MB (28.0%) > > > > Metric: perfetto_/system/bin/mediaserver#file-rss-avg > > Increased by 8.198 MB (28.4%) > > > > Metric: perfetto_media.extractor#file-rss-max > > Increased by 7.95 MB (33.6%) > > > > Metric: perfetto_/system/bin/incidentd#file-rss-avg > > Increased by 0.896 MB (20.4%) > > > > Metric: perfetto_/system/bin/audioserver#file-rss-max > > Increased by 6.883 MB (31.9%) > > > > Metric: perfetto_media.swcodec#file-rss-max > > Increased by 7.236 MB (34.9%) > > > > Metric: perfetto_/system/bin/incidentd#file-rss-max > > Increased by 1.003 MB (22.7%) > > > > Metric: perfetto_/system/bin/cameraserver#file-rss-avg > > Increased by 6.946 MB (34.2%) > > > > Metric: perfetto_/system/bin/cameraserver#file-rss-max > > Increased by 7.205 MB (33.8%) > > > > Metric: perfetto_com.android.nfc#file-rss-max > > Increased by 8.525 MB (9.8%) > > > > Metric: perfetto_/system/bin/surfaceflinger#file-rss-avg > > Increased by 3.715 MB (3.6%) > > > > Metric: perfetto_media.swcodec#file-rss-avg > > Increased by 5.096 MB (27.1%) > > > > [...] > > > > The issue is widespread across processes because in order to support > > larger page sizes Android has a requirement that the ELF segments are > > at-least 16KB aligned, which lead to the padding regions (never > > accessed). > > Thanks for the numbers! It's much more than I'd expect. So you apparently > have a lot of relatively small segments? Hi Jan, Yeah you are right the segments can be relatively small. I took one app on my device as an example: adb shell 'cat /proc/$(pidof com.google.android.youtube)/maps' | grep '.so$' | tee youtube_so_segments.txt cat youtube_so_segments.txt | ./total_mapped_size.sh Total mapping length: 147980288 bytes cat youtube_so_segments.txt | wc -l 1148 147980288/1148/1024 =3D 125.88 KB Let's say very roughly on average it's 128KB per segment; the padding region can be anywhere from 0 to 60KB of that. --Kalesh > > > Another possible way we can look at this: in the regressions shared > > above by the ELF padding regions, we are able to make these regions > > sparse (for *almost* all cases) -- solving the shared-zero page > > problem for file mappings, would also eliminate much of this overhead. > > So perhaps we should tackle this angle? If that's a more tangible > > solution ? > > > > From the previous discussions that Matthew shared [7], it seems like > > Dave proposed an alternative to moving the extents to the VFS layer to > > invert the IO read path operations [8]. Maybe this is a move > > approachable solution since there is precedence for the same in the > > write path? > > Yeah, so I certainly wouldn't be opposed to this. What Dave suggests make= s > a lot of sense. In principle we did something similar for DAX. But it won= 't be > a trivial change so details matter... > > H= onza > -- > Jan Kara > SUSE Labs, CR