From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6774C07CA9 for ; Tue, 28 Nov 2023 03:21:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5B5426B02ED; Mon, 27 Nov 2023 22:21:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 565586B02EE; Mon, 27 Nov 2023 22:21:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 42C6C6B02F0; Mon, 27 Nov 2023 22:21:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 34F0F6B02ED for ; Mon, 27 Nov 2023 22:21:29 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 08BF2C0199 for ; Tue, 28 Nov 2023 03:21:29 +0000 (UTC) X-FDA: 81505912698.28.D447848 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.31]) by imf14.hostedemail.com (Postfix) with ESMTP id 161F6100009 for ; Tue, 28 Nov 2023 03:21:25 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=h6JHM0UW; spf=pass (imf14.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701141687; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qJdV+ElV0fd1UB1a4W72IrVHG2tZM8+sFooooYE+glU=; b=aCgnmzxGcmbtXP8bvWkUfuYzp8LLmGnpq2UbedaBfOn4bpt63+yFklRt+3u99OkBrD6LzM IXcNcVzwsuM4SQW0s6W+L4OsBa5B+cC6ycCQy0VPoifaack9pfleJD+CfD+tCwBjh2bWGn cngMJE7vyExdv135r3zA5ymHaVSA83w= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701141687; a=rsa-sha256; cv=none; b=ck6o5or5w9hOrgyGK8UIgLohqf8wq6/Ind48OjpDy055dYFb8olU9d2olyNa5sFQ1rkgA5 aXOU1l0D/NEEy9jdOLGywyTGMz+ma3UzHHX+BUPE+LmT3NyXARBop53mS8RMUWb5ZhFAem 4d5+qLeXgsnKoAt3kF56vnJl0gJSJ4k= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=h6JHM0UW; spf=pass (imf14.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701141686; x=1732677686; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version:content-transfer-encoding; bh=uAyuEl4PRkvMBnAcgIlon89LF6qmybriGImNGAei/Ts=; b=h6JHM0UWmYoCiy9X+xF96ixwUGbp3RKYRLJImKifMBepDqkUFa/S8pMq dDkAvf1wqoJT0ZD93z/C9OqxEEoeJCmTiKwqXigk0fZ2PK0Mdqpdh5BzD /IoTiDfeu8LOnjqx6APKdB6ml+9SlCgCgKBeUmsIoNb8RrtVHq6xqvrIs IsNkMtWe4gMREOmTKwWxFgAdbWwP7Bj7NNQHoNta3BFXCFbI5YBOHLTgS qoWWkhc3K9TLlF3a1F/1MRLYuy2s1FdbVYl/YtJpcSsKBPNy+smZWdTMM dD/yK3FUYXaR3ynspo8IbH1DK+ff+nK5vD4id3qoG1zAolybsuoELY2LB w==; X-IronPort-AV: E=McAfee;i="6600,9927,10907"; a="457171856" X-IronPort-AV: E=Sophos;i="6.04,232,1695711600"; d="scan'208";a="457171856" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Nov 2023 19:21:24 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10907"; a="891943316" X-IronPort-AV: E=Sophos;i="6.04,232,1695711600"; d="scan'208";a="891943316" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Nov 2023 19:21:20 -0800 From: "Huang, Ying" To: Yosry Ahmed Cc: Minchan Kim , Chris Li , Michal Hocko , Liu Shixin , Yu Zhao , Andrew Morton , Sachin Sant , Johannes Weiner , Kefeng Wang , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v10] mm: vmscan: try to reclaim swapcache pages if no swap space In-Reply-To: (Yosry Ahmed's message of "Mon, 27 Nov 2023 13:56:26 -0800") References: <87msv58068.fsf@yhuang6-desk2.ccr.corp.intel.com> <87h6l77wl5.fsf@yhuang6-desk2.ccr.corp.intel.com> <87bkbf7gz6.fsf@yhuang6-desk2.ccr.corp.intel.com> Date: Tue, 28 Nov 2023 11:19:20 +0800 Message-ID: <87msuy5zuv.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Stat-Signature: qfy4nodxtxi89ox7tphmxpeta1wifpu9 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 161F6100009 X-Rspam-User: X-HE-Tag: 1701141685-120448 X-HE-Meta: U2FsdGVkX19VtsTMGiifIS1UhRijqNolcmozBfkqf57BkjsWTSIOEFL4tfAuJiMMDwqWiz3BFHo2Fd7vTT0pBOCOBiOrklhrRnoLE+P2XP1mZcwV8lsZcqOgiihd5RmnwSnHrfwfGZhATCY/CbiZt0Zd1SXwYH2rpY0npmY926DSQq8qLXydB2AZUOHxxtbCMo6eyhV7XlV4MVpD3EpvxWsJBuvbIJe5Qu6QqqoIAzK4aRQBacY3i5KkSCCdlPxEjNIGmb1ds/yLIBYaBsP10P8i+rDxs0sX0ITxCLR8fzdvDchEARYAzM6fcdBkCzKZPYqsBG70XiuqktVrq+cAYVoPWKDLPqgg6ZoqUigva5UiRPo/vYaztk+4F3YJniQLMuGfKQtCs6T7tLXanHnn5ORu2jzrru9Nd54NXIXwYCpfAXgX+nO63NPYXPBb+I555hmb4IPmaIQPEOfr7HUeyQNlnoS/AWPb3EqXLJ0mxBmg2R1zxNzoAPw+pemzajA/GAnPpGPC5OkAdldlQw4X7KF0UVcEda0uzE6JvDDg87YTOG+bmpNDndz8NkXU7sE+C1VwIt39smrdNX8gTOpl+XmexiHKe7DULXbhxAkrw55tR4vtimTmlheyv51/Vp/NmOtjNK3zoDzXOq5v58CZugwuyYsuL+Fmf/KgdnN/WLYiT45ywjdwUCWh9HJA8mxLVfxHpFPNRitTSaIVvPEAapLrkOyYLgDsloN3J6ZkhrMqCcGQXigwS2fs2qtG8C7hoM3Sgk06QwEZaJCMYCOqCqofpnQb+AgRnwRtVdOPYeMvB7vm8h920phuyZUGqTaffMxtUUaz4JbI3U/I1rFYI5biM+bczgTJdTnn+N2o2v6Anqv+vQuDjK+dN0Z/e9BJkq3sBKqkFJpPk1eTRIWlBqYx9k2gDDRse/5TWSJWf3NMX46ChjzyrkhON4hOa+r5bV6QwmdAPocYUyo11vY gUru5Yej Mr6mrIKvSYTVfFGNIEaoJYKV/H6fasZKnnz8tdBlfEYCsgbzrnasex+00mhy6zwnL2KIviNpa7wVlnMy8U5rnE5y2osP8+FUIktXM6XOU43Au60V1RaUUZ7q0Ld4WQ2bN4si6wvtlQ9QX++1x3U9MCXy+qKKAO0seMrtu66i/6dHZ/mXITi1vqTsNstpADQO9jcEdhe4OjdAnNxxY8PwBisxqYJLtqEjNC+uihqRJtyimV49EdTDcwGGu+dxAOZoSAzjM X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Yosry Ahmed writes: > On Mon, Nov 27, 2023 at 1:32=E2=80=AFPM Minchan Kim = wrote: >> >> On Mon, Nov 27, 2023 at 12:22:59AM -0800, Chris Li wrote: >> > On Mon, Nov 27, 2023 at 12:14=E2=80=AFAM Huang, Ying wrote: >> > > > I agree with Ying that anonymous pages typically have different p= age >> > > > access patterns than file pages, so we might want to treat them >> > > > differently to reclaim them effectively. >> > > > One random idea: >> > > > How about we put the anonymous page in a swap cache in a different= LRU >> > > > than the rest of the anonymous pages. Then shrinking against those >> > > > pages in the swap cache would be more effective.Instead of having >> > > > [anon, file] LRU, now we have [anon not in swap cache, anon in swap >> > > > cache, file] LRU >> > > >> > > I don't think that it is necessary. The patch is only for a special= use >> > > case. Where the swap device is used up while some pages are in swap >> > > cache. The patch will kill performance, but it is used to avoid OOM >> > > only, not to improve performance. Per my understanding, we will not= use >> > > up swap device space in most cases. This may be true for ZRAM, but = will >> > > we keep pages in swap cache for long when we use ZRAM? >> > >> > I ask the question regarding how many pages can be freed by this patch >> > in this email thread as well, but haven't got the answer from the >> > author yet. That is one important aspect to evaluate how valuable is >> > that patch. >> >> Exactly. Since swap cache has different life time with page cache, they >> would be usually dropped when pages are unmapped(unless they are shared >> with others but anon is usually exclusive private) so I wonder how much >> memory we can save. > > I think the point of this patch is not saving memory, but rather > avoiding an OOM condition that will happen if we have no swap space > left, but some pages left in the swap cache. Of course, the OOM > avoidance will come at the cost of extra work in reclaim to swap those > pages out. > > The only case where I think this might be harmful is if there's plenty > of pages to reclaim on the file LRU, and instead we opt to chase down > the few swap cache pages. So perhaps we can add a check to only set > sc->swapcache_only if the number of pages in the swap cache is more > than the number of pages on the file LRU or similar? Just make sure we > don't chase the swapcache pages down if there's plenty to scan on the > file LRU? The swap cache pages can be divided to 3 groups. - group 1: pages have been written out, at the tail of inactive LRU, but not reclaimed yet. - group 2: pages have been written out, but were failed to be reclaimed (e.g., were accessed before reclaiming) - group 3: pages have been swapped in, but were kept in swap cache. The pages may be in active LRU. The main target of the original patch should be group 1. And the pages may be cheaper to reclaim than file pages. Group 2 are hard to be reclaimed if swap_count() isn't 0. Group 3 should be reclaimed in theory, but the overhead may be high. And we may need to reclaim the swap entries instead of pages if the pages are hot. But we can start to reclaim the swap entries before the swap space is run out. So, if we can count group 1, we may use that as indicator to scan anon pages. And we may add code to reclaim group 3 earlier. >> > Regarding running out of swap space. That is a good point, in server >> > workload we don't typically run out of swap device space anyway. Think about this again. In server workload, if we set some swap usage limit for a memcg, we may run out of the limit. Is it common for a server workload run out of the swap usage limit of the memcg? >> > Android uses ZRAM, the story might be different. Adding Minchan here. >> >> Swap is usually almost full in Android since it compacts(i.e., swapout) >> background apps aggressively. If my understanding were correct, because ZRAM has SWP_SYNCHRONOUS_IO set, the anonymous pages will only be put in swap cache temporarily during swap out. So, the remaining swap cache pages in anon LRU should not be a problem for ZRAM. -- Best Regards, Huang, Ying