From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D83D6C07E97 for ; Tue, 28 Nov 2023 04:05:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 05A346B02F7; Mon, 27 Nov 2023 23:05:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F25866B02F8; Mon, 27 Nov 2023 23:05:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D77926B02F9; Mon, 27 Nov 2023 23:05:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C9E836B02F7 for ; Mon, 27 Nov 2023 23:05:57 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id A6554A0094 for ; Tue, 28 Nov 2023 04:05:57 +0000 (UTC) X-FDA: 81506024754.26.85961CF Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.24]) by imf30.hostedemail.com (Postfix) with ESMTP id 002A280005 for ; Tue, 28 Nov 2023 04:05:54 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Mykb73h9; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf30.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701144355; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=eOc0iVCVQy9cm7PMue0Vd3g6HPlXbkZGUMWcBaRRu6A=; b=JBd0jRVY0lf/Hs93pVkPNYkLJvsKpFyThJLWp+qk5m/e6fT1LWITGhqYsU55t3ItZowdZm q+MOluHH4gAaK1w0SY48ZE2IcgYuZRG78+1j/E7iW/hyXTlsLkmkxzETSAVN8HGwGFpZe6 fA7QYT5luxWpTa8oSdKHN9OY8WiVPf4= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Mykb73h9; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf30.hostedemail.com: domain of ying.huang@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701144355; a=rsa-sha256; cv=none; b=ZBL8Fz8K46hUvIZwss8S4yVBogpdvtlgqy/W+Y+AbkRwjJ6fTmf3in9dJl+IvewNNkeKvj ye+9P8rpeJkeXWE8P8IM1Ci2XdOKdASH1sL0hfelGJM0JjTqjA1DVADoXzl8IfTwTgB69V jcyk13x2tteeikYuHe+2YoJQatWdp7g= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701144355; x=1732680355; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version:content-transfer-encoding; bh=8Bt99589MBLkeCDpzNaucEg3AKJeNgJQZORAaSlDfnk=; b=Mykb73h9W2mQ8kxUnLXGFemsgbE2e73Y1k2FfRYQBSigvzrvvFaUNMeG 1/jDrzTUToFfQ5RLyBT0rxVuU2l1auJPFv8sFZUtvTa7Sh/xjz4LZalI8 uRaYXpCB+qtbF9wAYuf60m1orbFijJoAMVsu0iuYcSEQEbkswNyLB0RXh JdSGKxdX0c4USEY96Zn/0KwgUaC4i9DNmPKdIrhzua+Bjqj19SyNlqpli vlpMDsVDZvl6akKuovML7ElPgbhvPCAzJxHlL7njTQM3tAIA54NbATftw 4+2I5o+GtNQ5PxRcjm6DYZu5KTwAG8sxwwg5Mrjdq1kR052ldJ+rhoe86 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10907"; a="395662379" X-IronPort-AV: E=Sophos;i="6.04,232,1695711600"; d="scan'208";a="395662379" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Nov 2023 20:05:53 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.04,232,1695711600"; d="scan'208";a="9998846" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Nov 2023 20:05:50 -0800 From: "Huang, Ying" To: Yosry Ahmed Cc: Minchan Kim , Chris Li , Michal Hocko , Liu Shixin , Yu Zhao , Andrew Morton , Sachin Sant , Johannes Weiner , Kefeng Wang , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v10] mm: vmscan: try to reclaim swapcache pages if no swap space In-Reply-To: (Yosry Ahmed's message of "Mon, 27 Nov 2023 19:27:36 -0800") References: <87msv58068.fsf@yhuang6-desk2.ccr.corp.intel.com> <87h6l77wl5.fsf@yhuang6-desk2.ccr.corp.intel.com> <87bkbf7gz6.fsf@yhuang6-desk2.ccr.corp.intel.com> <87msuy5zuv.fsf@yhuang6-desk2.ccr.corp.intel.com> Date: Tue, 28 Nov 2023 12:03:49 +0800 Message-ID: <87fs0q5xsq.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 002A280005 X-Stat-Signature: ncsx86wp6io1sdnkp88nuc1m9ofbaujx X-HE-Tag: 1701144354-677103 X-HE-Meta: U2FsdGVkX19YIiZmyPdkdig1PP4xc53AkTlp4ZIewIV35NnRubDf5sc7lI6fiM87v1Onl/9GsOg9OL+1bSEreLLbBaeKh5Ba+ULnKm8/S1MbLihotTHfrYBFzrNMhScfc5wrQbd14bQxwsdqs7B+Vjj4hK+O+1+4xZuvXmLsi+Iok1lXG++FNNy8rGYDc8t018MkpZnKeb5d2hVNgv96H/jbIWtBXbtgNWO18onzFPDmOAbyTknJGXqHs+cyB9RV9J5GOzZ6xzewHtu4LR1lUytzvwF3lDFmljv6dGiENt1CsrEwqhwfYM1JCN0jEnObM5R1JOUfmBd0dBEWDtZ0BDzuTiIn/DNy+3q/OfXm+Hr04srKPaVQ9C35DKC0bOh4qKF02Y91/c8AiLVndTytzlAmBVhT0T8D8/v5Wf/GiDhbyA3piNNuBgX4F3d8wt5st/VJ25hlGQUyup3hYtE79K75raKpmzmXC6oNZiN/UopB2Pz6XI/+86d6gfpSjkMYiSK05oB97iJq2HHlqhf2yJUx54eARN+/SHVOgn76aP2BUqY6l4npxxYfttWcX6nGGEChwfKKwcABgHbmn30a2eas3gvWpT8T0ZTR+BarOUL1XTHKR4acn+e/XE03PmI4oKISGLzm6kUdwL0m0t7Ntdl08cqTwiYF/e2VemSeQgkxQeYL534qUcKOHWDzDSnfTneXYBE/YdoSJbXAg8A59VQGCL6WaexwMIGDyfCAjUfz+yMEuCEVjr6TRUhxUxn08bk+qw/JFUmMCGg1TLdvfqaEnOqmbkEyKitDKrc8KUBA/g0nhNpOFGnX51WO6owjs6gpBtvz7RbWwcLBw3NlC4uZmB9QT46x7w5rptVotgXjYFxrlHU8TIkEajkbB7RURtP4GzCnrNYdj+vE77Uq1iQ2G5FDaQE8trsFrIRY1hoouxiXI1LJ/GLHaD9FxvPB+9KDZFYyPv1APSDbo0w mG6eAf+Q UADGbt5whLS2rYNCpjzqfnfXcQmiLymrsuGFjQ/V6YaiGDSOEnskL52iMgWXJ0iK5bhmVeY2PG6XL4KBAILFBnsNyUgYy8AYbZYwGZMw1CLAp0BdlfpP5Rg5qbxCfCq/8A0lupsH1h239/7ZPPR7C9H2blqGhwXmvDyRM5UCnxWUsqhXaTTAt2ENNHZ4i3Z7QZKwHdQf+Mw4AiDZI/IgXpDvoWPRYRlxnKnjm1tkhUCbwzTBTYKLxLRvLZT7xJyG3x4ax X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Yosry Ahmed writes: > On Mon, Nov 27, 2023 at 7:21=E2=80=AFPM Huang, Ying wrote: >> >> Yosry Ahmed writes: >> >> > On Mon, Nov 27, 2023 at 1:32=E2=80=AFPM Minchan Kim wrote: >> >> >> >> On Mon, Nov 27, 2023 at 12:22:59AM -0800, Chris Li wrote: >> >> > On Mon, Nov 27, 2023 at 12:14=E2=80=AFAM Huang, Ying wrote: >> >> > > > I agree with Ying that anonymous pages typically have differen= t page >> >> > > > access patterns than file pages, so we might want to treat them >> >> > > > differently to reclaim them effectively. >> >> > > > One random idea: >> >> > > > How about we put the anonymous page in a swap cache in a differ= ent LRU >> >> > > > than the rest of the anonymous pages. Then shrinking against th= ose >> >> > > > pages in the swap cache would be more effective.Instead of havi= ng >> >> > > > [anon, file] LRU, now we have [anon not in swap cache, anon in = swap >> >> > > > cache, file] LRU >> >> > > >> >> > > I don't think that it is necessary. The patch is only for a spec= ial use >> >> > > case. Where the swap device is used up while some pages are in s= wap >> >> > > cache. The patch will kill performance, but it is used to avoid = OOM >> >> > > only, not to improve performance. Per my understanding, we will = not use >> >> > > up swap device space in most cases. This may be true for ZRAM, b= ut will >> >> > > we keep pages in swap cache for long when we use ZRAM? >> >> > >> >> > I ask the question regarding how many pages can be freed by this pa= tch >> >> > in this email thread as well, but haven't got the answer from the >> >> > author yet. That is one important aspect to evaluate how valuable is >> >> > that patch. >> >> >> >> Exactly. Since swap cache has different life time with page cache, th= ey >> >> would be usually dropped when pages are unmapped(unless they are shar= ed >> >> with others but anon is usually exclusive private) so I wonder how mu= ch >> >> memory we can save. >> > >> > I think the point of this patch is not saving memory, but rather >> > avoiding an OOM condition that will happen if we have no swap space >> > left, but some pages left in the swap cache. Of course, the OOM >> > avoidance will come at the cost of extra work in reclaim to swap those >> > pages out. >> > >> > The only case where I think this might be harmful is if there's plenty >> > of pages to reclaim on the file LRU, and instead we opt to chase down >> > the few swap cache pages. So perhaps we can add a check to only set >> > sc->swapcache_only if the number of pages in the swap cache is more >> > than the number of pages on the file LRU or similar? Just make sure we >> > don't chase the swapcache pages down if there's plenty to scan on the >> > file LRU? >> >> The swap cache pages can be divided to 3 groups. >> >> - group 1: pages have been written out, at the tail of inactive LRU, but >> not reclaimed yet. >> >> - group 2: pages have been written out, but were failed to be reclaimed >> (e.g., were accessed before reclaiming) >> >> - group 3: pages have been swapped in, but were kept in swap cache. The >> pages may be in active LRU. >> >> The main target of the original patch should be group 1. And the pages >> may be cheaper to reclaim than file pages. >> >> Group 2 are hard to be reclaimed if swap_count() isn't 0. >> >> Group 3 should be reclaimed in theory, but the overhead may be high. >> And we may need to reclaim the swap entries instead of pages if the pages >> are hot. But we can start to reclaim the swap entries before the swap >> space is run out. >> >> So, if we can count group 1, we may use that as indicator to scan anon >> pages. And we may add code to reclaim group 3 earlier. >> > > My point was not that reclaiming the pages in the swap cache is more > expensive that reclaiming the pages in the file LRU. In a lot of > cases, as you point out, the pages in the swap cache can just be > dropped, so they may be as cheap or cheaper to reclaim than the pages > in the file LRU. > > My point was that scanning the anon LRU when swap space is exhausted > to get to the pages in the swap cache may be much more expensive, > because there may be a lot of pages on the anon LRU that are not in > the swap cache, and hence are not reclaimable, unlike pages in the > file LRU, which should mostly be reclaimable. > > So what I am saying is that maybe we should not do the effort of > scanning the anon LRU in the swapcache_only case unless there aren't a > lot of pages to reclaim on the file LRU (relatively). For example, if > we have a 100 pages in the swap cache out of 10000 pages in the anon > LRU, and there are 10000 pages in the file LRU, it's probably not > worth scanning the anon LRU. For group 1 pages, they are at the tail of the anon inactive LRU, so the scan overhead is low too. For example, if number of group 1 pages is 100, we just need to scan 100 pages to reclaim them. We can choose to stop scanning when the number of the non-group-1 pages reached some threshold. -- Best Regards, Huang, Ying