From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2337C27C40 for ; Thu, 24 Aug 2023 08:51:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EEF2E28009C; Thu, 24 Aug 2023 04:51:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E77A3280070; Thu, 24 Aug 2023 04:51:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CF10D28009C; Thu, 24 Aug 2023 04:51:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id B9698280070 for ; Thu, 24 Aug 2023 04:51:16 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 84566140225 for ; Thu, 24 Aug 2023 08:51:16 +0000 (UTC) X-FDA: 81158378952.07.609E678 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.93]) by imf14.hostedemail.com (Postfix) with ESMTP id B17F910001D for ; Thu, 24 Aug 2023 08:51:13 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=RRY3S59i; spf=pass (imf14.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692867074; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=neaC0XmBSgK3vnJhWaPkTf5qh14UrJIn1GwUHN9MogU=; b=72cO1+QIEvsIPLt0N8AQ+kYtv+KWyeIwemI7gTxYQ9seEiqG9qjMRkR2mWLVGRbpmj1yEW k2KQlEA3MM1rnK4HBnG5IA5j+KKwK9o0/Y5U/pIJZMVavthxDlJi/CM3K071G18YTRkEGL k8R3vEdi8p0ZBTAsYKcFPULm8VkvHJ8= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=RRY3S59i; spf=pass (imf14.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692867074; a=rsa-sha256; cv=none; b=UL48GqTqdbT2CnzY2U02UYdWWx3Q93/mRZrxZJvaqqsV0I/zctT6IgyZ47eAO/GO+BfW/8 xW2iupxUodmuzpETzAtiovUvVZk3BJz6IGkg218fVhgTFPnKFPabGJwMtNzdqAZSSwuZk/ dFywFMfJrJAdmaOkc39VU5CmEcJRWpc= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1692867073; x=1724403073; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version:content-transfer-encoding; bh=MkFJAtGSW58GRT2ZM8xQjIwJ1tjAY0vicz1XT6x8mDk=; b=RRY3S59ioWfdmEhYmU23MNGWvc8hbxu3Tgw9u1ySeSk1uBMfCa25upnI ZusZK6Ds1QY8qt4RaY2Y9G+dcvyMSJj6FS2xcCAjBqaZHMONWfTOyPQSi R1v2pgu29w9OdLQibULo8VuK0hA7uue8OU0DiEAHdjww4aFhd8VNv0oPo H075Z+zaqXCoLW9NyL8lkRQppRFegCPq5iKwgnBNj21kSmoGFbCVJU1fu I2KJkrZMwCNHIBOCMR6Z+sa6t4cdPWvgKQXV0U6dUvtk/9K8S4pQTOVub vDwb/hRv9hm/pBgpG68Pxa7Dmz5wOWxAzCsPy3eEKm8t2HnZoDBKBRhUh A==; X-IronPort-AV: E=McAfee;i="6600,9927,10811"; a="371790717" X-IronPort-AV: E=Sophos;i="6.01,195,1684825200"; d="scan'208";a="371790717" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Aug 2023 01:51:06 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10811"; a="983617739" X-IronPort-AV: E=Sophos;i="6.01,195,1684825200"; d="scan'208";a="983617739" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Aug 2023 01:51:03 -0700 From: "Huang, Ying" To: Yosry Ahmed Cc: Liu Shixin , Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , wangkefeng.wang@huawei.com, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v2] mm: vmscan: reclaim anon pages if there are swapcache pages References: <20230822024901.2412520-1-liushixin2@huawei.com> Date: Thu, 24 Aug 2023 16:48:53 +0800 In-Reply-To: (Yosry Ahmed's message of "Tue, 22 Aug 2023 09:35:44 -0700") Message-ID: <87wmxk6d1m.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: B17F910001D X-Rspam-User: X-Stat-Signature: d7wjq9wb4rrcgx3g6pbcdnrgdxem99xg X-Rspamd-Server: rspam01 X-HE-Tag: 1692867073-400410 X-HE-Meta: U2FsdGVkX18+2SDXCsaRcQgtOlsJ4SRtAnBEdnGfIquFOnZa5X69Mcg3HdRuOrgJAMtbwA+miKkvUNGDgBz8pssVaG+b/zroi1/3H5975cmHRVLanbe0qS9ghocLy+hcz6oWGU4KRo4pWDL5i0v/LEHingVLjhFcgUqizRZYa8+R9pMCDCFqzUq5KoJI+mhwc4GsQKsR3++ljLvDNUMKVAULNTvdvFLjQwJRseqXyQaUvgq1eLt2ACd0fxjlIN9xgSKExlVIb/oG1xm817ZbrQevId95hsPI7Li/oplOncXiU/bNHO9WdgpxpfXYJJMdsH1XyN0LU5O+oksXJk0hNhKemalW8vR11jpR6/FejVUYv6RQzee07FHDLdVF5fKDPh/niOn2b/2Yo/FvFbZgEBrBWXWV17LBzCmT0DArhIdHxOrlQCkD+1vZ0Btvl+JfJzirsqlEO+n8jckasEjjYQx1fqeahlkHxlGnuLIvjFNqVEqDsH71sMM026L6k3PlCh1gkd6wJJcHAjDuvGDvcKaBM4TO0DOJ8aLufCbSeaOGrQqpi/Hjc8LOrKIIfWxujDELlRp9aBVibLA/Ldb4MwGy1f0riKS36LrAiWCmY0aoafQAVIHjY1zau7u6nUSgiRYGi2+TcdUI3cRqTzWXgyGYnoqTiUzYmFqMyC5NPCdNDT6RhiGhKxguJ6ZjkIJ8T4Rh8Iupm94uNKdi2hjA6olNAhnzvR0Z1x84LqzmSFyPdABcYFCvMuKz1pt7gICbfnAUcMDeBro0YuMHYw4x24sP+HglgqRAruznzDemhFXyfA7B67sHJpZLvah42xeHYI6GLfLf7JLI4pPNo36iyVWE9BkJ3urR4kQ2nci+H5S0yhaPeX389pJ5jpV9IGc4xgpD9GBUwiRHpaASo4b5n9T/L+nIY9/Cup1gAAWOMfBDxuc6p+cNiOl5xCViSPl22PCGtkRjNDcx5qrwDJx IJWhQ8Hp NbrPnV5mEILxijYqWGBtU1+rUjiz53nvg0u9045SW13wIn1GHLB9VQAGThjqQ/tvYsKMfePEA7HsxCg18JDfcPAIOTLiOppMTjkepFy6NwG+WIkhyra87DB/bOjKG5osKPQgcHz6i7HwxGS4GMqgza1dKKvo4X6vbAJ9J4cBR3WIF+v0dNbIgVsHm2qsDxcnsIeKh X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Yosry Ahmed writes: > On Mon, Aug 21, 2023 at 6:54=E2=80=AFPM Liu Shixin wrote: >> >> When spaces of swap devices are exhausted, only file pages can be reclai= med. >> But there are still some swapcache pages in anon lru list. This can lead >> to a premature out-of-memory. >> >> This problem can be fixed by checking number of swapcache pages in >> can_reclaim_anon_pages(). For memcg v2, there are swapcache stat that can >> be used directly. For memcg v1, use total_swapcache_pages() instead, whi= ch >> may not accurate but can solve the problem. > > Interesting find. I wonder if we really don't have any handling of > this situation. > >> >> Signed-off-by: Liu Shixin >> --- >> include/linux/swap.h | 6 ++++++ >> mm/memcontrol.c | 8 ++++++++ >> mm/vmscan.c | 12 ++++++++---- >> 3 files changed, 22 insertions(+), 4 deletions(-) >> >> diff --git a/include/linux/swap.h b/include/linux/swap.h >> index 456546443f1f..0318e918bfa4 100644 >> --- a/include/linux/swap.h >> +++ b/include/linux/swap.h >> @@ -669,6 +669,7 @@ static inline void mem_cgroup_uncharge_swap(swp_entr= y_t entry, unsigned int nr_p >> } >> >> extern long mem_cgroup_get_nr_swap_pages(struct mem_cgroup *memcg); >> +extern long mem_cgroup_get_nr_swapcache_pages(struct mem_cgroup *memcg); >> extern bool mem_cgroup_swap_full(struct folio *folio); >> #else >> static inline void mem_cgroup_swapout(struct folio *folio, swp_entry_t = entry) >> @@ -691,6 +692,11 @@ static inline long mem_cgroup_get_nr_swap_pages(str= uct mem_cgroup *memcg) >> return get_nr_swap_pages(); >> } >> >> +static inline long mem_cgroup_get_nr_swapcache_pages(struct mem_cgroup = *memcg) >> +{ >> + return total_swapcache_pages(); >> +} >> + >> static inline bool mem_cgroup_swap_full(struct folio *folio) >> { >> return vm_swap_full(); >> diff --git a/mm/memcontrol.c b/mm/memcontrol.c >> index e8ca4bdcb03c..3e578f41023e 100644 >> --- a/mm/memcontrol.c >> +++ b/mm/memcontrol.c >> @@ -7567,6 +7567,14 @@ long mem_cgroup_get_nr_swap_pages(struct mem_cgro= up *memcg) >> return nr_swap_pages; >> } >> >> +long mem_cgroup_get_nr_swapcache_pages(struct mem_cgroup *memcg) >> +{ >> + if (mem_cgroup_disabled() || do_memsw_account()) >> + return total_swapcache_pages(); >> + >> + return memcg_page_state(memcg, NR_SWAPCACHE); >> +} > > Is there a reason why we cannot use NR_SWAPCACHE for cgroup v1? Isn't > that being maintained regardless of cgroup version? It is not exposed > in cgroup v1's memory.stat, but I don't think there is a reason we > can't do that -- if only to document that it is being used with cgroup > v1. > > >> + >> bool mem_cgroup_swap_full(struct folio *folio) >> { >> struct mem_cgroup *memcg; >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> index 7c33c5b653ef..bcb6279cbae7 100644 >> --- a/mm/vmscan.c >> +++ b/mm/vmscan.c >> @@ -609,13 +609,17 @@ static inline bool can_reclaim_anon_pages(struct m= em_cgroup *memcg, >> if (memcg =3D=3D NULL) { >> /* >> * For non-memcg reclaim, is there >> - * space in any swap device? >> + * space in any swap device or swapcache pages? >> */ >> - if (get_nr_swap_pages() > 0) >> + if (get_nr_swap_pages() + total_swapcache_pages() > 0) >> return true; >> } else { >> - /* Is the memcg below its swap limit? */ >> - if (mem_cgroup_get_nr_swap_pages(memcg) > 0) >> + /* >> + * Is the memcg below its swap limit or is there swapcac= he >> + * pages can be freed? >> + */ >> + if (mem_cgroup_get_nr_swap_pages(memcg) + >> + mem_cgroup_get_nr_swapcache_pages(memcg) > 0) >> return true; >> } > > I wonder if it would be more efficient to set a bit in struct > scan_control if we only are out of swap spaces but have swap cache > pages, and only isolate anon pages that are in the swap cache, instead > of isolating random anon pages. We may end up isolating pages that are > not in the swap cache for a few iterations and wasting cycles. Scanning swap cache directly will make the code more complex. IIUC, the possibility for the swap device to be used up isn't high. If so, I prefer the simpler implementation as that in this series. -- Best Regards, Huang, Ying