From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 555F9C07548 for ; Wed, 15 Nov 2023 05:30:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 765106B0326; Wed, 15 Nov 2023 00:30:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6ED136B0328; Wed, 15 Nov 2023 00:30:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 567176B032A; Wed, 15 Nov 2023 00:30:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 42DE46B0326 for ; Wed, 15 Nov 2023 00:30:11 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 0F4F41A0249 for ; Wed, 15 Nov 2023 05:30:11 +0000 (UTC) X-FDA: 81459062622.05.6F2E593 Received: from mail-qt1-f179.google.com (mail-qt1-f179.google.com [209.85.160.179]) by imf02.hostedemail.com (Postfix) with ESMTP id 5187E80017 for ; Wed, 15 Nov 2023 05:30:09 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="jk/MLyo8"; spf=pass (imf02.hostedemail.com: domain of yuzhao@google.com designates 209.85.160.179 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700026209; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cTzcIPkoY+7GW1ili0ZNzFlDS034xJPkgo2VpEAsmP0=; b=1U+ZwKunIPfF7/yxDo8WWYI5inVrfwMBmIGlpZkR4Up+/XzBqJ6U9N32c8pTknic9dgYuo ycaKfyoif0yVI8uJW6nTECYY/6SVwDPuUfQtBV61gwT00YHZB12Dji52AWgcIAc/8GXBMH wE6oPjfVp9zSt21OhYIxOqHroeE1/nc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700026209; a=rsa-sha256; cv=none; b=7eIBGrv2Dos0pnP69vhJ5PHqWhSopmhJc134LlPRvFjmbkv8L9zSw4gUjW0Lqbm4hMt3Nu 2b8GfOhKkHzKtsvsefoO+PxcOHZ/PyNaWSSbfg2rY4pdgTQEcRxQoMSH3HS98BmTX0Vajr xvPAWYiE4+31GbX8ns+F9qvU3MYobZg= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="jk/MLyo8"; spf=pass (imf02.hostedemail.com: domain of yuzhao@google.com designates 209.85.160.179 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-qt1-f179.google.com with SMTP id d75a77b69052e-41cb7720579so137921cf.1 for ; Tue, 14 Nov 2023 21:30:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1700026208; x=1700631008; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=cTzcIPkoY+7GW1ili0ZNzFlDS034xJPkgo2VpEAsmP0=; b=jk/MLyo82wxauVd4s4l9BLdSKkk8fTT10tJnbDftWBbrqA50SdscflK0TQt/fp7O6h AwyctWU0+qsRvxpN+d0xRx735qV/sA50TkWSYrsv1gL+TU1c/7juWrG+Gn15ABQCEmIm fQgTNaZd2dplLPO7G4P9LgAj2BZCeJXj2R36RJf0IQHL8q5TQ61ibKSQ289Scz4FkTH1 zc3Wi0/oRMEH8b3ZMGjIgUJPL4yDe2x7WoV0dKwIHkve7YfPE98gO5S/Wddkp8YdRmBR tAtwjAj7malixMh3dc275nfZofU76aghplLdvVv/yTdJxdIcLtcYeS5nGsetR0+/AD0j zIMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700026208; x=1700631008; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cTzcIPkoY+7GW1ili0ZNzFlDS034xJPkgo2VpEAsmP0=; b=EXOAJpSbdUWmBQmN9KXSQAHGUa3J4l2vN5l28ZYeEt2So+69agmR1zWBJhKJ0QJC7P WJDPSdrylcWH6S6vM331YVeM0JciKtNRg0aTEOqrtETiBq/+2l7JmcYkL8lvrRnfqUp0 Pl5mW4Wy2CDm4IZC8khuIdEJuNFzvkWRPiTfZhIz82ZuS/b5DTBhxaMir/KxcCjU8hy6 1oIbTcSp0MFwCVhE+gTya1n46rzSevB8E756/25rCS+QIdSlE64pjYEoMZQvi2tqpuA3 ufU0r+aUc/iPgL1OtvNj2uNyPfDfPAOmZQ7t9HSPYlVePnUd0HCEFmqmlke43t0X1Lwb lvag== X-Gm-Message-State: AOJu0Yx7RMC9QruakIFHY0Tsm5f/6jBU4PYhS9TtQW54gICgIuQhFdWM OtAd1/B+sy8ncNDD7nV8bClcqTCli0qMnug0jEPbTQ== X-Google-Smtp-Source: AGHT+IHUr6z14WJsN3TUSijIcq+5nnoZHkgJAE3wpifyY0eVdTqQajfKgjJu5h9UzTSgd5pRAfwdHPpZTnI18SRRQkQ= X-Received: by 2002:a05:622a:420d:b0:421:bc7d:dc84 with SMTP id cp13-20020a05622a420d00b00421bc7ddc84mr179532qtb.2.1700026208232; Tue, 14 Nov 2023 21:30:08 -0800 (PST) MIME-Version: 1.0 References: <20231115050123.982876-1-liushixin2@huawei.com> In-Reply-To: <20231115050123.982876-1-liushixin2@huawei.com> From: Yu Zhao Date: Tue, 14 Nov 2023 22:29:31 -0700 Message-ID: Subject: Re: [PATCH v9] mm: vmscan: try to reclaim swapcache pages if no swap space To: Liu Shixin Cc: Andrew Morton , Yosry Ahmed , Huang Ying , Sachin Sant , Michal Hocko , Johannes Weiner , Kefeng Wang , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: wb14txsntpdcdz7zie943obkg73tcasq X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 5187E80017 X-Rspam-User: X-HE-Tag: 1700026209-470253 X-HE-Meta: U2FsdGVkX18VOLy1WAvGheSKvhVzmFLEBKKuk6fTwvaS4c8kPsLs/eHsGiL2wI2GFk4sCYb72VL5r9ow3/MlnCTfc1segldKg7iTl65v3oKi4ys96Gr2+/55Z0Hx5KufQvKRrXVlPKWWsMHz7wGH/kLHeV0mvxVwBK3A8s3cTJYMrM3HVcf3IXtfNYS8Hd7ONUtyrlEbcAQWb6IhUui/juwOw6tcmVSWNJTkvuOu9PYOzmq9GALLcmHT/Ux5FsCXSA5W8MM+VXKz7g6aaMhluhd/0cfwMyKF0ONkIkRqnVkWQ+z0cuaoRbIlakxQBVAojD1z9pmrBL5/zxIsekBNjeQTEeqPl3vkBaNd5zE8/LSSAqfCcv07QdtodFNjeN4UHjvYkf/uEkRceFS0St6wi19DVWGfC1FGWqg/DvBgrzX+IMVeARF8WBhQ7pnmfI4gPn+7/y7c406QNNFYlq6dmoI1MFZBGkr/6JUfo7hqHEZdmMI0UKCw66MRds3nWct+ZunO4Kg9Jg+rtI71tW02NDbGWwkGeEKuD2eVuKqgBwG8oFgjHnRV4wDD17pC0Nc0NeW2djKxiKGMm4CzMTGHjM9jPaHCa27OKk2p3EFbv3OjjcMo34Zhpub5JZTTBxeN8FoatEbRjANim7akvD+rTL1CCoFaRZ/xX3RuxAKTiqaj54CiEZaM/SxHWyo0Uwev6cX6HWXHqGN1gS8OAdYgWQZK42n5E9LoWpzNRHNs5DhjnK5dI1I8rG4zOxVLe8bKBPXJQ2wda1hc08/ycGwavZBqXAD1UpD+7xEkgP5iSpPUhvsH904r85RFfND4J5meMP2lp7rjw+jryCN2dLnv2EFDy86Z90ru12KjDaZcrJTdCidfaGCYX+bNUAEOh2YnzLHUTHjsxLueGVrF4athoFr2Go/URC8g3UHkpc0KtE3WdKt1735AYitMQei60hn+gah8/fBArjcZBklMO7E ga0qz+5P f/9FdezFd/qXMndH0b7tF+h2iVaRUBsEIjZEfJ2MGHoFSExVsxsX2pAEe2iJr/7qWNq0v/yinhybN3Ralgp0ua36GbvlBOXmnobSkE53TijSuGK3pZEGNIwe1RjFxdlE0CwCCToI7L9BdCE8Hghq1SPsMJDyLGKrbCVmN7LwhSYvSaaVh0AZrJUhrRl2xGWHAqTMJKn8NqSFBZRkrfpgqJ0Ajyh1abNmyubo3iY85wEz/H9H+Ih0HzjLPTSRb8kkXiaEE/nD3Cju63B8+0y1358utJcGCeBBNtslVJxqCT0WOUJB7uctrssUh4RFnk6uB/nQ+U5hKOC5Xm2z8UFAaape7ojhRCXFByFVJDo7+M1/9/F47GPYng/6yQQIqCH7TWhpH X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Nov 14, 2023 at 9:24=E2=80=AFPM Liu Shixin = wrote: > > When spaces of swap devices are exhausted, only file pages can be > reclaimed. But there are still some swapcache pages in anon lru list. > This can lead to a premature out-of-memory. > > The problem is found with such step: > > Firstly, set a 9MB disk swap space, then create a cgroup with 10MB > memory limit, then runs an program to allocates about 15MB memory. > > The problem occurs occasionally, which may need about 100 times [1]. > > Fix it by checking number of swapcache pages in can_reclaim_anon_pages(). > If the number is not zero, return true and set swapcache_only to 1. > When scan anon lru list in swapcache_only mode, non-swapcache pages will > be skipped to isolate in order to accelerate reclaim efficiency. > > However, in swapcache_only mode, the scan count still increased when scan > non-swapcache pages because there are large number of non-swapcache pages > and rare swapcache pages in swapcache_only mode, and if the non-swapcache > is skipped and do not count, the scan of pages in isolate_lru_folios() ca= n > eventually lead to hung task, just as Sachin reported [2]. > > By the way, since there are enough times of memory reclaim before OOM, it > is not need to isolate too much swapcache pages in one times. > > [1]. https://lore.kernel.org/lkml/CAJD7tkZAfgncV+KbKr36=3DeDzMnT=3D9dZOT0= dpMWcurHLr6Do+GA@mail.gmail.com/ > [2]. https://lore.kernel.org/linux-mm/CAJD7tkafz_2XAuqE8tGLPEcpLngewhUo= =3D5US14PAtSM9tLBUQg@mail.gmail.com/ > > Signed-off-by: Liu Shixin > Tested-by: Yosry Ahmed > Reviewed-by: "Huang, Ying" > Reviewed-by: Yosry Ahmed > --- > v8->v9: Move the swapcache check after can_demote() and refector > can_reclaim_anon_pages() a bit. > v7->v8: Reset swapcache_only at the beginning of can_reclaim_anon_pages()= . > v6->v7: Reset swapcache_only to zero after there are swap spaces. > v5->v6: Fix NULL pointing derefence and hung task problem reported by Sac= hin. > > include/linux/swap.h | 6 ++++++ > mm/memcontrol.c | 8 ++++++++ > mm/vmscan.c | 47 ++++++++++++++++++++++++++++++++------------ > 3 files changed, 48 insertions(+), 13 deletions(-) > > diff --git a/include/linux/swap.h b/include/linux/swap.h > index f6dd6575b905..3ba146ae7cf5 100644 > --- a/include/linux/swap.h > +++ b/include/linux/swap.h > @@ -659,6 +659,7 @@ static inline void mem_cgroup_uncharge_swap(swp_entry= _t entry, unsigned int nr_p > } > > extern long mem_cgroup_get_nr_swap_pages(struct mem_cgroup *memcg); > +extern long mem_cgroup_get_nr_swapcache_pages(struct mem_cgroup *memcg); > extern bool mem_cgroup_swap_full(struct folio *folio); > #else > static inline void mem_cgroup_swapout(struct folio *folio, swp_entry_t e= ntry) > @@ -681,6 +682,11 @@ static inline long mem_cgroup_get_nr_swap_pages(stru= ct mem_cgroup *memcg) > return get_nr_swap_pages(); > } > > +static inline long mem_cgroup_get_nr_swapcache_pages(struct mem_cgroup *= memcg) > +{ > + return total_swapcache_pages(); > +} > + > static inline bool mem_cgroup_swap_full(struct folio *folio) > { > return vm_swap_full(); > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 774bd6e21e27..a76ec540d4a3 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -7865,6 +7865,14 @@ long mem_cgroup_get_nr_swap_pages(struct mem_cgrou= p *memcg) > return nr_swap_pages; > } > > +long mem_cgroup_get_nr_swapcache_pages(struct mem_cgroup *memcg) > +{ > + if (mem_cgroup_disabled()) > + return total_swapcache_pages(); > + > + return memcg_page_state(memcg, NR_SWAPCACHE); > +} > + > bool mem_cgroup_swap_full(struct folio *folio) > { > struct mem_cgroup *memcg; > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 506f8220c5fe..62a1c75f74ad 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -136,6 +136,9 @@ struct scan_control { > /* Always discard instead of demoting to lower tier memory */ > unsigned int no_demotion:1; > > + /* Swap space is exhausted, only reclaim swapcache for anon LRU *= / > + unsigned int swapcache_only:1; > + > /* Allocation order */ > s8 order; > > @@ -312,25 +315,34 @@ static inline bool can_reclaim_anon_pages(struct me= m_cgroup *memcg, > int nid, > struct scan_control *sc) > { > - if (memcg =3D=3D NULL) { > - /* > - * For non-memcg reclaim, is there > - * space in any swap device? > - */ > - if (get_nr_swap_pages() > 0) > - return true; > - } else { > - /* Is the memcg below its swap limit? */ > - if (mem_cgroup_get_nr_swap_pages(memcg) > 0) > - return true; > - } > + if (sc) > + sc->swapcache_only =3D 0; > + > + /* > + * For non-memcg reclaim, is there space in any swap device? > + * Or is the memcg below its swap limit? > + */ > + if ((!memcg && get_nr_swap_pages() > 0) || > + (memcg && mem_cgroup_get_nr_swap_pages(memcg) > 0)) > + return true; > > /* > * The page can not be swapped. > * > * Can it be reclaimed from this node via demotion? > */ > - return can_demote(nid, sc); > + if (can_demote(nid, sc)) > + return true; > + > + /* Is there any swapcache pages to reclaim? */ > + if ((!memcg && total_swapcache_pages() > 0) || > + (memcg && mem_cgroup_get_nr_swapcache_pages(memcg) > 0)) { The above can return false positives if there are multiple nodes, so it needs to be per node or lruvec, i.e., node_page_state() or lruvec_page_state_local().