Re: [PATCH v10] mm: vmscan: try to reclaim swapcache pages if no swap space

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Michal Hocko <mhocko@suse.com>
To: Yosry Ahmed <yosryahmed@google.com>
Cc: Liu Shixin <liushixin2@huawei.com>, Yu Zhao <yuzhao@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Huang Ying <ying.huang@intel.com>,
	Sachin Sant <sachinp@linux.ibm.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v10] mm: vmscan: try to reclaim swapcache pages if no swap space
Date: Wed, 22 Nov 2023 14:19:37 +0100	[thread overview]
Message-ID: <ZV3_6UH28KMt0ZDb@tiehlicka> (raw)
In-Reply-To: <CAJD7tka0=JR1s0OzQ0+H8ksFhvB2aBHXx_2-hVc97Enah9DqGQ@mail.gmail.com>

On Wed 22-11-23 02:39:15, Yosry Ahmed wrote:
> On Wed, Nov 22, 2023 at 2:09 AM Michal Hocko <mhocko@suse.com> wrote:
> >
> > On Wed 22-11-23 09:52:42, Michal Hocko wrote:
> > > On Tue 21-11-23 22:44:32, Yosry Ahmed wrote:
> > > > On Tue, Nov 21, 2023 at 10:41 PM Liu Shixin <liushixin2@huawei.com> wrote:
> > > > >
> > > > >
> > > > > On 2023/11/21 21:00, Michal Hocko wrote:
> > > > > > On Tue 21-11-23 17:06:24, Liu Shixin wrote:
> > > > > >
> > > > > > However, in swapcache_only mode, the scan count still increased when scan
> > > > > > non-swapcache pages because there are large number of non-swapcache pages
> > > > > > and rare swapcache pages in swapcache_only mode, and if the non-swapcache
> > > > > > is skipped and do not count, the scan of pages in isolate_lru_folios() can
> > > > > > eventually lead to hung task, just as Sachin reported [2].
> > > > > > I find this paragraph really confusing! I guess what you meant to say is
> > > > > > that a real swapcache_only is problematic because it can end up not
> > > > > > making any progress, correct?
> > > > > This paragraph is going to explain why checking swapcache_only after scan += nr_pages;
> > > > > >
> > > > > > AFAIU you have addressed that problem by making swapcache_only anon LRU
> > > > > > specific, right? That would be certainly more robust as you can still
> > > > > > reclaim from file LRUs. I cannot say I like that because swapcache_only
> > > > > > is a bit confusing and I do not think we want to grow more special
> > > > > > purpose reclaim types. Would it be possible/reasonable to instead put
> > > > > > swapcache pages on the file LRU instead?
> > > > > It looks like a good idea, but I'm not sure if it's possible. I can try it, is there anything to
> > > > > pay attention to?
> > > >
> > > > I think this might be more intrusive than we think. Every time a page
> > > > is added to or removed from the swap cache, we will need to move it
> > > > between LRUs. All pages on the anon LRU will need to go through the
> > > > file LRU before being reclaimed. I think this might be too big of a
> > > > change to achieve this patch's goal.
> > >
> > > TBH I am not really sure how complex that might turn out to be.
> > > Swapcache tends to be full of subtle issues. So you might be right but
> > > it would be better to know _why_ this is not possible before we end up
> > > phising for couple of swapcache pages on potentially huge anon LRU to
> > > isolate them. Think of TB sized machines in this context.
> >
> > Forgot to mention that it is not really far fetched from comparing this
> > to MADV_FREE pages. Those are anonymous but we do not want to keep them
> > on anon LRU because we want to age them indepdendent on the swap
> > availability as they are just dropped during reclaim. Not too much
> > different from swapcache pages. There are more constrains on those but
> > fundamentally this is the same problem, no?
> 
> I agree it's not a first, but swap cache pages are more complicated
> because they can go back and forth, unlike MADV_FREE pages which
> usually go on a one way ticket AFAICT.

Yes swapcache pages are indeed more complicated but most of the time
they just go away as well, no? MADV_FREE can be reinitiated if they are
written as well. So fundamentally they are not that different.

> Also pages going into the swap
> cache can be much more common that MADV_FREE pages for a lot of
> workloads. I am not sure how different reclaim heuristics will react
> to such mobility between the LRUs, and the fact that all pages will
> now only get evicted through the file LRU. The anon LRU will
> essentially become an LRU that feeds the file LRU. Also, the more
> pages we move between LRUs, the more ordering violations we introduce,
> as we may put colder pages in front of hotter pages or vice versa.

Well, traditionally the file LRU has been maintaining page cache or
easily disposable pages like MADV_FREE (which can be considered a cache
as well). Swapcache is a form of a page cache as well.

> All in all, I am not saying it's a bad idea or not possible, I am just
> saying it's probably more complicated than MADV_FREE, and adding more
> cases where pages move between LRUs could introduce problems (or make
> existing problems more visible).

Do we want to start adding filtered anon scan for a certain type of
pages? Because this is the question here AFAICS. This might seem an
easier solution but I would argue that it is less predictable one. 
It is not unusual that a huge anon LRU would contain only very few LRU
pages.

That being said, I might be missing some obvious or less obvious reasons
why this is completely bad idea. Swapcache is indeed subtle.
-- 
Michal Hocko
SUSE Labs

next prev parent reply	other threads:[~2023-11-22 13:19 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-21  9:06 Liu Shixin
2023-11-21 13:00 ` Michal Hocko
2023-11-22  6:41   ` Liu Shixin
2023-11-22  6:44     ` Yosry Ahmed
2023-11-22  6:57       ` Huang, Ying
2023-11-22  8:55         ` Michal Hocko
2023-11-22  8:52       ` Michal Hocko
2023-11-22 10:09         ` Michal Hocko
2023-11-22 10:39           ` Yosry Ahmed
2023-11-22 13:19             ` Michal Hocko [this message]
2023-11-22 20:13               ` Yosry Ahmed
2023-11-23  6:15               ` Huang, Ying
2023-11-24 16:30                 ` Michal Hocko
2023-11-27  2:34                   ` Huang, Ying
2023-11-27  7:42                     ` Chris Li
2023-11-27  8:11                       ` Huang, Ying
2023-11-27  8:22                         ` Chris Li
2023-11-27 21:31                           ` Minchan Kim
2023-11-27 21:56                             ` Yosry Ahmed
2023-11-28  3:19                               ` Huang, Ying
2023-11-28  3:27                                 ` Yosry Ahmed
2023-11-28  4:03                                   ` Huang, Ying
2023-11-28  4:13                                     ` Yosry Ahmed
2023-11-28  5:37                                       ` Huang, Ying
2023-11-28  5:41                                         ` Yosry Ahmed
2023-11-28  5:52                                           ` Huang, Ying
2023-11-28 22:37                                 ` Minchan Kim
2023-11-29  3:12                                   ` Huang, Ying
2023-11-29 10:22                                 ` Michal Hocko
2023-11-30  8:07                                   ` Huang, Ying
2023-11-28 23:45                               ` Chris Li
2023-11-27  9:10                     ` Michal Hocko
2023-11-28  1:31                       ` Huang, Ying
2023-11-28 10:16                         ` Michal Hocko
2023-11-28 22:45                           ` Minchan Kim
2023-11-28 23:05                             ` Yosry Ahmed
2023-11-28 23:15                               ` Minchan Kim
2023-11-29 10:17                                 ` Michal Hocko
2023-12-13 23:13                                   ` Andrew Morton
2023-12-15  5:05                                     ` Huang, Ying
2023-12-15 19:24                                       ` Andrew Morton
2023-11-23 17:30   ` Chris Li
2023-11-23 17:19 ` Chris Li
2023-11-28  1:59   ` Liu Shixin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZV3_6UH28KMt0ZDb@tiehlicka \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liushixin2@huawei.com \
    --cc=sachinp@linux.ibm.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=ying.huang@intel.com \
    --cc=yosryahmed@google.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox