From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D696CC61D97 for ; Wed, 22 Nov 2023 10:39:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 470CB6B05E4; Wed, 22 Nov 2023 05:39:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 420F86B05E5; Wed, 22 Nov 2023 05:39:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2E9836B05E6; Wed, 22 Nov 2023 05:39:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 1EED86B05E4 for ; Wed, 22 Nov 2023 05:39:59 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id DF2D21CB3DF for ; Wed, 22 Nov 2023 10:39:58 +0000 (UTC) X-FDA: 81485244876.19.CA9EB29 Received: from mail-lf1-f44.google.com (mail-lf1-f44.google.com [209.85.167.44]) by imf05.hostedemail.com (Postfix) with ESMTP id 158C9100011 for ; Wed, 22 Nov 2023 10:39:56 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=do1FC3dN; spf=pass (imf05.hostedemail.com: domain of yosryahmed@google.com designates 209.85.167.44 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700649597; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jPWkkH55xPxAxHPspMATKuWOXWgWjEUaF8xxquOIBi0=; b=8gW9i3u9X86nYF5/N6IDih/QvOXr+WFv/6RDuA+oUi818rCp8HMHNQL2KyULtaERgjzTxx /swIAZoGvKIXZQVNagT7fY+rMcNauPcIMq4++jaPn5Yd3LJX1yhhh3DMUzgUqTGmEBel5L YOgTvbwVngcnUy4WMnZX6m09uhIFAME= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700649597; a=rsa-sha256; cv=none; b=HjeVud+JH3JwZ+FwBWRuXd0QTxs3WlvCR0W5xOLJMTNe4g/AXdl2ouiN3Usuq/xdqcpdqc WYBKCiLF471slt7DMD1dYF7JX7j1mJgRP2sNbbOC1I+IZFK/wSYl5xKt4b8XhisAHbMB/p fKb/TNKAIZDif9S37zaPK+8LUBGUNt0= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=do1FC3dN; spf=pass (imf05.hostedemail.com: domain of yosryahmed@google.com designates 209.85.167.44 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-lf1-f44.google.com with SMTP id 2adb3069b0e04-507f1c29f25so9072384e87.1 for ; Wed, 22 Nov 2023 02:39:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1700649595; x=1701254395; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=jPWkkH55xPxAxHPspMATKuWOXWgWjEUaF8xxquOIBi0=; b=do1FC3dNOTcKThXhPf5QVT8G4lcNthYpc3usq9SfTqCtfVJIu+Gu7bASpVmo3a2hp4 zbmxXcBzLl1QyGWJnstpFYPfn7j6ThDFNqaqdnRK70buo0HBAN4zrX5PE5iGQ6I1L4qO t7ffmH2S2pDgj7BnLe5q9PFgXZiDJUtarzusCE+0exv63CbmJmQNb6fE8mXtgCVPJhC8 FFO1bVEhAN7Sz07F4Pcn1XWK6s8zwmfqK8ByUYVNKMcaeE/fgOzmmdeWVls7vw8AbaPJ eAlfXt941mXCQFaenrkayGCNlIOuBb5MQBWS2RqGOeFYxv+qubFBPF1LMQHH6lVlImTk pv2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700649595; x=1701254395; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jPWkkH55xPxAxHPspMATKuWOXWgWjEUaF8xxquOIBi0=; b=jr2sbG6GUyvFkc7vUwNZdIcMSRAH3LTlvK/u3gGdRt0xaaxNIS66RxaSrUMFmsB7B+ 3dyEMEwIexJeW6tQaqS8RNaeDeiWvUDlasZWDw3KBH8Bcg7JgLNxDFqb/4HZdfmEC6BR xZZp1Wr9YO5Po5+OkGkgx4YrDGwQF13MO6tg8EAKR0Pi/iQpeURpP5IiAm4F/cnOj3wm CJ/7K3NnqzCpqq337D1hSTOU5y8JMeiOte7WcjNchczeCcbyV0Rdxs/aZBdF0HS5LrBg DvtmO3U3JhT4z97H3ZrJ+8DTf2t9HvUjOuh6PvCar7Q1SgoXyNxJC+M+a+tTiWeKaoDw k4/Q== X-Gm-Message-State: AOJu0YzTTV2zI7pjFvQt8dl5Q/QlwHMPRR8vi42g0Uh/EJkQVomS3BND sM40BIV9BIDuXCsmnAf5myxUlocFDfsSy1tLCWO3ZA== X-Google-Smtp-Source: AGHT+IFvsXtAu9iJG+Ar78KDXMZvoDM+bciud3KjTO/bgSnXAWHgkMkDtMh8KcHGANQVftizzJF1igDKNaNwMXt6X9I= X-Received: by 2002:ac2:5334:0:b0:509:448a:d with SMTP id f20-20020ac25334000000b00509448a000dmr1163961lfh.31.1700649594924; Wed, 22 Nov 2023 02:39:54 -0800 (PST) MIME-Version: 1.0 References: <20231121090624.1814733-1-liushixin2@huawei.com> <32fe518a-e962-14ae-badc-719390386db9@huawei.com> In-Reply-To: From: Yosry Ahmed Date: Wed, 22 Nov 2023 02:39:15 -0800 Message-ID: Subject: Re: [PATCH v10] mm: vmscan: try to reclaim swapcache pages if no swap space To: Michal Hocko Cc: Liu Shixin , Yu Zhao , Andrew Morton , Huang Ying , Sachin Sant , Johannes Weiner , Kefeng Wang , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 158C9100011 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: pchrsofisf8jcrcdrunrsnxfz4yekp98 X-HE-Tag: 1700649596-143113 X-HE-Meta: U2FsdGVkX1+IIRe1C2G/DjIy+fMAcMf5vpb8cz7EntSNykEv+CqhaZhS6du+ZtJfe0NnL5u/chP6JNTq8Yn9xW8dUZ4pPSBLFSM5nhf/LxntdnVOkLsyv28RIzNVO/Wg91InEiXtnPSlLB+PYB+A3APrS5Ztdv30tZ2vuWcpQl0C5aJqHQHCxNGct2fxkVCWXsQp/SRB0eZ2bobaDPy4XaS1/ZPGbvIe5WcEhHt2G9HMGC8HihKrXmkj+2zWttFdD751bPdjDNAvpQoT2rWXbfAmVel1icIPsM7VC149y2BSBLu7qEuamETkDruyVvspOC/0hjf95xNdy0XOaKX0JmFRsuE/uYfa//1TPGijouw7zjcxCz1oaF7AVZUIxeBzp96vQ5y6rm4nPDkuydjpQWBTZyEVrviQbFSirubeE/pIqdyw4QbKnNWY2yzoPbyVhIg9JNdbmrrTCIcHtvFcbKJpsinvSYohKV4JBJtHdQCN5lxYYDMTkpfPYTgfgrU85Kc9BJ9CZolTMHB/stdgm7WNzjOY2d4l9cSINZsCvQNYrAZn1fYSl88RWLO9Zizw+IJrVWvnoNjFmJIQQQhu3c9WAE+tjeX8+lTY1+EroSBw+xKLiyGTVL7NyTjcOEJwnTNmm9jRGUd0J5t/cMZmQyvj+wo8HlLjs3IyL8PEpHLriHPDcE05b73tuoJ5KOFnLm+sh17eRTXTpZ29t5ylDO06fWkUilxnsbIR64m59YW/sUgncmVqR/L/LuQDcbnvQto3mqGPy2Ibsxo2aeY+rgwyRV4MSjXl8ZLevY8ZCYOHKUA5xx3B/Y4MqWmiKweb1sD1oufx2bNLCsY5+W2YobEHySRwJvcaQeYXEAwl/jhG7/krq0q0g8p+dI9+pordA1fmIXz98kBCGihDOP45WbO6ahKdISqQJeohCtYeLUJiOH7pTCfdh8wP9zlQN5mkKqYLx01OCfXg01sCcOD 0IphiKbG Q87LvQlLgx/lbKRjZGuA7OJ5ZZY9X1QcDreEn7q52VB88iLkLOYHn0RfCn0xst0vNGQrfc+nEL/fxamlI/Ria6hRDMmrU3JzPG9yNz44ZlC6RD+GJOOKna1C9+5SkR3ANH9VgxqJQTzcWPQLjuwKBCxHiqxepbp9a4OjlaKweB5jJM5M9442PLfFfE3hxdJ+bib3PvSDe7wJjIRIHYO8s/Q359D6wbRvFstPSMjHVstlWibdbUKxy/ZkmgkAa0A2HPgq1VcNWx3cMuxFaIaj86xCRRed/sqVrOOrbJSh8d16SotiE0UB/NNGsHg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Nov 22, 2023 at 2:09=E2=80=AFAM Michal Hocko wrot= e: > > On Wed 22-11-23 09:52:42, Michal Hocko wrote: > > On Tue 21-11-23 22:44:32, Yosry Ahmed wrote: > > > On Tue, Nov 21, 2023 at 10:41=E2=80=AFPM Liu Shixin wrote: > > > > > > > > > > > > On 2023/11/21 21:00, Michal Hocko wrote: > > > > > On Tue 21-11-23 17:06:24, Liu Shixin wrote: > > > > > > > > > > However, in swapcache_only mode, the scan count still increased w= hen scan > > > > > non-swapcache pages because there are large number of non-swapcac= he pages > > > > > and rare swapcache pages in swapcache_only mode, and if the non-s= wapcache > > > > > is skipped and do not count, the scan of pages in isolate_lru_fol= ios() can > > > > > eventually lead to hung task, just as Sachin reported [2]. > > > > > I find this paragraph really confusing! I guess what you meant to= say is > > > > > that a real swapcache_only is problematic because it can end up n= ot > > > > > making any progress, correct? > > > > This paragraph is going to explain why checking swapcache_only afte= r scan +=3D nr_pages; > > > > > > > > > > AFAIU you have addressed that problem by making swapcache_only an= on LRU > > > > > specific, right? That would be certainly more robust as you can s= till > > > > > reclaim from file LRUs. I cannot say I like that because swapcach= e_only > > > > > is a bit confusing and I do not think we want to grow more specia= l > > > > > purpose reclaim types. Would it be possible/reasonable to instead= put > > > > > swapcache pages on the file LRU instead? > > > > It looks like a good idea, but I'm not sure if it's possible. I can= try it, is there anything to > > > > pay attention to? > > > > > > I think this might be more intrusive than we think. Every time a page > > > is added to or removed from the swap cache, we will need to move it > > > between LRUs. All pages on the anon LRU will need to go through the > > > file LRU before being reclaimed. I think this might be too big of a > > > change to achieve this patch's goal. > > > > TBH I am not really sure how complex that might turn out to be. > > Swapcache tends to be full of subtle issues. So you might be right but > > it would be better to know _why_ this is not possible before we end up > > phising for couple of swapcache pages on potentially huge anon LRU to > > isolate them. Think of TB sized machines in this context. > > Forgot to mention that it is not really far fetched from comparing this > to MADV_FREE pages. Those are anonymous but we do not want to keep them > on anon LRU because we want to age them indepdendent on the swap > availability as they are just dropped during reclaim. Not too much > different from swapcache pages. There are more constrains on those but > fundamentally this is the same problem, no? I agree it's not a first, but swap cache pages are more complicated because they can go back and forth, unlike MADV_FREE pages which usually go on a one way ticket AFAICT. Also pages going into the swap cache can be much more common that MADV_FREE pages for a lot of workloads. I am not sure how different reclaim heuristics will react to such mobility between the LRUs, and the fact that all pages will now only get evicted through the file LRU. The anon LRU will essentially become an LRU that feeds the file LRU. Also, the more pages we move between LRUs, the more ordering violations we introduce, as we may put colder pages in front of hotter pages or vice versa. All in all, I am not saying it's a bad idea or not possible, I am just saying it's probably more complicated than MADV_FREE, and adding more cases where pages move between LRUs could introduce problems (or make existing problems more visible). > -- > Michal Hocko > SUSE Labs