From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2B310C433F5 for ; Mon, 7 Mar 2022 20:50:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A79F68D0002; Mon, 7 Mar 2022 15:50:39 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A292B8D0001; Mon, 7 Mar 2022 15:50:39 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F08F8D0002; Mon, 7 Mar 2022 15:50:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.a.hostedemail.com [64.99.140.24]) by kanga.kvack.org (Postfix) with ESMTP id 81FDD8D0001 for ; Mon, 7 Mar 2022 15:50:39 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay11.hostedemail.com (Postfix) with ESMTP id 45B04817EC for ; Mon, 7 Mar 2022 20:50:39 +0000 (UTC) X-FDA: 79218783798.14.36BA5EE Received: from mail-qv1-f48.google.com (mail-qv1-f48.google.com [209.85.219.48]) by imf30.hostedemail.com (Postfix) with ESMTP id 8A85C80008 for ; Mon, 7 Mar 2022 20:50:38 +0000 (UTC) Received: by mail-qv1-f48.google.com with SMTP id e22so13049747qvf.9 for ; Mon, 07 Mar 2022 12:50:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=54nl44VJ8k3ofxk3GGzLYfKRJ/GodCZUkiVotBT500g=; b=u30lVaDMkwiEwvI0a1GoMOmXiCuMSV+7mCPny/VjA87tH3cfhSLYVE1X0TXwUIitYo 53/LOF2WjCzJWDqpxtN0yvZ/e9mln61NEzLckDeb9fEdG+j/nTh8+PHyIYDB8ZInAkK9 r9lE3Ut/VeqvEJN77Df95CADoJGg19LrK9qinRNuYN0WAZM8pzp6aP8LuUYWvlOAO+Vv CFDhVwBJU7WCIK57NlZ5rJLdawHMtiyVR8hJNMYyQ+J897hv8oIriyA2vqqkL/ZxurOW BU6/BT8A4sAUYN6PJ1F2b1gTo+W0HyC7Ii3yo7nhpqeBSmb/d4OJAwj+51HnnwBw7akA eB7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=54nl44VJ8k3ofxk3GGzLYfKRJ/GodCZUkiVotBT500g=; b=yPfUf09vD6we8v9xZBUg4RLQu/b4+RTZkCMHpCjZXaz4X+PEFYIed1xj6U2H16pSne G8tpD/rjZv/mGRcMmt2sQKeBfiTfn1/0pF2YZ08/xzOe5H8SJauubPaS2ydvbO6OXW+1 S/jeaclA/on/uSdnhPO/9Yx6TS4HwBWGB9lbatUP5Cy2QGnPrHMUdV9A7YxUwHivBzKQ Czefoso9WSku6wNW0OW9/Nt1EN3kMm1fwhgQh8QpeIBv1HBmytGp2OVIJ7kzzXbJp//9 NJ7B9mW5yi5PjbtPEVByNRdR8w3C4uUoKqcntbBNj2jAmlA+7BzjN5TwZ69sGDTUZG30 5Uhw== X-Gm-Message-State: AOAM531S75SJYoUecpMEyV1b/c41jp6YBzEjSwAWpwkZoms/wi1GtpIl cmxYqpR6UbsPartk2RI60Fm1Jw== X-Google-Smtp-Source: ABdhPJxwV6IMyVNtqgF6MEJUy7KimGuh9S86d3gR6MMdRSskdHLCYlo0p0yiYCK5+pB42IkE5Jg1fQ== X-Received: by 2002:ad4:5de2:0:b0:435:6997:5408 with SMTP id jn2-20020ad45de2000000b0043569975408mr9596759qvb.121.1646686237606; Mon, 07 Mar 2022 12:50:37 -0800 (PST) Received: from localhost (cpe-98-15-154-102.hvc.res.rr.com. [98.15.154.102]) by smtp.gmail.com with ESMTPSA id v129-20020a379387000000b0064936bab2fcsm6708795qkd.48.2022.03.07.12.50.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 07 Mar 2022 12:50:37 -0800 (PST) Date: Mon, 7 Mar 2022 15:50:36 -0500 From: Johannes Weiner To: David Rientjes Cc: Andrew Morton , Michal Hocko , Yu Zhao , Dave Hansen , linux-mm@kvack.org, Yosry Ahmed , Wei Xu , Shakeel Butt , Greg Thelen Subject: Re: [RFC] Mechanism to induce memory reclaim Message-ID: References: <5df21376-7dd1-bf81-8414-32a73cea45dd@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5df21376-7dd1-bf81-8414-32a73cea45dd@google.com> X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 8A85C80008 X-Stat-Signature: nbej5xmca1eodwmab69cp9iyx7pf7osf Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=u30lVaDM; spf=pass (imf30.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.48 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org X-HE-Tag: 1646686238-555599 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, Mar 06, 2022 at 03:11:23PM -0800, David Rientjes wrote: > Hi everybody, > > We'd like to discuss formalizing a mechanism to induce memory reclaim by > the kernel. > > The current multigenerational LRU proposal introduces a debugfs > mechanism[1] for this. The "TMO: Transparent Memory Offloading in > Datacenters" paper also discusses a per-memcg mechanism[2]. While the > former can be used for debugging of MGLRU, both can quite powerfully be > used for proactive reclaim. > > Google's datacenters use a similar per-memcg mechanism for the same > purpose. Thus, formalizing the mechanism would allow our userspace to use > an upstream supported interface that will be stable and consistent. > > This could be an incremental addition to MGLRU's lru_gen debugfs mechanism > but, since the concept has no direct dependency on the work, we believe it > is useful independent of the reclaim mechanism in use (both with and > without CONFIG_LRU_GEN). > > Idea: introduce a per-node sysfs mechanism for inducing memory reclaim > that can be useful for global (non-memcg constrained) reclaim and possible > even if memcg is not enabled in the kernel or mounted. This could > optionally take a memcg id to induce reclaim for a memcg hierarchy. > > IOW, this would be a /sys/devices/system/node/nodeN/reclaim mechanim for > each NUMA node N on the system. (It would be similar to the existing > per-node sysfs "compact" mechanism used to trigger compaction from > userspace.) I generally think a proactive reclaim interface is a good idea. A per-cgroup control knob would make more sense to me, as cgroupfs takes care of delegation, namespacing etc. and so would permit self-directed proactive reclaim inside containers. > Userspace would write the following to this file: > - nr_to_reclaim pages This makes sense, although (and you hinted at this below), I'm thinking it should be in bytes, especially if part of cgroupfs. > - swappiness factor This I'm not sure about. Mostly because I'm not sure about swappiness in general. It balances between anon and file, but both of them are aged according to the same LRU rules. The only reason to prefer one over the other seems to be when the cost of reloading one (refault vs swapin) isn't the same as the other. That's usually a hardware property, which in a perfect world we'd auto-tune inside the kernel based on observed IO performance. Not sure why you'd want this per reclaim request. > - flags to specify context, if any[**] > > [**] this is offered for extensibility to specify the context in which > reclaim is being done (clean file pages only, demotion for memory > tiering vs eviction, etc), otherwise 0 This one is curious. I don't understand the use cases for either of these examples, and I can't think of other flags a user may pass on a per-invocation basis. Would you care to elaborate some?