From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6AB87C433F5 for ; Thu, 10 Mar 2022 17:34:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DBB378D0002; Thu, 10 Mar 2022 12:34:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D42FF8D0001; Thu, 10 Mar 2022 12:34:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BE24D8D0002; Thu, 10 Mar 2022 12:34:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.a.hostedemail.com [64.99.140.24]) by kanga.kvack.org (Postfix) with ESMTP id AB6E18D0001 for ; Thu, 10 Mar 2022 12:34:00 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 7609421FDD for ; Thu, 10 Mar 2022 17:34:00 +0000 (UTC) X-FDA: 79229174640.08.33A5373 Received: from mail-io1-f41.google.com (mail-io1-f41.google.com [209.85.166.41]) by imf14.hostedemail.com (Postfix) with ESMTP id EEF10100018 for ; Thu, 10 Mar 2022 17:33:59 +0000 (UTC) Received: by mail-io1-f41.google.com with SMTP id d62so7250673iog.13 for ; Thu, 10 Mar 2022 09:33:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=tigQTA0nI9Q9E8eYynjYM79HGj9uvSju/Oun3/2es/I=; b=bK/bRD7wLcnexxP7buEgtvZW5tpnRcfzQNMm6DRWh407C2rbI8+2R81c8LyXNi//ra bpojyemY2HWMl31qT9NWGOWHrJPE+Nh301VoBaPsC9b65RuXr0ozjvAULCpJgkatOwBj uWoMAR1CXD5ZgtgFKMj8ms0p+F2Ff00N6hk1S9g9iq7tGCxphJOKtNdi465AtskNS+ng wtCDxjtg3vhOGSMkigbGKIL0BptftmZDfZlQmOfL9t4YkUZnmwdMPXsYFByO8ef0nJGa CpvN/wtavb1jfuXOCY3pQrhioQJ+mxZG4gY57l5GOXmOwSe587RvUpMPq2e5HrQr5vYE pufQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=tigQTA0nI9Q9E8eYynjYM79HGj9uvSju/Oun3/2es/I=; b=QJKLIZkEWRzsiwd6bHaAipDqPLGpogpLXdVjfdCimbpXAb/ZIWmXe79yil7Vu1oglM 6jf9uey/rwkdG/AoYRHhQiuVwb7MjTMu9eIHeS9nKkt9FglVBS8a6FvQGABGycMbSmTy Xpo8BkAnlo+W1AbYLWoLpJUVD5lFl0+V8DNx178R/MQw211fxXOSJsQrMzCGqiO/A5Fo PZkV/uWISWvsFHntFdTPeFE3fRO9vIN9X7gYEiYp6w5tQDfcxpbz2k29k1eSQrl0XarA cXxly5V4kIajYWDZOPfWyfE4yRqrLKEiaQZSURnt/6EkKY0H89HHmdKoQOMDMXpK1UCU Kp7A== X-Gm-Message-State: AOAM530fKYTUi819uEt7I0I2vW92EHCIh1jXG7BgOxrTQ3LyyaewWV2e PEtVyj6sZurbOKYId5CN5QFcFcGHv2Pz0g/hj52m6w== X-Google-Smtp-Source: ABdhPJzrO7CdTB0yB0F/5aMebu1boE2bqup/0qy2q/WPiGN4VXopk/unL6LTOJ9SYhciPM92zllBZCcKOl18qmUwTQY= X-Received: by 2002:a05:6638:2601:b0:319:ab71:e6df with SMTP id m1-20020a056638260100b00319ab71e6dfmr5075074jat.318.1646933639096; Thu, 10 Mar 2022 09:33:59 -0800 (PST) MIME-Version: 1.0 References: <5df21376-7dd1-bf81-8414-32a73cea45dd@google.com> <20220307183141.npa4627fpbsbgwvv@google.com> <63fcd044-7c87-63f3-391e-3b32f8feaae@google.com> In-Reply-To: From: Wei Xu Date: Thu, 10 Mar 2022 09:33:48 -0800 Message-ID: Subject: Re: [RFC] Mechanism to induce memory reclaim To: Johannes Weiner Cc: David Rientjes , Michal Hocko , Shakeel Butt , Andrew Morton , Yu Zhao , Dave Hansen , Linux MM , Yosry Ahmed , Greg Thelen Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: EEF10100018 X-Stat-Signature: mzwg5jogybx5edetkwo6hy4i1bwnycmu Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="bK/bRD7w"; spf=pass (imf14.hostedemail.com: domain of weixugc@google.com designates 209.85.166.41 as permitted sender) smtp.mailfrom=weixugc@google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1646933639-303512 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Mar 10, 2022 at 8:58 AM Johannes Weiner wrote: > > On Wed, Mar 09, 2022 at 02:03:21PM -0800, David Rientjes wrote: > > On Tue, 8 Mar 2022, Michal Hocko wrote: > > > > > > Let me take a stab at this. The specific reasons why high limit is not a > > > > good interface to implement proactive reclaim: > > > > > > > > 1) It can cause allocations from the target application to get > > > > throttled. > > > > > > > > 2) It leaves a state (high limit) in the kernel which needs to be reset > > > > by the userspace part of proactive reclaimer. > > > > > > > > If I remember correctly, Facebook actually tried to use high limit to > > > > implement the proactive reclaim but due to exactly these limitations [1] > > > > they went the route [2] aligned with this proposal. > > > > > > I do remember we have discussed this in the past. There were proposals > > > for an additional limit to trigger a background reclaim [3] or to add a > > > pressure based memcg knob [4]. For the nr_to_reclaim based interface > > > there were some challenges outlined in that email thread. I do > > > understand that practical experience could have confirmed or diminished > > > those concerns. > > > > > > I am definitely happy to restart those discussion but it would be really > > > great to summarize existing options and why they do not work in > > > practice. It would be also great to mention why concerns about nr_to_reclaim > > > based interface expressed in the past are not standing out anymore wrt. > > > other proposals. > > > > > > > Johannes, since you had pointed out that the current approach used at Meta > > and described in the TMO paper works well in practice and is based on > > prior discussions of memory.reclaim[1], do you have any lingering concerns > > from that 2020 thread? > > I'd be okay with merging the interface proposed in that thread as-is. We will need a nodemask argument for the memory tiering use case. We can add it as an optional argument to memory.reclaim later. Or do you think we should add a different interface (e.g. memory.demote) for memory tiering instead? > > My first email in this thread proposes something that can still do memcg > > based reclaim but is also possible even without CONFIG_MEMCG enabled. > > That's particularly helpful for configs used by customers that don't use > > memcg, namely Chrome OS. I assume we're not losing any functionality that > > your use case depends on if we are to introduce a per-node sysfs mechanism > > for this as an alternative since you can still specify a memcg id? > > We'd lose the delegation functionality with this proposal. > > But per the other thread, I wouldn't be opposed to adding a global > per-node interface in addition to the cgroupfs one.