From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3FA5EC433F5 for ; Tue, 8 Mar 2022 17:21:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 927178D0007; Tue, 8 Mar 2022 12:21:57 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8D6EC8D0001; Tue, 8 Mar 2022 12:21:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7C7B28D0007; Tue, 8 Mar 2022 12:21:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0174.hostedemail.com [216.40.44.174]) by kanga.kvack.org (Postfix) with ESMTP id 6D1258D0001 for ; Tue, 8 Mar 2022 12:21:57 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 30013A8344 for ; Tue, 8 Mar 2022 17:21:57 +0000 (UTC) X-FDA: 79221886674.26.83D1862 Received: from mail-io1-f41.google.com (mail-io1-f41.google.com [209.85.166.41]) by imf21.hostedemail.com (Postfix) with ESMTP id A38881C0010 for ; Tue, 8 Mar 2022 17:21:56 +0000 (UTC) Received: by mail-io1-f41.google.com with SMTP id x4so5763430iom.12 for ; Tue, 08 Mar 2022 09:21:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=xWrTF8UDX0G6aIsHUcaJNXq24C39ICMCzppL+KGDaYA=; b=k+05qpiWJucR5r5OTMb5xRS8mLSdPVoLzXyoOWBvMof+VNKCMDKDz96tJVvU4OuI/b P7VWzuJytenQCvqFo60k7Fswu4s7bnHGsmAqsicXsfK/d3T1EPSKJ6zcLrd9S9bUpiLY bxdVYF+9y38xI0lWcFSShewIG22d1RQgIZrWk0gik1s0xHfumqAJLZY3y2NXJXmmRYqM GQNkxAVJQznCv55nst0NYorz2IGTPBSVZ2Dqa/c8k+1G3rFfyE/84cduWQi7qarui3kP G2ZwRaot17j7vZXdPYgCPStQOfdnUcKm8bjrAU+NssjJ1CdDMYrQJaf1hbS0sNgKIAlw roVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=xWrTF8UDX0G6aIsHUcaJNXq24C39ICMCzppL+KGDaYA=; b=OmQEbfy4NisNvvRB3Yico4Hnrp0D/UCA9uxp0zGkz0Oh91gPJsiowSamQe8Bf+D9Am ssDc1tqvSru+YNcjpcHJjwHqKr28xkxgNmV1VBgCdUuL+an1GhYQM6v78v2iRl0oCkpZ IqPZRGdYkgJq8vNBBmIKyi2jIMD93aAyb7V3hwu5n51EHw/nOwrhxMt29snxFT1oBLBT Y7QxpjQiZWJXtvnWMM2IfiR4oMx0thm7/cHbnuQBJDYQb5WwaiOt4km5zAKi1pJYO/iA q27IgYJ4m5TzxrE6Nym+UB0KTSqgwOhgPuS/oYcD/OWv4LjbM6SOgyluiy/Yc8wLpazq HMng== X-Gm-Message-State: AOAM530Xmqe2eMd1Yi86dnMzKYs2lRw5BlmWEz23pjdyFORdBaikwfJZ XHqB8sxBgZAOBIrEUsDkutQN2OgMOVmlvwerI5RxJg== X-Google-Smtp-Source: ABdhPJwvsiWn48/aLPJezQJTrL0fWebhbeLlEdy15PeAnbJditOegdRycrIOWlgDQvYF2TD6VDBdOeQiwF3TL/kFrRk= X-Received: by 2002:a6b:c842:0:b0:645:c339:38c7 with SMTP id y63-20020a6bc842000000b00645c33938c7mr9594065iof.26.1646760115753; Tue, 08 Mar 2022 09:21:55 -0800 (PST) MIME-Version: 1.0 References: <5df21376-7dd1-bf81-8414-32a73cea45dd@google.com> <20220307183141.npa4627fpbsbgwvv@google.com> In-Reply-To: From: Wei Xu Date: Tue, 8 Mar 2022 09:21:44 -0800 Message-ID: Subject: Re: [RFC] Mechanism to induce memory reclaim To: Michal Hocko Cc: Dan Schatzberg , Johannes Weiner , Shakeel Butt , David Rientjes , Andrew Morton , Yu Zhao , Dave Hansen , Linux MM , Yosry Ahmed , Greg Thelen Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: A38881C0010 X-Stat-Signature: gupnxhiq8c4amj39gxk8j84j9b6rgn6c Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=k+05qpiW; spf=pass (imf21.hostedemail.com: domain of weixugc@google.com designates 209.85.166.41 as permitted sender) smtp.mailfrom=weixugc@google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1646760116-205852 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Mar 8, 2022 at 8:05 AM Michal Hocko wrote: > > On Tue 08-03-22 09:44:35, Dan Schatzberg wrote: > > On Tue, Mar 08, 2022 at 01:53:19PM +0100, Michal Hocko wrote: > > > On Mon 07-03-22 15:26:18, Johannes Weiner wrote: > [...] > > > > A mechanism to request a fixed number of pages to reclaim turned out > > > > to work much, much better in practice. We've been using a simple > > > > per-cgroup knob (like here: https://lkml.org/lkml/2020/9/9/1094). > > > > > > Could you share more details here please? How have you managed to find > > > the reclaim target and how have you overcome challenges to react in time > > > to have some head room for the actual reclaim? > > > > We have a userspace agent that just repeatedly triggers proactive > > reclaim and monitors PSI metrics to maintain some constant but low > > pressure. In the complete absense of pressure we will reclaim some > > configurable percentage of the workload's memory. This reclaim amount > > tapers down to zero as PSI approaches the target threshold. > > > > I don't follow your question regarding head-room. Could you elaborate? > > One of the concern that was expressed in the past is how effectively > can pro-active userspace reclaimer act on memory demand transitions. It > takes some time to get refaults/PSI changes and then you should > be acting rather swiftly. At least if you aim at somehow smooth > transition. Tuning this up to work reliably seems to be far > from trivial. Not to mention that changes in the memory reclaim > implementation could make the whole tuning rather fragile. The userspace reclaimer is not a complete replacement of the kernel memory reclaim (kswapd or direct reclaim). At least in Google's user cases, it is to proactively identify memory savings opportunities and reclaim some amount of cold pages set by the policy to free up the memory for more demanding jobs or scheduling new jobs. If a job (container) has a rapid memory demand increase, it would just mean less proactive savings from this job. The userspace reclaimer doesn't have to act much more swiftly for such jobs with the proposed nr_bytes_to_reclaim interface. If the userspace reclaim interface was memory.high-based, then such jobs would indeed be a serious problem.