From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83D0EC4706E for ; Fri, 22 Dec 2023 04:38:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 238976B007E; Thu, 21 Dec 2023 23:38:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1C2206B0080; Thu, 21 Dec 2023 23:38:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 03BCA6B0081; Thu, 21 Dec 2023 23:38:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id E05AF6B007E for ; Thu, 21 Dec 2023 23:38:44 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id BA20E1407AC for ; Fri, 22 Dec 2023 04:38:44 +0000 (UTC) X-FDA: 81593198568.21.5971285 Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) by imf07.hostedemail.com (Postfix) with ESMTP id 2A7374000E for ; Fri, 22 Dec 2023 04:38:42 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=1cMOk0n0; spf=pass (imf07.hostedemail.com: domain of rientjes@google.com designates 209.85.214.171 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1703219923; a=rsa-sha256; cv=none; b=WBHJ+A82RRxIdu9AdLSzlyD3PrA4x3OXTOEju99AAO9U4AA6axIzMqWLlJRGDYwIxOPqlg U3XU+8iSQMvsehOf/HkR8siDgQq2S1B48PVWcT97bVWmcY894WmDzKuPy/FnjzhsqEH3hh TqrwTEy3SShIrT3GF3ZwOknj7+iatNM= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=1cMOk0n0; spf=pass (imf07.hostedemail.com: domain of rientjes@google.com designates 209.85.214.171 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1703219923; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GyedNlCjbE+Q4bYx68krhasQrfrx8EhiaTfO6LvPL04=; b=sy4srMBR4nzMhdi5frBtH1zS7jWBtOW3YgdhrXivdugXpfVSbBd2sS5R2he69L357O+jns 4zlWociyrSFhd6KJidlrHbByTwSixUVPO8LfO6tnxAeswjIAgg9I7T5UgUYPqCVx+HTdOf S47fb84WqT2bzT7tyik7IUinnfPnNR4= Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-1d3a9b9634dso94005ad.0 for ; Thu, 21 Dec 2023 20:38:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1703219922; x=1703824722; darn=kvack.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=GyedNlCjbE+Q4bYx68krhasQrfrx8EhiaTfO6LvPL04=; b=1cMOk0n0lrEl1K2HThD2X85eqNgP/ZcvM2hMSafiaLQIzWMiodVV9uIKLkDlnstR1k HDpUiHFp//x4Snk7j+ghU+pQ2w9JDCOyN8Cw/bHU9VkcqWh9QmJHX9bYr+THV8pGuR2d /AZkfBeUCUaFCYRTmyXVLTowMbDwwr/XBv1Mc3msQ3pTg/Q64iwQKQmKY2FqTO9WPcEK rD/q6WeHYyo1twzQ0rdHu5Ii9vhJTzk4oMG7juvOzF6f8rkWOrFBIe8u3MpHkkEbTCPU O3WlYaI8uaJ2eOX0xxGizPmPKtvqs3drLrJitGv2MiQJIvB+HTjAWYlEv02AM//RYM73 huFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703219922; x=1703824722; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=GyedNlCjbE+Q4bYx68krhasQrfrx8EhiaTfO6LvPL04=; b=BNGkK006AMeYbhddNH5xNKgzzV7kuLTe8soxMICBjAviq2B/DumaloMOUYJfxu72Cs Kl+pOtzAG6Egyf+4i4Bc+khJ4rxfBjjdgBoYLb7xfg537f9xHTGupiHuM6mVGOWjwQ8d +faABWKgtgaFj+4Aw5wd56nncC/b7M/BfFfNxOyWX/u+QwONAoL9KLzKgxynCpL/lukK mQW6PDPvhDRoS+eLnqMhQ/I8YchsvhS8qtSC3c84BLk4rkvPjb6zsUsnNqXiSrvYwbok 5gFeWm4bKkVqBlDBE3BYm/oj/azhPvVJxUuKKREF8n216tBtDcb6swxJqcwJuO+ofke1 TP5g== X-Gm-Message-State: AOJu0Yz2K9Ewt8ZRca1Ry3SGPLGZ2TRx8N5utbKHVa7NK9T8L5oQS3n9 M7aqQP0yZw5Bqvt+N7rPseFEzWwbNHsd X-Google-Smtp-Source: AGHT+IGi06u5ImNRJFBga9p7VnAPha1sQjY12WynAwodGPZhh60I1bpfyFTprcdcFoDzb89iYeB4Vw== X-Received: by 2002:a17:903:18c:b0:1d4:1430:139f with SMTP id z12-20020a170903018c00b001d41430139fmr58061plg.25.1703219921743; Thu, 21 Dec 2023 20:38:41 -0800 (PST) Received: from [2620:0:1008:15:184:1476:510:6ea1] ([2620:0:1008:15:184:1476:510:6ea1]) by smtp.gmail.com with ESMTPSA id ei8-20020a17090ae54800b0028bd9f88576sm2626083pjb.26.2023.12.21.20.38.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Dec 2023 20:38:41 -0800 (PST) Date: Thu, 21 Dec 2023 20:38:40 -0800 (PST) From: David Rientjes To: Dan Schatzberg cc: Johannes Weiner , Roman Gushchin , Yosry Ahmed , Huan Yang , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, Tejun Heo , Zefan Li , Jonathan Corbet , Michal Hocko , Shakeel Butt , Muchun Song , Andrew Morton , Kefeng Wang , SeongJae Park , "Vishal Moola (Oracle)" , Nhat Pham , Yue Zhao Subject: Re: [PATCH v5 2/2] mm: add swapiness= arg to memory.reclaim In-Reply-To: <20231220152653.3273778-3-schatzberg.dan@gmail.com> Message-ID: References: <20231220152653.3273778-1-schatzberg.dan@gmail.com> <20231220152653.3273778-3-schatzberg.dan@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 2A7374000E X-Stat-Signature: 9fwif6p66nni48a7hkc7b4iz59p8efu7 X-Rspam-User: X-HE-Tag: 1703219922-682172 X-HE-Meta: U2FsdGVkX19CMtYXjmU9Cfb+zdvM/c63jwlrDt/pKJMwWvYx3nLMkNRcruP00GjdiIg33cvSJq1eifbTISdhynwkIdZGXfa3vQfsg8tj26caIpBzgajFeM/K+dUqyV+w1Rf5YH+uDdWiHV0rkYffviws12Nat/rxPmOokT1owRZWm2yDGQf3mqb+qZlTNlOQgrgvxDKiTfPMoFb5GtjTV5No4QjsP1lVlr67OX3kEASV9KMyVznoipcgaix2qo00LVIviLR9hBulp5wAgZ4KVK75PTou94DMTUcd04mRu8T0Gzqztigljypr35yG6+wxSfjq1su4OCSZIKrJYIu6jGliyDxv1NTYrjh7h0OaJBmiNgnxt4vb5jXffiiJmLVAA4pWdTHcrqjG+3y917dJmBGzv/ZjysKUk0pgPer8MQP+gP2zXvPk3aQUrtf21g1pqnRIts5tf6OpFrKleHMYYveSEVLzZRmJte4ys5pKC77nO7BWJFIxtaejsS8/YVrE4Q1WxmLDYrH9o43hlCCQh/sIDVAbvK2SWxkNUMEsBt2o9E/IKc84inAubpgphCz8FjfqmcijwAocHOqsKH4E8yu0MmED9NxcMWfVGHIVjAK2j7qqCx1tNaxSzmSZMzdkHhys5iik5L+O5sK1SWngdFSyseouzNGSWuHYSuWfGMuI/Yl4wjmZ0Kr63Fum+nHQsspikDJfP7EWyPXeQX2EJvCykRIAanHter2+Oci9l0U5RXsKcKovh+G1LqLYHJZiO0yeGhVKtUZmFZ36Lo4rLqhPOC9/1z4IjJH7cy+aYJ+lwAaJPv1A4fSavGC6f3ssIHhtJuXNV7+DbL34u9OdOdHCJA/b6GTqEdXt0Mf4NUT9MNUPLD4tPD5LCfqrJgPQm589EW5LA8GOsPjIa0cdimO/s/tVKb75dt3Ke9KOhZt2aKXj3vXHHn8j+DdhfJ+YJ31S2tUbXCYOJnXzna2 KmW7Om8s pzVaD9UAOYVdT5oF8q0xTkRjJdCa1iORWhh4Sg9LD/uQDzAjyBtj7PGMwNgC8YyG5gq/MXLM/tygkIj6ZAlxQLD4SLz93BNxafshGlNbJa/SgvfVctqQronNX1HBciHNhNMqh6rjx95jSkIcdzA7ba8QusQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, 20 Dec 2023, Dan Schatzberg wrote: > Allow proactive reclaimers to submit an additional swappiness= > argument to memory.reclaim. This overrides the global or per-memcg > swappiness setting for that reclaim attempt. > > For example: > > echo "2M swappiness=0" > /sys/fs/cgroup/memory.reclaim > > will perform reclaim on the rootcg with a swappiness setting of 0 (no > swap) regardless of the vm.swappiness sysctl setting. > > Userspace proactive reclaimers use the memory.reclaim interface to > trigger reclaim. The memory.reclaim interface does not allow for any way > to effect the balance of file vs anon during proactive reclaim. The only > approach is to adjust the vm.swappiness setting. However, there are a > few reasons we look to control the balance of file vs anon during > proactive reclaim, separately from reactive reclaim: > > * Swapout should be limited to manage SSD write endurance. In near-OOM > situations we are fine with lots of swap-out to avoid OOMs. As these are > typically rare events, they have relatively little impact on write > endurance. However, proactive reclaim runs continuously and so its > impact on SSD write endurance is more significant. Therefore it is > desireable to control swap-out for proactive reclaim separately from > reactive reclaim > > * Some userspace OOM killers like systemd-oomd[1] support OOM killing on > swap exhaustion. This makes sense if the swap exhaustion is triggered > due to reactive reclaim but less so if it is triggered due to proactive > reclaim (e.g. one could see OOMs when free memory is ample but anon is > just particularly cold). Therefore, it's desireable to have proactive > reclaim reduce or stop swap-out before the threshold at which OOM > killing occurs. > > In the case of Meta's Senpai proactive reclaimer, we adjust > vm.swappiness before writes to memory.reclaim[2]. This has been in > production for nearly two years and has addressed our needs to control > proactive vs reactive reclaim behavior but is still not ideal for a > number of reasons: > > * vm.swappiness is a global setting, adjusting it can race/interfere > with other system administration that wishes to control vm.swappiness. > In our case, we need to disable Senpai before adjusting vm.swappiness. > > * vm.swappiness is stateful - so a crash or restart of Senpai can leave > a misconfigured setting. This requires some additional management to > record the "desired" setting and ensure Senpai always adjusts to it. > > With this patch, we avoid these downsides of adjusting vm.swappiness > globally. > > [1]https://www.freedesktop.org/software/systemd/man/latest/systemd-oomd.service.html > [2]https://github.com/facebookincubator/oomd/blob/main/src/oomd/plugins/Senpai.cpp#L585-L598 > > Signed-off-by: Dan Schatzberg Acked-by: David Rientjes