From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7577DC4345F for ; Fri, 12 Apr 2024 20:48:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DB2136B0083; Fri, 12 Apr 2024 16:48:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D62736B0099; Fri, 12 Apr 2024 16:48:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C2BCA6B009B; Fri, 12 Apr 2024 16:48:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A0BE06B0083 for ; Fri, 12 Apr 2024 16:48:30 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 2B3491C1584 for ; Fri, 12 Apr 2024 20:48:30 +0000 (UTC) X-FDA: 82002067980.28.7E71532 Received: from mail-qv1-f44.google.com (mail-qv1-f44.google.com [209.85.219.44]) by imf09.hostedemail.com (Postfix) with ESMTP id 67802140002 for ; Fri, 12 Apr 2024 20:48:28 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=j5IyZFy3; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf09.hostedemail.com: domain of mcassell411@gmail.com designates 209.85.219.44 as permitted sender) smtp.mailfrom=mcassell411@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1712954908; a=rsa-sha256; cv=none; b=kpLZzuA+9VrU34N61pJZEK97XMt3EUf6L9uH11sLgEZNJfbdFqw3q0oWGZYAGXXcR/yZsJ 2Dm7VAG9dgVxUBrHjdtpHYQSBZ8CANDQicokMB3wSdLvOqbVeVXn9m/qBWOSULMREJKAMm 0nmI3ImFL7XIW1sYcHq/of0DBYljbFw= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=j5IyZFy3; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf09.hostedemail.com: domain of mcassell411@gmail.com designates 209.85.219.44 as permitted sender) smtp.mailfrom=mcassell411@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1712954908; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=UnmmxjRKpqxFVnghAeFega9zNQA9pSSXf3fcA1ZfMnw=; b=x9v55RGtubGBQFfej9L8eUgbUY8nY1G0L46SiA9YM1Mr9VqU+5tNLIbKToUV1trlyE3JXy N6YAx31uGmqlV5/NNWx9fMOXo1YOcYoEAyxOXRuirJLPOFY0YSkv6HJCbJcQDIDeJMuFIB B/TYkH2FR/vauGC5XRTrq7F81rPc5bg= Received: by mail-qv1-f44.google.com with SMTP id 6a1803df08f44-699320fcbc1so8098206d6.3 for ; Fri, 12 Apr 2024 13:48:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712954907; x=1713559707; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=UnmmxjRKpqxFVnghAeFega9zNQA9pSSXf3fcA1ZfMnw=; b=j5IyZFy3MwEqUu8SlaBW0NGbx6UgA7mL5+lgOmFBY87nZsdg7xLS0BfVlvt9ZOkSXb MhscIu2S88MDVuksq7/LymlA9ulR0anBRm4nPSTMB3j8TqAaD/+YWsVncSgr6BO2NpIf Pn+fwZTspABTMvoaXzI+fANSMSIpRkIsAd0WjAQzmmWyn5geZcldCdj5JUp6J956By33 vHclZMUvD1OpGc/0VXkgagCQO66tZ76AkQrh3aY3QbGkfh6JcwKtLjo+XbK9+HlXoGSW jCR3efRgj+ITVOmRg14KPTjd1ws3MR2RKJhlnvbfe/kUMXvyWAKU4kQnwQ4Q75zbcGTl 2HKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712954907; x=1713559707; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UnmmxjRKpqxFVnghAeFega9zNQA9pSSXf3fcA1ZfMnw=; b=jqiO3AlFUEUBYuIJF9Prdp9sjeEsGUdXJC4361Oo/WqAnS2nwoRQvLaF+CB4ZMdeWB cmgWXPpbzuRwrUPLCaQQnqY9bdvfxjKE+XJTviS4udbCg6SMhLuOOi7xkO5q4UCRTnkc oQtAycze3CqouF9p5oYvT7f8RuuonlnK7FdMFHxMsj4cFAS5iRGsePH+/i68RTaZ5Qk9 TzqVzEGaOHnMNFamdoEd5exb+evzDQw1MKgNJkhSP2nu8w2nVD0ugr+9jSe5aQBLG7Lx VdaLXljZ+labqbUM5sDogyl8g7sk7Po7TMjk+ClbVuh4BOqtYouHYp42vbhMtOWZ4Rz2 7cUQ== X-Forwarded-Encrypted: i=1; AJvYcCWWZyIw4alovNXTsPvE0ZjsXyCJlavb3ycddguYbiPqmue2yTcbGS7o9Z6EgHxGrL45MVB55UrRmCduI+IdFAGuTIc= X-Gm-Message-State: AOJu0YzSsBpWDuMeshXc+ZCUpaW8DTgkAQRr9DFUoOrfnqTrWBR++XsX 8jtDuChcWmvUSwLahiWxnXUAk/R/ojhtrjJAAHBC7HDihSs4hEEKm2K5ms2PVcToe7IYAq1Qr3D 9d03/dVrzaB2Lk9vFkEOyOR1QfEU= X-Google-Smtp-Source: AGHT+IGsxdKNSMvpvFUibqKic+MapNJ5+83IoIIuK53K6T/QEX/NXNp6+YB1TQzZemxd1I6ribVMdXKyKdk4s9Suypg= X-Received: by 2002:a05:6214:11a2:b0:69b:246b:4bff with SMTP id u2-20020a05621411a200b0069b246b4bffmr3823325qvv.33.1712954907535; Fri, 12 Apr 2024 13:48:27 -0700 (PDT) MIME-Version: 1.0 References: <20240405164920.2844-1-mcassell411@gmail.com> In-Reply-To: From: Matthew Cassell Date: Fri, 12 Apr 2024 15:48:16 -0500 Message-ID: Subject: Re: [PATCH] Documentation/admin-guide/sysctl/vm.rst adding the importance of NUMA-node count to documentation To: Vratislav Bendel Cc: corbet@lwn.net, akpm@linux-foundation.org, rppt@kernel.org, linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 67802140002 X-Stat-Signature: gqmt9b7kkjpa5pugusxhty94fg5n9kdn X-HE-Tag: 1712954908-962164 X-HE-Meta: U2FsdGVkX19buZMJaEnMxIbfZ+EfBXyrkImoUMJFmUgrt0nhTydBgif35NRp0yuNNfppHanYiXz9Zj9miAt+heK4HeX8SukYpEtwM9OHsnMMqgv5WZT7HU9CTRE/n6dl0YbP95sK+Zx+1oY/Z/+DYs+Xl2WhjVn6RyA+YTpt6UNbU+w5cj+0FCCxsos9vmkR8rwe/0kS06kav+nnau76e1x1gfCb72sQ8WV1zyBX7/Cm0mAn2DfcAzy+8sS9b0kcEFv4bbiLmyAouGPAmnZY6faSQN+Yv1DTnc7TEKBUqiEQZIRAL/fSHYEHIhR4ORAl0XD7dSr5Yn1GR4pb4YpUcQIFQcl7yvYnBa3MIRV78dTRdksKNTrG2+EfJ3WVDPkSgB3pbA8g0Zjixv6GYonVy3C/2hqOOKBoflaugOVCm5bf3g+QrX7HTaFc95YAdHhwPYAQMLi5hebsBBrGJGXjm8POgxha5/CAhOEGNY1R9Ro0mT/YSZEiVhfrxzlva5XYqX01rCaw6CJpa2/C7AwCFshBQxwopyyoy6VPkYwXEEpg3rOZeZQJlaE2XAmk+qkp7IzR0Dd3U2u0/yfGVyNJLt8D3lVCzOtKQDyfsyGacO/8NX/LQlEYNDxJC4XrxIUsy0XWm5YxWqQ/GiS/i6L/30sHtcllanoomVUNtAUDlPXeOBmyAWBLX0lSaawtzVe6Za0FXNviEr0/F5/TS3NxTRWMbUShAS3G3ssEyLbrQ62RGUbppKb91A5kiBAvzyRMGt58XrxMFIoEoehXTaqNovgwqYwac1+YJO3Xpfvu/cKp+qChgoJ/yITcJUg2oLl9b9OYbq7UEzHbDGDQI+M2wmcsakS0iFrGRSYTBlbSjGc3Bwe6GzeWUucdZqcPOIHibT/yYg2CfaCxbiaT5HOImZocR0cKbiIYp3emJsfsgVDBTWwoXxp4TtXmx7XdcoGawHKO5UKld0//WGP7cKc hMZqmrH3 PrYkHWl5q2JkAyiqRsDvHaojHZwTVgFGeRcQYan8vW/6lPRQBTQJX6jXbBuBHS0H3jeY4eUXvY5cqaxJNnK67wylUtVW+54GspEGqVvP+GP+6oophRub2KdEZ4idFVbz/HyBT7F+3JzNAfVtAvO43iGXdNnLPVRvr4nOMRh06pX1I64Lqc/eXXCNJyaXbYbRVlUaYnhFLpljWzF8upk7YFjLAdNBPFpVPMHNHT03+q4b7LgtsumTJB5BZjgyIxbJQHAhslZzZ3xrfY3PJQKIDzY0SCMklz+Tmt0bE8EDSLxS+x4yLDUN8wATwEPS6rgqQdcZhIRzd4KXKRqK1InlZGThoalNO7FZpC1aOIgEUimI9+pGX8JgEeklm8wo/rY0ezoFI+HtXHmfogRg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000004, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Thanks for the feedback. Here is a quick outline I came up with on your advice: [...] (original content) Keep in mind enabling bits in zone_reclaim_mode makes the most sense for topologies consisting of multiple NUMA nodes. In addition to vanilla zone_reclaim (clean and unmapped pages), there exist additional bits that expand which pages are eligible to be reclaimed and dictate scan_control policy during the reclaim process. The page allocator will attempt to recla= im memory locally in accordance with these bits before attempting to allocate on remote nodes. Allow dirty pages to become candidates for memory reclaim:: echo 2 > /proc/sys/vm/zone_reclaim_mode [...] (original content) Allow mapped pages to become candidates for memory reclaim:: echo 4 > /proc/sys/vm/zone_reclaim_mode [...] (original content) I'm trying to balance between keeping the original content, being descripti= ve, and not going into encyclopedia-mode. My motivation was to stress the impor= tance of NUMA-node count and describe the additional bits more per your advice. I added the echo snippets to better segue the aggressive options. Any thoug= hts on the above? On Thu, Apr 11, 2024 at 2:54=E2=80=AFAM Vratislav Bendel wrote: > > On Fri, Apr 5, 2024 at 6:49=E2=80=AFPM Matthew Cassell wrote: > > > > If any bits are set in node_reclaim_mode (tunable via > > /proc/sys/vm/zone_reclaim_mode) within get_pages_from_freelist(), then > > page allocations start getting early access to reclaim via the > > node_reclaim() code path when memory pressure increases. This behavior > > provides the most optimization for multiple NUMA node machines. The abo= ve > > is mentioned in: > > > > Commit 9eeff2395e3cfd05c9b2e6 ("[PATCH] Zone reclaim: Reclaim logic") > > states "Zone reclaim is of particular importance for NUMA machines. It > > can be more beneficial to reclaim a page than taking the performance > > penalties that come with allocating a page on a REMOTE zone." > > > > While the pros/cons of staying on node versus allocating remotely are > > mentioned in commit histories and mailing lists. It isn't specifically > > mentioned in Documentation/ and isn't possible with a lone node. Imagin= e a > > situation where CONFIG_NUMA=3Dy (the default on most major distribution= s) > > and only a single NUMA node exists. The latter is an oxymoron > > (single-node =3D=3D uniform memory access). Informing the user via vm.r= st that > > the most bang for their buck is when multiple nodes exist seems helpful= . > > > > I agree that the documentation could be improved to better express the > implications > and relevance of setting zone_reclaim_mode bits. > > Though I would suggest to go a step further and also elaborate on > those "additional actions", > for example something like: > "The page allocator will attempt to reclaim memory within the zone, > depending on the bits set, > before looking for free pages in other zones, namely on remote memory nod= es." > > > Signed-off-by: Matthew Cassell > > --- > > Documentation/admin-guide/sysctl/vm.rst | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/ad= min-guide/sysctl/vm.rst > > index c59889de122b..10270548af2a 100644 > > --- a/Documentation/admin-guide/sysctl/vm.rst > > +++ b/Documentation/admin-guide/sysctl/vm.rst > > @@ -1031,7 +1031,8 @@ Consider enabling one or more zone_reclaim mode b= its if it's known that the > > workload is partitioned such that each partition fits within a NUMA no= de > > and that accessing remote memory would cause a measurable performance > > reduction. The page allocator will take additional actions before > > -allocating off node pages. > > +allocating off node pages. Keep in mind enabling bits in zone_reclaim_= mode > > +makes the most sense for topologies consisting of multiple NUMA nodes. > > > > Allowing zone reclaim to write out pages stops processes that are > > writing large amounts of data from dirtying pages on other nodes. Zone > > -- > > 2.34.1 > > >