From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22A30C47258 for ; Fri, 2 Feb 2024 05:03:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ADCAD6B007D; Fri, 2 Feb 2024 00:03:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A8CFE6B007E; Fri, 2 Feb 2024 00:03:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 954F66B0080; Fri, 2 Feb 2024 00:03:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 852DE6B007D for ; Fri, 2 Feb 2024 00:03:28 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 29741C01CB for ; Fri, 2 Feb 2024 05:03:28 +0000 (UTC) X-FDA: 81745670496.18.109CCF9 Received: from bjm7-spam01.kuaishou.com (smtpcn03.kuaishou.com [103.107.217.217]) by imf26.hostedemail.com (Postfix) with ESMTP id D39F8140004 for ; Fri, 2 Feb 2024 05:03:24 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=kuaishou.com header.s=dkim header.b=QQWV+MJh; spf=pass (imf26.hostedemail.com: domain of yangyifei03@kuaishou.com designates 103.107.217.217 as permitted sender) smtp.mailfrom=yangyifei03@kuaishou.com; dmarc=pass (policy=none) header.from=kuaishou.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706850206; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kv+uNlFkCs164PfgZgLvFPjj78x8QizrenyK8Lktg0o=; b=UBRyNKcxku6yofyQjLCEponevTPalm5ZkI1BaJsZFhRHiC2lOKIgO6mEakIgAcZmaA44KO //qCbrjhQfFS3DseFojrwcFMOqLKeqzBmLbdtYi/Lacw1P2bTybfkMec8TONxthpctVOZU Nf4pous9TGQqm/5gMYXMxjRDEGVayzM= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=kuaishou.com header.s=dkim header.b=QQWV+MJh; spf=pass (imf26.hostedemail.com: domain of yangyifei03@kuaishou.com designates 103.107.217.217 as permitted sender) smtp.mailfrom=yangyifei03@kuaishou.com; dmarc=pass (policy=none) header.from=kuaishou.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706850206; a=rsa-sha256; cv=none; b=WrleHjjfuRPaV8ZD0NlGdzgPNyZtxQQP8z9P9EroXntZ55QBytnQfaH0dCibocaUqR/Xpn q3rlDplAHOBxr5ejvlImnIzBCfOOtKvoZdaAwpSaQAUk34HPT5XDBTwKbo7umHn8YbXoRm qlon6xDYQMtakgmKk1gTdQapA3MJQws= Received: from bjm7-pm-mail12.kuaishou.com (unknown [172.28.1.94]) by bjm7-spam01.kuaishou.com (Postfix) with ESMTPS id 03C101809C7D9; Fri, 2 Feb 2024 13:03:22 +0800 (CST) DKIM-Signature: v=1; a=rsa-sha256; d=kuaishou.com; s=dkim; c=relaxed/relaxed; t=1706850201; h=from:subject:to:date:message-id; bh=kv+uNlFkCs164PfgZgLvFPjj78x8QizrenyK8Lktg0o=; b=QQWV+MJhP60Pv53iohRRniqcjlILuV8vNvm3yaM1x7hjrjqNS2VcHSwMJoced3vO6+ZZlMbCrHL VKC05bAjppcJuYW+jZzavBAR51WEhNlpKzWz9rNW0hkSVHIp5d54dv4UuCKftMKjUamguqPk4c8DQ KIKijLk2FBNHQRS2C+I= Received: from infra-bjy-rs-infra-kernel6.idchb1az2.hb1.kwaidc.com (172.28.1.32) by bjm7-pm-mail12.kuaishou.com (172.28.1.94) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.20; Fri, 2 Feb 2024 13:03:21 +0800 From: Efly Young To: CC: , , , , , , , , , , , , Subject: Re: [PATCH] mm: memcg: Use larger chunks for proactive reclaim Date: Fri, 2 Feb 2024 13:02:47 +0800 Message-ID: <20240202050247.45167-1-yangyifei03@kuaishou.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20240201153428.GA307226@cmpxchg.org> References: <20240201153428.GA307226@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset="y" Content-Transfer-Encoding: 8bit X-Originating-IP: [172.28.1.32] X-ClientProxiedBy: bjxm-pm-mail10.kuaishou.com (172.28.128.10) To bjm7-pm-mail12.kuaishou.com (172.28.1.94) X-Rspamd-Queue-Id: D39F8140004 X-Rspam-User: X-Stat-Signature: cig1egdgxet4q3s1gro5amho7c4quyz1 X-Rspamd-Server: rspam01 X-HE-Tag: 1706850204-370860 X-HE-Meta: U2FsdGVkX18YYs5PjWQOammg3uCIxq/Z3P/vFlJQW0isxVGgg2XsbqysIQ8w4ppqpnFis/syQNQQ4SJY8GTLadkkR06n/PZ6EvaCKTDPt/6AoYTvdh2vFo3OCeQT7z4WgxaijxpU8CYISDLBRrK67ppwB9yb09beprN/ANeU/jVG9d2OMmVU5YMxpH5PHOyUm5mYZvhMwF39wzRLUPstZFYaXy4u50jkW/3zd1hWcOd2KOdXHVlKWnBrdoNDR0wicbnse42n7FyhWyYqrqY7hv9AUcRxc5gfMNo9ZzQ712A0B78U+JvZobMAOjsa5Er4YmX3ydm0ZIXS8Z+5o5of5E2HDevtufleIO24smB72dWnJfzggejMYPqfcHE5jJ+GzUoIqveMlALUZ+fkiUnEwJ0NIn7suscZeV4elyc6wuPXCP71cyW1h13cfw+Eb0K63FPjCdMJOOJ68EuTFHDMqprb9R0Z+Y01qG6zQc6SaDXNNX+IzoqnEPaRjDR64tMNJ1NKzMfs9Y1H54qvaFQuLjjPJddDnmU6p0K1TEinEvoGnGiSaV7s+sXbqHBfwCvMmuZCXnCYHlbucRGD/EX0zS/g1ffYSvUxT944BOxT/tOuxSwQCTY2/brHr630TLEvLpMJwpybzpGNVXQZ5I4l3BoxUbUm8vVQq3b6BzMwnfBZ0FZcxgjARN7Rx1Yew5GYYtBwVxxYLZ0hpw4SAmd1E/gAEb0Zd/yDQm3mthNpsd3SXPH+uYeGVLetHbNkHRsPBNMMT5x1ruMIXJExlwlbYeAsKUnO5DIRNgLeUPW89uB7zIAMqRcfucHbANtaE6gDfm56WKibL2HfIprOKuT/VcxLTfPOGuDKUqMWXNdp+g1pgNZ+1CciSjCMDD7OMryLvQ5TuignUUaunCk181U5gZg23j/RGsRU7U7Yx7FYWYTtj2zQobtK1HgfkX4Rn5sM0/YBcj81/aNpbWIZ0UJ 6eKXdhP3 C+GIQ3U7LWMh9SUUGoNK6xyVAQue/F0McbThfTcP+1oaPut/tpl50fnGIVOLrMakoUJfN76HmTvB7S4yAUvsHirI6kzIv3+bMyB/12XjrFL5Ttd18l3B+B15q5UbwV1sVFZiOPlWnSHKjoS52ArEjVLIDxLZsbj/uNxhJNdTLQPAU8L5AvPHfDG7OKZm1hZaNCXBOouAeyOJohninkiY2B0R6rA9QObhGaNe3WM7J+joeSGqRfoUWrD1gjbbt3vzZRw1IIDzmqEjh2NA6HdAFEGw/pB7A8FTxH8i7RgdjfxjTcF8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: > On Thu, Feb 01, 2024 at 02:57:22PM +0100, Michal Koutný wrote: > > Hello. > > > > On Wed, Jan 31, 2024 at 04:24:41PM +0000, "T.J. Mercier" wrote: > > > reclaimed = try_to_free_mem_cgroup_pages(memcg, > > > - min(nr_to_reclaim - nr_reclaimed, SWAP_CLUSTER_MAX), > > > + max((nr_to_reclaim - nr_reclaimed) / 4, > > > + (nr_to_reclaim - nr_reclaimed) % 4), > > > > The 1/4 factor looks like magic. > > It's just cutting the work into quarters to balance throughput with > goal accuracy. It's no more or less magic than DEF_PRIORITY being 12, > or SWAP_CLUSTER_MAX being 32. > > > Commit 0388536ac291 says: > > | In theory, the amount of reclaimed would be in [request, 2 * request). > > Looking at the code, I'm not quite sure if this can be read this > literally. Efly might be able to elaborate, but we do a full loop of > all nodes and cgroups in the tree before checking nr_to_reclaimed, and > rely on priority level for granularity. So request size and complexity > of the cgroup tree play a role. I don't know where the exact factor > two would come from. I'm sorry that this conclusion may be arbitrary. It might just only suit for my case. In my case, I traced it loop twice every time before checking nr_reclaimed, and it reclaimed less than my request size(1G) every time. So I think the upper bound is 2 * request. But now it seems that this is related to cgroup tree I constucted and my system status and my request size(a relatively large chunk). So there are many influencing factors, a specific upper bound is not accurate. > IMO it's more accurate to phrase it like this: > > Reclaim tries to balance nr_to_reclaim fidelity with fairness across > nodes and cgroups over which the pages are spread. As such, the bigger > the request, the bigger the absolute overreclaim error. Historic > in-kernel users of reclaim have used fixed, small request batches to > approach an appropriate reclaim rate over time. When we reclaim a user > request of arbitrary size, use decaying batches to manage error while > maintaining reasonable throughput. > > > Doesn't this suggest 1/2 as a better option? (I didn't pursue the > > theory.) > > That was TJ's first suggestion as well, but as per above I suggested > quartering as a safer option. > > > Also IMO importantly, when nr_to_reclaim - nr_reclaimed is less than 8, > > the formula gives arbitrary (unrelated to delta's magnitude) values. > > try_to_free_mem_cgroup_pages() rounds up to SWAP_CLUSTER_MAX. So the > error margin is much higher at the smaller end of requests anyway. > But practically speaking, users care much less if you reclaim 32 pages > when 16 were requested than if you reclaim 2G when 1G was requested. Yes, I agreed completely that the bigger the request the bigger the absolute overreclaim error. The focus now is the tradeoff between accurate reclaim and efficient reclaim. I think TJ's test is suggestive.