From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 27196C6FD18
	for <linux-mm@archiver.kernel.org>; Tue, 18 Apr 2023 19:53:41 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 814638E0002; Tue, 18 Apr 2023 15:53:40 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 7C4458E0001; Tue, 18 Apr 2023 15:53:40 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 68CC58E0002; Tue, 18 Apr 2023 15:53:40 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11])
	by kanga.kvack.org (Postfix) with ESMTP id 5B3278E0001
	for <linux-mm@kvack.org>; Tue, 18 Apr 2023 15:53:40 -0400 (EDT)
Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay06.hostedemail.com (Postfix) with ESMTP id 365FEAC091
	for <linux-mm@kvack.org>; Tue, 18 Apr 2023 19:53:40 +0000 (UTC)
X-FDA: 80695561800.10.E841037
Received: from mail-qv1-f43.google.com (mail-qv1-f43.google.com [209.85.219.43])
	by imf03.hostedemail.com (Postfix) with ESMTP id 1BE3820003
	for <linux-mm@kvack.org>; Tue, 18 Apr 2023 19:53:37 +0000 (UTC)
Authentication-Results: imf03.hostedemail.com;
	dkim=pass header.d=cmpxchg-org.20221208.gappssmtp.com header.s=20221208 header.b=NMIzvLIn;
	spf=pass (imf03.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.43 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org;
	dmarc=pass (policy=none) header.from=cmpxchg.org
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1681847618;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=OGoi/M+H19ZEaSUrI6qk1TZawIL1Q7fYNR9L4vjQE/8=;
	b=lYnwaTjasCtupJYpas+ZEorTu2D4lXKbU/AwTffgGuOYYASdHsXxHY77UJuDN0RuKQVDOB
	WGqToHyXVPNNXn7uuebEVVlTzUGWjkRNJtgRB6xr/OAe4aZSZ5VAuJ/c1CQEK8PN5EcDua
	bQzQSZ2ILasRPWetE/DMmvkrF31S6Bw=
ARC-Authentication-Results: i=1;
	imf03.hostedemail.com;
	dkim=pass header.d=cmpxchg-org.20221208.gappssmtp.com header.s=20221208 header.b=NMIzvLIn;
	spf=pass (imf03.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.43 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org;
	dmarc=pass (policy=none) header.from=cmpxchg.org
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681847618; a=rsa-sha256;
	cv=none;
	b=xeYnb3qlcTNdGtNMLUKCjUqV3MXPuAdjncyFpUnEICEasF28vUia+YCFWTqhnAcXierIXm
	VKnZRGGwcZ8i/U6iay6g+I+iUHZ0bXqDbFzyBiMUHZklCi7hH1Wy7A26l0CsrXHMiqsmf6
	AJtcIRMFHQxOe0DJqIUyUKZBZu0ya9I=
Received: by mail-qv1-f43.google.com with SMTP id e13so13066859qvd.8
        for <linux-mm@kvack.org>; Tue, 18 Apr 2023 12:53:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=cmpxchg-org.20221208.gappssmtp.com; s=20221208; t=1681847617; x=1684439617;
        h=in-reply-to:content-disposition:mime-version:references:message-id
         :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to;
        bh=OGoi/M+H19ZEaSUrI6qk1TZawIL1Q7fYNR9L4vjQE/8=;
        b=NMIzvLIn91NVhhOzUhCambBzKYzsJw4DxxzcdXG7mVmcb7QRnWXCpKxDsMdSm2kxO7
         IPKIKd4MD28nuKXtiEhFEzjMeqa6xcknYsn0YbHqeThk0LAdpD6bjd9oPNOoTWzF7gV/
         dHjVijZiyfcebRTkftr4jWG945E+imnrz9sTlJ0jo1fczmuB1k2rPRn9tAo1c7xJ8bbU
         vLRUpnfK3/zhmadVGzHGn9Uo+4smTEP4h66mPy7K7PIHykUZSdZbVLb27MB87thnlU2W
         5JWd7q1b8lXglmVaPETuWHshhizpfmmfLDsWKB2NQ2/bhpw2OjHg+PYle73CrB8fw2Yi
         lyLg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20221208; t=1681847617; x=1684439617;
        h=in-reply-to:content-disposition:mime-version:references:message-id
         :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date
         :message-id:reply-to;
        bh=OGoi/M+H19ZEaSUrI6qk1TZawIL1Q7fYNR9L4vjQE/8=;
        b=EFF+JbK8K5XXIzEDGKMGnTgL/sIIGtwyZDlZV+Uoagwi6gtcJBwy9k+Ef2/dbHqpQr
         zPKPgX5/bG43uOLdNK3NgLhP5tys5zYKD2Pmp7tWrnrgBlqKbdH+6d0cPFhIcDZEIlrn
         aUK6mIsdAkMB4Qg8EtSXrkLF0Edo2kwxXkEGklwHkb8jfVJJCwA56i4tsrxR5CtQIENQ
         tzLSaS5SFSv6T7B1RXxBmh2BFsVGFzyLo1zfCaqD6vnyOTlKXCQ+CuiqwGy38qDSotI7
         Z2RU/5sthtyL1SEdbdv8BLtYoGLx2RMWbQxFksjsbo/e7Rl6EgEoONkpexqcPb5TELnX
         M+MA==
X-Gm-Message-State: AAQBX9cOeQJdDdz3a6uAcKIUb6g4Zqq5AkOCOTfTIWQP7Kloaho+886i
	ROpG6z5upWN+/8mlmmq5QqV0Mw==
X-Google-Smtp-Source: AKy350bIyZhpnjbglYY8ZHqsjOC1yiwx68YPPtwtxa0ItbI9Af0u2IIgv4F8nSh87GzGLv468enYBA==
X-Received: by 2002:a05:6214:c8a:b0:5ef:519:b21 with SMTP id r10-20020a0562140c8a00b005ef05190b21mr17920030qvr.26.1681847617144;
        Tue, 18 Apr 2023 12:53:37 -0700 (PDT)
Received: from localhost ([2620:10d:c091:400::5:e646])
        by smtp.gmail.com with ESMTPSA id m1-20020ad44a01000000b005ef61084fddsm2693927qvz.131.2023.04.18.12.53.36
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 18 Apr 2023 12:53:36 -0700 (PDT)
Date: Tue, 18 Apr 2023 15:53:35 -0400
From: Johannes Weiner <hannes@cmpxchg.org>
To: Douglas Anderson <dianders@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Mel Gorman <mgorman@techsingularity.net>,
	Yu Zhao <yuzhao@google.com>, Ying <ying.huang@intel.com>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [RFC PATCH] mm, compaction: kcompactd work shouldn't count
 towards memory PSI
Message-ID: <20230418195335.GA268630@cmpxchg.org>
References: <20230418095852.RFC.1.I53bf7f0c7d48fe7af13c5dd3ad581d3bcfd9d1bd@changeid>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20230418095852.RFC.1.I53bf7f0c7d48fe7af13c5dd3ad581d3bcfd9d1bd@changeid>
X-Rspam-User: 
X-Rspamd-Server: rspam04
X-Rspamd-Queue-Id: 1BE3820003
X-Stat-Signature: r5f8114ikfj7aoox5mabczkkxwekf5nx
X-HE-Tag: 1681847617-596716
X-HE-Meta: U2FsdGVkX19bZ0MHYL71GYzgDruS1dyIymLTkCRIWCu0foAgczojbW+9YpJddvsip155vq8+TI+TnGIeggPLZl76rB9JldD1asnAkn3e2+62mxP4MK6630PLB9wn7pFKeRom6JkryMEmewlhIeovdS+BBjBYCIefzXK2y3sKJqMO26sX/xjJcR1VYrp33PkCIurLLcicXl3QnNOI5vhSw0CShX4MLuRvpHmA50gAKX4uCt22jthvQJ9IzZ/5Dgzz7A3bb/UF8a1MTd/weRs8gdQfhvzIDhqQBpuh4gHTv/sU6ts4gSfWoCb0UuZ4DltjnP2HCmvrSwTxsCR8GrixbIQ1ACSkhZv6kmKOI9xi1tdyktA2Sn2nKL1Hqrn2KcrvuG5178mEysl0GHoP65psg206fZDuNLB29yI8Gw25aBhleFRuFEOz7qjNmiqin1JaOquu9oC6C3tW9S1YhXdIscCou3/D8/T6FwN7ZWPpSEqMt/s1/t+56xSM4YncakhtMLh/B7d4asgtb5Cx5dv33OBS64sgRWulTk9IDozpYhj5UYL3OvzACQjrwDQyNBjMToFKHnNtxXszrWNKtkQOHzfW75dln51WuLQeyKmB5eqrz85ixgKb4D4OhrWilz538IO5gEboH/7Eq3o53N99WaMEX482chJT0Yx/Jv9Jw++1BUmNhsxueYBv3HKXlwrlb1Vr6wouEFh8D3byIrihQSoceAT2o+61MnRHdG1ZkToHrsl/cxGGbiIgdu/tXt7mM0zE3IYkv462ma/zhJgE2C+0wr7srwAn4QBpAKsN+gjs/zIzevhLp5uTLMWPqQub4REh5rYzowzjB1GWHe5A1X+juN+KSMBrz717qlun4jimw+Juups5+WZXH9917GyF4orq9NPpZmOwu6G6z1zGGXdBteyXJSERJeJsx6Dk2ar3Yh8QyQZ8X3Fu1gnAXuKXgUPInthvs7sxgzSQDXB
 AGxQR9ra
 FNcriDvqljoRJ4zTH0E8flShAozFsMDUSnRCy2qKTCkvovFmsbI0JQhq8eAyJ3vJaDOzOCYkYgPyQcRovsvNkS5GFK6LFds16lrO3x1eY4mOrxeMH7skGRFlLhG+FO8GhH/Q0DcpqCcUo3/FnrNv8Y65X6dTvnn0Xo24xXpJ9CwOWKw4WynpVUdwurXCGuoNUsMOuoPiOODMgZd45MUtWLhH7Q3g6e+QhrsZfxyg09dv2ujXoWF6RZsw2DNEEKfroooBGf4m6tG8cxYC4oT8vQ3RkVjy4i58U0dpjoW9QUtC8TPRBHWNtdFCjEoo5viMcaWjm+27abfP8KE+5jY5iia4Lx0+ZD4xH7D6Vv0Ph6ksCSr7k1vc+J9csJeNcZcq74NG4hxr6547JDQLdsnF3f7kDxRRsly6SpbHB0I3+obRanVPMcK6Qk3RRTg==
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On Tue, Apr 18, 2023 at 09:58:54AM -0700, Douglas Anderson wrote:
> When the main kcompactd thread is doing compaction then it's always
> proactive compaction. This is a little confusing because kcompactd has
> two phases and one of them is called the "proactive" phase.
> Specifically:
> * Phase 1 (the "non-proactive" phase): we've been told by someone else
>   that it would be a good idea to try to compact memory.
> * Phase 2 (the "proactive" phase): we analyze memory fragmentation
>   ourselves and compact if it looks fragmented.
> 
> From the context of kcompactd, the above naming makes sense. However,
> from the context of the kernel as a whole both phases are "proactive"
> because in both cases we're trying compact memory ahead of time and
> we're not actually blocking (stalling) any task who is trying to use
> memory.
> 
> Specifically, if any task is actually blocked needing memory to be
> compacted then it will be in direct reclaim. That won't block waiting
> on kcompactd task but instead call try_to_compact_pages() directly.
> The caller of that direct compaction, __alloc_pages_direct_compact(),
> already marks itself as counting towards PSI.
> 
> Sanity checking by looking at this from another perspective, we can
> look at all the people who explicitly ask kcompactd to do a reclaim by
> calling wakeup_kcompactd(). That leads us to 3 places in vmscan.c.
> Those are all requests from kswapd, which is also a "proactive"
> mechanism in the kernel (tasks aren't blocked waiting for it).

There is a reason behind annotating kswapd/kcompactd like this, it's
in the longish comment in psi.c:

 * The time in which a task can execute on a CPU is our baseline for
 * productivity. Pressure expresses the amount of time in which this
 * potential cannot be realized due to resource contention.
 *
 * This concept of productivity has two components: the workload and
 * the CPU. To measure the impact of pressure on both, we define two
 * contention states for a resource: SOME and FULL.
 *
 * In the SOME state of a given resource, one or more tasks are
 * delayed on that resource. This affects the workload's ability to
 * perform work, but the CPU may still be executing other tasks.
 *
 * In the FULL state of a given resource, all non-idle tasks are
 * delayed on that resource such that nobody is advancing and the CPU
 * goes idle. This leaves both workload and CPU unproductive.
 *
 *	SOME = nr_delayed_tasks != 0
 *	FULL = nr_delayed_tasks != 0 && nr_productive_tasks == 0
 *
 * What it means for a task to be productive is defined differently
 * for each resource. For IO, productive means a running task. For
 * memory, productive means a running task that isn't a reclaimer. For
 * CPU, productive means an oncpu task.

So when you have a CPU that's running reclaim/compaction work, that
CPU isn't available to execute the workload.

Say you only have one CPU shared between an allocating thread and
kswapd. Even if the allocating thread never has to do reclaim on its
own, if it has to wait for the CPU behind kswapd 50% of the time, that
workload is positively under memory pressure.

I don't think the distinction between proactive and reactive is all
that meaningful. It's generally assumed that all the work done by
these background threads is work that later doesn't have to be done by
an allocating thread. It might matter from a latency perspective, but
otherwise the work is fungible as it relates to memory pressure.

HTH