From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F39AAECAAA1 for ; Mon, 31 Oct 2022 16:00:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 72A646B0071; Mon, 31 Oct 2022 12:00:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6DA706B0073; Mon, 31 Oct 2022 12:00:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 57BE86B0074; Mon, 31 Oct 2022 12:00:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 49A236B0071 for ; Mon, 31 Oct 2022 12:00:08 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id F1A4440817 for ; Mon, 31 Oct 2022 16:00:07 +0000 (UTC) X-FDA: 80081706096.22.6AB1EFA Received: from mail-qt1-f175.google.com (mail-qt1-f175.google.com [209.85.160.175]) by imf12.hostedemail.com (Postfix) with ESMTP id 75B3A40016 for ; Mon, 31 Oct 2022 16:00:05 +0000 (UTC) Received: by mail-qt1-f175.google.com with SMTP id e15so3755534qts.1 for ; Mon, 31 Oct 2022 09:00:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20210112.gappssmtp.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=pqNXT1yvQ+HeL2IwyL8Ke4Gy8cBqHUWLrK2MKKUD7Ms=; b=6R1CxvJuec61nZc5iGmNYAx1CT/jr3j7yzXcibP0bT7RSVNWBDFj13wsjwTzqM6mnd UXroCkKUwbV+KL1WmS1RFPx0VgBRGlvf1GvlXI4T4qam9pUOicaba5DkzOEjksiO0wbu ckhSgiyZCUqPdndZeCg125nV6//L9HEiKkSbjcbUle7+2tDxBhyMUGLrva+r6PfZNBYq /Kj3FKDexk6gZ4kFYx/DQcEPpioIQV1vnARhpezCWqr7e0yygmUFQkbH8fv7yloGXaAM 81rZgv52lEHDVeN7ODSPfDq1cD1WKB6tx+jAjODZU2mf3xKlESjvr/TCbmlod96Mm+bX AGHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=pqNXT1yvQ+HeL2IwyL8Ke4Gy8cBqHUWLrK2MKKUD7Ms=; b=a4ebLh/70kwsrHS+EPCpzBS68enmbplqb0G4E4hHYVvDSidq1HDDS9H7u9ABo38Td/ O4ae7qBavzLYSfsYLAdZAeM4Swaf7LToMFztHhg9Q1fvwEu9NP6ysXfShJ6piw55dmsX sOFIk+rdOBPlU6Qpwd6Cx3fvwOJ/JwlwOXbg6KC0Y/C4+EfYSI4aoxpaH6DmtBLw6bd5 FnyDTAbpV29CKbKjLjZ4Py4krDIpoam9Y5NsAQIfxUVetB6ROoPr/Rqaw5J9zuem+TtP aerLbA86P7zrnASH1BGRmHhQoUBBzgLg/AGjPf8991NeDxVrQvIc6KuxNFp2+9RMIAhn 74ew== X-Gm-Message-State: ACrzQf2elZ0L3S4j48Q7EtP3THRAHNAogXtwuI7sv8MJ0jqnuzD/T4s0 8BF1Wj57h+Qj5IBzWKqBK3vxFg== X-Google-Smtp-Source: AMsMyM6ROOnUhJSSHMqb/eoxOgQHmqL3NUHIvcDeLB1gfroaX0hnKpkxLMDNMgfqHkG4NsFd+bjzWQ== X-Received: by 2002:a05:622a:4815:b0:3a5:25a0:9b7c with SMTP id fb21-20020a05622a481500b003a525a09b7cmr4897818qtb.668.1667232005015; Mon, 31 Oct 2022 09:00:05 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::25f1]) by smtp.gmail.com with ESMTPSA id u22-20020a05620a431600b006ceb933a9fesm4844384qko.81.2022.10.31.09.00.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 31 Oct 2022 09:00:04 -0700 (PDT) Date: Mon, 31 Oct 2022 12:00:05 -0400 From: Johannes Weiner To: Yosry Ahmed Cc: Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Eric Bergen Subject: Re: [PATCH] mm: vmscan: split khugepaged stats from direct reclaim stats Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1667232006; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pqNXT1yvQ+HeL2IwyL8Ke4Gy8cBqHUWLrK2MKKUD7Ms=; b=uTGteminHrEpNYUJy3xn2m9AnhJBv6tITUzP9BgtNPII+nsZUJP73viYp1CzvPwsOKxnKd +L525lxgPPy1IE7n7ojvRoHz7m87Wd3b/gDysrS9e7HWMgS23VErHZ63nf6We60aUW2e2T uOtWN9IRzRvdgXCgLeq+cOhhOG/SwDc= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=6R1CxvJu; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf12.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.175 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1667232006; a=rsa-sha256; cv=none; b=2xyUqz2ToyTjoPdmFNel5iR4CrcjJB+rvF39zhpAnGx+DFlO7MawgPA4mWMCALLl3JVhO8 bvvNY1QKn14Pa37270jQnI74hqWSp1mWqPNZ8S/2Tc58q5eVsVEIz3JborKptbaw8FHdNu SjqXypLN/QtkF5DAX91wzO/MfcolvGM= X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 75B3A40016 Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=6R1CxvJu; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf12.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.175 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org X-Stat-Signature: u6yxi7ai5yozy3pudtgnipc7kh78wm51 X-Rspam-User: X-HE-Tag: 1667232005-824516 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Oct 28, 2022 at 10:41:17AM -0700, Yosry Ahmed wrote: > On Fri, Oct 28, 2022 at 7:39 AM Johannes Weiner wrote: > > pgscan_user: User-requested reclaim. Could be confusing if we ever > > have an in-kernel proactive reclaim driver - unless that would then go > > to another counter (new or kswapd). > > > > pgscan_ext: Reclaim activity from extraordinary/external > > requests. External as in: outside the allocation context. > > I imagine if the kernel is doing proactive reclaim on its own, we > might want a separate counter for that anyway to monitor what the > kernel is doing. So maybe pgscan_user sounds nice for now, but I also > like that the latter explicitly says "this is external to the > allocation context". But we can just go with pgscan_user and document > it properly. Yes, I think you're right. pgscan_user sounds good to me. > How would khugepaged fit in this story? Seems like it would be part of > pgscan_ext but not pgscan_user. I imagine we also don't want to > pollute proactive reclaim counters with khugepaged reclaim (or other > non-direct reclaim). > > Maybe pgscan_user and pgscan_kernel/pgscan_indirect for things like khugepaged? > The problem with pgscan_kernel/indirect is that if we add a proactive > reclaim kthread in the future it would technically fit there but we > would want a separate counter for it. > > I am honestly not sure where to put khugepaged. The reasons I don't > like a dedicated counter for khugepaged are: > - What if other kthreads like khugepaged start doing the same, do we > add one counter per-thread? It's unlikely there will be more. The reason khugepaged doesn't rely on kswapd is unique to THP allocations: they can require an exorbitant amount of work to assemble, but due to fragmentation those requests may fail permanently. We don't want to burden a shared facility like kswapd with large amounts of speculative work on behalf of what are (still*) cornercase requests. This isn't true for other allocations. We do have __GFP_NORETRY sites here and there that rather fall back early than put in the full amount of work; but overall we expect allocations to succeed - and kswapd to be able to balance for them!!** - because the alternative tends to be OOMs, or drivers and workloads aborting on -ENOMEM. (* As we evolve the allocator and normalize huge page requests (folios), kswapd may also eventually balance for THPs again. IOW, it's more likely for this exception to disappear again than it is that we'll see more of them.) (** This is also why it's no big deal if other kthreads that rely on kswapd contribute to direct reclaim stats. First, it's highly error prone to determine on a case by case basis whether userspace could be waiting behind that direct reclaim - as Yang Shi's writeback example demonstrates. Second, if kswapd is overwhelmed, it's likely to impact userspace *anyway*! The benefit of this classification work is questionable.) > - What if we deprecate khugepaged (or such threads)? Seems more likely > than deprecating kswapd. If that happens, we can remove the counter again. The bar isn't as high for vmstat as it for other ABI, and we've updated it plenty of times to reflect changes in the MM implementation. > Looks like we want a stat that would group all of this reclaim coming > from non-direct kthreads, but would not include a future proactive > reclaim kthread. I think the desire to generalize overcomplicates things here in a way that isn't actually meaningful. Think of direct reclaim stats as a signal that either a) kswapd is broken or b) memory pressure is high enough to cause latencies in the class of requests that are of interest to userspace. This is true for all cases but khugepaged.