From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15215C25B0E for ; Wed, 17 Aug 2022 02:59:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 935628D0002; Tue, 16 Aug 2022 22:59:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8E4D48D0001; Tue, 16 Aug 2022 22:59:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7AD418D0002; Tue, 16 Aug 2022 22:59:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 6B8CB8D0001 for ; Tue, 16 Aug 2022 22:59:50 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 3C51EC1529 for ; Wed, 17 Aug 2022 02:59:50 +0000 (UTC) X-FDA: 79807579740.21.C7C9995 Received: from mail-pf1-f174.google.com (mail-pf1-f174.google.com [209.85.210.174]) by imf29.hostedemail.com (Postfix) with ESMTP id E21F11201D6 for ; Wed, 17 Aug 2022 02:59:48 +0000 (UTC) Received: by mail-pf1-f174.google.com with SMTP id f30so10989594pfq.4 for ; Tue, 16 Aug 2022 19:59:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:subject:cc :to:from:date:message-id:from:to:cc; bh=/tO4DCm3u+0w3PVCkx0WyWuvz7PYFu5oe9lMYDeyNbw=; b=NyIpz72DIbP7xVNP6HW4NhNaME+eghGBdleag4Y6re9LujRPjzRxwNUk8xAIsgzoYH H7Z1iYadfvRv7oicq2E/hUUmXDm90hf6a/+PNjoZ+9BYHj5ZEuNK/LR4MkhUR8eaDCiQ o66HO8YymT9y/X/PFc1bUL6mCk5HCB0xm/F/lxzv1XnLDGv59zzaMLtLQLT+EbNdLyER y+ZvhtxWbevjSV/15sajJ6ixVlPNC46sfbAF9pBeBvURpxbTqTHztk0h9x7bsZL0/vQh RW1T2cyRm9agwl+hbj9CI/WwLNG+UDrFPl/j26rA4/ui4bUMeaSp3fl/pD1ZqnksFmaH WP3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:subject:cc :to:from:date:message-id:x-gm-message-state:from:to:cc; bh=/tO4DCm3u+0w3PVCkx0WyWuvz7PYFu5oe9lMYDeyNbw=; b=NUdXlUn9jy2g8kebBNo6sYwSzKW3AAND+JW61WrPdiKgPatuVCpgY0piIvcjhJGE/E vi/Rv4LSL6sAiUWFsUuTenpEIzFVJvIJzx2NioIPkgyC2PJWhT5XoPocjum2CxhjqeMk Q1vmVWZbpYeUkrIYWKk5/LwAgHYDXBW5l0z5F26VjJQW50nIVj5hjpgM2y9dO4akADM0 yonpnzmlFsVuV3LC6jfo07f6xHyUEPdvq3VZDFELZ6qaqtd+qiVDeTmvd4FI3uI72Lft L6SLYc62fCWk1kEZ0MkWvVJQvukOxSeTzkz+QqiVLiJT3rOfqpZk2/2TmJLPJCCcp6gV DjVw== X-Gm-Message-State: ACgBeo1M5LYHVxhvVOsUqw/I1wD/6Z/RFYINxtcTW76N98VJGB5ptW1N BcPLCjmX9YadtxslOCyGwevLAIsccSk= X-Google-Smtp-Source: AA6agR4nUscc8bnOCJT3ZxVy5m1K/hVbp4yo7ytPZPRRQp1Xt43/bBv312f9Q2ycjDJlssKfTNPwOg== X-Received: by 2002:a05:6a00:16cb:b0:52b:cc59:9468 with SMTP id l11-20020a056a0016cb00b0052bcc599468mr23397196pfc.46.1660705187829; Tue, 16 Aug 2022 19:59:47 -0700 (PDT) Received: from localhost ([193.203.214.57]) by smtp.gmail.com with ESMTPSA id z7-20020aa79587000000b00528c22fbb45sm9472021pfj.141.2022.08.16.19.59.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Aug 2022 19:59:47 -0700 (PDT) Message-ID: <62fc59a3.a70a0220.b9a52.01f3@mx.google.com> X-Google-Original-Message-ID: <20220817025945.GA84631@cgel.zte@gmail.com> Date: Wed, 17 Aug 2022 02:59:45 +0000 From: CGEL To: Johannes Weiner Cc: akpm@linux-foundation.org, tj@kernel.org, axboe@kernel.dk, vdavydov.dev@gmail.com, ran.xiaokai@zte.com.cn, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgel , Peter Zijlstra , yang.yang29@zte.com.cn Subject: Re: [RFC PATCH 1/2] psi: introduce memory.pressure.stat References: <20220801004205.1593100-1-ran.xiaokai@zte.com.cn> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660705189; a=rsa-sha256; cv=none; b=WQa5tBBdj9O/C6gGFh04/AWJjmiPTfuquLeuzcYhWIFeIJTZPrDLWR643qcyPrvL8tz6GN kYqgavn+3nkbrEnaMWrGjMQI9hSCC9nc5tYB3dEWtalAGbAVdYLAnKxVuasiP1mo0wvUmL aft1wjzC33EnEXwvsDZxR33Jwq0bvoQ= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=NyIpz72D; spf=pass (imf29.hostedemail.com: domain of cgel.zte@gmail.com designates 209.85.210.174 as permitted sender) smtp.mailfrom=cgel.zte@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660705189; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/tO4DCm3u+0w3PVCkx0WyWuvz7PYFu5oe9lMYDeyNbw=; b=U/HhtYVWFoMFHdyTkejfuSRrteRR90YJzq2gWB5YloPuUvWdfFQnlv1lkcUZbsSO+hvnMk HUqfvfDDRW5H8ft3685qVpVqLg946DjL6CluFcW2reh4yK181yKvn2AWiD1ClgwuZ7qI6p U/9kBTzQKXgMP3km9Sp9hT/2aM1XZX0= X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: E21F11201D6 X-Rspam-User: Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=NyIpz72D; spf=pass (imf29.hostedemail.com: domain of cgel.zte@gmail.com designates 209.85.210.174 as permitted sender) smtp.mailfrom=cgel.zte@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Stat-Signature: q1p5a9znoocea7on8jk11yreme3q6m9t X-HE-Tag: 1660705188-839371 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Aug 03, 2022 at 09:55:39AM -0400, Johannes Weiner wrote: > On Mon, Aug 01, 2022 at 12:42:04AM +0000, cgel.zte@gmail.com wrote: > > From: cgel > > > > For now psi memory pressure account for all the mem stall in the > > system, And didnot provide a detailed information why the stall > > happens. This patch introduce a cgroupu knob memory.pressure.stat, > > it tells the detailed stall information of all memory events and it > > format and the corresponding proc interface. > > > > for the cgroup, add memory.pressure.stat and it shows: > > kswapd: avg10=0.00 avg60=0.00 avg300=0.00 total=0 > > direct reclaim: avg10=0.00 avg60=0.00 avg300=0.12 total=42356 > > kcompacted: avg10=0.00 avg60=0.00 avg300=0.00 total=0 > > direct compact: avg10=0.00 avg60=0.00 avg300=0.00 total=0 > > cgroup reclaim: avg10=0.00 avg60=0.00 avg300=0.00 total=0 > > workingset thrashing: avg10=0.00 avg60=0.00 avg300=0.00 total=0 > > > > for the system wide, a proc file introduced as pressure/memory_stat > > and the format is the same as the cgroup interface. > > > > With this detaled information, for example, if the system is stalled > > because of kcompacted, compaction_proactiveness can be promoted so > > pro-compaction can be involved earlier. > > > > Signed-off-by: cgel > > > @@ -64,9 +91,11 @@ struct psi_group_cpu { > > > > /* Aggregate pressure state derived from the tasks */ > > u32 state_mask; > > + u32 state_memstall; > > > > /* Period time sampling buckets for each state of interest (ns) */ > > u32 times[NR_PSI_STATES]; > > + u32 times_mem[PSI_MEM_STATES]; > > This doubles the psi cache footprint on every context switch, wakeup, > sleep, etc. in the scheduler. You're also adding more branches to > those same paths. It'll measurably affect everybody who is using psi. > > Yet, in the years of using psi in production myself, I've never felt > the need for what this patch provides. There are event counters for > everything that contributes to pressure, and it's never been hard to > rootcause spikes. There are also things like bpftrace that let you > identify who is stalling for how long in order to do one-off tuning > and systems introspection. > We think this patch is not for rootcause spikes, it's for automatic optimize memory besides oomd, especially for sysctl adjustment. For example if we see much pressure of direct reclaim the automatic optimize program might turn up watermark_scale_factor. The base idea is that this patch gives user a brief UI to know what kind of memory pressure the system is suffering, and to optimize the system in a fine grain. It could provide data for user to adjust watermark_boost_factor, extfrag_threshold, compaction_proactiveness,transparent_hugepage/defrag, swappiness, vfs_cache_pressure, madvise(), which may not easy for to do before. It's not easy for automatic optimize program to use tools likes bpftrace or ftrace to do this. While we may use CONFIG_PSI_XX or bootparam to turn on/off this patch to avoid additional footprint for user who not need this.