From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8EF7C27C4F for ; Wed, 26 Jun 2024 22:08:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 57E786B0089; Wed, 26 Jun 2024 18:08:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 506D86B0092; Wed, 26 Jun 2024 18:08:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 380756B0093; Wed, 26 Jun 2024 18:08:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 17EA96B0089 for ; Wed, 26 Jun 2024 18:08:39 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id CF969120784 for ; Wed, 26 Jun 2024 22:08:38 +0000 (UTC) X-FDA: 82274429916.07.5FC0EA9 Received: from mail-ej1-f51.google.com (mail-ej1-f51.google.com [209.85.218.51]) by imf25.hostedemail.com (Postfix) with ESMTP id EF3D7A0020 for ; Wed, 26 Jun 2024 22:08:35 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=irrV5J8v; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf25.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.51 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719439697; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ow9uMzU4QlTZN49r1yOxWduojAK/284N5Vep8JME3m0=; b=fA/Fk9UMZhMB3aXOG+BvXgG7Gy6grH0Ni/Y5VvWpE5yHhP2l2SYKl6wQ7zTh+MYB26urPD HySQzxOqUqZPn4GhRBY0OF/Fd/jHlDQ609YNtmM3e3X17bmCftpQtc/3aRmpPYDdR/QKHe Khbsefo1ondbPWBOj8uDarzp4rEpywY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719439697; a=rsa-sha256; cv=none; b=UdxCuTsHHlbCsWtEZ63hg/RuIS41ofl/5aE4x8iPZExlSbdm4fjJZ+lLiShmST7pB3IqJ2 pEQEMCoWvIsT1QanKU7BRa6uwFkqxRiE63RiPq2sj/JY7q3TyVdHO37viAWLNvqB+60SiB aLU19JBqt+Zxco/4okJyH7L1bvVwpkM= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=irrV5J8v; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf25.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.51 as permitted sender) smtp.mailfrom=yosryahmed@google.com Received: by mail-ej1-f51.google.com with SMTP id a640c23a62f3a-a72510ebc3fso603505966b.2 for ; Wed, 26 Jun 2024 15:08:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1719439714; x=1720044514; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ow9uMzU4QlTZN49r1yOxWduojAK/284N5Vep8JME3m0=; b=irrV5J8vNeLiZ8DuGFm1byRaB+U1m13IlyEeY/3Zpd0+Hv31pddMJCy5NheAqC2ksF 3EvZ3TiqWm9Xqm5EcF9QSNBXBy2Fkh0QRi9OiZOEkUOC1efdWT3x6hxvtsl52ZEAqqQo EZzcKKw1j+I2X0oGT3mHP6phAaftSOrJv+OFSROGugscpLXZyUkuhCtSa2FJ5C+xRfa8 zevlv88Yv8wGaKx+SUUsZyrrt6wmSBlBz15HZ7LGxnf4UHYVohsNPzp39msjhS+jNtZS Af9OeFtKKw9YXeWYsWgUe+mg1LucpJKMahFIjoB5IEbZnXyqbiPjwy4SsOWUOww6a3gG 1UXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719439714; x=1720044514; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ow9uMzU4QlTZN49r1yOxWduojAK/284N5Vep8JME3m0=; b=DqKG+kOxKdxEYDCPBr52HzuiT71nWXHXMFRU+URDBQAle3ov9oFUhQb+ZlqPrPbnjK 51lbDed8MWmSkZLwbVL3PwDAaSHVRZf38mLGVtdPlr3SZ4LJWqaRfVJDDfRwjFZBWCrT L3cQ7GaPpRRQsblUJvIk2dHWsoYy1eUbdOkF8LqDV3hkAOliwRIYHT/wGlIoD6QC4gLT fIW7JL+hzXkbGzzEiyCP5JXRbiQe81yqdMi9nh+R6hMfN6VgiZImPMCKKOSENfa+IaEb ceyxugfie+7B2xsRK0kBzNHIp+nlA7vGQIWp2Q208jSEwTqRCPB2mE13f141GcpjPdaf 9D3w== X-Forwarded-Encrypted: i=1; AJvYcCXTe+b0qBSdeCmeuL6C43QjBY/Dky+gFTpAu4d9wEOIBIWC7Z+PNTM85cIvTgF/yOujgMUYYvHPLX6Pfx5S5WxkmXM= X-Gm-Message-State: AOJu0YwDhyBWrIweLE5YPj7yboJ6reSqYFnYAHHTnVj/oFGtVBX3+1r+ QPUA5hgdvUhylSyo8QnYF+uqWKYPRmAGsEGyxNuNljFZ72fAm46RwRsLXSwari8aXluhAP2mVw0 mQ4VU/wBrtJvTWEWY7ivRUAj8Yy+Auju+/mZS X-Google-Smtp-Source: AGHT+IHi2xggiOm+Up9NomVLlotYhdckrjT5dOyCxfIdhHOlfpCFA13hcM0WF/+YkIUaXf9/7Qt0uwwB0bQyrqsQAa8= X-Received: by 2002:a17:906:7f05:b0:a6e:4693:1f6e with SMTP id a640c23a62f3a-a7242c39be2mr819395666b.29.1719439713799; Wed, 26 Jun 2024 15:08:33 -0700 (PDT) MIME-Version: 1.0 References: <43732a44-1f90-4119-9e52-000b5a6a2f99@kernel.org> In-Reply-To: <43732a44-1f90-4119-9e52-000b5a6a2f99@kernel.org> From: Yosry Ahmed Date: Wed, 26 Jun 2024 15:07:55 -0700 Message-ID: Subject: Re: [PATCH V2] cgroup/rstat: Avoid thundering herd problem by kswapd across NUMA nodes To: Jesper Dangaard Brouer Cc: "Christoph Lameter (Ampere)" , Shakeel Butt , tj@kernel.org, cgroups@vger.kernel.org, hannes@cmpxchg.org, lizefan.x@bytedance.com, longman@redhat.com, kernel-team@cloudflare.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: EF3D7A0020 X-Stat-Signature: c5x56ddxhekf9zt91ctr8hgqomyi4pxn X-Rspam-User: X-HE-Tag: 1719439715-575389 X-HE-Meta: U2FsdGVkX1/s0VAMz1cbFYb7IZlqi0S+t3E2D6lhQxm4GL5acPtXMdYlOakh9tfDRDHRXhkcpzhJ4/WZeIArFCD1T8qSMgyDyUlRla9etsj6xxvE63wUcXdsHPC82nVixtHNq3iX5x0TlD7eVdCPMnIT6Gcrqc1qap1l6rz4x5ICadFg/4RocXhVqAQ5v3YiZQbW/gXFDE5qy3lA9YcQskdFFuCWaLYzY3LAwcRV7qj7ygc5ZWWBGV5vJO4XlVQ3pzDeE4aV4PJBcrKXrrAof0TXZDQf7a2QlbJIZ8Upz9n+aJbCR7t1EUCPCgbDwTnx51ZAAHGgtbihSPjrdE5XD4j2bhZP3IQHt5CbfkVEASFzkt14OGPKYCpBxB2I+APFdoHv7uSkpZI6I1BtNu3+gxdr1duy5hRLaniGUdUJbetrN4AN1DVungYET7ZPRwaDGdnZxsqZtv3SFwWb29gShjRVVC3Y/2AElYD3K5TkbiRXHIRoJbhHAs1xoErid8pj0sTk3qwzhHyYOssUZDeIbasGmvyDEGt58mm3yNUgrVItE4IH18Ael06EC3ooCSly+sgmvh5qxv8dnYO3/sSqHB6UkKtbgNSS3svOLkBji9am6E8K9SqCDkoTd2EDI2oNDh9u/JJwUaEcZduRzfMcPXXVkihs2PPajNiWzSkcKIfJd1AdEzosdLTQa5U9nS0dhO6b43eDkrMid8FTOT3r8fKplUb6LfWWOsFhxMD48SFQWuPODSpmBp4Zn/+6/hk2MFIOjQiLkULH3y6JanRBd8EKlJ7XSSNFvHB6wlr543wId/S+NJOmc6/5k2EZxLJW8wipv8s72S6C/notoesGj3b5Nk2kDwPNO3XhYTlSp5OhWzzEHfrIyY4wpBfzAXC/pOZhDe/SrQVtWjjTRhqcRdG4+x7R9YSMNsoPPtnkkhgkUclJReFFg362fX5LhhSWilnJKYCKoXrQzmcAX3T T3PTy/2U yG1+GrsLY65Y95qpbe4lBrsmK3a6i4UU2ZGt0VATUY539D5+JmDE7mtXkiZPTYjhNShVW88ujQB/P4b4pxeeWp5+NTIJszr+Hw8kJn6OWwk/M0vr9sDqYHkd0/Va5KgfCxTL8W5fmWrjopV9l3MyP1NyCPYZoY1LxXCLNXAU/PkBA5+nzaj/MbhVGzwdWGfEzFUdAf4J96Y4UCmvbbHK3rDZzRKju1NOSNVn7FpvrbMdElQsIubtMprDYSYj2XVvIzoxf5NyyvU4CwLR1xTi2nBP3RIfDVTZAz52H X-Bogosity: Ham, tests=bogofilter, spamicity=0.040827, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jun 26, 2024 at 2:35=E2=80=AFPM Jesper Dangaard Brouer wrote: > > > > On 26/06/2024 00.59, Yosry Ahmed wrote: > > On Tue, Jun 25, 2024 at 3:35=E2=80=AFPM Christoph Lameter (Ampere) wrote: > >> > >> On Tue, 25 Jun 2024, Yosry Ahmed wrote: > >> > >>>> In my reply above, I am not arguing to go back to the older > >>>> stats_flush_ongoing situation. Rather I am discussing what should be= the > >>>> best eventual solution. From the vmstats infra, we can learn that > >>>> frequent async flushes along with no sync flush, users are fine with= the > >>>> 'non-determinism'. Of course cgroup stats are different from vmstats > >>>> i.e. are hierarchical but I think we can try out this approach and s= ee > >>>> if this works or not. > >>> > >>> If we do not do sync flushing, then the same problem that happened > >>> with stats_flush_ongoing could occur again, right? Userspace could > >>> read the stats after an event, and get a snapshot of the system befor= e > >>> that event. > >>> > >>> Perhaps this is fine for vmstats if it has always been like that (I > >>> have no idea), or if no users make assumptions about this. But for > >>> cgroup stats, we have use cases that rely on this behavior. > >> > >> vmstat updates are triggered initially as needed by the shepherd task = and > >> there is no requirement that this is triggered simultaenously. We > >> could actually randomize the intervals in vmstat_update() a bit if thi= s > >> will help. > > > > The problem is that for cgroup stats, the behavior has been that a > > userspace read will trigger a flush (i.e. propagating updates). We > > have use cases that depend on this. If we switch to the vmstat model > > where updates are triggered independently from user reads, it > > constitutes a behavioral change. > > I implemented a variant using completions as Yosry asked for: > > https://lore.kernel.org/all/171943668946.1638606.1320095353103578332.stgi= t@firesoul/ Thanks! I will take a look at this a little bit later. I am wondering if you could verify if that solution fixes the problem with kswapd flushing?