From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47998C6FD18 for ; Wed, 19 Apr 2023 14:16:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D529E6B0071; Wed, 19 Apr 2023 10:16:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D03416B0072; Wed, 19 Apr 2023 10:16:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BCB8F900003; Wed, 19 Apr 2023 10:16:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id AC6DC6B0071 for ; Wed, 19 Apr 2023 10:16:29 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 724B312030C for ; Wed, 19 Apr 2023 14:16:29 +0000 (UTC) X-FDA: 80698340898.02.8282EC7 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf15.hostedemail.com (Postfix) with ESMTP id 85678A0024 for ; Wed, 19 Apr 2023 14:16:27 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="XSt+d/PA"; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf15.hostedemail.com: domain of mtosatti@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=mtosatti@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681913787; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qY/7Ww3ksokf5xA87dySpBXMx7ItXCgllwSeJ2NZDd0=; b=rvBqvufq5HRZMKj7J8pcVDi1n4a2iRh7cLucbf91m0jmxw4dk8BGGHP12wUH716DTO1toF AnpJwl7HtGt2VTUfBRH/WkfWsoCzhFuACdMKzYL2/Cu5+DYfo/LVRPixLqJfu4oMaJQbb/ 3DtHyE8WpBlLJSCeQjeBuZk0IDqkV6U= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="XSt+d/PA"; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf15.hostedemail.com: domain of mtosatti@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=mtosatti@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681913787; a=rsa-sha256; cv=none; b=bWvz8bVFt1Wjcx9hvaFN1Q7+rhQF7udmhGY4ZY3k0pl9GzPLAUmcj9fYEKqut9+tburb77 KlRh5/ozaKS3oINkY7TENSknLfOI9pY9RcCmTFVbgVsK+U3UNc1ivJwJQofXmnsU3xKDL+ zBHJmDAHnzEDk04jfPOpvFUgjcL9aZk= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681913786; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qY/7Ww3ksokf5xA87dySpBXMx7ItXCgllwSeJ2NZDd0=; b=XSt+d/PAVzFrYeXjmTwymBgLB0rKuost9dVzUF1vLk22d7eXAOd/6U5J+xYKYYOYDxodA0 cuaCgN/K6xVvkjxa9OBSiXDW1dR9W7X4yVGDA7Osm6hcph/p9LQgdj8Le6WwXiWsWVfBfO +U8UboehQjFk3YBMvJX6YLMJGKzQjz4= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-147-RUOePQYSOJeqyFrH69TAtQ-1; Wed, 19 Apr 2023 10:16:22 -0400 X-MC-Unique: RUOePQYSOJeqyFrH69TAtQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CA4DF1C0A594; Wed, 19 Apr 2023 14:16:21 +0000 (UTC) Received: from tpad.localdomain (ovpn-112-2.gru2.redhat.com [10.97.112.2]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 7C98A1121314; Wed, 19 Apr 2023 14:16:21 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id DA248400DED28; Wed, 19 Apr 2023 10:48:03 -0300 (-03) Date: Wed, 19 Apr 2023 10:48:03 -0300 From: Marcelo Tosatti To: Frederic Weisbecker Cc: Andrew Morton , Christoph Lameter , Aaron Tomlin , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Russell King , Huacai Chen , Heiko Carstens , x86@kernel.org, Vlastimil Babka , Michal Hocko Subject: Re: [PATCH v7 00/13] fold per-CPU vmstats remotely Message-ID: References: <20230320180332.102837832@redhat.com> <20230418150200.027528c155853fea8e4f58b2@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 85678A0024 X-Stat-Signature: j9jro8qendiyf6g43dfagp614huh5a4d X-HE-Tag: 1681913787-207162 X-HE-Meta: U2FsdGVkX1+wgRtSUbCi8+eLvNYXil76KzubDUqHWdrKN/+Y0hRUXAibs7st5hXFLmEKc954FggIbC6YIzmjqm1Xt2ZCF4XBU+f5uyyUhMuCrup3h+yYV3LCLm5bxlWdPPeRS4DhWfipmooy9r8ZYPcimlGICixIizq5BF56jiBrYesaYtviN3TtgHjf+IJ5PsiaKWTBMUgSAcaWsNvDPHys+RpnmFyCXmjvOYmYDAFOVfhDFdt8s6tJYek3O+kKJ7ikSHQqU5qQWzqNz5d1XH+ZcEQN8n/YIFiRyzhEBBcazH9rhlKtbmezYDYE5Tu9uuvWwHI1goVMKNfdkLvOSf3pDGLsg0Wlpdae9TJ9HQvUIboT1VY+PhtzZBXz+OUP1bXcBEw8K2/k3rA4wTDwX0eWMCsDQycAgQI+TKBd16Cm405DstcFM//E3yZtSzuFEjjp+HCZPOKXfgCnUQRJa2fznoSUJrHKhun2uyk1/Fn1sqmcVIzvh193aui75mlUgHFx5T19UEb2YNgJhBitIVDrnALk5x4Fh9ZIXukh3UWz8ERToXpJUG1ol591ryI7nHDhjv18pyN1gH8eEu5R7G1zNTK0LakvzZ0I1yVLFC9t7v+sRU9tA0LGRyijSzzHkLCLOIMiDdVAvKhSa5a4fqPrwlakpr4cftLKjO/Rfvj5a6o7YQSkL6tdl0poF8RDFhrTofDcUczdHXIgnVGiEIhlnIllrZL0kJZdzsk6766589+2KNOqzdN+hZMQCvnJA+ldydtpLpnAhg9im96xcr0rxIEl+PGrqOlqDDJNiECwork4nkV42HVrtrTffk4J22dxfQuTySXFsCvND56AlTFytKNoN/MS0VQf5/hbhxiKI+q+xUTFyVNbJpl/mbljAJNYrwNBzAPPH2V7rHR6KJjqoyaBE58jz2WTlsP5CRvytffLJifDo/ImWsPo0xHHlVm1pvxiPh0n4FV2/mD Ocdxpx9q HwMY7gAuqlsIFxzhJ1Y2ZlKzPE8rPFWyWAGFl/S3j8PGR0rCB7Fyu2Ja4bXB2CnOvZuOy9EG7S/tSVNKwiYZz/gjAxBcLTue1XXlckrcNZKs+VzEBbmPRlrq/di7nG1aHaumQ2YQa+Azmvm+85Gnc6nySTn94E9v/rs8ho+gyPNdUUm1UG7M9zY4j4G/ogF2VX+fphTyXPhOLwPQqRDHAKVA/zKjcyt1xVW4RYhkk7Iwiw/lihPJFGhKttyR9MHcM8fwF X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Apr 19, 2023 at 02:24:01PM +0200, Frederic Weisbecker wrote: > Le Wed, Apr 19, 2023 at 08:59:28AM -0300, Marcelo Tosatti a écrit : > > On Wed, Apr 19, 2023 at 08:29:47AM -0300, Marcelo Tosatti wrote: > > > On Wed, Apr 19, 2023 at 08:14:09AM -0300, Marcelo Tosatti wrote: > > > > This was tried before: > > > > https://lore.kernel.org/lkml/20220127173037.318440631@fedora.localdomain/ > > > > > > > > My conclusion from that discussion (and work) is that a special system > > > > call: > > > > > > > > 1) Does not allow the benefits to be widely applied (only modified > > > > applications will benefit). Is not portable across different operating systems. > > > > > > > > Removing the vmstat_work interruption is a benefit for HPC workloads, > > > > for example (in fact, it is a benefit for any kind of application, > > > > since the interruption causes cache misses). > > > > > > > > 2) Increases the system call cost for applications which would use > > > > the interface. > > > > > > > > So avoiding the vmstat_update update interruption, without userspace > > > > knowledge and modifications, is a better than solution than a modified > > > > userspace. > > > > > > Another important point is this: if an application dirties > > > its own per-CPU vmstat cache, while performing a system call, > > > > Or while handling a VM-exit from a vCPU. > > > > This are, in my mind, sufficient reasons to discard the "flush per-cpu > > caches" idea. This is also why i chose to abandon the prctrl interface > > patchset. > > If you're running your isolated workloads on guests, which sounds quite > challenging but I guess you guys managed, I'd expect that VMEXITs are > absolutely out of question while the task runs critical code, so I'm not > sure why you would care. I guess not only your guests but also your hosts > run nohz_full, right? The answer is: there are VM-exits. For example to write MSRs to program LAPIC timer. Yes both host and guest are nohz_full (but for example, cyclictest or a PLC program can call nanosleep in the guest which translate to MSR writes to program LAPIC timer which is a VM-exit). > I can't tell if the prctl solution which quiesces everything is the solution > for you, I don't know well enough your workloads, but I would expect that > the pattern is as follows: > > 1) Arrange for full isolation (no more interrupts/exceptions/VMEXITs) Yes, this in the general scheme. Full isolation is automated by tuned (realtime-virtual-host/realtime-virtual-guest profiles). There are VM-exits in our use-case. There might be use-cases where interrupts are desired. For more details: https://www.youtube.com/watch?v=SyhfctYqjc8 > 2) Run critical code > 3) Optionally do something once you're done > > If vmstat is going to be the only thing to wait for on 1), then the remote > solution looks good enough (although I leave that to -mm guys as I'm too > clueless about those matters), I am mostly clueless too, but i don't see a problem with the proposed patch (and no one has pointed any problem either). > if there is more to be expected, I guess the > quiescing prctl (or whatever syscall) is something to consider. > > Thanks. I don't know of anything else to consider ATM, and for all cases we have analyzed so far there has always been the possibility to do the work remotely, via RCU or some other locking scheme, rather than requiring the application to be modified (which decreases the number of userspace applications that can benefit).