From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7CAEC77B73 for ; Thu, 20 Apr 2023 02:42:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EFA7E6B0071; Wed, 19 Apr 2023 22:42:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E84026B0072; Wed, 19 Apr 2023 22:42:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CFE436B0074; Wed, 19 Apr 2023 22:42:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B80966B0071 for ; Wed, 19 Apr 2023 22:42:06 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 797EB1A0363 for ; Thu, 20 Apr 2023 02:42:06 +0000 (UTC) X-FDA: 80700219852.26.6EDB866 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf18.hostedemail.com (Postfix) with ESMTP id 1DF201C000C for ; Thu, 20 Apr 2023 02:42:03 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=D5v6E7oL; spf=pass (imf18.hostedemail.com: domain of mtosatti@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=mtosatti@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681958524; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bhzNEUHxfcr5TJDSlHYLmkGKy9rdf7nMkm2xvyBG4GE=; b=z5RU+vR5EfozYfuE5SqSrLhKTxFFxuyc6S9QXO9hScR85r6V8quitornNxlBOfx8iAt5XJ TV5Xs+hiuocdsIHxirrEdTw6rB6nGuesJg4ZpXamJbkD1AZZ4N2WwPH2zHi3k5igpxCd4H B1vN7DnCpP+cyJ5C7PYnKD1by3FbPxg= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=D5v6E7oL; spf=pass (imf18.hostedemail.com: domain of mtosatti@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=mtosatti@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681958524; a=rsa-sha256; cv=none; b=3e2Q3EJlZQxGnoc154xlFJbJFStZ2jZpUXJT96HernDMoiWe8aEWEV2Qy+gHe/5TYmlDs0 Z0PO7s2hPokcFXxXL/de1At/adjEh3NfaysbrPkFZdkvghByXT96EEfND1u2wVNzrAR3Vx tAFWuliMG0NxPSLodDYpZnwNOWtGy4A= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681958523; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=bhzNEUHxfcr5TJDSlHYLmkGKy9rdf7nMkm2xvyBG4GE=; b=D5v6E7oLgUBHnmyMqQp4Qfxe4fQ94vphpEEYQEB+ohXKOOq/zjcpbM+9Tsxa/fGA5LbrdX DzYZFST66lttx6sNT6Ds/jUDLjyMK8HPh3MpmqN7kR7wIvLzSD4y2rLD6dw75KtBiz4JlA coUEkvEKcCfJU+GwzLbHEr5p5mUAv4s= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-570-0k_eczO_PHWLRT9OhPDPfQ-1; Wed, 19 Apr 2023 22:42:02 -0400 X-MC-Unique: 0k_eczO_PHWLRT9OhPDPfQ-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 48836858F0E; Thu, 20 Apr 2023 02:42:01 +0000 (UTC) Received: from tpad.localdomain (ovpn-112-2.gru2.redhat.com [10.97.112.2]) by smtp.corp.redhat.com (Postfix) with ESMTPS id E6CFE492B05; Thu, 20 Apr 2023 02:42:00 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id E9D8040134446; Wed, 19 Apr 2023 16:15:50 -0300 (-03) Date: Wed, 19 Apr 2023 16:15:50 -0300 From: Marcelo Tosatti To: Vlastimil Babka Cc: Andrew Morton , Christoph Lameter , Aaron Tomlin , Frederic Weisbecker , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Russell King , Huacai Chen , Heiko Carstens , x86@kernel.org, Michal Hocko Subject: Re: [PATCH v7 00/13] fold per-CPU vmstats remotely Message-ID: References: <20230320180332.102837832@redhat.com> <20230418150200.027528c155853fea8e4f58b2@linux-foundation.org> <1a481d68-930e-9418-a9aa-befdcfe36928@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1a481d68-930e-9418-a9aa-befdcfe36928@suse.cz> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 1DF201C000C X-Stat-Signature: cnq5udhp1zr4ym51wyaqetzqyykf4bmw X-HE-Tag: 1681958523-171110 X-HE-Meta: U2FsdGVkX1/MbsXY+iCR+Ti0qGyRnW73ab0iPB7zMCH8koyDbT/PYZ4mqJCzsvjnGRQTIbH7vKTUndmeYVPomgBNxdTQD9W2FLvQmfIiArDcvlzApq3Cp11hCAe2xN8jKOcpf4TDjPwarziLapjd8aAtnnFAm6bUhgRYOWa8BYqbvJ+mo5vkzFLPxC4BC8Jnhv9ca3rEk7jrvjH58FZfQ5u5sPjVIaPSFgJdg3g/FOg/u3Vj/VQpPZCtZbfeu+9i4P+049I8jjhrSuFamAipTvSGFWUG3kaNIOxaGz4jvZzYC04X+efi4s8eLjcRMoVKW+f16hzon9bxCxiCRLJqgeN0H2z8rNtrGSaHy8MMgNEZuma4IHx5DoBg3rRqYzQpg/qKYgItH63w20a5g639Ytj23f9tQOaOl7zBYIqramtKtRQBeJ32Joxt4UJW4C07JtQyfx4xZL3mfcBdWE8w4In3pcyZ5OrLTzPfS1zxXimdPZtHCXLAMYzGwuMYFZORuNVjkdQAZODcGuG0GtUSdWleH4DV9BLnDCq9zZqyD+AT4mlVqZ6fMByOi2u/ZzducjYsvLk0J+W9Hi4MiCyk9yQJL+JxlA/V6kSdePVAahytJ+rNMP5hJQrcxD5IqLf7FAmrRUSf7N29u3Be74W+Aa2k6uY+ztDuCimnz6MA9w8K11iTnlPCxsB4e/wIDDZIVQWCc2oEysV4QNXi7vQ20bcKwifOGAen9jqLTTs3/NpFZ73l1J7eXbpYYb9ObsnZeSPoMuM8n4+hBKHyoR4bnSwPnuycco+2MNoVmdmbyOJQFHlvyUC2edfB6eYbVsIxXeTXf0rjrmML1nJq/f332VvV5MBtnCUDCnmXxHkbFWiElbeSqY28pfyL68OtdFEa1KKK5KBEARjGohjgYqrU3D3YJjRHAnlEMiqfrhv4d/agLHrFDn62ACorl4EQTaA30S8v7gyWwkWC1l6Aua+ p0rjOK+P 4LIjbhjioklRjIDz/AS+LQ0gQ7uvfY17bAjELM/a4/l8oBZOD6o0QW8LeWSCl0FJSOeMys4cDsEwJ9QqKasJNY7j36Xvajw59JaVel+QyRB+90K8Xm043CDLOa/d9RrSVl1rwLC968+pWVWFvS+RdrXHt76uzU87fTMFMmIAn86NNDVE8qQCZ68hBFK+pA3YSMB6RwW4H78BOqR3zSsKlH2BN8JMbNQXC3T4O1AvUgkFD0LE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Apr 19, 2023 at 06:47:30PM +0200, Vlastimil Babka wrote: > On 4/19/23 13:29, Marcelo Tosatti wrote: > > On Wed, Apr 19, 2023 at 08:14:09AM -0300, Marcelo Tosatti wrote: > >> This was tried before: > >> https://lore.kernel.org/lkml/20220127173037.318440631@fedora.localdomain/ > >> > >> My conclusion from that discussion (and work) is that a special system > >> call: > >> > >> 1) Does not allow the benefits to be widely applied (only modified > >> applications will benefit). Is not portable across different operating systems. > >> > >> Removing the vmstat_work interruption is a benefit for HPC workloads, > >> for example (in fact, it is a benefit for any kind of application, > >> since the interruption causes cache misses). > >> > >> 2) Increases the system call cost for applications which would use > >> the interface. > >> > >> So avoiding the vmstat_update update interruption, without userspace > >> knowledge and modifications, is a better than solution than a modified > >> userspace. > > > > Another important point is this: if an application dirties > > its own per-CPU vmstat cache, while performing a system call, > > and a vmstat sync event is triggered on a different CPU, you'd have to: > > > > 1) Wait for that CPU to return to userspace and sync its stats > > (unfeasible). > > > > 2) Queue work to execute on that CPU (undesirable, as that causes > > an interruption). > > So you're saying the application might do a syscall from the isolcpu, so > IIUC it cannot expect any latency guarantees at that very moment, Why not? cyclictest uses nanosleep and its the main tool for measuring latency. > but then > it immediately starts expecting them again after returning to userspace, No, the expectation more generally is this: For certain types of applications (for example PLC software or RAN processing), upon occurrence of an event, it is necessary to complete a certain task in a maximum amount of time (deadline). One way to express this requirement is with a pair of numbers, deadline time and execution time, where: * deadline time: length of time between event and deadline. * execution time: length of time it takes for processing of event to occur on a particular hardware platform (uninterrupted). The particular values depend on use-case. For the case where the realtime application executes in a virtualized guest, an interruption which must be serviced in the host will cause the following sequence of events: 1) VM-exit 2) execution of IPI (and function call) (or switch to kwork thread to execute some work item). 3) VM-entry Which causes an excess of 50us latency as observed by cyclictest (this violates the latency requirement of vRAN application with 1ms TTI, for example). > and > a single interruption for a one-time flush after the syscall would be too > intrusive? Generally, if you can't complete the task (which involves executing a number of instructions) before the deadline, then its a problem. One-time flush? You mean to switch between: rt-app -> kworker (to execute vmstat_update flush) -> rt-app My measurement, which probably had vmstat_update code/data in cache, took 7us. It might be the case that the code to execute must be brought in from memory, which takes even longer. > (elsewhere in the thread you described an RT app initialization that may > generate vmstats to flush and then entry userspace loop, again, would a > single interruption soon after entering the loop be so critical?) 1) It depends on the application. For the use-case above, where < 50us interruption is desired, yes it is critical. 2) The interruptions can come from different sources. Time 0 rt-app executing instruction 1 1 rt-app executing instruction 2 2 scheduler switches between rt-app and kworker 3 kworker runs vmstat_work 4 scheduler switches between kworker and rt-app 5 rt-app executing instruction 3 6 ipi to handle a KVM request IPI 7 fill in your preferred IPI handler So the argument "a single interruption might not cause your deadline to be exceeded" fails (because the time to handle the different interruptions might sum). Does that make sense? > > 3) Remotely sync the vmstat for that CPU. > > > > > > > >