From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3FAB5C19776 for ; Fri, 28 Feb 2025 08:44:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 62F846B007B; Fri, 28 Feb 2025 03:44:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5DFA66B0082; Fri, 28 Feb 2025 03:44:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4A69A6B0083; Fri, 28 Feb 2025 03:44:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 2D5F66B007B for ; Fri, 28 Feb 2025 03:44:01 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 7CFA1C0557 for ; Fri, 28 Feb 2025 08:44:00 +0000 (UTC) X-FDA: 83168715840.30.E21A7E4 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf18.hostedemail.com (Postfix) with ESMTP id 9AE941C000E for ; Fri, 28 Feb 2025 08:43:58 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=VYpV1nGC; spf=pass (imf18.hostedemail.com: domain of 3THfBZwgKCIUsjltvjwkpxxpun.lxvurw36-vvt4jlt.x0p@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3THfBZwgKCIUsjltvjwkpxxpun.lxvurw36-vvt4jlt.x0p@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740732238; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uCQhmEkI20WCXSrDNmwHS83kJOrfCigX2ZIIywReQD0=; b=F79mL1mE4r0bOOHx4++8phxgUXKeDAUdiBRZv7mIuOyWmyBB3uDwix74008eUKaMDPaAV6 ZlwAJ5Tqx6/nCwdpsqfMpSa35bEq1y1XKtf2K50R2jDldq43Lqq5LxM+sV4XEGrO7/NNoc f+y+G9n9UMQsyMdHLGKc+F7AoVt7UC8= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=VYpV1nGC; spf=pass (imf18.hostedemail.com: domain of 3THfBZwgKCIUsjltvjwkpxxpun.lxvurw36-vvt4jlt.x0p@flex--jackmanb.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3THfBZwgKCIUsjltvjwkpxxpun.lxvurw36-vvt4jlt.x0p@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740732238; a=rsa-sha256; cv=none; b=27m8dHMwG4C115rvr+mnR9VkG+oLcGIp9nYKzAs26a+5/mvPqJLQJnD9EAPKUeFzIxTZk/ mjxO3h2Kp7+NZ3AAnkpRBoij71G9fIFNmbzYTSpgY4woqJ6jasm79gn3V+NzIlEFPd+izA gpydookiqR0bFb2Rv2lXHXwPYMyBxLk= Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-438e4e9a53fso11829795e9.1 for ; Fri, 28 Feb 2025 00:43:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740732237; x=1741337037; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=uCQhmEkI20WCXSrDNmwHS83kJOrfCigX2ZIIywReQD0=; b=VYpV1nGCWq1nVwRhy0REILsYqBI3VC8wtL8c9I0pRo6ZX9wwQaNoEjWbwm3kWbxBtF YHaFOBXQuhSl0aRuAjQgPLeA2AICgyyUEA8tQr9VwbGWFYnH7aMtcrUMm61LwE8PVRwf R055ZLnKOc+yko1991m7eS9Fq0uSrGrZS39yMHI7FHp21HiyrXt/4hCMOsRrdfOxljR/ TWD702AVFMc3b/p8P9ldk+Csm7o6QQslSHv9rTGLauv09zcU+68PZ7ISKMQ1AwsTsR2B +1ANrbrMYGYVc9eVrkgw23RuklbiUXubmYbGFKmB7wYzTtsOl3H/yOc4Yqsv7lwVSrrK tzRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740732237; x=1741337037; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=uCQhmEkI20WCXSrDNmwHS83kJOrfCigX2ZIIywReQD0=; b=Wst9P4sNAciRhoB3EQG1E74c2zyXqDKEXm+GmcLN7wvwC2eSfkRSUHnAFBO+hsMsPE lw8n2whsa0luHNVqqqRLIu930myJC2N1y4QBk1MS9VqjCR4S4N8tGTd1dHFESArmq6IB 1crbNnuc3IHOEnYudh3YA3XdSvRwMaE48/vZRANC8PyKIdVChLQM1c9xZFElaRnpOQF6 7h1TblXkmtoloF9tlZejbdtq73B7ogA3aJF51ozUflpFzadTzJ6hCIn4+PFMgulnG2Yb PB6OEUWd+kyqwN5TXFABJLv4fVhoQ1i7qFwCY36pL358HtK23WM33UGF55x5AOef3ZTD YLtw== X-Forwarded-Encrypted: i=1; AJvYcCUXlG+sGtgUqxL0/0uh7KmbMihBpoxOEVZK09VnW0UF4Fp0PYb4r9bg69dk5JyGPxW50Z/aE8DgQA==@kvack.org X-Gm-Message-State: AOJu0Yz6PjLQNZvr8c+EKMW7OTav3h69mIxY1wmpOUCaCszeogS3OY25 EskmXY+cTCIVzXgiz65lItL4yhgBYyKdjDawNowj/ual903U5n0nyYPGOtKPylVJwZXBMqgzBJC 6Xho/boJdWA== X-Google-Smtp-Source: AGHT+IEJq+5yWOZHmvu9HxjEv9KlFOODJE3Hve16zw/QnOkwbeaxyl/H0lwa63WLmmVB/J+lwRxrJvVq5cm/Ag== X-Received: from wmsd7.prod.google.com ([2002:a05:600c:3ac7:b0:439:850b:8080]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:4693:b0:439:a1f2:50a3 with SMTP id 5b1f17b1804b1-43ba66dfd93mr17960705e9.4.1740732236929; Fri, 28 Feb 2025 00:43:56 -0800 (PST) Date: Fri, 28 Feb 2025 08:43:55 +0000 In-Reply-To: <20250227120607.GPZ8BVL2762we1j3uE@fat_crate.local> Mime-Version: 1.0 References: <20250227120607.GPZ8BVL2762we1j3uE@fat_crate.local> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228084355.2061899-1-jackmanb@google.com> Subject: Re: [PATCH RFC v2 03/29] mm: asi: Introduce ASI core API From: Brendan Jackman To: bp@alien8.de Cc: akpm@linux-foundation.org, dave.hansen@linux.intel.com, jackmanb@google.com, yosryahmed@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, peterz@infradead.org, seanjc@google.com, tglx@linutronix.de, x86@kernel.org Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Queue-Id: 9AE941C000E X-Rspamd-Server: rspam09 X-Stat-Signature: uuqe3sqqzuw4k3jq37tn79b3kuafzrkw X-HE-Tag: 1740732238-589975 X-HE-Meta: U2FsdGVkX1/YM8m31HJk0dVPAgwNyNDn2JTB5YGzlhI2euntb5fg3O33L1LejsvCS+4+zVsFqQUET6MYenLTMdzhycmDAz+mL6tp7NVAX/cdBSfd2N0YOh2URyAgxrseCmkiteYixDyO8CPDH7g3qCSjEe9MxJGAazOuwbIrnyt/701F6bsrnHs5QeRnm36fN0FWUzAr2yleyARtUh/6FjixjDzyv0KfK1WjaKq1MNJQqQiqZlJk47Gi5+9gUN7Iq6ru+YES4Q29qGxNS6NlXK985jtF9W9m3ydoo6HaQryTIxwrwUZ+ymXrsI3Do7BNkxRDaMutegxIqYdf95CUcVvNX0dskemxspnRIfPkT8+ug1SxsaQGcokIG8iYNSKGIjmAhGGtkbNd79ac/w0483k58I1OEcOViMRqUbOXJJUziudmkWe89uH0E9KiFRtBRWbDjJvawD2bxOqJfV8Vndq3q3uFzofYoP88i43gzcSB1ZRZmCp63Plw7RKmAP/CuDXKsspnbffWLGDCTOujg6uUzVLxs+TKoywifGqxJOY8xOZvChJAx8JnqR3huVm3T35cBElIcen1wh3lGK7YXbnsGtvqWO6i6rt/PUVYCeSkphxok5T6zAifimueTJ6Wj1jJNbiRua6x8/6zGzWz7Tg0t7bxS6W0QTJCUMu8ItqOONDOLgpGq+biXfSbRji3t4wWhnPJo6w6M1OEGDEoZOdSSyYdfBY/fNUoCgRc149M8S+G8bb/gyWuF5HjSduZZ1WGjsiluVzQkYa3KeXT2LdFVdHGi2+A6Hl+nIamWzNNEd5PktLV+cFhUbCQjUoLvFI1uMDQgcrsBePvMtKGTUQ24CLbWhbrZjI7jWVACxSFYCFGqzV8jEeYD27LtUDQzsqxbsc1Rhrko8lMA+2ZgA+D9VjBxTRoHs8P8RaQzpTB6JH8usB4A7meFNJEHagkr42qTJEehXTeSzSf31E RfkCPj8L mirc7W8qyqJtcU+b5rKoCZBKj7WcjF2PDaKMdLry9+xc7zoPXAt7wpcxvnErewCIa12dkNHhKXe9e1hnHsW2yeArS72EiOdA59AAZ2TYavJmpBqGIx2aVNKfNvcSFvw41yn7g8ioXRMu6AEfjDupGlHmOGTgMdLI+SD+6BKUWYcfUO75qLOrVz6CZxfc+MtEyo/dBrehmt9aMm5UBSbnywzNhBwr4O+imfMckbbF+urfPenOcTIGDyTJugtA5/1DBNCYDbtU+ZlIkMGIeL0nwYJcdITrpdCrBNAJrh+J+axavrudaLxsBJ1l0IH0Wd3Mbv+G9hlBq02OC8fnGIp5NGNzoEgDaSur9QjEsPFfb7ZCGgBmytTWxLwOZN4Pf2bwDT1iiZUusJzgMihx/uMSLa0ZbLUC16S3Y/aN9YsfDJpfSDWDUeu2S2N7FHUX7bLjwTP6GnLkZrmMJadfP+/eYrQc+fskRpCUgXflyaBAmpTh4kqYtQN9e0TK5RnRlFOQfIAsYaE274aZkoOx6LpXGU9X3eM8aYnbN96IaTvB03rN/pPNpN2IMcQxRaPtHr8VsNKB7 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: > > OK, sounds like I need to rewrite this explanation! It's only been > > read before by people who already knew how this thing worked so this > > might take a few attempts to make it clear. > > > > Maybe the best way to make it clear is to explain this with reference > > to KVM. At a super high level, That looks like: > > > > ioctl(KVM_RUN) { > > enter_from_user_mode() > > while !need_userspace_handling() { > > asi_enter(); // part 1 > > vmenter(); // part 2 > > asi_relax(); // part 3 > > } > > asi _exit(); // part 4b > > exit_to_user_mode() > > } > > > > So part 4a is just referring to continuation of the loop. > > > > This explanation was written when that was the only user of this API > > so it was probably clearer, now we have userspace it seems a bit odd. > > > > With my pseudocode above, does it make more sense? If so I'll try to > > think of a better way to explain it. > > Well, it is still confusing. I would expect to see: > > ioctl(KVM_RUN) { > enter_from_user_mode() > while !need_userspace_handling() { > asi_enter(); // part 1 > vmenter(); // part 2 > asi_exit(); // part 3 > } > asi_switch(); // part 4b > exit_to_user_mode() > } > > Because then it is ballanced: you enter the restricted address space, do stuff > and then you exit it without switching address space. But then you need to > switch address space so you have to do asi_exit or asi_switch or wnatnot. And > that's still unbalanced. > > So from *only* looking at the usage, it'd be a lot more balanced if all calls > were paired: > > ioctl(KVM_RUN) { > enter_from_user_mode() > asi_switch_to(); <-------+ > while !need_userspace_handling() { | > asi_enter(); // part 1 <---+ | > vmenter(); // part 2 | | > asi_exit(); // part 3 <---+ | > } | > asi_switch_back(); // part 4b <-------+ > exit_to_user_mode() > } > > (look at me doing ascii paintint :-P) > > Naming is awful but it should illustrate what I mean: > > asi_switch_to > asi_enter > asi_exit > asi_switch_back > > Does that make more sense? Yeah I see what you mean. I think the issues are: 1. We're mixing up two different aspects in the API: a. Starting and finishing "critical sections" (i.e. the region between asi_enter() and asi_relax()) b. Actually triggering address space transitions. 2. There is a fundamental asymmetry at play here: asi_enter() and asi_exit() can both be NOPs (when we're already in the relevant address space), and asi_enter() being a NOP is really the _whole point of ASI_. The ideal world is where asi_exit() is very very rare, so asi_enter() is almost always a NOP. So we could disentangle part 1 by just rejigging things as you suggest, and I think the naming would be like: asi_enter asi_start_critical asi_end_critical asi_exit But the issue with that is that asi_start_critical() _must_ imply asi_enter() (otherwise if we get an NMI between asi_enter() and asi_start_critical(), and that causes a #PF, we will start the critical section in the wrong address space and ASI won't do its job). So, we are somewhat forced to mix up a. and b. from above. BTW, there is another thing complicating this picture a little: ASI "clients" (really just meaning KVM code at this point) are not not really supposed to care at all about the actual address space, the fact that they currently have to call asi_exit() in part 4b is just a temporary thing to simplify the initial implementation. It has a performance cost (not enormous, serious KVM platforms try pretty hard to avoid returning to user space, but it does still matter) so Google's internal version has already got rid of it and that's where I expect this thing to evolve too. But for now it just lets us keep things simple since e.g. we never have to think about context switching in the restricted address space. With that in mind, what if it looked like this: ioctl(KVM_RUN) { enter_from_user_mode() while !need_userspace_handling() // This implies asi_enter(), but this code "doesn't care" // about that. asi_start_critical(); vmenter(); asi_end_critical(); } // TODO: This is temporary, it should not be needed. asi_exit(); exit_to_user_mode() } Once the asi_exit() call disappears, it will be symmetrical from the "client API"'s point of view. And while we still mix up address space switching with critical section boundaries, the address space switching is "just an implementation detail" and not really visible as part of the API. > Documentation/process/email-clients.rst I have now setup Mutt. But, for now I am replying with plan vim + git-send-email, because I also sent this RFC to a ridiculous CC list (I just blindly used the get_maintainers.pl output, I don't know why I thought that was a reasonable approach) and it turns out this is the easiest way to trim it in a reply! Hopefully I can get the headers right...