From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BFD0D1813A for ; Mon, 14 Oct 2024 20:08:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0BEBF6B007B; Mon, 14 Oct 2024 16:08:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 06F2E6B0083; Mon, 14 Oct 2024 16:08:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E78B56B0085; Mon, 14 Oct 2024 16:08:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C4C436B007B for ; Mon, 14 Oct 2024 16:08:34 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id AFE9AA0C20 for ; Mon, 14 Oct 2024 20:08:19 +0000 (UTC) X-FDA: 82673295096.19.53FDAB0 Received: from mail-ed1-f51.google.com (mail-ed1-f51.google.com [209.85.208.51]) by imf22.hostedemail.com (Postfix) with ESMTP id EB889C0009 for ; Mon, 14 Oct 2024 20:08:24 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=qZm6v4eG; spf=pass (imf22.hostedemail.com: domain of jannh@google.com designates 209.85.208.51 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728936440; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WJ0VNcaIc86VSheN53nKaNiPzg3nD5Htkqzr0j33dvg=; b=5LWaehHuPcPrk+TdLUZcq+iYNgtGNw33+t3R7K6chAZlu/aZmrZk7u5GsdIYqYhc4UTJxK F0CsEJx2X+3nk73NMA4L1vVF+2EIlgv8HM4k5ZNtsJli69HxCI4GNgaRpfgb1E6/n90372 9skjQUX0T15hMeCjNPiGV04QgS0R+Fw= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=qZm6v4eG; spf=pass (imf22.hostedemail.com: domain of jannh@google.com designates 209.85.208.51 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728936440; a=rsa-sha256; cv=none; b=GYs/XNGk/A7hc6g/Kk6ylqx2RBgAxlzQw3iOh1FcAP4DK8CYBTwC3QDyEwds00TQBm5rhT tbRBBMu0wW1NkOpHtheZBBJuHgdhi2FOjzexsZIwuTaPkShnDTCyRalVoMnE0S+CEytxgn GltxT47DxlSznju55Z878jCPbxZ04yg= Received: by mail-ed1-f51.google.com with SMTP id 4fb4d7f45d1cf-5c932b47552so28229a12.0 for ; Mon, 14 Oct 2024 13:08:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728936511; x=1729541311; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=WJ0VNcaIc86VSheN53nKaNiPzg3nD5Htkqzr0j33dvg=; b=qZm6v4eGd6rmzHSLyg6CUxyBQqq+BUFR/7pgl5vzyXjdRE/rKBHB6q9RSAOLfAQbRy 0j3fK0dX96fM87tgH6WrmvVnidU24Ug3Jkb4Qg8PJlVmQyfbEzxO2ciPbmMJ2JlUb4UQ rlPFsti0zjGZljT91tX+20DU9crT3fU+DDjes/ZZMGvVYv+ftNa9VbDw/kiGuqIBUbaG mSON6C+pVfhcg5HmQf+NTF9chC2vIltvCmMkAcaFsIP42HbOzF8VdmWTQUZR2+s4Y6xd QFYqhGgRGyygfBwewATaUHeuO9docJBWuY0R1GfEqrH9hpLwXEg3ZRJorZe6Rf7FNVvw FFdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728936511; x=1729541311; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WJ0VNcaIc86VSheN53nKaNiPzg3nD5Htkqzr0j33dvg=; b=MV44jVymhgpGwbVaL16AH3/Hzs0m0BTnY3EW+nc8Tjfo7NDuZFW1Q5wiuVxYju5BUg rO5ujdUucsPXxMwC3fvp3zXtPEPewgRFgbyyTccX3zVOdugqTheI8VB7n9cqhK3Y12QC nm7o5F1oGPUBE0CVJG/s4sM2jlEONqXzbuYI/NCsv+7FMDee2+Dp0d8TZCI/sUiPuNeb oaaQC9ymOODug3nXAQ7Hl+xSb0u3HR+wC/ofoZSZhNDs6UwIxmvuxZaqz9HuoSBf2bm3 pgydtFf4Nlbsya3rP1sHUF0yeGoC6fUMk/OUqNfyWXj3LtxwtxFwB1FzcopVd0/SlWTU nPdA== X-Forwarded-Encrypted: i=1; AJvYcCVe8IxUW4KLm7iaAjKHK4sCqKmKQHGP0wtXdSXoUD7+cxf0N3vhsxIi9v+1iVAF8HeZ3RPVRrtpKg==@kvack.org X-Gm-Message-State: AOJu0Yz6Ii6Md3KKjehRxFFHBrl8LcmU6i4WVkAzbvWmgXjYm06Q250v LBtSmsJGAx++YGxOyQn8wy6O4WcPK2rq+qGvfKoGIBkqzioYhR6VtwfA10Cb28XL7+CKvM2ABGl buds518zfn1PjmGMn54F1hz8g+4+mHfDTeZ+r X-Google-Smtp-Source: AGHT+IHzauMJXMRWtRoTmHGHhtdjUN1w4lIpDgvaCBZQpjqAMMmCxNje0X9NX0J+wmSkhigBY3I88y6VohWuT5HtooQ= X-Received: by 2002:a05:6402:234c:b0:5c8:a0fd:64f0 with SMTP id 4fb4d7f45d1cf-5c95b2c9139mr637583a12.2.1728936510371; Mon, 14 Oct 2024 13:08:30 -0700 (PDT) MIME-Version: 1.0 References: <20240903232241.43995-1-anthony.yznaga@oracle.com> In-Reply-To: <20240903232241.43995-1-anthony.yznaga@oracle.com> From: Jann Horn Date: Mon, 14 Oct 2024 22:07:52 +0200 Message-ID: Subject: Re: [RFC PATCH v3 00/10] Add support for shared PTEs across processes To: Anthony Yznaga Cc: akpm@linux-foundation.org, willy@infradead.org, markhemm@googlemail.com, viro@zeniv.linux.org.uk, david@redhat.com, khalid@kernel.org, andreyknvl@gmail.com, dave.hansen@intel.com, luto@kernel.org, brauner@kernel.org, arnd@arndb.de, ebiederm@xmission.com, catalin.marinas@arm.com, linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mhiramat@kernel.org, rostedt@goodmis.org, vasily.averin@linux.dev, xhao@linux.alibaba.com, pcc@google.com, neilb@suse.de, maz@kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: EB889C0009 X-Stat-Signature: efxh686hkagouh4b8ed6qy1qe14gxa5w X-HE-Tag: 1728936504-826784 X-HE-Meta: U2FsdGVkX18MCn4h7BY5mnLzGu3nSefZMXyiFcvdv+4E/DyZNMtOeA//V/fV6HWMqa8akrbI5qn6c13IbKCRHmgfeYXVOKNYYoOFlgLuVF1+zXXtDTjReyoop+LkI65x6+9iGePcyJY7DlQtsnX9I6WxOBfiwqPp3tVWGyv0bnAY2nF3Q25B3PQ42SstQVEK7Pc0FRxn2oq7tV4AwCwsmejybLC+I05d5GLdIpp1/zB+jHcl+r39q7jGz+YsfZ3dcIbfGyBj2+slOYD5e3lpHzWm0sw5HmwMJVqx/oY6RCgYZwy0E1O4qOUBqtXXfcUQBUYxPYdhNJZmD28g/ZiHLooao8lyEh4kriqLs6wbxUNpOTl8UxJHDhoHhwMNPlkMLCEAGOUyavmBBvXJYepHZmkjPf4jx9REGL/+6hFVy2h04s8yrSmmHV2GCNqSM2zd7Jd7AHJDfSfzZahcE0c1N+ksFy6fBIFrzw4/YQkLBZFxUmk2MWYBTOOntmVZsrWWeA2/31tdJu/UxcSDGDn/Ll2636nLv7whjSKy8HPsyrRgwIcuyfXPAvoAg3M234SkxjdUnyQJB5F7kj+iYC4IeTw35Zyh3TrI+Tr7NVplWwzwC4Sqhj5fRQhBiSK5XNnLuB2upZxlee4tDNnt2JXEh8Ms+lG8FHvzZ3llaKDtNKFSw3kdMOorhPZqcACpClKz70VZWmPiCaWASRKDsgOSbBfizz4TPsEF6J8hmXoZpriC+I+2oR4Hdn/SL0aU3cin+IrmIE2pmmSnDeihJhsDCfbOW72coYV4x0ZoSrdL2TriKGLV2E1WWDKkcxtRiAWFEp8FiRTkAtJntwIpmfr/TX0p+BaGE+1HoAGi78E31skVIV3fuvaAORkjPBpNFast/PZpS/qaCYcnmLUSX/8eSATiwpdYlNvMZyPOdqGMb4sKROMHzjRYdxthJ0SB91Msk58s+Wk0i6kxOoaHnLw jk7YaZPF Pcd1cumSM+qT7M635ljKoTy7XS0ySL25iCfyKWO1YNqeUZF2B2Y+1Fpta4xxIXPQTMR3yO6cawP7OmCIW7Z3rKbdzcW/7H6h5z2vMi/z1CgYxxOk6dRTls7ATutWCdVlwOU9W5Ns38LMhFEtLZrsQ5nrUOys1cQw/hepFnNdnRZyMtQ4ut9e+YDMvSaoTn7vw8mc/8yONbLgDonrLoUN6MWSZWgTjXI+Uk1fHwU5scxq0ejAa6swoZ0WsysqnuK0twl9xDTx33NQiuZM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.003786, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Sep 4, 2024 at 1:22=E2=80=AFAM Anthony Yznaga wrote: > One major issue to address for this series to function correctly > is how to ensure proper TLB flushing when a page in a shared > region is unmapped. For example, since the rmaps for pages in a > shared region map back to host vmas which point to a host mm, TLB > flushes won't be directed to the CPUs the sharing processes have > run on. I am by no means an expert in this area. One idea is to > install a mmu_notifier on the host mm that can gather the necessary > data and do flushes similar to the batch flushing. The mmu_notifier API has two ways you can use it: First, there is the classic mode, where before you start modifying PTEs in some range, you remove mirrored PTEs from some other context, and until you're done with your PTE modification, you don't allow creation of new mirrored PTEs. This is intended for cases where individual PTE entries are copied over to some other context (such as EPT tables for virtualization). When I last looked at that code, it looked fine, and this is what KVM uses. But it probably doesn't match your usecase, since you wouldn't want removal of a single page to cause the entire page table containing it to be temporarily unmapped from the processes that use it? Second, there is a newer mode for IOMMUv2 stuff (using the mmu_notifier_ops::invalidate_range callback), where the idea is that you have secondary MMUs that share the normal page tables, and so you basically send them invalidations at the same time you invalidate the primary MMU for the process. I think that's the right fit for this usecase; however, last I looked, this code was extremely broken (see https://lore.kernel.org/lkml/CAG48ez2NQKVbv=3DyG_fq_jtZjf8Q=3D+Wy54FxcFrK_O= ujFg5BwSQ@mail.gmail.com/ for context). Unless that's changed in the meantime, I think someone would have to fix that code before it can be relied on for new usecases.