From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91935C4167B for ; Thu, 2 Nov 2023 15:56:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 185FA8D0099; Thu, 2 Nov 2023 11:56:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 137ED8D000F; Thu, 2 Nov 2023 11:56:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F1AD98D0099; Thu, 2 Nov 2023 11:56:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E35428D000F for ; Thu, 2 Nov 2023 11:56:47 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id B3F0BC0F2B for ; Thu, 2 Nov 2023 15:56:47 +0000 (UTC) X-FDA: 81413467254.29.9B197AB Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf29.hostedemail.com (Postfix) with ESMTP id DC61E120019 for ; Thu, 2 Nov 2023 15:56:45 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LG9Rik3Z; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of 3vMZDZQYKCIk5rn0wpt11tyr.p1zyv07A-zzx8npx.14t@flex--seanjc.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3vMZDZQYKCIk5rn0wpt11tyr.p1zyv07A-zzx8npx.14t@flex--seanjc.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698940605; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sYwIeNUmcSQVXWofeU45UHS4WNGoW8dkX+AiEs9OLu8=; b=mtDM0x8Kt7ge8fxL4aCPw1RCv2bScwHT1u4jM5rG09p4KegTTobh2xHw0O7BtzDwM1Y5J4 styR/O4fLVhJhQ9e9RMmUbr5b/3g1lQjiWsHZlLO9PPet+Uh51lZHOMb77naWx/9GEG24m RH+36jBZbub30LICzr+Pi+fClU22zQ8= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LG9Rik3Z; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of 3vMZDZQYKCIk5rn0wpt11tyr.p1zyv07A-zzx8npx.14t@flex--seanjc.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3vMZDZQYKCIk5rn0wpt11tyr.p1zyv07A-zzx8npx.14t@flex--seanjc.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698940605; a=rsa-sha256; cv=none; b=JJakqvATvSlRqYpqxKEZgk9Gw3FnmwBIbAvid9+lxsVWGrGwMwIFT0cdAeGm2B3scim+Wa XHPQXrZmkWgMRnO/uUVYdXVoqRrfPlZlmr9sXxYaQGviRL5xwXh/yusEjt/xSucC5INHbc FYuXFo098f2WhX6Oi6zX4cmR1UwPfIc= Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-d86dac81f8fso1363122276.1 for ; Thu, 02 Nov 2023 08:56:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698940605; x=1699545405; darn=kvack.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=sYwIeNUmcSQVXWofeU45UHS4WNGoW8dkX+AiEs9OLu8=; b=LG9Rik3ZzDxa3bZnb9FQJ8qFtTOiSDW1jhOWojllnhGQHnbxejlZgAQ6+HgOx3U+j4 RxBPISlSm5WAX4TLbBLo/+Lz/PCDv4+d/+zSMxkzWxlLMTTeM6xAiVO7ElTdQjhEHBjY r0DctjJqcWUxv+/nXqUDHKCr1YEd1wLqsD2YnCsnt5xl62UvV5Tio552g1MU9Rwo7qxi rtvCMTNoKsqifcnTyIO6C7mu+/0k6xPSUrivse64t87I2Od9YAlJD/GlZK2UFgs5mmrj flADeWvYSf20JUiTNpQ6VRoQRicRT9J4zFXhzRZQgcp+BYMbD1gA3mW/Fv2G5eTUO9uS J7/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698940605; x=1699545405; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=sYwIeNUmcSQVXWofeU45UHS4WNGoW8dkX+AiEs9OLu8=; b=O3fSwwwrAdOiqUqxTgUabcPo/lf5swgJJUcnMpxeoYbNSAwUDD/VTYyoJ5etUMdOhC 5NWLXXE46idHpN1ih3GsUgtUNKQ+dFt1t/wZEr3t5thg6mKo1v7Al4Tx7TmuTnGTqR+1 LpGF13zKzRm2e6je2eYHwvUPJLOkTjVu7DP2RpV4ZK7vEcUE6N+16ziu5omAfuAzs3aw mJZcWUjJaLPl1ok2LjUbFFhtz5JJfng6TOzkLVsgChGIKpqpcuYyEmNplW04o2k2mh1V yhXzlI4I0oNiCPfbQVljQ5pg6inrjHP+68x0a5aWk21GPl1wAkrI1VFighzNTUuWl8KW ItbQ== X-Gm-Message-State: AOJu0YyFdUTDrG6cQieuNAB1ozleefrAWJ1lltFgo/zsfj49wP1faqBn RLLIKoFlmkPj8mphJdRoM1Lb3/znkeg= X-Google-Smtp-Source: AGHT+IGZwkk8ZQOm5+11CB83zMzVhzLYncHDWYhyGGxNXJ5R5wT9b6yT+r/XAOkVnZeNHiaSF1GNxI0NkwA= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a25:aae7:0:b0:da0:5a30:6887 with SMTP id t94-20020a25aae7000000b00da05a306887mr349504ybi.4.1698940604877; Thu, 02 Nov 2023 08:56:44 -0700 (PDT) Date: Thu, 2 Nov 2023 08:56:43 -0700 In-Reply-To: <64e3764e36ba7a00d94cc7db1dea1ef06b620aaf.camel@intel.com> Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-10-seanjc@google.com> <482bfea6f54ea1bb7d1ad75e03541d0ba0e5be6f.camel@intel.com> <64e3764e36ba7a00d94cc7db1dea1ef06b620aaf.camel@intel.com> Message-ID: Subject: Re: [PATCH v13 09/35] KVM: Add KVM_EXIT_MEMORY_FAULT exit to report faults to userspace From: Sean Christopherson To: Kai Huang Cc: Xiaoyao Li , "kvm-riscv@lists.infradead.org" , "mic@digikod.net" , "liam.merwick@oracle.com" , Isaku Yamahata , "kvm@vger.kernel.org" , "pbonzini@redhat.com" , "kirill.shutemov@linux.intel.com" , "david@redhat.com" , "linux-fsdevel@vger.kernel.org" , "amoorthy@google.com" , "linuxppc-dev@lists.ozlabs.org" , "tabba@google.com" , "kvmarm@lists.linux.dev" , "linux-kernel@vger.kernel.org" , "michael.roth@amd.com" , "viro@zeniv.linux.org.uk" , "oliver.upton@linux.dev" , "chao.p.peng@linux.intel.com" , "palmer@dabbelt.com" , "chenhuacai@kernel.org" , "aou@eecs.berkeley.edu" , "linux-mips@vger.kernel.org" , "mpe@ellerman.id.au" , Vishal Annapurve , "vbabka@suse.cz" , "mail@maciej.szmigiero.name" , "linux-riscv@lists.infradead.org" , "maz@kernel.org" , "willy@infradead.org" , "dmatlack@google.com" , "anup@brainfault.org" , "yu.c.zhang@linux.intel.com" , Yilun Xu , "qperret@google.com" , "brauner@kernel.org" , "isaku.yamahata@gmail.com" , "ackerleytng@google.com" , "jarkko@kernel.org" , "paul.walmsley@sifive.com" , "linux-arm-kernel@lists.infradead.org" , "linux-mm@kvack.org" , Wei W Wang , "akpm@linux-foundation.org" Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: DC61E120019 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: unqnf5y9fcczhqaie5zzaojc8gmfqztt X-HE-Tag: 1698940605-815034 X-HE-Meta: U2FsdGVkX19c3r3U07zMGO6gajHNVVZIIitmMTcVZCXAUg9uSaHtv7lGe+oa39HAthZhSRkCFTUvaMzw4z4nqWRNjRHs35VY7OUw2+FGOCD8iFa7nor8NEcF5Zk+y64p4yKD+tS3G2FNclMTQQTs5gdq4utI08n92anS6HDjPIXsblfn6ex53DUCCc8PWTaPvtKlkTnokv1hkIRMT48UIystPY6SVOoUO1029EKtfCE6jPiNmAQVtv20keeOoemTJS0fOztrfjuEjYsixCXMxCCB8iz/Xan1ODPLOjNYlz9DjKPXaJ514CGIbAgv/hZq0GRTLY3mAoHnHg+T/yLYNZkzJVcKstEc81QToKehFpCqxzmZ42Xs61iIdenRlGqZDadTiMagFKzaQVlI/jQtV0ItWSozBKv3VBWduKNgalp9NXTA88BJmQBLUuT1OStyOYn2jAZ69TJ922n/ot6V1N8cuCZa5v1sOyEOT9dc8CMTiN32SQROIoa0Zb+bsNunvZXOCTRDFo3SO4GfBIEuILmuWu5aiFHUddOJDSjNnv9Ck7AbwBTyn7fGX3NjySLtmkdhaRRpL3dDKAeSIAwA9l8vXtWu6CkaWVe9gSXyU34S5h0u4dBkCevZ9CT6UYx5XG1N24C9KxGFHmHi9xu6ynv1Om0P2L2ZqwShntE3CPdelBv6ZUoy64O/osiKdFnWa+ea9uKQuSBZ8tXk+9OMXjCKHv9DHk7hILn1ZQKj4mD6KbT39/sXwRrIKlWYEV4QL+URKbKVA8ZdeWjw4KRt4Zb8okhV9zNhKjvFh81DoShFgUAweweG7qbWE7knH9+hbpZ8rbspzMcEK7eM6+kvaynpu3YRT2hEYK02rIrl1rMtoBNJRR5SMkiuHXS1sNhbUFrUX5IAnmdir91KRQxBWheJVYTzYBoeyCSkbEeOKGLR0VsdlB+MOMsLQv2drLWcwvjkVj06VhtbUp0uOHW DpWLCdKG DZmks4UJa5vmQXZwLcefLGbxOAY0tTjtVtcWz4N2iQjSMp0N48wdCCNI5XUAJUjqHGK9e/DbaUuYf1PnQ6sEm0kTKBM1ZOOj5q7GJh6gVqKAExrfcwOe8HCiZYq1gA48szGWK3MKCzVNSA4+fKgaNTmE2+q56bVH3ibeZm1A3jGtY8QoxrhxkipiChn5iE72iAVx82LyBmk6KbdBAV4qAwAFKM3s7h7ndHSGe7EmGZ7IZDpJzaUN9LmntlW1Boe4VcDiNm35BXNu5mHzsg5QORX5Dja0TnhXIOJ2/syzsge+pfWLKcFal2Nuzb/f8Bb89ybrU3QhxDDvUvYjWL6vNd59wOPJdYI+OoCpzdTL6k8vhHB3BF+tMujv+F5FxXnoYelCY1Xir9up9UeyG7paXIHq9AGpWLHPwRxHAMAM+G37x0/gVZiLKUa8n005XZlIVZOwUVbU2FOlpKGGlMQ3B2O5dYhrs/F7PumXFsIRPPUXE8i15hyIMYhIV4zUN/cLlPYf5DeWRdQrfDxKMMDdE8+2KX9sEuZmxjFCR8vOLNyWtDkztZvd9NU8oCFLoJcBXWF8/g5idu7nalCR1iLEeT7gsb4pkQ9Su+TmOJW46CqGpfexYBmOhD6m/3mIvlkx4P3a/HbrqwfV5q439JDtGVcVaZwpXhG/ReMFMGkClj3PYx6+htcY9I/dRjYf0TWvaTU4GQvcpC0rCq8WTt+jgekK4I8+HIJSqvqjmTCESz65rEAlygxDMDM1UaXaVbeLRVSJQ/kGKxjZ2XkedNtodhF0eI6n/UE3OTH+UwpOLXmgVvAXbWZ7vDsA5grpgJGomW+aU X-Bogosity: Ham, tests=bogofilter, spamicity=0.000197, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Nov 02, 2023, Kai Huang wrote: > On Wed, 2023-11-01 at 10:36 -0700, Sean Christopherson wrote: > > On Wed, Nov 01, 2023, Kai Huang wrote: > > >=20 > > > > +7.34 KVM_CAP_MEMORY_FAULT_INFO > > > > +------------------------------ > > > > + > > > > +:Architectures: x86 > > > > +:Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP. > > > > + > > > > +The presence of this capability indicates that KVM_RUN will fill > > > > +kvm_run.memory_fault if KVM cannot resolve a guest page fault VM-E= xit, e.g. if > > > > +there is a valid memslot but no backing VMA for the corresponding = host virtual > > > > +address. > > > > + > > > > +The information in kvm_run.memory_fault is valid if and only if KV= M_RUN returns > > > > +an error with errno=3DEFAULT or errno=3DEHWPOISON *and* kvm_run.ex= it_reason is set > > > > +to KVM_EXIT_MEMORY_FAULT. > > >=20 > > > IIUC returning -EFAULT or whatever -errno is sort of KVM internal > > > implementation. > >=20 > > The errno that is returned to userspace is ABI. In KVM, it's a _very_ = poorly > > defined ABI for the vast majority of ioctls(), but it's still technical= ly ABI. > > KVM gets away with being cavalier with errno because the vast majority = of errors > > are considered fatal by userespace, i.e. in most cases, userspace simpl= y doesn't > > care about the exact errno. > >=20 > > A good example is KVM_RUN with -EINTR; if KVM were to return something = other than > > -EINTR on a pending signal or vcpu->run->immediate_exit, userspace woul= d fall over. > >=20 > > > Is it better to relax the validity of kvm_run.memory_fault when > > > KVM_RUN returns any -errno? > >=20 > > Not unless there's a need to do so, and if there is then we can update = the > > documentation accordingly. If KVM's ABI is that kvm_run.memory_fault i= s valid > > for any errno, then KVM would need to purge kvm_run.exit_reason super e= arly in > > KVM_RUN, e.g. to prevent an -EINTR return due to immediate_exit from be= ing > > misinterpreted as KVM_EXIT_MEMORY_FAULT. And purging exit_reason super= early is > > subtly tricky because KVM's (again, poorly documented) ABI is that *som= e* exit > > reasons are preserved across KVM_RUN with vcpu->run->immediate_exit (or= with a > > pending signal). > >=20 > > https://lore.kernel.org/all/ZFFbwOXZ5uI%2Fgdaf@google.com > >=20 > >=20 >=20 > Agreed with not to relax to any errno. However using -EFAULT as part of = ABI > definition seems a little bit dangerous, e.g., someone could accidentally= or > mistakenly return -EFAULT in KVM_RUN at early time and/or in a completely > different code path, etc. =C2=A0-EINTR has well defined meaning, but -EFA= ULT (which > is "Bad address") seems doesn't but I am not sure either. :-) KVM has returned -EFAULT since forever, i.e. it's effectively already part = of the ABI. I doubt there's a userspace that relies precisely on -EFAULT, but use= rspace definitely will be confused if KVM returns '0' where KVM used to return -EF= AULT. And so if we want to return '0', it needs to be opt-in, which means forcing userspace to enable a capability *and* requires code in KVM to conditionall= y return '0' instead of -EFAULT/-EHWPOISON. > One example is, for backing VMA with VM_IO | VM_PFNMAP, hva_to_pfn() retu= rns > KVM_PFN_ERR_FAULT when the kernel cannot get a valid PFN (e.g. when SGX v= epc > fault handler failed to allocate EPC) and kvm_handle_error_pfn() will jus= t > return -EFAULT. If kvm_run.exit_reason isn't purged early then is it pos= sible > to have some issue here? Well, yeah, but that's exactly why this series has a patch to reset exit_re= ason. The solution to "if KVM is buggy then bad things happen" is to not have KVM= bugs :-)