From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30DD8C433B4 for ; Wed, 14 Apr 2021 14:47:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 874A8611EE for ; Wed, 14 Apr 2021 14:47:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 874A8611EE Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E95616B0036; Wed, 14 Apr 2021 10:47:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E45BC6B006C; Wed, 14 Apr 2021 10:47:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CE6AA6B0070; Wed, 14 Apr 2021 10:47:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0093.hostedemail.com [216.40.44.93]) by kanga.kvack.org (Postfix) with ESMTP id B2B396B0036 for ; Wed, 14 Apr 2021 10:47:02 -0400 (EDT) Received: from smtpin36.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 628D4180364EB for ; Wed, 14 Apr 2021 14:47:02 +0000 (UTC) X-FDA: 78031249884.36.97B065A Received: from mail-vk1-f174.google.com (mail-vk1-f174.google.com [209.85.221.174]) by imf15.hostedemail.com (Postfix) with ESMTP id 4CA6EA000395 for ; Wed, 14 Apr 2021 14:47:00 +0000 (UTC) Received: by mail-vk1-f174.google.com with SMTP id o17so4450008vko.8 for ; Wed, 14 Apr 2021 07:47:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=IWUmXHvIRv1Q6+KLmMzqs8rZ1xX3xtlUCsE3QMY698k=; b=We+Ts2hHSHOHxeChgK8+e5GvO1bOhyJG3e8alHuvmBYg1sjXMvJXsRkvFie9C1LZ5Y EYo/ZdeYgWsiQFijZBgILNIU+zp8sKNqeJuFd7RLh6wkDWCLVDkeBArt+EZ90dBvcAMD z8E2Fb37ftMCoXc3LzC48d9ItTfC9IHb1XC+pQIc4C7xpTMO9sB3aQzRMvmw2G+F3khG gxN9STH6BjecBbxq5JPWFNKu+xCCEaRNnQUFeDTADBoVPs3rVB9MzQBNvup8p+CotXyO BabKwOG4lSUmSnjq4ge+yH4IVeMwCAUSnnwieSHf1c0AMTYpjtpzUnYZOgNv68oRG+gK kIxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=IWUmXHvIRv1Q6+KLmMzqs8rZ1xX3xtlUCsE3QMY698k=; b=BEcx8x9pI/ovYScAth3CtPbdSbgzjQbbyay5Xpw/DJ311CZkv+jUsIC63Yp84Sm/l5 yjwxjj6f/c7Xvo1obKn51holm3hEX8T7NnabgNfTi0aRdOPIC6JByUeqB9O5FPGpXllF m3uBD4SElZjjZ11fc77zH3kE3xXOWUNUnkxDAIdw3HdaUGT+/xNR1TEYqGHwoq3TwMjS Er9kDnDo/JSUl1xoycyFguAlhqqC0JOHBleXSnUmSV1YxiG2YrOUH0mKbIBefnQNlUzZ rt7uCzsURZcZTnHM3Wn017PqwmPa6YjlXtWoK+XQKXEsOio6O/0tiqt7hQf5XNb4YJBV SNRg== X-Gm-Message-State: AOAM532OJIGb+y9mXR9fRdp/6zr8Bp8MVgMa/oyxJXWHRVaFPW7kpdV8 fl5pKB75iwoVGA5JFkbHKC36QZhg+nWKRB/dEYiz7Q== X-Google-Smtp-Source: ABdhPJwMrqNfz/lL+mK0ZpoZqN/GaAVzoGMZ87EZV+WRQ+6NAX5jX91VGJjGg83awOurqZcMIyAgLFQiNtbr23HeAAk= X-Received: by 2002:a1f:53c7:: with SMTP id h190mr28473405vkb.19.1618411620871; Wed, 14 Apr 2021 07:47:00 -0700 (PDT) MIME-Version: 1.0 References: <20210414131018.GG10709@zn.tnic> In-Reply-To: <20210414131018.GG10709@zn.tnic> From: Jue Wang Date: Wed, 14 Apr 2021 07:46:49 -0700 Message-ID: Subject: Re: [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison To: Borislav Petkov Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, luto@kernel.org, =?UTF-8?B?SE9SSUdVQ0hJIE5BT1lBKOWggOWPoyDnm7TkuZ8p?= , "Luck, Tony" , x86 , yaoaili@kingsoft.com Content-Type: text/plain; charset="UTF-8" X-Stat-Signature: t4dmnf5awekzq51ytf51ea4eb3wb51hi X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 4CA6EA000395 Received-SPF: none (google.com>: No applicable sender policy available) receiver=imf15; identity=mailfrom; envelope-from=""; helo=mail-vk1-f174.google.com; client-ip=209.85.221.174 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1618411620-101192 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000010, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Apr 14, 2021 at 6:10 AM Borislav Petkov wrote: > > On Tue, Apr 13, 2021 at 10:47:21PM -0700, Jue Wang wrote: > > This path is when EPT #PF finds accesses to a hwpoisoned page and > > sends SIGBUS to user space (KVM exits into user space) with the same > > semantic as if regular #PF found access to a hwpoisoned page. > > > > The KVM_X86_SET_MCE ioctl actually injects a machine check into the guest. > > > > We are in process to launch a product with MCE recovery capability in > > a KVM based virtualization product and plan to expand the scope of the > > application of it in the near future. > > Any pointers to code or is this all non-public? Any text on what that > product does with the MCEs? These are non-public at this point. User-facing docs and blog post are expected to be released towards the launch (i.e., in 3-4 months from now). > > > The in-memory database and analytical domain are definitely using it. > > A couple examples: > > SAP HANA - as we've tested and planned to launch as a strategic > > enterprise use case with MCE recovery capability in our product > > SQL server - https://support.microsoft.com/en-us/help/2967651/inf-sql-server-may-display-memory-corruption-and-recovery-errors > > Aha, so they register callbacks for the processes to exec on a memory > error. Good to know, thanks for those. My other 2 cents: I can see this is useful in other types of domains, e.g., on multi-tenant cloud servers where many VMs are collocated on the same host, with proper recovery + live migration, a single MCE would only affect a single VM at most. Another type of generic use case may be services that can tolerate abrupt crash, i.e., they periodically save checkpoints to persistent storage or are stateless services in nature and are managed by some process manager to automatically restart and resume from where the work was left at when crashed. Thanks, -Jue > > Thx. > > -- > Regards/Gruss, > Boris. > > https://people.kernel.org/tglx/notes-about-netiquette