From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 998ECC433DB for ; Tue, 12 Jan 2021 17:16:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3478323121 for ; Tue, 12 Jan 2021 17:16:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3478323121 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 711146B006E; Tue, 12 Jan 2021 12:16:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 69A986B0071; Tue, 12 Jan 2021 12:16:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 514DB6B0072; Tue, 12 Jan 2021 12:16:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0087.hostedemail.com [216.40.44.87]) by kanga.kvack.org (Postfix) with ESMTP id 346806B006E for ; Tue, 12 Jan 2021 12:16:37 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 5375E362C for ; Tue, 12 Jan 2021 17:16:34 +0000 (UTC) X-FDA: 77697777108.19.gold97_140817527517 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin19.hostedemail.com (Postfix) with ESMTP id 0707F1ACEA2 for ; Tue, 12 Jan 2021 17:16:34 +0000 (UTC) X-HE-Tag: gold97_140817527517 X-Filterd-Recvd-Size: 4463 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Tue, 12 Jan 2021 17:16:32 +0000 (UTC) IronPort-SDR: rcC1SgRRnLUZslAY2PbtxSmuvLE+n2Pkgyb/0dE/h3xu6Nf6sWwWUaeXjIlmlLq7f9EVWuoteq XUeNdg82mgMg== X-IronPort-AV: E=McAfee;i="6000,8403,9862"; a="196702365" X-IronPort-AV: E=Sophos;i="5.79,341,1602572400"; d="scan'208";a="196702365" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jan 2021 09:16:30 -0800 IronPort-SDR: vpimsz8mu85mY/VhZ8BIsMQYhxifw2vOGImwkg53nHRI2ExwVgofbs4i2tC/j7jIwT1GpJtDWx vKAsEzaVeUUg== X-IronPort-AV: E=Sophos;i="5.79,341,1602572400"; d="scan'208";a="353104671" Received: from agluck-desk2.sc.intel.com (HELO agluck-desk2.amr.corp.intel.com) ([10.3.52.68]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jan 2021 09:16:30 -0800 Date: Tue, 12 Jan 2021 09:16:28 -0800 From: "Luck, Tony" To: Andy Lutomirski Cc: Borislav Petkov , X86 ML , Andrew Morton , Peter Zijlstra , Darren Hart , LKML , linux-edac , Linux-MM Subject: Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery Message-ID: <20210112171628.GA15664@agluck-desk2.amr.corp.intel.com> References: <20210111214452.1826-2-tony.luck@intel.com> <20210111222057.GA2369@agluck-desk2.amr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jan 12, 2021 at 09:00:14AM -0800, Andy Lutomirski wrote: > > On Jan 11, 2021, at 2:21 PM, Luck, Tony wrote: > > > > =EF=BB=BFOn Mon, Jan 11, 2021 at 02:11:56PM -0800, Andy Lutomirski wr= ote: > >> > >>>> On Jan 11, 2021, at 1:45 PM, Tony Luck wrote= : > >>> > >>> =EF=BB=BFRecovery action when get_user() triggers a machine check u= ses the fixup > >>> path to make get_user() return -EFAULT. Also queue_task_work() set= s up > >>> so that kill_me_maybe() will be called on return to user mode to se= nd a > >>> SIGBUS to the current process. > >>> > >>> But there are places in the kernel where the code assumes that this > >>> EFAULT return was simply because of a page fault. The code takes so= me > >>> action to fix that, and then retries the access. This results in a = second > >>> machine check. > >>> > >>> While processing this second machine check queue_task_work() is cal= led > >>> again. But since this uses the same callback_head structure that > >>> was used in the first call, the net result is an entry on the > >>> current->task_works list that points to itself. > >> > >> Is this happening in pagefault_disable context or normal sleepable f= ault context? If the latter, maybe we should reconsider finding a way fo= r the machine check code to do its work inline instead of deferring it. > > > > The first machine check is in pagefault_disable() context. > > > > static int get_futex_value_locked(u32 *dest, u32 __user *from) > > { > > int ret; > > > > pagefault_disable(); > > ret =3D __get_user(*dest, from); >=20 > I have very mixed feelings as to whether we should even try to recover > from the first failure like this. If we actually want to try to > recover, perhaps we should instead arrange for the second MCE to > recover successfully instead of panicking. Well we obviously have to "recover" from the first machine check in order to get to the second. Are you saying you don't like the different return value from get_user()? In my initial playing around with this I just had the second machine check simply skip the task_work_add(). This worked for this case, but only because there wasn't a third, fourth, etc. access to the poisoned data. If the caller keeps peeking, then we'll keep taking more machine checks - possibly forever. Even if we do recover with just one extra machine check ... that's one more than was necessary. -Tony