From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6DF4C36002 for ; Mon, 24 Mar 2025 16:54:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1E070280002; Mon, 24 Mar 2025 12:54:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 19027280001; Mon, 24 Mar 2025 12:54:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0591D280002; Mon, 24 Mar 2025 12:54:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id DF34C280001 for ; Mon, 24 Mar 2025 12:54:08 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 1B1DFB7BB3 for ; Mon, 24 Mar 2025 16:54:09 +0000 (UTC) X-FDA: 83257042218.13.3AAC396 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) by imf07.hostedemail.com (Postfix) with ESMTP id A184040008 for ; Mon, 24 Mar 2025 16:54:06 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=QhOKekUI; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf07.hostedemail.com: domain of tony.luck@intel.com designates 198.175.65.10 as permitted sender) smtp.mailfrom=tony.luck@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742835247; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pCacE+xisYPcsgdv3+TM1VMm0lKFuw1VC2E+x/ru66M=; b=bm1XGqL6tCh/T27J9sYcxMfkPUdjR75LJwImz12tGceDTMp1usSO0/K77H13OBv1N0JwUG Vdy226qiKVZeW4o0qO0gH6kXnRkislLBctiMkqddA8C2WXheWCAQeLcdMJ9a2zBKf0iltJ Clh65fuQmHksQVnvgYbS4+HAMyv6nwI= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=QhOKekUI; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf07.hostedemail.com: domain of tony.luck@intel.com designates 198.175.65.10 as permitted sender) smtp.mailfrom=tony.luck@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742835247; a=rsa-sha256; cv=none; b=fl2Vwc4axDcbQCNHuLvWeESpkckvBB9eDrODNXSNd4/j9qYlKkLp5Geh50AOncNQwUuP1d et4sr+U8koC71gw+qjsM/W21gYsd67si+bM/xPM/d4gJeCKTFvbbM5nXUKHYEQWmhbDe9G Yu4xgT55ASy6cUo/btN5WWGCbTBEl94= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1742835246; x=1774371246; h=date:from:to:cc:subject:message-id:references: mime-version:content-transfer-encoding:in-reply-to; bh=dE6Zd6Ph+/ryHeQLLazfSBwNRaC6LfUCJ2aoR2M+tDQ=; b=QhOKekUINcKM/BiQtaOrPni0qQ0+aprMmu+2SLx9iOoe2ERL3Mga3Tnj CGaL+Qvwm/FlkpfyQ2fAlQgSZW5WdU/v0vkBHT/prU/IJR3BGfzpVoUp3 lcmPgaBtSi1u6lVUHfkitylVO1t9+ms0plU1vfRV69Q7LY24DhZhrtD4J /uCFhw3LofI1MsQrgxXqBKgvp7I4gQ1P8vWfBoPpe5TsxMApHNS/YhEFJ wKZmMNRAWwZy88i/NQ6MRoy6S+MbG+RPeSrK9qZzCledNVqnYi885ekWi A0xDH/wgGVgRvxbtJxUEmxpR66KZr2X9UbF+bRA7DE3LaisQcv1rBC+aM g==; X-CSE-ConnectionGUID: waxO0ZydQMim9X49LfaoMQ== X-CSE-MsgGUID: qla3zS6hTJGYXklARZFjFA== X-IronPort-AV: E=McAfee;i="6700,10204,11383"; a="61444865" X-IronPort-AV: E=Sophos;i="6.14,272,1736841600"; d="scan'208";a="61444865" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Mar 2025 09:54:04 -0700 X-CSE-ConnectionGUID: y2R8TjF9T0uyVt27GhllBQ== X-CSE-MsgGUID: cDQOi4IQR9e/FuSVLgSLxw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.14,272,1736841600"; d="scan'208";a="124891128" Received: from agluck-desk3.sc.intel.com (HELO agluck-desk3) ([172.25.222.70]) by fmviesa009-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Mar 2025 09:54:05 -0700 Date: Mon, 24 Mar 2025 09:54:02 -0700 From: "Luck, Tony" To: Tong Tiangen Cc: Catalin Marinas , Mark Rutland , Jonathan Cameron , Mauro Carvalho Chehab , Will Deacon , Andrew Morton , James Morse , Robin Murphy , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Michael Ellerman , Nicholas Piggin , Andrey Ryabinin , Alexander Potapenko , Christophe Leroy , "Aneesh Kumar K.V" , "Naveen N. Rao" , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Madhavan Srinivasan , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com, wangkefeng.wang@huawei.com, Guohanjun Subject: Re: [PATCH v13 2/5] arm64: add support for ARCH_HAS_COPY_MC Message-ID: References: <20241209024257.3618492-1-tongtiangen@huawei.com> <20241209024257.3618492-3-tongtiangen@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: A184040008 X-Stat-Signature: tymm7jtxz9rfhp6y4cswwfhgsrybgxey X-Rspam-User: X-HE-Tag: 1742835246-464311 X-HE-Meta: U2FsdGVkX19mLYGSZdO83vFlBSQYAvH1/BCxZ0AKdd0Tuvqm3cOeZzuFwUH3F2BvEresqp0GMGcZZfqziRfiIrnYKmau+1Ov294r5rvMceQuIpVWSZgRm+CkG2XxII9wRBkMZYAtCkWcMDJCFU2I/YRJeEkeXu9OZic5cM3jm7ScfHPMu3br0qWFNSPR5Ziil+fvEOWnLyodNVskC5Y3galNTQ+YPYleNVTMz3PJcFvUvgb8WlAIQ63+2IHFTjk7JIXsQ9VMGrcyXiYTvX7v4e0ft6t/QUxH7W5tS+B++g0bdNDfp2WDcxrp4x0LF4VpBzZFr5+2CIhhieofYj3HV42raG886aB++Bbc3HnR0xTIF7V7dmQI08hkWWPGOVJxjpqCycpd8lzYrZJzXtpf7MomnillwQjnwMPIJTDPmEqb9h6Ta/kFbeYyb9aAP1v+ILGXnfo8IRmiJh6x7Q5EuY9XOpzZ1/BlB47xkv+RkncLJmOmyJ4tp7sLTkTPPTCmWPQa9H0sIWPuASF7YYXB+zunCY8FSsFbV/Vbq4LVJToudrVYzwdNhw6FH7zqrofEpmJDJV7MPGZcaLP4cFzei9VtV0ebZJdE/EymmAeGfPvaGI/lrwnai4OJP7VOETSWAkSp5LXNT7yZwPWqquTHlIc0Sc4xeolrVvw0hrlpOUUSLEaYiIyc8VLEcBJu8iNpzm4bf/vISNMYSxTqFmTZW8XjYkQbzMhVmIWpomIVYcEcgEAg9O4+2bgvh0Df/hXGdMCnt0z3oz7zLtVrGTDwcpzsnoUnkD1vPgokIGp8NJMXC23kUqEWKrRRXqmE6JNfQG8M4yhN16PI7aNiBvVbox8yhtojTvkX7+3NoLlOgIwUbZLs7uH7+poVY4d7zR14+J5oVDUBAGSszO5MZduibNhf0r7N4Ceaz7U6MkPRXUy0wQV9uhcx72/XbNLT22rRtogUXmy5aSnmR0Zhlhi TsJkA9BX xyR5oY1HykKeer2nxWQ1UI5kH6A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Feb 14, 2025 at 09:44:02AM +0800, Tong Tiangen wrote: > > > 在 2025/2/13 0:21, Catalin Marinas 写道: > > (catching up with old threads) > > > > On Mon, Dec 09, 2024 at 10:42:54AM +0800, Tong Tiangen wrote: > > > For the arm64 kernel, when it processes hardware memory errors for > > > synchronize notifications(do_sea()), if the errors is consumed within the > > > kernel, the current processing is panic. However, it is not optimal. > > > > > > Take copy_from/to_user for example, If ld* triggers a memory error, even in > > > kernel mode, only the associated process is affected. Killing the user > > > process and isolating the corrupt page is a better choice. > > > > I agree that killing the user process and isolating the page is a better > > choice but I don't see how the latter happens after this patch. Which > > page would be isolated? > > The SEA is triggered when the page with hardware error is read. After > that, the page is isolated in memory_failure() (mf). The processing of > mf is mentioned in the comments of do_sea(). > > /* > * APEI claimed this as a firmware-first notification. > * Some processing deferred to task_work before ret_to_user(). > */ > > Some processing include mf. > > > > > > Add new fixup type EX_TYPE_KACCESS_ERR_ZERO_MEM_ERR to identify insn > > > that can recover from memory errors triggered by access to kernel memory, > > > and this fixup type is used in __arch_copy_to_user(), This make the regular > > > copy_to_user() will handle kernel memory errors. > > > > Is the assumption that the error on accessing kernel memory is > > transient? There's no way to isolate the kernel page and also no point > > in isolating the destination page either. > > Yes, it's transient, the kernel page in mf can't be isolated, the > transient access (ld) of this kernel page is currently expected to kill > the user-mode process to avoid error spread. > > > The SEA processes synchronization errors. Only hardware errors on the > source page can be detected (Through synchronous ld insn) and processed. > The destination page cannot be processed. I've considered the copy_to_user() case as only partially fixable. There are lots of cases to consider: 1) Many places where drivers copy to user in ioctl(2) calls. Killing the application solves the immediate problem, but if the problem with kernel memory is not transient, then you may run into it again. 2) Copy from Linux page cache to user for a read(2) system call. This one is a candidate for recovery. Might need help from the file system code. If the kernel page is a clean copy of data in the file system, then drop this page and re-read from storage into a new page. Then resume the copy_to_user(). If the page is modified, then need some file system action to somehow mark this range of addresses in the file as lost forever. First step in tackling this case is identifying that the source address is a page cache page. 3) Probably many other places where the kernel copies to user for other system calls. Would need to look at these on a case by case basis. Likely most have the same issue as ioctl(2) above. -Tony