From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 994FBC5479D for ; Thu, 5 Jan 2023 03:11:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 309DE8E0002; Wed, 4 Jan 2023 22:11:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2B9DA8E0001; Wed, 4 Jan 2023 22:11:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 15ABC8E0002; Wed, 4 Jan 2023 22:11:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 04D308E0001 for ; Wed, 4 Jan 2023 22:11:12 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id AF51A1C214B for ; Thu, 5 Jan 2023 03:11:11 +0000 (UTC) X-FDA: 80319269142.19.4D8C002 Received: from mail-pj1-f49.google.com (mail-pj1-f49.google.com [209.85.216.49]) by imf15.hostedemail.com (Postfix) with ESMTP id ECB97A000A for ; Thu, 5 Jan 2023 03:11:09 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=OxEV44Qk; spf=pass (imf15.hostedemail.com: domain of nadav.amit@gmail.com designates 209.85.216.49 as permitted sender) smtp.mailfrom=nadav.amit@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672888270; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EhKOVmA3EBsLDZIpg+LUE1FOA36aPnSUta95vtXcQOY=; b=yoHUaaEfyDTgdMn2KFaQTqxqTmG+fbWCPbtYeSuLfKNKmhpY5rZnR2gQ/hSeUKkiL+dnmO YS3F4PFCeuZEv+sKz3GyxuxhhG3B/T2oaxYUbwySWHNd3+d6dye0mU1qQNCUCJZMhsxjnv zLZ8y6kxeYVb3Y7Tc6IK8o3eJK60Xog= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=OxEV44Qk; spf=pass (imf15.hostedemail.com: domain of nadav.amit@gmail.com designates 209.85.216.49 as permitted sender) smtp.mailfrom=nadav.amit@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672888270; a=rsa-sha256; cv=none; b=i0GcqAU8p4ScqgSKeEZmBocmpAmhnYPSJUecJ5D2DCpdnQes94xYpVfMXjmBGP5rGhI5mZ Jeqfn4peIIyIDnXDnObRNSZKYwNfxpNk7EYSWt3by2XMlSyGJy8WwBUEwcdWl5r3uRaGIz IRCyGTlVyfbo/ZJFClQ3RTrWkTsjHNU= Received: by mail-pj1-f49.google.com with SMTP id n12so25613545pjp.1 for ; Wed, 04 Jan 2023 19:11:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=EhKOVmA3EBsLDZIpg+LUE1FOA36aPnSUta95vtXcQOY=; b=OxEV44QkKErjVtsimElo3qpQaBH1f02WauiAcJmvb8lri8HwZ7EchFdf0u/vHAtPEz anqM0JAYJ1d0bkDkgfBDVOeduFOA+OoJ+0DSnuMhjRUixk9lfk2iOcsv0U4J6jkCESRC eN8AWb2FnW0isvYdpuKUhNRLbDJRwLy7Qv33GFEjhVmIVhXYgRZCUQsKEAl96K6+/924 qIhrsQohKGcqYAJ3fdmR7+eLq0106rtq0tLU8Q0uHqsQvt0mLxrsIErQQhWw13Kg7pOG PxC8fLe/yG8zLAC4kr0+cRlU/nTJepUhe+XPAgv2InF4oBjCo0ni48pclNhWxLEiYmqk Kj7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=EhKOVmA3EBsLDZIpg+LUE1FOA36aPnSUta95vtXcQOY=; b=LuetO34dckdE6N7WCNcv5z1xx+rWAd0p5Gtj3hsi6V8y/EPwBHjvcuPU+k6xlFOM5M z99ysmHTBtQPGSQuMunwyqh8hxU+mykfgxzzGOiHYf/mRc2hgFy/FgHmDq98vk8tlc7f OxGoV7UJH0IAQPGDf1VXxlny8GHwLThr2ppBvnBZtqYS4GeODgyCqodXX1Y2ScXjfPuZ 0trJd8mnE61ZT8+voPUSloveMokQocyuaYNw4k9xcS5qo6Nk6JiQDGJ4aJRKYNFRq24A gwSHx5WxYHlOxrMjZ/r3PLoGukcXEaFdD7lFr/xZBYCLb248fLng0/PlX3PVj0Lv33Oa JNIA== X-Gm-Message-State: AFqh2kpwhmvVQUFFCkTSFt6S5pMysOsRAyc9qfWitTVXbSk3jCTKB+7N 2DmzPZzRuoZu6muc6/PB0hs= X-Google-Smtp-Source: AMrXdXs4jN8eae84oxkgcbJGUfq9GpWzVW03+YqYS2l2u5I5rsBv//1OWluuxwOQ/W5MwjTOV40Ghw== X-Received: by 2002:a17:902:d491:b0:192:ad82:dc98 with SMTP id c17-20020a170902d49100b00192ad82dc98mr29305954plg.34.1672888268269; Wed, 04 Jan 2023 19:11:08 -0800 (PST) Received: from smtpclient.apple ([66.170.99.95]) by smtp.gmail.com with ESMTPSA id e7-20020a17090301c700b00186c3af9644sm24914159plh.273.2023.01.04.19.11.06 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 Jan 2023 19:11:07 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.300.101.1.3\)) Subject: Re: [PATCH 3/3] mm/uffd: Detect pgtable allocation failures From: Nadav Amit In-Reply-To: <20230104225207.1066932-4-peterx@redhat.com> Date: Wed, 4 Jan 2023 19:10:55 -0800 Cc: Linux-MM , kernel list , Mike Kravetz , Muchun Song , Andrea Arcangeli , David Hildenbrand , James Houghton , Axel Rasmussen , Andrew Morton Content-Transfer-Encoding: quoted-printable Message-Id: References: <20230104225207.1066932-1-peterx@redhat.com> <20230104225207.1066932-4-peterx@redhat.com> To: Peter Xu X-Mailer: Apple Mail (2.3731.300.101.1.3) X-Stat-Signature: 9dn3t1y1c4ckgj11rcdd35q1ys9k78dg X-Rspam-User: X-Rspamd-Queue-Id: ECB97A000A X-Rspamd-Server: rspam06 X-HE-Tag: 1672888269-770381 X-HE-Meta: U2FsdGVkX1/bSOzkKPSGgzxCcaQhYXBM7jvoqSVefnyHrr+kE5b55AHD35lDMJWr3oZS4t/oWWej6/jQ5wJjw+agvzTeme4VknZODF1wo5xxDjGCrJUsFzZpkLdWhU15obmSoB+0mMQzhNgtKYPY8ub2lttwYvBwANjSwBTaKeB1LfbOfs0U7UTg9PHO+bi0DgRGblqDDksI3vYGjSbsVeNWIVKMzhsQAX4ubvAwB2J10ft8DGJavbxvW/SDcElDBbh+qwlwtizlRccWCDu/7JvXxT9tImeh5jdNcJbN5O+oqhY1gthFEPmmi1YK5Olm6kl6/n4UGBA4R8zGkVmlFBDgSyK8c8kXGmGRCuKwPK4SCZQsTTENi0H/WIch/R3wJBPmuMjrsV1osjJ0wUeLSJu9CEezOE2RYUvPrttAzIBOwcLiqEz7mfmodCKqb8/53e27PTQLUrvTUCFRu14VKgd8GvUDwRT632AfdL55gvz6P3n3acql6Y4J0QSMmbTVBp0UsKjaEaTR1WSulw4Oox2HPe7C+B+TV38jinCzNRCHjSpfZJJY0hlDtAoH6rtV+OTgMZJwB03BEIDkCdpMTPtpCs+qL0x7p/dp3vxfzjAwLgJ9tL1x3trpU6O0Cs+4cVcDLjwmRGEahJ+Bmjij/7qbsmwx+D/iGeKwYXHzUDQgtnQ7f00dL/Jg+EmYLVTogGZ8M5XZKqjMR+/VuK0bTcG/7/gxB0XAw4TswOZe9h4r5tiv7azC6CqQSBge5D7ki0VM4iQU647woeHHC1yAKq8Tpv4fxR1JZaOIG3ruHTHH0+9+WfQW4qqI+miZVa6x9vjiQP41ACNl/pwfciwKiqghIBXaKoPyC317+qdpgClaS0u1FZ0CZ/io5GiS/ul8OxS6bWuewrZgJKvTnn1uT2zKY02G6Y4MnXZfgmSkjsavDFBo0CQc8pwqlEdG0ZOQh8KDAfA/jdiDsFDslZV GpIcOGw3 Bvj+feWjmODdQOcywf5IudJdaNZ2hGyxTVJuSXctl6DduHpuSYpRlZxna+0TFpPnFvT0aOnj8UO9O5vvI9CfehSSsYX4oPx9JgXsG7W8osmI8bpjJdF1pPZtDARmFZPbwTyeAXyOWXDY9IIZnSZ3bnKmRfw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > On Jan 4, 2023, at 2:52 PM, Peter Xu wrote: >=20 > Before this patch, when there's any pgtable allocation issues happened > during change_protection(), the error will be ignored from the = syscall. > For shmem, there will be an error dumped into the host dmesg. Two = issues > with that: >=20 > (1) Doing a trace dump when allocation fails is not anything close to > grace.. >=20 > (2) The user should be notified with any kind of such error, so the = user > can trap it and decide what to do next, either by retrying, or = stop > the process properly, or anything else. >=20 > For userfault users, this will change the API of UFFDIO_WRITEPROTECT = when > pgtable allocation failure happened. It should not normally break = anyone, > though. If it breaks, then in good ways. >=20 > One man-page update will be on the way to introduce the new -ENOMEM = for > UFFDIO_WRITEPROTECT. Not marking stable so we keep the old behavior = on the > 5.19-till-now kernels. I understand that the current assumption is that change_protection() = should fully succeed or fail, and I guess this is the current behavior. However, to be more =E2=80=9Cfuture-proof=E2=80=9D perhaps this needs to = be revisited. For instance, UFFDIO_WRITEPROTECT can benefit from the ability to (based = on userspace request) prevent write-protection of pages that are pinned. = This is necessary to allow userspace uffd monitor to avoid write-protection of O_DIRECT=E2=80=99d memory, for instance, that might change even if a = uffd monitor considers it write-protected. In such a case, a =E2=80=9Cpartial failure=E2=80=9D is possible, since = only part of the memory was write-protected. The uffd monitor should be allowed to continue execution, but it has to know the part of the memory that was = successfully write-protected.=20 To support =E2=80=9Cpartial failure=E2=80=9D, the kernel should return = to UFFDIO_WRITEPROTECT-users the number of pages/bytes that were not successfully write-protected, unless no memory was successfully write-protected. (Unlike NUMA, pages that were skipped should be = accounted as =E2=80=9Csuccessfully write-protected").=20 I am only raising this subject to avoid multiple API changes.