From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2ABBC3DA7D for ; Thu, 5 Jan 2023 08:59:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 29B358E0002; Thu, 5 Jan 2023 03:59:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 24B618E0001; Thu, 5 Jan 2023 03:59:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 112E98E0002; Thu, 5 Jan 2023 03:59:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id F27578E0001 for ; Thu, 5 Jan 2023 03:59:42 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id C6A9DA0C7E for ; Thu, 5 Jan 2023 08:59:42 +0000 (UTC) X-FDA: 80320147404.03.C0E656A Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf05.hostedemail.com (Postfix) with ESMTP id BFBDC10000B for ; Thu, 5 Jan 2023 08:59:40 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BYL1DgyF; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf05.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672909180; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=upIlYRm2nv0gvFeqvVRrj60R4AvxeIzZxQeMo5tjjPY=; b=42TxCwKsGXKt5TaNZ9zM5XarwkRJPxFdWQLVmStdAo8OFugy7v/iksmqSsw712D5IB/4DL 9b1oNIL6b/TBDC7h1cJc44r8uEBXXo6JBNQDF2Rr1+uLT81H41yviPWLN+t2wyPE3WNdBU UzenQtHSRqvn+aZdJRKT+1g+VjJijxY= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BYL1DgyF; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf05.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672909180; a=rsa-sha256; cv=none; b=IDy4yMQdCQMgedKRhZqY2tNmLxQkZXRDXwkWhFzeqFYIOuV9uDRlFBvaLlKPwnM4SFMt/p 8Poq+IxlsgCS1qZo3ZkIMx8zNFvevzIGCezf0mAzLkcXV4lYq3exAV1N3jYx1f/ivnqhYj QoZuF7oLlGr97vYBAr3DE7Nwdx0ZxBg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1672909180; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=upIlYRm2nv0gvFeqvVRrj60R4AvxeIzZxQeMo5tjjPY=; b=BYL1DgyF7nbI3+IiHeEkN6ybgRORLh/APD/JBIrf4+qvT5ZE21gJBKrLpJIoFzxs52MUXq 24UZLSz29pLwc66mh4TpuyECXGFTm4mA4DGtNTBjEudqY7F6dZMyp2gKmh36IQu4qsfYV+ tRYCx10Qqop1OorClCx2O203VmAqJNw= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-550-pm9lVRwpPFGXadBKAMfyFQ-1; Thu, 05 Jan 2023 03:59:31 -0500 X-MC-Unique: pm9lVRwpPFGXadBKAMfyFQ-1 Received: by mail-wm1-f71.google.com with SMTP id q21-20020a7bce95000000b003d236c91639so250563wmj.8 for ; Thu, 05 Jan 2023 00:59:31 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:subject:organization:from :references:cc:to:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=upIlYRm2nv0gvFeqvVRrj60R4AvxeIzZxQeMo5tjjPY=; b=PkDY6aqxrvtYzCcFgnoPQ4g33ygK0a4YTrgGtzIQyO0P9c9NZrB7zBRaVItoxLDTxm gqvh7fQgqQnMPu5QRZA8pHJtqB2rDJwpD3fdtrwEC8Kdo/8nS5/8nP3vxwXh9P1u4ZQ8 Nnhb1hJsbzEzM2hfP/QenhF+LrJu1behQzLbumVedVp/tCMAuaPrnge03HG5II66HdKO jH0DFcweG1koAylfpCDtKyv87MYyvjLdz7GPUByHPpDoEbtXtugNMPIm1I1n8NqeIXnd RzB0qKew/gbc+/C+VUJZJdatclOpIuMgiVsUcWTj0rbh1i13e2cQn6mGb3wNMu6eLBvn Cw5w== X-Gm-Message-State: AFqh2kqoi6qQy48TSUvwLBKS9TCq/9LAlZtkwr4hiWKgER3+nzhPOp+W aTlYOQ0bCx1We15gGnZkdNh8aXL6dZKaW+32lOnSAKU3zQnLHk3Wba22z+P1WOLI0GW4AVgnqen 2LR7ORlY9yHM= X-Received: by 2002:a05:600c:1d8e:b0:3d6:e23:76a2 with SMTP id p14-20020a05600c1d8e00b003d60e2376a2mr34894793wms.34.1672909170332; Thu, 05 Jan 2023 00:59:30 -0800 (PST) X-Google-Smtp-Source: AMrXdXufXzUK1PvLXyja6eVL7x3NsrtqjQxHZ/PzP8Ui5ZNLCow91JB6b5Asg8rZLfuM7j2MW7r5vg== X-Received: by 2002:a05:600c:1d8e:b0:3d6:e23:76a2 with SMTP id p14-20020a05600c1d8e00b003d60e2376a2mr34894783wms.34.1672909170069; Thu, 05 Jan 2023 00:59:30 -0800 (PST) Received: from ?IPV6:2003:cb:c707:6e00:ff02:ec7a:ded5:ec1e? (p200300cbc7076e00ff02ec7aded5ec1e.dip0.t-ipconnect.de. [2003:cb:c707:6e00:ff02:ec7a:ded5:ec1e]) by smtp.gmail.com with ESMTPSA id o9-20020a05600c510900b003c6f8d30e40sm1791442wms.31.2023.01.05.00.59.28 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 05 Jan 2023 00:59:29 -0800 (PST) Message-ID: Date: Thu, 5 Jan 2023 09:59:28 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0 To: Nadav Amit , Peter Xu Cc: Linux-MM , kernel list , Mike Kravetz , Muchun Song , Andrea Arcangeli , James Houghton , Axel Rasmussen , Andrew Morton References: <20230104225207.1066932-1-peterx@redhat.com> <20230104225207.1066932-4-peterx@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH 3/3] mm/uffd: Detect pgtable allocation failures In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: BFBDC10000B X-Stat-Signature: 1cerwowmfwoyunnanf4t3phy7bo55hyt X-HE-Tag: 1672909180-317386 X-HE-Meta: U2FsdGVkX195eqIsZAqTXgvnThtIWodmlaQv162wxaCupxCDpPLL3uheKMM5VMMFOYQfA/45pKISUfVhUvvh7hD6/B9NiDvmas+aPLfYc2jn6sQHFLly4C1FmsqdDyCJxUXGrr5NPk3Y+nwcrShVRmCWx+vh1Cru8nyqNPoDzkX8Rhu3i7xWhGDcvwlUPIMptKxDsf3wncKZMtnHBdAb08SFjMfG8j8g5Punb8FHQ82cqBqY4NgYG9j4rr8ar4V5pTifyBzZIvQ8boNifpsvHVYD7q8el3trH71bQnLwW65EAiCjJ3X+9DuH9A2Au6j+l/7K9jw9kwLYlGLz37HpvrDHv1JV0CLtXQHNSgAUn6kHxQTVcR2+o3eD9H2/8fsaJrxSfgaVpg9MaSFhfZfZemWEKzdXTleaVQJaQLCAcphYdfX6DDpmlOP99rZoZirJ6dIoqUBTFt+5f0vtF2jIJJUwc3wRlVo9mfymLrYMIdhdhl9i4hZ1O0xIOIWP7mtgJXgP4vJNBLNwYMsjq0S86jp8oekEjIBCYQTaTeu+52afEr4oK694U/Yfykf1qCw4kYzYwcA5zOpCyhBErObQPj2XaTjGGcHlR/jQWEoOHAhjpxMO0flOoniLFX13ICk7JP0y6g1LjglP9HY2nuszpTRndmZbWm5wpRVZDfQVTxq0UuMP/ffZRrRt/LzIfGIziOvgEWyjM4mcjWS/WM6LPEIJD8+Ugin8ilEWnIxU9eZP12wyKG4JeLZdoyqytv1tbOSkAS64p2bPXslHrqdaXx0+faSIymhGwZ+qdebDpWbpXdkhbYDW/+6ng6gwtfSkqrTLYyj0uL1qq1tWQw4vhpUBh2iGKJ0r/xScN9D0gAbpIHeBuJnSWKhXhyeZPgC7bWIIh+BUyOppAxmLPFvUu+/eeUK9Uaz/NpKn7n00yAf3sWDZz2RHJV7ZeksziL9p2W1K+jGY/zRsIEIAGae 8F4Uav3z eLW5rwtHjesLB4cMm7xhSl5lM4HmQp701x8x8y0/dF2uIZBpC0QFCFDxosta6HNigdcUYh40K/HugUXLtGH56J1uDUsX0BIfi69yYPh/DMkU34Z0oI7GcBfpe3Y+H2Wis1mXTTEl3F+dTZsYjdA2lO0D4ppI4IBe7IpECkNV+sA2+gq/3ca6W+GneS1FrknyxZuZ6UpaNHFi53Jr9nGZSkiHEtw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 05.01.23 04:10, Nadav Amit wrote: > >> On Jan 4, 2023, at 2:52 PM, Peter Xu wrote: >> >> Before this patch, when there's any pgtable allocation issues happened >> during change_protection(), the error will be ignored from the syscall. >> For shmem, there will be an error dumped into the host dmesg. Two issues >> with that: >> >> (1) Doing a trace dump when allocation fails is not anything close to >> grace.. >> >> (2) The user should be notified with any kind of such error, so the user >> can trap it and decide what to do next, either by retrying, or stop >> the process properly, or anything else. >> >> For userfault users, this will change the API of UFFDIO_WRITEPROTECT when >> pgtable allocation failure happened. It should not normally break anyone, >> though. If it breaks, then in good ways. >> >> One man-page update will be on the way to introduce the new -ENOMEM for >> UFFDIO_WRITEPROTECT. Not marking stable so we keep the old behavior on the >> 5.19-till-now kernels. > > I understand that the current assumption is that change_protection() should > fully succeed or fail, and I guess this is the current behavior. > > However, to be more “future-proof” perhaps this needs to be revisited. > > For instance, UFFDIO_WRITEPROTECT can benefit from the ability to (based on > userspace request) prevent write-protection of pages that are pinned. This is > necessary to allow userspace uffd monitor to avoid write-protection of > O_DIRECT’d memory, for instance, that might change even if a uffd monitor > considers it write-protected. Just a note that this is pretty tricky IMHO, because: a) We cannot distinguished "pinned readable" from "pinned writable" b) We can have false positives ("pinned") even for compound pages due to concurrent GUP-fast. c) Synchronizing against GUP-fast is pretty tricky ... as we learned. Concurrent pinning is usually problematic. d) O_DIRECT still uses FOLL_GET and we cannot identify that. (at least that should be figured out at one point) I have a patch lying around for a very long time that removes that special-pinned handling from softdirty code, because of the above reasons (and because it forgets THP). For now I didn't send it because for softdirty, it's acceptable to over-indicate and it hasn't been reported to be an actual problem so far. For existing UFFDIO_WRITEPROTECT users, however, it might be very harmful (especially for existing users) to get false protection errors. Failing due to ENOMEM is different to failing due to some temporary concurrency issues. Having that said, I started thinking about alternative ways of detecting that in that past, without much outcome so far: that latest idea was indicating "this MM has had pinned pages at one point, be careful because any techniques that use write-protection (softdirty, mprotect, uffd-wp) won't be able to catch writes via pinned pages reliably". Hm. -- Thanks, David / dhildenb