From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0981EC433B4 for ; Tue, 27 Apr 2021 18:54:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8A2B66113D for ; Tue, 27 Apr 2021 18:54:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8A2B66113D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B51F66B0036; Tue, 27 Apr 2021 14:54:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B029F6B006E; Tue, 27 Apr 2021 14:54:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 92E456B0070; Tue, 27 Apr 2021 14:54:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0020.hostedemail.com [216.40.44.20]) by kanga.kvack.org (Postfix) with ESMTP id 780DD6B0036 for ; Tue, 27 Apr 2021 14:54:17 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 2775B180AD81D for ; Tue, 27 Apr 2021 18:54:17 +0000 (UTC) X-FDA: 78079047354.10.56153AB Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf05.hostedemail.com (Postfix) with ESMTP id 91B69E000105 for ; Tue, 27 Apr 2021 18:54:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619549656; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=6scYoYaJnZw9w+YTIXgfa8KVKfLRpUAPMKwzhV1kGVU=; b=M0klcoF/c1JcH77KzsGD1VEa0LwxlawkxYuH//26D/34H4ZWtnvjhrjm1nTJJqAthP9U2o vYY4BRakF2jjNmoWsx7VpXD9rm7u4E8J4hbl2nJbRKY4h572JSHkRiXSJACa1qNypG7Dkn p1H8DMkLQjEwOKgMEmHvN9fU9LBi8eM= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-107-RMjbAJPhMyqPvtogm4UkhA-1; Tue, 27 Apr 2021 14:54:14 -0400 X-MC-Unique: RMjbAJPhMyqPvtogm4UkhA-1 Received: by mail-qk1-f197.google.com with SMTP id 81-20020a370c540000b02902e4d7b9e448so1919943qkm.16 for ; Tue, 27 Apr 2021 11:54:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=6scYoYaJnZw9w+YTIXgfa8KVKfLRpUAPMKwzhV1kGVU=; b=C3J7SrnidxQmcfT/o97/h6mslCLZz3ShD5r8Y2LUhPHU7oXtdSS8ohLVpka9VfXhxz QGbb6AohY/UWYGxGY+GqvlLexhoCdgM8AdsmrkBQQyP+SFTfBvgwYsK+MkQRC+SEiCeZ hweUfaYAHL8SYrnbAwTHsub2Cc8lkB6Y3JKzhqiGsLMeOT6q39BL4s/UHZHpGfV61JFg JydvGiH4exAf99PucPrgOBv6sXccgyGsE1mAL/NIsMNOszMUrOZGo2pYBIztQsLKJWea q+nm5mQV2mR07bYgkT2bWbAgDtUBxMtGbLCP1E7YgCN5UBkvu/NgsKp5QULTQJhTcNFf fSqw== X-Gm-Message-State: AOAM530kkcUPD+MQVJntw4Z327z+MSWC0DQ6z6cdAyuXeW7F500T2vNp 0XZ4P8DI+ExITl/jLOK02Pk1wVsZDUtb2SWoVFnFXQVe92hV2EaafsiVbIHeGVBNOSCqzsRUrnu 665XYtDmWuc0v9wvJVj3SbAVoQcHpyGqirzzTypFCbNovSxk6gWjCGOLqP8sW X-Received: by 2002:ac8:60d7:: with SMTP id i23mr22310855qtm.184.1619549653644; Tue, 27 Apr 2021 11:54:13 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwVTUNFNAnZMgU4XsiHk6z89v+ygkwXtAm8ItrZOw7RjZxhaD33Lrgq3XyNL8VJHDGUhULb/w== X-Received: by 2002:ac8:60d7:: with SMTP id i23mr22310810qtm.184.1619549653168; Tue, 27 Apr 2021 11:54:13 -0700 (PDT) Received: from xz-x1 (bras-base-toroon474qw-grc-77-184-145-104-227.dsl.bell.ca. [184.145.104.227]) by smtp.gmail.com with ESMTPSA id c192sm3449561qke.25.2021.04.27.11.54.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Apr 2021 11:54:12 -0700 (PDT) Date: Tue, 27 Apr 2021 14:54:10 -0400 From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Nadav Amit , Miaohe Lin , Mike Rapoport , Andrea Arcangeli , Hugh Dickins , Jerome Glisse , Mike Kravetz , Jason Gunthorpe , Matthew Wilcox , Andrew Morton , Axel Rasmussen , "Kirill A . Shutemov" Subject: Re: [PATCH v2 05/24] shmem/userfaultfd: Handle uffd-wp special pte in page fault handler Message-ID: <20210427185410.GE6820@xz-x1> References: <20210427161317.50682-1-peterx@redhat.com> <20210427161317.50682-6-peterx@redhat.com> MIME-Version: 1.0 In-Reply-To: <20210427161317.50682-6-peterx@redhat.com> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 91B69E000105 X-Stat-Signature: 6xenhyiwox3tx8iax1uxz3bhzg5og6it Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf05; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=216.205.24.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619549654-667695 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Apr 27, 2021 at 12:12:58PM -0400, Peter Xu wrote: > File-backed memories are prone to unmap/swap so the ptes are always unstable. > This could lead to userfaultfd-wp information got lost when unmapped or swapped > out on such types of memory, for example, shmem. To keep such an information > persistent, we will start to use the newly introduced swap-like special ptes to > replace a null pte when those ptes were removed. > > Prepare this by handling such a special pte first before it is applied. Here > a new fault flag FAULT_FLAG_UFFD_WP is introduced. When this flag is set, it FAULT_FLAG_UFFD_WP does not exist any more. Obviously I should have touched up the commit message when touching up the code... > means the current fault is to resolve a page access (either read or write) to > the uffd-wp special pte. > > The handling of this special pte page fault is similar to missing fault, but it > should happen after the pte missing logic since the special pte is designed to > be a swap-like pte. Meanwhile it should be handled before do_swap_page() so > that the swap core logic won't be confused to see such an illegal swap pte. > > This is a slow path of uffd-wp handling, because unmap of wr-protected shmem > ptes should be rare. So far it should only trigger in two conditions: > > (1) When trying to punch holes in shmem_fallocate(), there will be a > pre-unmap optimization before evicting the page. That will create > unmapped shmem ptes with wr-protected pages covered. > > (2) Swapping out of shmem pages > > Because of this, the page fault handling is simplifed too by not sending the > wr-protect message in the 1st page fault, instead the page will be installed > read-only, so the message will be generated until the next do_wp_page() call. > > Disable fault-around for such a special page fault, because the introduced new > flag (FAULT_FLAG_UFFD_WP) only applies to current pte rather than all the pages Same here. > around it. Doing fault-around with the new flag could confuse all the rest of > pages when installing ptes from page cache when there's a cache hit. > > Signed-off-by: Peter Xu > --- > include/linux/userfaultfd_k.h | 11 +++++ > mm/memory.c | 80 ++++++++++++++++++++++++++++++++--- > 2 files changed, 86 insertions(+), 5 deletions(-) > > diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h > index bc733512c6905..fefebe6e96560 100644 > --- a/include/linux/userfaultfd_k.h > +++ b/include/linux/userfaultfd_k.h > @@ -89,6 +89,17 @@ static inline bool uffd_disable_huge_pmd_share(struct vm_area_struct *vma) > return vma->vm_flags & (VM_UFFD_WP | VM_UFFD_MINOR); > } > > +/* > + * Don't do fault around for FAULT_FLAG_UFFD_WP because it means we want to Same here... > + * recover a previously wr-protected pte. This flag is a per-pte information, > + * so it could confuse all the pages around the current page when faulted in. > + * Similar reason for MINOR mode faults. > + */ > +static inline bool uffd_disable_fault_around(struct vm_area_struct *vma) > +{ > + return vma->vm_flags & (VM_UFFD_WP | VM_UFFD_MINOR); > +} -- Peter Xu