From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00974C636CC for ; Mon, 20 Feb 2023 08:36:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 490FE6B0071; Mon, 20 Feb 2023 03:36:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 441D56B0072; Mon, 20 Feb 2023 03:36:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 309446B0073; Mon, 20 Feb 2023 03:36:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 218D66B0071 for ; Mon, 20 Feb 2023 03:36:26 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id D46D3140E73 for ; Mon, 20 Feb 2023 08:36:25 +0000 (UTC) X-FDA: 80487013530.04.6C0C57A Received: from madras.collabora.co.uk (madras.collabora.co.uk [46.235.227.172]) by imf02.hostedemail.com (Postfix) with ESMTP id CEBEA80013 for ; Mon, 20 Feb 2023 08:36:22 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=collabora.com header.s=mail header.b=FeIL13Ed; spf=pass (imf02.hostedemail.com: domain of usama.anjum@collabora.com designates 46.235.227.172 as permitted sender) smtp.mailfrom=usama.anjum@collabora.com; dmarc=pass (policy=reject) header.from=collabora.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676882183; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=oKPNEJnSlfMRPMCVjKKi2v/8bE73kcgzqIkrHaul/vQ=; b=h9HOQCC5IwooVujQKM3V7oZX9g1DwBXKS917M1tzygtCEP/4lbfFmm1jgp30CRAY3+iP4o 5HrDgtrqsrh6MORc3efPu657RPLYEokmAtw7crEIjX3Ye+YjWMunfQURCsf6lu073jqjml yryYMcgOubJyHBzpLVdebmFpEaIvjII= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=collabora.com header.s=mail header.b=FeIL13Ed; spf=pass (imf02.hostedemail.com: domain of usama.anjum@collabora.com designates 46.235.227.172 as permitted sender) smtp.mailfrom=usama.anjum@collabora.com; dmarc=pass (policy=reject) header.from=collabora.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676882183; a=rsa-sha256; cv=none; b=DSQTR2V3RLlmkIo+MIy2hjJ3hq6XEM7NnfHXPlsqzp8S18PbmSIfDviRTWixAohYPWpId1 NpIhJS3eThX28xY9D5FI42EYt/8gHZ/pdYkSVT80iOLaHLOGjQfZCsuiLUc93rtX4Q2rGQ 1X1vGjrRFEtrBloV4VY/D2MZRZnQ1dI= Received: from [192.168.10.12] (unknown [39.45.217.110]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: usama.anjum) by madras.collabora.co.uk (Postfix) with ESMTPSA id 8BD7966018CA; Mon, 20 Feb 2023 08:36:14 +0000 (GMT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1676882181; bh=kMZkLJYnHc2ozlvIeqwgQbNYmUjgvuZvFvLC0JjfCIo=; h=Date:Cc:Subject:To:References:From:In-Reply-To:From; b=FeIL13Ed3B7K3m2XvQ5pImkpPsoNOFzsOakUec5c2jL58ehH2qjUnw2FhD4fazRg9 kvmAWUGLsDmmbgDK4BSvxMKf2++Ljw33BaxdsavOT1ClAUErVl8I61tAcWRnU5kTJP dhMEwCPNbT1/QI6kscsfk9xBkzWoohTEDm+ESSKwEWU4Ltjrgtv+avAZjdsjm48q4Q AEPNv82qmn8tzV+5Btag8a57XbWGyCnVo0H3nrvqOzSwMgeoo0PqA53OOCY9ckAyVU gxD5cfPNapOgl0rPkPxsnIkI+iJAulpHSB4yqXH4JRbqBIvkSzchPe2ASirSTfq9gA cwbD3TuL5Ah4w== Message-ID: <787ea67e-c48d-c88e-c233-a231b7a101e5@collabora.com> Date: Mon, 20 Feb 2023 13:36:10 +0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1 Cc: Muhammad Usama Anjum , Peter Xu , David Hildenbrand , Andrew Morton , =?UTF-8?B?TWljaGHFgiBNaXJvc8WC?= =?UTF-8?Q?aw?= , Andrei Vagin , Danylo Mocherniuk , Paul Gofman , Cyrill Gorcunov , Alexander Viro , Shuah Khan , Christian Brauner , Yang Shi , Vlastimil Babka , "Liam R . Howlett" , Yun Zhou , Suren Baghdasaryan , Alex Sierra , Matthew Wilcox , Pasha Tatashin , Nadav Amit , Axel Rasmussen , "Gustavo A . R . Silva" , Dan Williams , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, Greg KH , kernel@collabora.com Subject: Re: [PATCH v10 1/6] userfaultfd: Add UFFD WP Async support Content-Language: en-US To: Mike Rapoport References: <20230202112915.867409-1-usama.anjum@collabora.com> <20230202112915.867409-2-usama.anjum@collabora.com> From: Muhammad Usama Anjum In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Stat-Signature: 3qp5uwyd7hgw63dodsjrpndsjikf7inq X-Rspam-User: X-Rspamd-Queue-Id: CEBEA80013 X-Rspamd-Server: rspam06 X-HE-Tag: 1676882182-879195 X-HE-Meta: U2FsdGVkX1/sV3qw/GlFgV1/jID45dhJJ6XgE+JpQBsY1LAELo+T4ZvvpqP8CFCXcMBBpcspJJhccsVq47jqw2PRSB9NTnUlXDKhl66oE3coHQ9tZrRLOHvgTMYmiqnYNmkUx2KkX/MKDA1s40CDVPkAirGNIX2MtPsqsJ+QZANw6VJ8UeuvER0NrTDMMJAATDNtGroFKLLPfjLrBoRaOG46RBixM8EeuJXWhNUudrpi/enOVC4OWGX2pSrFX0tIP81JnS2tgA5U3lTsvMvF9i6lBewCRZ0TE+cxFVtimxnVPYpd9nA4CMxQq+OOHedBU/9nctqUG/MBtF/Qz0OdiwVx2m3COE6HMjlRKmsgQbtSs5VJudcAYqQv8lABStchkKfdwEmSimf4fkkar8/j/ynL6ZuLxHVrPdAypSsFzDMl9DSD7bdzcCqzYmNti9u4uMjlcm/ty33ErKjOqGAoztw45S2dMW58nIIxTSfn4UBvaMOR37PJ3Q+nL3R8uBaKBT20n8gygMWqyt53kEXv6jKYhW47YKf9AeAqVQKXT8gYm/qs8NrOPdsUBO7T8++dvk7jnQHfVSOpgmPGZ7dzH3Of5dinPQdakRZyCwP56hS8KQ31BwZy3F0zTAzu/5kzE67VbHlt5sDr0M5fEk7PSBXDezzY9K2587aRPqYGYfXZix2e7maJzlYHsfTlZylkE1CfdlIZI+9lanc2BSbISYSsqMqRKE0xTEBRzPaO8kQL+R1BH3jd6JHTTo4tGx/U3fugyqKneHRsKxkvaxCwzjWSG3Lt6UKEyzXApqGC6SSVekOiODPO31g2Vsb5IjRAKxG3xA5Bj4Ea6lKupgaaJzvrsc1YR17slRAENpAFW8KYUxFh8YKn1+dlsQnUqd5FmxFG9XfKUesIuWDfju0qo+9Ri/wFo3mo40X8+Cb9So2A8glWKwEblp593IMOvpPxk3kgRSs6J3AcYljAAHN DJ6XAZQn DaOfI3rhSHssfUbEM2/7GSFrDrbGJnnEMt9bpY55P05tiwE4XjFTDlN4/TtSZYh3JYCctDbfXO+AzDT8KrhJzmHJcb5V6Ece0kXW5ULLmqdMM9z/Nt5hEIid6igjuNeBx/78B5KIyOMUaLSTjzAFD5dBaErJY8FoMlwp69wCHyNIylIyaBf8WOeK2bCsixtzJu1k5Qk0duvCnsMdUt9LbGdUVkbYLryB6YVgDM3SaPsUnfiffy8NeyddlnvgXjhi0Cb7qWVcXJdY5cW7nzO77VBIGvzFP1mIomwt96661fbKaDxs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Mike, Thanks for reviewing. On 2/17/23 2:37 PM, Mike Rapoport wrote: > Hi Muhammad, > > On Thu, Feb 02, 2023 at 04:29:10PM +0500, Muhammad Usama Anjum wrote: >> Add new WP Async mode (UFFD_FEATURE_WP_ASYNC) which resolves the page >> faults on its own. It can be used to track that which pages have been >> written-to from the time the pages were write-protected. It is very >> efficient way to track the changes as uffd is by nature pte/pmd based. >> >> UFFD synchronous WP sends the page faults to the userspace where the >> pages which have been written-to can be tracked. But it is not efficient. >> This is why this asynchronous version is being added. After setting the >> WP Async, the pages which have been written to can be found in the pagemap >> file or information can be obtained from the PAGEMAP_IOCTL. >> >> Suggested-by: Peter Xu >> Signed-off-by: Muhammad Usama Anjum >> --- >> Changes in v10: >> - Build fix >> - Update comments and add error condition to return error from uffd >> register if hugetlb pages are present when wp async flag is set >> >> Changes in v9: >> - Correct the fault resolution with code contributed by Peter >> >> Changes in v7: >> - Remove UFFDIO_WRITEPROTECT_MODE_ASYNC_WP and add UFFD_FEATURE_WP_ASYNC >> - Handle automatic page fault resolution in better way (thanks to Peter) >> >> update to wp async >> >> uffd wp async >> --- >> fs/userfaultfd.c | 20 ++++++++++++++++++-- >> include/linux/userfaultfd_k.h | 11 +++++++++++ >> include/uapi/linux/userfaultfd.h | 10 +++++++++- >> mm/memory.c | 23 ++++++++++++++++++++--- >> 4 files changed, 58 insertions(+), 6 deletions(-) >> >> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c >> index 15a5bf765d43..422f2530c63e 100644 >> --- a/fs/userfaultfd.c >> +++ b/fs/userfaultfd.c >> @@ -1422,10 +1422,15 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, >> goto out_unlock; >> >> /* >> - * Note vmas containing huge pages >> + * Note vmas containing huge pages. Hugetlb isn't supported >> + * with UFFD_FEATURE_WP_ASYNC. >> */ >> - if (is_vm_hugetlb_page(cur)) >> + if (is_vm_hugetlb_page(cur)) { >> + if (ctx->features & UFFD_FEATURE_WP_ASYNC) >> + goto out_unlock; >> + >> basic_ioctls = true; >> + } >> >> found = true; >> } >> @@ -1867,6 +1872,10 @@ static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx, >> mode_wp = uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_WP; >> mode_dontwake = uffdio_wp.mode & UFFDIO_WRITEPROTECT_MODE_DONTWAKE; >> >> + /* The unprotection is not supported if in async WP mode */ >> + if (!mode_wp && (ctx->features & UFFD_FEATURE_WP_ASYNC)) >> + return -EINVAL; >> + >> if (mode_wp && mode_dontwake) >> return -EINVAL; >> >> @@ -1950,6 +1959,13 @@ static int userfaultfd_continue(struct userfaultfd_ctx *ctx, unsigned long arg) >> return ret; >> } >> >> +int userfaultfd_wp_async(struct vm_area_struct *vma) >> +{ >> + struct userfaultfd_ctx *ctx = vma->vm_userfaultfd_ctx.ctx; >> + >> + return (ctx && (ctx->features & UFFD_FEATURE_WP_ASYNC)); >> +} >> + >> static inline unsigned int uffd_ctx_features(__u64 user_features) >> { >> /* >> diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h >> index 9df0b9a762cc..38c92c2beb16 100644 >> --- a/include/linux/userfaultfd_k.h >> +++ b/include/linux/userfaultfd_k.h >> @@ -179,6 +179,7 @@ extern int userfaultfd_unmap_prep(struct mm_struct *mm, unsigned long start, >> unsigned long end, struct list_head *uf); >> extern void userfaultfd_unmap_complete(struct mm_struct *mm, >> struct list_head *uf); >> +extern int userfaultfd_wp_async(struct vm_area_struct *vma); >> >> #else /* CONFIG_USERFAULTFD */ >> >> @@ -189,6 +190,11 @@ static inline vm_fault_t handle_userfault(struct vm_fault *vmf, >> return VM_FAULT_SIGBUS; >> } >> >> +static inline void uffd_wp_range(struct mm_struct *dst_mm, struct vm_area_struct *vma, >> + unsigned long start, unsigned long len, bool enable_wp) >> +{ >> +} >> + >> static inline bool is_mergeable_vm_userfaultfd_ctx(struct vm_area_struct *vma, >> struct vm_userfaultfd_ctx vm_ctx) >> { >> @@ -274,6 +280,11 @@ static inline bool uffd_disable_fault_around(struct vm_area_struct *vma) >> return false; >> } >> >> +static inline int userfaultfd_wp_async(struct vm_area_struct *vma) >> +{ >> + return false; >> +} >> + >> #endif /* CONFIG_USERFAULTFD */ >> >> static inline bool pte_marker_entry_uffd_wp(swp_entry_t entry) >> diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h >> index 005e5e306266..30a6f32cf564 100644 >> --- a/include/uapi/linux/userfaultfd.h >> +++ b/include/uapi/linux/userfaultfd.h >> @@ -38,7 +38,8 @@ >> UFFD_FEATURE_MINOR_HUGETLBFS | \ >> UFFD_FEATURE_MINOR_SHMEM | \ >> UFFD_FEATURE_EXACT_ADDRESS | \ >> - UFFD_FEATURE_WP_HUGETLBFS_SHMEM) >> + UFFD_FEATURE_WP_HUGETLBFS_SHMEM | \ >> + UFFD_FEATURE_WP_ASYNC) >> #define UFFD_API_IOCTLS \ >> ((__u64)1 << _UFFDIO_REGISTER | \ >> (__u64)1 << _UFFDIO_UNREGISTER | \ >> @@ -203,6 +204,12 @@ struct uffdio_api { >> * >> * UFFD_FEATURE_WP_HUGETLBFS_SHMEM indicates that userfaultfd >> * write-protection mode is supported on both shmem and hugetlbfs. >> + * >> + * UFFD_FEATURE_WP_ASYNC indicates that userfaultfd write-protection >> + * asynchronous mode is supported in which the write fault is automatically >> + * resolved and write-protection is un-set. It only supports anon and shmem >> + * (hugetlb isn't supported). It only takes effect when a vma is registered >> + * with write-protection mode. Otherwise the flag is ignored. >> */ > > Most of mm/ adheres the 80-character limits. Please make your changes to > follow it as well. Will update in next version. > >> #define UFFD_FEATURE_PAGEFAULT_FLAG_WP (1<<0) >> #define UFFD_FEATURE_EVENT_FORK (1<<1) >> @@ -217,6 +224,7 @@ struct uffdio_api { >> #define UFFD_FEATURE_MINOR_SHMEM (1<<10) >> #define UFFD_FEATURE_EXACT_ADDRESS (1<<11) >> #define UFFD_FEATURE_WP_HUGETLBFS_SHMEM (1<<12) >> +#define UFFD_FEATURE_WP_ASYNC (1<<13) >> __u64 features; >> >> __u64 ioctls; >> diff --git a/mm/memory.c b/mm/memory.c >> index 4000e9f017e0..75331fbf7cb4 100644 >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -3351,8 +3351,21 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) >> >> if (likely(!unshare)) { >> if (userfaultfd_pte_wp(vma, *vmf->pte)) { >> - pte_unmap_unlock(vmf->pte, vmf->ptl); >> - return handle_userfault(vmf, VM_UFFD_WP); >> + if (userfaultfd_wp_async(vma)) { >> + /* >> + * Nothing needed (cache flush, TLB invalidations, >> + * etc.) because we're only removing the uffd-wp bit, >> + * which is completely invisible to the user. >> + */ >> + pte_t pte = pte_clear_uffd_wp(*vmf->pte); >> + >> + set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte); >> + /* Update this to be prepared for following up CoW handling */ >> + vmf->orig_pte = pte; >> + } else { >> + pte_unmap_unlock(vmf->pte, vmf->ptl); >> + return handle_userfault(vmf, VM_UFFD_WP); >> + } > > You can revert the condition here and reduce the nesting: > > if (!userfaultfd_wp_async(vma)) { > pte_unmap_unlock(vmf->pte, vmf->ptl); > return handle_userfault(vmf, VM_UFFD_WP); > } > > /* handle async WP */ I'll update in next version. > >> } >> >> /* >> @@ -4812,8 +4825,11 @@ static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf) >> >> if (vma_is_anonymous(vmf->vma)) { >> if (likely(!unshare) && >> - userfaultfd_huge_pmd_wp(vmf->vma, vmf->orig_pmd)) >> + userfaultfd_huge_pmd_wp(vmf->vma, vmf->orig_pmd)) { >> + if (userfaultfd_wp_async(vmf->vma)) >> + goto split; >> return handle_userfault(vmf, VM_UFFD_WP); >> + } >> return do_huge_pmd_wp_page(vmf); >> } >> >> @@ -4825,6 +4841,7 @@ static inline vm_fault_t wp_huge_pmd(struct vm_fault *vmf) >> } >> } >> >> +split: >> /* COW or write-notify handled on pte level: split pmd. */ >> __split_huge_pmd(vmf->vma, vmf->pmd, vmf->address, false, NULL); >> >> -- >> 2.30.2 >> > -- BR, Muhammad Usama Anjum