From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1BEB7C636D4 for ; Mon, 13 Feb 2023 17:50:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A9AE06B0073; Mon, 13 Feb 2023 12:50:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A4ABE6B0074; Mon, 13 Feb 2023 12:50:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9120A6B0075; Mon, 13 Feb 2023 12:50:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 805166B0073 for ; Mon, 13 Feb 2023 12:50:49 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 5148680284 for ; Mon, 13 Feb 2023 17:50:49 +0000 (UTC) X-FDA: 80463009018.19.EC3FF19 Received: from madras.collabora.co.uk (madras.collabora.co.uk [46.235.227.172]) by imf24.hostedemail.com (Postfix) with ESMTP id 25D58180015 for ; Mon, 13 Feb 2023 17:50:46 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=collabora.com header.s=mail header.b=jMSob38M; dmarc=pass (policy=reject) header.from=collabora.com; spf=pass (imf24.hostedemail.com: domain of usama.anjum@collabora.com designates 46.235.227.172 as permitted sender) smtp.mailfrom=usama.anjum@collabora.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676310647; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CC6U1ZWiwanBPRTEIzpPxzJN6KulAL8Ae9D3+gemM4I=; b=eeM/3cXgTFiLLQwUpGmqEc8zsIFqcosW3dQ2Q8lvsG+XSh9LxxQX3J6ix1w5tq8jTu2DiE +/X8XN1Q0UCoi0wszx8Dt5QVqPMTO0deGzrVCnKgeQoXE7J8medSYQwXUUAShAQK9LB9Og HEu7cHaAiAYJfxAFFHU4R5dICjrjxFU= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=collabora.com header.s=mail header.b=jMSob38M; dmarc=pass (policy=reject) header.from=collabora.com; spf=pass (imf24.hostedemail.com: domain of usama.anjum@collabora.com designates 46.235.227.172 as permitted sender) smtp.mailfrom=usama.anjum@collabora.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676310647; a=rsa-sha256; cv=none; b=Mv361gQ33flBldnTclrRyJOeML5GGuORzoXqFKDJbMoX+9d1fdET9uu1vSlJKZC9FJk3sA aDwrDgyUIOyVwqIXCySzm3L595kziOJyfuFhFFGvrmOcgZkP/BUwzMtC3peeq1AreFvrq7 KJkjpdqhTljf2vmr2bgMXV+Pe0EO8Ek= Received: from [192.168.10.12] (unknown [39.45.179.179]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: usama.anjum) by madras.collabora.co.uk (Postfix) with ESMTPSA id CAB226600357; Mon, 13 Feb 2023 17:50:43 +0000 (GMT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1676310645; bh=xeODPhDIfFUyaS2Gon5ykxRQvpUcZt1AbKfG1EsXaCE=; h=Date:Cc:Subject:To:References:From:In-Reply-To:From; b=jMSob38MBuOfwjSoUmGu4kU2zAju9fanwAIlMOujFuggRlI6nkVfcrQIl+nHLx7tY PilEa5cLKzGgYuakyeiC9tQKGCw5ui2j0pQZ6V0MiwZ/7u9kT7tr7xxZiP4N2rLbyj R4oO1PpxyilWHV1ziQZR33f1HfuP/eJX5iEoFno8qhHJn9mj+gamdUxpEucFem4/Hm vluSpuDKNmXTDU0gNhlywP6DM74eKIqDXgWJJcRMikfEzOp8ft+/Pe2YfVAE3Frk5o ot79JUNS+okr6DJNQnKjHerNTlxN0D9sO3FJDXvg7ttRTpDxugM6qhPCKyhA56bkFa zTtR/wfs00GrQ== Message-ID: <9f0278d7-54f1-960e-ffdf-eeb2572ff6d1@collabora.com> Date: Mon, 13 Feb 2023 22:50:39 +0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1 Cc: Muhammad Usama Anjum , david@redhat.com, Andrew Morton , kernel@collabora.com, Paul Gofman , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 1/2] mm/userfaultfd: Support WP on multiple VMAs Content-Language: en-US To: Peter Xu References: <20230213163124.2850816-1-usama.anjum@collabora.com> From: Muhammad Usama Anjum In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 25D58180015 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: n3imiu8qte61ny83o51gbxwept7htfb9 X-HE-Tag: 1676310646-289337 X-HE-Meta: U2FsdGVkX198j6J6OFIwzziqWrLfqHyJDRJEp3BU3NmtnFYjAOgptGenld8RPekvxihhacuZMv7xifKFGsS6N4XbPMHrmLXLrofO9wdec9o+5lhgs5m3+uZGE4P1WpGzAZScpnwqQayRnAOa74gsJixLGCYhJgWACDA9drSfHjkR7CuSuDa56isphFgx8KIUTKSr0GOp4mDdg/KCOyAwBGmlGGoxp/QdgNqfVPTKkc8U+jYiA0Fo9mGEC7mdY+aozlrzbIRN63lr7bUMHrQxEnu73KiZ/xAxLEQpIbTt+PZO3iQ7GaVvMkP5uuuoQ+wLJgvuA79eAsTqA6VvlF7+76vUsZal1zX3F8lZXaVV8AT+mrX8SBXR8dOa9dttKrHt0YCJvzPIF8DjqTj3c+JpYKd/xewjXmM/OoMP1FfNKx0zspXjH7JQC4pKfPK9TaMB9PUiyDpvzCaiAxTW+MoN2Vdg2ofpxiSkPtJG1RuzFMHVUq3MWDjYxGI59vWR0JnLFB2YxX7GS1nSDofmpIzkEmq31Tr/fgpYj4XF2xYeEtWqXaiBJwtgfonFZkPogvpDnVmZBkfJg/9n9M9L3bs8i7XfXqVqQQMtIDnQoUqGJNFG+3t0eaL/x1rlTyKUOygxrphhnidhWun4GPGEfjMeuyGD4Uo5ZxIIstFOMyWIWdtbVFOE4QO2dkFvZ713UW/vmp/G7jQptKfJXwslqW7K/FwxqRseyr/mu6EC39vKPHDTu3x3Fv+ju4Fk+M5KtCMPxoDkMXfBeM5RJnPwmNpcgtzg8EpJcIlDfLXPBwFQnpua2GVc0DMDPvTL1mpVr2Gga6DeD/YCKj3yDlxUx8Q2SnHW7OxxM431phQWyOX12gTq4gwZCB3nSvvw589KF7CG2+96xrkO9Xjm5TuCKNjtV3vZkAjDeKS0IKa1VfmNje20OibRRE/iL5wLWSotVPzD3wjnCYA1+YUM4tQH5Rg v+5x72Af BBJnldCdBtRAG4djTXB9669lRYHHbwFpWXTt3q67n06baHrMd1BwRSxOkYazJrwO6jsy03F18RH/aYa/cqE2HvqEJchWnJyYkAkmzQVcE4q1OybT55JlYdvMVaZbtsjORPQQ5LTLCoZf++8nQN9kKuRSi8HUupZSbBs9LnjcmNEwO+TNFMBPOiuT+1JJ3ENETdtQ2PNUg1/AW5DiGtZjmrdoEUPiOWOc3yZA6lOcWcYLlwwEfb76uboLgH8lTfk6FSz0GQPer8BeOeFMzJN/cq2NfOYFs1QwdKBbUn1WDB+kK/MBYVvpfbbRi8n3gRurBVMEI9pXG2eGCsmk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2/13/23 9:54 PM, Peter Xu wrote: > On Mon, Feb 13, 2023 at 09:31:23PM +0500, Muhammad Usama Anjum wrote: >> mwriteprotect_range() errors out if [start, end) doesn't fall in one >> VMA. We are facing a use case where multiple VMAs are present in one >> range of interest. For example, the following pseudocode reproduces the >> error which we are trying to fix: >> >> - Allocate memory of size 16 pages with PROT_NONE with mmap >> - Register userfaultfd >> - Change protection of the first half (1 to 8 pages) of memory to >> PROT_READ | PROT_WRITE. This breaks the memory area in two VMAs. >> - Now UFFDIO_WRITEPROTECT_MODE_WP on the whole memory of 16 pages errors >> out. >> >> This is a simple use case where user may or may not know if the memory >> area has been divided into multiple VMAs. >> >> Reported-by: Paul Gofman >> Signed-off-by: Muhammad Usama Anjum >> --- >> Changes since v1: >> - Correct the start and ending values passed to uffd_wp_range() >> --- >> mm/userfaultfd.c | 38 ++++++++++++++++++++++---------------- >> 1 file changed, 22 insertions(+), 16 deletions(-) >> >> diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c >> index 65ad172add27..bccea08005a8 100644 >> --- a/mm/userfaultfd.c >> +++ b/mm/userfaultfd.c >> @@ -738,9 +738,12 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, >> unsigned long len, bool enable_wp, >> atomic_t *mmap_changing) >> { >> + unsigned long end = start + len; >> + unsigned long _start, _end; >> struct vm_area_struct *dst_vma; >> unsigned long page_mask; >> int err; > > I think this needs to be initialized or it can return anything when range > not mapped. It is being initialized to -EAGAIN already. It is not visible in this patch. > >> + VMA_ITERATOR(vmi, dst_mm, start); >> >> /* >> * Sanitize the command parameters: >> @@ -762,26 +765,29 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, >> if (mmap_changing && atomic_read(mmap_changing)) >> goto out_unlock; >> >> - err = -ENOENT; >> - dst_vma = find_dst_vma(dst_mm, start, len); >> + for_each_vma_range(vmi, dst_vma, end) { >> + err = -ENOENT; >> >> - if (!dst_vma) >> - goto out_unlock; >> - if (!userfaultfd_wp(dst_vma)) >> - goto out_unlock; >> - if (!vma_can_userfault(dst_vma, dst_vma->vm_flags)) >> - goto out_unlock; >> + if (!dst_vma->vm_userfaultfd_ctx.ctx) >> + break; >> + if (!userfaultfd_wp(dst_vma)) >> + break; >> + if (!vma_can_userfault(dst_vma, dst_vma->vm_flags)) >> + break; >> >> - if (is_vm_hugetlb_page(dst_vma)) { >> - err = -EINVAL; >> - page_mask = vma_kernel_pagesize(dst_vma) - 1; >> - if ((start & page_mask) || (len & page_mask)) >> - goto out_unlock; >> - } >> + if (is_vm_hugetlb_page(dst_vma)) { >> + err = -EINVAL; >> + page_mask = vma_kernel_pagesize(dst_vma) - 1; >> + if ((start & page_mask) || (len & page_mask)) >> + break; >> + } >> >> - uffd_wp_range(dst_mm, dst_vma, start, len, enable_wp); >> + _start = (dst_vma->vm_start > start) ? dst_vma->vm_start : start; >> + _end = (dst_vma->vm_end < end) ? dst_vma->vm_end : end; >> >> - err = 0; >> + uffd_wp_range(dst_mm, dst_vma, _start, _end - _start, enable_wp); >> + err = 0; >> + } >> out_unlock: >> mmap_read_unlock(dst_mm); >> return err; > > This whole patch also changes the abi, so I'm worried whether there can be > app that relies on the existing behavior. Even if a app is dependent on it, this change would just don't return error if there are multiple VMAs under the hood and handle them correctly. Most apps wouldn't care about VMAs anyways. I don't know if there would be any drastic behavior change, other than the behavior becoming nicer. > > Is this for the new pagemap effort? Can this just be done in the new > interface rather than changing the old? We found this bug while working on pagemap patches. It is already being handled in the new interface. We just thought that this use case can happen pretty easily and unknowingly. So the support should be added. Also mwriteprotect_range() gives a pretty straight forward way to WP or un-WP a range. Async WP can be used in coordination with pagemap file (PM_UFFD_WP flag in PTE) as well. There may be use cases for it. On another note, I don't see any use cases of WP async and PM_UFFD_WP flag as !PM_UFFD_WP flag doesn't give direct information if the page is written for !present pages. > > Side note: in your other pagemap series, you can optimize "WP_ENGAGE && > !GET" to not do generic pgtable walk at all, but use what it does in this > patch for the initial round or wr-protect. Yeah, it is implemented with some optimizations. > > Thanks, > -- BR, Muhammad Usama Anjum