From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3ADD7CD1288 for ; Thu, 4 Apr 2024 20:32:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C8FC86B0098; Thu, 4 Apr 2024 16:32:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C40046B009A; Thu, 4 Apr 2024 16:32:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AE0A16B009B; Thu, 4 Apr 2024 16:32:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 8FE3D6B0098 for ; Thu, 4 Apr 2024 16:32:41 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 4E7F1A1051 for ; Thu, 4 Apr 2024 20:32:41 +0000 (UTC) X-FDA: 81972997722.23.C8C3B5E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf15.hostedemail.com (Postfix) with ESMTP id 399F7A0006 for ; Thu, 4 Apr 2024 20:32:39 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=iYh9vhYa; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf15.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1712262759; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vr/SUGhSbvPpqa+Xo0hTVf6gdcXmyeXEviRfCaUkX+4=; b=Rhe3H7vsThdz1ePdkfE2WBN65j3Hq9FWTbUJ7J+qSXPFCkcH2MLihfz7yhE4h3stdFk8uo ED/R7cHglbL52XEDtCszG/NuVBjcKN0EYjnXaAYyv7xRDTsFXoGFBa0OoKeKCx3RCnoXXb WsjHREhAL3Kllov6/vemSBIM8ObqhHM= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=iYh9vhYa; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf15.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1712262759; a=rsa-sha256; cv=none; b=d4YWPXuxtJd+pIxZuUDv1Sq9iCwy8aEaCNmo+HxjDXscoFLk8RPl82B4mmJY63lxskhYfd xr7lAMd87dEYkK7+qyzgsiYDeINSPKb3+kmDWMQO1UaxouuijjJgJJeEIKeLLFTy6Qplfk 7AL5FetdPfT7ahyvjCy5xL5ZYZmiISg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1712262758; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=vr/SUGhSbvPpqa+Xo0hTVf6gdcXmyeXEviRfCaUkX+4=; b=iYh9vhYagE3z2CI6y2W/P3D4O4dRfbwA94gVZ0XkceQQK8QX7BYHBULVYMuYZuOMJNhZp+ HhXloVWLErQHQtEy5JWo3JcSvos5JxqvyCrYImqq0AWs5Vf1QyCOtSJShm9p7MUD7ydNaE pOyW/DeORhwopHaJC9XMecnHHEnMJqA= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-690-CI20rx7kNV6URwJfiFO-PA-1; Thu, 04 Apr 2024 16:32:37 -0400 X-MC-Unique: CI20rx7kNV6URwJfiFO-PA-1 Received: by mail-qv1-f71.google.com with SMTP id 6a1803df08f44-6991ad4ea9eso2560646d6.0 for ; Thu, 04 Apr 2024 13:32:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712262757; x=1712867557; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=vr/SUGhSbvPpqa+Xo0hTVf6gdcXmyeXEviRfCaUkX+4=; b=eKa9wq8ybhdjPRbe11eU09sM61FeyqxQsjcZBcjZ5zDgoctR56XeKx2jpOWs9zRBUJ Vj77nGIjKhRkPIyIU5q0ud+WqWRA5t7b9+tWskFIep5duKPR1VM+zVGEBtVfcI9l3nNZ AeAlrYERfbprK0rLYNrdY2bGUjfmtycpn46qHkMYyVUA2fD9aBX8VzrfKnn4O0Crdxr0 b3KoTnjka8Jx6VtyQQZAidYmwK+pDF89dCB7ZKLJ3DexqFwY1OzSQjdwxalRGFrx4aoh rViiqoKNxPw6rugXSSXwt9tMU8utdWdEOs4cPdnml8f28nVKWaEWlK7uFkCuElOA8juc A5eg== X-Forwarded-Encrypted: i=1; AJvYcCUxp2HhR/cfvgxZ6VNf9viur93R2gAuQg0bOTiQTfg8SLqonSjn4RIS3yrm7YzWKgjYxNKB30IHauDaucy+JPLVhh0= X-Gm-Message-State: AOJu0YzUHJ4YYlV4HOGftByCESWLao1cw4lzzHOYfw69f7xfRoxeC6xx HSMIgrFFa7FtkDdrAEa31QyGcoGLRk3RTEiBXUsFNBwUFiawBd10IANOim8M7T9Q91rkvc4x9WR 0h+0ALBvMNuP85546ISBAG0RlIhYVpppoULk/5NTcoOijGU+0 X-Received: by 2002:a05:6214:1c86:b0:699:2f11:384f with SMTP id ib6-20020a0562141c8600b006992f11384fmr3519007qvb.2.1712262756758; Thu, 04 Apr 2024 13:32:36 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF/XWgQgMXrsC/9dKI75yPJSvQp60HieuCzEjNcjcNcBiPkBasPO28C4IidP6yoo2sMeHIWlg== X-Received: by 2002:a05:6214:1c86:b0:699:2f11:384f with SMTP id ib6-20020a0562141c8600b006992f11384fmr3518978qvb.2.1712262756166; Thu, 04 Apr 2024 13:32:36 -0700 (PDT) Received: from x1n ([99.254.121.117]) by smtp.gmail.com with ESMTPSA id s13-20020ad4438d000000b0069908364644sm41556qvr.82.2024.04.04.13.32.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Apr 2024 13:32:35 -0700 (PDT) Date: Thu, 4 Apr 2024 16:32:33 -0400 From: Peter Xu To: Matthew Wilcox Cc: Lokesh Gidra , akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, surenb@google.com, kernel-team@android.com, aarcange@redhat.com, david@redhat.com, zhengqi.arch@bytedance.com, kaleshsingh@google.com, ngeoffray@google.com Subject: Re: [PATCH] userfaultfd: change src_folio after ensuring it's unpinned in UFFDIO_MOVE Message-ID: References: <20240404171726.2302435-1-lokeshgidra@google.com> MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 399F7A0006 X-Stat-Signature: qhusn6ptjm7qedxuect3y9fhr8xejqga X-HE-Tag: 1712262759-170588 X-HE-Meta: U2FsdGVkX1+LxHWvbc5tMxQyWgSOrQgFiuyVy6FdeI7nsTOanyy29gIgx9TLlF5a3CQ9LfAKkbiU3kILGaeusWSboqXz8oPGqFWAkf12fE5P13FQdk96i8bGNAVRChE9+f0NyadzfuoGPAACMpRvMn9EdE5/lnwMs7iYh4R1kq0Kja7WBjy2v/NHtHNcW1ZGNkwM4+fJxqdtFus0DO3rIy0FcpidoSxdRc/RqocTtjzGBoik74YbC4Q76p70uf0g163DYIqcBVWnH9VkaJSgPBoWiiBzITJAYS+iNDBZS6+wv9XCfO6S626HQo9Dgc17RL34ROD5l+jExJ9Nl0l+e0HyAxdxZK+NmcquRs03kKvB1tOLh7CjOsn/vVk+6xUQ0EW+bUDrom/aFgS7SIshKVMRCJr05DnImND6fP19OWMQvnD+uqMpqcx9z9sqtMsdsxFsgWOjxdreT2PSpXABRrC8wNpGnv0ObV4MrsUDA8MfmwId27v5dmn/YJGfTjakSlsnua9foL2BMXmOd2hWtNHZql45x5Vg1ZAAIiW6C9dFgH4RSJ1EDrbqQv724NwV6AIaEbijwlwq0L0F00vfT3w7kSI76QcfQUTChldeMg865eR9hgEDChtWp6y09+jETiwhNdFUMksWkGQR26CJ5++oBnqKrYmHVamx/NmnmotM6rNsPdudA/w2jeACaAc60gJD3An9PQPEoMOpiRMyA4fqUnpcVbGwEq79jEL9wDOUURGnXkZXPiRbg7L+uXo/PuI3sYdoulY3AYrZYObP/Jm4lBwXk1zc8RXh6vd0RTGLWx0p1OC+OAvbJKnYBvXKwE6mmeXexu4paQ5XlnumUTiDUcKJI7x8o8zSolEkjdelE3IT20OoUbrmX03ixdq88mFiEz/uF7Gw5facC8jJWO0gTJzGTzv1plxxNYKR3PNY/m6XBl4CnsmKtNg5+S5c9j36m4TMm66Uo/qm4Rw T63Q9p6R EDBDsnPTCkIXXJSOTMPnKfXJkmtjkpGQuPrmWCmAMhNKlRzai+ez9SyZrxMMdRoUcrQAmyvU3HAbyWvYzpyTAj2xBJIRrPfwRL0aqO6G/ICbxqKWONqEiBR3ml7ToZb1fCNjbi8/xseiq0gJk01fC+p4rIFoYoeYmlfYHvZOhd2roebEXqAJKWMaj0vYuKIidKpuT/Att83tCfV0wOzoP5aEjU8485x73wJ7rJ7M3AiLB92gBbpRm4UI9MQM+5len4kGxiIK/m2bPCiNytAhkIMwg7HhRRlBze/TLRwR+ejaxGgEmDWEnjgziPAszmGEX7R+AhS3ikNrdiUUD1z5Glz+7uscOqO03i4v/7T0rAGKEF7SiufwJlFVF5+QnHk5H1UYhb4RpT/gWlFZEdR2wfaNvUqSv1jOvmuvk X-Bogosity: Ham, tests=bogofilter, spamicity=0.000102, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Apr 04, 2024 at 06:21:50PM +0100, Matthew Wilcox wrote: > On Thu, Apr 04, 2024 at 10:17:26AM -0700, Lokesh Gidra wrote: > > - folio_move_anon_rmap(src_folio, dst_vma); > > - WRITE_ONCE(src_folio->index, linear_page_index(dst_vma, dst_addr)); > > - > > src_pmdval = pmdp_huge_clear_flush(src_vma, src_addr, src_pmd); > > /* Folio got pinned from under us. Put it back and fail the move. */ > > if (folio_maybe_dma_pinned(src_folio)) { > > @@ -2270,6 +2267,9 @@ int move_pages_huge_pmd(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, pm > > goto unlock_ptls; > > } > > > > + folio_move_anon_rmap(src_folio, dst_vma); > > + WRITE_ONCE(src_folio->index, linear_page_index(dst_vma, dst_addr)); > > + > > This use of WRITE_ONCE scares me. We hold the folio locked. Why do > we need to use WRITE_ONCE? Who's looking at folio->index without > holding the folio lock? Seems true, but maybe suitable for a separate patch to clean it even so? We also have the other pte level which has the same WRITE_ONCE(), so if we want to drop we may want to drop both. I just got to start reading some the new move codes (Lokesh, apologies on not be able to provide feedbacks previously..), but then I found one thing unclear, on special handling of private file mappings only in userfault context, and I didn't know why: lock_vma(): if (vma) { /* * lock_vma_under_rcu() only checks anon_vma for private * anonymous mappings. But we need to ensure it is assigned in * private file-backed vmas as well. */ if (!(vma->vm_flags & VM_SHARED) && unlikely(!vma->anon_vma)) vma_end_read(vma); else return vma; } AFAIU even for generic users of lock_vma_under_rcu(), anon_vma must be stable to be used. Here it's weird to become an userfault specific operation to me. I was surprised how it worked for private file maps on faults, then I had a check and it seems we postponed such check until vmf_anon_prepare(), which is the CoW path already, so we do as I expected, but seems unnecessary to that point? Would something like below make it much cleaner for us? As I just don't yet see why userfault is special here. Thanks, ===8<=== diff --git a/mm/memory.c b/mm/memory.c index 984b138f85b4..d5cf1d31c671 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3213,10 +3213,8 @@ vm_fault_t vmf_anon_prepare(struct vm_fault *vmf) if (likely(vma->anon_vma)) return 0; - if (vmf->flags & FAULT_FLAG_VMA_LOCK) { - vma_end_read(vma); - return VM_FAULT_RETRY; - } + /* We shouldn't try a per-vma fault at all if anon_vma isn't solid */ + WARN_ON_ONCE(vmf->flags & FAULT_FLAG_VMA_LOCK); if (__anon_vma_prepare(vma)) return VM_FAULT_OOM; return 0; @@ -5817,9 +5815,9 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, * find_mergeable_anon_vma uses adjacent vmas which are not locked. * This check must happen after vma_start_read(); otherwise, a * concurrent mremap() with MREMAP_DONTUNMAP could dissociate the VMA - * from its anon_vma. + * from its anon_vma. This applies to both anon or private file maps. */ - if (unlikely(vma_is_anonymous(vma) && !vma->anon_vma)) + if (unlikely(!(vma->vm_flags & VM_SHARED) && !vma->anon_vma)) goto inval_end_read; /* Check since vm_start/vm_end might change before we lock the VMA */ diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index f6267afe65d1..61f21da77dcd 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -72,17 +72,8 @@ static struct vm_area_struct *lock_vma(struct mm_struct *mm, struct vm_area_struct *vma; vma = lock_vma_under_rcu(mm, address); - if (vma) { - /* - * lock_vma_under_rcu() only checks anon_vma for private - * anonymous mappings. But we need to ensure it is assigned in - * private file-backed vmas as well. - */ - if (!(vma->vm_flags & VM_SHARED) && unlikely(!vma->anon_vma)) - vma_end_read(vma); - else - return vma; - } + if (vma) + return vma; mmap_read_lock(mm); vma = find_vma_and_prepare_anon(mm, address); -- 2.44.0 -- Peter Xu