From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D264CD1288 for ; Thu, 4 Apr 2024 20:55:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 969AA6B008C; Thu, 4 Apr 2024 16:55:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8F3026B0098; Thu, 4 Apr 2024 16:55:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 745DA6B009A; Thu, 4 Apr 2024 16:55:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 518B26B008C for ; Thu, 4 Apr 2024 16:55:43 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 0B1F1C0411 for ; Thu, 4 Apr 2024 20:55:43 +0000 (UTC) X-FDA: 81973055766.13.096C89F Received: from mail-il1-f174.google.com (mail-il1-f174.google.com [209.85.166.174]) by imf02.hostedemail.com (Postfix) with ESMTP id 6ACF78000B for ; Thu, 4 Apr 2024 20:55:41 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=SW9fDBFC; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf02.hostedemail.com: domain of surenb@google.com designates 209.85.166.174 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1712264141; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5iC1pIqUXTSnTgmvGouWQ95I32tg1xlV1RokQzbsyAY=; b=umGBgKm28KdgLOl+ZrQtATmA99jpNFrLuWVxcr5WRNh850wOhQ/2DLZud22nd8eTyZOWim NBmG+/E8Q34XvVAaY2kMkczdARwf9m7mPfgRDZLUlIvfm29utQaYx//br4kv4vYLd0Atag MIGUyiK+fOigxCMa0scR57lIxrABNCk= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=SW9fDBFC; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf02.hostedemail.com: domain of surenb@google.com designates 209.85.166.174 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1712264141; a=rsa-sha256; cv=none; b=tmhgzJDMiJ65VhRl/tDHA54SjkJzpwWrehpo9KHR0XtOlw9rsKq+tkx9ul+6MNEul6aFPw 7ySrxAvp0HhSafC8A1IzvwgnLESTt2rIkIiLtpJb1WUDNiRbh1KTspl4JqPVU5NAkNPgCH tW/dTMoJFCW184UvQWRbpF1gjx0eKG0= Received: by mail-il1-f174.google.com with SMTP id e9e14a558f8ab-369e3c29aedso6261285ab.1 for ; Thu, 04 Apr 2024 13:55:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1712264140; x=1712868940; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=5iC1pIqUXTSnTgmvGouWQ95I32tg1xlV1RokQzbsyAY=; b=SW9fDBFCIZRHyoXcKWG1icDAKIrHvjfTL6zaBeeahElfWYtR1Sd03W0Qwh7bynKLw2 FZYBEzsmoTryWeBT+l3ucgq1Y71Vj15YlLmnyR/OH/wlx8SXEgRrSXTctxFr2nEeqj3J jgOVNcc909/32p1uOyWlYEB6RNt51c2SvxcX7iTWu9TRrCImltuQtMPU/pF1wvk1MrCv QJwHMP3iGsfOkqeqZ06924WKQeODkDtA4v4R8URo5CNYcbTmkgUNvwbFfUjLoY/Yc6WC ZnJk1T7BUKc2ug6hu6fL3LJMrtk+n3iunIy4ztVYzr0LaZhDhAxkKHzeZvZKUgDtb8Db 6miw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712264140; x=1712868940; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5iC1pIqUXTSnTgmvGouWQ95I32tg1xlV1RokQzbsyAY=; b=njlqt85lVq46OKpA4eWtLnIEVbVx6d6dLDVYMrZ0k5KNcJnXBxEdiym9SJ8SPKKcD1 2r1nKzU9DpETeb8qRS5V+FlumDnyDXJZcJBi5VtbADrgt5TSKeJHMuch/V8rRnEEDGBY g1kNlzQjHRC1UDWqMdV7WvMpplFVGsnGxc2HHRslhDxjFsppQw0qCrxSAo9oB82aqjFr Yvz4VuMrRFHAaPmK9+iHzsGXrjyfAFo47L3TFDMHAF07qXxF0x6YIigMdiFawBBir9K+ xBi9r3rwiRFxeDs2T5B2J3CgVPdrMejZA+eUEOyg2kpPi/A2xWZyTVOOeHHLwED50SCf cbVA== X-Forwarded-Encrypted: i=1; AJvYcCVAjn7vfwY5j+1/+wvqvKjdAupDFVEY40UWHId2ys6ITDQMiWk0Xe8DKpQgP/Z05tdoN319s7YGrNUPFgv2PXfQYmQ= X-Gm-Message-State: AOJu0YyPJ3v2XezwZh/9V+x1QOeHhutvyQrinREohudkLH/dueqwe5vY ajcaPpcZHsWuyM8nPCze+e0jM18SN0P7LNRXAFYkDj5c9LiLGJzCsT5Fn/SGMoKaL65IhqG0eqH mYT/PH0p/X4J1z33aAwlWkNDNAzC5+yg4yVD72NXdF8nLbiHnAkVY X-Google-Smtp-Source: AGHT+IFijaaIO1cUULiBvsqMMPBKsSOmh3PkrPxo3Hj+Z8lo29i0nEnJkkdTCXecC4aWAvTTz0SokQGmPbK/tFGMp7E= X-Received: by 2002:a25:e015:0:b0:dcf:c389:854c with SMTP id x21-20020a25e015000000b00dcfc389854cmr4048035ybg.16.1712264120095; Thu, 04 Apr 2024 13:55:20 -0700 (PDT) MIME-Version: 1.0 References: <20240404171726.2302435-1-lokeshgidra@google.com> In-Reply-To: From: Suren Baghdasaryan Date: Thu, 4 Apr 2024 13:55:07 -0700 Message-ID: Subject: Re: [PATCH] userfaultfd: change src_folio after ensuring it's unpinned in UFFDIO_MOVE To: Peter Xu Cc: Matthew Wilcox , Lokesh Gidra , akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, aarcange@redhat.com, david@redhat.com, zhengqi.arch@bytedance.com, kaleshsingh@google.com, ngeoffray@google.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 6ACF78000B X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: q99qow9jsfznuwk73h79sjcsmcnfwrek X-HE-Tag: 1712264141-954024 X-HE-Meta: U2FsdGVkX18YLBYzoOJJzVVJ9I99Nqg69vZiOvCZIZjXPXF5+AUvA1hB54Nrw3hCNB6Y/r66DP//1MgHYcPhx/IPJdozrx29CVIux/380tkTK51/4uT+A23SjhHtwivrNk98lwGpJlcyQ4mbifMpThC2Lan4yyAW90CflIZDhjqf2IWuaClVfIMCbzsHq1mMPRpiX0/0nsd8B0Kzg7B+gybv3/cVmRYy52bwuMTvosYOxEeXI9bQAAPtjVOvzZBzOIJc/4b7wrHmMF5vHukKCi/fHN4EikbSJqSlbBbE7OCnFvsdJyLaWrsFddDBM7T2KRr8EMWy1BJBosx4oFMBiESIskRzq3gvLROUi8M1kYObwPCckJBzE9PyJU+N8S0/6lrNHuTfwUH8dG+yy80jBD7fmE3DArvjNoaaDgQQTvuKVWrld5bPyCJQye1sa9tqF98wZ9Nc2Gxpe5wL2XVxCstq0YW8QrUP9lzBKgvNRhi3lW69ynnx/m1shopZj1oL0ylyzqhcnQXJVFyeII5n1BdoLhkaFaF2DYwbmrJzElNpQHEXy7Bdxslua3ksoyFbkkc2V1RDbLeKtcT1Ax92nApI7HEAtOtx6QCM6qHJNaSQilMug8sURRm7ZgShdTAuYQ2jCYmWDOpUiqLDzidNyAIYUT0EngPiAVWbScWrC6a3EZc7xVaDHE1hYX3cEMug3IHP5mcLS0L6awVkSjMOempTnK21g/zRhmRQSqEpUAMX587PSBk0Jw0KWoKWQYqwsmM0o9gU0CrCCU7KEQEGzemDRIao4gMFOOe2avH9/L5bfOObX3R6bD0AVBLHmEqEItBiAfvlYlNv6h6WJlXdLeQjAOD9ip872CuUxmokVsz3AJ8ZllHnt7wKa56jOgoKtd3fxvdvcBbZEMz61I+1BW79iPmZVG4GwU8TfJiATcQFhnRpMkApWntowCZlKzy0vec3bRfvuBN+F1/ysBk Sn3jp73e m2QiKPZkvSTUAfZAmA8XgGRzOp/CzpkuOaaIc9VFFkQeoEQw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.001833, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Apr 4, 2024 at 1:32=E2=80=AFPM Peter Xu wrote: > > On Thu, Apr 04, 2024 at 06:21:50PM +0100, Matthew Wilcox wrote: > > On Thu, Apr 04, 2024 at 10:17:26AM -0700, Lokesh Gidra wrote: > > > - folio_move_anon_rmap(src_folio, dst_vma); > > > - WRITE_ONCE(src_folio->index, linear_page_index(dst_vma, d= st_addr)); > > > - > > > src_pmdval =3D pmdp_huge_clear_flush(src_vma, src_addr, s= rc_pmd); > > > /* Folio got pinned from under us. Put it back and fail t= he move. */ > > > if (folio_maybe_dma_pinned(src_folio)) { > > > @@ -2270,6 +2267,9 @@ int move_pages_huge_pmd(struct mm_struct *mm, p= md_t *dst_pmd, pmd_t *src_pmd, pm > > > goto unlock_ptls; > > > } > > > > > > + folio_move_anon_rmap(src_folio, dst_vma); > > > + WRITE_ONCE(src_folio->index, linear_page_index(dst_vma, d= st_addr)); > > > + > > > > This use of WRITE_ONCE scares me. We hold the folio locked. Why do > > we need to use WRITE_ONCE? Who's looking at folio->index without > > holding the folio lock? > > Seems true, but maybe suitable for a separate patch to clean it even so? > We also have the other pte level which has the same WRITE_ONCE(), so if w= e > want to drop we may want to drop both. Yes, I'll do that separately and will remove WRITE_ONCE() in both places. > > I just got to start reading some the new move codes (Lokesh, apologies on > not be able to provide feedbacks previously..), but then I found one thin= g > unclear, on special handling of private file mappings only in userfault > context, and I didn't know why: > > lock_vma(): > if (vma) { > /* > * lock_vma_under_rcu() only checks anon_vma for private > * anonymous mappings. But we need to ensure it is assign= ed in > * private file-backed vmas as well. > */ > if (!(vma->vm_flags & VM_SHARED) && unlikely(!vma->anon_v= ma)) > vma_end_read(vma); > else > return vma; > } > > AFAIU even for generic users of lock_vma_under_rcu(), anon_vma must be > stable to be used. Here it's weird to become an userfault specific > operation to me. > > I was surprised how it worked for private file maps on faults, then I had= a > check and it seems we postponed such check until vmf_anon_prepare(), whic= h > is the CoW path already, so we do as I expected, but seems unnecessary to > that point? > > Would something like below make it much cleaner for us? As I just don't > yet see why userfault is special here. > > Thanks, > > =3D=3D=3D8<=3D=3D=3D > diff --git a/mm/memory.c b/mm/memory.c > index 984b138f85b4..d5cf1d31c671 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3213,10 +3213,8 @@ vm_fault_t vmf_anon_prepare(struct vm_fault *vmf) > > if (likely(vma->anon_vma)) > return 0; > - if (vmf->flags & FAULT_FLAG_VMA_LOCK) { > - vma_end_read(vma); > - return VM_FAULT_RETRY; > - } > + /* We shouldn't try a per-vma fault at all if anon_vma isn't soli= d */ > + WARN_ON_ONCE(vmf->flags & FAULT_FLAG_VMA_LOCK); > if (__anon_vma_prepare(vma)) > return VM_FAULT_OOM; > return 0; > @@ -5817,9 +5815,9 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm= _struct *mm, > * find_mergeable_anon_vma uses adjacent vmas which are not locke= d. > * This check must happen after vma_start_read(); otherwise, a > * concurrent mremap() with MREMAP_DONTUNMAP could dissociate the= VMA > - * from its anon_vma. > + * from its anon_vma. This applies to both anon or private file = maps. > */ > - if (unlikely(vma_is_anonymous(vma) && !vma->anon_vma)) > + if (unlikely(!(vma->vm_flags & VM_SHARED) && !vma->anon_vma)) > goto inval_end_read; > > /* Check since vm_start/vm_end might change before we lock the VM= A */ > diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c > index f6267afe65d1..61f21da77dcd 100644 > --- a/mm/userfaultfd.c > +++ b/mm/userfaultfd.c > @@ -72,17 +72,8 @@ static struct vm_area_struct *lock_vma(struct mm_struc= t *mm, > struct vm_area_struct *vma; > > vma =3D lock_vma_under_rcu(mm, address); > - if (vma) { > - /* > - * lock_vma_under_rcu() only checks anon_vma for private > - * anonymous mappings. But we need to ensure it is assign= ed in > - * private file-backed vmas as well. > - */ > - if (!(vma->vm_flags & VM_SHARED) && unlikely(!vma->anon_v= ma)) > - vma_end_read(vma); > - else > - return vma; > - } > + if (vma) > + return vma; > > mmap_read_lock(mm); > vma =3D find_vma_and_prepare_anon(mm, address); > -- > 2.44.0 > > > -- > Peter Xu >