From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF368C4829A for ; Tue, 13 Feb 2024 11:25:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4D32F6B0071; Tue, 13 Feb 2024 06:25:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4848C6B0075; Tue, 13 Feb 2024 06:25:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 323916B007B; Tue, 13 Feb 2024 06:25:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 1D4C86B0071 for ; Tue, 13 Feb 2024 06:25:09 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id D6A2C1A0B77 for ; Tue, 13 Feb 2024 11:25:08 +0000 (UTC) X-FDA: 81786549096.26.5665D26 Received: from mail-wr1-f54.google.com (mail-wr1-f54.google.com [209.85.221.54]) by imf07.hostedemail.com (Postfix) with ESMTP id 34E6140019 for ; Tue, 13 Feb 2024 11:25:06 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="gSsaRsn/"; spf=pass (imf07.hostedemail.com: domain of lokeshgidra@google.com designates 209.85.221.54 as permitted sender) smtp.mailfrom=lokeshgidra@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707823507; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=c7B1VdNEjxdBHUG+t3lw56dnlzhGxxWmEE0AwtlQokM=; b=tZHeB6fgD/Bfq1p4pI402PvcdtzFMp91Wg2SIIxk6WZ8tF6LcTFMGe+IaFmyrWBEMZ7el9 6uasjuaFJQ+t2i7Lwf+Ud/BDmHteU8kJ/vnLRI0xK8C2XSwQkPIqYItHIIY5eYpVBEOkv1 uFagqqIeRHXL99tBPqpLUIwflzozqOg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707823507; a=rsa-sha256; cv=none; b=rWXeBk/CMNlEEQQ0YVNoJxNmB8Aj+fP5k4CCL8D9NEDqpK/OIG3IjsE4P7fG+tqWkKutqb S3hH6B1iAXhcm+fJRSDpvzZaObC3FyJlgYmq1dd+FwYzpRjZhzu24nQEqpjAD/cBYauKPe gtzGtMjI9lK9j6wBXBp3PVGM5Rrz3U0= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="gSsaRsn/"; spf=pass (imf07.hostedemail.com: domain of lokeshgidra@google.com designates 209.85.221.54 as permitted sender) smtp.mailfrom=lokeshgidra@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-wr1-f54.google.com with SMTP id ffacd0b85a97d-33aeb088324so2682799f8f.2 for ; Tue, 13 Feb 2024 03:25:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1707823505; x=1708428305; darn=kvack.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=c7B1VdNEjxdBHUG+t3lw56dnlzhGxxWmEE0AwtlQokM=; b=gSsaRsn/rlTy1w+D7LDbn5fTZ71CdXnzQI8LFJA1gbLO4Z3MW89zwh7ryWMkcBxVSn ow+FzlgqWdI2mQiXR4+LS82H0EZKldvH9bmqi62zcGMtNPz3kUtee3Qxrw+Lg5c/6+0c OOIE/hiLh+WBiajUA+qDJV40k/rvCjkqqOqkqEqGivkdA2CUMm7YLjA/441UFNy8d4xZ 5UGGKjZHpMGFDLRpCNFYb/dxYQR4btH2IbMoYBxpHqw96lXVrRW9gtsTwn6eR561Cy4Y UTHcbjHTti0bSsHfzrG5nYqG5KHav/Iz3LMYbLwpstKRvN8/JXW/Hdd+3FUAEjvdCI/l gxtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707823505; x=1708428305; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=c7B1VdNEjxdBHUG+t3lw56dnlzhGxxWmEE0AwtlQokM=; b=ZAtGSpp+XvuCDx91AfmYQs3BwSWRbxGvunPjBe1Le/je0cIrW+VUT1eUdKQxYCTyUa 2oUVfXYZrf/9FTCo5lfUYzGbk+NlQiYdkrlvk9X7VUhJK3QdqTVpLmyRVHrxxM0ima36 y1f4B0MBTwNSj7VBZDzORh81BOlsNtq1m+80rSS2P6fOA8zuMQBRWibA22svJeyxHv9d IyPGX9VxBI3SHqgLBK9EbdA3GqskPysbMtyIi8eIpMtiPH3usOCuWoMBDTxvZbq3xJbh rjIxPchmOvy5WogyAinzwL4Fv9Fn1IqudlYGyejMx++V+1qIQR/XauQRuwmGrjwtBeaE MwfQ== X-Forwarded-Encrypted: i=1; AJvYcCWiyyT6BSfst0QO+wvHunQ9CyElBfzMnBwea+p6vbd416W3G/mmrhqa+3yQQiCSNeSK591OR/wZ19PjGNvlL/6gevw= X-Gm-Message-State: AOJu0YxyX9ocQhyGUFsLAh3GHqsCseB/kSIp9YINFzZfr/jzia7DeoGK JRjwwZx2J4ccYGYaU66vXeD9Ba0EC4b2GMZN8YLz2iv0mNk9hrAQPrY/T0F/mGvEHM4piGHtR/Y nsOBx8T3G2HNB2W34qI3YiEEsnju65xq1D6sb X-Google-Smtp-Source: AGHT+IFDvKZilqGPmprSihukL2cOV+DX1TDTDe1cz5RyBlvdIbIU9j0HXFdr6eKEBuH81WUcfNsiM9yJ0Ex2CNjDRAA= X-Received: by 2002:adf:eec9:0:b0:33b:2281:ef32 with SMTP id a9-20020adfeec9000000b0033b2281ef32mr7382022wrp.69.1707823505374; Tue, 13 Feb 2024 03:25:05 -0800 (PST) MIME-Version: 1.0 References: <20240213001920.3551772-1-lokeshgidra@google.com> <20240213001920.3551772-4-lokeshgidra@google.com> <20240213033307.zbhrpjigco7vl56z@revolver> In-Reply-To: <20240213033307.zbhrpjigco7vl56z@revolver> From: Lokesh Gidra Date: Tue, 13 Feb 2024 03:24:53 -0800 Message-ID: Subject: Re: [PATCH v5 3/3] userfaultfd: use per-vma locks in userfaultfd operations To: "Liam R. Howlett" , Lokesh Gidra , akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, selinux@vger.kernel.org, surenb@google.com, kernel-team@android.com, aarcange@redhat.com, peterx@redhat.com, david@redhat.com, axelrasmussen@google.com, bgeffon@google.com, willy@infradead.org, jannh@google.com, kaleshsingh@google.com, ngeoffray@google.com, timmurray@google.com, rppt@kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: gfjos1zdg4yoxecu5j8y5x1zygnu47d4 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 34E6140019 X-Rspam-User: X-HE-Tag: 1707823506-116660 X-HE-Meta: U2FsdGVkX18sN/ywaG4uFuXFsd2ZNez5LuWe07YzIOtiuFuM1AdNdhr2qOddoUiqPkVwl6hbG7ARlzDdsimfGa8x+WSr5FjCMcLKEsN+VPMvrUntMOS7Nnz+rRMtl1H+3aQ0joZkvGBafaXk40JLZquOdF6IKUK/utoYlTeyZsPrGR+mQkoKUolbQGUjbArLBV5MEXidVsXTUyVqrhi85XFz83F4cDiUNeSShwJ56QKwzjZCFQPrR7cyb4CZzEJ35yjdLPfaDgIp93MZ0UOi/JiE2J2DbnFQPPGIE8EPJBlL8eia/A7jXADTXkYwOlTxwPfgfmgMw61kSHYLfuLRb5XRhkhO7IU8dLUCOe/7/syv6tTerHVKesKBXj+8B/9QMuIzzfFOnq3Ef0mOofm8JioA9SxF0dl00gFf9Lm9zalUIDrH9HsSUJTps0iypo1O9n9aTm3TXTDR2/MMAnQ84LObmF+fIZjdB+6xYyS/IL4etS7o/PBQoJZlQKpSIPQZOoOi4JiJm+2DUyuBEv/F4sFbVVbH6Ur2/3yaYuBRrOSVXbohOJjtoFopITYD56+HQ7uacTn1vOM8Ae8RJuXNzYk3ojaT/NKG3HR23eFaoZN/tM4ALX+/IjPI84bRe4bWjD4EMcKErMOt/8/y1CtAsnV+ky9v9agejCHhVXatgf7YYlQPkITZ9CGzISBb8V0Q2k9tyoJqi/vnsyrhF4KJQZAjOfU+kbu2gWSP6tYFWY//6KrCMs1iqx2VEetqmOIf5wUYjFNmIaHi3SBljMV1ldg4jqa9dFirWY9+a7CuTaFutralLj0DH/0nLGoqusuFpuy07qEtP4+85LZ4ezTlxG2gZnv3IxOC71JEOd13qFHk+iQPo35VRij/9YLVWFdIZqojVyFcbZjmKa6YBvkW5meh3CVfMFL1xSMuDmmqNLScQz0GYP6dYtVLgt7tfSC66Qn0z5f79Ircg/xQNAS g9Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Feb 12, 2024 at 7:33=E2=80=AFPM Liam R. Howlett wrote: > > * Lokesh Gidra [240212 19:19]: > > All userfaultfd operations, except write-protect, opportunistically use > > per-vma locks to lock vmas. On failure, attempt again inside mmap_lock > > critical section. > > > > Write-protect operation requires mmap_lock as it iterates over multiple > > vmas. > > > > Signed-off-by: Lokesh Gidra > > --- > > fs/userfaultfd.c | 13 +- > > include/linux/userfaultfd_k.h | 5 +- > > mm/userfaultfd.c | 392 ++++++++++++++++++++++++++-------- > > 3 files changed, 312 insertions(+), 98 deletions(-) > > > ... > > > + > > +static __always_inline > > +struct vm_area_struct *find_vma_and_prepare_anon(struct mm_struct *mm, > > + unsigned long addr) > > +{ > > + struct vm_area_struct *vma; > > + > > + mmap_assert_locked(mm); > > + vma =3D vma_lookup(mm, addr); > > + if (!vma) > > + vma =3D ERR_PTR(-ENOENT); > > + else if (!(vma->vm_flags & VM_SHARED) && anon_vma_prepare(vma)) > > + vma =3D ERR_PTR(-ENOMEM); > > Nit: I just noticed that the code below says anon_vma_prepare() is unlike= ly. > Thanks for catching this. I'll add it in next version. > ... > > > +static struct vm_area_struct *lock_mm_and_find_dst_vma(struct mm_struc= t *dst_mm, > > + unsigned long dst_= start, > > + unsigned long len) > > +{ > > + struct vm_area_struct *dst_vma; > > + int err; > > + > > + mmap_read_lock(dst_mm); > > + dst_vma =3D find_vma_and_prepare_anon(dst_mm, dst_start); > > + if (IS_ERR(dst_vma)) { > > + err =3D PTR_ERR(dst_vma); > > It's sort of odd you decode then re-encode this error, but it's correct > the way you have it written. You could just encode ENOENT instead? Thanks. It was an oversight. I'll fix it. > > > + goto out_unlock; > > + } > > + > > + if (validate_dst_vma(dst_vma, dst_start + len)) > > + return dst_vma; > > + > > + err =3D -ENOENT; > > +out_unlock: > > + mmap_read_unlock(dst_mm); > > + return ERR_PTR(err); > > } > > +#endif > > > ... > > > +static __always_inline > > +long find_vmas_mm_locked(struct mm_struct *mm, > > int would probably do? > > + unsigned long dst_start, > > + unsigned long src_start, > > + struct vm_area_struct **dst_vmap, > > + struct vm_area_struct **src_vmap) > > +{ > > + struct vm_area_struct *vma; > > + > > + mmap_assert_locked(mm); > > + vma =3D find_vma_and_prepare_anon(mm, dst_start); > > + if (IS_ERR(vma)) > > + return PTR_ERR(vma); > > + > > + *dst_vmap =3D vma; > > + /* Skip finding src_vma if src_start is in dst_vma */ > > + if (src_start >=3D vma->vm_start && src_start < vma->vm_end) > > + goto out_success; > > + > > + vma =3D vma_lookup(mm, src_start); > > + if (!vma) > > + return -ENOENT; > > +out_success: > > + *src_vmap =3D vma; > > + return 0; > > +} > > + > > +#ifdef CONFIG_PER_VMA_LOCK > > +static long find_and_lock_vmas(struct mm_struct *mm, > > This could also be an int return type, I must be missing something? If you look at ERR_PTR() etc. macros, they all use 'long' for conversions. Also, this file uses long/ssize_t/int at different places. So I went in favor of long. I'm sure int would work just fine too. Let me know if you want me to change it to int. > > ... > > > + *src_vmap =3D lock_vma_under_rcu(mm, src_start); > > + if (likely(*src_vmap)) > > + return 0; > > + > > + /* Undo any locking and retry in mmap_lock critical section */ > > + vma_end_read(*dst_vmap); > > + > > + mmap_read_lock(mm); > > + err =3D find_vmas_mm_locked(mm, dst_start, src_start, dst_vmap, s= rc_vmap); > > + if (!err) { > > + /* > > + * See comment in lock_vma() as to why not using > > + * vma_start_read() here. > > + */ > > + down_read(&(*dst_vmap)->vm_lock->lock); > > + if (*dst_vmap !=3D *src_vmap) > > + down_read(&(*src_vmap)->vm_lock->lock); > > + } > > + mmap_read_unlock(mm); > > + return err; > > +} > > +#else > > +static long lock_mm_and_find_vmas(struct mm_struct *mm, > > + unsigned long dst_start, > > + unsigned long src_start, > > + struct vm_area_struct **dst_vmap, > > + struct vm_area_struct **src_vmap) > > +{ > > + long err; > > + > > + mmap_read_lock(mm); > > + err =3D find_vmas_mm_locked(mm, dst_start, src_start, dst_vmap, s= rc_vmap); > > + if (err) > > + mmap_read_unlock(mm); > > + return err; > > } > > +#endif > > This section is much easier to understand. Thanks. I'm glad finally the patch is easier to follow. Thanks so much for your prompt reviews. > > Thanks, > Liam