From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8C1ABC02180 for ; Mon, 13 Jan 2025 17:53:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1E3846B007B; Mon, 13 Jan 2025 12:53:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 16DD16B0088; Mon, 13 Jan 2025 12:53:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F28B16B0089; Mon, 13 Jan 2025 12:53:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id C988B6B007B for ; Mon, 13 Jan 2025 12:53:15 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 5C24012021A for ; Mon, 13 Jan 2025 17:53:15 +0000 (UTC) X-FDA: 83003175150.16.8FE42A1 Received: from mail-qt1-f169.google.com (mail-qt1-f169.google.com [209.85.160.169]) by imf21.hostedemail.com (Postfix) with ESMTP id 800C31C0009 for ; Mon, 13 Jan 2025 17:53:13 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=EFMfbgol; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of surenb@google.com designates 209.85.160.169 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736790793; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Nq60D/wtnL7MJ0xSTzLn6BeiNY9D1QgksEuRNAoydAM=; b=nim11hbdVakyGm8NlIot70RlqqCI9h5TwOCs1p5lSWLF0T+m5utH1ke0AFnkNqU7KLOh79 ukOGdm+WIfdhEQBsWQ0vbq2sOP9Z5b7GNOKQ1xKfp9rfKVgTQ8acGeWkWM2l+pOIYbDSXo oPyUN3rYxvPxA3kPQ0Ky21sbfhNYzrc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736790793; a=rsa-sha256; cv=none; b=yCEgKx6j/bFRunjKuzpUgWRzTfZonYR2ccSOZ3ZNKXk0es/+7mWjPf4eyAJX4Nq8zw/8Ef L/Q4CgEzn+BDbi1O2EjzzPtI4bQmy6HdrpDJ4ftDz8I9EVjoVWfHogeLEmgPimQW0ELauw v+OfUGRhVRz3iXuYE9cGd2+dR272m/8= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=EFMfbgol; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of surenb@google.com designates 209.85.160.169 as permitted sender) smtp.mailfrom=surenb@google.com Received: by mail-qt1-f169.google.com with SMTP id d75a77b69052e-4678c9310afso405331cf.1 for ; Mon, 13 Jan 2025 09:53:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736790792; x=1737395592; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Nq60D/wtnL7MJ0xSTzLn6BeiNY9D1QgksEuRNAoydAM=; b=EFMfbgolQRkObWEemPExba6Rgpwo8OWreVeJKAFItLGBHxYTRB2RBG1qjC7wfbLe8n 6zMP1J9QWQuPY5Adj+Wn0dMs3A50HSnCsmQZ7CXsqJ2MNcI98vDXWbNvU6Rv+WsQhHzs ywV6Qst9uexNzASgy1XmX5vo1bbxPV+mVvp9tyi6dtcCqoUi8N/aOOqd6pydORVnoQgb lzEz+RvVs778JlokGyrXpI0yw9oxMynUOrFPl1IFDIu1d37kI/8rtn6ai+zkYegXfj37 57fzsktSQlioYjRlzhC6f5ZtmUAOvwsKfdFqqwVP2UtNpPb7XvGD0MGAebxzm+6ldD2s VVIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736790792; x=1737395592; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Nq60D/wtnL7MJ0xSTzLn6BeiNY9D1QgksEuRNAoydAM=; b=CKj7RM3wx8TuMMz972WgHn/hCeTNQnUplt6EsFAaG7KLdfIHsRiW9vKH1BJQplS2AL lb++ledDKE2nCJgjSieFVOCFmIuIVwDyONnr+IIRS4mDzGRJ7O8d7+pMsLIkJcljC0Fk k6jEVwx2r6FeZ/CcG5ONhBVE39VM0J2CiDayPNA3WjOIT1HOTiKrX5ecsC1lCj7Tx3VX +gr9WHQDPyizaxhvu7ZNbs9/En49tyGGdAnB05ZZVQCSR9/If40/Im5ty35mL4s5PDFg R+XtlSc5qDLaujy1G/z218aRmKd2Az6a7y6LuoFDwayNqXBlCQfXeElaJRRvEhDEV4kS kCaQ== X-Forwarded-Encrypted: i=1; AJvYcCXu08lrKOPGwk7Q7uHq0gR43S7CKfSVh81ig79AlJQYi8kMezxuwC5b6qRSMOwG3RY+7FCbGw4W9w==@kvack.org X-Gm-Message-State: AOJu0YzJX/OGgeAY2EZXbZMhqlzlmAk/zG0e4yGv2h/63ayXRlIeP3id Al+uEoY2vctEN401WkKXXRSuizvXZausbwvGQQDErOkqXUS+Nhv88xEWily04yYPtPCE7yny7hs 9CAJV7eoUxg2cQ0SSc7vuPqa8hVc75dCgwpvn X-Gm-Gg: ASbGncuNDajrUApn1bpbyccO8smXIIl/f8M2xAE81YjRf2EdZyAT75NCgkvoFcbYyCR dFLbTq7Lp+IbCCHTtN1SYSAuwO9rbrSbaem4W+A== X-Google-Smtp-Source: AGHT+IHgp/eBIbrWKg0PPffLbeqwythp95UQ6YWsTKdDGsrGCUmxfCRxP4zOnorjo1pB/TEWA8j0R8ypqJMjCwf9DPs= X-Received: by 2002:ac8:7dcc:0:b0:466:91fd:74c4 with SMTP id d75a77b69052e-46c87d08ee2mr11399381cf.0.1736790792331; Mon, 13 Jan 2025 09:53:12 -0800 (PST) MIME-Version: 1.0 References: <20250111042604.3230628-1-surenb@google.com> <20250111042604.3230628-8-surenb@google.com> <038aaebd-264a-4e64-8777-4c4015401097@lucifer.local> In-Reply-To: <038aaebd-264a-4e64-8777-4c4015401097@lucifer.local> From: Suren Baghdasaryan Date: Mon, 13 Jan 2025 09:53:01 -0800 X-Gm-Features: AbW1kvbKYqt4l9pQoD7yxBpJLBveKcxRbQCvIKvr5hst8o5bNAAjPJtV2xqtilM Message-ID: Subject: Re: [PATCH v9 07/17] mm: allow vma_start_read_locked/vma_start_read_locked_nested to fail To: Lorenzo Stoakes Cc: akpm@linux-foundation.org, peterz@infradead.org, willy@infradead.org, liam.howlett@oracle.com, david.laight.linux@gmail.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, richard.weiyang@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: i8xcnhxfmt5csqr9wuwrt8g8swnuxerg X-Rspamd-Queue-Id: 800C31C0009 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1736790793-59358 X-HE-Meta: U2FsdGVkX18T1GYPMgjsif1Kgop9h/3/MiMlKNGMia9GhU+8zmqqAhMEHTIUdSzhtLpc0i9f68w4oUDNd6KMz0h9AMjZtmI1rxHNXzLnePPbuGezSR7DYKX9zWJ4qrgWXWFbB47HnXwxHeY/dQmFf3E4mQTqPvnBW2uKVtsSFnYLmk1KjIgghdvzeLDht0ZNFTcHrNdPalIgxTJaAU8MyupOqhA3KeSlsOd07lYx/QgnSYu14Y3zJ3MPKWAxMv1MneRiuSH8MxbaSy+HSGTS2lsVOlX88TMFJ/iMP9l1IBFzKtQFfxogZvqe2ODgXXQ/Y0zhfd6L5svjMD0NTuQJ2KFD1OJk7N4sU2dQJUxPbke4DszTQLfe2c7tXxk6+GG65cMbG2zvZl+0QNVUAGJq3hPsFynRWzdoaqpYRNwHYadcq7erlOAH8X/YuYvoWWz4awPGAP7DDacdMS8vmAXSPIY1AKrvpmvUib7SwMmL9vFZyF0r2AbIqwNouVdYlo+lqr4EaF2if8Yo9I1OUQh2NSYrREZYxuFaLH8rHAskD8kh5fKYsXk2/8I9SRJGsLibLPxqy+jLg5neDjoS5fD6vk6LZmKbx/DiheyOCkF7tJKcLzMCglI30bLe+1pjxt+ImhdlV69/+pQxA/Zm+VegbVKRnoI3M79n0TpHWDi2+3tYNrrHV++In2uMItpluOaq/tBdiC2OUfZaWYIKcEqMcScQUT1SNdOqvn3SfHo64bayJV4h5/ufWtT6616UWwAtFK1NnzqLeFCX7PdCpRss8OLby0ESENjnQAostmm4UKQH+3ili9VO5yfDH+N/eB5GAC0ReM5eBIGLM63ao8EZQ9wixutFcUpq2WT//DiCPM+DfdKx6FGgGcDOX5XyucFpFN42ZSECIKVs1LMv9PHG25kmaP94JXbN5l/IX/nQd4dXeHCkTco1Lgd8srONYnsQ5ZB9xuQI701L6rR3Vja YPM8ftYy 4SfR++VQlaHdFqa8XtQxFmueB06ArKtTvFWXHHJTQQg3fQqlSCEpalmBkTtjbMC7j0tIbHDcl7KiNM/YQ4+IczW7j/RZLBuZBi8o6xG3HjrfsC7h37gRYFCCS6KYKo7IW0l8ki7wRLVB6zG1/wWxleX7w3uwGYP6kj/Py0vuQWeNRu0u8NzST2whYythHT4AllUOBHVQEhYTGfY4yrnIPxw5IrtQStjGgBn6rbqI5qew/6ChSahlNhG1T2bpuglNE7QVY5X7RQnLp3kfxO3x9zmnRNJVcMILylvG2X+INvarOpZa07v5JYyRUkZo5Sp2ISM3TFn81w4bxe8A0O5JFwgKvezIEYbMj4GDrVd9MdWAmFuU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jan 13, 2025 at 7:25=E2=80=AFAM Lorenzo Stoakes wrote: > > On Fri, Jan 10, 2025 at 08:25:54PM -0800, Suren Baghdasaryan wrote: > > With upcoming replacement of vm_lock with vm_refcnt, we need to handle = a > > possibility of vma_start_read_locked/vma_start_read_locked_nested faili= ng > > due to refcount overflow. Prepare for such possibility by changing thes= e > > APIs and adjusting their users. > > > > Signed-off-by: Suren Baghdasaryan > > Acked-by: Vlastimil Babka > > Cc: Lokesh Gidra > > --- > > include/linux/mm.h | 6 ++++-- > > mm/userfaultfd.c | 18 +++++++++++++----- > > 2 files changed, 17 insertions(+), 7 deletions(-) > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index 2f805f1a0176..cbb4e3dbbaed 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -747,10 +747,11 @@ static inline bool vma_start_read(struct vm_area_= struct *vma) > > * not be used in such cases because it might fail due to mm_lock_seq = overflow. > > * This functionality is used to obtain vma read lock and drop the mma= p read lock. > > */ > > -static inline void vma_start_read_locked_nested(struct vm_area_struct = *vma, int subclass) > > +static inline bool vma_start_read_locked_nested(struct vm_area_struct = *vma, int subclass) > > { > > mmap_assert_locked(vma->vm_mm); > > down_read_nested(&vma->vm_lock.lock, subclass); > > + return true; > > } > > > > /* > > @@ -759,10 +760,11 @@ static inline void vma_start_read_locked_nested(s= truct vm_area_struct *vma, int > > * not be used in such cases because it might fail due to mm_lock_seq = overflow. > > * This functionality is used to obtain vma read lock and drop the mma= p read lock. > > */ > > -static inline void vma_start_read_locked(struct vm_area_struct *vma) > > +static inline bool vma_start_read_locked(struct vm_area_struct *vma) > > { > > mmap_assert_locked(vma->vm_mm); > > down_read(&vma->vm_lock.lock); > > + return true; > > } > > > > static inline void vma_end_read(struct vm_area_struct *vma) > > diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c > > index 4527c385935b..411a663932c4 100644 > > --- a/mm/userfaultfd.c > > +++ b/mm/userfaultfd.c > > @@ -85,7 +85,8 @@ static struct vm_area_struct *uffd_lock_vma(struct mm= _struct *mm, > > mmap_read_lock(mm); > > vma =3D find_vma_and_prepare_anon(mm, address); > > if (!IS_ERR(vma)) > > - vma_start_read_locked(vma); > > + if (!vma_start_read_locked(vma)) > > + vma =3D ERR_PTR(-EAGAIN); > > Nit but this kind of reads a bit weirdly now: > > if (!IS_ERR(vma)) > if (!vma_start_read_locked(vma)) > vma =3D ERR_PTR(-EAGAIN); > > Wouldn't this be nicer as: > > if (!IS_ERR(vma) && !vma_start_read_locked(vma)) > vma =3D ERR_PTR(-EAGAIN); > > On the other hand, this embeds an action in an expression, but then it so= rt of > still looks weird. > > if (!IS_ERR(vma)) { > bool ok =3D vma_start_read_locked(vma); > > if (!ok) > vma =3D ERR_PTR(-EAGAIN); > } > > This makes me wonder, now yes, we are truly bikeshedding, sorry, but mayb= e we > could just have vma_start_read_locked return a VMA pointer that could be = an > error? > > Then this becomes: > > if (!IS_ERR(vma)) > vma =3D vma_start_read_locked(vma); No, I think it would be wrong for vma_start_read_locked() to always return EAGAIN when it can't lock the vma. The error code here is context-dependent, so while EAGAIN is the right thing here, it might not work for other future users. > > > > > mmap_read_unlock(mm); > > return vma; > > @@ -1483,10 +1484,17 @@ static int uffd_move_lock(struct mm_struct *mm, > > mmap_read_lock(mm); > > err =3D find_vmas_mm_locked(mm, dst_start, src_start, dst_vmap, s= rc_vmap); > > if (!err) { > > - vma_start_read_locked(*dst_vmap); > > - if (*dst_vmap !=3D *src_vmap) > > - vma_start_read_locked_nested(*src_vmap, > > - SINGLE_DEPTH_NESTING); > > + if (vma_start_read_locked(*dst_vmap)) { > > + if (*dst_vmap !=3D *src_vmap) { > > + if (!vma_start_read_locked_nested(*src_vm= ap, > > + SINGLE_DEPTH_NEST= ING)) { > > + vma_end_read(*dst_vmap); > > Hmm, why do we end read if the lock failed here but not above? We have successfully done vma_start_read_locked(dst_vmap) (we locked dest vma) but we failed to do vma_start_read_locked_nested(src_vmap) (we could not lock src vma). So we should undo the dest vma locking. Does that clarify the logic? > > > + err =3D -EAGAIN; > > + } > > + } > > + } else { > > + err =3D -EAGAIN; > > + } > > } > > This whole block is really ugly now, this really needs refactoring. > > How about (on assumption the vma_end_read() is correct): > > > err =3D find_vmas_mm_locked(mm, dst_start, src_start, dst_vmap, s= rc_vmap); > if (err) > goto out; > > if (!vma_start_read_locked(*dst_vmap)) { > err =3D -EAGAIN; > goto out; > } > > /* Nothing further to do. */ > if (*dst_vmap =3D=3D *src_vmap) > goto out; > > if (!vma_start_read_locked_nested(*src_vmap, > SINGLE_DEPTH_NESTING)) { > vma_end_read(*dst_vmap); > err =3D -EAGAIN; > } > > out: > mmap_read_unlock(mm); > return err; > } Ok, that looks good to me. Will change this way. Thanks! > > > mmap_read_unlock(mm); > > return err; > > -- > > 2.47.1.613.gc27f4b7a9f-goog > >