From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 79E15EEE262 for ; Thu, 12 Sep 2024 21:04:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E15086B008A; Thu, 12 Sep 2024 17:04:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DC51B6B008C; Thu, 12 Sep 2024 17:04:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CB41C6B0092; Thu, 12 Sep 2024 17:04:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id AB6066B008A for ; Thu, 12 Sep 2024 17:04:20 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 5C9E640863 for ; Thu, 12 Sep 2024 21:04:20 +0000 (UTC) X-FDA: 82557314280.19.A7578E3 Received: from mail-ed1-f46.google.com (mail-ed1-f46.google.com [209.85.208.46]) by imf09.hostedemail.com (Postfix) with ESMTP id 761CE140013 for ; Thu, 12 Sep 2024 21:04:18 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=r37bOueN; spf=pass (imf09.hostedemail.com: domain of surenb@google.com designates 209.85.208.46 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726175029; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Z9VsB68MQof9g4wA1cKKKrr59DMnuyJ0zAwUPhXjwPI=; b=N39IJGU2/02Ov53hiaqkuE45Zsb2xz/a3ydWq0LjxQHr5+GBTDAqAbibyiRpkTmq2eYJS+ ZdW2cGGtwJiMGuyrEtLmQFD0BzkCzcbQ8Cug+Wh8a1pa+DFEYHfwhiE4LouPx07S1qj/Qu EXtetx1ls13h6mzv1KZnCpqXQaP6row= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=r37bOueN; spf=pass (imf09.hostedemail.com: domain of surenb@google.com designates 209.85.208.46 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726175029; a=rsa-sha256; cv=none; b=g289J/V8rBKyfrQdQsLaZujNmILEy2r+fSXSN06auGhqoZwgX0pyqo1sTHRaykqZAF0Kfh baMghaiu4Hu4G3K3EUA2C/B+ue8XlFaaqBfPdU3GRjCoYZDENxEuBgOWhp6jigAd76qjJv 7p5C8PuUJVXgQzPG0qqIzWYPKA3JUpM= Received: by mail-ed1-f46.google.com with SMTP id 4fb4d7f45d1cf-5c2443b2581so6700a12.0 for ; Thu, 12 Sep 2024 14:04:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726175057; x=1726779857; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Z9VsB68MQof9g4wA1cKKKrr59DMnuyJ0zAwUPhXjwPI=; b=r37bOueNHddlD7agyzi3pFxskwJ161d1ntnu49lkynn8BCWew8ZZsiuUeJ18X1lQay j4qwB44OumAPECqXItzLlIqSC98rBb7S3T8IpZUJVbpWYaomMwLQ6D/NrPYoKhGk5qX8 vJ7p2edPlUwNkxCrmoDb2r0k4ngnaGCM7eAC/NYPwKDuYrlXoEEVwJYP/qP6RIodkmNx NmuSGURG6hQ91K6a9CYVK918AVf/KE0HXelyM4rpxzVXZgvz7al0O7arCRRz1WM7I1MP C2vv30wPATIG5KBbPGDHUeuHlbl3YJ78jlxytR/UXs2C2HhrCdedsdK4C0t7Are9udkB Ybmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726175057; x=1726779857; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Z9VsB68MQof9g4wA1cKKKrr59DMnuyJ0zAwUPhXjwPI=; b=tnvnEtlVNw2+lyGxj8D1jLIHJ314X+W7gcnofmSynC5qxuj294LSB6h7pjcl1xd/XI pFDZLVHTh1v0ekJRzRgN1Ty3vU1UuxGWU6LUbwvOx3h62q2muz+MqWOT3a3JF+NhzZMw SqNHZsxqjJtNY4IiPqtbugNduXEFQUvyBPmZ/nB1XZuB9anrDDEmeV0usTeiDJuctmzz DrI62Abe6n2UHgkAusvXm6WlIq+OIxq4S61lUH965JtTWMPrbVLr4KaQfVKr+atcG8T7 WtPR5a03AelU/uN6D6UsT/JamTcwjO0mVenWTmi7ErsFOGUBj6OfRuvdpGZQflZoSgqn luNQ== X-Forwarded-Encrypted: i=1; AJvYcCUrBGyX6AV2LKdsQq0KPqZlRNMnLfmvHh41k6n2RFSNTezC9+avNJudZYBkR6m+nI7F972tCTouHA==@kvack.org X-Gm-Message-State: AOJu0Yyec8UCswNTO1sgNb0W0Kzo+fpj6o7/D9vqKy7q7l35Ahc3cnoj aScYmUr0wzwt3nE0EfajFFW6efsFXnaOisCrwcNyWgrdPa2Si/75B09QRIsiqywIMyGFi600RtY Q43krrFn9xzPU89DTtlG4+qAzXikvgln9Nu0J X-Google-Smtp-Source: AGHT+IFLf7AFcbkAtsWFWFjzpfStlqHufGs3sxLk2dIi+9wgBc3dInwrl75rUyYqCfZRJ/a24Knz1XzFxhyJ/z86Zyo= X-Received: by 2002:a05:6402:27c7:b0:5c2:5641:af79 with SMTP id 4fb4d7f45d1cf-5c414384e17mr488780a12.0.1726175055955; Thu, 12 Sep 2024 14:04:15 -0700 (PDT) MIME-Version: 1.0 References: <20240912210222.186542-1-surenb@google.com> In-Reply-To: <20240912210222.186542-1-surenb@google.com> From: Suren Baghdasaryan Date: Thu, 12 Sep 2024 14:04:00 -0700 Message-ID: Subject: Re: [PATCH v2 1/1] mm: introduce mmap_lock_speculation_{start|end} To: linux-trace-kernel@vger.kernel.org, peterz@infradead.org, oleg@redhat.com Cc: rostedt@goodmis.org, mhiramat@kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, jolsa@kernel.org, paulmck@kernel.org, willy@infradead.org, akpm@linux-foundation.org, linux-mm@kvack.org, mjguzik@gmail.com, brauner@kernel.org, jannh@google.com, andrii@kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: 6tay5m9mqbjyktstc9tebk8th6mrtd18 X-Rspamd-Queue-Id: 761CE140013 X-Rspamd-Server: rspam11 X-HE-Tag: 1726175058-246815 X-HE-Meta: U2FsdGVkX19gaPHeyTNWdNIro0loAeDV2ICwSyZSVjs2ALlkLFeK7xzTSQRGfiGhJnzoiZsiCpZeTpSEGUgCBzDG6qrOy5GrNCEAJ6b11+X+oY170bWdt6bpoYuYFDazqHLQpp5I2U7ntx1Izu2uP202xqN1brA52I8UpTYS9OuOimiUlNNWaoDRu1MO58IpE1Gufm+3bUaV9Wx1oYVtIa8ZUjCm/ViLFjLWV7sTZOtpXq0xfv2Zu5i0rHn4fNaKjC5ZNLcBMQZTD2wLRbZ13kK7PDwirMGwc9Bij7c7Xu+wvPEZFo61p8fVcmTBV9MoI484EGXXzVkwxks569q16Xz8NZCUOd4KbbO1+8ULGX3uOjOEXC/Aflaf6aJ9eKGQJ5JNthpY5zQp6eOsDI4HSafOQC5kTegojplanF4RPOmulFv0bLlOYiWleILmxx8IzStXbnMBMEmpAt00WtDNMHfL6zstZz5/FILiadSJPN87YzsO4GU4tcdoxbVOCjHXk4PVqGeD68NURmUnQdpQDSk2OpTIKWL2AHHyklmK4J73fTQOpb0AUGJIjjKO3q58XLheFWMVuBQhyoARNBxhjm0I90Pt7UAofkiRWENU2CJ9lCbukOrqjfiTKpqmtfUdmjwVP5p+lwqq0TNSQOtLCKFxbmFUTQZXxTj6T7QcGnXKoaKl6kfM6BU7G0wUbaByKqYtnxAqPhRI1T84twgzvm/iGz7M+yW6Eym/hC7Ye0wIgVjwlq3QTN5zLp+0t6h4gOsXDBgLG+vct2eo3+YPoiZqQFgNN4f87IuDsGJe8PAVZJGoY0Xi+g1BK8WntrEXwNQJrJi8uPeu5v+8jcW0UhBHeydmc8vsPVCo6Rh3126lZ8lb5jvfjqpaGSHqMd9ORutgARS60CIc1v39p5Eg+16s57A9w9+YQcdcrJLVn8bYWFw9kaLjiCnHgelmKX9cVZ9DTIvEhxOHqUN1bzG 4dxAY7xp s2eTeif4+0hx2hsFjrJcsgmKvAIYykts2RXxdFFNURCaQtnfrkEh3nFsxeSmmXajOSqGA9LsMYunqD5FSYxFuMLTkj1hq3qxWKYF5IQdbjZtW/9sPe44+sPXHOVNnEbBmAFrKce6tt2LIPUrUtR+NE98yENarNxQWXVBuCPxhF7WSf/2D375aaBAvm2MI7h59WmlKvgUp8Ydd3rjf7anSY+WZpMWTmFE+Sy2x5zxIzZkNR9Fbo03fHTL9wLD3nsIuq56WT0TaZmt8TaYulHiRRyGpr9N8laWJrc5NwoySX1Qod6mIrpaiVAyyIAO8zC6t8QtVCg8OeElPyxWGRKlje2AlE5lINTym5K/1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Sep 12, 2024 at 2:02=E2=80=AFPM Suren Baghdasaryan wrote: > > Add helper functions to speculatively perform operations without > read-locking mmap_lock, expecting that mmap_lock will not be > write-locked and mm is not modified from under us. Here you go. I hope I got the ordering right this time around, but I would feel much better if Jann reviewed it before it's included in your next patchset :) Thanks, Suren. > > Suggested-by: Peter Zijlstra > Signed-off-by: Suren Baghdasaryan > Signed-off-by: Andrii Nakryiko > --- > Changes since v1 [1]: > - Made memory barriers in inc_mm_lock_seq and mmap_lock_speculation_end > more strict, per Jann Horn > > [1] https://lore.kernel.org/all/20240906051205.530219-2-andrii@kernel.org= / > > include/linux/mm_types.h | 3 ++ > include/linux/mmap_lock.h | 74 ++++++++++++++++++++++++++++++++------- > kernel/fork.c | 3 -- > 3 files changed, 65 insertions(+), 15 deletions(-) > > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > index 6e3bdf8e38bc..5d8cdebd42bc 100644 > --- a/include/linux/mm_types.h > +++ b/include/linux/mm_types.h > @@ -887,6 +887,9 @@ struct mm_struct { > * Roughly speaking, incrementing the sequence number is > * equivalent to releasing locks on VMAs; reading the seq= uence > * number can be part of taking a read lock on a VMA. > + * Incremented every time mmap_lock is write-locked/unloc= ked. > + * Initialized to 0, therefore odd values indicate mmap_l= ock > + * is write-locked and even values that it's released. > * > * Can be modified under write mmap_lock using RELEASE > * semantics. > diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h > index de9dc20b01ba..a281519d0c12 100644 > --- a/include/linux/mmap_lock.h > +++ b/include/linux/mmap_lock.h > @@ -71,39 +71,86 @@ static inline void mmap_assert_write_locked(const str= uct mm_struct *mm) > } > > #ifdef CONFIG_PER_VMA_LOCK > +static inline void init_mm_lock_seq(struct mm_struct *mm) > +{ > + mm->mm_lock_seq =3D 0; > +} > + > /* > - * Drop all currently-held per-VMA locks. > - * This is called from the mmap_lock implementation directly before rele= asing > - * a write-locked mmap_lock (or downgrading it to read-locked). > - * This should normally NOT be called manually from other places. > - * If you want to call this manually anyway, keep in mind that this will= release > - * *all* VMA write locks, including ones from further up the stack. > + * Increment mm->mm_lock_seq when mmap_lock is write-locked (ACQUIRE sem= antics) > + * or write-unlocked (RELEASE semantics). > */ > -static inline void vma_end_write_all(struct mm_struct *mm) > +static inline void inc_mm_lock_seq(struct mm_struct *mm, bool acquire) > { > mmap_assert_write_locked(mm); > /* > * Nobody can concurrently modify mm->mm_lock_seq due to exclusiv= e > * mmap_lock being held. > - * We need RELEASE semantics here to ensure that preceding stores= into > - * the VMA take effect before we unlock it with this store. > - * Pairs with ACQUIRE semantics in vma_start_read(). > */ > - smp_store_release(&mm->mm_lock_seq, mm->mm_lock_seq + 1); > + > + if (acquire) { > + WRITE_ONCE(mm->mm_lock_seq, mm->mm_lock_seq + 1); > + /* > + * For ACQUIRE semantics we should ensure no following st= ores are > + * reordered to appear before the mm->mm_lock_seq modific= ation. > + */ > + smp_wmb(); > + } else { > + /* > + * We need RELEASE semantics here to ensure that precedin= g stores > + * into the VMA take effect before we unlock it with this= store. > + * Pairs with ACQUIRE semantics in vma_start_read(). > + */ > + smp_store_release(&mm->mm_lock_seq, mm->mm_lock_seq + 1); > + } > +} > + > +static inline bool mmap_lock_speculation_start(struct mm_struct *mm, int= *seq) > +{ > + /* Pairs with RELEASE semantics in inc_mm_lock_seq(). */ > + *seq =3D smp_load_acquire(&mm->mm_lock_seq); > + /* Allow speculation if mmap_lock is not write-locked */ > + return (*seq & 1) =3D=3D 0; > +} > + > +static inline bool mmap_lock_speculation_end(struct mm_struct *mm, int s= eq) > +{ > + /* Pairs with ACQUIRE semantics in inc_mm_lock_seq(). */ > + smp_rmb(); > + return seq =3D=3D READ_ONCE(mm->mm_lock_seq); > } > + > #else > -static inline void vma_end_write_all(struct mm_struct *mm) {} > +static inline void init_mm_lock_seq(struct mm_struct *mm) {} > +static inline void inc_mm_lock_seq(struct mm_struct *mm, bool acquire) {= } > +static inline bool mmap_lock_speculation_start(struct mm_struct *mm, int= *seq) { return false; } > +static inline bool mmap_lock_speculation_end(struct mm_struct *mm, int s= eq) { return false; } > #endif > > +/* > + * Drop all currently-held per-VMA locks. > + * This is called from the mmap_lock implementation directly before rele= asing > + * a write-locked mmap_lock (or downgrading it to read-locked). > + * This should normally NOT be called manually from other places. > + * If you want to call this manually anyway, keep in mind that this will= release > + * *all* VMA write locks, including ones from further up the stack. > + */ > +static inline void vma_end_write_all(struct mm_struct *mm) > +{ > + inc_mm_lock_seq(mm, false); > +} > + > static inline void mmap_init_lock(struct mm_struct *mm) > { > init_rwsem(&mm->mmap_lock); > + init_mm_lock_seq(mm); > } > > static inline void mmap_write_lock(struct mm_struct *mm) > { > __mmap_lock_trace_start_locking(mm, true); > down_write(&mm->mmap_lock); > + inc_mm_lock_seq(mm, true); > __mmap_lock_trace_acquire_returned(mm, true, true); > } > > @@ -111,6 +158,7 @@ static inline void mmap_write_lock_nested(struct mm_s= truct *mm, int subclass) > { > __mmap_lock_trace_start_locking(mm, true); > down_write_nested(&mm->mmap_lock, subclass); > + inc_mm_lock_seq(mm, true); > __mmap_lock_trace_acquire_returned(mm, true, true); > } > > @@ -120,6 +168,8 @@ static inline int mmap_write_lock_killable(struct mm_= struct *mm) > > __mmap_lock_trace_start_locking(mm, true); > ret =3D down_write_killable(&mm->mmap_lock); > + if (!ret) > + inc_mm_lock_seq(mm, true); > __mmap_lock_trace_acquire_returned(mm, true, ret =3D=3D 0); > return ret; > } > diff --git a/kernel/fork.c b/kernel/fork.c > index 61070248a7d3..c86e87ed172b 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -1259,9 +1259,6 @@ static struct mm_struct *mm_init(struct mm_struct *= mm, struct task_struct *p, > seqcount_init(&mm->write_protect_seq); > mmap_init_lock(mm); > INIT_LIST_HEAD(&mm->mmlist); > -#ifdef CONFIG_PER_VMA_LOCK > - mm->mm_lock_seq =3D 0; > -#endif > mm_pgtables_bytes_init(mm); > mm->map_count =3D 0; > mm->locked_vm =3D 0; > > base-commit: 015bdfcb183759674ba1bd732c3393014e35708b > -- > 2.46.0.662.g92d0881bb0-goog >