From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA799C52D71 for ; Thu, 8 Aug 2024 20:19:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3C5306B008C; Thu, 8 Aug 2024 16:19:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 375826B0092; Thu, 8 Aug 2024 16:19:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 23C9F6B0095; Thu, 8 Aug 2024 16:19:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 055806B008C for ; Thu, 8 Aug 2024 16:19:06 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 56CF380FCC for ; Thu, 8 Aug 2024 20:19:06 +0000 (UTC) X-FDA: 82430192292.21.7F54AD1 Received: from mail-ed1-f44.google.com (mail-ed1-f44.google.com [209.85.208.44]) by imf04.hostedemail.com (Postfix) with ESMTP id 70DB740009 for ; Thu, 8 Aug 2024 20:19:04 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=TdoqiLhK; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf04.hostedemail.com: domain of andrii.nakryiko@gmail.com designates 209.85.208.44 as permitted sender) smtp.mailfrom=andrii.nakryiko@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723148271; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=crPwGeOrxjPFHaDXHUZYSxmf8c7pkDPQAJXPSd+ORLU=; b=K4hqk3uLJ/rG0AxgWgatvhRVQve92RfPVqS4peSFeI3E9VYSjMICeqN5X8bAwUPV1O+Lny c/TISyfM5P8qYPC/zu5lSsuw36E39sVa0ErAW13//pdIe7ZhS6RxayJfdLK6mawsbduo37 s+QmrJGJhECE1aNPdlE4u5osFPw8a44= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723148271; a=rsa-sha256; cv=none; b=kG/fzMzf8YxRn1HJHszUlFzt2TxzcTiS8slZiRZN9WaUwfWvcC/m+SK+yaa0dkLyzHvBEo GQg5s4XmHNSS9TyASfLEBQcHL+Wfn20h1QWy1CFCHII68Nz/QsqDKj3RXsjs383ARktwrN zkAotbBObxpLT9SrGnUi4oUhpxWlYGo= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=TdoqiLhK; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf04.hostedemail.com: domain of andrii.nakryiko@gmail.com designates 209.85.208.44 as permitted sender) smtp.mailfrom=andrii.nakryiko@gmail.com Received: by mail-ed1-f44.google.com with SMTP id 4fb4d7f45d1cf-5a3b866ebc9so1593662a12.3 for ; Thu, 08 Aug 2024 13:19:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723148343; x=1723753143; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=crPwGeOrxjPFHaDXHUZYSxmf8c7pkDPQAJXPSd+ORLU=; b=TdoqiLhKGMvKyV9xfuMEJjgl2YtpXv8IEEGiNCtyWXxTvTiprJtBWo+l59g3QQFIbK p67V46adij4c44CFWVDxEFRSZVqVoKkAdiNJC0WSCKIQxOW6oTUCyG1xiyaSzTrVQdiL r5sH8JqHnxRKwpyur7c+aMFd7nJrLK12otRkOkaHkEVNT2Z2skzkfBmw2WUXcFs4ByCN q+4AK7LMcQukjF5nGhuGpOFDtTzcl/EvQwGxU7OqEx4hLP2huA1SnUlESSlCeu1xm71d zOkscHHZResuSRcXZf16dxSOtS+eQAN4In1MD3K/ANkJ73bfLnM7McQ/uYPlKoEijJLE Dhig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723148343; x=1723753143; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=crPwGeOrxjPFHaDXHUZYSxmf8c7pkDPQAJXPSd+ORLU=; b=RRJlL6qfPcF+9zs/0xelefcFndExedargW1PkFig5widTwoKKxs+qi10nqtuHAeANM Uv6ogNhSjAoC9VgEIaxTy7OiOtVL8tMzcPTmOb5KdgjDRQb/fONI/cfcfp0urmuclNse KcHLThT+KsFF6da1nrFmTepj+ygjr1XQduuaOYPJuEGIIVzVZeAtgPWB5EpT46YPhIPJ UV87JlBuwtyV4l52dRy1mN4AUXKCrbMLhmYYsUZIiIuQaQ2CfQxmNvL2lmVwen/s63uL phU715U73ABFVgx8pIOcolR5UNqVAE5s0VRo47/qve538KnJyW0EPZ3VBGAulI8mQPMY t5bA== X-Forwarded-Encrypted: i=1; AJvYcCWPJRZ/Lxd0RiWQ9ADni3H5z+NG8O5OXjauFQyL4yCd7y8g3ESmX3HYbqRIGyAU/ToJ5mc11c4XdXY3sfNhqlhZM/k= X-Gm-Message-State: AOJu0YzQwrKIe0+0SmQCwivC2K/IB70rjggNEKL4BdFVj65wQgRVDZ7d 9VITxKzETCie+SppIsdA6sKIXT1DxtMTRuBx6QfFnYNM1IAG0EUuY/i1PuFix72MnJ+o1zW2F52 416VxhQ0saLvQNkNaWb2QFVOSVJsJt2mR X-Google-Smtp-Source: AGHT+IEUOhYL6xStpYkuThqQmBqF9anzRNcCXHrVrb8jzngDIMAgzFxD9dlfrxkdWim6vc6yiQ0wqpXPBTEwfZx8S9M= X-Received: by 2002:a17:907:3e1a:b0:a7d:2612:33d6 with SMTP id a640c23a62f3a-a8090e9f92fmr259662366b.53.1723148342231; Thu, 08 Aug 2024 13:19:02 -0700 (PDT) MIME-Version: 1.0 References: <20240807182325.2585582-1-surenb@google.com> In-Reply-To: <20240807182325.2585582-1-surenb@google.com> From: Andrii Nakryiko Date: Thu, 8 Aug 2024 13:18:46 -0700 Message-ID: Subject: Re: [RFC 1/1] mm: introduce mmap_lock_speculation_{start|end} To: Suren Baghdasaryan Cc: akpm@linux-foundation.org, peterz@infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 70DB740009 X-Stat-Signature: tijx1fom485zagrebyf4h6eif6b7nrho X-Rspam-User: X-HE-Tag: 1723148344-392246 X-HE-Meta: U2FsdGVkX1/oLNVzKRYDPNB45EumkcXrlmuNB2qk6t2J1THSna0tOBuXoz3j2tPDtAtavoQVgyvnzu8c0izDiZ0FoVGfTOAN1kcbQ4fMqeaVkYTyLnAKsnbl4q9IuXxWBO+SlD4KxYAAvVLNZutGFPC1IBpw1cPSS7Yu2pLtF4mcP3vG5R0wNVgWk0Guv94SQgBeqBvizU8hMzcg3Hd6TgYGLp50dTjDMMsTi0cAsJ7OTw+M5s99USYqMk8AnhOh8ghOj/1elBf9BLTlPJ/HOqe3QKDQfkLVwY6S8MZlOt34/mDVZsqUCc95OpiSHOLyxZZm9hRP47iInd1NvJ8T3bU8Vr64nswQ/Ll+jRDDXCYVnjmk1FDG8z/JAqDrSf91Chr7G3d+jBPmcAB6/1rdpVera9pQM7Ema0E2Hwp22ppAtQi79c4mhG28/xHudV6UTN1qPguH07y+BDf4JDtkmqPers2ESvbdF4k/ZqX0DNaF12E+R5j5VzoeAjtMKkGwhii3m+g/dXo61HDRQtkXAttTNZvng/awpSGtGdPKH/6zz03F4iMhY0AOoS0QmHBMf3Zb1moHithSEZH9moLh9NkRr4LKbHN6KZifyz9xeg7duH7wYOv9jfYDDHRwUJHTOZfhyInmBorzxSc0fjX1FEtvFO+8pdg25EFvM2Gqk1+MrE1MT7r1XDOMNdQCpAvh6gmRFwI+9mfdZwnd0qCKwzBOxLKbX2s2+1yub+RRfPwUJYIsISYZ+dX5cl72sm2TSIUX0Due4BOcmHA7VYF9+WIv6XOAABKAWnXtYdpAJaTJxF4mDclF99CDwoTgBHFjsj8jS8vMyUydVVUEpTxWHeqq7k42jXWN3wrQw82rrhVL7/diQK5Z78fZRtsPWNFnTajsoRC+sNSeHWxGAJIG7St7x1KDqTaULwCRO2XemOqaKOcRJ5h2DSKb8cFY2Eut1JVlVlOYTpSXD5BRMKt h3w+Anyp jGGXeBUSUAQf9FTfvobwb10c62tTO/yxY+SUM0Qu/6FhrzHfiqZ9Qi2j0iO5MumeQMhoRUONoYZbIcoeUO0Wd8/9ZyWTgsp5eV2e3PlMRC13nS+oO+gIKZivLdLuVYxX7MhvGQHvNDbFCFAc2gR+M3HD2uoVtT53jkbBGJtZngXeWrXI5fMYFfNBlkZowc4s81X2cPRZGE9Mt0onwvN/NOLbe22lD2H3D3JVWEmm3F1YfWBnIBs1PHqYNt2GAUssfYV3dda/7bW6Z90cbsUQZTi3ROo7xaV9bFUaeBbdbqymy6gJAOikCs8Yj9z/rGTiuqRDwrygBkQmukVtPf/XI1WFzrI8vD/83a0e8r+4F7fYrk7l0KHw4opx68o/O6y/jYzOyaYGtaXhmiNE1oF48Lj/L1bELneOPFuxunDVJqtaqivUetN7xYu7/BX+uxy94UvMMr4NNWqfTgiVBzsR1/Tzg/+w71SLaiOm80hSEuacQ570YKLNvygjPrYgEkEVXJs4dQvmbn0bCQvF0/cIzoganw26ApgFYntIrLJPRrgEWVCcKXpynEr/di0G2hTrYxkijua2LPraK8FOOkCsNopHGCg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Aug 7, 2024 at 11:23=E2=80=AFAM Suren Baghdasaryan wrote: > > Add helper functions to speculatively perform operations without > read-locking mmap_lock, expecting that mmap_lock will not be > write-locked and mm is not modified from under us. > > Signed-off-by: Suren Baghdasaryan > Suggested-by: Peter Zijlstra > Cc: Andrii Nakryiko > --- This change makes sense and makes mm's seq a bit more useful and meaningful. I've also tested it locally with uprobe stress-test, and it seems to work great, I haven't run into any problems with a multi-hour stress test run so far. Thanks! Acked-by: Andrii Nakryiko > Discussion [1] follow-up. If proves to be useful can be included in that > patchset. Based on mm-unstable. > > [1] https://lore.kernel.org/all/20240730134605.GO33588@noisy.programming.= kicks-ass.net/ > > include/linux/mm_types.h | 3 +++ > include/linux/mmap_lock.h | 53 +++++++++++++++++++++++++++++++-------- > kernel/fork.c | 3 --- > 3 files changed, 46 insertions(+), 13 deletions(-) > > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > index 003619fab20e..a426e6ced604 100644 > --- a/include/linux/mm_types.h > +++ b/include/linux/mm_types.h > @@ -887,6 +887,9 @@ struct mm_struct { > * Roughly speaking, incrementing the sequence number is > * equivalent to releasing locks on VMAs; reading the seq= uence > * number can be part of taking a read lock on a VMA. > + * Incremented every time mmap_lock is write-locked/unloc= ked. > + * Initialized to 0, therefore odd values indicate mmap_l= ock > + * is write-locked and even values that it's released. > * > * Can be modified under write mmap_lock using RELEASE > * semantics. > diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h > index de9dc20b01ba..5410ce741d75 100644 > --- a/include/linux/mmap_lock.h > +++ b/include/linux/mmap_lock.h > @@ -71,15 +71,12 @@ static inline void mmap_assert_write_locked(const str= uct mm_struct *mm) > } > > #ifdef CONFIG_PER_VMA_LOCK > -/* > - * Drop all currently-held per-VMA locks. > - * This is called from the mmap_lock implementation directly before rele= asing > - * a write-locked mmap_lock (or downgrading it to read-locked). > - * This should normally NOT be called manually from other places. > - * If you want to call this manually anyway, keep in mind that this will= release > - * *all* VMA write locks, including ones from further up the stack. > - */ > -static inline void vma_end_write_all(struct mm_struct *mm) > +static inline void init_mm_lock_seq(struct mm_struct *mm) > +{ > + mm->mm_lock_seq =3D 0; > +} > + > +static inline void inc_mm_lock_seq(struct mm_struct *mm) > { > mmap_assert_write_locked(mm); > /* > @@ -91,19 +88,52 @@ static inline void vma_end_write_all(struct mm_struct= *mm) > */ > smp_store_release(&mm->mm_lock_seq, mm->mm_lock_seq + 1); > } > + > +static inline bool mmap_lock_speculation_start(struct mm_struct *mm, int= *seq) > +{ > + /* Pairs with RELEASE semantics in inc_mm_lock_seq(). */ > + *seq =3D smp_load_acquire(&mm->mm_lock_seq); > + /* Allow speculation if mmap_lock is not write-locked */ > + return (*seq & 1) =3D=3D 0; > +} > + > +static inline bool mmap_lock_speculation_end(struct mm_struct *mm, int s= eq) > +{ > + /* Pairs with RELEASE semantics in inc_mm_lock_seq(). */ > + return seq =3D=3D smp_load_acquire(&mm->mm_lock_seq); > +} > + > #else > -static inline void vma_end_write_all(struct mm_struct *mm) {} > +static inline void init_mm_lock_seq(struct mm_struct *mm) {} > +static inline void inc_mm_lock_seq(struct mm_struct *mm) {} > +static inline bool mmap_lock_speculation_start(struct mm_struct *mm, int= *seq) { return false; } > +static inline bool mmap_lock_speculation_end(struct mm_struct *mm, int s= eq) { return false; } > #endif > > +/* > + * Drop all currently-held per-VMA locks. > + * This is called from the mmap_lock implementation directly before rele= asing > + * a write-locked mmap_lock (or downgrading it to read-locked). > + * This should normally NOT be called manually from other places. > + * If you want to call this manually anyway, keep in mind that this will= release > + * *all* VMA write locks, including ones from further up the stack. > + */ > +static inline void vma_end_write_all(struct mm_struct *mm) > +{ > + inc_mm_lock_seq(mm); > +} > + > static inline void mmap_init_lock(struct mm_struct *mm) > { > init_rwsem(&mm->mmap_lock); > + init_mm_lock_seq(mm); > } > > static inline void mmap_write_lock(struct mm_struct *mm) > { > __mmap_lock_trace_start_locking(mm, true); > down_write(&mm->mmap_lock); > + inc_mm_lock_seq(mm); > __mmap_lock_trace_acquire_returned(mm, true, true); > } > > @@ -111,6 +141,7 @@ static inline void mmap_write_lock_nested(struct mm_s= truct *mm, int subclass) > { > __mmap_lock_trace_start_locking(mm, true); > down_write_nested(&mm->mmap_lock, subclass); > + inc_mm_lock_seq(mm); > __mmap_lock_trace_acquire_returned(mm, true, true); > } > > @@ -120,6 +151,8 @@ static inline int mmap_write_lock_killable(struct mm_= struct *mm) > > __mmap_lock_trace_start_locking(mm, true); > ret =3D down_write_killable(&mm->mmap_lock); > + if (!ret) > + inc_mm_lock_seq(mm); > __mmap_lock_trace_acquire_returned(mm, true, ret =3D=3D 0); > return ret; > } > diff --git a/kernel/fork.c b/kernel/fork.c > index 3d590e51ce84..73e37af8a24d 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -1259,9 +1259,6 @@ static struct mm_struct *mm_init(struct mm_struct *= mm, struct task_struct *p, > seqcount_init(&mm->write_protect_seq); > mmap_init_lock(mm); > INIT_LIST_HEAD(&mm->mmlist); > -#ifdef CONFIG_PER_VMA_LOCK > - mm->mm_lock_seq =3D 0; > -#endif > mm_pgtables_bytes_init(mm); > mm->map_count =3D 0; > mm->locked_vm =3D 0; > > base-commit: 98808d08fc0f78ee638e0c0a88020fbbaf581ec6 > -- > 2.46.0.rc2.264.g509ed76dc8-goog >