From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E18BFC3600B for ; Mon, 31 Mar 2025 18:42:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 64D48280002; Mon, 31 Mar 2025 14:42:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5D46D280001; Mon, 31 Mar 2025 14:42:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 44E27280002; Mon, 31 Mar 2025 14:42:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 24AB9280001 for ; Mon, 31 Mar 2025 14:42:50 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 59915141C51 for ; Mon, 31 Mar 2025 18:42:50 +0000 (UTC) X-FDA: 83282717700.24.EDEEAF3 Received: from mail-qt1-f180.google.com (mail-qt1-f180.google.com [209.85.160.180]) by imf11.hostedemail.com (Postfix) with ESMTP id 9C2FA40005 for ; Mon, 31 Mar 2025 18:42:48 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=eEg+9J5O; spf=pass (imf11.hostedemail.com: domain of surenb@google.com designates 209.85.160.180 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743446568; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vyOTWw7twwqXolrJUJA1LqtIOcHuQtwXyG7P3C9d4V0=; b=kJ2NmEl0QmD6GcxlXzPYCB5My3km7XCdU/GPS0E6vEpPptZHvJyemdKbZhi5UHJ5sg4y68 eMJtBO+Z9jeIdZ5X7Lxook2IAwxW5v6l4Z4g/pcjNmEVIZj6w17q5JldwsGAqErvN/4Pm2 zy9G5fU2IE1hGo4iW+pywZY/+HMSVmw= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=eEg+9J5O; spf=pass (imf11.hostedemail.com: domain of surenb@google.com designates 209.85.160.180 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743446568; a=rsa-sha256; cv=none; b=qofr76sgOZzmvBl1WqFE+eNUSKPazWI3HjZScnvt4vycLjAwqA/9sBpKkngJ+BWcT9BbOA ZNCRUtjs0nAXa5Q71seziFeuebcRLv6kGo38UGMzvEOvQ7VPhrrusQ67YKD+3qBYWgBxoI lqvxRt6S96ueV9O+agzwgo9dDOxzNTI= Received: by mail-qt1-f180.google.com with SMTP id d75a77b69052e-4769e30af66so65361cf.1 for ; Mon, 31 Mar 2025 11:42:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743446568; x=1744051368; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=vyOTWw7twwqXolrJUJA1LqtIOcHuQtwXyG7P3C9d4V0=; b=eEg+9J5OdSDiiJMkEoM/UKPBRZJuH+4u/9VMmlt6l+5svI6DI7D+Vu3lBA5/2rXH1U da1EgVgoPM7TWeHdHqXuMKSkhFL8hnaWu+vNRm7Z/vafDm3SVkO6dPJJenpTNTSIazog AsLsHN+DW054uQGgRpXHBxYfOdnS8ZsIMFi+rDJRBslmImWKGdlqSTWEDyKYZIi9g6V8 yy6bagtNffvtf0MiCU4x2l9hKuPbewQnA98qaYALskcK7A5RhvVaP9QMonY/oqex0mYF /EtUTWqVCQI84BUsHMzpjwc8KH/59S0UTPphmRfvkdqt5LLxsZ5AV+86hPuRWGFt8F8c JBUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743446568; x=1744051368; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vyOTWw7twwqXolrJUJA1LqtIOcHuQtwXyG7P3C9d4V0=; b=WnpmqNj1638Q039mgXYyXDBONkTPrdtchy8JTF8RRB6W7a9LpJ3tconKeIj8WR41t/ 0GQmMIGMG36yOUPBjxmDfqntbkqQAS3aScHcEJ6MJ1lIxq6CelJ5Z0H0AkaOA2fJtMKd q61Dt0VSzfGYyEnzMNsoVdMnwIfq9eyYDBzYZllxsKsH6fdbgUl6wqEovhvujnfCLhyy CBdcdCTst1FJugSERdg9F/oIvNr1Gcp9Vs5BUJnsubgZGvOI+K6k9Lj4UMH80RQ7yPHn kmvV8e5MN4px8pV+05d9JmWXJaCcgRot5Khp3OtvFna/97iJuwkSGnr9+3moI/Cxb5In ldrw== X-Forwarded-Encrypted: i=1; AJvYcCV5zQQVnz1S+Sr3Av+R9+tm9saEMBvXLkuV9DaMDGv4csSuOx4w9rT4xJx2rO1pYigJh9lsZBrNIg==@kvack.org X-Gm-Message-State: AOJu0YzjiKjaQhGWvC1EYJg47gTS/88gQeHkd9a3f6Hh7ikBb5YdX8hq FpvGKH4krDCHm5ynDsEr1xgTlWrnlTrKIaAqDRIud9Uhhoq9Pu4KBrfx6FbMKWN8FsqkWozsIcz Yq6II4EjAJX2/L5CSnDWY2yuf6sfCqb/dIVe/n0zpaF7fe8wP/g== X-Gm-Gg: ASbGnctzLn0mw+D6Yq5+nyzziGYvqMX5oUOHazK9i68PIOWMB0bOOvih/pGm9g/XUL4 Ue93SdlwH1UdblaTEPr/eBxrcTzmfi8D/MJ/zNd6D9pKZYalXq3sgUhZXyzgHKCqzHJ9FJ13/m0 6VZzwELVLdS9MrdYwhyPNKxBKuBtH1K8BDwQA= X-Google-Smtp-Source: AGHT+IHCVOJhwdHqs6WoXthi05yXR+/gEyu/YZ0Ml2rBuRokuSPmi1WAflyVWq7bdy6dy4iKXv7IqfMQGnu4LprT5EQ= X-Received: by 2002:ac8:7e96:0:b0:477:1fa1:e795 with SMTP id d75a77b69052e-478f63a471dmr323231cf.14.1743446567440; Mon, 31 Mar 2025 11:42:47 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Suren Baghdasaryan Date: Mon, 31 Mar 2025 11:42:35 -0700 X-Gm-Features: AQ5f1Jo3yx5KDKzQk3AGWJPC5WRpedY_ybeTlq0EZKXQ87nbojKsqN2lmrdYhuM Message-ID: Subject: Re: not issuing vma_start_write() in dup_mmap() if the caller is single-threaded To: Mateusz Guzik Cc: "Liam R. Howlett" , linux-mm , Lorenzo Stoakes , Matthew Wilcox Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 9C2FA40005 X-Rspamd-Server: rspam05 X-Rspam-User: X-Stat-Signature: dy76x18wpogzqr1h8iyam7sfo3merxrc X-HE-Tag: 1743446568-200795 X-HE-Meta: U2FsdGVkX19MRfFY/TSjx9Q2irJLXbn3cHIoYH7WyuH2za1+Sq9YNS9ygWsF+N/WVKR4+gxJe1J+mb/ZxlBqIi82nwAUZcK1ePhtbkxa1D/kElwNZ3feBr9PEL6gQaIIlRP/nl7i7+etr+Sz0SS/iDhMOBwgGjlgwfKdG1GgGicl0CDMCRY/gaIMY9S1vlAiSykKgzdvztwf0BfIcC9syjXzF1Z0+9FAfDr4cTWgrdCrCAmJBC+/cZwmyoRNLAfRLYFt98aNYJ4a1I0DhV+Ixg6LzLy6sLM6zRXnMRSU4tz0LAWdvmyDuUAIuxXZPUoQNQvFBi01eeAcg/qlPI5iVAgNxRzCQmvpQCQBR32OrIEn0OCp/ggZh5Ps0pl/Vi6+juUV3QYkR+P3YP6+P4iHy/Q5MqPmeIfR/ktwuOoWXbixN9/449w2ALN7+Lrh9RJuO08zInP6jqXk8oCh6O0cbAadZSHCYu6M8W01e3VCR+UmI2VJVj9EzCzSh5NdzRbpkLbH99kzFEuuHW7lnrE6t41LvmvQyZMM3DDaQQjTA39wUoQHhT7qHKearPC4vz3F39JQEHL73plLpJHx8dchEBjO8bervKS5Is6NzD/KAWeuaIIqiaUC9X0ru55oANmyp2haC7l74GU7GJfT/h3WRRks2WJrplzPODJI5CMMHxO8Dr+YuyEEb5fycsWpSFEvrocsKXjWv72JZLRPZ11sG1ZJ8ltxi8nniol9+rnvmh1s9F0wxC3lDaUg3ArkxPr75KIOMo6SY2WfJyTOK62HzlSSllm6/aIfwNHMoZFh9O0QA0PGJnXjlPi1DrxOSSGn5nUy6Dxur9Y5mJdIm8bLoGgJL+boTGMpTaMds9J5fZY9mANa6lvD8TeieDBVPXuEuNLmraaf+DP2LCUhNJlxkT0oDOLZcDmdEypr5em0VAwk8oO71FT1J+a38zieHytn3LyonUfq2Pg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000785, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Mar 31, 2025 at 10:51=E2=80=AFAM Mateusz Guzik = wrote: > > On Mon, Mar 31, 2025 at 6:43=E2=80=AFPM Liam R. Howlett wrote: > > > > * Mateusz Guzik [250330 15:43]: > > > On Sun, Mar 30, 2025 at 9:23=E2=80=AFPM Suren Baghdasaryan wrote: > > > > > However, the good news is that mm_count tends to be 1. If both > > > > > mm_count and mm_users are 1, then there is no usefaultfd in use a= nd > > > > > nobody to add it either. > > > > > > > > I'm not sure... IIUC new_userfaultfd() does not take mmap_lock whil= e > > > > calling mmgrab(), therefore I think it can race with the code check= ing > > > > its value. > > > > > > > > Considering the mm struct isn't the only way to find the vmas and there > > are users who use other locks to ensure the mm and vma don't go away > > (rmap, for example). It is reasonable to think that other users may us= e > > the vma lock to avoid mm_struct accesses racing. > > > > Although I don't know of a way this is unsafe today, we are complicatin= g > > the locking story of the mm with this change and no data has been given > > on the benefit. I don't recall any regression caused by the addition o= f > > per-vma locking? > > > > I was not involved in any of this, but I learned about the issue from lwn= : > https://lwn.net/Articles/937943/ > > the new (at the time) per-vma locking was suffering weird crashes in > multithreaded programs and Suren ultimately fixed it by locking parent > vma at a 5% hit, > see fb49c455323f ("fork: lock VMAs of the parent process when > forking"). The patch merely adds vma_start_write(mpnt) in dup_mmap. > > What I'm proposing here remedies the problem for most commonly forking > consumers (single-threaded), assuming it does work. ;) > > To that end see below. > > > > > > > It issues: > > > ctx->mm =3D current->mm; > > > ... > > > mmgrab(ctx->mm); > > > > > > Thus I claim if mm_count is 1 *and* mm_users is 1 *and* we are in > > > dup_mmap(), nobody has a userfaultfd for our mm and there is nobody t= o > > > create it either and the optimization is saved. > > > > mm_count is lazy, so I am not entirely sure we can trust what it says. > > But maybe that's only true of mmgrab_lazy_tlb() now? > > > > warning: I don't know the Linux nomenclature here. I'm going to > outline how I see it. > > There is an idiomatic way of splitting ref counts into two: > - something to prevent the struct itself from getting freed ("hold > count" where I'm from, in this case ->mm_count) > - something to prevent data used by the structure from getting freed > ("use count" where I'm from, in this case ->mm_users) > > mm_users > 0 keeps one ref on ->mm_count > > AFAICS the scheme employed for mm follows the mold. > > So with that mmgrab_lazy_tlb() example, the call bumps the count on first= use. > > Suppose we are left with one thread in the process and a lazy tlb > somewhere as the only consumers. mm_users is 1 because of the only > thread and mm_count is 2 -- one ref for mm_users > 0 and one ref for > lazy tlb. > > Then my proposed check: ->mm_count =3D=3D 1 && mm->mm_users =3D=3D 1 > > ... fails and the optimization is not used. > > Now, per the above, the lock was added to protect against faults > happening in parallel. The only cases I found where this is of note > are remote accesses (e.g., from /proc/pid/cmdline) and userfaultfd. > > I'm not an mm person and this is why I referred to Suren to sort this > out, hoping he would both have interest and enough knowledge about mm > to validate it. > > That is to say I don't *vouch* the idea myself (otherwise I would sign > off on a patch), I am merely bringing it up again long after the dust > has settled. If the idea is a nogo, then so be it, but then it would > be nice to document somewhere why is it so. I think it would be worth optimizing if it's as straight-forward as we think so far. I would like to spend some more time checking uffd code to see if anything else might be lurking there before posting the patch. If I don't find anything new I'll post a patch sometime next week. Thanks, Suren. > -- > Mateusz Guzik