From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B643AC3600B for ; Mon, 31 Mar 2025 17:51:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ABBC2280002; Mon, 31 Mar 2025 13:51:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A4197280001; Mon, 31 Mar 2025 13:51:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8BB27280002; Mon, 31 Mar 2025 13:51:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 6B751280001 for ; Mon, 31 Mar 2025 13:51:07 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 7A950ABAC7 for ; Mon, 31 Mar 2025 17:51:07 +0000 (UTC) X-FDA: 83282587374.19.47CBAB0 Received: from mail-ed1-f44.google.com (mail-ed1-f44.google.com [209.85.208.44]) by imf15.hostedemail.com (Postfix) with ESMTP id BD84BA0011 for ; Mon, 31 Mar 2025 17:51:05 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="fvQD/VkY"; spf=pass (imf15.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.208.44 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743443465; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lAJblEdsu6RGVzdfZQ6SZtx+X/wxUNi8yqU+X10uPEY=; b=MN+RQIsYD0t8eefxviadZjXKcn+VDtdlyW8MKp9VrJy219SvU0CKf3CxYTmp38/weHtTZd w5ok5TeFytS7WYIz63UgCIg0pybEt+jlkkKaPsFjx29wQLSfrAPuZi803RgP9IFDXPv4/0 KpBQkmNGeWdkx4l73UcAk48s8S9inDg= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="fvQD/VkY"; spf=pass (imf15.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.208.44 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743443465; a=rsa-sha256; cv=none; b=HA8/IQG6DLzgPhtBoK+SBY24bzdkju65M7UdK2+56vJZ3tkrhjU+KaiTTZvN8pzDySiYSk vSwGNbriR2byGj5ThSALtfNJuUDuhKAIzOn+kpIsUESxUu02Rb9++B5B9KIQqKtLdamZa3 VHQXrmtM+A7j+g6WbgxMDK5RRHH7MFI= Received: by mail-ed1-f44.google.com with SMTP id 4fb4d7f45d1cf-5e6c18e2c7dso8678616a12.3 for ; Mon, 31 Mar 2025 10:51:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1743443464; x=1744048264; darn=kvack.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=lAJblEdsu6RGVzdfZQ6SZtx+X/wxUNi8yqU+X10uPEY=; b=fvQD/VkYCn3CpdJEhshmbZJic95BGDPQ+v4qykJjzKFq8Fb36owHkKzH9O+VjqTe5O Gy63MYfdJ61I/WcNl0fBgtI4tKWh7QDNtvwrZz/c4UcOfoLONaWslqeeXPo2F30ZWNE2 lX83aG4zpo1EP7Ku+ORQj+t8Fz+CjCyDQ5WbpiwHikTfXCJmCU7NjLKdxiLmLrNgk0p9 esE1C8vpkvdBEtR1BH6Lw/bE/SiB5r6ltCIqYUvYbzrRkX6oJW4yRvXxwx3eHIX6B+mO RbGlAIVFVTwUQ5QWltPWWV5eAnax4n+YDtRmEmE+wUvOrVjMGv/FnKskqGb4hm9DUXHu z+wQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743443464; x=1744048264; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lAJblEdsu6RGVzdfZQ6SZtx+X/wxUNi8yqU+X10uPEY=; b=vjDojIcj1V5hIAupNLn39hBNBdO6eRhDmQ2zzYt59ZXNl2X402pzqv+XnNPLNhRrO1 uNRdwOV83r8lAzgX++JFLlBqAsjtr+WO5NELVJx0fxC1EHe3M5C7eW3YJvi0ejokC4PT eLytWNRcOCeE17CGyJyO7KXWTxOIAq0k3eCY5Zc6h3UN1kqUczeeFhnPWtjKf/oB+EkN Lq4IXjyZE/hqbdh+OIdLJuXN5kJ2NpmTMP7c5yD0B80wq/cMOvr5aK96wR6fc4hvwi9g bhQhRbIQdeZ8sRGsrg3MrYKJtzRNhe0enfCRdsD3vUpWEjXwIA1J9TOb5smlS0UpkznM KoMQ== X-Forwarded-Encrypted: i=1; AJvYcCVly2VYlpt3JUhxTy8IqGhgQW388XTN0jzF60wp2UfGrWQazl9vvq9ZP4lfM6fu/RYIXxpIZcU2fA==@kvack.org X-Gm-Message-State: AOJu0YwqbrC7dvY8Svif1V9JTa+qrRlWRzMUZ2rIstuD46ldhoV31Fxm Yr7f+m9llo+dEbcQlPAFZJ4CLGRf/SU32JmS3QajUt1gl6QkE/WHinOKdG3XPTZr+kqb7i2q/kp 6OalXyh9vNuaxJ583GAcXpkNEIHk= X-Gm-Gg: ASbGnctWJF0Kmp+tPTpQ84WdNiVaJrKxRy7A57aPPWY7M/or1Ux3yjGiq3CFFyago7u RsGfZchDGMdmLeyvrxDBjNUibkcu+vHwZM9Txnu5p88SavuqhBfAyKwDIbH7fSbhMbfK3liV8Xh fqeiXFatdbHFtceaKn3g87aVT4aTtJ3Jmr53Y= X-Google-Smtp-Source: AGHT+IF7lKZaQYpx9+UitD/L45shgnecJUUBXEbImpo5hyJ2UAlu+1yAsR9SmTThl04MMdjIGuRknqHMZkzVI8+MGL8= X-Received: by 2002:a05:6402:13c8:b0:5dc:7823:e7e4 with SMTP id 4fb4d7f45d1cf-5edfcd4af1amr7130524a12.12.1743443463997; Mon, 31 Mar 2025 10:51:03 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Mateusz Guzik Date: Mon, 31 Mar 2025 19:50:52 +0200 X-Gm-Features: AQ5f1Jq2rN7_ixDKWFoStU7wRZ1Ax4F6LR5TtWjtIwfRXA5mnSSHlAT8BoA3p1s Message-ID: Subject: Re: not issuing vma_start_write() in dup_mmap() if the caller is single-threaded To: "Liam R. Howlett" , Mateusz Guzik , Suren Baghdasaryan , linux-mm , Lorenzo Stoakes , Matthew Wilcox Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: BD84BA0011 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: wci4fgdwsu5bi9n6948hetzynn6k4jk1 X-HE-Tag: 1743443465-923120 X-HE-Meta: U2FsdGVkX19q3fjGj7KTSPAmwYTOQIKuc8mM9qAE6wsVVio5wvhfcqBqM3KuKX/eqW5WM/zjtUR5Fo5IEXSR24AxSgSxNUIF3uuvWsSm/sGMeZ1CGauZifwPw9KbK4anrVllGgnoGLFbbppUANhSjMjSlsERyAL6au/0DtRIsriIUnH2CjwPgKr0U9+VgzFFvahmM26BeW+G00s+QMiy3eMTRVCjFSU/3SWb6axrz/PkfnFmWGBPEAbBGeZO5GuXOLF3Cny6Vt0bG7ZOCNxvsa8ATxCIXKxu4axR/FJVEAaFLCNjg953TbM8cbf2KX0i/K78C3kvf0cJz8zNFB7mbVJi9Aff0c+dxX0dw4UjE27eQaXS7XotQb4EzPJ+/g2o/Sf7wDIc097cd5MJEwbW+B0cu6p5hVtKI8lzNGAcPf5xSYOdyCGLUaVFsAjHeCMWNx6vL6qw4lQaUwcK4Sd9RD0CnGRXhBbdJqnfkoBY/xswNxsMl+5LO4JQaZDvyAmUEP30ptD+3QTDFEqMOuoxc0po6T2kFlTsdTMjb112mhHIARawbwFqrwA+tug6S/6qpfpSLRzLszMfM8pLWxcIsnhnI9XOtL55z3Pqr08mp4rh9PWtvI3O3ovHndfZYJTjikZWIP0OnYIGU9i8VoEG3UcB/xHD46C7mVdbmKthxd5e+ZebYygG2iLSX92cuNWcoSUGlpiBfnU/sRU6Ers3uN7hG8eBMKSP1xWbQStTTtADPL3+qIdT920zamQa4gp68XoqMHueQ4YtaIhxKKgVDVDx3djPoNhaVqPWl41AfE0yLeZVdnwIrYV42dkGZ0iR0/KWXAP7Q4N1gJzdNNCH62doYWWBxk9o/sM7MmzTc3hjyMEdMZIw69OQeViQ2lEaXdFkKFQPV9wMHpkFahdhCAeJR7BUfUXTcGxm67nCP2CGxiyUARfnHTHe1gp3jWk/cR5HXpngdnl9eE6SJP5 6mfdnhCl +bwQt/MuZrcprNjFlEVklrORcHQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000182, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Mar 31, 2025 at 6:43=E2=80=AFPM Liam R. Howlett wrote: > > * Mateusz Guzik [250330 15:43]: > > On Sun, Mar 30, 2025 at 9:23=E2=80=AFPM Suren Baghdasaryan wrote: > > > > However, the good news is that mm_count tends to be 1. If both > > > > mm_count and mm_users are 1, then there is no usefaultfd in use and > > > > nobody to add it either. > > > > > > I'm not sure... IIUC new_userfaultfd() does not take mmap_lock while > > > calling mmgrab(), therefore I think it can race with the code checkin= g > > > its value. > > > > > Considering the mm struct isn't the only way to find the vmas and there > are users who use other locks to ensure the mm and vma don't go away > (rmap, for example). It is reasonable to think that other users may use > the vma lock to avoid mm_struct accesses racing. > > Although I don't know of a way this is unsafe today, we are complicating > the locking story of the mm with this change and no data has been given > on the benefit. I don't recall any regression caused by the addition of > per-vma locking? > I was not involved in any of this, but I learned about the issue from lwn: https://lwn.net/Articles/937943/ the new (at the time) per-vma locking was suffering weird crashes in multithreaded programs and Suren ultimately fixed it by locking parent vma at a 5% hit, see fb49c455323f ("fork: lock VMAs of the parent process when forking"). The patch merely adds vma_start_write(mpnt) in dup_mmap. What I'm proposing here remedies the problem for most commonly forking consumers (single-threaded), assuming it does work. ;) To that end see below. > > > > It issues: > > ctx->mm =3D current->mm; > > ... > > mmgrab(ctx->mm); > > > > Thus I claim if mm_count is 1 *and* mm_users is 1 *and* we are in > > dup_mmap(), nobody has a userfaultfd for our mm and there is nobody to > > create it either and the optimization is saved. > > mm_count is lazy, so I am not entirely sure we can trust what it says. > But maybe that's only true of mmgrab_lazy_tlb() now? > warning: I don't know the Linux nomenclature here. I'm going to outline how I see it. There is an idiomatic way of splitting ref counts into two: - something to prevent the struct itself from getting freed ("hold count" where I'm from, in this case ->mm_count) - something to prevent data used by the structure from getting freed ("use count" where I'm from, in this case ->mm_users) mm_users > 0 keeps one ref on ->mm_count AFAICS the scheme employed for mm follows the mold. So with that mmgrab_lazy_tlb() example, the call bumps the count on first u= se. Suppose we are left with one thread in the process and a lazy tlb somewhere as the only consumers. mm_users is 1 because of the only thread and mm_count is 2 -- one ref for mm_users > 0 and one ref for lazy tlb. Then my proposed check: ->mm_count =3D=3D 1 && mm->mm_users =3D=3D 1 ... fails and the optimization is not used. Now, per the above, the lock was added to protect against faults happening in parallel. The only cases I found where this is of note are remote accesses (e.g., from /proc/pid/cmdline) and userfaultfd. I'm not an mm person and this is why I referred to Suren to sort this out, hoping he would both have interest and enough knowledge about mm to validate it. That is to say I don't *vouch* the idea myself (otherwise I would sign off on a patch), I am merely bringing it up again long after the dust has settled. If the idea is a nogo, then so be it, but then it would be nice to document somewhere why is it so. --=20 Mateusz Guzik