From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CC8ECC001DB for ; Fri, 4 Aug 2023 21:46:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1767A6B0071; Fri, 4 Aug 2023 17:46:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 126B76B0072; Fri, 4 Aug 2023 17:46:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F2FC66B0074; Fri, 4 Aug 2023 17:46:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id E23B66B0071 for ; Fri, 4 Aug 2023 17:46:28 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id A9EEC1A0843 for ; Fri, 4 Aug 2023 21:46:28 +0000 (UTC) X-FDA: 81087756456.07.36743D7 Received: from mail-wm1-f52.google.com (mail-wm1-f52.google.com [209.85.128.52]) by imf19.hostedemail.com (Postfix) with ESMTP id 9462F1A000D for ; Fri, 4 Aug 2023 21:46:26 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=a4GiJzdq; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf19.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.128.52 as permitted sender) smtp.mailfrom=mjguzik@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1691185586; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=M/pOSfgbyVT5LQxnNymgKpK05ihGgzeqo1fmPwKo4cM=; b=kLu7A+vN/fyNRJWPhaSBrT52phRhSA6he4uhgavLo+9Agt0F4H3j2RFm5WXgz+vrcLAn2D YPtaUL2DbfVgmwIvB4nhc1TCmVuDJ15IVznG+uF5bLE+bpNu7m4kMNOPaK3JfehQ2r9jG5 K2ibyTe2LBQNgPT7y1cgsYkC84vMtGI= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=a4GiJzdq; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf19.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.128.52 as permitted sender) smtp.mailfrom=mjguzik@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1691185586; a=rsa-sha256; cv=none; b=RouBqUunntWpDwI49kwIPlkf6JO6pvkRmu9nTM81217SQ9FbLWdrhXI5MaKJyWC7IF0hFi XBnP9A4KUmWROq8FbWtXiCKPVA9cr93qu27YYkIdp47hEW7uDFujpvhWgWQqUesScpZEfT LPul/XGeyK7sYTMHIfYOJT3CFN5Zb7s= Received: by mail-wm1-f52.google.com with SMTP id 5b1f17b1804b1-3fe2bc2701bso23544305e9.2 for ; Fri, 04 Aug 2023 14:46:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1691185585; x=1691790385; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=M/pOSfgbyVT5LQxnNymgKpK05ihGgzeqo1fmPwKo4cM=; b=a4GiJzdqv+qk8GtXsrJmdKutL523iO/jKkD1Gd/zbu6JYYdPAn4zLqpp4/JPAO1ysS wtqxE8CsmEcwfV9Ove0bJCJnmMGGV6Kn/MjWnLXRYvp3B3QYpGfNBQKNrIkszewrCKHx c/cotekgWT4fnr7Kx08vJnKsw3SLeSGW6rFs0yWNmZgwOJhFrKOxr5Cp4ZUvkQBCEA+4 NWcuECwaIC2nDyL6LfgMHhR6CzMuWN+vOZe3VJTKm+90ZaWdhiw4pbDdzrbAsEEKyzHz mpayLVtTYyHnz4GQTcbqJuZQssyMVFYpcI4D6yl+jo881L1rIzs/P8T8EURCo9UAnNcJ 3QFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691185585; x=1691790385; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=M/pOSfgbyVT5LQxnNymgKpK05ihGgzeqo1fmPwKo4cM=; b=lLSqZrYwn5yVbEV+kBg1jYZHq3qv9Rx5NteQBeMTEr3F5fGlTnuNsWHuGcRNeBThvX 6bFdeT0xisFBsZvgJ5v2aNWjCM4YlWZiA75AgqCwOdFVWBmiFRXU+taK+z2tKqhogyAo oSRnFvC2k34R7SerTcnRcXY/4V4LdOZSRj7+nwUPDy5DiJpBziFzvuqWRJIb5EgxXX5p MI7P4HYvkv5XEa1X5qi3WILKuo/j5OIZgTHhdrD+3aPVhkb2IHQHzYZKMX5LzcmklHkb W7H5VDnYF1ToDAfzt93xdn1s9gsbeG0grfMbGL+B5bxivuq9u7aB39c3Lx0pwr136OCb 2B1Q== X-Gm-Message-State: AOJu0YxsUB2seYo/nzfK/pNJkkIJK8UNIaNF4CS+AnUSqXatTj0PzNjk r2zKS897c/fP8c5STX64vNymLZrYGGVYOQ== X-Google-Smtp-Source: AGHT+IGOtBUSiK39LU5+0tpNJevAeU4Xfb74Y4DRTOPaM0nN2Uvb1A/IoQQ+f7RQvAmDLiwJ9gr3Zg== X-Received: by 2002:a1c:f603:0:b0:3fe:21b9:806 with SMTP id w3-20020a1cf603000000b003fe21b90806mr2415552wmc.0.1691185584689; Fri, 04 Aug 2023 14:46:24 -0700 (PDT) Received: from f (cst-prg-21-219.cust.vodafone.cz. [46.135.21.219]) by smtp.gmail.com with ESMTPSA id h3-20020a5d5483000000b0030ae53550f5sm3423406wrv.51.2023.08.04.14.46.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Aug 2023 14:46:24 -0700 (PDT) Date: Fri, 4 Aug 2023 23:46:20 +0200 From: Mateusz Guzik To: Suren Baghdasaryan Cc: torvalds@linux-foundation.org, akpm@linux-foundation.org, regressions@leemhuis.info, bagasdotme@gmail.com, jacobly.alt@gmail.com, willy@infradead.org, liam.howlett@oracle.com, david@redhat.com, peterx@redhat.com, ldufour@linux.ibm.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-arm-kernel@lists.infradead.org, gregkh@linuxfoundation.org, regressions@lists.linux.dev, Jiri Slaby , Holger =?utf-8?Q?Hoffst=C3=A4tte?= , stable@vger.kernel.org Subject: Re: [PATCH v2 3/3] fork: lock VMAs of the parent process when forking Message-ID: <20230804214620.btgwhsszsd7rh6nf@f> References: <20230708191212.4147700-1-surenb@google.com> <20230708191212.4147700-3-surenb@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20230708191212.4147700-3-surenb@google.com> X-Rspamd-Queue-Id: 9462F1A000D X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: kyfc4c5q15ap3c5j44w49ne7toni1isy X-HE-Tag: 1691185586-844410 X-HE-Meta: U2FsdGVkX19REpAvqZT6pHWAj//pF0RO/aUtpfyS+qJn3+6F7YUux6E8Lk35CR4l05XccVU7rGUD16vA1EVOp6vUKH4NGGmptuz0CLvYMIajXQZxbRSafKlYR8ULl08vb8dxBzVmVrvWVYG9eUsmzMeZF+T/xTe23aALhMfkR6UmMGj046Vp5cf8fhtX5ETB4MVfzFEliRfuc1fAx9IAlGc7Q6PIAUZSptKPuSwlzbyV5pFutChNz+w9i2qFiWcy4xJph3QX++fUORZ1OKTMBZDUre0qCAjvx1kcLjIPNettRcHMtkI9J/NmD+ptXUhLSi4RgGej5O49LAP49mQJecS0nvOdsGU22EdGWZ/nf5uilD5JJnhycQlbZ3nJF/8cLPWqvvWnoQS6gWZa02kiNALOAUsDpIAKnw97ObVfHvpthJPUGzdWMDg8mzQ90w4bObbgCLyo2JcF6U5fDiO3N9IvzeCj+Gz+N8aGE/a6ir+wB1QbmGC44NzBj2aMDVhNnflNul0HU9tpohn+WYsus9FDqZvyKGu/eG7W6zQIWoGhkeae0F9wU4NdlmXN/fOu4P6vY6fo8EZvCdFIiml/h/sr/x1gHirlooFv0Qu+yezildMfi3hS7OGsUGiO92uiqlyMdSprn89oa33DLgvCJcG5DHIn9m4Udsz8eBpKoLYmi6fFqa2xbg7C4FA7cQRplyTVtAReuf9CXP/ynbx8/I93ZIAPXikPJur6nW0oWqksGukg7M6y1sfRQ3SK0vE/Rfw+oiRVy2t4Kiydn8zyycDLqsuT1l8Ubz3S4OJjxhOUu20PwTcW26wPH44rfqO2wkyOxbAOpPBsbbDXPMc7XwDlMyh9tG6vK5USp9S3ItPh42OPleszNOoqTishELe9Ms9lelIUvVNHM/AxA3NYUnqQa4i3g5wY2LJvftbanQ3Yk66jyCk3dkyroZF6rc95UhK5C7ktIY9cCove7da zmWVwvVT 3V94bTndB7y0luj+5eq+B7zI6KIl6pw+k3Dg6S/kjir6utpoBzW5r+rT1I9C6ejdNOhTIomx2ztqy0kKnFv21NHvS4mIP/fwYVR1+i6VEZRaYNxyXpLk6G6hVhzRqq9xAf8lHbqUhPsjOEGC472e1164L+W03OGe2OG/c5ae+heJG4XxOdwe2jz1r2ybJi45hYGi6Pe2keH3V1NRfL16oz+XBWdE/5IKwv1CNYTdOLLbzu3peIWaK7FHOTMN3ESZhvC+z2PMtkyjE4dB27D0OxBo3d48CWEA6F+iLgY40nLlY4+XjQRkvopoaeFbBMusTCQo6EsW+DY4G3xN+/ZX8aCIisLPhBBUej9wMrSkF/9bHNgiAy5wSSMkSlY6xt75+EYuEsnMtVUPLm7cjn2VVrv/qIKaAFTTN7OoyB0kDnVQZk8eAtUYWPGAncIyJoXoRvi5qwkYenYoQGkMSXWZDLFS0YCjpbWM5h1KwXLTIEtacN569XLS41RCecg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, Jul 08, 2023 at 12:12:12PM -0700, Suren Baghdasaryan wrote: [..] > Lock VMAs of the parent process when forking a child, which prevents > concurrent page faults during fork operation and avoids this issue. > This fix can potentially regress some fork-heavy workloads. Kernel build > time did not show noticeable regression on a 56-core machine while a > stress test mapping 10000 VMAs and forking 5000 times in a tight loop > shows ~5% regression. If such fork time regression is unacceptable, > disabling CONFIG_PER_VMA_LOCK should restore its performance. Further > optimizations are possible if this regression proves to be problematic. > > --- > kernel/fork.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/kernel/fork.c b/kernel/fork.c > index b85814e614a5..d2e12b6d2b18 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -686,6 +686,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, > for_each_vma(old_vmi, mpnt) { > struct file *file; > > + vma_start_write(mpnt); > if (mpnt->vm_flags & VM_DONTCOPY) { > vm_stat_account(mm, mpnt->vm_flags, -vma_pages(mpnt)); > continue; > I don't see it mentioned in the discussion, so at a risk of ruffling feathers or looking really bad I'm going to ask: is the locking of any use if the forking process is single-threaded? The singular thread in this case is occupied executing this very code, so it can't do any op in parallel. Is there anyone else who could trigger a page fault? Are these shared with other processes? Cursory reading suggests a private copy is made here, so my guess is no. But then again, I landed here freshly from the interwebz. Or in short: if nobody can mess up the state if the forking process is single-threaded, why not check for mm_users or whatever other indicator to elide the slowdown for the (arguably) most common case? If the state can be messed up anyway, that's a shame, but short explanation how would be welcome. to illustrate (totally untested): diff --git a/kernel/fork.c b/kernel/fork.c index d2e12b6d2b18..aac6b08a0b21 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -652,6 +652,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, LIST_HEAD(uf); VMA_ITERATOR(old_vmi, oldmm, 0); VMA_ITERATOR(vmi, mm, 0); + bool singlethread = READ_ONCE(oldmm->mm_users) == 1; uprobe_start_dup_mmap(); if (mmap_write_lock_killable(oldmm)) { @@ -686,7 +687,8 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, for_each_vma(old_vmi, mpnt) { struct file *file; - vma_start_write(mpnt); + if (!singelthreaded) + vma_start_write(mpnt); if (mpnt->vm_flags & VM_DONTCOPY) { vm_stat_account(mm, mpnt->vm_flags, -vma_pages(mpnt)); continue;