From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07EC5EB64DA for ; Sat, 8 Jul 2023 19:22:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 801696B0071; Sat, 8 Jul 2023 15:22:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7D9548D0002; Sat, 8 Jul 2023 15:22:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6C7E98D0001; Sat, 8 Jul 2023 15:22:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 5E6AF6B0071 for ; Sat, 8 Jul 2023 15:22:27 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 1DE18160204 for ; Sat, 8 Jul 2023 19:22:27 +0000 (UTC) X-FDA: 80989415934.25.D25821A Received: from mail-yw1-f175.google.com (mail-yw1-f175.google.com [209.85.128.175]) by imf28.hostedemail.com (Postfix) with ESMTP id 5450CC000F for ; Sat, 8 Jul 2023 19:22:25 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=yZP9mqmp; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf28.hostedemail.com: domain of surenb@google.com designates 209.85.128.175 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688844145; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2FxH4Sn2IVEMSRJMI6h54/CoL9VNO3HIcD6stV0UszU=; b=tK7mhG7bygmqycWzMIyClgtjG6D8lRCx0cLAzQ3Xxke5Q2pEw/GP6a8cocFe6IhLXW3E92 EJlGWOaBaVXZkHcoECWKPoY31Aq+ev7QGBA83IR3D+ir2FVDFamjcAR+NQlqXfsauEwRoC iab4eznvwQUK0mBJk/zaRPhTCUGB00Q= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=yZP9mqmp; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf28.hostedemail.com: domain of surenb@google.com designates 209.85.128.175 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688844145; a=rsa-sha256; cv=none; b=di/Q5+T4ICBQ7B7vbNHHzHG5YIWapU7DBRca6l6Jlad4s7/3AKhTD1vHlwJHQRbxphcbou f4kkdHd78yYnPYlTUEtzUDlPQcBpdsrDSVrjSZRbkrFcxEHnTaV4XebeczveQfOPbgYIhi TZ5ZtnhYxqbvwA8Bh1WHGxk3Er/yUbA= Received: by mail-yw1-f175.google.com with SMTP id 00721157ae682-579ed2829a8so36464907b3.1 for ; Sat, 08 Jul 2023 12:22:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688844144; x=1691436144; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=2FxH4Sn2IVEMSRJMI6h54/CoL9VNO3HIcD6stV0UszU=; b=yZP9mqmpeQMLkFovdtEmI4WKMEtko+pAQ1BEDfa+W5OHCr7jJVo29zKlMe4dc5y4+/ ctYdxCnqqkyKx9WIui1Qqv5Ci88AGheDwsXz2NaxdeLoR4MDhJacff2brnbgaJOICM0/ FUirKv//hF+3UgB0qF4zsmJdaL2b4sPGgHQwVh5lM+8VF9WIzAMiuuZ3N12zpITmJW2P bwvS7I0B45kBtMhXgHGcTY3U6QrBUujNiAIQ4YtZb+FEXhaB5xpVyoqlik6jHJ2HE/rE iQRENfjy1+wSccJQm6AdI6rIf3TRZOKTIXbalO5K2RUxxGox0Tcs59Z6qdZpfnU1mAbL 9T/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688844144; x=1691436144; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2FxH4Sn2IVEMSRJMI6h54/CoL9VNO3HIcD6stV0UszU=; b=HeUwXbhmX5X4ic00kL5auL3bWHLsoXaPD8IvZ4TUUeJnK31JeS9r7nYI6ZzgSEd+SW PuHElGNY+/sJY39+QouE0mxbXCTqwQvJ/uoo8cR7nI9TOJOHzlMEtYX8pNA6oQSup+CA jwKiiyewtcvhu69SpKfWyWvhG8ZxmfZ0B3e/VI442JG5IWZLjPmyryuQT08zod/m1amv MPsoO097jbIsXTOSSJB9EGlXlR3yNbK9NtYgtFGJd5kjqJik1RJv3n31FTUKpwbOQFco m8hxAkEoozn/CuUXyGp104hp9wXT3veI/Lt0ijDBJ07L53ATwKSqaVv+r+MMG2XAN8Hx AHyg== X-Gm-Message-State: ABy/qLazydnSo4PqdgnTKhDRweTU2wQiELnvzbzkL/wnCBhC4QUuIj9J WUFCaiXHxmkWkzxx4rBzmMKwEC/S6ji3KFPBG/uSKQ== X-Google-Smtp-Source: APBJJlGKL8J4JHA6mAQOwpLYAy2LOkRrcxRPz0EwZZ4kcff6g2abFfEEg9qIuCiKp4H+yMTT0PzACvox6UCyMAwV86o= X-Received: by 2002:a0d:dc83:0:b0:570:8482:4074 with SMTP id f125-20020a0ddc83000000b0057084824074mr10341749ywe.42.1688844144270; Sat, 08 Jul 2023 12:22:24 -0700 (PDT) MIME-Version: 1.0 References: <20230708191212.4147700-1-surenb@google.com> <20230708191212.4147700-3-surenb@google.com> In-Reply-To: <20230708191212.4147700-3-surenb@google.com> From: Suren Baghdasaryan Date: Sat, 8 Jul 2023 12:22:13 -0700 Message-ID: Subject: Re: [PATCH v2 3/3] fork: lock VMAs of the parent process when forking To: torvalds@linux-foundation.org Cc: akpm@linux-foundation.org, regressions@leemhuis.info, bagasdotme@gmail.com, jacobly.alt@gmail.com, willy@infradead.org, liam.howlett@oracle.com, david@redhat.com, peterx@redhat.com, ldufour@linux.ibm.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-arm-kernel@lists.infradead.org, gregkh@linuxfoundation.org, regressions@lists.linux.dev, Jiri Slaby , =?UTF-8?Q?Holger_Hoffst=C3=A4tte?= , stable@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 5450CC000F X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 94oxiw7mktbr3ppdzzymgms76dzwisjn X-HE-Tag: 1688844145-832150 X-HE-Meta: U2FsdGVkX1+C74zCiWx4/MZ3WpLDlYtuxA4bq50ejyPkPqEkwOPAP013pkk5FTQVlZ26Qvkf5d/R4xYOKAUbtaPjrD+QTzk88jxQ0ABEEFg4z3M2XsoKAGCWII8lv3szQMFKI4hqGo0VDgevAT3sqREkNfRzzyfskURMpWKgBhLKxblxXxTOAHQF6L3TARSxxDas26G/8EaQ+zXYLj8frtNg73/h6lO9yNenz6p42JyEYlg8nuQFSVlq/qwPVtTJg4IWXlKk8+sSNeS6DWNyxfgAyXOe7NNUGGHg+kJs0MpigISvpkRki46XoIOKrlKPOB7Ypsyt6IhxZYkhSBpkS4oSkbJrHDiC+vujioMPADZaOixxOTs9XRyFZNzWGm9QbMkD5xvpk7JvMCLcVnrSnP1+Kfwch32EM3oZQ/Tgwc+WxNC/3yMYG0sRps0keX3//k2U7BAswFbUZmcitERJdry7njB7Q/KHefYni29hk6Hw1WZunx7lWi9CcBMWUgdUgbNhZqn/aUWwoL2xuwTGuCikfFpLEmr0v0hfA8nF4mdEYIeZ/uNLH48SVNYrH3YcS1KXDhkvPFL2fiy7KKWyOw1/QD9aeqsiwQSQ/XDiC0udO5dpwM7sjIFYMSpGnRk0zffJmo5y2PkMkufuJtjzTqZ9W2Nf2UNhf7QxrXo/6VFORZLTfO0siLf62Me5j8mshNr4fpVHmq2OF8eks1pwcl5LJG7cBFm5cQ2ojb/JU6FfBFZpWUeh1OPOobVS5Rz9eS98JRMrd6KDgfuu+S4EtrXqf+3NXTYD1V7p7oSOj0G+jVJTCtTBx2g7DIxIfLa2175jqLn9hUhRr7UPrh1otSTz43Ua/maGI6nFZv3ROJEtT0tp9yR7MPEtsWspyq9AKU1+x0W7FK2BHNPKFBxyh/+bp1FnJFNqjln/meiVe6WoVRmFjDRl2gvxwshEigrmV8L3u3Zdh9MhNGHQdhM i77ejRlm mThnI9LvfS/TzGk4bcf/64tE2HJDP0tuBHsGPGMLKYBUx/vyrfurHZtAezoVU2s7QvqRbG2taySri9z0UwWgoplYmSNEQdGQzoqUbpU9Gnq5PB2U1uidElaJs6AJucWciN1rBvtTAkQDBngoKCP4NIQaOgjg1lgCKKi+DyVPBrq/kU9KLRiMJRL33yniKS/MWAThiQ4tAQegRtvrNFltUczngaUddAzGFkZ0C1/uG6zebnAizU1W1AbEaDQD60IXzW7jb/OMj3PS24V4jO5VNanHEdWlUC3pYlh2pQ5a76xkLK24C3Pi375leaCA5e/NuDHDbGcO/Jazip3xYBL8dB53L5Qj7W6K4J0mChF7Us9ojrsILnDbNlPMQgdyeuzsKb+scrYpmpYj4xBQObcz6gU34MmN+WRHsaM1Zczsh0A+CTI94S7jcLxIKXAGCbufTeQwvypJ4D+dfVSScP4VeoFbHlpJHZtg+RGt1+Lpwk6Tf2qA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, Jul 8, 2023 at 12:12=E2=80=AFPM Suren Baghdasaryan wrote: > > When forking a child process, parent write-protects an anonymous page > and COW-shares it with the child being forked using copy_present_pte(). > Parent's TLB is flushed right before we drop the parent's mmap_lock in > dup_mmap(). If we get a write-fault before that TLB flush in the parent, > and we end up replacing that anonymous page in the parent process in > do_wp_page() (because, COW-shared with the child), this might lead to > some stale writable TLB entries targeting the wrong (old) page. > Similar issue happened in the past with userfaultfd (see flush_tlb_page() > call inside do_wp_page()). > Lock VMAs of the parent process when forking a child, which prevents > concurrent page faults during fork operation and avoids this issue. > This fix can potentially regress some fork-heavy workloads. Kernel build > time did not show noticeable regression on a 56-core machine while a > stress test mapping 10000 VMAs and forking 5000 times in a tight loop > shows ~5% regression. If such fork time regression is unacceptable, > disabling CONFIG_PER_VMA_LOCK should restore its performance. Further > optimizations are possible if this regression proves to be problematic. Sending this earlier version of the patch per request from Linus and with his explanation here: https://lore.kernel.org/all/CAHk-=3Dwi-99-DyMOGywTbjRnRRC+XfpPm=3Dr=3Dpei4A= =3DMEL0QDBXA@mail.gmail.com/ > > Suggested-by: David Hildenbrand > Reported-by: Jiri Slaby > Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@= kernel.org/ > Reported-by: Holger Hoffst=C3=A4tte > Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@= applied-asynchrony.com/ > Reported-by: Jacob Young > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=3D217624 > Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling firs= t") > Cc: stable@vger.kernel.org > Signed-off-by: Suren Baghdasaryan > --- > kernel/fork.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/kernel/fork.c b/kernel/fork.c > index b85814e614a5..d2e12b6d2b18 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -686,6 +686,7 @@ static __latent_entropy int dup_mmap(struct mm_struct= *mm, > for_each_vma(old_vmi, mpnt) { > struct file *file; > > + vma_start_write(mpnt); > if (mpnt->vm_flags & VM_DONTCOPY) { > vm_stat_account(mm, mpnt->vm_flags, -vma_pages(mp= nt)); > continue; > -- > 2.41.0.390.g38632f3daf-goog >