From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D2A8EB64D9 for ; Tue, 4 Jul 2023 05:39:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8580028005B; Tue, 4 Jul 2023 01:39:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 80826280049; Tue, 4 Jul 2023 01:39:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6CF7428005B; Tue, 4 Jul 2023 01:39:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 5D653280049 for ; Tue, 4 Jul 2023 01:39:46 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 2E3351C88CD for ; Tue, 4 Jul 2023 05:39:46 +0000 (UTC) X-FDA: 80972827572.01.7752A81 Received: from mail-yb1-f177.google.com (mail-yb1-f177.google.com [209.85.219.177]) by imf03.hostedemail.com (Postfix) with ESMTP id 52F4A2000A for ; Tue, 4 Jul 2023 05:39:44 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=MPcJoeJ2; spf=pass (imf03.hostedemail.com: domain of surenb@google.com designates 209.85.219.177 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688449184; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AkfNJir+/jMU0eT+SNiq8i2fEZt2t7pisPKNX8Reto4=; b=qvNctt40lQY9RpMOJji6J6Yp8E5oQ5zOvhXOtFy0JhGkNTFMfL9Cuw/t6CVCckDnjmjuWO 5yqTcmpMoyNfrcQk6lTethnLj+KkgAz11sJ7VLRiY2g9pOBQCtux9jKVON+frXrOe1InEd Vf59snXCtbVJHRYLW2ti4KQp9uWIkF8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688449184; a=rsa-sha256; cv=none; b=cSDGWmBIvlMYJgp+ZCrYY9e5S/84Px+fQ3wmFaFQC5NUnuLQvtl0kuEd4M89zJ8wEdfos+ 4CCRwVAUbz0JtUwdYAELM0IZK3G0jzA5HMripklRU6wvOdgSQX2ttWlWHSdBuSUUvJkz2M 1qQe2sINhOR6bkAcXAajBxKMwYQmEU4= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=MPcJoeJ2; spf=pass (imf03.hostedemail.com: domain of surenb@google.com designates 209.85.219.177 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yb1-f177.google.com with SMTP id 3f1490d57ef6-c4e4c258ba9so2873589276.1 for ; Mon, 03 Jul 2023 22:39:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688449183; x=1691041183; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=AkfNJir+/jMU0eT+SNiq8i2fEZt2t7pisPKNX8Reto4=; b=MPcJoeJ2IOM9UlkT5U7FIrm/OYswr+Ea4d1SBng3lHvy0L79Tn5XD6Y7R4wevIeFFR VU76vdSlRYQhoEZXA1lRRxVCkp8AlG7db9MmqO/rIxd2pLDL2xYXgHGaiux42izU/2yN LGaKwKHUwLYy724KG3eEjyJ/Z0FpNN0umNkHwfNIQOpoU7Ec8s6cy6M/b9uh8VxAtDhB h2hYXnSAmG087xwn6OOZofyKiWNPyViAmcNF3jipkXXIJ15Ff64T3ZLEUl3vvFKP6aTt jAq+78bvjoOWir1kMx/Wlp/tB3LuKQkEdXLvS5fR07DU0ZrB1vdg4ysRQQyDqeRa60s9 2+Ug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688449183; x=1691041183; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=AkfNJir+/jMU0eT+SNiq8i2fEZt2t7pisPKNX8Reto4=; b=aeoRA7Y4LQum5CBqDfJ4qbPOCdvPf3tOF1fytVuhT8/KGfLfli4Q+Tq0PQ0RzPV5iQ F1j+c8OW+IgdzXJCNFYUo1lCFl14uMf8wDoY1wLttWbKhcVoP1+ripqHzADZ6lzK57UK 4MsDW8+RO3+Heyct5Sj0iafG5diC1I7UNuvHW9CVnvWY0V/aXho5hB5DUqVOa1QhM4+k nhyU6WUd68EXrfY0KW5xnA7ko2YU1kPSH7BqjMARFGus38wUC9V0RnkpWjtehIq511d7 MkMPADBz4wJ1loBgBgcxO8UxuDtKRQZHvfO7GLaglMjXLJyfv57EIqiS0/4jsB/4bJqv Emng== X-Gm-Message-State: ABy/qLZYVq5OcnxzKYWBK85jLZwmI3Fjd3gGMLpXYuwry+EXphM9+Bmf 16qfPKeGPNBVPxeoHYRkX9oDmdYLuePAijJ3f+KWvw== X-Google-Smtp-Source: APBJJlHVPXFYEXEp3q4L4aOBFGDeUl3eg42Pb2WWWQevzjBRvzRv/WQoWd9PUoFW0FTOuHxT0FtqzI/o7yHTZRVyHrU= X-Received: by 2002:a25:69c8:0:b0:ba8:6c1f:f5ad with SMTP id e191-20020a2569c8000000b00ba86c1ff5admr12064831ybc.29.1688449183252; Mon, 03 Jul 2023 22:39:43 -0700 (PDT) MIME-Version: 1.0 References: <20230703182150.2193578-1-surenb@google.com> <7e3f35cc-59b9-bf12-b8b1-4ed78223844a@redhat.com> In-Reply-To: <7e3f35cc-59b9-bf12-b8b1-4ed78223844a@redhat.com> From: Suren Baghdasaryan Date: Tue, 4 Jul 2023 05:39:31 +0000 Message-ID: Subject: Re: [PATCH 1/1] mm: disable CONFIG_PER_VMA_LOCK by default until its fixed To: David Hildenbrand Cc: akpm@linux-foundation.org, jirislaby@kernel.org, jacobly.alt@gmail.com, holger@applied-asynchrony.com, michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, paulmck@kernel.org, mingo@redhat.com, will@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, chriscli@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, rppt@kernel.org, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 52F4A2000A X-Rspam-User: X-Stat-Signature: kmonhc5a5f6kqqufxw1367ixzsd69tjw X-Rspamd-Server: rspam03 X-HE-Tag: 1688449184-160611 X-HE-Meta: U2FsdGVkX19GpPsLfioOJ6ym+rK47D6nvG2krr95UiwsDYsUlhUCYd760hHVePagTeJLeb6ROvrw7i/Z1RfSyJCbzg0aY3cExBhopCtsWRcKXx17ykp92bojqvIa1QjbsdvOty4MCWKBBwD5edtJpHt/fz2YDoj6ajhY44PUd9tTzz+SBXG7+GppUhASTxA4T6qIMRNcEzp+2xVn5sGgehLIIuvqBFA1hCoFvHVaxb1L/ays17CJQzYxaJNChu+mHdY5p42Nz/uKCuW7mnmQO+S5G9bqR2GqfD9lX04b8iDxw0cP6UyObuBW9KDkehHd8qEnY6Io+qLrhfgTsSh2kUwRnUn5N5FuB5OLRDy7seXQyVH4DfcxS7KXWIT3auuHu2QJuXZjo8sRTln/sUh54cosdg6WvRDByP90+z3LTZyeqlrR+l9LipNeJ+ubsZcGbWnR0uZbiB125IXmZchKQ4Q+aeoT4JkxcRcUVlmR4Rh6Yt9+cF284yjx3cjwgCxP0s4L15VLUj3ivHQ2eYr7UKggAIdgFJ4NeYVNoO7U0rhIDFbnPVC5wsF/J+Gix1Uxbo5PHCwd/VwgBS9CtUhg22wjhrno4KICfAKNYx0owttFuDo+hf6yXR6hnzMpS0NLJfk0teoNVoifgVN1yAvwQydVYBQEArnld5fd+iSj+/SOchErJDd2QD4/8QzpS+oQA2pJrefPkUOm6Wq5K8MlWJsUb97R5vzVOt78R7LpbM43siZIVYuN9CEBn0tzoUxqEvLk2znLQDhWTYu8zUoraK0OMHeazkY6J0ABMIT/m2tGrqICIZKu/fWUERwpLMVfwAe8CXNS9CLMSEXrxJPc9hHKxrg/61535gflSXMuMIhwC1D17p2Dskd+i0W/gCBcEuz8QGkPLjSr9+pGQ3+m0rlPQ9FqFQ5AebaL8vHI7p0BjehPd46B4DQ9tCBFkHeRNbd2TmB1N3wYl9VVecV XrpPvA5L h47SSS0F2R2r/Yl67WsMvl8gKGf2r6GHvU5Ze2p0mfifLT8Je9FnbJrfqemZjQt7i8VofJXvvIFMR8rTiW+tMwk1QFImqNG78bSgNl1FSNFzTHLYf0VuKtkohXqDhA849ypwzzYrqVv/AhI9gGvE8QkuroXA332a15coIKWZ9U0/iHgUD1ju7mvTtxIfEGkPzZBihaCSyvlJxGBLGEC7owDBEBojEAm3BV02Xxvfzhum2gT9xutBCT9JRo28spwaYrxgNQSi2AnO/T82hZqFQSVAo9F2vM+nFTmsjl9zTXQk2Pc1j8KtKwSp6i3QxMEqoTe8S3jTHHGWkPjsk9+0+9I9K78ombfBGjfK32M5cX1T9PonYwj1RzHlXpfNeWnLO1oIxQSaTVKZZmoEuJ9FM+FeLvZGMxrQRhBTY8xSTBgYguKDSpZaK0tchXwW2LEyi1/ItHdtXHgeuv2OAFiM/wu/yOn6fJkGURUQt X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jul 3, 2023 at 8:30=E2=80=AFPM David Hildenbrand = wrote: > > On 03.07.23 20:21, Suren Baghdasaryan wrote: > > A memory corruption was reported in [1] with bisection pointing to the > > patch [2] enabling per-VMA locks for x86. > > Disable per-VMA locks config to prevent this issue while the problem is > > being investigated. This is expected to be a temporary measure. > > > > [1] https://bugzilla.kernel.org/show_bug.cgi?id=3D217624 > > [2] https://lore.kernel.org/all/20230227173632.3292573-30-surenb@google= .com > > > > Reported-by: Jiri Slaby > > Reported-by: Jacob Young > > Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling fi= rst") > > Signed-off-by: Suren Baghdasaryan > > --- > > mm/Kconfig | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/mm/Kconfig b/mm/Kconfig > > index 09130434e30d..de94b2497600 100644 > > --- a/mm/Kconfig > > +++ b/mm/Kconfig > > @@ -1224,7 +1224,7 @@ config ARCH_SUPPORTS_PER_VMA_LOCK > > def_bool n > > > > config PER_VMA_LOCK > > - def_bool y > > + bool "Enable per-vma locking during page fault handling." > > depends on ARCH_SUPPORTS_PER_VMA_LOCK && MMU && SMP > > help > > Allow per-vma locking during page fault handling. > > As raised at LSF/MM, I was "surprised" that we can now handle page faults > concurrent to fork() and was expecting something to be broken already. > > What probably happens is that we wr-protected the page in the parent proc= ess and > COW-shared an anon page with the child using copy_present_pte(). > > But we only flush the parent MM tlb before we drop the parent MM lock in > dup_mmap(). > > > If we get a write-fault before that TLB flush in the parent, and we end u= p > replacing that anon page in the parent process in do_wp_page() [because, = COW-shared with the child], > this might be problematic: some stale writable TLB entries can target the= wrong (old) page. Hi David, Thanks for the detailed explanation. Let me check if this is indeed what's happening here. If that's indeed the cause, I think we can write-lock the VMAs being dup'ed until the TLB is flushed and mmap_write_unlock(oldmm) unlocks them all and lets page faults to proceed. If that works we at least will know the reason for the memory corruption. Thanks, Suren. > > > We had similar issues in the past with userfaultfd, see the comment at th= e beginning of do_wp_page(): > > > if (likely(!unshare)) { > if (userfaultfd_pte_wp(vma, *vmf->pte)) { > pte_unmap_unlock(vmf->pte, vmf->ptl); > return handle_userfault(vmf, VM_UFFD_WP); > } > > /* > * Userfaultfd write-protect can defer flushes. Ensure th= e TLB > * is flushed in this case before copying. > */ > if (unlikely(userfaultfd_wp(vmf->vma) && > mm_tlb_flush_pending(vmf->vma->vm_mm))) > flush_tlb_page(vmf->vma, vmf->address); > } > > > We really should not allow page faults concurrent to fork() without furth= er investigation. > > -- > Cheers, > > David / dhildenb >