From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C54BC0015E for ; Mon, 3 Jul 2023 13:52:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D1862900009; Mon, 3 Jul 2023 09:52:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CA19E8E00BA; Mon, 3 Jul 2023 09:52:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B41C1900009; Mon, 3 Jul 2023 09:52:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 9C9888E00BA for ; Mon, 3 Jul 2023 09:52:45 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 6BE0FB0111 for ; Mon, 3 Jul 2023 13:52:45 +0000 (UTC) X-FDA: 80970441090.01.D87611F Received: from mail.itouring.de (mail.itouring.de [85.10.202.141]) by imf03.hostedemail.com (Postfix) with ESMTP id E1D092001A for ; Mon, 3 Jul 2023 13:52:42 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=applied-asynchrony.com; spf=pass (imf03.hostedemail.com: domain of holger@applied-asynchrony.com designates 85.10.202.141 as permitted sender) smtp.mailfrom=holger@applied-asynchrony.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688392363; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=f6HQxzjBhzL2uA1D+XL/fTDUW/0GnC6PuzMXS+kjWjQ=; b=6zuRmM2f+OvWBiXzlYjs7LJGyeuU1Q2qjyziAnpWeY0I5VGGennPYGbnb5sDFTFExcJaUa II17FJiDnujuRCVa1a1CzaUPbomkHv5yWCHhMtdxThrL1xq4sByWvx2HlOgj1oU4+5ws4n B9zn2goiCI4HDOcPD0QzbWq4Mr6c75c= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=applied-asynchrony.com; spf=pass (imf03.hostedemail.com: domain of holger@applied-asynchrony.com designates 85.10.202.141 as permitted sender) smtp.mailfrom=holger@applied-asynchrony.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688392363; a=rsa-sha256; cv=none; b=5cH+YlwyZQuBKYGi+yLI98LeMFLBRS8FFd8mBNs5x08nvMALNhIobPfuegawMI1BcgQv8s TodAR1YMqKmWnAIABcagtSDMPkCrpsHIxxhySfnQmDvepFneAKrWr8wy3WuohYUq61nMvK k/J8uhSgVbqtPgaFBaZYmoj1LvESOO8= Received: from tux.applied-asynchrony.com (p5ddd7b2c.dip0.t-ipconnect.de [93.221.123.44]) by mail.itouring.de (Postfix) with ESMTPSA id 6FBC2C2C0; Mon, 3 Jul 2023 15:52:40 +0200 (CEST) Received: from [192.168.100.221] (hho.applied-asynchrony.com [192.168.100.221]) by tux.applied-asynchrony.com (Postfix) with ESMTP id 2111DF0160A; Mon, 3 Jul 2023 15:52:40 +0200 (CEST) Subject: Re: [PATCH v4 29/33] x86/mm: try VMA lock-based page fault handling first To: Jiri Slaby , Suren Baghdasaryan Cc: akpm@linux-foundation.org, michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, paulmck@kernel.org, mingo@redhat.com, will@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, chriscli@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, rppt@kernel.org, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, linux-mm References: <20230227173632.3292573-1-surenb@google.com> <20230227173632.3292573-30-surenb@google.com> <9a8d788c-b8ba-1b8a-fd79-0e25b1b60bed@kernel.org> <2f150512-e460-a9ae-65db-39dc54fe99d6@kernel.org> From: =?UTF-8?Q?Holger_Hoffst=c3=a4tte?= Organization: Applied Asynchrony, Inc. Message-ID: Date: Mon, 3 Jul 2023 15:52:40 +0200 MIME-Version: 1.0 In-Reply-To: <2f150512-e460-a9ae-65db-39dc54fe99d6@kernel.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: E1D092001A X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: jcrndmp546y6xyr1hgs4xu6g87te67zf X-HE-Tag: 1688392362-202120 X-HE-Meta: U2FsdGVkX18+YIPBkZtclqC0RfsD5ei+qCxei1ui4R8W8fhjWSa4QkY3ji9z4BZJWVitJrxOqFHkFIzl6Skrzj+3r2hG0KMMdDnndMhzjS/brvrzp/b1yFHSuTfHOgr2D1N0F6P5GR+iFndTlBST51+TySFGZFnPl6e+CyTEqSu/BmcXlLbJf0vsfPCOPVm5BhN7q2tJEbjQsJYzCLg5WpBj/uV/qyjSVM4S00hVuD8JYC4+RMd7pGXMtd0vUi9eX2FH6pgZg2PMyDlEK2HpcdaPrpB+blFLih2vckMbW81xWvJBiHYOgOq+UGu83928vpYCk2i0OyGyLpl5TjqW8lhDov8jxAIDMYNZ3cpMPWuCQrQ5J1MDU3pLGzX2w9V3z8+3t8KivQy1NdnVLPwpGPPD7FDwce0yxPrH12hppacf84UmWkYJsJrBLwNra7WdlGijZ0uWbHnUhEA2r8Ep5JDXoFFrVfOz0gbxyKGJLvJ6TlUHD1DqcYVGrWDouYWJgjqzNsED1+gwBSDITYkpNJgzlR1wJBD1mh462E05OWbHQheTtIqtyBf+zNE+wb6kOf2LDuERUp9nBlGq2NDsXmw33NL0m97l1R6oaO4KbZtMt7p0eBIMDmUf4APkd3SO9HNRf2g+5TNBrB6ggwssiybZ+v7iUv24liuoV7N9TSjyPE6lne8cE9gsNkjsiKunSksyv0tU6sPO1FezQ3wAidd257EQ7m/v8lICDmupR4phjl5W2zWg7/TSB8YScvnYmvw1pH8QtrVJiafGnTE6ji9mmCRaWzR0HxWEkuSk2xLTmcmUmkDiUEkZ9b8c6jBYQKLEUUPYeVKbDhZuHrp7lS4siNWoxc5PzqvAvCPp0cv1+uGnBPkuRLTFmL2ia5aSDuoB7OrbXt6I4Kz/Y0xu6cclfiqyabe9MTxfZjMJK/ziGuhnrY4KRwquBOdSR5ugsm+KirLHJ/jstO2CDvN ODJcKL2v VcrkCHEy85LCJUMbvPH1mgxxLvIq1UcUnTQShjcR8cLJRxZj/UEVw8pxYPWnuPsehJBB4P4+B0KZfpMnqiYIKLNaaSL0PSs3TQ+PKYrgMNzCqvqG1wTDxmmcisO+wBIvHAAUPvtfY/VL2RvwTVBqu2MZaapfqv7+lQigEBwckM6gQQhoGFZOTIH2jv86U9J/HdEQirjeK7boeG54cvUdw3CJlVjKYjb5e1N4sv/5SzErCJA56wvzXZ0vCgazlSo5fuOriiDSVxi1vFJYBPhrG/Dj+d7dgb+ZJIYtiyMwMTkjWsdh5nPXQR+//xBgpt12EJwGQT9mAAXIrxcZvKvFMHY2Q4lv1c2jIHfCP X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023-07-03 12:47, Jiri Slaby wrote: > Cc Jacob Young (from kernel bugzilla) > > On 30. 06. 23, 19:40, Suren Baghdasaryan wrote: >> On Fri, Jun 30, 2023 at 1:43 AM Jiri Slaby wrote: >>> >>> On 30. 06. 23, 10:28, Jiri Slaby wrote: >>>>   > 2348 >>>> clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7fcaa5882990, parent_tid=0x7fcaa5882990, exit_signal=0, stack=0x7fcaa5082000, stack_size=0x7ffe00, tls=0x7fcaa58826c0} => {parent_tid=[2351]}, 88) = 2351 >>>>   > 2350  <... clone3 resumed> => {parent_tid=[2372]}, 88) = 2372 >>>>   > 2351  <... clone3 resumed> => {parent_tid=[2354]}, 88) = 2354 >>>>   > 2351  <... clone3 resumed> => {parent_tid=[2357]}, 88) = 2357 >>>>   > 2354  <... clone3 resumed> => {parent_tid=[2355]}, 88) = 2355 >>>>   > 2355  <... clone3 resumed> => {parent_tid=[2370]}, 88) = 2370 >>>>   > 2370  mmap(NULL, 262144, PROT_READ|PROT_WRITE, >>>> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 >>>>   > 2370  <... mmap resumed>)               = 0x7fca68249000 >>>>   > 2372  <... clone3 resumed> => {parent_tid=[2384]}, 88) = 2384 >>>>   > 2384  <... clone3 resumed> => {parent_tid=[2388]}, 88) = 2388 >>>>   > 2388  <... clone3 resumed> => {parent_tid=[2392]}, 88) = 2392 >>>>   > 2392  <... clone3 resumed> => {parent_tid=[2395]}, 88) = 2395 >>>>   > 2395  write(2, "runtime: marked free object in s"..., 36 >>> ...> >>>> >>>> I.e. IIUC, all are threads (CLONE_VM) and thread 2370 mapped ANON >>>> 0x7fca68249000 - 0x7fca6827ffff and go in thread 2395 thinks for some >>>> reason 0x7fca6824bec8 in that region is "bad". >> >> Thanks for the analysis Jiri. >> Is it possible from these logs to identify whether 2370 finished the >> mmap operation before 2395 tried to access 0x7fca6824bec8? That access >> has to happen only after mmap finishes mapping the region. > > Hi, > > it's hard to tell, but I assume so. > > For now, forget about this go's overly complicated, hard to reproduce case and concentrate on the very nice reduced testcase in: >  https://bugzilla.kernel.org/show_bug.cgi?id=217624 > ;) > > FWIW, I can reproduce using the test case too. > > thanks, As another (admittedly correlation-only) data point, I noticed at least hourly crashes of Firefox-114 after upgrading to 6.4.1, which had never happened before with 6.3.x. After reverting 0bff0aaea03e2a3ed6 - with a bit of context fixup due to follow-up commits in 6.4.1 - it has been rock stable again, for several hours now. cheers Holger