From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5F8DC52D7C for ; Tue, 13 Aug 2024 17:44:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2C1896B0095; Tue, 13 Aug 2024 13:44:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 272546B0098; Tue, 13 Aug 2024 13:44:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1396D6B009A; Tue, 13 Aug 2024 13:44:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id EAD4E6B0095 for ; Tue, 13 Aug 2024 13:44:33 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 959031A08AD for ; Tue, 13 Aug 2024 17:44:33 +0000 (UTC) X-FDA: 82447946826.26.75A54A8 Received: from mail-vs1-f53.google.com (mail-vs1-f53.google.com [209.85.217.53]) by imf11.hostedemail.com (Postfix) with ESMTP id C0CFD4001C for ; Tue, 13 Aug 2024 17:44:31 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=tXIthwbx; spf=pass (imf11.hostedemail.com: domain of yuzhao@google.com designates 209.85.217.53 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723571000; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=v5qCJWinHPwix2emDITqwEdfIFGgfIhzXmjEyYRNeqw=; b=f+7jxJVtULhP+GlCmVGFRvplkU33L+HQmEFCTviHDeA7x+l042V2EwwaRrBwcTxS31c+C1 nKIU06l5Y/WaSJyIFjcaMC6Is81kqPUlu3DKupx2yxCXI6hPclerdKgET56ZeoFtkBh06I B9cRXQ6uKCzwOzVboVptPYrVIMH38kI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723571000; a=rsa-sha256; cv=none; b=ZudzD9uzBfyN1v7VOd/AnKBnjf27lBe2jv8pztEJwwZDt8d8nTnqG4gEwRyUrCVko1kNxs 1VmYp5OujhzMvCvwbj5hXewfYhVLknBVDwvw3Lta8x/R8pB9v8E7GE4ou2VrntkPWduUGr 9g3Off7GsdI9UUJ0v26qyqlqYfoOFvk= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=tXIthwbx; spf=pass (imf11.hostedemail.com: domain of yuzhao@google.com designates 209.85.217.53 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-vs1-f53.google.com with SMTP id ada2fe7eead31-4928b5531caso1896829137.3 for ; Tue, 13 Aug 2024 10:44:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1723571071; x=1724175871; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=v5qCJWinHPwix2emDITqwEdfIFGgfIhzXmjEyYRNeqw=; b=tXIthwbxURmR25e5sq+Elq163V+Qn4eKAS+dT4UJ9VlpbI8hplN9dxyRTLLIPORXXe JMwqcTMDFi0hZo+Xa7Iil3VNTboFz0/WOR/WTAbWRxAeJYMalzavGd02u8NVhqqTl9dm rrmY1L7t8KQt4IrUNL1375svMEzUAoO3JrNsvn2viKTdjIab0JPKm7XDF4na/uDxFWEY uM1k67ZEZOuS2if4wrUj9XG9PnIYuMf218Y3dlSSeqdTUKuuR4ckGln8pbACEO5Ansg7 YuoMRRsytl+Drb4ihYWQOqPK+K8DaGJXWNH4FbtQAd7zNoonmgykc1Hsb5j5Y/eMJ6B8 c6cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723571071; x=1724175871; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=v5qCJWinHPwix2emDITqwEdfIFGgfIhzXmjEyYRNeqw=; b=oNwUEbqUfzEvF1RsScyN+oHeb+9oD50/Fo8h6hdnUKqT2rvsrVyk92fcfwTUl+GoNj jNmmEp0FfUqQTJfYcfn0s7OQSzbzvgCvLcW1rJQK9FynntSjph9ZWgkt/yNzRh/LWnN0 xu9KJQGYGhPhp8DmIhPNmzvOmOpLwfhWtzCW8YQBW/HESPGmJk5VAuOFKP4gA/SjWW+f +NQi8eg0jbZYl4N7NsjmWtH3MYIkR2C0QvRWaXVZmydWGVG50IwHnC6S6vWokCj3N98r ivaSI3ilsOKiVoO4s8FD8fUfXVxuiEsXB7q1zSIgZa8HZZyV8mk+qe9AQIWJacmQyx8N O8DA== X-Forwarded-Encrypted: i=1; AJvYcCXsRGanFd771fW6WTO4P2xIPVj23zjPm/N/E47pctZJWniE4/XCVOAedXQR5AGVupS7BkA59Tg5HbzvZKFo4PF2wTE= X-Gm-Message-State: AOJu0YwgJ1fcXgKxmJ1u4+tSm+7TNrw1goXGUOZL7SYRX/6Kb0d+OOvF OzukCs80/fsyblI58Yvse7bkQw9HIuJNANOOmsa+thoDAjZsac/TtX4PD400Fy+SOvBORYPrdjl +xasVmfeX1ehO5vfXXftj7ebv3yVR4ku8eTOk X-Google-Smtp-Source: AGHT+IFO+uGUU98DqHRAX6nV1kqLGhOV24S8OffXAIkCtdUG1oMVYNt/OYhdrmrpePlnjVGOsEczG5rFBBOpcfVBhKg= X-Received: by 2002:a05:6102:e0e:b0:492:9ef9:9d1b with SMTP id ada2fe7eead31-497599d9241mr403418137.22.1723571070651; Tue, 13 Aug 2024 10:44:30 -0700 (PDT) MIME-Version: 1.0 References: <1998d479-eb1a-4bc8-a11e-59f8dd71aadb@amd.com> <7a06a14e-44d5-450a-bd56-1c348c2951b6@amd.com> <0702c85c-abca-4a33-b91b-dadf68670364@gmail.com> In-Reply-To: <0702c85c-abca-4a33-b91b-dadf68670364@gmail.com> From: Yu Zhao Date: Tue, 13 Aug 2024 11:43:52 -0600 Message-ID: Subject: Re: Hard and soft lockups with FIO and LTP runs on a large system To: Usama Arif Cc: Bharata B Rao , linux-mm@kvack.org, linux-kernel@vger.kernel.org, nikunj@amd.com, "Upadhyay, Neeraj" , Andrew Morton , David Hildenbrand , willy@infradead.org, vbabka@suse.cz, kinseyho@google.com, Mel Gorman , leitao@debian.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: a9gb4bfanq6wgzbhbxpo3aymif1chbjo X-Rspamd-Queue-Id: C0CFD4001C X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1723571071-69394 X-HE-Meta: U2FsdGVkX1/Irs5OfqKR+Vuyp5uKKnGo/rhHlxaG8irqp77VQLlWGq5PKXm4B1oh9wuWa47yVwt0DXeSim8FzC/au8gDW1r3Uz6bn3A1W1J1HPDY3QSmvXmvJc9oymxlQZdw4aOAJiv7W2qGc1uvaX3y8OD5zXCpi8j7xq8l983paU8XhiijXXqOs9EOzAt2EthE+YP4Y79EZ+QNQMYn4TpVCYrqJGkeb6rNoh31J0x1AD57jkbOhwD5Yjk/nPJbgeA85wIs5ouFLQX54zipdx5QXd/7SPRGASJ8UZk2Lxl7pN93r8Va+S6izuvTG7V1+cQTCx20V2h25G/SAtHT0Y5WMcXPXfc19xUWt+mDbPKI52yrxQLKATZAGuDfQclxptc28ukmfYnFFfPyXorgIt6esj7UzjBgVLy2zZ84Z1X//PbLathMFIayLCaN6JeECNCnx98jnUIh3UFLR1TAObTqoEtWYs01h9hrBkX2al0p0F1bOX7aWYH5cI03Pd/cO8lg7w3r0HMahytDvZcLUpFQ8i0Wsc72TeztHKlXFhNgt4hO2oKq/aTTeM1+y5JMIPUT0aOfLg+1+lCfnDaiugJspz6QNDdSq5oOY1ZNZxpBRleU93V1lnG4GhChkL0ThYjMUXl+/wijO1/p8HGQ3EuHehgpsONVz5D7RLLtx8Zu+KlQd5FiFumXL0KF4qLpg9TTv7HPia8tORv/HH4UWLm0P3En6tYmsm226GSw+9gqD3JLzg5yE5eHpnZPsXtJTtvaGxjl3a9MvHE7xdn+AN4fRI1min6pGuhmgS5vh9xbcEyi7R56ktlry3KIeBL7E/CTNuj+/1dGfwpBFE4Zp3QThWw1mQqn6rPWLmxYBNzWdC6G12zkeoaQBLQ+znLZnQn1dYIXVjMq60X7s8+BHpkVuZdip08nZmLCnqFxdyPoP2Njro2ZFaEf3ID+0ODJu2DfoHB4c8TpWgKXvtX yRabioHo ozzthOI8jlbof26iGkWsNVaSI2ZvfvM/1vS9BAqa4OnT54cHVVzMvMt8e5VfqOAoMwwP1h8OSzzd46MunoqT4fTd9kHEb/BZ9GxjBmVeWGHPrgANjtj9NnrJ1W3l0M1vtowWdZIADw2vmoSXIJjKdIO5xbnnQC+H1ndHD5iVuycJj0868eeXvBWMF2GMY+HDTZAQQARG9zMHS6CgwXepujqT02tYjBQdQXZm6jYPuP3/8sZ/xRP0PI+bH70QY1jGk5rDTmuuXPJakFzW+NGaYPQT1bw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000007, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Aug 13, 2024 at 5:04=E2=80=AFAM Usama Arif = wrote: > > > > On 09/07/2024 06:58, Yu Zhao wrote: > > On Mon, Jul 8, 2024 at 10:31=E2=80=AFPM Bharata B Rao = wrote: > >> > >> On 08-Jul-24 9:47 PM, Yu Zhao wrote: > >>> On Mon, Jul 8, 2024 at 8:34=E2=80=AFAM Bharata B Rao wrote: > >>>> > >>>> Hi Yu Zhao, > >>>> > >>>> Thanks for your patches. See below... > >>>> > >>>> On 07-Jul-24 4:12 AM, Yu Zhao wrote: > >>>>> Hi Bharata, > >>>>> > >>>>> On Wed, Jul 3, 2024 at 9:11=E2=80=AFAM Bharata B Rao wrote: > >>>>>> > >>>> > >>>>>> > >>>>>> Some experiments tried > >>>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > >>>>>> 1) When MGLRU was enabled many soft lockups were observed, no hard > >>>>>> lockups were seen for 48 hours run. Below is once such soft lockup= . > >>>>> > >>>>> This is not really an MGLRU issue -- can you please try one of the > >>>>> attached patches? It (truncate.patch) should help with or without > >>>>> MGLRU. > >>>> > >>>> With truncate.patch and default LRU scheme, a few hard lockups are s= een. > >>> > >>> Thanks. > >>> > >>> In your original report, you said: > >>> > >>> Most of the times the two contended locks are lruvec and > >>> inode->i_lock spinlocks. > >>> ... > >>> Often times, the perf output at the time of the problem shows > >>> heavy contention on lruvec spin lock. Similar contention is > >>> also observed with inode i_lock (in clear_shadow_entry path) > >>> > >>> Based on this new report, does it mean the i_lock is not as contended= , > >>> for the same path (truncation) you tested? If so, I'll post > >>> truncate.patch and add reported-by and tested-by you, unless you have > >>> objections. > >> > >> truncate.patch has been tested on two systems with default LRU scheme > >> and the lockup due to inode->i_lock hasn't been seen yet after 24 hour= s run. > > > > Thanks. > > > >>> > >>> The two paths below were contended on the LRU lock, but they already > >>> batch their operations. So I don't know what else we can do surgicall= y > >>> to improve them. > >> > >> What has been seen with this workload is that the lruvec spinlock is > >> held for a long time from shrink_[active/inactive]_list path. In this > >> path, there is a case in isolate_lru_folios() where scanning of LRU > >> lists can become unbounded. To isolate a page from ZONE_DMA, sometimes > >> scanning/skipping of more than 150 million folios were seen. There is > >> already a comment in there which explains why nr_skipped shouldn't be > >> counted, but is there any possibility of re-looking at this condition? > > > > For this specific case, probably this can help: > > > > @@ -1659,8 +1659,15 @@ static unsigned long > > isolate_lru_folios(unsigned long nr_to_scan, > > if (folio_zonenum(folio) > sc->reclaim_idx || > > skip_cma(folio, sc)) { > > nr_skipped[folio_zonenum(folio)] +=3D nr_pages; > > - move_to =3D &folios_skipped; > > - goto move; > > + list_move(&folio->lru, &folios_skipped); > > + if (spin_is_contended(&lruvec->lru_lock)) { > > + if (!list_empty(dst)) > > + break; > > + spin_unlock_irq(&lruvec->lru_lock); > > + cond_resched(); > > + spin_lock_irq(&lruvec->lru_lock); > > + } > > + continue; Nitpick: if () { ... if (!spin_is_contended(&lruvec->lru_lock)) continue; if (!list_empty(dst)) break; spin_unlock_irq(&lruvec->lru_lock); cond_resched(); spin_lock_irq(&lruvec->lru_lock); } > Hi Yu, > > We are seeing lockups and high memory pressure in Meta production due to = this lock contention as well. My colleague highlighted it in https://lore.k= ernel.org/all/ZrssOrcJIDy8hacI@gmail.com/ and was pointed to this fix. > > We removed skip_cma check as a temporary measure, but this is a proper fi= x. I might have missed it but didn't see this as a patch on the mailing lis= t. Just wanted to check if you were planning to send it as a patch? Happy t= o send it on your behalf as well. Please. Thank you.