From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5E76C7115B for ; Wed, 18 Jun 2025 11:40:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 864CD6B00A5; Wed, 18 Jun 2025 07:40:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 83E016B00A6; Wed, 18 Jun 2025 07:40:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 751D86B00A7; Wed, 18 Jun 2025 07:40:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 697D36B00A5 for ; Wed, 18 Jun 2025 07:40:23 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 1F7DE161057 for ; Wed, 18 Jun 2025 11:40:23 +0000 (UTC) X-FDA: 83568328326.23.6AB0011 Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) by imf23.hostedemail.com (Postfix) with ESMTP id 3AD6014000D for ; Wed, 18 Jun 2025 11:40:20 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=KYmstpFR; spf=pass (imf23.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.210.173 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750246821; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Vg9KlwIz1mnZGI4kokyeGTdkzv1z3OHfndjs6itg7OE=; b=gzHLWztDmWBpHHdsfP5RvWCH5ylB3/a8IvbVPIxQ44QRRQdYLZv30dJCa9ny+B5e9S+8ds fqA2DJ9ZjexBL5wgBZbcUJvG88nH0uk9mQ1O3ZCheBPDmjcdnfNsKGKa5qHqSLOA5FvV90 d+J4dMzGqo4u4w5xTI1dQyStDm1K8/U= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=KYmstpFR; spf=pass (imf23.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.210.173 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750246821; a=rsa-sha256; cv=none; b=f1ux3v0dzuBI/13GYnSCkMttLTdUiL88xJe/g/xb5s7C/aRnfCrhh/4LZlLa9JTLDZKBoq 65uBAdKKFwprPE0kcEMO52RTytMisw3HNCyUx/+GtrfjxFSBTsv91eCesJ6PRVJaKvMM7P K0F8ub+TQ11duESi0MUowMXCclGI/PQ= Received: by mail-pf1-f173.google.com with SMTP id d2e1a72fcca58-742caef5896so5622019b3a.3 for ; Wed, 18 Jun 2025 04:40:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1750246820; x=1750851620; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Vg9KlwIz1mnZGI4kokyeGTdkzv1z3OHfndjs6itg7OE=; b=KYmstpFRR5mjg8ySANWJfGvqHLQNhKGV7pCHB9tr7Ncd7GS82Yyk+Ni7UnUnWuccS0 mQYjiOAS431hTSOUBebMqEIkzzUv+33R3x5dv9auhJV2cUD4rZ3L9IzEnEXdEoVGLBWL viw4GKSBSR4zbbIt/nW4tfPUGmANAJaN5P0lS/esHpq7WAweOgPOAaLH4A4UycfOMiJO Pu4Pw1Zp/ZDGsCAcf/2QUcmTrnl4H++trKHwmyVTNOwloVzLVVXaflPzRer7hXeYPQqL RsVDokOvfKv3HurcEIkQajB3IDGsMiK+yZSJ5QcuKus93otm/Btew2s1veJ7R4oBTJqC YjRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750246820; x=1750851620; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Vg9KlwIz1mnZGI4kokyeGTdkzv1z3OHfndjs6itg7OE=; b=Y0f7ltSgGhg/IHzF/HltnI+it3Y86zFvxvHF79rOpgxrSP/wTS/q/dte2rOYl/vhoz 0NfCPJqe0xvu+WTsU3TTFEiOIk4o90E4TkSjUg4ZT0lBYRqqYwzV6lHh010n/c/EpEM1 AERwn1QVJydZOKlsWrlhYqJ4LEiJHXsASA0uvoB6IaG7nNVPXcH+qPdvmA+BEJ3JAviJ 3gfTrk6MYdamnrhjtSMKXOV2fmZ5BJvd0LRgAxQJbnNlNs6v0yQLY0RREuU3MkV+NP/0 91wjgOgrA8GRO44uBwX6OZkmpaRdS0/wYTQOR6nrFo6DdwBYP/RWyV4fVkt0LW6K7Q3B 6mwg== X-Forwarded-Encrypted: i=1; AJvYcCVpkWQgvktVtPg0jZiFnwUDaz7fx/tv5QDPSNNOByiUa3EbQdi7T/dc1K232TcvvFcyk4ZY1MzbdA==@kvack.org X-Gm-Message-State: AOJu0YxkSKNkBy/CFCSwWjTq3672WiPo/KCoQV28k1f4lj4YOv005SBO aBn58Bn1Gd6nzsMUYZ1x0Xnz/j0S3himgvF5Ym++ejdAc12c2kaMmgGSlNSjVAeEU1E= X-Gm-Gg: ASbGncstKdIwoPOmtaofYThVpzXLfvHOTJJWNySZgXwPextrBvvnCaCvEfLt1fzPmHj 8JN2ZdUXsVn2u1ZSAL9o5zgWHFjgKZXWogDltlejrD+VTFXZGkMy1zOGufj/Kgre1tk0Dha5grn XsTComtcCP4TqgXT+lBJAd273GIT/YbChZG5nPshYlWCooaq4n3NKWYi+6MR6hwY8ilya0WH5oI eVmScoiKDXwpVrCRcG8XfVRWwRRo74Pb7yMC7ToWfjCIOMEsQq9xN6YSPQUpAbgDHJ4n4Cpf4im VMlhZzxq2U+/Uk4QUQmGkiprV0QkHbS/r/jM4EKtgauD1HkUvVFoEHL4WeX5iLvLoGhgS5RD0a4 UAHIrHyjXnCg= X-Google-Smtp-Source: AGHT+IFjiNLAtj905GCE7Fre41c5uh14WoSaN+/SDx34U3YOO1Dp0kn5NqBhaBmjWveoEpoBhRI3Og== X-Received: by 2002:a05:6a00:852:b0:736:a540:c9ad with SMTP id d2e1a72fcca58-7489cfca938mr25785015b3a.20.1750246819747; Wed, 18 Jun 2025 04:40:19 -0700 (PDT) Received: from n37-069-081.byted.org ([115.190.40.12]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-748900e3a09sm10683148b3a.180.2025.06.18.04.40.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Jun 2025 04:40:19 -0700 (PDT) From: Zhongkun He To: akpm@linux-foundation.org, tytso@mit.edu, jack@suse.com, hannes@cmpxchg.org, mhocko@kernel.org Cc: muchun.song@linux.dev, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, Zhongkun He , Muchun Song Subject: [PATCH 2/2] jbd2: mark the transaction context with the scope PF_MEMALLOC_ACFORCE context Date: Wed, 18 Jun 2025 19:39:58 +0800 Message-Id: <81b1f3df0379b0e34bdf239d36d4d9aeb4bee9cf.1750234270.git.hezhongkun.hzk@bytedance.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam11 X-Rspam-User: X-Rspamd-Queue-Id: 3AD6014000D X-Stat-Signature: 98ddpbane7ubq51dy8gc3nmod3ea9sow X-HE-Tag: 1750246820-580727 X-HE-Meta: U2FsdGVkX189hou38ZDvq7/H33lSzxRn4Gq+Fl773HtG7LjtQPFYpad27hwMiJX/8koPH+YbUl9O2FGS1hCDK98sdXpe7d6UNlQKPt5jXCvzWeWM36l60VxP43Xlk3WOGWXujXTn2IIEFtDzi5wQHt/226cGwXsjq8cmFrl8XfUo811NZE/RX0zK36EWxDJZYGWnVP1sP7+mEi5LJN7XQhc59vXjEMefwNbzxCR979A/Y+2QnJ2G4NVYLwSWbm10ihyrgM6ZJCKgqaqj+L/5qyi+tNcg/1HH0yP+271d1tNFQ0pOhVsflb/aaiAF0B54QtewTSrmoTFEfTMWP/j74ohbBZ8A9OVNvt46iHsukYm3D29Nx9oDw7XTrc0jMb3mymDDcMB+p5IRonuHJ5Q+dbeMVRv43RmpzUCgbTetN3dGZOJh21oxm8q3OLsjcedhGSOQgFfPjpscHAI73IekuiP7doqU2fyKqqEUlYf0vosOTnYkCN+ksd1V0Wx4KzJ0IItWSIgGyFqA8G7F4Afo7z0pgYJ9PrVOs+mh3hHbcgNgnWCb2gM4teE+UZp4tOoIySFvR72m6E/voLSa2Er1dnqQZGcI7XixWbmot1nxfgNjaRMfBM1bDd+srUp6FlXbFyxktA6gNKEfs2yxHd7Jn9DzaULFfu0oMl61PS5qM3qwH2Oelz+76HZoWanNInpy6F7qCG4n4vWkxte4m4c7Si/f8O8D5u1erWQp+RA/DqHL3xg/BUWolGtEhaRfOSt0TlScVpG3rKLqy9AfwX1RyRj5x0IQ1MiHUZqb1pPYZy5Z00uAxoJXn1crMsjdWJf6Xjl9D6NUZZUOfIckZH9hqQ5TA5oX28MSsItEJQPIFiflM72XBposu2IfLoF6kvW9n+ox+AAjfa/JT0oqSoGi6nDyMc2DEwZATc0xP3d4DbRtZ9V/3O2+diiVRDQUWpfZn4qLo7R6bc5cGICnnQu fXtXhflv 21KprlRt8bwrRVxwyXKOIBF8xrN6bp5EaRjTAV7tkJYICGQqPZ45BcqB7LNvVteMcOiVO6AbVFnd6j8/FrNUU0ib+25CWxZA8qnZeBaIEMktBFs4CABTMbBctXsaaUgL7p8yDPemTvGlwJs9ULEGjRxz6K1yeSRfG3Ou0aX2FT1UEnPRR8QU1q0zQR2vOiY+43Z1bo4FYwHDIjiKRrDPboG0A3ZYfqzjqCo89EEeMG9a9KU92X859N5tSPVAzZR/hD4YgDAKN8jow5kcJlUaOZi1oeYlEIUWCCbqGWJs+pAJbQZxFiY7oB09aK5la/ZLFX/MV4J9eUDC/qSCMxWg6Aeqh4jhdDWjKOP54s5SWIvRQxQUNo6Au5TMOwBoWRq/5ke21HlNlXhVuUcO2sh9ZLSWA6cY/qSPuGdNeG+n90mzKhXAG0ImiQGTUXbeAM+Z6V5SjXtIpxfoORZBxrBYwBu4sHQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The jbd2 handle, associated with filesystem metadata, can be held during direct reclaim when a memcg limit is hit. This prevents other tasks from writing pages, resulting in shrink failures due to dirty pages that cannot be written back. These shrink failures may leave many tasks stuck in the uninterruptible (D) state. The OOM killer may select a victim and return success, allowing the current thread to retry the memory charge. However, the selected task cannot respond to the SIGKILL because it is also stuck in the uninterruptible state. As a result, the charging task resets nr_retries and attempts reclaim again, but the victim never exits. This leads to a prolonged retry loop in direct reclaim with the jbd2 handle held, significantly extending its hold time and potentially causing a system-wide block. We found that a related issue has been reported and partially addressed in previous fixes [1][2]. However, those fixes only skip direct reclaim and return a failure for some cases like readahead requests. Since sb_getblk() is called multiple times in __ext4_get_inode_loc() with the NOFAIL flag, the problem still persists. So call the memalloc_account_force_save() to charge the pages and delay the direct reclaim util return to userland, to release the global resource jbd2 handle. [1]:https://lore.kernel.org/linux-fsdevel/20230811071519.1094-1-teawaterz@linux.alibaba.com/ [2]:https://lore.kernel.org/all/20230914150011.843330-1-willy@infradead.org/T/#u Co-developed-by: Muchun Song Signed-off-by: Muchun Song Signed-off-by: Zhongkun He --- fs/jbd2/transaction.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c index c7867139af69..d05847301a8f 100644 --- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -448,6 +448,13 @@ static int start_this_handle(journal_t *journal, handle_t *handle, * going to recurse back to the fs layer. */ handle->saved_alloc_context = memalloc_nofs_save(); + + /* + * Avoid blocking on jbd2 handler in memcg direct reclaim + * which may otherwise lead to system-wide stalls. + */ + handle->saved_alloc_context |= memalloc_account_force_save(); + return 0; } @@ -733,10 +740,10 @@ static void stop_this_handle(handle_t *handle) rwsem_release(&journal->j_trans_commit_map, _THIS_IP_); /* - * Scope of the GFP_NOFS context is over here and so we can restore the - * original alloc context. + * Scope of the GFP_NOFS and PF_MEMALLOC_ACCOUNTFORCE context + * is over here and so we can restore the original alloc context. */ - memalloc_nofs_restore(handle->saved_alloc_context); + memalloc_flags_restore(handle->saved_alloc_context); } /** @@ -1838,7 +1845,7 @@ int jbd2_journal_stop(handle_t *handle) * Handle is already detached from the transaction so there is * nothing to do other than free the handle. */ - memalloc_nofs_restore(handle->saved_alloc_context); + memalloc_flags_restore(handle->saved_alloc_context); goto free_and_exit; } journal = transaction->t_journal; -- 2.39.5