From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5641CC433F5 for ; Thu, 25 Nov 2021 06:46:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C38DB6B0074; Thu, 25 Nov 2021 01:46:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BE8BF6B0075; Thu, 25 Nov 2021 01:46:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AB0226B007B; Thu, 25 Nov 2021 01:46:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0150.hostedemail.com [216.40.44.150]) by kanga.kvack.org (Postfix) with ESMTP id 9D4C26B0074 for ; Thu, 25 Nov 2021 01:46:29 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 61AB555FA4 for ; Thu, 25 Nov 2021 06:46:19 +0000 (UTC) X-FDA: 78846518478.06.DF953E9 Received: from mail-pj1-f50.google.com (mail-pj1-f50.google.com [209.85.216.50]) by imf16.hostedemail.com (Postfix) with ESMTP id A51ACF00009A for ; Thu, 25 Nov 2021 06:46:13 +0000 (UTC) Received: by mail-pj1-f50.google.com with SMTP id fv9-20020a17090b0e8900b001a6a5ab1392so4957575pjb.1 for ; Wed, 24 Nov 2021 22:46:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=AJKeYDChQLO4ew6FjZ6Y8Jm0rawpzT+mEb6zq+ylthQ=; b=OIJaXwOyYjpHI1G9DWSPynQ3Wdu8P1n50oK6pqqXtnBW2cVj+LLSV8VwhYefEO1SAY zrj9hgsv9jw7UlluOzPEu7+lVMz8wY8koTifDwOT+T3QHSFR9DPnaGO1xaCtf5w0Dg00 TSG99VmpJuAtUb4dc6viHVwBfOanRme39VVz8BBmW7joKZ+Z2CX/Q0M3zbcvvj7Et3Vz rZJNfbrXVxyOPLBug+WdLoOCpEqA3hIoXxe+lNCMZfO8xBP7DO9wrYj22LPHFhbxDgUM To993p4Ma5Llc13o4FQLjYRK/DtfqhfhW+C6nHT1ybduI1/wRJU0/jhqPfpDL/+zQ0Se nmlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=AJKeYDChQLO4ew6FjZ6Y8Jm0rawpzT+mEb6zq+ylthQ=; b=rLHrt8RrufxODMXgrhgsQVaAbty73tiwQYjoou4+hnprD2ratc2Nn/SlCOVbII1Vk8 SMusdFHkUQHzuaEumSpto6JiUa3NOZCCXis7XX9i1urzFK8dVeueO3VV/HsJTeOJ+4Ur Kxqtbxd7wgOJcEGjLlagJDtg+AqResrY/SdhS6TQq/quakNlqvLnrQoEAtvk7FJ29iim 3EJ9VMyoLLs5I8KNZ4vDqkWPwkwKtyhKTrNj/aTxN8cC79urnfIInE3AnTiNXBCdOEHN IHUG9Vv7t0Mz0aosnkXVfv9eVnUkiX00OIgbHrNEoomheYHMqooncwLhKXF/N9sN8W67 YWbA== X-Gm-Message-State: AOAM530QQn3+QEqiPjn3p/70Hkfk+K/uOZ77nbSqbqAwRN0laO5pxCdV dO7Bk5aEUn3PGU5sOec5vrfFJA== X-Google-Smtp-Source: ABdhPJwSiyVd0tMvU745BGU1v1Zfqqhru6Spx8LM0JTu1aWBPBkd1HNKcqGMPKfBIRAsFKd3azMakg== X-Received: by 2002:a17:902:ea10:b0:142:112d:c0b9 with SMTP id s16-20020a170902ea1000b00142112dc0b9mr26227788plg.35.1637822776457; Wed, 24 Nov 2021 22:46:16 -0800 (PST) Received: from C02FT5A6MD6R.bytedance.net ([61.120.150.76]) by smtp.gmail.com with ESMTPSA id j17sm2082294pfj.55.2021.11.24.22.46.12 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 24 Nov 2021 22:46:15 -0800 (PST) From: Gang Li To: Hugh Dickins , Andrew Morton , "Kirill A. Shutemov" Cc: Gang Li , stable@vger.kernel.org, Muchun Song , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v5] shmem: fix a race between shmem_unused_huge_shrink and shmem_evict_inode Date: Thu, 25 Nov 2021 14:45:00 +0800 Message-Id: <20211125064502.99983-1-ligang.bdlg@bytedance.com> X-Mailer: git-send-email 2.32.0 MIME-Version: 1.0 X-Rspamd-Queue-Id: A51ACF00009A X-Stat-Signature: p97zbckcz6jakcic8wg9yci34dj6hriq Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=OIJaXwOy; dmarc=pass (policy=none) header.from=bytedance.com; spf=pass (imf16.hostedemail.com: domain of ligang.bdlg@bytedance.com designates 209.85.216.50 as permitted sender) smtp.mailfrom=ligang.bdlg@bytedance.com X-Rspamd-Server: rspam02 X-HE-Tag: 1637822773-78654 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch fixes a data race in commit 779750d20b93 ("shmem: split huge p= ages beyond i_size under memory pressure"). Here are call traces causing race: Call Trace 1: shmem_unused_huge_shrink+0x3ae/0x410 ? __list_lru_walk_one.isra.5+0x33/0x160 super_cache_scan+0x17c/0x190 shrink_slab.part.55+0x1ef/0x3f0 shrink_node+0x10e/0x330 kswapd+0x380/0x740 kthread+0xfc/0x130 ? mem_cgroup_shrink_node+0x170/0x170 ? kthread_create_on_node+0x70/0x70 ret_from_fork+0x1f/0x30 Call Trace 2: shmem_evict_inode+0xd8/0x190 evict+0xbe/0x1c0 do_unlinkat+0x137/0x330 do_syscall_64+0x76/0x120 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 A simple explanation: Image there are 3 items in the local list (@list). In the first traversal, A is not deleted from @list. 1) A->B->C ^ | pos (leave) In the second traversal, B is deleted from @list. Concurrently, A is deleted from @list through shmem_evict_inode() since last reference count= er of inode is dropped by other thread. Then the @list is corrupted. 2) A->B->C ^ ^ | | evict pos (drop) We should make sure the inode is either on the global list or deleted fro= m any local list before iput(). Fixed by moving inodes back to global list before we put them. Fixes: 779750d20b93 ("shmem: split huge pages beyond i_size under memory = pressure") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Gang Li Reviewed-by: Muchun Song Acked-by: Kirill A. Shutemov --- Changes in v5: - Fix a compile warning Changes in v4: - Rework the comments Changes in v3: - Add more comment. - Use list_move(&info->shrinklist, &sbinfo->shrinklist) instead of list_move(pos, &sbinfo->shrinklist) for consistency. Changes in v2: https://lore.kernel.org/all/20211124030840.88455-1-ligang.= bdlg@bytedance.com/ - Move spinlock to the front of iput instead of changing lock type since iput will call evict which may cause deadlock by requesting shrinklist_lock. - Add call trace in commit message. v1: https://lore.kernel.org/lkml/20211122064126.76734-1-ligang.bdlg@byted= ance.com/ --- mm/shmem.c | 36 ++++++++++++++++++++---------------- 1 file changed, 20 insertions(+), 16 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 9023103ee7d8..a6487fe0583f 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -554,7 +554,7 @@ static unsigned long shmem_unused_huge_shrink(struct = shmem_sb_info *sbinfo, struct shmem_inode_info *info; struct page *page; unsigned long batch =3D sc ? sc->nr_to_scan : 128; - int removed =3D 0, split =3D 0; + int split =3D 0; =20 if (list_empty(&sbinfo->shrinklist)) return SHRINK_STOP; @@ -569,7 +569,6 @@ static unsigned long shmem_unused_huge_shrink(struct = shmem_sb_info *sbinfo, /* inode is about to be evicted */ if (!inode) { list_del_init(&info->shrinklist); - removed++; goto next; } =20 @@ -577,12 +576,12 @@ static unsigned long shmem_unused_huge_shrink(struc= t shmem_sb_info *sbinfo, if (round_up(inode->i_size, PAGE_SIZE) =3D=3D round_up(inode->i_size, HPAGE_PMD_SIZE)) { list_move(&info->shrinklist, &to_remove); - removed++; goto next; } =20 list_move(&info->shrinklist, &list); next: + sbinfo->shrinklist_len--; if (!--batch) break; } @@ -602,7 +601,7 @@ static unsigned long shmem_unused_huge_shrink(struct = shmem_sb_info *sbinfo, inode =3D &info->vfs_inode; =20 if (nr_to_split && split >=3D nr_to_split) - goto leave; + goto move_back; =20 page =3D find_get_page(inode->i_mapping, (inode->i_size & HPAGE_PMD_MASK) >> PAGE_SHIFT); @@ -616,38 +615,43 @@ static unsigned long shmem_unused_huge_shrink(struc= t shmem_sb_info *sbinfo, } =20 /* - * Leave the inode on the list if we failed to lock - * the page at this time. + * Move the inode on the list back to shrinklist if we failed + * to lock the page at this time. * * Waiting for the lock may lead to deadlock in the * reclaim path. */ if (!trylock_page(page)) { put_page(page); - goto leave; + goto move_back; } =20 ret =3D split_huge_page(page); unlock_page(page); put_page(page); =20 - /* If split failed leave the inode on the list */ + /* If split failed move the inode on the list back to shrinklist */ if (ret) - goto leave; + goto move_back; =20 split++; drop: list_del_init(&info->shrinklist); - removed++; -leave: + goto put; +move_back: + /* + * Make sure the inode is either on the global list or deleted from + * any local list before iput() since it could be deleted in another + * thread once we put the inode (then the local list is corrupted). + */ + spin_lock(&sbinfo->shrinklist_lock); + list_move(&info->shrinklist, &sbinfo->shrinklist); + sbinfo->shrinklist_len++; + spin_unlock(&sbinfo->shrinklist_lock); +put: iput(inode); } =20 - spin_lock(&sbinfo->shrinklist_lock); - list_splice_tail(&list, &sbinfo->shrinklist); - sbinfo->shrinklist_len -=3D removed; - spin_unlock(&sbinfo->shrinklist_lock); - return split; } =20 --=20 2.20.1