From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F234C433EF for ; Thu, 25 Nov 2021 04:08:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 80D256B0074; Wed, 24 Nov 2021 23:08:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7BCBD6B0075; Wed, 24 Nov 2021 23:08:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 636E46B007B; Wed, 24 Nov 2021 23:08:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0100.hostedemail.com [216.40.44.100]) by kanga.kvack.org (Postfix) with ESMTP id 50FFC6B0074 for ; Wed, 24 Nov 2021 23:08:23 -0500 (EST) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id EF9578BEDB for ; Thu, 25 Nov 2021 04:08:12 +0000 (UTC) X-FDA: 78846119982.27.9397472 Received: from mail-qk1-f179.google.com (mail-qk1-f179.google.com [209.85.222.179]) by imf14.hostedemail.com (Postfix) with ESMTP id F3F8E60019B2 for ; Thu, 25 Nov 2021 04:08:10 +0000 (UTC) Received: by mail-qk1-f179.google.com with SMTP id 193so8191966qkh.10 for ; Wed, 24 Nov 2021 20:08:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=dP3yZrGzb9R1iOzqxrTkQjJTMoOKsAT1OEs2kYMMg74=; b=udnW4TACgVBEHiQOw5KT9xMkvLQmjdfGlEEGtF2Hz8uc2nycBd4B6q54WvA7dT3suq mWpf2x+I5QexmCZ18kCZB4U5h3dyuIHPSmDujKxP4HXA9l3IuN13k4lYeCWZgkJ2LcYp SxAybxt8OXRgkCsW5zfR/Z0MQXOquRWWQxJwkOg9hXEuUXJeAggz/sOB/mVhmxoexPsP GfVNQ8lDR+EoStUauvz3Jk0PlmHra4DKBYB0AzktF+bAmKTyfSDK2bxHGfNLza+dRgKo LhREi4xonYQyRnh89cE0wPp9vpToNaD84B8sOw7iFC87a5UnEtSFDPwxq+Ydc/nUHwEr Mq/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=dP3yZrGzb9R1iOzqxrTkQjJTMoOKsAT1OEs2kYMMg74=; b=TcH+Da1/wm1u1KY4OOydkNA5JVtfJ9gGt7Sjlpf+3e437keC4mcRy7gugmj+/+Ou7z SsaS81m+yvzoGVevF8tEVVffjapPhMoXpmbSVOKD9q2FquIbI6FoV76+3fOt/XAL86sV tqYpChOL85EsyC3pAX+pFV/PTw3BZy7cpNzLkUFTyB8CTGh6QVWLzlYbJmMonWwCuBZT uZ9oo+CXN4JBYHW1qudxjtSIeX5Qd5cg6y7Ece9awIEJprGH5rETHChZ+rqDfuJHcozf mKpzgFt76t8oyYzvOPVW4DGf1uqdjMOBBh0T7wxGxIwf8nrjsqH8Gc/ZU5+e2GhvMRtR kpYg== X-Gm-Message-State: AOAM532RFG+PPOmHpXPFfIWtvZQIz2eC8yBq6Gp45zmsFKXQPk/1xL4k qVkm9V+qyFaCkQLMgqMmM9l6/uXdmTs/jZo9Cwt9EQ== X-Google-Smtp-Source: ABdhPJy66/tQ4CNBBLx69eIsGB+A6cvgvKIrd45tNyK1KRALISbSOii9fareZYyDNTWIUa76/pKptXIYbjQ2uBu+Syo= X-Received: by 2002:a25:d2ca:: with SMTP id j193mr2903638ybg.419.1637813290882; Wed, 24 Nov 2021 20:08:10 -0800 (PST) MIME-Version: 1.0 References: <20211125031244.89848-1-ligang.bdlg@bytedance.com> In-Reply-To: <20211125031244.89848-1-ligang.bdlg@bytedance.com> From: Muchun Song Date: Thu, 25 Nov 2021 12:07:31 +0800 Message-ID: Subject: Re: [PATCH v4] shmem: fix a race between shmem_unused_huge_shrink and shmem_evict_inode To: Gang Li Cc: Hugh Dickins , Andrew Morton , "Kirill A. Shutemov" , linux- stable , Linux Memory Management List , LKML Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: F3F8E60019B2 X-Stat-Signature: sfksh1g4tfoxypi9n3whggauohk93xws Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=udnW4TAC; dmarc=pass (policy=none) header.from=bytedance.com; spf=pass (imf14.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.222.179 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com X-HE-Tag: 1637813290-823247 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Nov 25, 2021 at 11:12 AM Gang Li wrote: > > This patch fixes a data race in commit 779750d20b93 ("shmem: split huge pages > beyond i_size under memory pressure"). > > Here are call traces causing race: > > Call Trace 1: > shmem_unused_huge_shrink+0x3ae/0x410 > ? __list_lru_walk_one.isra.5+0x33/0x160 > super_cache_scan+0x17c/0x190 > shrink_slab.part.55+0x1ef/0x3f0 > shrink_node+0x10e/0x330 > kswapd+0x380/0x740 > kthread+0xfc/0x130 > ? mem_cgroup_shrink_node+0x170/0x170 > ? kthread_create_on_node+0x70/0x70 > ret_from_fork+0x1f/0x30 > > Call Trace 2: > shmem_evict_inode+0xd8/0x190 > evict+0xbe/0x1c0 > do_unlinkat+0x137/0x330 > do_syscall_64+0x76/0x120 > entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > > A simple explanation: > > Image there are 3 items in the local list (@list). > In the first traversal, A is not deleted from @list. > > 1) A->B->C > ^ > | > pos (leave) > > In the second traversal, B is deleted from @list. Concurrently, A is > deleted from @list through shmem_evict_inode() since last reference counter of > inode is dropped by other thread. Then the @list is corrupted. > > 2) A->B->C > ^ ^ > | | > evict pos (drop) > > We should make sure the inode is either on the global list or deleted from > any local list before iput(). > > Fixed by moving inodes back to global list before we put them. > > Fixes: 779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure") > Signed-off-by: Gang Li You have forgotten my Reviewed-by and Kirill A. Shutemov's Acked-by as well as Cc: stable@vger.kernel.org. > --- > mm/shmem.c | 34 +++++++++++++++++++--------------- > 1 file changed, 19 insertions(+), 15 deletions(-) > > diff --git a/mm/shmem.c b/mm/shmem.c > index 9023103ee7d8..e6ccb2a076ff 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -569,7 +569,6 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo, > /* inode is about to be evicted */ > if (!inode) { > list_del_init(&info->shrinklist); > - removed++; I believe there is a warning about @removed since it's unused. > goto next; > } > > @@ -577,12 +576,12 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo, > if (round_up(inode->i_size, PAGE_SIZE) == > round_up(inode->i_size, HPAGE_PMD_SIZE)) { > list_move(&info->shrinklist, &to_remove); > - removed++; > goto next; > } > > list_move(&info->shrinklist, &list); > next: > + sbinfo->shrinklist_len--; > if (!--batch) > break; > } > @@ -602,7 +601,7 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo, > inode = &info->vfs_inode; > > if (nr_to_split && split >= nr_to_split) > - goto leave; > + goto move_back; > > page = find_get_page(inode->i_mapping, > (inode->i_size & HPAGE_PMD_MASK) >> PAGE_SHIFT); > @@ -616,38 +615,43 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo, > } > > /* > - * Leave the inode on the list if we failed to lock > - * the page at this time. > + * Move the inode on the list back to shrinklist if we failed > + * to lock the page at this time. > * > * Waiting for the lock may lead to deadlock in the > * reclaim path. > */ > if (!trylock_page(page)) { > put_page(page); > - goto leave; > + goto move_back; > } > > ret = split_huge_page(page); > unlock_page(page); > put_page(page); > > - /* If split failed leave the inode on the list */ > + /* If split failed move the inode on the list back to shrinklist */ > if (ret) > - goto leave; > + goto move_back; > > split++; > drop: > list_del_init(&info->shrinklist); > - removed++; > -leave: > + goto put; > +move_back: > + /* > + * Make sure the inode is either on the global list or deleted from > + * any local list before iput() since it could be deleted in another > + * thread once we put the inode (then the local list is corrupted). > + */ > + spin_lock(&sbinfo->shrinklist_lock); > + list_move(&info->shrinklist, &sbinfo->shrinklist); > + sbinfo->shrinklist_len++; > + spin_unlock(&sbinfo->shrinklist_lock); > +put: > iput(inode); > } > > - spin_lock(&sbinfo->shrinklist_lock); > - list_splice_tail(&list, &sbinfo->shrinklist); > - sbinfo->shrinklist_len -= removed; > - spin_unlock(&sbinfo->shrinklist_lock); > - > return split; > } > > -- > 2.20.1 >