From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A509C54E71 for ; Fri, 22 Mar 2024 03:20:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9A12A6B008A; Thu, 21 Mar 2024 23:20:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 94FEA6B008C; Thu, 21 Mar 2024 23:20:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 817FC6B0092; Thu, 21 Mar 2024 23:20:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 6D0936B008A for ; Thu, 21 Mar 2024 23:20:41 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 31EECA0EDC for ; Fri, 22 Mar 2024 03:20:41 +0000 (UTC) X-FDA: 81923222682.18.A7B8A4A Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf15.hostedemail.com (Postfix) with ESMTP id 443CAA0005 for ; Fri, 22 Mar 2024 03:20:38 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=iNwNe5am; dmarc=none; spf=none (imf15.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711077639; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jwTne3Hx/T83kHemCjc89IPiXMo/TYVA2FmO1gawVJ4=; b=NrKfpbY7AJAh7rh91U1pke+3x+oSkLoupo4xB1yvVOUuY6JCFR2COtaVQed8RptVWdBrI5 3YvHwsZyTjmWMB0vs6bO1F4NKc40kQ65qL/FXc5J8qax9SqAsYyvvuHpMBH3GAKY9lKa3F iw8ABgoIbM+gZFFF+1GOQPR5unvFPsU= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=iNwNe5am; dmarc=none; spf=none (imf15.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711077639; a=rsa-sha256; cv=none; b=6lrB7gaOlhbd66scr0+XzUiGbU/FElN5nklQ1MCdwDIo84Hb7CCJn+6FTDV/i3+5C/cAZ8 3KUbW7IAU0KBZ5jlo6xLG8BUaxgvRQA5WrA4HZmFruPSxkqvsyAYS4SxMA+mB/KwMCW49p rAf4enwvmjylOat6ct7qU5GcSjL25cQ= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=jwTne3Hx/T83kHemCjc89IPiXMo/TYVA2FmO1gawVJ4=; b=iNwNe5am11LTo6ck59SeFnUGL1 HD+nWOQ3pVN8jmOZxDAu+Ncpib5Fb16eRXv5w+DbvWJwQaFsP23UNf2rmWdk+upzS6U2IS7YZKAZa Hg5KJNEDUAjZ7XFTXmsKRU4e3oXHPhaPUJJqCf0JBtwlSIKPFVoFiqs6iewF7kOUmVIfp7rTewrWk 9k0DVnQI6uZj0K/QAyMcJnSxuDK79CN139+cEyskKg9vPpFoQxUr9nUr6MJkeAp6FNkI/lyM87xcs oDmZbGlSrqWg/1y7Kt491l+xXjmkr5QMVcsjAHH8KELIeE4uRp3bcI9zHeuizUXYFMuJN94h8gLnL xXnl7Adg==; Received: from willy by casper.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1rnVSG-00000008Kc7-3Ret; Fri, 22 Mar 2024 03:20:28 +0000 Date: Fri, 22 Mar 2024 03:20:28 +0000 From: Matthew Wilcox To: Zhaoyang Huang Cc: =?utf-8?B?6buE5pyd6ZizIChaaGFveWFuZyBIdWFuZyk=?= , Andrew Morton , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , =?utf-8?B?5bq357qq5ruoIChTdGV2ZSBLYW5nKQ==?= Subject: Re: summarize all information again at bottom//reply: reply: [PATCH] mm: fix a race scenario in folio_isolate_lru Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 443CAA0005 X-Stat-Signature: s4sy5sqrknjy9x1cs899j1ec6panj5zq X-Rspam-User: X-HE-Tag: 1711077638-205764 X-HE-Meta: U2FsdGVkX1/7Yxf3chdkF1vCPFJjkGBJAfjWreF5O8ro5bgT4mmNW+EMp8cZPWTXKhg+UeCOaLX6pDr+7EpMYrghIBxtzdFTkKKGOACnhDOZLLB88OxYCAdCCrGsDGx5QWUDmXJWQEa8w6oVJEPHYZ0i/PN5ih4gjTUfUxcCaZ7aQSh+vsHx0WqoMbNCZPv+LwJv/m7Kq8tiwNwtf9LBz+/Wx5nJ127WDaMB1RNJMk3TkTiE3twzxoFwiTXUpo5xQDZ+0PN+zbvZlHQkijSjawzVpiaWX6O6fIsURUWH+r/Jw20HF7jWC7NLxD+JEnwnLz5Kw5E0R06zIMknW7Dl+kD5QhsdPGXe/Ys0cn/gsQHaM96gPlr6cG42gP4K9/mhCiW3ut4gUj3Pwi6ipq1mhoMvW2EHysWvxQ7b8xxTM/qKObuOAFOdi07toH7PjvgWIZdwT0Lhm+CWEmmJjoabQ5nlQXsQ+uXGGT0vo3dRep4fOQ5wUOCNyUC7Y28+n4zl8tCdGphZEfcP2mq5UqEYv3tGujJRD+XZyYfiCoX8N4FLa0ZtF7aPnf1yeBw5YsuyraH+9WAo+LFeIXlo/VXGKCPwr/SuZC1UNn/O3T53aaVOzGskp+NHPCHtLEpkv0wW+sw+m7h0xcXqzfkFn9FcUYM+R0pZZMdYG+3Idum2MOnIxGV1HNszM/f8IXfFNEj59fpy2rH4vbRWdIYO/F6yHzoXVh/Vkte1uL/Vm/mtne24rSm0hn5FReZchmmjR+jFg8v1m/6eiicW5mN+1M8k+0pFIXBQohu+MQppGdkdbQkKj+kPY14P4INJ8sysTdrSqHZ4NdNos2vC5pENvTW58TfOS9yG6p81bHXW3N9ldF6LEYDR1H7ONiKHDqZl3qOiht/w9g4A2ahJAC0Bh/1ppsUtl1VAPGl3OzCeGBTVY45OR6ffUm2MbGl0u2aZHNze2YIHeGW02w+WASSHuTM oI5CS3FN g8GF0sgSvE1JAS0Q2KreNJZ5LD4hJ+O7R1MQ1KnDNhlHYPH8GdGeh7GvdS2HbSibkFtCphvsceNumbz4Q1iGYreGT4T4+chBg6CQwBqsXizheIF4wl3dmIi88jPGoPe9qG2ROyAirnoEg4z15gCUXJJ+lgBumE6ogoaZEG+I2i5BbCN0BE7Y8Cd48Eo3Kpd+NY0PBwuUUxRQnMC1vCNina8UfCC55tuq5CLda X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Mar 22, 2024 at 09:52:36AM +0800, Zhaoyang Huang wrote: > Thanks for the comments. fix the typo and update the timing sequence > by amending possible preempt points to have the refcnt make sense. > > 0. Thread_bad gets the folio by find_get_entry and preempted before > take refcnt(could be the second round scan of > truncate_inode_pages_range) > refcnt == 1(page_cache), PG_lru == true, PG_lock == false > find_get_entry > folio = xas_find > > folio_try_get_rcu > > 1. Thread_filemap get the folio via > filemap_map_pages->next_uptodate_folio->xas_next_entry and gets preempted > refcnt == 1(page_cache), PG_lru == true, PG_lock == false > filemap_map_pages > next_uptodate_folio > xas_next_entry > > folio_try_get_rcu > > 2. Thread_truncate get the folio via > truncate_inode_pages_range->find_lock_entries > refcnt == 2(page_cache, fbatch_truncate), PG_lru == true, PG_lock == true > > 3. Thread_truncate proceed to truncate_cleanup_folio > refcnt == 2(page_cache, fbatch_truncate), PG_lru == true, PG_lock == true > > 4. Thread_truncate proceed to delete_from_page_cache_batch > refcnt == 1(fbatch_truncate), PG_lru == true, PG_lock == true > > 4.1 folio_unlock > refcnt == 1(fbatch_truncate), PG_lru == true, PG_lock == false OK, so by the time we get to folio_unlock(), the folio has been removed from the i_pages xarray. > 5. Thread_filemap schedule back from '1' and proceed to setup a pte > and have folio->_mapcnt = 0 & folio->refcnt += 1 > refcnt == 1->2(+fbatch_filemap)->3->2(pte, fbatch_truncate), > PG_lru == true, PG_lock == true->false This line succeeds (in next_uptodate_folio): if (!folio_try_get_rcu(folio)) continue; but then this fails: if (unlikely(folio != xas_reload(xas))) goto skip; skip: folio_put(folio); because xas_reload() will return NULL due to the folio being deleted in step 4. So we never get to the point where we set up a PTE. There should be no way to create a new PTE for a folio which has been removed from the page cache. Bugs happen, of course, but I don't see one yet. > 6. Thread_madv clear folio's PG_lru by > madvise_xxx_pte_range->folio_isolate_lru->folio_test_clear_lru > refcnt == 2(pte,fbatch_truncate), PG_lru == false, PG_lock == false > > 7. Thread_truncate call folio_fbatch_release and failed in freeing > folio as refcnt not reach 0 > refcnt == 1(pte), PG_lru == false, PG_lock == false > ********folio becomes an orphan here which is not on the page cache > but on the task's VM********** > > 8. Thread_bad scheduled back from '0' to be collected in fbatch_bad > refcnt == 2(pte, fbatch_bad), PG_lru == false, PG_lock == true > > 9. Thread_bad clear one refcnt wrongly when doing filemap_remove_folio > as it take this refcnt as the page cache one > refcnt == 1(fbatch_bad), PG_lru == false, PG_lock == true->false > truncate_inode_folio > filemap_remove_folio > filemap_free_folio > ******refcnt decreased wrongly here by being taken as the page cache one ****** > > 10. Thread_bad calls release_pages(fbatch_bad) and has the folio > introduce the bug. > release_pages > folio_put_testzero == true > folio_test_lru == false > list_add(folio->lru, pages_to_free)