From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A07BFC433EF for ; Wed, 16 Feb 2022 06:48:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C14AD6B0078; Wed, 16 Feb 2022 01:48:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BC3EE6B007B; Wed, 16 Feb 2022 01:48:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A8B696B007D; Wed, 16 Feb 2022 01:48:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0071.hostedemail.com [216.40.44.71]) by kanga.kvack.org (Postfix) with ESMTP id 96F746B0078 for ; Wed, 16 Feb 2022 01:48:25 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 45AD6181AC9C6 for ; Wed, 16 Feb 2022 06:48:25 +0000 (UTC) X-FDA: 79147714170.10.BC41D52 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by imf23.hostedemail.com (Postfix) with ESMTP id 2FFDD140009 for ; Wed, 16 Feb 2022 06:48:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1644994104; x=1676530104; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=q8LqAYTT+rCKLBOuXOQb7rqDPKP6It+owG5+PV14lP8=; b=VMaR0VAxZuWX1zRevrWB4VYA4xUHV4EqBKNI/W1jqSH/0decy+XHeSM6 Nzf8KJ8izTFbYOJmrlda1xgmIFVNTg8H40gGPtRUDoKAIJlwJcelQvVs9 tWsQ7HHKrEtr5p15R7naMhdoXSq4gYOBK4sX6NtpLes2AZbH1KcagGbqa i4R+XcsztqjXGjxPe2l3CGDSo73SMZO+W/xsbj3aKHLYUm4ZxmeQvkYxs thJLub6ZDleqPiNi1fjZCi6qfO7IwjP9DBSzAZS+qfCJwTXKUGLNMUckH rDp40kPHXVk0q4lNYUYjOVbM/zmSWugNEV1t3/PKVIUYF+exoNzQW93YF A==; X-IronPort-AV: E=McAfee;i="6200,9189,10259"; a="336969328" X-IronPort-AV: E=Sophos;i="5.88,373,1635231600"; d="scan'208";a="336969328" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Feb 2022 22:48:23 -0800 X-IronPort-AV: E=Sophos;i="5.88,373,1635231600"; d="scan'208";a="633342319" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.239.13.11]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Feb 2022 22:48:21 -0800 From: "Huang, Ying" To: Yu Zhao Cc: Mauricio Faria de Oliveira , Minchan Kim , Andrew Morton , Yang Shi , Miaohe Lin , linux-mm@kvack.org, linux-block@vger.kernel.org Subject: Re: [PATCH v3] mm: fix race between MADV_FREE reclaim and blkdev direct IO read References: <20220131230255.789059-1-mfo@canonical.com> Date: Wed, 16 Feb 2022 14:48:19 +0800 In-Reply-To: (Yu Zhao's message of "Wed, 2 Feb 2022 14:53:22 -0700") Message-ID: <87o837cnnw.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Rspamd-Queue-Id: 2FFDD140009 X-Rspam-User: Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=VMaR0VAx; spf=none (imf23.hostedemail.com: domain of ying.huang@intel.com has no SPF policy when checking 192.55.52.43) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com X-Stat-Signature: r9j6xkibqtgwmkjphhspxz3uch3n5pgf X-Rspamd-Server: rspam11 X-HE-Tag: 1644994103-906825 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Yu Zhao writes: > On Wed, Feb 02, 2022 at 06:27:47PM -0300, Mauricio Faria de Oliveira wrote: >> On Wed, Feb 2, 2022 at 4:56 PM Yu Zhao wrote: >> > >> > On Mon, Jan 31, 2022 at 08:02:55PM -0300, Mauricio Faria de Oliveira wrote: >> > > Problem: >> > > ======= >> > >> > Thanks for the update. A couple of quick questions: >> > >> > > Userspace might read the zero-page instead of actual data from a >> > > direct IO read on a block device if the buffers have been called >> > > madvise(MADV_FREE) on earlier (this is discussed below) due to a >> > > race between page reclaim on MADV_FREE and blkdev direct IO read. >> > >> > 1) would page migration be affected as well? >> >> Could you please elaborate on the potential problem you considered? >> >> I checked migrate_pages() -> try_to_migrate() holds the page lock, >> thus shouldn't race with shrink_page_list() -> with try_to_unmap() >> (where the issue with MADV_FREE is), but maybe I didn't get you >> correctly. > > Could the race exist between DIO and migration? While DIO is writing > to a page, could migration unmap it and copy the data from this page > to a new page? Check the migrate_pages() code, migrate_pages unmap_and_move __unmap_and_move try_to_migrate // set PTE to swap entry with PTL move_to_new_page migrate_page folio_migrate_mapping folio_ref_count(folio) != expected_count // check page ref count folio_migrate_copy The page ref count is checked after unmapping and before copying. This is good, but it appears that we need a memory barrier between checking page ref count and copying page. Best Regards, Huang, Ying