From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA938C433FE for ; Wed, 12 Jan 2022 01:46:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F1D196B0103; Tue, 11 Jan 2022 20:46:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id ECBD36B0105; Tue, 11 Jan 2022 20:46:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D45646B0106; Tue, 11 Jan 2022 20:46:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0182.hostedemail.com [216.40.44.182]) by kanga.kvack.org (Postfix) with ESMTP id C1F466B0103 for ; Tue, 11 Jan 2022 20:46:29 -0500 (EST) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 7AF888FD09 for ; Wed, 12 Jan 2022 01:46:29 +0000 (UTC) X-FDA: 79019945298.02.DBE09D4 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by imf28.hostedemail.com (Postfix) with ESMTP id 7144CC0005 for ; Wed, 12 Jan 2022 01:46:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1641951988; x=1673487988; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=ja5XEXe3L9WwTnPzS4XAGuR/jxoyzbCGm01nJS455Pw=; b=CWHkrlm/a0VpU81XHeoO7ikD31CK+S345KqVQs26CVVDCCGvjXzbrjuU pwVzj6JZCW+Nt484to7+mVqRDeK+xRDFIUqW3rmPnoIIpbWnm3dMn4WU3 2urE45o1OYwhaD2VaZEGtl2tsqMkg2ad8MRlV7K6Jsy7YWPX7uoubuiH1 /oKwiCqh1Bn40Lw5MNHyKyVTzRhgJUvIxH9vbQwNqMI70LRKG5lqJlYwN 4ofXvgRHXL7bhaNzZX0NsdJ7Ki12bZKWwV1oR9KKRGxGxmvT3bIHqkAkw HLcHyVPQYcrTBMN+HYmLHQSjy2WN3fbvBpXgiXn4/o9BZyHnCa6XmXvHM w==; X-IronPort-AV: E=McAfee;i="6200,9189,10224"; a="223619936" X-IronPort-AV: E=Sophos;i="5.88,281,1635231600"; d="scan'208";a="223619936" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jan 2022 17:46:27 -0800 X-IronPort-AV: E=Sophos;i="5.88,281,1635231600"; d="scan'208";a="528982378" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.239.13.11]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jan 2022 17:46:24 -0800 From: "Huang, Ying" To: Yu Zhao Cc: Mauricio Faria de Oliveira , Andrew Morton , Minchan Kim , linux-mm@kvack.org, linux-block@vger.kernel.org, Miaohe Lin , Yang Shi Subject: Re: [PATCH v2] mm: fix race between MADV_FREE reclaim and blkdev direct IO read References: <20220105233440.63361-1-mfo@canonical.com> Date: Wed, 12 Jan 2022 09:46:23 +0800 In-Reply-To: (Yu Zhao's message of "Mon, 10 Jan 2022 23:48:13 -0700") Message-ID: <87v8ypybdc.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b="CWHkrlm/"; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf28.hostedemail.com: domain of ying.huang@intel.com has no SPF policy when checking 192.55.52.136) smtp.mailfrom=ying.huang@intel.com X-Stat-Signature: rpwq8za5ksijg49i97zzyyyf59dj1fbp X-Rspamd-Queue-Id: 7144CC0005 X-Rspamd-Server: rspam12 X-HE-Tag: 1641951988-675704 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Yu Zhao writes: > On Wed, Jan 05, 2022 at 08:34:40PM -0300, Mauricio Faria de Oliveira wrote: >> diff --git a/mm/rmap.c b/mm/rmap.c >> index 163ac4e6bcee..8671de473c25 100644 >> --- a/mm/rmap.c >> +++ b/mm/rmap.c >> @@ -1570,7 +1570,20 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, >> >> /* MADV_FREE page check */ >> if (!PageSwapBacked(page)) { >> - if (!PageDirty(page)) { >> + int ref_count = page_ref_count(page); >> + int map_count = page_mapcount(page); >> + >> + /* >> + * The only page refs must be from the isolation >> + * (checked by the caller shrink_page_list() too) >> + * and one or more rmap's (dropped by discard:). >> + * >> + * Check the reference count before dirty flag >> + * with memory barrier; see __remove_mapping(). >> + */ >> + smp_rmb(); >> + if ((ref_count - 1 == map_count) && >> + !PageDirty(page)) { >> /* Invalidate as we cleared the pte */ >> mmu_notifier_invalidate_range(mm, >> address, address + PAGE_SIZE); > > Out of curiosity, how does it work with COW in terms of reordering? > Specifically, it seems to me get_page() and page_dup_rmap() in > copy_present_pte() can happen in any order, and if page_dup_rmap() > is seen first, and direct io is holding a refcnt, this check can still > pass? I think that you are correct. After more thoughts, it appears very tricky to compare page count and map count. Even if we have added smp_rmb() between page_ref_count() and page_mapcount(), an interrupt may happen between them. During the interrupt, the page count and map count may be changed, for example, unmapped, or do_swap_page(). Best Regards, Huang, Ying