From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F741C2BBCA for ; Tue, 25 Jun 2024 09:01:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D4A686B02F9; Tue, 25 Jun 2024 05:01:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CD3236B02FA; Tue, 25 Jun 2024 05:01:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B4C226B02FB; Tue, 25 Jun 2024 05:01:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 95F3D6B02F9 for ; Tue, 25 Jun 2024 05:01:28 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 176ECC163E for ; Tue, 25 Jun 2024 09:01:28 +0000 (UTC) X-FDA: 82268817456.13.F670402 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by imf17.hostedemail.com (Postfix) with ESMTP id 654C540013; Tue, 25 Jun 2024 09:01:23 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=none; spf=pass (imf17.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719306078; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xmWkbiyglkh7hXTVB4C44ggi9IVyuULBvFGc3Y3LKn4=; b=l2aGFfKdhiMRH/mF6C10huFDCdFS2ML6Qd9bejQcil3psioA854BY4uh9AMCTNj6v/Oh4Y ntajUCyyQMGA2HFESC9gepFhKaraJR+qdV/bOrVJwrP30JNLPucv/gEVG17flHB9vdBPrH 4h83yFi2b/OASMuvWqv15f/jYYELkMw= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=none; spf=pass (imf17.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719306078; a=rsa-sha256; cv=none; b=kuNsHrYmEVj5sJddBiVnj7eEigup06jcWVWm0yRr1h9ZTOxpYAoXgHNurAa7FBF/qzRp0i TcNvduXhMrXpEjmPvClnJC2wxQLmLsDIdYAd1N1MY1hps9xLxG1fy/AQE55JUWhr0u581Y u9Pp1YKecRaK92zgUjzL9edXDwUJRDs= Received: from mail.maildlp.com (unknown [172.19.163.174]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4W7f2K4grDznXXw; Tue, 25 Jun 2024 17:01:13 +0800 (CST) Received: from dggpemf100008.china.huawei.com (unknown [7.185.36.138]) by mail.maildlp.com (Postfix) with ESMTPS id 3AD7314022E; Tue, 25 Jun 2024 17:01:18 +0800 (CST) Received: from [10.174.177.243] (10.174.177.243) by dggpemf100008.china.huawei.com (7.185.36.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Tue, 25 Jun 2024 17:01:17 +0800 Message-ID: Date: Tue, 25 Jun 2024 17:01:16 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH mm-unstable] mm: migrate: folio_ref_freeze() under xas_lock_irq() Content-Language: en-US To: Hugh Dickins , Andrew Morton CC: , Tony Luck , Miaohe Lin , , Matthew Wilcox , David Hildenbrand , Muchun Song , Benjamin LaHaise , , Zi Yan , Jiaqi Yan , Vishal Moola , Alistair Popple , Jane Chu , Oscar Salvador References: <07edaae7-ea5d-b6ae-3a10-f611946f9688@google.com> From: Kefeng Wang In-Reply-To: <07edaae7-ea5d-b6ae-3a10-f611946f9688@google.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.177.243] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpemf100008.china.huawei.com (7.185.36.138) X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 654C540013 X-Stat-Signature: iqwwu14k9ngga33zdsfj147mr6kucri3 X-Rspam-User: X-HE-Tag: 1719306083-874899 X-HE-Meta: U2FsdGVkX18IuODLME/Px5QSn1YMentiHm8ncUQXIlZD85AmX1u0wl37yxC4z7DsKBthzwfZnehBNi05Wk17hUZ6mt9jhX7idifY1Z6gIoDccB15Gd7uQR7SwCbo913B3yrYjk7Db9Zf6J4zRfAqmLPLtZyDWtKIknZtuZg+FPCcTdtQkzUGpUbCvQeiMO1lZdOuxSo/NS0MSQtAcMwrym7D3o3gZcLvNmhZe+Inz5/DXhH7jLzt+c0U+JFu0e8uo7kmASnQ8aoAciq/2BbhwI/17yd7DbIEluiOxmzK3QsRpqWMg/oVpMdTTDQzPikMGD3zu6xMKJla8gfHyQ+77n6KNLlBmz0uBpcABZJEI3SmnRfXDNSRUVfebAHeZWC1OmPJBqm4jEFxY8Gnj1xgZ6Sc0KH/X1q+uE/oilwY0Lushx6IHjokNa7eA+dj661G9c8B8MDhuG137972oINF5KlVdt4vDdjMEk9tzi5GBAaf/nhGUlrqCU+1Pys47/GND2rmCH1uz1i/mQP/NUp8GuB9NrLbZiByyyxKQiqXfiqdjBKsnZVG+C6/+wH7r5fClG+EzohphdBYoHgpNak2C/Pg3gPtYDhD4Q1tdB4OcgaaKMbxMbyaanmXF5YAKhe4Kxw7EKsaS+0gXu8jMfhTI4GIBgn5NYctv4zXNYy5qcngMbbkVqYF+INfLFjNHLQw+gV1anzdyganSoozmkkc+tQOVBvF9v+aDbJ69g0OxouOaMZ8NVoObap3cF1dYJKkUBSDKsGkfFEK7FPNmhCoJ15yU6amloBJVzzNEOGJ31H61H0PH/GUtOfk1hKrqZQg8/x5quJd2K5MO24jjm/HoNWKScG+RSJT6fIqXVWmW9n5eG2FPEjcWYWtWfKACJ9sxFke11xJG3gH7VhKqn7QE3uCEnO6w7B1zpIIJwtgy52AczZNQAyQ+f6tJnjY2J8eO385bAxz6ln/Swhrr1Y nHQ9qZsk aFyjsmurIPkMuPBriaUlEmiZGo6C+sBhxU0FIf4B5LXXKezIle2HF3ldgdvnHEJERGJDR4O3ddJqjJnxKEck3fcYcR/0QPLi2bkOOABrvNLTQsByJrkl2UX71zJ2E4BAtzAhSon9DtufhvNCSpxwBnwcW/ZP5+oQ4ui9R2Bi4+Pc+FV82w/etK52eHIgbmDBcjpwZTZwHcyXPP3ZSDPTOU5MM3N6gymxJQsacWgzlB3zbazR8lxvOo0h7W4+13F2N6/+XRpv3gq5X7ARkr9GyYPhimwk5eAZWuv/UvBDB+2iDRhIUYrS/Yb3GpC1IdJNL+GmJaLc7Sc1a6tn/Qrg22rZKb0NCkOLa44Nm0kK3xyd0YRpkp21/hsc7mSU11gmFsdpHpwDM6y9aM1wRiWDvDe5XZg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/6/25 13:04, Hugh Dickins wrote: > Commit "mm: migrate: split folio_migrate_mapping()" drew attention to > "Note, the folio_ref_freeze() is moved out of xas_lock_irq(), Since the > folio is already isolated and locked during migration, so suppose that > there is no functional change." > > That was a mistake. Freezing a folio's refcount to 0 is much like taking > a spinlock: see how filemap_get_entry() takes rcu_read_lock() then spins > around until the folio is unfrozen. If the task freezing is preempted (or > calls cond_resched(), as folio_mc_copy() may do), then it risks deadlock: > in my case, one CPU in zap_pte_range() with free_swap_and_cache_nr() > trying to reclaim swap while PTL is held, all the other CPUs in reclaim > spinning for that PTL. Oh, thanks for pointing this out, I didn't take that into account. > > I'm uncertain whether it's necessary for interrupts to be disabled as > well as preemption, but since they have to be disabled for the page > cache migration, it's much the best to do it all together as before. > So revert to folio_ref_freeze() under xas_lock_irq(): but keep the > preliminary folio_ref_count() check, which does make sense before > trying to copy the folio's data. This is what my RFC version[1] does, which adds same reference check to avoid the unnecessary folio_mc_copy(). Should I resend all patches, or Andrew directly pick this one? Thanks. [1] https://lore.kernel.org/linux-mm/20240129070934.3717659-7-wangkefeng.wang@huawei.com/ > > Use "expected_count" for the expected count throughout. > > Signed-off-by: Hugh Dickins > --- > mm/migrate.c | 59 +++++++++++++++++++++++++--------------------------- > 1 file changed, 28 insertions(+), 31 deletions(-) > > diff --git a/mm/migrate.c b/mm/migrate.c > index 27f070f64f27..8beedbb42a93 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -400,8 +400,8 @@ static int folio_expected_refs(struct address_space *mapping, > * 2 for folios with a mapping > * 3 for folios with a mapping and PagePrivate/PagePrivate2 set. > */ > -static void __folio_migrate_mapping(struct address_space *mapping, > - struct folio *newfolio, struct folio *folio, int expected_cnt) > +static int __folio_migrate_mapping(struct address_space *mapping, > + struct folio *newfolio, struct folio *folio, int expected_count) We can rename it back to folio_migrate_mapping(). > { > XA_STATE(xas, &mapping->i_pages, folio_index(folio)); > struct zone *oldzone, *newzone; > @@ -415,13 +415,18 @@ static void __folio_migrate_mapping(struct address_space *mapping, > newfolio->mapping = folio->mapping; > if (folio_test_swapbacked(folio)) > __folio_set_swapbacked(newfolio); > - return; > + return MIGRATEPAGE_SUCCESS; > } > > oldzone = folio_zone(folio); > newzone = folio_zone(newfolio); > > xas_lock_irq(&xas); > + if (!folio_ref_freeze(folio, expected_count)) { > + xas_unlock_irq(&xas); > + return -EAGAIN; > + } > + > /* Now we know that no one else is looking at the folio */ > newfolio->index = folio->index; > newfolio->mapping = folio->mapping; > @@ -456,7 +461,7 @@ static void __folio_migrate_mapping(struct address_space *mapping, > * old folio by unfreezing to one less reference. > * We know this isn't the last reference. > */ > - folio_ref_unfreeze(folio, expected_cnt - nr); > + folio_ref_unfreeze(folio, expected_count - nr); > > xas_unlock(&xas); > /* Leave irq disabled to prevent preemption while updating stats */ > @@ -504,23 +509,19 @@ static void __folio_migrate_mapping(struct address_space *mapping, > } > } > local_irq_enable(); > + > + return MIGRATEPAGE_SUCCESS; > } > > int folio_migrate_mapping(struct address_space *mapping, struct folio *newfolio, > struct folio *folio, int extra_count) > { > - int expected_cnt = folio_expected_refs(mapping, folio) + extra_count; > + int expected_count = folio_expected_refs(mapping, folio) + extra_count; > > - if (!mapping) { > - if (folio_ref_count(folio) != expected_cnt) > - return -EAGAIN; > - } else { > - if (!folio_ref_freeze(folio, expected_cnt)) > - return -EAGAIN; > - } > + if (folio_ref_count(folio) != expected_count) > + return -EAGAIN; > > - __folio_migrate_mapping(mapping, newfolio, folio, expected_cnt); > - return MIGRATEPAGE_SUCCESS; > + return __folio_migrate_mapping(mapping, newfolio, folio, expected_count); > } > EXPORT_SYMBOL(folio_migrate_mapping); > > @@ -534,16 +535,18 @@ int migrate_huge_page_move_mapping(struct address_space *mapping, > XA_STATE(xas, &mapping->i_pages, folio_index(src)); > int ret, expected_count = folio_expected_refs(mapping, src); > > - if (!folio_ref_freeze(src, expected_count)) > + if (folio_ref_count(src) != expected_count) > return -EAGAIN; > > ret = folio_mc_copy(dst, src); > - if (unlikely(ret)) { > - folio_ref_unfreeze(src, expected_count); > + if (unlikely(ret)) > return ret; > - } > > xas_lock_irq(&xas); > + if (!folio_ref_freeze(src, expected_count)) { > + xas_unlock_irq(&xas); > + return -EAGAIN; > + } > > dst->index = src->index; > dst->mapping = src->mapping; > @@ -660,24 +663,18 @@ static int __migrate_folio(struct address_space *mapping, struct folio *dst, > struct folio *src, void *src_private, > enum migrate_mode mode) > { > - int ret, expected_cnt = folio_expected_refs(mapping, src); > + int ret, expected_count = folio_expected_refs(mapping, src); > > - if (!mapping) { > - if (folio_ref_count(src) != expected_cnt) > - return -EAGAIN; > - } else { > - if (!folio_ref_freeze(src, expected_cnt)) > - return -EAGAIN; > - } > + if (folio_ref_count(src) != expected_count) > + return -EAGAIN; > > ret = folio_mc_copy(dst, src); > - if (unlikely(ret)) { > - if (mapping) > - folio_ref_unfreeze(src, expected_cnt); > + if (unlikely(ret)) > return ret; > - } > > - __folio_migrate_mapping(mapping, dst, src, expected_cnt); > + ret = __folio_migrate_mapping(mapping, dst, src, expected_count); > + if (ret != MIGRATEPAGE_SUCCESS) > + return ret; > > if (src_private) > folio_attach_private(dst, folio_detach_private(src));