From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id F37AFC64ED6
	for <linux-mm@archiver.kernel.org>; Wed,  1 Mar 2023 07:12:07 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 876396B0073; Wed,  1 Mar 2023 02:12:07 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 826186B0074; Wed,  1 Mar 2023 02:12:07 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 6C7706B0075; Wed,  1 Mar 2023 02:12:07 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11])
	by kanga.kvack.org (Postfix) with ESMTP id 5D9286B0073
	for <linux-mm@kvack.org>; Wed,  1 Mar 2023 02:12:07 -0500 (EST)
Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay02.hostedemail.com (Postfix) with ESMTP id 1EECC121303
	for <linux-mm@kvack.org>; Wed,  1 Mar 2023 07:12:07 +0000 (UTC)
X-FDA: 80519460294.27.6BEF014
Received: from mga01.intel.com (mga01.intel.com [192.55.52.88])
	by imf04.hostedemail.com (Postfix) with ESMTP id E00DF40022
	for <linux-mm@kvack.org>; Wed,  1 Mar 2023 07:12:03 +0000 (UTC)
Authentication-Results: imf04.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=Rrd+HpfY;
	spf=pass (imf04.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=ying.huang@intel.com;
	dmarc=pass (policy=none) header.from=intel.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677654724; a=rsa-sha256;
	cv=none;
	b=r6e38rHouRLJvmjQujv0hcZuE99SNAqQK12MV7RkIiK4hWtg+9tpyzgtEAuHeUw0wo65zh
	FtGQUEfYELyX3N+5rfeY7E863l8BMM0rBeVb8IErIEr1lnN6jYei5iyl7Ppa4ot/vqTA5l
	EWls4x2FACZST1GJgdfDBHftAV5rDvQ=
ARC-Authentication-Results: i=1;
	imf04.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=Rrd+HpfY;
	spf=pass (imf04.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=ying.huang@intel.com;
	dmarc=pass (policy=none) header.from=intel.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1677654724;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=pQTQKhtxE6uqNiVJ8wzRAfCFw1UFrue1EY8lDM8Vm2k=;
	b=W63C8doJN39GxFUd8Hxlhm8jmwhRsgacknx9QixzAx+l02qVtolL8nxpd456iOTK8WiQOA
	qv4n0UwwmuTG2ZphUyNQ14Niw5WKldEeWCrPEKjhtCmRfoX02TCjhQMDQu5E4uuWjHsaSb
	JM6JhWahCmYE6T9t7Goj4HwVOhwIj/o=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1677654723; x=1709190723;
  h=from:to:cc:subject:references:date:in-reply-to:
   message-id:mime-version;
  bh=uGeUM/GH3u6jAhtt7FlQqyfKk3rCPGo6FeqqaXZsj60=;
  b=Rrd+HpfYBRl9A2x6rarZbSn0AGyrEgmcGLqfiAOT36jBwIzmXdw4TM8Z
   GTONP7OL4qRNnnatiB+eB0nVuxT9DkONd8paEF0oTqaDdydKd6Fls3bfm
   zI5BPNP68L4oZGYUKtMjYU9plr4+tKKrwAbcdrXXXg1vDvSx6fqdYrDPt
   rwwl4hP7Be28PfyWgHocMnTKj/Xq9aGYPY/DBdUp7P5yWF+GYTfDvbGSA
   NPUorMa6lX2F2zA6gjC3vGY5WFzekz27qzJh2o5pVxHicUlTy8PEyUub9
   nsylLofULQyBrzpvliMrHYJzLMfsz4mWCqGCF5EIGZo4q2T5BpPEJLi31
   A==;
X-IronPort-AV: E=McAfee;i="6500,9779,10635"; a="361917239"
X-IronPort-AV: E=Sophos;i="5.98,224,1673942400"; 
   d="scan'208";a="361917239"
Received: from orsmga001.jf.intel.com ([10.7.209.18])
  by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2023 23:12:02 -0800
X-IronPort-AV: E=McAfee;i="6500,9779,10635"; a="706862429"
X-IronPort-AV: E=Sophos;i="5.98,224,1673942400"; 
   d="scan'208";a="706862429"
Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55])
  by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2023 23:11:58 -0800
From: "Huang, Ying" <ying.huang@intel.com>
To: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,  linux-mm@kvack.org,
  linux-kernel@vger.kernel.org,  "Xu, Pengfei" <pengfei.xu@intel.com>,
  Christoph Hellwig <hch@lst.de>,  Stefan Roesch <shr@devkernel.io>,  Tejun
 Heo <tj@kernel.org>,  Xin Hao <xhao@linux.alibaba.com>,  Zi Yan
 <ziy@nvidia.com>,  Yang Shi <shy828301@gmail.com>,  Baolin Wang
 <baolin.wang@linux.alibaba.com>,  Matthew Wilcox <willy@infradead.org>,
  Mike Kravetz <mike.kravetz@oracle.com>
Subject: Re: [PATCH 3/3] migrate_pages: try migrate in batch asynchronously
 firstly
References: <20230224141145.96814-1-ying.huang@intel.com>
	<20230224141145.96814-4-ying.huang@intel.com>
	<bdc873-3367-9aa7-79c6-91c68fecac41@google.com>
	<87cz5ub5dr.fsf@yhuang6-desk2.ccr.corp.intel.com>
	<070f71-9af-c29a-30b9-758b5cdf6766@google.com>
	<874jr5atqf.fsf@yhuang6-desk2.ccr.corp.intel.com>
	<c9de353-2420-d076-9fff-d6011611c2b@google.com>
Date: Wed, 01 Mar 2023 15:10:53 +0800
In-Reply-To: <c9de353-2420-d076-9fff-d6011611c2b@google.com> (Hugh Dickins's
	message of "Tue, 28 Feb 2023 22:46:47 -0800 (PST)")
Message-ID: <87356p9caq.fsf@yhuang6-desk2.ccr.corp.intel.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=ascii
X-Rspam-User: 
X-Rspamd-Queue-Id: E00DF40022
X-Rspamd-Server: rspam01
X-Stat-Signature: f1cm73whyab5tfgzbanoa9soh7qyc79j
X-HE-Tag: 1677654723-691666
X-HE-Meta: U2FsdGVkX19ubpKJhvJvdI54/T9GIA1ssinU3x3AuDvbepYkaTa8O+aSJDt0f65ZPIu65oU+ou/4sBpJFsRk9zHYRit/OHHZ1LyLdAEds5YezRMk/eKS9R7qJDUtWf122IpaJ7UtQwfPWkbrCONs/yibzZqhCjivpv1d404Abf+gnO6veWcBz/XkPeACOpJ7LDnacXQKGOB8xoM6YVJ3dcdUV1bN7ct1iSDQyWJploMe3Mq6GC18wPv9S1Do4/1GSyMy4BIK2vQUNSv/aoTfZkMDKjI9JRbk525E3OrRd9C+ONxcNBww+7l9rjvgw1bItfOelHIBUgkC3hkH/nZ8j1Jn0WxXkF63jw9xwxA/vxK/biedKzx2yXfKxS6uKGG6mUpjhmX0ALij1gE2ox880vfl9utx7ESXCAGMRK75WyU5e08ikpXhK4vsHjNYiV6FJAN1iAaBc3sAo9fGUUSEDsjZnVzF6ATSm7clPDKEG/s+7evz7QBF20FMxL6l5eAJNI/jehwduV2ta8Xx8639oZq6ygYT3YVyXdWwdRN/clQKwBf0SVwC/vvrObQQKkOcRPyo77qGYfv84YaaGJ1uUvxdnySiF/Bc/yP/W6/9nYPTQpdHzdDqXJiAQqqt9VriPx5TDh0QsgQgmUTC6esDllxk7PC5EynfnJK/aip7Y6zMnccRlaoV70WQocoAwQzLE2Ro3G8skQvWpHJizAX3Z8ir9M++kuJWVfKzbm11D9iS60nwh4VDM+y3SeBNnKZULpXq1nJdAMrn64kHZ/oLoQYG0srnX5aU2vB1BEgTapjGbMD+pZ57fonXy2RGQtqpMaJQfQ5Bcvb1oOzm9Z7WxANJGjX5qIjOuy7ie47l6HVKbT7Zm1Fu/88ehA6gyA/maK3EvljWHiIGkT46sc4/tLHUABZEZs57iAHX6cpuhDB7B8hG7SHbXi3vbnOsXRAybcDvajuNTDYWtmRTYDs
 3kqUc1op
 30E4XDNgUC4yhVgnX05FyXjwabRUoizMqmWbGAYFPv4+3n0ujKHornvxs3VK6zDDvzJiKxxHzKJKDQhJRzq4ibTBn+tVlWQQLN6lWZVq8MQjBJNNFv/hhrgp3iVYucYfT7x5jA8g6HSJOWLNRVwAyoGHosHGwQptRKOoMG5nLKkFYSkwvT/SFf05xclNvUpdzYdPx/vah6fO81nZBWbnM3KOcKR6OYv6g6N0TWb6+nJ9vRGY=
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

Hugh Dickins <hughd@google.com> writes:

> On Wed, 1 Mar 2023, Huang, Ying wrote:
>> Hugh Dickins <hughd@google.com> writes:
>> > On Tue, 28 Feb 2023, Huang, Ying wrote:
>> >> Hugh Dickins <hughd@google.com> writes:
>> >> > On Fri, 24 Feb 2023, Huang Ying wrote:
>> >> >> 
>> >> >> diff --git a/mm/migrate.c b/mm/migrate.c
>> >> >> index 91198b487e49..c17ce5ee8d92 100644
>> >> >> --- a/mm/migrate.c
>> >> >> +++ b/mm/migrate.c
>> >> >> @@ -1843,6 +1843,51 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>> >> >>  	return rc;
>> >> >>  }
>> >> >>  
>> >> >> +static int migrate_pages_sync(struct list_head *from, new_page_t get_new_page,
>> >> >> +		free_page_t put_new_page, unsigned long private,
>> >> >> +		enum migrate_mode mode, int reason, struct list_head *ret_folios,
>> >> >> +		struct list_head *split_folios, struct migrate_pages_stats *stats)
>> >> >> +{
>> >> >> +	int rc, nr_failed = 0;
>> >> >> +	LIST_HEAD(folios);
>> >> >> +	struct migrate_pages_stats astats;
>> >> >> +
>> >> >> +	memset(&astats, 0, sizeof(astats));
>> >> >> +	/* Try to migrate in batch with MIGRATE_ASYNC mode firstly */
>> >> >> +	rc = migrate_pages_batch(from, get_new_page, put_new_page, private, MIGRATE_ASYNC,
>> >> >> +				 reason, &folios, split_folios, &astats,
>> >> >> +				 NR_MAX_MIGRATE_PAGES_RETRY);
>> >> >
>> >> > I wonder if that and below would better be NR_MAX_MIGRATE_PAGES_RETRY / 2.
>> >> >
>> >> > Though I've never got down to adjusting that number (and it's not a job
>> >> > to be done in this set of patches), those 10 retries sometimes terrify
>> >> > me, from a latency point of view.  They can have such different weights:
>> >> > in the unmapped case, 10 retries is okay; but when a pinned page is mapped
>> >> > into 1000 processes, the thought of all that unmapping and TLB flushing
>> >> > and remapping is terrifying.
>> >> >
>> >> > Since you're retrying below, halve both numbers of retries for now?
>> >> 
>> >> Yes.  These are reasonable concerns.
>> >> 
>> >> And in the original implementation, we only wait to lock page and wait
>> >> the writeback to complete if pass > 2.  This is kind of trying to
>> >> migrate asynchronously for 3 times before the real synchronous
>> >> migration.  So, should we delete the "force" logic (in
>> >> migrate_folio_unmap()), and try to migrate asynchronously for 3 times in
>> >> batch before migrating synchronously for 7 times one by one?
>> >
>> > Oh, that's a good idea (but please don't imagine I've thought it through):
>> > I hadn't realized the way in which your migrate_pages_sync() addition is
>> > kind of duplicating the way that the "force" argument conditions behaviour,
>> > It would be very appealing to delete the "force" argument now if you can.
>> 
>> Sure.  Will do that in the next version.
>> 
>> > But aside from that, you've also made me wonder (again, please remember I
>> > don't have a good picture of the new migrate_pages() sequence in my head)
>> > whether you have already made a *great* strike against my 10 retries
>> > terror.  Am I reading it right, that the unmapping is now done on the
>> > first try, and the remove_migration_ptes after the last try (all the
>> > pages involved having remained locked throughout)?
>> 
>> Yes.  You are right.  Now, unmapping and moving are two separate steps,
>> and they are retried separately.  After a folio has been unmapped
>> successfully, we will not remap/unmap it 10 times if the folio is pinned
>> so that failed to move (migrate_folio_move()).  So the latency caused by
>> retrying is much better now.  But I still tend to keep the total retry
>> number as before.  Do you agree?
>
> Yes, I agree, keep the total retry number 10 as before: maybe someone in
> future will show that more than 5 is a waste of time, but there's little
> need to get into that now: if you've put an end to that 10 times unmapping
> and remapping, that's a great step forward, quite apart from the TLB flush
> batching itself.
>
> (I did change "no need" to "little need" above: I do have some some
> anxiety about the increased latencies from keeping folios locked and
> migration entries in place for significantly longer than before your
> batching: I won't be surprised if the maximum batch size has to be
> lowered, if reports of latency spikes come in; and that might extend
> to the retry count too.)

Yes.  Latency are always concerns for batching.  We may revisit this
when needed.  Something good now is that we will never wait the lock or
bit in batched mode.  Latency tolerance depends on caller too, for
example, when we migrate some cold pages from DRAM to CXL MEM, we can
tolerate relatively long latency.  If so, we can add a parameter to
migrate_pages() to restrict the batch number and retry number when
necessary too.

Best Regards,
Huang, Ying