From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F15F3C3DA4A for ; Mon, 19 Aug 2024 07:02:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 657716B007B; Mon, 19 Aug 2024 03:02:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 607A46B0082; Mon, 19 Aug 2024 03:02:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4CF3F6B0083; Mon, 19 Aug 2024 03:02:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 2FF8B6B007B for ; Mon, 19 Aug 2024 03:02:00 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 9803D141060 for ; Mon, 19 Aug 2024 07:01:59 +0000 (UTC) X-FDA: 82468100358.26.F15CAB1 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.16]) by imf09.hostedemail.com (Postfix) with ESMTP id BF7C0140029 for ; Mon, 19 Aug 2024 07:01:55 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=KYKmQOAU; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf09.hostedemail.com: domain of ying.huang@intel.com designates 192.198.163.16 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724050831; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=t2PC/DWkDwWDGSO2lQnG2WqjUyjp7H/9JszkL1Q+haw=; b=Op2YGzIigz2LShHdCwez6wu1NQ8gZnn0pWmkSa0WNGzT9ge+Smu53MxcYQjJbajDO8n2O6 SX5o/2Add9hxJSgh76Qsru+vXwf823Qf01IfjADehXxe9vYY8lLMisvYM9vvKPYsT2WnKo bU0buZoq4SuyAQq4ZtQ3EFjB8RZs6wQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724050831; a=rsa-sha256; cv=none; b=Fb5L07DrcjdxU/vgof3y6DpLJCjRFJalQTrJVpjB6jUEifXa3ZfcMQy5Wbew7PlLdxScKG G/n47CqcjAQtOb5Xp6bN966vgq6RUCVw3pHyhbzEx3I/Uz8e1jY6bfyC4xGghll29YE7VG tiWkB1tnTYP4xrNfEyPga8QcOMx0Ims= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=KYKmQOAU; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf09.hostedemail.com: domain of ying.huang@intel.com designates 192.198.163.16 as permitted sender) smtp.mailfrom=ying.huang@intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1724050916; x=1755586916; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version:content-transfer-encoding; bh=XPKIKSGQsM7DCWaQ6/c8d7hDW2E6Ywe4ScSLyzkJIRQ=; b=KYKmQOAUqfBu1urb2zcxF7DVOkBfitUwyn33sqc7UhGmN/NthIDd1VVB eP3LCgwJSBJ6D4vxx5tbkjFMhYRI0w3pMALaUKSCe9HqxwEbOJgBwZGVt j5U2Zh67YssvWa61j7hW9wjJ5HAn4/coYRvUuDUISkEwUcPD32tRe/z8S FM+3pr6x+2ZGSx/aWII+BXy/48pVY67P7RYutzVPVAzRjz5yq0qmULhkf 9Ju4YbnOH2GjY5WC+QMZ6f3BhFuuYKLJuSRXG+SKU16buEuP8ykYrUCw4 IfsdeER7ysupPASbAbS/CNGT6/F8TKSFtzXFnxX8cFnK006zCXCoylYMI Q==; X-CSE-ConnectionGUID: gs22xOwoR5WWiRYQqIeksg== X-CSE-MsgGUID: rYYi/n0yQ0KG0dZeQH2yIQ== X-IronPort-AV: E=McAfee;i="6700,10204,11168"; a="13086565" X-IronPort-AV: E=Sophos;i="6.10,158,1719903600"; d="scan'208";a="13086565" Received: from orviesa007.jf.intel.com ([10.64.159.147]) by fmvoesa110.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2024 00:01:53 -0700 X-CSE-ConnectionGUID: tHEUm6KNSp2I7a5pXWBpVg== X-CSE-MsgGUID: tfrMnVkBRP2vSuupBojeYw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,158,1719903600"; d="scan'208";a="60844933" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orviesa007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2024 00:01:45 -0700 From: "Huang, Ying" To: Dev Jain Cc: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: Re: [PATCH 1/2] mm: Retry migration earlier upon refcount mismatch In-Reply-To: (Dev Jain's message of "Fri, 16 Aug 2024 17:01:58 +0530") References: <20240809103129.365029-1-dev.jain@arm.com> <20240809103129.365029-2-dev.jain@arm.com> <87frrauwwv.fsf@yhuang6-desk2.ccr.corp.intel.com> <15dbe4ac-a036-4029-ba08-e12a236f448a@arm.com> <87bk1yuuzu.fsf@yhuang6-desk2.ccr.corp.intel.com> <95b72817-5444-4ced-998a-1cb90f42bf49@arm.com> <8734naurhm.fsf@yhuang6-desk2.ccr.corp.intel.com> <9d84e4e8-ac54-4eb1-a113-3f32aea864c9@arm.com> <391d4f4f-e642-4c11-a36b-190874963f8a@arm.com> Date: Mon, 19 Aug 2024 14:58:12 +0800 Message-ID: <87a5h9hud7.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: BF7C0140029 X-Stat-Signature: 7umk8jnpo9rin6h13jhir9ubzixioj8t X-Rspam-User: X-HE-Tag: 1724050915-717865 X-HE-Meta: U2FsdGVkX19kFgCLsS8xd/t7B2f+xiinmiJeo6+ziK1Mw3cmIfx0H41J60CP5Gr1e6U8e8gWofTwvI9bYI8u6UybtZpCQDDis2nl1rAGDxZVuYKILxh+kU4cC7p7TymI1uOzzsj6ku5wjs/qkw1gVXOmOtVsE/IppyoSc5Xk0/khN12YY7DX65W2snW8DGirnQ7We5snv13Ce7oroKPKrwSk0NnpYR5hyz7DKUqTWK1CKOR/N4DG8aqEM9FQvA5NMa+kx1qY7jxIQw+gat45iI/ahDueBooMe8ddyN81PTS8EP9LAY8vA8LrApToSH3ZPiyumGAtvTmL+boo00BS/Y1T6HlsAmiP4tQWX6qanUfegJgUHnKECbjuaIOqjdQkL7VxEhjqIHZsyvxXlqmjH6OWdR9Z18ElkcgtzZmRDNSDY4jO81e7He6q/K7r7//UOhX7uHZ6GxMAeVyNoCAPozsvFfZkrjTkws56tzbrXjdhNpPTUuHfjY/y2jreAzQ8IJaPWJYzAAy82VFZZ+7LLDKEuIpIiBRXJ2C9CgO879JSiJwgT0fQZ9DB3WN76iwVD+gvHfNVykRGpySzLXnol7pifjC/W1rft5HQc+Xnvtn1Jb9FyHrVZ9qRkGbzjqZ4I1TRBbtP/jEb7sTOsjXvnM9GbKrTwJz5wxuewlKN+OwQ8dEndrYJD/p0wYN0rHuoooPDaoyO3tjnzt1viZJFMzILEXpgx32BUR2NtQlNPjeutrwgmfoHiI8D0P3oYZvVIIeCK98Aw6nh4OMddHyTSbYBGcDn5k0zkm/Qk3mxIcLy5Mujf3r3mXxWhMm7EO2FrmvM5jHYrABZG3fLO+d9fM++po5aFCQ5Fonq/+WYY3KWeOYthf7QRHT+dtY/VS9zCJ1AUlJvF0MgLh6pRANwQyy2iHccxQ3qJPWzhH4CzOEvM0gJqukCG3UgYhnAnguAE/eGkaObu3fvPT+F6o/ mvEWU4Zs JAN07M0CgyUTuDjr/xtmEYY8JE2pmF4+hJ/ZX+Gh63P/kajpycocA+ePBsehKgj+OoN2Ha5WdMhLVqxZIAL0UUa1nrZDLDg0IS7+WeGoNJWenq9C1F4g8vuYfJk6sSpddx+a8Ui5Kum/hi/Qs2GPfdE4HMI/w03LgbVrlKMb/VQNtUCv9CZ7HQ5MJrxW9g5KRN1rZHtBU3uVFHnQqpmPIMw6bgvzJzNYq9lErE25laiGmkYwCJTSJbYJOxTvoF54pKozErSJpshKMXYzdXu9SGpDMKqRom6W9703RVEncouUoPTz16djPB7WMMZUNG3F0J8qcHFf+Bmyu/3yV4rLn+/RqpPoikMnjs1gziyOEJYNNj1F8fFmuR/QJV6LEj+A9/VIM X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Dev Jain writes: > On 8/13/24 12:52, Dev Jain wrote: >> >> On 8/13/24 10:30, Dev Jain wrote: >>> >>> On 8/12/24 17:38, Dev Jain wrote: >>>> >>>> On 8/12/24 13:01, Huang, Ying wrote: >>>>> Dev Jain writes: >>>>> >>>>>> On 8/12/24 11:45, Huang, Ying wrote: >>>>>>> Dev Jain writes: >>>>>>> >>>>>>>> On 8/12/24 11:04, Huang, Ying wrote: >>>>>>>>> Hi, Dev, >>>>>>>>> >>>>>>>>> Dev Jain writes: >>>>>>>>> >>>>>>>>>> As already being done in __migrate_folio(), wherein we >>>>>>>>>> backoff if the >>>>>>>>>> folio refcount is wrong, make this check during the >>>>>>>>>> unmapping phase, upon >>>>>>>>>> the failure of which, the original state of the PTEs will be >>>>>>>>>> restored and >>>>>>>>>> the folio lock will be dropped via migrate_folio_undo_src(), >>>>>>>>>> any racing >>>>>>>>>> thread will make progress and migration will be retried. >>>>>>>>>> >>>>>>>>>> Signed-off-by: Dev Jain >>>>>>>>>> --- >>>>>>>>>> =C2=A0=C2=A0=C2=A0 mm/migrate.c | 9 +++++++++ >>>>>>>>>> =C2=A0=C2=A0=C2=A0 1 file changed, 9 insertions(+) >>>>>>>>>> >>>>>>>>>> diff --git a/mm/migrate.c b/mm/migrate.c >>>>>>>>>> index e7296c0fb5d5..477acf996951 100644 >>>>>>>>>> --- a/mm/migrate.c >>>>>>>>>> +++ b/mm/migrate.c >>>>>>>>>> @@ -1250,6 +1250,15 @@ static int >>>>>>>>>> migrate_folio_unmap(new_folio_t get_new_folio, >>>>>>>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } >>>>>>>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (!foli= o_mapped(src)) { >>>>>>>>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* >>>>>>>>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * Someone may = have changed the refcount and maybe >>>>>>>>>> sleeping >>>>>>>>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * on the folio= lock. In case of refcount mismatch, >>>>>>>>>> bail out, >>>>>>>>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * let the syst= em make progress and retry. >>>>>>>>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 */ >>>>>>>>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 struct address_space= *mapping =3D folio_mapping(src); >>>>>>>>>> + >>>>>>>>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (folio_ref_count(= src) !=3D >>>>>>>>>> folio_expected_refs(mapping, src)) >>>>>>>>>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 goto out; >>>>>>>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 __migrate_folio_record(dst, old_page_state, >>>>>>>>>> anon_vma); >>>>>>>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 return MIGRATEPAGE_UNMAP; >>>>>>>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 } >>>>>>>>> Do you have some test results for this?=C2=A0 For example, after >>>>>>>>> applying the >>>>>>>>> patch, the migration success rate increased XX%, etc. >>>>>>>> I'll get back to you on this. >>>>>>>> >>>>>>>>> My understanding for this issue is that the migration success >>>>>>>>> rate can >>>>>>>>> increase if we undo all changes before retrying. This is the >>>>>>>>> current >>>>>>>>> behavior for sync migration, but not for async migration.=C2=A0 If >>>>>>>>> so, we can >>>>>>>>> use migrate_pages_sync() for async migration too to increase >>>>>>>>> success >>>>>>>>> rate?=C2=A0 Of course, we need to change the function name and >>>>>>>>> comments. >>>>>>>> As per my understanding, this is not the current behaviour for sync >>>>>>>> migration. After successful unmapping, we fail in >>>>>>>> migrate_folio_move() >>>>>>>> with -EAGAIN, we do not call undo src+dst (rendering the loop >>>>>>>> around >>>>>>>> migrate_folio_move() futile), we do not push the failed folio >>>>>>>> onto the >>>>>>>> ret_folios list, therefore, in _sync(), _batch() is never >>>>>>>> tried again. >>>>>>> In migrate_pages_sync(), migrate_pages_batch(,MIGRATE_ASYNC) will be >>>>>>> called first, if failed, the folio will be restored to the original >>>>>>> state (unlocked).=C2=A0 Then migrate_pages_batch(,_SYNC*) is called >>>>>>> again. >>>>>>> So, we unlock once.=C2=A0 If it's necessary, we can unlock more tim= es via >>>>>>> another level of loop. >>>>>> Yes, that's my point. We need to undo src+dst and retry. >>>>> For sync migration, we undo src+dst and retry now, but only once.=C2= =A0 You >>>>> have shown that more retrying increases success rate. >>>>> >>>>>> We will have >>>>>> to decide where we want this retrying to be; do we want to change the >>>>>> return value, end up in the while loop wrapped around _sync(), >>>>>> and retry >>>>>> there by adding another level of loop, or do we want to make use >>>>>> of the >>>>>> existing retry loops, one of which is wrapped around _unmap(); >>>>>> the latter >>>>>> is my approach. The utility I see for the former approach is >>>>>> that, in case >>>>>> of a large number of page migrations (which should usually be >>>>>> the case), >>>>>> we are giving more time for the folio to get retried. The latter >>>>>> does not >>>>>> give much time and discards the folio if it did not succeed >>>>>> under 7 times. >>>>> Because it's a race, I guess that most folios will be migrated >>>>> successfully in the first pass. >>>>> >>>>> My concerns of your method are that it deal with just one case >>>>> specially.=C2=A0 While retrying after undoing all appears more genera= l. >>>> >>>> >>>> Makes sense. Also, please ignore my "change the return value" >>>> thing, I got confused between unmap_folios, ret_folios, etc. >>>> Now I think I understood what the lists are doing :) >>>> >>>>> >>>>> If it's really important to retry after undoing all, we can either >>>>> convert two retying loops of migrate_pages_batch() into one loop, or >>>>> remove retry loop in migrate_pages_batch() and retry in its caller >>>>> instead. >>>> >>>> And if I implemented this correctly, the following makes the test >>>> pass always: >>>> https://www.codedump.xyz/diff/Zrn7EdxzNXmXyNXe >>> >>> >>> Okay, I did mess up with the implementation, leading to a false >>> positive. Let me try again :) >> >> >> Hopefully this should do the job: >> https://www.codedump.xyz/diff/ZrsIV8JSOPYx5V_u >> >> But the result is worse than the patch proposed; I rarely hit >> a 3 digit number of successes of move_pages(). But, on a >> base kernel without any changes, when I apply David's >> suggestion to change the test, if I choose 7 as the number >> of retries (=3D NR_MAX_MIGRATE_SYNC_RETRY) in the test, I >> can touch even 4 digits. I am puzzled. >> We can also try merging the for loops of unmap and move... > > > If people are okay with this change, I guess I can send it as > a v2? I concur with your assessment that my initial approach > is solving a specific case; the above approach does give me > a slight improvement on arm64 and should be an improvement > in general, since it makes sense to defer retrying the failed folio > as much as we can. We need to deal with something else before a formal v2, - stats need to be fixed, please check result processing for the first loop of migrate_pages_sync(). - Do we need something similar for async migration. - Can we add another level of explicit loop for the second loop of migrate_pages_sync()? That is to improve code readability. Or, add a function to dot that? - Is it good to remove retry loop in migrate_pages_batch()? And do retry in the caller? -- Best Regards, Huang, Ying