From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29206C3DA7F for ; Mon, 12 Aug 2024 06:19:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B8FED6B0092; Mon, 12 Aug 2024 02:19:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B3F936B0098; Mon, 12 Aug 2024 02:19:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A072A6B009A; Mon, 12 Aug 2024 02:19:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 7D9336B0092 for ; Mon, 12 Aug 2024 02:19:34 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 23600A3E27 for ; Mon, 12 Aug 2024 06:19:34 +0000 (UTC) X-FDA: 82442591868.05.475CA6A Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.17]) by imf23.hostedemail.com (Postfix) with ESMTP id 49504140002 for ; Mon, 12 Aug 2024 06:19:31 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=RR6g6q5l; spf=pass (imf23.hostedemail.com: domain of ying.huang@intel.com designates 198.175.65.17 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723443502; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=i54XoORt+oj25ZaDKz9XGVCjV5FAMHgYBPTJGqtehVc=; b=zSDcImIRM8dhmZpvXMKfl4eAOMEQQjhq1/Yu2e1nWO5kg38bJTvR0jdF0c5+5Ey2mrjQJz Flxme333QTVKscgUdQLMqFkstLxlJ7K61GiirXt8vF8x0iG50whihmRn8RyVTP8hrV6XXJ Tw1/dvRUsjnW/Q2AhvpbR5LLoDhAD6A= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723443502; a=rsa-sha256; cv=none; b=fw6o4NlEXyLk5mIlEE7H5eM6ANr4GrCu7NiwNHss8PMuBHud6dvcmpwCEjVeFczSKdKXLJ YeTHsyqBykWbrdeXsmRtGdXyb2iyk02IYw01OWnG++eUb7CbW/NBkz19iT1Btp/zyO2sC0 5A0inGLXKF0r0foD1NJUsCnbiSVrfRk= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=RR6g6q5l; spf=pass (imf23.hostedemail.com: domain of ying.huang@intel.com designates 198.175.65.17 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1723443572; x=1754979572; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version; bh=upPcG7x+Ow0LLe8ZkMqMsAdNeXtuQTLEUkwMBpfkz/k=; b=RR6g6q5lagu6tox4achoDA8eGUn9+BJUxfQ+/KypF9LTGCZ5Fplb34P2 2fFZ365MB1bF0rJd7r0LK0V/J5muiGAveNv0i+uVoG38d1pVXRdkVkdfG GDEQxg8vICC5LMTc1C1c0UIUpAK622nQ6nhbG4oWaCKkk6iobOx1x4ic3 ohhqXJuIjBDZB+Zw1D1wYv0syRazq1GQ1Wpf3B7k0Di99K8c3tFPxVDvJ uG9SiZubDi+xgS2Cfgr+tHGudgGWxBN/Fj9p9geScva71nYhOPN2aCAP+ NdzX3lw5Wh0dnhzHpiVEUW50MxTHPQdNnyYXLEE+NokX55fdKDkMS8zid Q==; X-CSE-ConnectionGUID: g4piOXxLQJyq1dr1dDNL9A== X-CSE-MsgGUID: lhRTDtcuT0ylAL5o4nAr1g== X-IronPort-AV: E=McAfee;i="6700,10204,11161"; a="21664566" X-IronPort-AV: E=Sophos;i="6.09,282,1716274800"; d="scan'208";a="21664566" Received: from orviesa001.jf.intel.com ([10.64.159.141]) by orvoesa109.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2024 23:19:30 -0700 X-CSE-ConnectionGUID: AC5fCf66Sdu4WnmmH4vGsw== X-CSE-MsgGUID: ztVQmoM7TGe+GOWf4XCOBw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,282,1716274800"; d="scan'208";a="95665801" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by smtpauth.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2024 23:19:22 -0700 From: "Huang, Ying" To: Dev Jain Cc: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: Re: [PATCH 1/2] mm: Retry migration earlier upon refcount mismatch In-Reply-To: <15dbe4ac-a036-4029-ba08-e12a236f448a@arm.com> (Dev Jain's message of "Mon, 12 Aug 2024 11:31:29 +0530") References: <20240809103129.365029-1-dev.jain@arm.com> <20240809103129.365029-2-dev.jain@arm.com> <87frrauwwv.fsf@yhuang6-desk2.ccr.corp.intel.com> <15dbe4ac-a036-4029-ba08-e12a236f448a@arm.com> Date: Mon, 12 Aug 2024 14:15:49 +0800 Message-ID: <87bk1yuuzu.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Stat-Signature: yc4rsomenefxgfbsd1aub4qnctbmym71 X-Rspamd-Queue-Id: 49504140002 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1723443571-555788 X-HE-Meta: U2FsdGVkX19LJ6iSH6P5FIDrEcHekf+lHxsLtQv9mvOX3lDBTolG3bTk3iDqdEua40UoMZltnTmscN9pHgNqRL/NEgszuUDh8AHB5YHsTNCMd5T7hWKC0mHqIrPIRDKyf83PbfMff54cklSwzGQA96Mq2zsd9yR0xMQK/d2/FMoAK6UupsyVojNcZqM9Qa+sh+CkHy9moiQR3Q3ZQBE5JWYjb5CCgNuDdb18fWd4Vh65yKgJJsYPVMvC5WHupfHKevmGEeM9pKBOC2TQct7HLFAClNd2EVR9WHhRfQlHK/xqBWTSp08DFe29UuSupNxHczmdxUd/L+KSs3+G8wyQdaUlSsbByf99xZFo4gttctRL3cCbUYiHOKHMypDkq43ShI7kSBcZ7NbpiVN/Y+jTon/cREnYnSOqB/eXBGosUmgocU3OY373xIEP180zEgy2QnBSL2xKolTBDUpZmSLYEaDtx7PnqVLwzsV998mK+AXoJ7eoxqd8jJBs9eDh9Ib+yfUMmkwvP7XSQQRN6SoObQH95xtca2fHWhHOfXWzx25AWcRmlnzU16saeRodK3wk1EhSeNzW4lNLzhbtk2iDlImw4h4KRmXLUqyjKtSWwkh0Ip5T7GHZp4b5Bp+IDEljBG8zGMSAp6uO/v7tC5Bwmn233F8xyzYXsTyHo/YV2/c0GNbdEkeY4BLzroERBvHI33y7/S3hXBorus2yp3BQeh8lVedWG0umW9pafK//lzZJ3wl3xpMNhM/3u6qY6HfeBEMOw81XRkXbG/nXOXS2llN6YMyyWRATiMlF5FJAe5Pa4AT045OwXCJHGW3qDc9LdVT3FlDAq12CF1q/SeoG4QKWFrNFDmZ8RYZvxuY7h+f1OAeTpC5XFMKNFNup+2thyEHoXmxAhecqriKnSuvG5Mp1ePS0Iw2IdbgSb5UkYQi8vVPTR8+7koiUtpEn2dZivmBVxZSNXxXhsWlmG/d 1a9KI3cp TDKD+uEsIjPx6ry12FTkisJhoQ0cobknxUDDT2n2Q/mBniE7o87oQ/Hrwfr+2/S62pzLBI9d4phxrfyVqfkHEA84Jwjs7uiYz6cM74J8xAcIE4Wa6qveYDBLtK+s5fii6AVYqMx4PFHbCv5YADKzPmilJ3JW6PSrqHt+/GgQXiUAI0c1iPPYL21yXs9LOXJ02ZN1z3Vj3Y0TMixI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Dev Jain writes: > On 8/12/24 11:04, Huang, Ying wrote: >> Hi, Dev, >> >> Dev Jain writes: >> >>> As already being done in __migrate_folio(), wherein we backoff if the >>> folio refcount is wrong, make this check during the unmapping phase, upon >>> the failure of which, the original state of the PTEs will be restored and >>> the folio lock will be dropped via migrate_folio_undo_src(), any racing >>> thread will make progress and migration will be retried. >>> >>> Signed-off-by: Dev Jain >>> --- >>> mm/migrate.c | 9 +++++++++ >>> 1 file changed, 9 insertions(+) >>> >>> diff --git a/mm/migrate.c b/mm/migrate.c >>> index e7296c0fb5d5..477acf996951 100644 >>> --- a/mm/migrate.c >>> +++ b/mm/migrate.c >>> @@ -1250,6 +1250,15 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, >>> } >>> if (!folio_mapped(src)) { >>> + /* >>> + * Someone may have changed the refcount and maybe sleeping >>> + * on the folio lock. In case of refcount mismatch, bail out, >>> + * let the system make progress and retry. >>> + */ >>> + struct address_space *mapping = folio_mapping(src); >>> + >>> + if (folio_ref_count(src) != folio_expected_refs(mapping, src)) >>> + goto out; >>> __migrate_folio_record(dst, old_page_state, anon_vma); >>> return MIGRATEPAGE_UNMAP; >>> } >> Do you have some test results for this? For example, after applying the >> patch, the migration success rate increased XX%, etc. > > I'll get back to you on this. > >> >> My understanding for this issue is that the migration success rate can >> increase if we undo all changes before retrying. This is the current >> behavior for sync migration, but not for async migration. If so, we can >> use migrate_pages_sync() for async migration too to increase success >> rate? Of course, we need to change the function name and comments. > > > As per my understanding, this is not the current behaviour for sync > migration. After successful unmapping, we fail in migrate_folio_move() > with -EAGAIN, we do not call undo src+dst (rendering the loop around > migrate_folio_move() futile), we do not push the failed folio onto the > ret_folios list, therefore, in _sync(), _batch() is never tried again. In migrate_pages_sync(), migrate_pages_batch(,MIGRATE_ASYNC) will be called first, if failed, the folio will be restored to the original state (unlocked). Then migrate_pages_batch(,_SYNC*) is called again. So, we unlock once. If it's necessary, we can unlock more times via another level of loop. -- Best Regards, Huang, Ying