From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D067C52D7C for ; Mon, 12 Aug 2024 05:35:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 223266B0083; Mon, 12 Aug 2024 01:35:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1AA666B008A; Mon, 12 Aug 2024 01:35:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 04B906B008C; Mon, 12 Aug 2024 01:35:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id DB2546B0083 for ; Mon, 12 Aug 2024 01:35:20 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 47BDB1606B2 for ; Mon, 12 Aug 2024 05:35:20 +0000 (UTC) X-FDA: 82442480400.23.1EC4230 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf26.hostedemail.com (Postfix) with ESMTP id 23D08140005 for ; Mon, 12 Aug 2024 05:35:17 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=none; spf=pass (imf26.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723440864; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Z+LFX5/j5Ru37Yq8p04VrA1RYyqfUkRPWec/T/aKN/k=; b=Ns88iheNcdq12/dc1spTGEjnjjzZuiTuwmgOKJ9jzMeAmOhUmFTb92oBtd/8vwrYzyAMGt xfHYG8vcyokXd5um9ktzU+MvDNBnNVTsALuNWiMirjesnpuAqA7ER1hwE09vO2cDUMd+Tv RU2I84l0lHMYereSLzQHpTBXDkiJREU= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=none; spf=pass (imf26.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723440864; a=rsa-sha256; cv=none; b=ZARprhG6O09Tw1YUm2A8cwlRgfom/xBtMr9uxYRiQaH/ec2oUGAHfu1DrUXLWblFWt6IDN cxCHEn4vAgdt1BN+HVrRB/xEtqu3aRabCjHpsD3ebEm3FVVbdBhyYs4BHjcn4ObyJzA0JO TRs57CXcfs8li0fMJ7UG7qEXXpqIcPo= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id F03CAFEC; Sun, 11 Aug 2024 22:35:42 -0700 (PDT) Received: from [10.162.43.141] (e116581.arm.com [10.162.43.141]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C43693F6A8; Sun, 11 Aug 2024 22:35:07 -0700 (PDT) Message-ID: <0d049ec4-ab39-441b-8560-5613f3527473@arm.com> Date: Mon, 12 Aug 2024 11:05:04 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 1/2] mm: Retry migration earlier upon refcount mismatch To: David Hildenbrand , akpm@linux-foundation.org, shuah@kernel.org, willy@infradead.org Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, osalvador@suse.de, baolin.wang@linux.alibaba.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, ioworker0@gmail.com, gshan@redhat.com, mark.rutland@arm.com, kirill.shutemov@linux.intel.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, broonie@kernel.org, mgorman@techsingularity.net, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, ying.huang@intel.com References: <20240809103129.365029-1-dev.jain@arm.com> <20240809103129.365029-2-dev.jain@arm.com> <761ba58e-9d6f-4a14-a513-dcc098c2aa94@redhat.com> <5a4ae1d3-d753-4261-97a8-926e44d4217a@arm.com> <367b0403-7477-4857-9e7c-5a749c723432@redhat.com> <04e12698-8f83-4033-91b2-3a402c59c17a@redhat.com> Content-Language: en-US From: Dev Jain In-Reply-To: <04e12698-8f83-4033-91b2-3a402c59c17a@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Stat-Signature: fx8w4paub5rut83i4ceg7xskkr7j7iqa X-Rspam-User: X-Rspamd-Queue-Id: 23D08140005 X-Rspamd-Server: rspam02 X-HE-Tag: 1723440917-524776 X-HE-Meta: U2FsdGVkX19KleYqzbFhpCbnmoW+t4/lT8Yt1VmMVHCP31GDKrXPcUMDKTDfIduSgwZci7JzFoWTQXa+7Z6Fpmm3C0azSrsHRmKPww9ps+KywZ9s6dEP7y/YTNskls7+HP1Ff75G17fQaWZ113PgbAZwbujCyLyJ9cHvdJ6g3UBx0u2YvYIQAeZiLoLTQZGM9vaPio8PHFDyO827hp9Qj+t7ALiAnLluwk0VQyK78T8gSQZg7skQ5gizt21AAN5wjQ03jhAgjXHj0LYYiQ6xGrlgfjWr/J2+ms2H9NM7xZrE92ijS2tgiUQ/A6Biyy/6PW1DYXdqiI2NI9TOgfcuNqpdhV+duTr0MAhTRKjcPa/8buddLCXEWKi+ORWc+xppi8uBi2ysJOyjt5IL38cQUNyv/4tGYEnmMuqcRtUDvR9HbvmD0EDfnLgC6ZM4D3CmFit6Imi+Ttmq8b6/Q7xy5jpm4UdAuustGzXxGccDR+54o5YWAq7rDFMNoxI7Z+akrlrxebvnQV6bLfq27jep2q9fXCX+ruccW5Rnwxb1TerwuS/p/tdK1FMFGDC1ciDgh95o0rO3keWuCwsRByKz9pEbxCxumxcjTYmi4R2p9Rg0duyYYaOSxY1yfASV7/JNuUc/C90OZd3ug94HwPa7BtuBVOKeVoKwYUErVuP1grBxGdEradpN/fZ9v8uuvH61urdk9fCh17GceoD11U0PLZv7LhDJrnoyUdbZcrUqH7+UB7kf76lN3HxqkhM2CGZjzz7i69vCG9XyjyGd/9MEYiCaVvbfFHIwTanCC9ed8qS4t1lB84MRTaAYSFha8qDsH1Iup39hgETUyf1pZnfRbVWn9FUGjWa4dpIQyBkzNmoELLaD6453xwT1Fk0hlM7PXRoTGxIY1VpEH4/XFl+0SgrdAA9OX6Qw+BUs9ngXZGb6TYV7Yh0ePiJEpWUQ617vtAFtH5Xv8vRkzI5lPtg Bcu3uJtU oKOtcEJllVSiLyDcb3Fn2JblSVg60oLQ4Bugi52z38akfepx8iWEBMCcIIUB+NMzgpNkfLSTOCQBma4OKjpBRF2ePD3m5TzQUfp75WaSoNyvLjOjNTuETdbIBB21Y2hsNQq46Xcy+85NH4z8muf7yWGCPf7SQk3lQXM9tnQTrr67KzyQQcvOx719AAcTyhuNXPQ44yZhXI8/3xt7dTUSXGn9Q/Z3fDFTuxbfS35BFVnAgDkU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 8/11/24 14:38, David Hildenbrand wrote: > On 11.08.24 08:06, Dev Jain wrote: >> >> On 8/11/24 00:22, David Hildenbrand wrote: >>> On 10.08.24 20:42, Dev Jain wrote: >>>> >>>> On 8/9/24 19:17, David Hildenbrand wrote: >>>>> On 09.08.24 12:31, Dev Jain wrote: >>>>>> As already being done in __migrate_folio(), wherein we backoff if >>>>>> the >>>>>> folio refcount is wrong, make this check during the unmapping phase, >>>>>> upon >>>>>> the failure of which, the original state of the PTEs will be >>>>>> restored >>>>>> and >>>>>> the folio lock will be dropped via migrate_folio_undo_src(), any >>>>>> racing >>>>>> thread will make progress and migration will be retried. >>>>>> >>>>>> Signed-off-by: Dev Jain >>>>>> --- >>>>>>     mm/migrate.c | 9 +++++++++ >>>>>>     1 file changed, 9 insertions(+) >>>>>> >>>>>> diff --git a/mm/migrate.c b/mm/migrate.c >>>>>> index e7296c0fb5d5..477acf996951 100644 >>>>>> --- a/mm/migrate.c >>>>>> +++ b/mm/migrate.c >>>>>> @@ -1250,6 +1250,15 @@ static int migrate_folio_unmap(new_folio_t >>>>>> get_new_folio, >>>>>>         } >>>>>>           if (!folio_mapped(src)) { >>>>>> +        /* >>>>>> +         * Someone may have changed the refcount and maybe sleeping >>>>>> +         * on the folio lock. In case of refcount mismatch, bail >>>>>> out, >>>>>> +         * let the system make progress and retry. >>>>>> +         */ >>>>>> +        struct address_space *mapping = folio_mapping(src); >>>>>> + >>>>>> +        if (folio_ref_count(src) != folio_expected_refs(mapping, >>>>>> src)) >>>>>> +            goto out; >>>>> >>>>> This really seems to be the latest point where we can "easily" back >>>>> off and unlock the source folio -- in this function :) >>>>> >>>>> I wonder if we should be smarter in the migrate_pages_batch() loop >>>>> when we start the actual migrations via migrate_folio_move(): if we >>>>> detect that a folio has unexpected references *and* it has waiters >>>>> (PG_waiters), back off then and retry the folio later. If it only has >>>>> unexpected references, just keep retrying: no waiters -> nobody is >>>>> waiting for the lock to make progress. >>>> >>>> >>>> The patch currently retries migration irrespective of the reason of >>>> refcount change. >>>> >>>> If you are suggesting that, break the retrying according to two >>>> conditions: >>> >>> That's not what I am suggesting ... >>> >>>> >>>> >>>>> This really seems to be the latest point where we can "easily" back >>>>> off and unlock the source folio -- in this function :) >>>>> For example, when migrate_folio_move() fails with -EAGAIN, check if >>>>> there are waiters (PG_waiter?) and undo+unlock to try again later. >>>> >>>> >>>> Currently, on -EAGAIN, migrate_folio_move() returns without undoing >>>> src >>>> and dst; even if we were to fall >>> >>> ... >>> >>> I am wondering if we should detect here if there are waiters and undo >>> src+dst. >> >> After undoing src+dst, which restores the PTEs, how are you going to >> set the >> >> PTEs to migration again? That is being done through >> migrate_folio_unmap(), >> >> and the loops of _unmap() and _move() are different. Or am I missing >> something... > > Again, no expert on the code, but it would mean that if we detect that > there are waiters, we would undo src+dst and add them to ret_folios, > similar to what we do in "Cleanup remaining folios" at the end of > migrate_pages_batch()? > > So instead of retrying migration of that folio, just give it up > immediately and retry again later. > > Of course, this means that (without further modifications to that > function), we would leave retrying these folios to the caller, such as > in migrate_pages_sync(), where we move ret_folios to the tail of > "folios" and retry migration. So IIUC, you are saying to change the return value in __folio_migrate_mapping(), so that when move_to_new_folio() fails in migrate_folio_move(), we end up in the retrying loop of _sync() which calls _batch() in synchronous mode. Here, we will have to make a change to decide how much we want to retry? > > > Maybe one would want to optimize that retry logic with such > "temporarily failed because someone else has to make progress for us > to make progress and free up a page reference" case. These are > different to the typical "speculative" references that we try to > handle via the existing retry magic. > > Please let me know if I am missing something fundamental. > >