From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5502AC77B60 for ; Wed, 26 Apr 2023 08:17:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DBB546B00A0; Wed, 26 Apr 2023 04:17:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D6A216B00A2; Wed, 26 Apr 2023 04:17:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C59A86B00A3; Wed, 26 Apr 2023 04:17:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id B3C6C6B00A0 for ; Wed, 26 Apr 2023 04:17:35 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 86CC7A01A6 for ; Wed, 26 Apr 2023 08:17:35 +0000 (UTC) X-FDA: 80722838070.09.E7E5003 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf30.hostedemail.com (Postfix) with ESMTP id 85B1180004 for ; Wed, 26 Apr 2023 08:17:33 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf30.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682497053; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jjzViqxCaeqhhIYWQgy4ikHBS6ujpeX8CDX9/fAnhBc=; b=m4BPmzSrk59VCAjhw3UN3yCnjQFYfbaRzKYzDbvQPNlqvIW/31ccubEcfdX416fJvzKhPh hTdHGKZ727BimS3KW4EmEVKc0Q96SWzmo9uEd7N/rU2Ik9nDGjFHkX4ooPIkI9uPYIpr62 Ps45EZH/s7qmuy9kSynVNwdtC8UGr60= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf30.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682497053; a=rsa-sha256; cv=none; b=4LmKJxK0Ss2YktfSkNDOurxfMp3RhQaKAein5Sd27Ev+AeyQj6zBEpPs+pG/FunpIAabpV n/YgcKeu3w2d+9kzmpJcsvODkN3gp/I3vXC7tjoFaTz29f/T68JeBUzCbtjgIlAkzIa624 GXjaY95HRHYLqzL/VrCxXiocjocIBHg= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 686ED4B3; Wed, 26 Apr 2023 01:18:16 -0700 (PDT) Received: from [10.57.69.78] (unknown [10.57.69.78]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 390CC3F587; Wed, 26 Apr 2023 01:17:31 -0700 (PDT) Message-ID: <1b9c8578-c411-d2f0-f711-91582b2047bc@arm.com> Date: Wed, 26 Apr 2023 09:17:29 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Subject: Re: [PATCH v2 1/2] THP: avoid lock when check whether THP is in deferred list Content-Language: en-US To: Yin Fengwei , "Kirill A. Shutemov" Cc: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org, yuzhao@google.com, ying.huang@intel.com References: <20230425084627.3573866-1-fengwei.yin@intel.com> <20230425084627.3573866-2-fengwei.yin@intel.com> <20230425123857.k5mp2cdp5c6ldbrk@box.shutemov.name> <06f01cd3-9e9d-8748-4fc6-c36b706369cb@intel.com> From: Ryan Roberts In-Reply-To: <06f01cd3-9e9d-8748-4fc6-c36b706369cb@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 85B1180004 X-Stat-Signature: 5scjo6nuq9akh8s7kwybr1wq4u8myryy X-HE-Tag: 1682497053-678925 X-HE-Meta: U2FsdGVkX19jZITnvBC+tSX1VB2dPH8AxHtyTWb9b+znDlZsnHXQAMOQETf5ac//VCK8vKWDAln6jhXoLjuwBZc3+QHb7rS9mBDF4/r0ajkdMVzR6MlWjr5ONoDUUMel4SrD+XHgVRF1xwP9vBm9QEpHMo6wzeDM2V0vrKVG4RL1n5xUoVyttZlfxqSxPn4BQFHCmJBnaDZNiyvXaUk5ic8Kf+3SM6sNOCON046mVxwhJr5zhzCKsBG53purFoE8c2Vbm5yfUK0QlHl3tmZEHP25cmDealQ3N40D4wP/OKh5Yz2BxhRQ4zdggBbQaoezwNTG1mIyAel1vkpd1El5Nk2l/ZQDwSUbYYoxjcuKKJgR7i0sIhUZKYmgsYgFxoBccj0M0ViD1FgwR1dCQJOcntjulx6vn8EEdcuv9sq8uUi6N5fsALtX9mzaBKWXAKyQsE3i6TukTtY0QmIfeSRdydS1iR5OLexJyzyC8JAjL83Fj/VtmpSbOVIa9Z92OBm6nI9HDf65eeadL1McYPQWQiemS0cb8jqVuPvlt0jcfWl/jdQ0qNmKwXsp2MhSEF2GrB3BJn99YjYlx5jeH2QNpgAWldVDQchz6NU1J6dAx/AW2z5brVTALMbW+XvKRylax1R8Wstj63363x6Y4FlccsBvGIJZcm/Er3P1kwg+4Kwc9Y5hlnpJ2TRzjrUXQxBsVH/VNSGQkGmMNSHaAKcSdddQi9e9w35/GdJeywKNu/OgADeZtvITICsTmPpUoxQ/LKU+AsWL6QrNdVT32FRYb0KDdmgDsHcZf744QMJfkn0wfUITXvnd+o2uNW5hjO5j+3Lk/JcvuDFuH+TMnUEbAU0qOJBupwwN78t8jOo5tXwz6brp/A+dTJx5m390QWXjoendH88Vc/3wboHW6GSSCvmC5dTIl6iT3Th1Y4crXmWin8+dRedyRmNR30KpYvMGZ5Ym2rft0WUrEdNSxjf dUdH+Abl 4BRYwE5cmRrfmDaQ38d0lSzHZ0wqNbNe9k6lUtF4FeXCyiju/bubWxWJbZQusV/7rouaJqaRuuI89wtyOwVCiSsooKjud/whV2+RxKv/mP6wUvwGIusv2SgIzEZ4qxGVAH1GIVdzxTHo1TACHu8IzwDuX3Hf6GDWI+xNybAEWMJjk2d5bqQrcVM3NKQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 26/04/2023 03:08, Yin Fengwei wrote: > > > On 4/25/23 20:38, Kirill A. Shutemov wrote: >> On Tue, Apr 25, 2023 at 04:46:26PM +0800, Yin Fengwei wrote: >>> free_transhuge_page() acquires split queue lock then check >>> whether the THP was added to deferred list or not. >>> >>> It's safe to check whether the THP is in deferred list or not. >>> When code hit free_transhuge_page(), there is no one tries >>> to update the folio's _deferred_list. >>> >>> If folio is not in deferred_list, it's safe to check without >>> acquiring lock. >>> >>> If folio is in deferred_list, the other node in deferred_list >>> adding/deleteing doesn't impact the return value of >>> list_epmty(@folio->_deferred_list). >> >> Typo. >> >>> >>> Running page_fault1 of will-it-scale + order 2 folio for anonymous >>> mapping with 96 processes on an Ice Lake 48C/96T test box, we could >>> see the 61% split_queue_lock contention: >>> - 71.28% 0.35% page_fault1_pro [kernel.kallsyms] [k] >>> release_pages >>> - 70.93% release_pages >>> - 61.42% free_transhuge_page >>> + 60.77% _raw_spin_lock_irqsave >>> >>> With this patch applied, the split_queue_lock contention is less >>> than 1%. >>> >>> Signed-off-by: Yin Fengwei >>> Tested-by: Ryan Roberts >>> --- >>> mm/huge_memory.c | 19 ++++++++++++++++--- >>> 1 file changed, 16 insertions(+), 3 deletions(-) >>> >>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >>> index 032fb0ef9cd1..c620f1f12247 100644 >>> --- a/mm/huge_memory.c >>> +++ b/mm/huge_memory.c >>> @@ -2799,12 +2799,25 @@ void free_transhuge_page(struct page *page) >>> struct deferred_split *ds_queue = get_deferred_split_queue(folio); >>> unsigned long flags; >>> >>> - spin_lock_irqsave(&ds_queue->split_queue_lock, flags); >>> - if (!list_empty(&folio->_deferred_list)) { >>> + /* >>> + * At this point, there is no one trying to queue the folio >>> + * to deferred_list. folio->_deferred_list is not possible >>> + * being updated. >>> + * >>> + * If folio is already added to deferred_list, add/delete to/from >>> + * deferred_list will not impact list_empty(&folio->_deferred_list). >>> + * It's safe to check list_empty(&folio->_deferred_list) without >>> + * acquiring the lock. >>> + * >>> + * If folio is not in deferred_list, it's safe to check without >>> + * acquiring the lock. >>> + */ >>> + if (data_race(!list_empty(&folio->_deferred_list))) { >>> + spin_lock_irqsave(&ds_queue->split_queue_lock, flags); >> >> Recheck under lock? > Huang Ying pointed out the race with deferred_split_scan(). And Yes. Need > recheck under lock. Will update in next version. Oops sorry - I see this was already pointed out. Disregard my previous mail. Thanks, Ryan > > > Regards > Yin, Fengwei > >> >>> ds_queue->split_queue_len--; >>> list_del(&folio->_deferred_list); >>> + spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); >>> } >>> - spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); >>> free_compound_page(page); >>> } >>> >>> -- >>> 2.30.2 >>> >>> >>