From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB836C77B7E for ; Sat, 29 Apr 2023 08:47:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 563996B007B; Sat, 29 Apr 2023 04:47:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4F68C900002; Sat, 29 Apr 2023 04:47:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3650C6B007E; Sat, 29 Apr 2023 04:47:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 23F9A6B007B for ; Sat, 29 Apr 2023 04:47:02 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id D6FD4803CD for ; Sat, 29 Apr 2023 08:47:01 +0000 (UTC) X-FDA: 80733798642.01.92CD1B1 Received: from wout2-smtp.messagingengine.com (wout2-smtp.messagingengine.com [64.147.123.25]) by imf09.hostedemail.com (Postfix) with ESMTP id A187A140011 for ; Sat, 29 Apr 2023 08:46:59 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=shutemov.name header.s=fm1 header.b="d fMPfTg"; dkim=pass header.d=messagingengine.com header.s=fm3 header.b=gWzr978F; spf=pass (imf09.hostedemail.com: domain of kirill@shutemov.name designates 64.147.123.25 as permitted sender) smtp.mailfrom=kirill@shutemov.name; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682758020; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xfpMboocttYB/j9eCRHyK6eFyVFLil5OCBWQ5tKEs2M=; b=W84LVGh9YCW9Ikym2mFZp7OqfBUGX5F/dAMQMJCI/lK5QwHll/CLeH6Y+AnyawWMGhPEN1 tuSLtmiZnADAnDkIxcHLdur1o7Aba0YDDzJ96nW773XVp+oOZ/1X3zi1W9N4u+CDSsjZKR U/ajzqimTklOLhC+wDGqx1FQoP6P2fI= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=shutemov.name header.s=fm1 header.b="d fMPfTg"; dkim=pass header.d=messagingengine.com header.s=fm3 header.b=gWzr978F; spf=pass (imf09.hostedemail.com: domain of kirill@shutemov.name designates 64.147.123.25 as permitted sender) smtp.mailfrom=kirill@shutemov.name; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682758020; a=rsa-sha256; cv=none; b=KGZ4aTQaSBkyJVeb0UWv04USr5EVMcbxunhp4QCOgi7cZ+JXeykqcKXfWUZjE47uzGsQIG YqTXwOwt15G2h+FzRXw7eEYi9UIA4OodZ+QgZzchGPVRcPFiU0VHXuTb8m5xEjoeSw0nwu bGJt5ucrINoL1PQ3RRIvFH9UYH2FcA4= Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 8F4B232003AC; Sat, 29 Apr 2023 04:46:57 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Sat, 29 Apr 2023 04:46:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov.name; h=cc:cc:content-type:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm1; t=1682758017; x= 1682844417; bh=xfpMboocttYB/j9eCRHyK6eFyVFLil5OCBWQ5tKEs2M=; b=d fMPfTg9BC15F7my70OHcQ6iSTHXRRxqzm9NODcvOdyoWiWAYKT4EJxum4RZL/knV LbRyNwbQObdflHP6IOYSMDNDsh1zlAgNYibOnnZcyCCFy/YbKZjasH+OSUt1AtDf 6oHcPR4Y8LwYVl01w9enTmJj7uVQIXoiSuZCUw2fWPUwjepq34KKaMAs0hTNjNJx tcHhmh2qHHwX21beGTS4tCDzP69fOeTe2fzofMqQG8fltFtr2573jah0XlvgiE6/ L/24DqZTe7TaM3P7giAmEKZzKa7fRUsAE+WMN52XFfDWEGcyvGvacJ8Zu7BjOl/j ojTiDnbP1h8REPIYkO3pg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm3; t=1682758017; x=1682844417; bh=xfpMboocttYB/ j9eCRHyK6eFyVFLil5OCBWQ5tKEs2M=; b=gWzr978FfKsfXE6/Mxcu4G8wrKP45 6OtGpwgLZzuyKPm/EhB6pd5DbOMwNFuuZSkVME2bxNE8yvMi3FGxRsy185SHF4xb Bn3nSupU+f1/vkfkr0i53DdXmxgWSoeCO/3hZW+/0aTLPjNdxbeIwJRxeKno5zXu GYEco4eT1iXeVbmGrwP0yxk1Zp2iz+g0e5rY/dyIN8gNK26PUOUaYv++hs4B79ax Cx4MsAZ2EmkQWDbbnwj/2Ibrtv5IiO9Sbp5xppaqVTosHr4/KmZD7v4PHhXL7Aj4 WMtlqI89nsbXb6OSJnDYkzubJJzvypGf/fbPwqGxIIUfjHAQZcAwjgduQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrfeduledgudefiecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpeffhffvvefukfhfgggtuggjsehttddttddttddvnecuhfhrohhmpedfmfhi rhhilhhlucetrdcuufhhuhhtvghmohhvfdcuoehkihhrihhllhesshhhuhhtvghmohhvrd hnrghmvgeqnecuggftrfgrthhtvghrnhephfeigefhtdefhedtfedthefghedutddvueeh tedttdehjeeukeejgeeuiedvkedtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrg hmpehmrghilhhfrhhomhepkhhirhhilhhlsehshhhuthgvmhhovhdrnhgrmhgv X-ME-Proxy: Feedback-ID: ie3994620:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Sat, 29 Apr 2023 04:46:56 -0400 (EDT) Received: by box.shutemov.name (Postfix, from userid 1000) id F3ABC1041AE; Sat, 29 Apr 2023 11:46:53 +0300 (+03) Date: Sat, 29 Apr 2023 11:46:53 +0300 From: "Kirill A. Shutemov" To: "Yin, Fengwei" Cc: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org, yuzhao@google.com, ryan.roberts@arm.com, ying.huang@intel.com Subject: Re: [PATCH v2 1/2] THP: avoid lock when check whether THP is in deferred list Message-ID: <20230429084653.vnmionbhnodbbd2w@box.shutemov.name> References: <20230425084627.3573866-1-fengwei.yin@intel.com> <20230425084627.3573866-2-fengwei.yin@intel.com> <20230425123857.k5mp2cdp5c6ldbrk@box.shutemov.name> <20230428140236.czx5eii34z373jqq@box.shutemov.name> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: A187A140011 X-Stat-Signature: 6wcie79aqmqe7oa4d85uyqd8o6531jgf X-HE-Tag: 1682758019-446005 X-HE-Meta: U2FsdGVkX1+f/cJx2gT2nuxLRaFmMXc5GEUv3d5fCirQVUV1kLmSgOFmmtHrEfkTQdn/ftwLjo5kOLDNsTaG8cMm/+UVRTYa9FmWZRQmYU87k3TqR6nHE/2Xjcmms/LRvq1JPfZIqyqxcALs39XKoa82pfj4dLUgxEJ1WWrIkDF+ur5pxti3PzqWw3t4vhGgiefOzDSDC3BVvfiLcVcOEvTFifcFzf3SEPaz/AXLzCda893uVqKAmbqtLE4ZviArb51nr2R1RgM7BSoXkebOEmbnlEcZLbnUbBNYuotfCvRuhWKNQJbdYzlUgxTM/jilf7nANw9WPIkjQjRd8+0Oc5xnNt3yt4oJ1h3u3+i+vFOr4PNpeqgswrABsferXtAwIi5ONWnwt/wHX9Uh2WFZw4e0KM9W38suNr+7450YsMFhgnCS9gHfa/YrtLv1eAxFaQvzU3HkEmNHzlpHBeXJ/O/jEG7lhiRkn9nflbXNXV1gQCqUbgTEjZd3NWDSvv2Y/f9PNaO6eWkQNP5CTlFzSAiDl093eUJ1VsRyEeVq8bW2W1YNlnnnyhaJzkGI2xu4l6Nw8F0L5u+tIjQVzolAf4MCYREZQJETtU+dZdISRQ+HsDsIaeNm8tgIl4yVj7UQ1et3RpuwOptxmgVP4FgReLyVh4yL82pwMPqAZcCnHgRR+770SuwjHk0M9n+tLgUDz0ZGdOcwprLQyeN4FMDlmZGBP5I+9y1dlJJRXZquzvPdIRUAEK7Wpqlf4Ww5PgS7a08bTmOPqQ0caIqnAcVYpaq5NcO3wSLcY+CnXFRmOfLIKiVJar3CVAMQIRPIQKdK0OPPfdzj+qiodn4aPK7JWtbkwzZr/uCUig6Tsna/VNCn9hHb7zdKu8OOZQtnp/S0f1x6RpuL1c9bykf13LgsDgrBk4X2L6TBaJdHJPYiDn2Dv7ZRXTJ4MZ6q0qJkO6og6CraGhno7PNxlcowTgw j/V1gdrh 5DrondOpRqKO87MOHN8fs+olpBtbhuiU8qW6gDqnOsUKQhH2g3PTzjs4w8GA/xCoWpj5NE3F5yyROCxRcX8KjQS1KR79Hp2EQ5b9ZCpiYzMaVi9x5uE0TbXDqtb8rPWR6UR44045U3JM3enJWUbDXf2CgelMU2dDFI71cHVPFHmKPPoAfZg+ynYrsGl2hVEznW+Eq7e7er/V1byT6+I8wy63zs6Wb400fCH9/tORVtrOKVLi4z9lImfWHW6j2GpMeXyH+NQlv5wgHdyP8uL0Q/XTt5kKrmb87359PXG/6RTxRJy2uPVbLm8linkTUr1nYnKuVKn1t1xeq29sKXRBMc1/kwnl6PNrMzTCrCv8eVJsl5WRZHheVI5ZDnEFDFPoFg4XT X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, Apr 29, 2023 at 04:32:34PM +0800, Yin, Fengwei wrote: > Hi Kirill, > > On 4/28/2023 10:02 PM, Kirill A. Shutemov wrote: > > On Fri, Apr 28, 2023 at 02:28:07PM +0800, Yin, Fengwei wrote: > >> Hi Kirill, > >> > >> On 4/25/2023 8:38 PM, Kirill A. Shutemov wrote: > >>> On Tue, Apr 25, 2023 at 04:46:26PM +0800, Yin Fengwei wrote: > >>>> free_transhuge_page() acquires split queue lock then check > >>>> whether the THP was added to deferred list or not. > >>>> > >>>> It's safe to check whether the THP is in deferred list or not. > >>>> When code hit free_transhuge_page(), there is no one tries > >>>> to update the folio's _deferred_list. > >>>> > >>>> If folio is not in deferred_list, it's safe to check without > >>>> acquiring lock. > >>>> > >>>> If folio is in deferred_list, the other node in deferred_list > >>>> adding/deleteing doesn't impact the return value of > >>>> list_epmty(@folio->_deferred_list). > >>> > >>> Typo. > >>> > >>>> > >>>> Running page_fault1 of will-it-scale + order 2 folio for anonymous > >>>> mapping with 96 processes on an Ice Lake 48C/96T test box, we could > >>>> see the 61% split_queue_lock contention: > >>>> - 71.28% 0.35% page_fault1_pro [kernel.kallsyms] [k] > >>>> release_pages > >>>> - 70.93% release_pages > >>>> - 61.42% free_transhuge_page > >>>> + 60.77% _raw_spin_lock_irqsave > >>>> > >>>> With this patch applied, the split_queue_lock contention is less > >>>> than 1%. > >>>> > >>>> Signed-off-by: Yin Fengwei > >>>> Tested-by: Ryan Roberts > >>>> --- > >>>> mm/huge_memory.c | 19 ++++++++++++++++--- > >>>> 1 file changed, 16 insertions(+), 3 deletions(-) > >>>> > >>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c > >>>> index 032fb0ef9cd1..c620f1f12247 100644 > >>>> --- a/mm/huge_memory.c > >>>> +++ b/mm/huge_memory.c > >>>> @@ -2799,12 +2799,25 @@ void free_transhuge_page(struct page *page) > >>>> struct deferred_split *ds_queue = get_deferred_split_queue(folio); > >>>> unsigned long flags; > >>>> > >>>> - spin_lock_irqsave(&ds_queue->split_queue_lock, flags); > >>>> - if (!list_empty(&folio->_deferred_list)) { > >>>> + /* > >>>> + * At this point, there is no one trying to queue the folio > >>>> + * to deferred_list. folio->_deferred_list is not possible > >>>> + * being updated. > >>>> + * > >>>> + * If folio is already added to deferred_list, add/delete to/from > >>>> + * deferred_list will not impact list_empty(&folio->_deferred_list). > >>>> + * It's safe to check list_empty(&folio->_deferred_list) without > >>>> + * acquiring the lock. > >>>> + * > >>>> + * If folio is not in deferred_list, it's safe to check without > >>>> + * acquiring the lock. > >>>> + */ > >>>> + if (data_race(!list_empty(&folio->_deferred_list))) { > >>>> + spin_lock_irqsave(&ds_queue->split_queue_lock, flags); > >>> > >>> Recheck under lock? > >> In function deferred_split_scan(), there is following code block: > >> if (folio_try_get(folio)) { > >> list_move(&folio->_deferred_list, &list); > >> } else { > >> /* We lost race with folio_put() */ > >> list_del_init(&folio->_deferred_list); > >> ds_queue->split_queue_len--; > >> } > >> > >> I am wondering what kind of "lost race with folio_put()" can be. > >> > >> My understanding is that it's not necessary to handle this case here > >> because free_transhuge_page() will handle it once folio get zero ref. > >> But I must miss something here. Thanks. > > > > free_transhuge_page() got when refcount is already zero. Both > > deferred_split_scan() and free_transhuge_page() can see the page with zero > > refcount. The check makes deferred_split_scan() to leave the page to the > > free_transhuge_page(). > > > If deferred_split_scan() leaves the page to free_transhuge_page(), is it > necessary to do > list_del_init(&folio->_deferred_list); > ds_queue->split_queue_len--; > > Can these two line be left to free_transhuge_page() either? Thanks. I *think* (my cache is cold on deferred split) we can. But since we already hold the lock, why not take care of it? It makes your change more efficient. -- Kiryl Shutsemau / Kirill A. Shutemov