From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9252C77B7C for ; Wed, 26 Apr 2023 01:15:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 99DBF6B0071; Tue, 25 Apr 2023 21:15:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 927476B0072; Tue, 25 Apr 2023 21:15:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7C7E66B0074; Tue, 25 Apr 2023 21:15:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 6EC9B6B0071 for ; Tue, 25 Apr 2023 21:15:25 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 2939F40356 for ; Wed, 26 Apr 2023 01:15:25 +0000 (UTC) X-FDA: 80721774210.02.61C1253 Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by imf08.hostedemail.com (Postfix) with ESMTP id 73FA016000F for ; Wed, 26 Apr 2023 01:15:22 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=VixKPYA6; spf=pass (imf08.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682471723; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Isq4mVkNgGA8toLVtlkip+Ip9hQJFb334psRZhf2nRc=; b=nwG7jVdv7v4VwICGfOwYLeFeV9l+x7LFOiBYCTBcA6leyWakCuzSGpMLWl9pnFjgGsTLGd N1c/jRZKYZc9yA0uIHZKPeZcQvyjX5A1GUiXUJ1tDqgAIf36DU5xbT8xBxFfLs/Z3MhMZ3 Bzl/Oej2+WV+82XODHYk4GhUxjLUqk0= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=VixKPYA6; spf=pass (imf08.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682471723; a=rsa-sha256; cv=none; b=exzVLxa7l0eMdPo3sasUhwYHos6k9xdlIx9m1ecb6DmCvBzC5gHRfaM4PuQ8bZesre87yT b22l/IOVS5GF6G5nytqJvZMLvWoaWi81P6r0ZrAC2jJodP3HdPGEiI1qOn0TR+8BQKPPkU JtzymlvWKMWfWsb3e1t3EBWG2v1Ny14= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682471722; x=1714007722; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=l4RgJMxSJrvUresvnuPnd2YmvdWUey7O427EsV5GBTk=; b=VixKPYA62rlSaEIqHydkLBu3L2YNZVbhnz/sD5zpSTev2KU0+5DmUFW2 ZdloSCkQaseHkWIluKyxw6tW0XdxfjGWulZ4HrLjRa0VMFIDIbh7/2lGi U1JZ3DlwFRy3zjkjcaVYfllTYNtK6Mz7MMzCHcBO4DiiRi5uB24rURBFo +KE8kiTLCIIMQWhNp2BiEpEa58dyGCtfOtiHeFHs+O1vOL5YvM2nMMgqP JqMXVhSNva5kvzZIzUmj5Vm4urJTFqP2eNoSiyzZ39oz7a/sMjXb+LA5P YMcOy7xhVMfdHVqqVTSWoshTboVzqFKMWe1+n8/rvwYf+9C+pwGDsCUpl Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10691"; a="346978241" X-IronPort-AV: E=Sophos;i="5.99,227,1677571200"; d="scan'208";a="346978241" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Apr 2023 18:15:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10691"; a="724213274" X-IronPort-AV: E=Sophos;i="5.99,227,1677571200"; d="scan'208";a="724213274" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Apr 2023 18:15:01 -0700 From: "Huang, Ying" To: Yin Fengwei Cc: , , , , Subject: Re: [PATCH v2 1/2] THP: avoid lock when check whether THP is in deferred list References: <20230425084627.3573866-1-fengwei.yin@intel.com> <20230425084627.3573866-2-fengwei.yin@intel.com> Date: Wed, 26 Apr 2023 09:13:48 +0800 In-Reply-To: <20230425084627.3573866-2-fengwei.yin@intel.com> (Yin Fengwei's message of "Tue, 25 Apr 2023 16:46:26 +0800") Message-ID: <87wn1ze983.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Rspamd-Queue-Id: 73FA016000F X-Stat-Signature: 9c7p5srf15wo9hjmwmqmqwwpyamup4pe X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1682471722-807678 X-HE-Meta: U2FsdGVkX1/JTVr0IZ/PxQDtXoNCHQoD1AQEZwiFDHyuTyUKwlCZhhjShXmNdOUmbLzHyT6XvOb34o4soSrD+w7OxkOYSFxGDW895QJkY3QCQTQnVz/OLvrUARPZubeptJ1+NLIX9A9lsFisBH4zwt9H8eaJ1Sd7TkwEOPL2zxN3hgpvBGffkrncbxoN3e/VYazh2uhRJ3iOSICjYSOCBBQynNZ2G5ApqVQ7/S05PfsnJ2VIMVjjEBJHTedmqYnp3oWieET7xxA/shSTx44UJv5ivxh3XN/v/fzse0KEagBB2C6MtKFQDjYhYukzunv88CEHK19FNHXCVC1JvDFSYii+oFV4kgpCMlUBFy3APFHYOv/nJegq/8ZGeGM7mgdYXh6oxlrPYC/S5VNVwu3mvc9oRNQyk/NpjtGNDZhpQvxWwuKhW0MP/IppxFO+JXvoCzmjnYm+XJU2xiOEFm92cPyd2wC5vBn/4JQHY+oyN+miYBq5lAQyWYDKbKOAqfgKPJ+ifQ5Kwmie8/p21+pjh4SncskHO1RRRfTPlij8hfDfbSHUgmBnRPx8d4gh3JVr6bWprQQH0bMVBQPACgnbCp2AsmGmjrJk9vjYr5yu6bv9kjGuHVFN2joZu5opMVtNaXdhXfa1Xh5HzKwgGo+4XNae2LsRkkef8IvC0gE4AIh22a7BVfI8YL2P3vl6EvFMuJ7wfK1V4QbSEu7ruzTBG1xHPtfu/35afvjK3acEI2xdrs5jLfnF9St8pTepEjieFsnmoA2IL8ZGRZIK/pbY3i2fGnhMs3GtVQOBSFQJCMr738Q5FguMaA8tQXKRlj1klOLMTdZOJVRDronufKrrCX7cMOz4rRa7LtQ/u0WL0vXX6jsgMDGDyZrUGTbPeU3vgXHNj0/EWg3zywF8EKAucnBro88Z0nn1wXX7CVkZMwNjd4ahzzLEun8jY2yEPTHkyu22J+fNpDrpAi0BOXk mCdY6PUU 754BvALmdflT3gPnPvbtj/l6+Ubve4UIeqCgBeULh9G3pRvNv/F10Gzjaq0J3IBVdFdYn7wCxot4RhzSrbHbr7xMwOFNFTXOLgIYaEybLqEnK8JdGNuMsXEmj+GtnOUDwOhyeY8vFiqxwdABfRyPn8UiRkbtsD1H9YtsT2LsGobGWQkoCgux4+Ai1L/mTIWjpV0b9Cv7Qx+oay/CHW+1AmR2IJhjz87AzClN7 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Yin Fengwei writes: > free_transhuge_page() acquires split queue lock then check > whether the THP was added to deferred list or not. > > It's safe to check whether the THP is in deferred list or not. > When code hit free_transhuge_page(), there is no one tries > to update the folio's _deferred_list. I think that it's clearer to enumerate all places pages are added and removed from deferred list. Then we can find out whether there's code path that may race with this. Take a glance at the search result of `grep split_queue_lock -r mm`. It seems that deferred_split_scan() may race with free_transhuge_page(), so we need to recheck with the lock held as Kirill pointed out. Best Regards, Huang, Ying > If folio is not in deferred_list, it's safe to check without > acquiring lock. > > If folio is in deferred_list, the other node in deferred_list > adding/deleteing doesn't impact the return value of > list_epmty(@folio->_deferred_list). > > Running page_fault1 of will-it-scale + order 2 folio for anonymous > mapping with 96 processes on an Ice Lake 48C/96T test box, we could > see the 61% split_queue_lock contention: > - 71.28% 0.35% page_fault1_pro [kernel.kallsyms] [k] > release_pages > - 70.93% release_pages > - 61.42% free_transhuge_page > + 60.77% _raw_spin_lock_irqsave > > With this patch applied, the split_queue_lock contention is less > than 1%. > > Signed-off-by: Yin Fengwei > Tested-by: Ryan Roberts > --- > mm/huge_memory.c | 19 ++++++++++++++++--- > 1 file changed, 16 insertions(+), 3 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 032fb0ef9cd1..c620f1f12247 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -2799,12 +2799,25 @@ void free_transhuge_page(struct page *page) > struct deferred_split *ds_queue = get_deferred_split_queue(folio); > unsigned long flags; > > - spin_lock_irqsave(&ds_queue->split_queue_lock, flags); > - if (!list_empty(&folio->_deferred_list)) { > + /* > + * At this point, there is no one trying to queue the folio > + * to deferred_list. folio->_deferred_list is not possible > + * being updated. > + * > + * If folio is already added to deferred_list, add/delete to/from > + * deferred_list will not impact list_empty(&folio->_deferred_list). > + * It's safe to check list_empty(&folio->_deferred_list) without > + * acquiring the lock. > + * > + * If folio is not in deferred_list, it's safe to check without > + * acquiring the lock. > + */ > + if (data_race(!list_empty(&folio->_deferred_list))) { > + spin_lock_irqsave(&ds_queue->split_queue_lock, flags); > ds_queue->split_queue_len--; > list_del(&folio->_deferred_list); > + spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); > } > - spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); > free_compound_page(page); > }