From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9945BC636CC for ; Wed, 8 Feb 2023 15:11:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 13E386B0072; Wed, 8 Feb 2023 10:11:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0EE476B0073; Wed, 8 Feb 2023 10:11:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EF8076B0074; Wed, 8 Feb 2023 10:11:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id DEE8A6B0072 for ; Wed, 8 Feb 2023 10:11:32 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 9CCFE160E55 for ; Wed, 8 Feb 2023 15:11:32 +0000 (UTC) X-FDA: 80444463624.07.99D8D90 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf27.hostedemail.com (Postfix) with ESMTP id 2B2CC40013 for ; Wed, 8 Feb 2023 15:11:29 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=r8xAaBOt; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=iCqdmg4i; spf=pass (imf27.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675869090; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=M5cRITLGeASqUAI9ZFUll2pAGkvBt2hc6N7uCIZYyyI=; b=LrDU/pa++4rWxSxAb9TJBjWqW1rnY59RefuXiDnsfB0hcdbi+1T9AOBzbJx+CXttjRfxiM +Yb/x/GIHfTCpDnERi19QKfdHRWbHRWPbtEo4ETllBxLTCzi0JlMHmbzpsKOHg22u6cl4k +rw2S2nNxwIs2IAoJYqlzOaZIDAaRyw= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=r8xAaBOt; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=iCqdmg4i; spf=pass (imf27.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675869090; a=rsa-sha256; cv=none; b=2sQDfz9dEqiepRpO/BiJckwdSqfcUS+yeVdEJzj+V0Jp7PQVLBwQlug2XkVD/nPrL5T+tU w+UKa6MVx85Gq5+I+TKPS9NFsSdsnzWBq/1A/NJLl6IOfdliZdaxrHSPXL5xFGrdO3kcrH CAAFQqfLEBCcdfHjEl5d2ghNJWjpWtw= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 69C3533787; Wed, 8 Feb 2023 15:11:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1675869088; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=M5cRITLGeASqUAI9ZFUll2pAGkvBt2hc6N7uCIZYyyI=; b=r8xAaBOtxQBuSJzqWWa1FE2fVQD6Oa5x3IDZH3FufYRiurTnrctUWupSe5Fwmg0Q+IHMbi Pe+yVnuFE+9Ichnrx5s8Cz4vCSwr+EidYw1v3T6prEXVXFJ0q+uq5/TAYMfZzcCVyH/xs+ i46n/Dka6MNevBokqUyqntQnGy4IFmQ= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1675869088; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=M5cRITLGeASqUAI9ZFUll2pAGkvBt2hc6N7uCIZYyyI=; b=iCqdmg4iJAes8u7G3HPzJpt1Hvip1etNSX29bKQB1nf4+0pGM+kYjcl9+GKu7LjsDhGLoq GhC471Ja4PYwVfBw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 50FCE13425; Wed, 8 Feb 2023 15:11:28 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id o6DSEqC742NbOQAAMHmgww (envelope-from ); Wed, 08 Feb 2023 15:11:28 +0000 Message-ID: <8dc0a48f-f29a-27de-0ced-d39a920b2afd@suse.cz> Date: Wed, 8 Feb 2023 16:11:28 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1 Subject: Re: [PATCH] mm: reduce lock contention of pcp buffer refill Content-Language: en-US To: Alexander Halbuer , Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Mel Gorman References: <20230201162549.68384-1-halbuer@sra.uni-hannover.de> <20230202152501.297639031e96baad35cdab17@linux-foundation.org> <68ba44d8-6899-c018-dcb3-36f3a96e6bea@sra.uni-hannover.de> From: Vlastimil Babka In-Reply-To: <68ba44d8-6899-c018-dcb3-36f3a96e6bea@sra.uni-hannover.de> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 2B2CC40013 X-Rspam-User: X-Stat-Signature: ghzkfa89k1it943kddj69qat8gzmjmkm X-HE-Tag: 1675869089-12182 X-HE-Meta: U2FsdGVkX19GpeDDSz06c4EitpPCyTm+3bIIGnJibbOK3k42tsEuSpCdtk8duREG4MpX4QP1AoH18mnhn2UOKgDPAzy6iOc0tlZycG13KlNXpPxj3sRXtOn+uuDnOC19S9/SdOacssDpwJ2T7S5I3gLJ5CsAklLEV4HWQJ48/Wmlmdm43W7N01dQIQpDAQtoQbfA+5yC8OEqChMhExq9yphZ7qU9DEMvbMOcSOX6/0XyH6cA335PlBwZ9tbmrJs1DrI8A9TIdR6AW00Sxuj9Frjq09MiDI+/LWfU05W5mRGGAK+C22qV3nXW+bJEv63FGnRa0n2JxYG1mFFJZCnb5ww6V/PWetpkH0CO7ugcCkaxzos9hmk1HLVhtpB5UnsPeEVhY2DU+KEkRncyB76TFJNKMMslXKQK2QGuRTmTigYrx/R2IB9FGn4iV5hoWHiCewjzvOSjF+yadFR/G4ZqnlCL0poUNBOMbLxrfBBn2YadidFUZoUH5RlS653JWP4IV13+mvZGa4oEfKYYdvzxWHgEpTFUx8mbJcxalGA2GshX6bulNQz5TYyrLkpTFthstzXlMAbhw7IzWjlaYq7UGLrZ+toX3fE22HBK48oOI8T8vIiBauliTF8i+jHXe6WC43lzy4cENV3ILEE5tFbpTKyX7lt4UVdd6ta/jxQsXdkkxkIMcV+vthIWD7MAceCuI2G6QdIGVq9U37CIAaasonK//sqBPvt9Zu2NKDSE4lEwG8qVSWuEfbn9Zwyu9AnKtdtpABi+2onyRczX5e3Ar4rxY9zkt/3gicHDYB/OCyhQYy+7Y+PUBikF76Xj0W7+1BJPviGCGIAhFo5OfZsSQb6SQMsbzqGIP5jkKVtIMssUjTStnU485/8pzEeB8NoF9bkrjngi/Wq0LRCJQA0yLQo57WXXKE9dqaV0cYIbWXB06Cbft1AjNl6JN0sG691JhEG9HkhNOmfJuInYTDl jgH3NNqv SUHb0/EmsZcHQqfMV1HKtmym51CsnA+lkrFGTi8+h3/+vCd83PcKmqSWbpdWDy2IggFdpaUwVn7hJ3xKnuzifgPnUHYlQx1oEM1BIwlt9LtGIp6GEgZkNGvZML/+3vfVUGiNE1mLptxP9UBtBVO7R94DABoo23GHf7AGy1VWo67xvXwAX3fTBj3Bc8uKtryFahoEwCDwjAIprAhP6lcSYGxp3qQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2/7/23 17:11, Alexander Halbuer wrote: > On 2/3/23 00:25, Andrew Morton wrote: >> On Wed, 1 Feb 2023 17:25:49 +0100 Alexander Halbuer wrote: >> >>> The `rmqueue_bulk` function batches the allocation of multiple elements to >>> refill the per-CPU buffers into a single hold of the zone lock. Each >>> element is allocated and checked using the `check_pcp_refill` function. >>> The check touches every related struct page which is especially expensive >>> for higher order allocations (huge pages). This patch reduces the time >>> holding the lock by moving the check out of the critical section similar >>> to the `rmqueue_buddy` function which allocates a single element. >>> Measurements of parallel allocation-heavy workloads show a reduction of >>> the average huge page allocation latency of 50 percent for two cores and >>> nearly 90 percent for 24 cores. >> Sounds nice. >> >> Were you able to test how much benefit we get by simply removing the >> check_new_pages() call from rmqueue_bulk()? > I did some further investigations and measurements to quantify potential > performance gains. Benchmarks ran on a machine with 24 physical cores > fixed at 2.1 GHz. The results show significant performance gains with the > patch applied in parallel scenarios. Eliminating the check reduces > allocation latency further, especially for low core counts. Are the benchmarks done via kernel module calling bulk allocation, or from userspace? > The following tables show the average allocation latencies for huge pages > and normal pages for three different configurations: > The unpatched kernel, the patched kernel, and an additional version > without the check. > > Huge pages > +-------+---------+-------+----------+---------+----------+ > | Cores | Default | Patch |    Patch | NoCheck |  NoCheck | > |       |    (ns) |  (ns) |     Diff |    (ns) |     Diff | > +-------+---------+-------+----------+---------+----------+ > |     1 |     127 |   124 |  (-2.4%) |     118 |  (-7.1%) | > |     2 |     140 |   140 |  (-0.0%) |     134 |  (-4.3%) | > |     3 |     143 |   142 |  (-0.7%) |     134 |  (-6.3%) | > |     4 |     178 |   159 | (-10.7%) |     156 | (-12.4%) | > |     6 |     269 |   239 | (-11.2%) |     237 | (-11.9%) | > |     8 |     363 |   321 | (-11.6%) |     319 | (-12.1%) | > |    10 |     454 |   409 |  (-9.9%) |     409 |  (-9.9%) | > |    12 |     545 |   494 |  (-9.4%) |     488 | (-10.5%) | > |    14 |     639 |   578 |  (-9.5%) |     574 | (-10.2%) | > |    16 |     735 |   660 | (-10.2%) |     653 | (-11.2%) | > |    20 |     915 |   826 |  (-9.7%) |     815 | (-10.9%) | > |    24 |    1105 |   992 | (-10.2%) |     982 | (-11.1%) | > +-------+---------+-------+----------+---------+----------+ > > Normal pages > +-------+---------+-------+----------+---------+----------+ > | Cores | Default | Patch |    Patch | NoCheck |  NoCheck | > |       |    (ns) |  (ns) |     Diff |    (ns) |     Diff | > +-------+---------+-------+----------+---------+----------+ > |     1 |    2790 |  2767 |  (-0.8%) |     171 | (-93.9%) | > |     2 |    6685 |  3484 | (-47.9%) |     519 | (-92.2%) | > |     3 |   10501 |  3599 | (-65.7%) |     855 | (-91.9%) | > |     4 |   14264 |  3635 | (-74.5%) |    1139 | (-92.0%) | > |     6 |   21800 |  3551 | (-83.7%) |    1713 | (-92.1%) | > |     8 |   29563 |  3570 | (-87.9%) |    2268 | (-92.3%) | > |    10 |   37210 |  3845 | (-89.7%) |    2872 | (-92.3%) | > |    12 |   44780 |  4452 | (-90.1%) |    3417 | (-92.4%) | > |    14 |   52576 |  5100 | (-90.3%) |    4020 | (-92.4%) | > |    16 |   60118 |  5785 | (-90.4%) |    4604 | (-92.3%) | > |    20 |   75037 |  7270 | (-90.3%) |    6486 | (-91.4%) | > |    24 |   90226 |  8712 | (-90.3%) |    7061 | (-92.2%) | > +-------+---------+-------+----------+---------+----------+ Nice improvements. I assume the table headings are switched? Why would Normal be more epensive than Huge? >> >> Vlastimil, I find this quite confusing: >> >> #ifdef CONFIG_DEBUG_VM >> /* >> * With DEBUG_VM enabled, order-0 pages are checked for expected state when >> * being allocated from pcp lists. With debug_pagealloc also enabled, they are >> * also checked when pcp lists are refilled from the free lists. >> */ >> static inline bool check_pcp_refill(struct page *page, unsigned int order) >> { >> if (debug_pagealloc_enabled_static()) >> return check_new_pages(page, order); >> else >> return false; >> } >> >> static inline bool check_new_pcp(struct page *page, unsigned int order) >> { >> return check_new_pages(page, order); >> } >> #else >> /* >> * With DEBUG_VM disabled, free order-0 pages are checked for expected state >> * when pcp lists are being refilled from the free lists. With debug_pagealloc >> * enabled, they are also checked when being allocated from the pcp lists. >> */ >> static inline bool check_pcp_refill(struct page *page, unsigned int order) >> { >> return check_new_pages(page, order); >> } >> static inline bool check_new_pcp(struct page *page, unsigned int order) >> { >> if (debug_pagealloc_enabled_static()) >> return check_new_pages(page, order); >> else >> return false; >> } >> #endif /* CONFIG_DEBUG_VM */ >> >> and the 4462b32c9285b5 changelog is a struggle to follow. >> >> Why are we performing *any* checks when CONFIG_DEBUG_VM=n and when >> debug_pagealloc_enabled is false? >> >> Anyway, these checks sounds quite costly so let's revisit their >> desirability? >>