From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 826BCC3DA64 for ; Thu, 1 Aug 2024 13:31:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EAC196B00B1; Thu, 1 Aug 2024 09:31:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E40B86B00C5; Thu, 1 Aug 2024 09:31:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CD5FE6B00C7; Thu, 1 Aug 2024 09:31:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id AB3DA6B00B1 for ; Thu, 1 Aug 2024 09:31:05 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 359CCA0DCA for ; Thu, 1 Aug 2024 13:31:05 +0000 (UTC) X-FDA: 82403762490.13.B5ACD3D Received: from mail-ed1-f49.google.com (mail-ed1-f49.google.com [209.85.208.49]) by imf09.hostedemail.com (Postfix) with ESMTP id 458BA14000A for ; Thu, 1 Aug 2024 13:31:03 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=NczszMCf; spf=pass (imf09.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.208.49 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722519034; a=rsa-sha256; cv=none; b=nN2lcvvZBnGyfcb5rkzycdK1gbxlIjAI1FKXYFVgVoxHAlTe4jW/Pk+0n/g8nsX5EhB7AI JAS/Og1t7ON56Yzn3UFHcTDKfgX6/MR76NjoW/WFtmLZbDSOlw33xlwW4ERPHdzzlQzoCJ uM4TIlftVA+lzDYO+P3+SZTrrjiGJFQ= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=NczszMCf; spf=pass (imf09.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.208.49 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722519034; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=tv4iB82yfscVGFR90a765Fg0rBGgnQwJipeiUQwNkUo=; b=Sx0FR2IbG7fIG9e56DTBWKGMqHVSUULGbtI4QVbcexOrmxZV94eCcACcQLcI6tc315anZS xkwLmXNrnfihd4qgK0U3WUcMC7IOpbczkQAz4+MRg6jE3wJrp76wveC/mU0pRb7aSXmocj S0CIGjSx1BPc/ACc7Qozy9pfer8fqvI= Received: by mail-ed1-f49.google.com with SMTP id 4fb4d7f45d1cf-5a309d1a788so9185763a12.3 for ; Thu, 01 Aug 2024 06:31:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722519062; x=1723123862; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=tv4iB82yfscVGFR90a765Fg0rBGgnQwJipeiUQwNkUo=; b=NczszMCfuQmVHJesQIrBeaVxV75/tACEgJ8HrfXELrfVXxql+NQm5N4EzygwYO61ZR 8WuGWbOfDU6G+9wHb4YW5ZpJr2wBPSenOoProAcd73RQuT2a+xZR56neBdJoyBDflFAg q+x3gU5BmhSqjlmf4obzk9ZtstBzUTX1VVMw+Jdn7WH/STQX5JVQFA6ZOACeIewyShW4 e/FK5QiDJsNOumbdI8PaH/dBejX5ZSC+93Bc6hybCC2i4nEt+58g5tyttJ3EkzPkB0c7 SNeRzsh6WHy82CxrEo/A7+h6bbysOsu+nz+Y2BPvJWa9H3Qo9eluG6kboRuvO2f6AiIB 7nqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722519062; x=1723123862; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=tv4iB82yfscVGFR90a765Fg0rBGgnQwJipeiUQwNkUo=; b=A37q4nyKn+PniEgZrTmwcL3tUVcL5xQ1+lst45GHPx7djHRJtp/TLxhq6+ggKXvtpJ /p9DUnkH6cQR8BFPh24YbNwCKR5lUlRrBp3zlSNFShHAp3lWpyNa8xMuKR6CJ7Tt4kDs 03TmibU8uJd1tTJLypiymK+YAVdJ8iFAxXv4Yo2k4MKk8aJy7fhylHMdPkEcSxHo835i 3PDfnYQeCsBBI8KOBb40p8z4Sj+ozzzLCjncxOomuYU4aXgu0Tx7Nuwxwp4nsSYF/mgw PG35CS+n28p+OPaXp1oma5V+R7q8Mey+KEYwHdu2u3nK4PZVzhQTmBlnjteyW4op3Nrx CO8A== X-Forwarded-Encrypted: i=1; AJvYcCV+OOC0eLs8Nc6IijQNvocnJ6i04+mc3jtsXC/F2fDpk+J1zANLIGnQ/+yqM9zbv7ZgakV3eFbilXUncNdn0HN/pGg= X-Gm-Message-State: AOJu0YzSBY0QAUT86iYtrkTdUWkfDiofditaB5XekcPidAqRXk72z4Nn ArKAd19bx6m6qP1OtnxeDQYNPBGwZl56yL3FMdQmqrmrZeR3mFh92/zXZF3l X-Google-Smtp-Source: AGHT+IElO9v+4Isd9ZaN2clcpXml/1imltaATSqt4uv4tE5W6CFLpI3ysrDPJAq4DaCky1y+YsIR6w== X-Received: by 2002:a50:ee8d:0:b0:5a1:a447:9f9b with SMTP id 4fb4d7f45d1cf-5b7f54136b9mr201616a12.27.1722519061241; Thu, 01 Aug 2024 06:31:01 -0700 (PDT) Received: from f (cst-prg-90-207.cust.vodafone.cz. [46.135.90.207]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5af959990c9sm8749591a12.47.2024.08.01.06.30.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Aug 2024 06:30:59 -0700 (PDT) Date: Thu, 1 Aug 2024 15:30:49 +0200 From: Mateusz Guzik To: David Hildenbrand Cc: "Yin, Fengwei" , kernel test robot , Peter Xu , oe-lkp@lists.linux.dev, lkp@intel.com, linux-kernel@vger.kernel.org, Andrew Morton , Huacai Chen , Jason Gunthorpe , Matthew Wilcox , Nathan Chancellor , Ryan Roberts , WANG Xuerui , linux-mm@kvack.org, ying.huang@intel.com, feng.tang@intel.com Subject: Re: [linus:master] [mm] c0bff412e6: stress-ng.clone.ops_per_sec -2.9% regression Message-ID: <6uxnuf2gysgabyai2r77xrqegb7t7cc2dlzjz6upwsgwrnfk3x@cjj6on3wqm4x> References: <202407301049.5051dc19-oliver.sang@intel.com> <193e302c-4401-4756-a552-9f1e07ecedcf@redhat.com> <439265d8-e71e-41db-8a46-55366fdd334e@intel.com> <90477952-fde2-41d7-8ff4-2102c45e341d@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <90477952-fde2-41d7-8ff4-2102c45e341d@redhat.com> X-Stat-Signature: yzk91ziawjn47xrf77mii6hw69dya6wx X-Rspamd-Queue-Id: 458BA14000A X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1722519063-892287 X-HE-Meta: U2FsdGVkX19p6Y//9PJ/G42F76sAyfSuJvbVCMtl3OXVxCWbr6iWDpiAlhAImkkE0QfPKisdEwOPP57IwQwG9dvT0j4cDceDofwLN9ahXddR3gh/yaZeC5bvoy8FUh7V4ksHQMvfXzMga5q7dPoHlnqLY1cYJcKB4pt/auJ8Mm0bzX5zKrxTrn2f7CMetOifVWz9pGBOVJ2HQjziPVd3BPURdtgLl4O1YXCSqA2lMpyuxVvz52ylYUz5aonwrVd4o58ZES5Lj4PIRFvxZvhQeVrXdpU+/SRdMn+uV0My95N1vCDOA08DDMKkQx5NZmz3gv70DCLQ8ShQNXnjLMC/afVw2LXRrxurwj05Aa/3NRD16kBryLi1v9Z3IQWL9NHdn2mSbsYBT++hEl7pKS4v23DhY/jjvpiK2ZpXaxzArTcHENlc/ELQpQ7x8fYB8bars+gu0MfPRrbNEFqIKcnhYriQTKGZ1iggVMDC42qGimL5+3r2Q6hEZaM3cF+zpGaP5dNdi58/GoOtgksPGSgmabJYKLD/BAUZhjZi6T91zzKcTHmsfE01eER7iLJX7rmNky7WNm3M4tKmVHaxVLkO0d7G/4Xpn7A2EM+vDLPr1Wv6bC+KlXRdgFc1f81Mwgx4ietuG9kiXDb5I0NziGArZdgEgyxp9Cl3wBQzL2l5xTAx7kv2lP0/rB61qvFLjrGLh0o99TH9XGNIbMC4kZ+ngc9nP0AhY3Mn8t4FfM6Fgp2ZPqDpRbKKvzuNFy23vpgvTQ2HpjOu8mQ8hkJsxAWdwK+/9TLXS9UFTJCcn4HR/jCe7JE22GH2BNiJWxFLkuQoWKbRxofIJjOxisK0Nz2/TsS/3cOuJwF/x1v7rEvkZCHoeZPu8NWad+h+DQJc4dd6+P08HHNnfXKQVlX6VNxE9UOBIOxVJJ79YPQxU9saXYmqBYOJeOAfVWojgRc+McEPN1zilR9fGaDhoBwz/au D0lJqGv9 imdny5a/MB/BkNfD52FBcUezYct2R1WAXWC6LIDvTGZFH1rpJmJMTvwUy7mQhv/s+kPmK4ir0MUD5PbSKjVNaFVjp8TGD0mR1+xyZ5F04XJhfkezWWp2MfcpctnI/LcphwnZoI4ZvRSEZqY9Ydr/W0ONOjS8DT6nkTegI4YIPW+CciQ9Q2u6rxztcrFjQyRWYJ72Yax0q3VH5P0XzLeTI15u7Os29Ag+6pmnfdj32YPEIRmZCxhsmRnFctQG5Tgimo+Xm/B89XE1UDOEphyxkKJss4jHdy+KkNLTuMDunoeX/xNXXuVYt4NXMUqEmOoQMOG7sJNYg6r4ytH0mHId0G6mDCUHke4UyKqgVI4wlj9yJLwN1hmZq6yv2d7uETr7/9w8Ko9ZhbNPDjZwlWhLcNZuJQtEEKlNwed7P X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Aug 01, 2024 at 08:49:27AM +0200, David Hildenbrand wrote: > Yes indeed. fork() can be extremely sensitive to each added instruction. > > I even pointed out to Peter why I didn't add the PageHuge check in there > originally [1]. > > "Well, and I didn't want to have runtime-hugetlb checks in > PageAnonExclusive code called on certainly-not-hugetlb code paths." > > > We now have to do a page_folio(page) and then test for hugetlb. > > return folio_test_hugetlb(page_folio(page)); > > Nowadays, folio_test_hugetlb() will be faster than at c0bff412e6 times, so > maybe at least part of the overhead is gone. > I'll note page_folio expands to a call to _compound_head. While _compound_head is declared as an inline, it ends up being big enough that the compiler decides to emit a real function instead and real func calls are not particularly cheap. I had a brief look with a profiler myself and for single-threaded usage the func is quite high up there, while it manages to get out with the first branch -- that is to say there is definitely performance lost for having a func call instead of an inlined branch. The routine is deinlined because of a call to page_fixed_fake_head, which itself is annotated with always_inline. This is of course patchable with minor shoveling. I did not go for it because stress-ng results were too unstable for me to confidently state win/loss. But should you want to whack the regression, this is what I would look into.