From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6286EC352A1 for ; Wed, 7 Dec 2022 05:40:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D18DA8E0005; Wed, 7 Dec 2022 00:40:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CA24A8E0001; Wed, 7 Dec 2022 00:40:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B1D518E0005; Wed, 7 Dec 2022 00:40:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id A26678E0001 for ; Wed, 7 Dec 2022 00:40:58 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 6D94E120684 for ; Wed, 7 Dec 2022 05:40:58 +0000 (UTC) X-FDA: 80214411396.09.20EFA84 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf18.hostedemail.com (Postfix) with ESMTP id 6B3D61C000D for ; Wed, 7 Dec 2022 05:40:57 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=cfEswFZJ; spf=pass (imf18.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670391658; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cJSc2UHPj4uxjFJvxHUJILhSPToWu/ighdn+zfOMGT8=; b=bW1PA9JAf2wmG6ZJ/fvptmZSFxLByTcAShpB3VdQM7/nUhOAi+mVI5dA6gwUL0Hd0yydXZ SPl6eWtRGxw2Q5v6HcOEjiFeZBUfXHzQWwVWsHAUEpKyDlUex9SyV8566HrctQe//qs7Tm a2PBWzCUXBOcIv5SK6zAvnPaVirLxH0= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=cfEswFZJ; spf=pass (imf18.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670391658; a=rsa-sha256; cv=none; b=zHmQ5zAz6TDOLjkheK4t6kOxbhegfh3/nczFeoo4CTEXeAMiSBauIqnxYm1JgFQAAmakY9 50Bl1OCb3vtf5HESfUYpcaeGbfpeR/nJzXARKIE0hNtWIZYUQyn2DXm3uPa1bF8jMV3o3F J4ozBxUcdrlPwRDOhylKlILccBIxpD0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1670391657; x=1701927657; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version:content-transfer-encoding; bh=nuvP5ru5qaNNoGwijTA2xk12sPYZDzHB4wO5RLsEeKw=; b=cfEswFZJ01lF/bOWvJfcercw1vg2NcmNZ3SU+jD2WppemdLl76pg1pmt 9KgcWV90z4gg2sIHMIioHMrfC5w8NPcIuGnnqqQAjCOpsvLov4iMr4mXM 5oxt4WK5Pxc2pWlVXmRkqSrYa5yPFI+0Pf6o2L850utTp8VKZYPlmRSI3 b6hb/J2SKOQRU9l0AlE2i8cxYSfbJJAm+FpQxDDY5390eYoO7/mzwU2y3 qEgH4fHTMchg2yKjno7w5q/M4Q8BVpHBv6KZzCDJwKkl4kxcVjKHXY7FS kG88KqmoV+LiNWG/dB3pWOFas8KSlk3qrYQqJlvAqeIuwVQ3xRhOEIYpv Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10553"; a="297161646" X-IronPort-AV: E=Sophos;i="5.96,223,1665471600"; d="scan'208";a="297161646" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2022 21:40:55 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10553"; a="677241005" X-IronPort-AV: E=Sophos;i="5.96,223,1665471600"; d="scan'208";a="677241005" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2022 21:40:51 -0800 From: "Huang, Ying" To: Linus Torvalds Cc: kernel test robot , oe-lkp@lists.linux.dev, lkp@intel.com, Andrew Morton , Johannes Weiner , Hugh Dickins , Nadav Amit , Linux Memory Management List , linux-arch@vger.kernel.org, feng.tang@intel.com, zhengjun.xing@linux.intel.com, fengwei.yin@intel.com Subject: Re: [linux-next:master] [mm] 5df397dec7: will-it-scale.per_thread_ops -53.3% regression In-Reply-To: (Linus Torvalds's message of "Tue, 6 Dec 2022 11:15:09 -0800") References: <202212051534.852804af-yujie.liu@intel.com> <87ilipffws.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) Date: Wed, 07 Dec 2022 13:39:48 +0800 Message-ID: <878rjj22mz.fsf@yhuang6-desk2.ccr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 6B3D61C000D X-Stat-Signature: 817zakks8f97k9fuiuphs4dk8ajkfiuy X-Spamd-Result: default: False [1.60 / 9.00]; SUSPICIOUS_RECIPS(1.50)[]; SUBJECT_HAS_UNDERSCORES(1.00)[]; DMARC_POLICY_ALLOW(-0.50)[intel.com,none]; R_SPF_ALLOW(-0.20)[+ip4:192.55.52.151/32]; R_DKIM_ALLOW(-0.20)[intel.com:s=Intel]; MIME_GOOD(-0.10)[text/plain]; RCVD_NO_TLS_LAST(0.10)[]; BAYES_HAM(-0.00)[25.52%]; RCPT_COUNT_TWELVE(0.00)[13]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; DKIM_TRACE(0.00)[intel.com:+]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; FROM_HAS_DN(0.00)[]; MID_RHS_MATCH_FROMTLD(0.00)[]; TO_DN_SOME(0.00)[]; ARC_SIGNED(0.00)[hostedemail.com:s=arc-20220608:i=1]; TAGGED_RCPT(0.00)[]; ARC_NA(0.00)[] X-Rspam-User: X-HE-Tag: 1670391657-373960 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Linus Torvalds writes: > On Tue, Dec 6, 2022 at 10:41 AM Linus Torvalds > wrote: >> >> Let me think about this a while, but I think I'll have a patch for you >> to test once I've dealt with a couple more pull requests. > > So here's a trial balloon for you to try if you can see if this mostly > fixes the regression.. > > It still limits batching (because unlike the full "gather pages until > you have to flush", this is all batched under the page table lock. But > it limits it a bit less, in that it will use a second active batch if > it only used the initial on-stack one (which is called "local", which > is not a great name in this context, but whatever). > > This _should_ mean that that benchmark will now batch ~512 pages > instead of just 8. > > Which should be pretty much what it effectively used to do before too, > because the dirty shared page case has always caused that > "force_flush" thing, so it will have always stopped to flush every > page directory. > > (But we still have that extra rmap flushing limit because there could > have been _previous_ buffered page pointers that weren't dirty shared > pages, and we don't want to have to deal with that pain, and might > have to exit early in order to avoid it) > > I can imagine cleaner ways to do this, but they would involve having > to remember which batch we started having dirty pages in, which is > more bookkeeping pain than I really think it's worth. > > Does this fix the regression? I have tested the patch, it does fix the regression, the test result is as follows, 5df397dec7c4c08c 7cc8f9c7146a5c2dad6e71653c4 7763ba2bb16804313aa52bc78ae=20 ---------------- --------------------------- ---------------------------=20 %stddev %change %stddev %change %stddev \ | \ | \=20=20 2256919 =C2=B1 5% +114.2% 4833919 =C2=B1 2% +116.6% 488919= 9 will-it-scale.16.threads 8.17 =C2=B1 6% -8.2 0.00 -8.2 0.00 = perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_fu= nc.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_funct= ion Where 5df397dec7c4c08c is first bad commit, 7cc8f9c7146a5c2dad6e71653c4 is its parent commit, and 7763ba2bb16804313aa52bc78ae is the fix commit. The benchmark score recovered and CPU cycles for tlb flushing recovered too. Best Regards, Huang, Ying