From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 564D6C76188 for ; Wed, 5 Apr 2023 20:27:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8CE9F6B0075; Wed, 5 Apr 2023 16:27:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 858D16B0078; Wed, 5 Apr 2023 16:27:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6F90E6B007B; Wed, 5 Apr 2023 16:27:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 5C7CF6B0075 for ; Wed, 5 Apr 2023 16:27:15 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 36629AC29F for ; Wed, 5 Apr 2023 20:27:15 +0000 (UTC) X-FDA: 80648472030.01.203EA39 Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) by imf26.hostedemail.com (Postfix) with ESMTP id 71FAC140005 for ; Wed, 5 Apr 2023 20:27:13 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=D6bVZ+Ej; spf=none (imf26.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680726433; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Rp1vsSz2rb49nidUrcuZPQSFo9FctViDmHV3m59FGQw=; b=p2IGnJqJwCGQ7EuLlbQ2rCxEcefEmG8HNpA1fxjIu0nmnoHqN1xgQ6om7z4JBvtVDbS97P XYNA9LDL4IrnVd6lMjbMaoD9sjoZQbNxDMk3/zJGKFeZoze9S3WJ7P6gWQM9gcVEviFdbC fjNv5FhEKswK0kFWI3lZkfojDVmyGUI= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=D6bVZ+Ej; spf=none (imf26.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680726433; a=rsa-sha256; cv=none; b=T1Xsh77fbCd92tmcpyi8qSzQ858CZxOCGd3ivVggPGVnqg4xfFsHXg+qjwzTWwPGJwRh0v SuKgv1l/L3LhYW7BXhk7OEee3FNv3d2tgIfCGI+vqGyX8mtTw8OdAIU3W0Iv5YxVYzjX8v 3cHiQgzZA6ncLWxJLrpb/8bH14xRePw= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=Rp1vsSz2rb49nidUrcuZPQSFo9FctViDmHV3m59FGQw=; b=D6bVZ+EjFDeh7bEdhFvwo2HPUm DMN/cQUuDsKxgwM4Uf5jrvAaVbcI5+E9J8kxiQjAwEnjonhrqg+dHUEnOHPE6FnWrX9Lj0d8Amv81 WInen6+5ifp5jozSCmrzNN5H263HFJXZhxutBDwNbMgLuu6fd7wiD+wi2KQRahwx//f7GkIIBICee VmtKXwZFdGiro+bdA5UeQjyTGiuBNOMsoPJ1seb6cNjIGLbU1HlG8HyMzAK2g61MjbJ/5ou9y+9e3 IPAKALU3/gpUw6W/q/ayByDUbTHf9NllqgdF0drw09vlwZLbKUgf8pVFkVkh7FoNP01f0cSzEy48C 4E2Iz6lA==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1pk9ij-00A4v8-1u; Wed, 05 Apr 2023 20:27:06 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id C012E300202; Wed, 5 Apr 2023 22:27:04 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id A79A127985546; Wed, 5 Apr 2023 22:27:04 +0200 (CEST) Date: Wed, 5 Apr 2023 22:27:04 +0200 From: Peter Zijlstra To: Ankur Arora Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org, torvalds@linux-foundation.org, akpm@linux-foundation.org, luto@kernel.org, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, mingo@redhat.com, juri.lelli@redhat.com, willy@infradead.org, mgorman@suse.de, rostedt@goodmis.org, tglx@linutronix.de, vincent.guittot@linaro.org, jon.grimm@amd.com, bharata@amd.com, boris.ostrovsky@oracle.com, konrad.wilk@oracle.com Subject: Re: [PATCH 9/9] x86/clear_huge_page: make clear_contig_region() preemptible Message-ID: <20230405202704.GF365912@hirez.programming.kicks-ass.net> References: <20230403052233.1880567-1-ankur.a.arora@oracle.com> <20230403052233.1880567-10-ankur.a.arora@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230403052233.1880567-10-ankur.a.arora@oracle.com> X-Stat-Signature: d458a7u86ep9shcqj8ciwzgw71pdxxc9 X-Rspam-User: X-Rspamd-Queue-Id: 71FAC140005 X-Rspamd-Server: rspam06 X-HE-Tag: 1680726433-353690 X-HE-Meta: U2FsdGVkX18xiGhd/hMq9Exq8v1kP0JiXZwouJgFrNGAsijzNuKKTLRmt7vmgtgzwca7YYOvv12eB5xnnIwsI1NQj0CIu96YYO+O35yOhb4XGIJ8h+pyJjesouzVARgI+ucAzX1wG3JvU8KyFuja+Iy+GvipZFvFYI/TuZwrNpD4HnAjjWG2ZqvAuniytHFK8p7D+nh47DERi3nVSlq5y7eHjSCfIL80loGImZzPsdXO1oV9/0mD7vB3iVVCNHwB9GEbrqqEKLfYIB9HDOGXxjcLBghnbPvYYVNGd+6EA78nqGTWQ6jvZYtC+f+4MAo3lcsuT+ww40fuvWlXJLF21naokMyAp7BXcmsNisPXS9+IUtYr2OYMxIy448/ieCLwA8dM/SBdx4+cZj3vrdWrFiSqW7G5QplAOL0Z2hdK4k6XWzOPt1nHC2Ws2faxZzfV3LEldPMw6N6ZLZH2WFpqaPEo/pgtpaNeE327iwgMcHkNm713bbDyaYW9GFU7epBgNym2Evzrv7id6tNGRu6OVHG1YQYPjJbkafxhIV3sm/RD1svNFecrcreoJbpMEymNwX2MSv8Pt6PATOtV/kV3n6dcWAZp/NwGk+LVfi2NW5SV46QyFg8mv4SQ/H5fwwAPbHX6BCBkqT3pZHWt9KQx53jrnXZnWQh+2sQTy4+dHJ4TXkOuSOQCFMiocTjaSBKaNVK0PnZ41WitsO0uzokPoS75wkLXHHwNiEkfZ8ph5U6MHpYjJuSq1UY4ELTxQOVtPU2HmOx9Q3x3++VOUwlWSKzItp1aor8CZDXNcOtOPpPNlMQeu0hyqVO6qJEvb+lBs8MPgyuWyRzHxQYhT48AaNj4pusUdkDu6pxbPouxVyRhFjad5k48Fbv5bzdFZLzaeLZ6qxNr4msQ8aWI1kipYxRSDBok0NoA4tC5uwv0aBk+OPFSKXhyjujGAlgGzlqM16/UwYisqksFV7EBYRS 8FdHg1l1 vg52H4XiBK9Wrg092M3t0fdwZ5N5MtG24Itk+2b6AOM2GKRHdGfe/xiSi2MOYiEHYXeyGqUNVMvKX+tNFfTi4tCYbDlmQjPlEksE1Lxea6hZiVEVdvi1RRxxbBudFqG3xQYrW69i6MiJjWVdqHWo5s7JijdiQ7XveqWGyhrY6Zup68KDkCyQjt/BFsKjeZVvOLV+keP9Y5f0VNhy+RArsnLdjzAmVdRHAcQhH X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, Apr 02, 2023 at 10:22:33PM -0700, Ankur Arora wrote: > clear_contig_region() can be used to clear up to a huge-page (2MB/1GB) > chunk Allow preemption in the irqentry_exit path to make sure we don't > hold on to the CPU for an arbitrarily long period. > > Performance: vm-scalability/case-anon-w-seq-hugetlb mmaps an anonymous > hugetlb-2mb region, and then writes sequentially to the region, demand > faulting pages on the way. > > This test, with a CONFIG_VOLUNTARY config shows the effects of this > change: stime drops (~18% on Icelakex, ~5% on Milan), while the utime > goes up (~15% on Icelakex, ~13% on Milan.) > > *Icelakex* mm/clear_huge_page x86/clear_huge_page change > (mem=4GB/task, tasks=128) > > stime 293.02 +- .49% 239.39 +- .83% -18.30% > utime 440.11 +- .28% 508.74 +- .60% +15.59% > wall-clock 5.96 +- .33% 6.27 +-2.23% + 5.20% > > > > *Milan* mm/clear_huge_page x86/clear_huge_page change > (mem=1GB/task, tasks=512) > > stime 490.95 +- 3.55% 466.90 +- 4.79% - 4.89% > utime 276.43 +- 2.85% 311.97 +- 5.15% +12.85% > wall-clock 3.74 +- 6.41% 3.58 +- 7.82% - 4.27% > > The drop in stime is due to REP; STOS being more efficient for bigger > extents. The increase in utime is due to cache effects of that change: > mm/clear_huge_page() clears page-at-a-time, while narrowing towards the > faulting page; while x86/clear_huge_page only optimizes for cache > locality in the local neighbourhood of the faulting address. > > This effect on utime is visible via the increased L1-dcache-load-misses > and LLC-load* and an increased backend boundedness for perf user-stat > --all-user on Icelakex. The effect is slight but given the heavy cache > pressure generated by the test, shows up in the drop in user IPC: > > - 9,455,243,414,829 instructions # 2.75 insn per cycle ( +- 14.14% ) (46.17%) > - 2,367,920,864,112 L1-dcache-loads # 1.054 G/sec ( +- 14.14% ) (69.24%) > - 42,075,182,813 L1-dcache-load-misses # 2.96% of all L1-dcache accesses ( +- 14.14% ) (69.24%) > - 20,365,688 LLC-loads # 9.064 K/sec ( +- 13.98% ) (69.24%) > - 890,382 LLC-load-misses # 7.18% of all LL-cache accesses ( +- 14.91% ) (69.24%) > > + 9,467,796,660,698 instructions # 2.37 insn per cycle ( +- 14.14% ) (46.16%) > + 2,369,973,307,561 L1-dcache-loads # 1.027 G/sec ( +- 14.14% ) (69.24%) > + 42,155,621,201 L1-dcache-load-misses # 2.96% of all L1-dcache accesses ( +- 14.14% ) (69.24%) > + 22,116,300 LLC-loads # 9.588 K/sec ( +- 14.20% ) (69.24%) > + 1,355,607 LLC-load-misses # 10.29% of all LL-cache accesses ( +- 15.49% ) (69.25%) > > Given the fact that the stime improves for all loads using this path, > while the utime drop is load dependent add this change. Either I really need sleep, or *NONE* of the above is actually relevant to what the patch below actually does! The above talks about the glories of using large clears, while the patch allows reschedules which are about latency. > Signed-off-by: Ankur Arora > --- > arch/x86/mm/hugetlbpage.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c > index 4294b77c4f18..c8564b0552e5 100644 > --- a/arch/x86/mm/hugetlbpage.c > +++ b/arch/x86/mm/hugetlbpage.c > @@ -158,7 +158,17 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr, > static void clear_contig_region(struct page *page, unsigned long vaddr, > unsigned int npages) > { > + might_sleep(); > + > + /* > + * We might be clearing a large region. > + * Allow rescheduling. > + */ > + allow_resched(); > clear_user_pages(page_address(page), vaddr, page, npages); > + disallow_resched(); > + > + cond_resched(); > } > > void clear_huge_page(struct page *page, > -- > 2.31.1 >