From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pb0-f49.google.com (mail-pb0-f49.google.com [209.85.160.49]) by kanga.kvack.org (Postfix) with ESMTP id C84FA6B0035 for ; Thu, 9 Jan 2014 16:39:59 -0500 (EST) Received: by mail-pb0-f49.google.com with SMTP id jt11so3541928pbb.22 for ; Thu, 09 Jan 2014 13:39:59 -0800 (PST) Received: from g1t0027.austin.hp.com (g1t0027.austin.hp.com. [15.216.28.34]) by mx.google.com with ESMTPS id wm3si4981876pab.20.2014.01.09.13.39.57 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Thu, 09 Jan 2014 13:39:58 -0800 (PST) Message-ID: <1389303595.19886.1.camel@buesod1.americas.hpqcorp.net> Subject: Re: [PATCH 0/5] Fix ebizzy performance regression due to X86 TLB range flush v3 From: Davidlohr Bueso Date: Thu, 09 Jan 2014 13:39:55 -0800 In-Reply-To: <1389278098-27154-1-git-send-email-mgorman@suse.de> References: <1389278098-27154-1-git-send-email-mgorman@suse.de> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Mel Gorman Cc: Alex Shi , Ingo Molnar , Linus Torvalds , Thomas Gleixner , Andrew Morton , Fengguang Wu , H Peter Anvin , Linux-X86 , Linux-MM , LKML On Thu, 2014-01-09 at 14:34 +0000, Mel Gorman wrote: > Changelog since v2 > o Rebase to v3.13-rc7 to pick up scheduler-related fixes > o Describe methodology in changelog > o Reset tlb flush shift for all models except Ivybridge > > Changelog since v1 > o Drop a pagetable walk that seems redundant > o Account for TLB flushes only when debugging > o Drop the patch that took number of CPUs to flush into account > > ebizzy regressed between 3.4 and 3.10 while testing on a new > machine. Bisection initially found at least three problems of which the > first was commit 611ae8e3 (x86/tlb: enable tlb flush range support for > x86). Second was related to TLB flush accounting. The third was related > to ACPI cpufreq and so it was disabled for the purposes of this series. > > The intent of the TLB range flush series was to preserve existing TLB > entries by flushing a range one page at a time instead of flushing the > address space. This makes a certain amount of sense if the address space > being flushed was known to have existing hot entries. The decision on > whether to do a full mm flush or a number of single page flushes depends > on the size of the relevant TLB and how many of these hot entries would > be preserved by a targeted flush. This implicitly assumes a lot including > the following examples > > o That the full TLB is in use by the task being flushed > o The TLB has hot entries that are going to be used in the near future > o The TLB has entries for the range being cached > o The cost of the per-page flushes is similar to a single mm flush > o Large pages are unimportant and can always be globally flushed > o Small flushes from workloads are very common > > The first three are completely unknowable but unfortunately it is something > that is probably true of micro benchmarks designed to exercise these > paths. The fourth one depends completely on the hardware. The large page > check used to make sense but now the number of entries required to do > a range flush is so small that it is a redundant check. The last one is > the strangest because generally only a process that was mapping/unmapping > very small regions would hit this. It's possible it is the common case > for virtualised workloads that is managing the address space of its > guests. Maybe this was the real original motivation of the TLB range flush > support for x86. If this is the case then the patches need to be revisited > and clearly flagged as being of benefit to virtualisation. > > As things currently stand, Ebizzy sees very little benefit as it discards > newly allocated memory very quickly and regressed badly on Ivybridge where > it constantly flushes ranges of 128 pages one page at a time. Earlier > machines may not have seen this problem as the balance point was at a > different location. While I'm wary of optimising for such a benchmark, > it's commonly tested and it's apparent that the worst case defaults for > Ivybridge need to be re-examined. > > The following small series brings ebizzy closer to 3.4-era performance > for the very limited set of machines tested. It does not bring > performance fully back in line but the recent idle power regression > fix has already been identified as regressing ebizzy performance > (http://www.spinics.net/lists/stable/msg31352.html) and would need to be > addressed first. Benchmark results are included in the relevant patch's > changelog. > > arch/x86/include/asm/tlbflush.h | 6 ++--- > arch/x86/kernel/cpu/amd.c | 5 +--- > arch/x86/kernel/cpu/intel.c | 10 +++----- > arch/x86/kernel/cpu/mtrr/generic.c | 4 +-- > arch/x86/mm/tlb.c | 52 ++++++++++---------------------------- > include/linux/vm_event_item.h | 4 +-- > include/linux/vmstat.h | 8 ++++++ > 7 files changed, 32 insertions(+), 57 deletions(-) I Tried this set on a couple of workloads, no performance regressions. So, fwiw: Tested-by: Davidlohr Bueso -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org