From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) by kanga.kvack.org (Postfix) with ESMTP id A2FC76B0006 for ; Mon, 23 Jul 2018 16:28:52 -0400 (EDT) Received: by mail-pg1-f198.google.com with SMTP id m25-v6so972125pgv.22 for ; Mon, 23 Jul 2018 13:28:52 -0700 (PDT) Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id t69-v6sor2619282pgd.355.2018.07.23.13.28.51 for (Google Transport Security); Mon, 23 Jul 2018 13:28:51 -0700 (PDT) Date: Mon, 23 Jul 2018 13:28:49 -0700 (PDT) From: David Rientjes Subject: Re: [PATCH] mm: thp: remove use_zero_page sysfs knob In-Reply-To: <20180722035156.GA12125@bombadil.infradead.org> Message-ID: References: <1532110430-115278-1-git-send-email-yang.shi@linux.alibaba.com> <20180720123243.6dfc95ba061cd06e05c0262e@linux-foundation.org> <3238b5d2-fd89-a6be-0382-027a24a4d3ad@linux.alibaba.com> <20180722035156.GA12125@bombadil.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: Matthew Wilcox Cc: Yang Shi , Andrew Morton , kirill@shutemov.name, hughd@google.com, aaron.lu@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org On Sat, 21 Jul 2018, Matthew Wilcox wrote: > > The huge zero page can be reclaimed under memory pressure and, if it is, > > it is attempted to be allocted again with gfp flags that attempt memory > > compaction that can become expensive. If we are constantly under memory > > pressure, it gets freed and reallocated millions of times always trying to > > compact memory both directly and by kicking kcompactd in the background. > > > > It likely should also be per node. > > Have you benchmarked making the non-huge zero page per-node? > Not since we disable it :) I will, though. The more concerning issue for us, modulo CVE-2017-1000405, is the cpu cost of constantly directly compacting memory for allocating the hzp in real time after it has been reclaimed. We've observed this happening tens or hundreds of thousands of times on some systems. It will be 2MB per node on x86 if the data suggests we should make it NUMA aware, I don't think the cost is too high to leave it persistently available even under memory pressure if use_zero_page is enabled.