From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=L6kS=52=kvack.org=owner-linux-mm@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-6.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID,
	DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,
	HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,
	SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 48A7AC2D0EC
	for <linux-mm@archiver.kernel.org>; Fri, 10 Apr 2020 18:55:07 +0000 (UTC)
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by mail.kernel.org (Postfix) with ESMTP id 03EEC20801
	for <linux-mm@archiver.kernel.org>; Fri, 10 Apr 2020 18:55:06 +0000 (UTC)
Authentication-Results: mail.kernel.org;
	dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="N7xLbvn3"
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 03EEC20801
Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com
Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix)
	id 918198E004F; Fri, 10 Apr 2020 14:55:06 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 8C8E48E004D; Fri, 10 Apr 2020 14:55:06 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 7DDE48E004F; Fri, 10 Apr 2020 14:55:06 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from forelay.hostedemail.com (smtprelay0219.hostedemail.com [216.40.44.219])
	by kanga.kvack.org (Postfix) with ESMTP id 651148E004D
	for <linux-mm@kvack.org>; Fri, 10 Apr 2020 14:55:06 -0400 (EDT)
Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251])
	by forelay02.hostedemail.com (Postfix) with ESMTP id 248B3A8F6
	for <linux-mm@kvack.org>; Fri, 10 Apr 2020 18:55:06 +0000 (UTC)
X-FDA: 76692847812.04.wash58_83d2a54adb101
X-HE-Tag: wash58_83d2a54adb101
X-Filterd-Recvd-Size: 6486
Received: from mail-il1-f196.google.com (mail-il1-f196.google.com [209.85.166.196])
	by imf30.hostedemail.com (Postfix) with ESMTP
	for <linux-mm@kvack.org>; Fri, 10 Apr 2020 18:55:05 +0000 (UTC)
Received: by mail-il1-f196.google.com with SMTP id z12so2661881ilb.10
        for <linux-mm@kvack.org>; Fri, 10 Apr 2020 11:55:05 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20161025;
        h=mime-version:references:in-reply-to:from:date:message-id:subject:to
         :cc;
        bh=ClYPsWG/W2/E4LG9+wZv+uTssMgHE+BF+kTHJv5rhBg=;
        b=N7xLbvn3dOXT3E15NneJclKR4BCQNvFqQ5N2nylnTfsWaQfwzc+LK6RNLBoz++zlA7
         wW++sXk1wWqIolqKiw2so3mLmuJBrx1vI03NX3crp2oGMWbU+/52Yiv5zvMLsBFFuHzO
         DL3WO25yR2Y1EueI1a2FB9O+z7Bs48n9ipQbsVB6uauoCoyKRlKMF2563Jk8OweT2u00
         rTN6QvVp8cHHhi1Xb0ff8fTCX/n2FCaJESqNeCJB630l3pRdfiM2yZ08biLhygfYtdHD
         alqYn/E7l3QgYqTnAl1zhtX2XHuY8xg+5r2jjPRQmYBj7dFU+zZwCpSxw9fKpFViCML+
         8xrQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:mime-version:references:in-reply-to:from:date
         :message-id:subject:to:cc;
        bh=ClYPsWG/W2/E4LG9+wZv+uTssMgHE+BF+kTHJv5rhBg=;
        b=blBsjKK+Un+KrF9SC9gpBSu5s/WyQzqJcXB55UHXP445z58zKS6VFuPsYxQG73IRBp
         rQmAAfWcG7SbeKnurNjNSg5aGnS6fn2F0aseijtKJTnksYqXHKhf1bieVXaKU5f0Uid5
         asglkzvXtQHMurl1Xc/RfscLDzZTiZcu0s8fHyhw3z+oBudQ4rc4r4FMobcXkHDc9bZ5
         0gr7O9lwGiPu74lV4sm2fgZjyV/QMF/W0lJpaGfSwwTX/Kt5jVNeq/I23RYzFNKx/x1k
         XKBeGcmE5kHVUkjFzEcqgaI/u7YEbdAMkZ/VrzEu0NeTZ92LueBNogexrcnC/nvK76qu
         UmQA==
X-Gm-Message-State: AGi0Pua4DPmvZQz5zi0/TV2EXndfCXkldFz+HsTMxk+zEKIc6XYpmtoL
	EVIq9nFaL0yB44pcB/61++Okn2EK6/XIbiw78Yk=
X-Google-Smtp-Source: APiQypItyWkuXLZJ3UEK9uHd7BZiVTfOniLiy2fCn2gDhHh05eKxBi9+WqRu/39AENb4GwNNuwK/H0MFQXXjfcQrSLI=
X-Received: by 2002:a05:6e02:c25:: with SMTP id q5mr5809576ilg.97.1586544904698;
 Fri, 10 Apr 2020 11:55:04 -0700 (PDT)
MIME-Version: 1.0
References: <20200403081812.GA14090@oneplus.com>
In-Reply-To: <20200403081812.GA14090@oneplus.com>
From: Alexander Duyck <alexander.duyck@gmail.com>
Date: Fri, 10 Apr 2020 11:54:53 -0700
Message-ID: <CAKgT0Ud68=vkZPKU3UGSD01Fqn8M4RW7YCSJdvO76fS2QrhBzQ@mail.gmail.com>
Subject: Re: [RFC] mm/memory.c: Optimizing THP zeroing routine for !HIGHMEM cases
To: Prathu Baronia <prathu.baronia@oneplus.com>
Cc: Andrew Morton <akpm@linux-foundation.org>, linux-mm <linux-mm@kvack.org>, 
	Greg KH <gregkh@linuxfoundation.org>, gthelen@google.com, jack@suse.cz, 
	Michal Hocko <mhocko@suse.com>, ken.lin@oneplus.com, gasine.xu@oneplus.com, 
	chintan.pandya@oneplus.com
Content-Type: text/plain; charset="UTF-8"
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On Fri, Apr 3, 2020 at 1:18 AM Prathu Baronia
<prathu.baronia@oneplus.com> wrote:
>
> THP allocation for anon memory requires zeroing of the huge page. To do so,
> we iterate over 2MB memory in 4KB chunks. Each iteration calls for kmap_atomic()
> and kunmap_atomic(). This routine makes sense where we need temporary mapping of
> the user page. In !HIGHMEM cases, specially in 64-bit architectures, we don't
> need temp mapping. Hence, kmap_atomic() acts as nothing more than multiple
> barrier() calls.
>
> This called for optimization. Simply getting VADDR from page does the job for
> us. So, implement another (optimized) routine for clear_huge_page() which
> doesn't need temporary mapping of user space page.
>
> While testing this patch on Qualcomm SM8150 SoC (kernel v4.14.117), we see 64%
> Improvement in clear_huge_page().
>
> Ftrace results:
>
> Default profile:
>  ------------------------------------------
>  6) ! 473.802 us  |  clear_huge_page();
>  ------------------------------------------
>
> With this patch applied:
>  ------------------------------------------
>  5) ! 170.156 us  |  clear_huge_page();
>  ------------------------------------------

I suspect that if anything this is really pointing out how much
overhead is being added through process_huge_page. I know for x86 most
of the modern processors are somewhere between 16B/cycle or 32B/cycle
to initialize memory with some fixed amount of overhead for making the
rep movsb/stosb call. One thing that might make sense to look at would
be to see if we could possibly reduce the number of calls we have to
make with process_huge_page by taking the caches into account. For
example I know on x86 the L1 cache is 32K for most processors, so we
could look at possibly bumping things up so that we are processing 8
pages at a time and then making a call to cond_resched() instead of
doing it per 4K page.

> Signed-off-by: Prathu Baronia <prathu.baronia@oneplus.com>
> Reported-by: Chintan Pandya <chintan.pandya@oneplus.com>
> ---
>  mm/memory.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 3ee073d..3e120e8 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -5119,6 +5119,7 @@ EXPORT_SYMBOL(__might_fault);
>  #endif
>
>  #if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS)
> +#ifdef CONFIG_HIGHMEM
>  static void clear_gigantic_page(struct page *page,
>                                 unsigned long addr,
>                                 unsigned int pages_per_huge_page)
> @@ -5183,6 +5184,16 @@ void clear_huge_page(struct page *page,
>                                     addr + right_idx * PAGE_SIZE);
>         }
>  }
> +#else
> +void clear_huge_page(struct page *page,
> +                    unsigned long addr_hint, unsigned int pages_per_huge_page)
> +{
> +       void *addr;
> +
> +       addr = page_address(page);
> +       memset(addr, 0, pages_per_huge_page*PAGE_SIZE);
> +}
> +#endif

This seems like a very simplistic solution to the problem, and I am
worried something like this would introduce latency issues when
pages_per_huge_page gets to be large. It might make more sense to just
wrap the process_huge_page call in the original clear_huge_page and
then add this code block as an #else case. That way you avoid
potentially stalling a system for extended periods of time if you
start trying to clear 1G pages with the function.

One interesting data point would be to see what the cost is for
breaking this up into a loop where you only process some fixed number
of pages and running it with cond_resched() so you can avoid
introducing latency spikes.