From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7A5FCC433EF for ; Mon, 6 Jun 2022 21:53:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0BF816B0072; Mon, 6 Jun 2022 17:53:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 06DC66B0073; Mon, 6 Jun 2022 17:53:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E4FFD6B0074; Mon, 6 Jun 2022 17:53:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id D0CD66B0072 for ; Mon, 6 Jun 2022 17:53:29 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 9ACFAD0A for ; Mon, 6 Jun 2022 21:53:29 +0000 (UTC) X-FDA: 79549162938.18.38D4469 Received: from mail-ej1-f42.google.com (mail-ej1-f42.google.com [209.85.218.42]) by imf22.hostedemail.com (Postfix) with ESMTP id 50F20C0053 for ; Mon, 6 Jun 2022 21:53:25 +0000 (UTC) Received: by mail-ej1-f42.google.com with SMTP id u12so31536914eja.8 for ; Mon, 06 Jun 2022 14:53:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=eMj6o1dpOH0W9Y5HTwykUKgzdtPrlRWnrwX9rpljZKc=; b=b8/7tGHsUvQuYEuE6TdH0YhTdZGjEQ7/PD1G684q5IgDFtECbox6b5ndqcP97oaCbM Y0l6Oyia8uQVhys0BPY3yzBe0RhQPwP+5ngtzKQL1sR6XgRSPB5KL3p+JjGCWVhSIsj4 I7WERSQh3C32HeCnKQiQXKWNocrpxH5DPorFI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=eMj6o1dpOH0W9Y5HTwykUKgzdtPrlRWnrwX9rpljZKc=; b=ITwjG5yK6VwAKDfQx8B3LyQlM88DRLcpZYr0fCW4R5BPxp7+STqaaZVWrg6zo/wPZ7 sRWkSf1mHlGumyU0O6hcRqmisS1R60AAKa299ISYTwLHdMWaoWgtYVhHRZVkrgnFjLab uVbelMMPQx1pAE+y9fJhLRT13GwhPDK5OlhAyxKQ+8+5PYr1Suiy26zPV0HERsB7KhMP a9UZlh33T5ZQwxl81Ceyxzjeu+RW760CskSoBzq90OawKCuOFwgg96k82mHNwGMERNZY 8SNOnVX0it4xMj29QlpWLO/KmUjqZonP0vb0xg2Xa5zlbOEYw0NwR9z7Lm2hwThUkpyP ekKg== X-Gm-Message-State: AOAM532dx+RPX6HpQ++RcMlgdrUyMgcXbhRvrOk+OExI7x3y/GSuW9QI JSW7LCAUx7y4FLzSn78vzcgMnLXwGb4Z8fwuTtQ= X-Google-Smtp-Source: ABdhPJz4Mhy1zjPC+e7hI48SdgG5b7Kw5WM/pG8ix50Aidt5jinVrFJzGtxtEoBZFP/hq7sXC9VVug== X-Received: by 2002:a17:906:685:b0:6fa:8e17:e9b5 with SMTP id u5-20020a170906068500b006fa8e17e9b5mr24774356ejb.522.1654552407257; Mon, 06 Jun 2022 14:53:27 -0700 (PDT) Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com. [209.85.128.50]) by smtp.gmail.com with ESMTPSA id bs13-20020a056402304d00b0042bd6f745fasm9168594edb.92.2022.06.06.14.53.26 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 06 Jun 2022 14:53:26 -0700 (PDT) Received: by mail-wm1-f50.google.com with SMTP id 67-20020a1c1946000000b00397382b44f4so8560913wmz.2 for ; Mon, 06 Jun 2022 14:53:26 -0700 (PDT) X-Received: by 2002:a05:600c:4982:b0:39c:3c0d:437c with SMTP id h2-20020a05600c498200b0039c3c0d437cmr20758360wmp.38.1654552405818; Mon, 06 Jun 2022 14:53:25 -0700 (PDT) MIME-Version: 1.0 References: <20220606202109.1306034-1-ankur.a.arora@oracle.com> In-Reply-To: <20220606202109.1306034-1-ankur.a.arora@oracle.com> From: Linus Torvalds Date: Mon, 6 Jun 2022 14:53:09 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v3 00/21] huge page clearing optimizations To: Ankur Arora Cc: Linux Kernel Mailing List , Linux-MM , "the arch/x86 maintainers" , Andrew Morton , Mike Kravetz , Ingo Molnar , Andrew Lutomirski , Thomas Gleixner , Borislav Petkov , Peter Zijlstra , Andi Kleen , Arnd Bergmann , Jason Gunthorpe , jon.grimm@amd.com, Boris Ostrovsky , Konrad Rzeszutek Wilk , joao.martins@oracle.com Content-Type: text/plain; charset="UTF-8" X-Rspam-User: Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b="b8/7tGHs"; dmarc=none; spf=pass (imf22.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.218.42 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 50F20C0053 X-Stat-Signature: nijy611ddgds7ubsp6rtuycwe5t7yitm X-HE-Tag: 1654552405-675630 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jun 6, 2022 at 1:22 PM Ankur Arora wrote: > > This series introduces two optimizations in the huge page clearing path: > > 1. extends the clear_page() machinery to also handle extents larger > than a single page. > 2. support non-cached page clearing for huge and gigantic pages. > > The first optimization is useful for hugepage fault handling, the > second for prefaulting, or for gigantic pages. Please just split these two issues up into entirely different patch series. That said, I have a few complaints about the individual patches even in this form, to the point where I think the whole series is nasty: - get rid of 3/21 entirely. It's wrong in every possible way: (a) That shouldn't be an inline function in a header file at all. If you're clearing several pages of data, that just shouldn't be an inline function. (b) Get rid of __HAVE_ARCH_CLEAR_USER_PAGES. I hate how people make up those idiotic pointless names. If you have to use a #ifdef, just use the name of the function that the architecture overrides, not some other new name. But you don't need it at all, because (c) Just make a __weak function called clear_user_highpages() in mm/highmem.c, and allow architectures to just create their own non-weak ones. - patch 4/21 and 5/32: can we instead just get rid of that silly "process_huge_page()" thing entirely. It's disgusting, and it's a big part of why 'rep movs/stos' cannot work efficiently. It also makes NO SENSE if you then use non-temporal accesses. So instead of doubling down on the craziness of that function, just get rid of it entirely. There are two users, and they want to clear a hugepage and copy it respectively. Don't make it harder than it is. *Maybe* the code wants to do a "prefetch" afterwards. Who knows. But I really think you sh ould do the crapectomy first, make the code simpler and more straightforward, and just allow architectures to override the *simple* "copy or clear a lage page" rather than keep feeding this butt-ugly monstrosity. - 13/21: see 3/21. - 14-17/21: see 4/21 and 5/21. Once you do the crapectomy and get rid of the crazy process_huge_page() abstraction, and just let architectures do their own clear/copy huge pages, *all* this craziness goes away. Those "when to use which type of clear/copy" becomes a *local* question, no silly arch_clear_page_non_caching_threshold() garbage. So I really don't like this series. A *lot* of it comes from that horrible process_huge_page() model, and the whole model is just wrong and pointless. You're literally trying to fix the mess that that function is, but you're keeping the fundamental problem around. The whole *point* of your patch-set is to use non-temporal stores, which makes all the process_huge_page() things entirely pointless, and only complicates things. And even if we don't use non-temporal stores, that process_huge_page() thing makes for trouble for any "rep stos/movs" implementation that might actualyl do a better job if it was just chunked up in bigger chunks. Yes, yes, you probably still want to chunk that up somewhat due to latency reasons, but even then architectures might as well just make their own decisions, rather than have the core mm code make one clearly bad decision for them. Maybe chunking it up in bigger chunks than one page. Maybe an architecture could do even more radical things like "let's just 'rep stos' for the whole area, but set a special thread flag that causes the interrupt return to break it up on return to kernel space". IOW, the "latency fix" might not even be about chunking it up, it might look more like our exception handling thing. So I really think that crapectomy should be the first thing you do, and that should be that first part of "extends the clear_page() machinery to also handle extents larger than a single page" Linus