From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C3EEC433F5 for ; Fri, 22 Apr 2022 02:31:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F33B06B0074; Thu, 21 Apr 2022 22:31:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EBC616B0075; Thu, 21 Apr 2022 22:31:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D36F16B0078; Thu, 21 Apr 2022 22:31:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.26]) by kanga.kvack.org (Postfix) with ESMTP id C1B566B0074 for ; Thu, 21 Apr 2022 22:31:27 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id A73DF239A5 for ; Fri, 22 Apr 2022 02:31:27 +0000 (UTC) X-FDA: 79382938614.11.2FF851E Received: from mail-lf1-f50.google.com (mail-lf1-f50.google.com [209.85.167.50]) by imf31.hostedemail.com (Postfix) with ESMTP id D2B5120020 for ; Fri, 22 Apr 2022 02:31:23 +0000 (UTC) Received: by mail-lf1-f50.google.com with SMTP id g19so11879293lfv.2 for ; Thu, 21 Apr 2022 19:31:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=SXY8vM5gipgpM7Xccl9ZLsPXwuvlwZprON91blgLX0g=; b=SHuGY33SCY7YguKUvgEG1+A1IUipTd1k1koT2htJ9P/On9R5tGcTGxcMT4d6PfouzG QKx1bjmMcYClZerXPl5o9F4get9HAK/EKDkkAVbgwOsdhJ89By/t46O2k21cbxYy58Tb 5EsDmv65Gc3MTSJ4RKWX/wea8jAy1CewCz1Lw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=SXY8vM5gipgpM7Xccl9ZLsPXwuvlwZprON91blgLX0g=; b=O6Q2CUczXHSrT4LZBlVz7Br51yVuRsXb3+2RL+3tpitUQEy4wBtdBDdkWkbA/bgcss AnZUHbyg/dAs83jKYbUg6Ns55tKldRKt+JxFazy9zbrHIR2svle0jKjzlhM1ykMPtk3M yqxUs2w6HgaC5im6c6E++9I9sbAGiZzjY0rpWPMofTS8VVgPwmwEqicBE9dpIvLkq41I Te73j+965EmNEA/9GJi/Bs31XO1O8m+HI+PrN2p73DyAdfg+uL4/gV7ceVUJe7Y3op75 Km7bZifpeY04EYMELh53NibzrHtNp1Rzzjyql+ww9BTBdNV/97j6cbujUmY0pghxJ7vi 04mw== X-Gm-Message-State: AOAM532czl0PJjsWewO1jg387hMvSA3mPSN8wBiOKIjtN32Nc2cdAw1C p+jJR/uVXA9cc/byWjdSj0MbP+gEyb4pNCeiN6E= X-Google-Smtp-Source: ABdhPJyeCSpO3o+Hkh16YfhBLI900pcyM81ZTzDfnYL03AO5BAYx1HHywFutFSdSf63yBN/8R+rI8w== X-Received: by 2002:a05:6512:b0b:b0:44a:f4a5:b519 with SMTP id w11-20020a0565120b0b00b0044af4a5b519mr1637723lfu.287.1650594685301; Thu, 21 Apr 2022 19:31:25 -0700 (PDT) Received: from mail-lj1-f173.google.com (mail-lj1-f173.google.com. [209.85.208.173]) by smtp.gmail.com with ESMTPSA id j12-20020a056512028c00b00471af73ada3sm76758lfp.21.2022.04.21.19.31.22 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 21 Apr 2022 19:31:22 -0700 (PDT) Received: by mail-lj1-f173.google.com with SMTP id v1so4532968ljv.3 for ; Thu, 21 Apr 2022 19:31:22 -0700 (PDT) X-Received: by 2002:a2e:91d9:0:b0:24d:c221:4941 with SMTP id u25-20020a2e91d9000000b0024dc2214941mr1494670ljg.164.1650594682049; Thu, 21 Apr 2022 19:31:22 -0700 (PDT) MIME-Version: 1.0 References: <20220415164413.2727220-1-song@kernel.org> <4AD023F9-FBCE-4C7C-A049-9292491408AA@fb.com> <88eafc9220d134d72db9eb381114432e71903022.camel@intel.com> <1650511496.iys9nxdueb.astroid@bobo.none> <1650530694.evuxjgtju7.astroid@bobo.none> <1650582120.hf4z0mkw8v.astroid@bobo.none> <1650590628.043zdepwk1.astroid@bobo.none> In-Reply-To: <1650590628.043zdepwk1.astroid@bobo.none> From: Linus Torvalds Date: Thu, 21 Apr 2022 19:31:05 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v4 bpf 0/4] vmalloc: bpf: introduce VM_ALLOW_HUGE_VMAP To: Nicholas Piggin Cc: "akpm@linux-foundation.org" , "ast@kernel.org" , "bp@alien8.de" , "bpf@vger.kernel.org" , "daniel@iogearbox.net" , "dborkman@redhat.com" , "edumazet@google.com" , "hch@infradead.org" , "hpa@zytor.com" , "imbrenda@linux.ibm.com" , Kernel Team , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "mbenes@suse.cz" , "mcgrof@kernel.org" , "pmladek@suse.com" , "Edgecombe, Rick P" , Mike Rapoport , "song@kernel.org" , Song Liu Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Queue-Id: D2B5120020 X-Stat-Signature: jn8b977pn8uq6hqhc1g8hd1g7au7siaz Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=SHuGY33S; spf=pass (imf31.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.167.50 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none X-Rspamd-Server: rspam01 X-HE-Tag: 1650594683-369922 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Apr 21, 2022 at 6:51 PM Nicholas Piggin wrote: > > > See > > > > https://lore.kernel.org/all/20220415164413.2727220-3-song@kernel.org/ > > > > for [PATCH 2/4] in the series for this particular issue. > > I was being facetious. The problem is you can't do ^ because x86 is > buggy. No, we probably *can* do that PATCH 2/4. I suspect x86 really isn't that buggy. The bugs are elsewhere (including other vmalloc_huge() uses). Really. Why can't you just admit that the major bug was in the hugepage code itself? You claim: > Because it can be transparent. The bug was (stupidly) using compound > pages when it should have just used split higher order pages. but we're in -rc3 for 5.18, and you seem to be entirely ignoring the fact that that stupid bug has been around for a *YEAR* now. Guess what? It was reported within *days* of the code having been enabled on x86. But for about a year, youv'e been convinced that powerpc is fine, because nobody ever reported it. And you *still* try to make this about how it's some "x86 bug", despite that bug not having been x86-specific AT ALL. Nick, please take a long look at yourself in the mirror. And stop this whole mindless "it's x86". The *ONLY* thing x86-64 did was to show that the code that had been enabled on powerpc for a year had gotten almost no testing there. And don't bother mentioning s390. It got even less coverage there. So exactly *because* bugs were uncovered in days by x86 enabling this, I'm not rushing to re-enable it until I think it's gone through more thinking and testing. And in particular, I really *really* want to limit the fallout. For example, your "two-liner fix" is not at all obvious. That broken code case used to have a comment that remap_vmalloc_page() required compound pages, and you just removed that whole thing as if it didn't matter, and split the page. (I also think the comment meant 'vmap_pages_range()', but whatever). And the thing is, I'm not entirely convinced that comment was wrong and could just be ignored. The freeing code in __vunmap() will do int i, step = 1U << page_order; for (i = 0; i < area->nr_pages; i += step) { struct page *page = area->pages[i]; BUG_ON(!page); mod_memcg_page_state(page, MEMCG_VMALLOC, -step); __free_pages(page, page_order); which now looks VERY VERY wrong. You've split the pages, they may be used as individual pages (possibly by other things), and then you now at freeing time treat them as a single compound page after all.. So your "trivial two-liner" that tried to fix a bug that has been there for a year now, itself seems quite questionable. Maybe it works, maybe it doesn't. My bet is "it doesn't". And guess what? I bet it worked just fine in your testing on powerpc, because you probably didn't actually have any real huge-page vmalloc cases except for those filesystem big-hash cases that never get free'd. So that "this code was completely buggy for a year on powerpc" never seemed to teach you anything about the code. And again - none of this is at all x86-specific. NOT AT ALL. So how about you admit you were wrong to begin with. That hugepage code needs more care before we re-enable it. Your two-liner wasn't so obvious after all, was it? I really think we're much safer saying "hugepage mappings only matter for a couple of things, and those things will *not* do sub-page games, so it's simple and safe". .. and that requires that opt-in model. Because your "it's transparent" argument has never ever actually been true, now has it? Linus