From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 504E9C433F5 for ; Thu, 21 Apr 2022 23:30:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C061E6B0071; Thu, 21 Apr 2022 19:30:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BB5EE6B0073; Thu, 21 Apr 2022 19:30:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A7D566B0074; Thu, 21 Apr 2022 19:30:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.a.hostedemail.com [64.99.140.24]) by kanga.kvack.org (Postfix) with ESMTP id 9A14B6B0071 for ; Thu, 21 Apr 2022 19:30:46 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 550F3226F7 for ; Thu, 21 Apr 2022 23:30:46 +0000 (UTC) X-FDA: 79382483292.25.31226FB Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) by imf10.hostedemail.com (Postfix) with ESMTP id A1750C000D for ; Thu, 21 Apr 2022 23:30:42 +0000 (UTC) Received: by mail-pl1-f170.google.com with SMTP id u15so3673335ple.4 for ; Thu, 21 Apr 2022 16:30:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:subject:to:cc:references:in-reply-to:mime-version :message-id:content-transfer-encoding; bh=lE0V3qjAS7CCi6fROk/XeSupX8dBuxp5diNFKdfJZzc=; b=KbrsF8Fen3H4rDvpPByaGLqw8BH6O0NVgJ9U9mZVU8q7kEBaZC0YvNL8ZBmICxppS7 NlQbbRoJWy7Fc/nDZhwdWK6uOBL03/SwAute02SyiW/ofQzqduZftIwR4zCWzCOxyyiY OvtMjoQOU11M8gnHp4RATKflKdlU0qs8Aw+u7/gSQTQsaLL+qUZynwsQN7EKKHWLGg46 Wie4q4wg2XZJ+lWk0OoQsoNEehPlswUEfIXOtsuZIzN/Abmy1LKCWKTvP9BM4AXlv2PS 4sTU/seTA2grVXKQi85b8mp0tnPy1/Y8hv3JHJ69cywIJJ/esF4lsxgwktJV+c/TSZPs vQHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:subject:to:cc:references:in-reply-to :mime-version:message-id:content-transfer-encoding; bh=lE0V3qjAS7CCi6fROk/XeSupX8dBuxp5diNFKdfJZzc=; b=ddxdGuDpzRJjRkCB0rUCuTs4IhiT0nKxVckHbQo+zfct/m7ag6MD2AcAuCZyabcaJv 1nd3Z60oS/Y2i8q3CUignLIMdc/EWRaAdoPVCsJ5EIAbMC/TtxWLIMLYc5wZR4+URGBx vSktdAW1KcV6pH9hKUHFmgeBN2MV0HOUlyJv5PkA7AJryMRm95+t4D2TeC4fPQTcrvtc rmKESWTqLBq7lkSxA/Oob5YQtU29oru05iZTOSz8fqy58+LmYeaL/JhESR6hHpFfERX1 SR6CtRQ5AIi2+FqddIEQtgoKXZTGMLIaQt3khJPo1bdjIHIlmPDayMGeeEu53GB0IeDz JCug== X-Gm-Message-State: AOAM530i3zP1hrEpNTRpXFruXYBo8Ib0Pt4yH4Taaqd4JKoQGjuEbjYF HYcKFEyNCcQCwLV8CjrScmY= X-Google-Smtp-Source: ABdhPJy1RKcxdVpf91OOgBJRYlC++j1HAqBJ92ZYc7kGDopyxu+mLuck5D65uI45k2CUxtW+qmxhyw== X-Received: by 2002:a17:902:8644:b0:153:9f01:2090 with SMTP id y4-20020a170902864400b001539f012090mr1520080plt.101.1650583844918; Thu, 21 Apr 2022 16:30:44 -0700 (PDT) Received: from localhost (193-116-116-20.tpgi.com.au. [193.116.116.20]) by smtp.gmail.com with ESMTPSA id l18-20020a056a00141200b004f75395b2cesm207689pfu.150.2022.04.21.16.30.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Apr 2022 16:30:44 -0700 (PDT) Date: Fri, 22 Apr 2022 09:30:39 +1000 From: Nicholas Piggin Subject: Re: [PATCH v4 bpf 0/4] vmalloc: bpf: introduce VM_ALLOW_HUGE_VMAP To: Linus Torvalds Cc: "akpm@linux-foundation.org" , "ast@kernel.org" , "bp@alien8.de" , "bpf@vger.kernel.org" , "daniel@iogearbox.net" , "dborkman@redhat.com" , "edumazet@google.com" , "hch@infradead.org" , "hpa@zytor.com" , "imbrenda@linux.ibm.com" , Kernel Team , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "mbenes@suse.cz" , "mcgrof@kernel.org" , "pmladek@suse.com" , "Edgecombe, Rick P" , Mike Rapoport , "song@kernel.org" , Song Liu References: <20220415164413.2727220-1-song@kernel.org> <4AD023F9-FBCE-4C7C-A049-9292491408AA@fb.com> <88eafc9220d134d72db9eb381114432e71903022.camel@intel.com> <1650511496.iys9nxdueb.astroid@bobo.none> <1650530694.evuxjgtju7.astroid@bobo.none> In-Reply-To: MIME-Version: 1.0 Message-Id: <1650582120.hf4z0mkw8v.astroid@bobo.none> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: A1750C000D X-Stat-Signature: 64ynk9uqz4urmbt6k8n6qh3gmg5hsk9d Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=KbrsF8Fe; spf=pass (imf10.hostedemail.com: domain of npiggin@gmail.com designates 209.85.214.170 as permitted sender) smtp.mailfrom=npiggin@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-HE-Tag: 1650583842-380346 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Excerpts from Linus Torvalds's message of April 22, 2022 1:44 am: > On Thu, Apr 21, 2022 at 1:57 AM Nicholas Piggin wrote= : >> >> Those were (AFAIKS) all in arch code though. >=20 > No Nick, they really weren't. >=20 > The bpf issue with VM_FLUSH_RESET_PERMS means that all your arguments > are invalid, because this affected non-architecture code. VM_FLUSH_RESET_PERMS was because bpf uses the arch module allocation=20 code which was not capable of dealing with huge pages in the arch specific direct map manipulation stuff was unable to deal with it. An x86 bug. > So the bpf case had two independent issues: one was just bpf doing a > really bad job at making sure the executable mapping was sanely > initialized. >=20 > But the other was an actual bug in that hugepage case for vmalloc. >=20 > And that bug was an issue on power too. I missed it, which bug was that? >=20 > So your "this is purely an x86 issue" argument is simply wrong. > Because I'm very much looking at that power code that says "oh, > __module_alloc() needs more work". >=20 > Notice? No I don't notice. More work to support huge allocations for executable mappings, sure. But the arch's implementation explicitly does not support that yet. That doesn't make huge vmalloc broken! Ridiculous. It works fine. >=20 > Can these be fixed? Yes. But they can't be fixed by saying "oh, let's > disable it on x86". You did just effectively disable it on x86 though. And why can't it be reverted on x86 until it's fixed on x86?? > Although it's probably true that at that point, some of the issues > would no longer be nearly as noticeable. There really aren't all these "issues" you're imagining. They aren't noticable now, on power or s390, because they have non-buggy HAVE_ARCH_HUGE_VMALLOC implementations. If you're really going to insist on this will you apply this to fix=20 (some of) the performance regressions it introduced? Thanks, Nick diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 6e5b4488a0c5..b555f17e84d5 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -8919,7 +8919,10 @@ void *__init alloc_large_system_hash(const char *tab= lename, table =3D memblock_alloc_raw(size, SMP_CACHE_BYTES); } else if (get_order(size) >=3D MAX_ORDER || hashdist) { - table =3D __vmalloc(size, gfp_flags); + if (IS_ENABLED(CONFIG_PPC) || IS_ENABLED(CONFIG_S390)) + table =3D vmalloc_huge(size, gfp_flags); + else + table =3D __vmalloc(size, gfp_flags); virt =3D true; if (table) huge =3D is_vm_area_hugepages(table);