From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F41CAC433EF for ; Thu, 21 Apr 2022 09:07:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8D0B96B0073; Thu, 21 Apr 2022 05:07:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 880436B0074; Thu, 21 Apr 2022 05:07:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 74A266B0075; Thu, 21 Apr 2022 05:07:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.25]) by kanga.kvack.org (Postfix) with ESMTP id 67D8B6B0073 for ; Thu, 21 Apr 2022 05:07:31 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id 49770122D04 for ; Thu, 21 Apr 2022 09:07:31 +0000 (UTC) X-FDA: 79380307902.17.6C78E5B Received: from mail-pj1-f42.google.com (mail-pj1-f42.google.com [209.85.216.42]) by imf11.hostedemail.com (Postfix) with ESMTP id E413640029 for ; Thu, 21 Apr 2022 09:07:28 +0000 (UTC) Received: by mail-pj1-f42.google.com with SMTP id md20-20020a17090b23d400b001cb70ef790dso7188228pjb.5 for ; Thu, 21 Apr 2022 02:07:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:subject:to:cc:references:in-reply-to:mime-version :message-id:content-transfer-encoding; bh=Pel2dsTDIe3t8VmyT+t+d8FvIgfC1mdCX7nKYG6+pAU=; b=KE+1uE8GspZJXXd4phb9kpKGBFpu+wjtmQuW/trkeZUN7ovwbbuSZOWD1TViv4mTmA fYUHJU53DQ9FrM9JacJLmdewAPaLgaXJEMY7fECM4Snj/d2SIe2lP9IyeGH3uscJp6VN Jn+7Y4QHlP4czVFWdRMjMGIFJjqX93NAXVZVF6QIfQnRMXiJjDTr6BWu9dhXt1i0ztRG XtcrkCd61b/jbbbFzmLB75Ln+jT4PxtRceQhIflESDHg7dzmHadrcNQ5fkJCY9PvqFgj g0bHiIwIN76nDb7ziCfKGdb1naBLE29U4i0Sc6+5wS88eFQlEQqkr3SWBAFc5jlWWNkP DZew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:subject:to:cc:references:in-reply-to :mime-version:message-id:content-transfer-encoding; bh=Pel2dsTDIe3t8VmyT+t+d8FvIgfC1mdCX7nKYG6+pAU=; b=ouEnyT+npdN1Mi26eNkeU66QbUx/0k5X03rglfZBX5HjEv9RkvMeODEyaBZ+NMtLS6 t2y5iTjTDDNRbhBRN0v6+oVrlIPBODXK6eWY+dA0SyMvSpJjOQSsea2eR4Qas+qYw6Or G5J/Qb3xjSTU3E9dXoypNAGUumplLwaL/zSr6gbkNo1mYyPVBwfrfVXzswRWGHUJNWRb x4Wa21yRw61XQj0J58iQ7vUefWFt7OtZ0UbpJiMIoadqLUIKHKrD7C9yN4amtx5nU6hb czQWDxIrJjD+aIyv5s0xP+16I7V5neE3XkyxN9zcsggiMEo6N95C72pa+LsfdungsAXp 2cow== X-Gm-Message-State: AOAM530qJ+68nKif6ZSJPbN7G5HO2X4czWFnaZ025qabyHf78UoaqXAi sczZTINXFRHbo2CUfPgILX8= X-Google-Smtp-Source: ABdhPJxTzOmlLL91c7uscSbJN9RG0yZZMfQvkW/9n5UZt8nUTJpo7WduP18v/O+I0ZB9VNRBIfLAlw== X-Received: by 2002:a17:902:bf04:b0:149:c5a5:5323 with SMTP id bi4-20020a170902bf0400b00149c5a55323mr24813906plb.97.1650532049802; Thu, 21 Apr 2022 02:07:29 -0700 (PDT) Received: from localhost (193-116-116-20.tpgi.com.au. [193.116.116.20]) by smtp.gmail.com with ESMTPSA id b5-20020a056a0002c500b0050600032179sm22557180pft.130.2022.04.21.02.07.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Apr 2022 02:07:29 -0700 (PDT) Date: Thu, 21 Apr 2022 19:07:24 +1000 From: Nicholas Piggin Subject: Re: [PATCH v4 bpf 0/4] vmalloc: bpf: introduce VM_ALLOW_HUGE_VMAP To: Linus Torvalds Cc: "akpm@linux-foundation.org" , "ast@kernel.org" , "bp@alien8.de" , "bpf@vger.kernel.org" , "daniel@iogearbox.net" , "dborkman@redhat.com" , "edumazet@google.com" , "hch@infradead.org" , "hpa@zytor.com" , "imbrenda@linux.ibm.com" , Kernel Team , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "mbenes@suse.cz" , "mcgrof@kernel.org" , "pmladek@suse.com" , "Edgecombe, Rick P" , Mike Rapoport , "song@kernel.org" , Song Liu References: <20220415164413.2727220-1-song@kernel.org> <4AD023F9-FBCE-4C7C-A049-9292491408AA@fb.com> <88eafc9220d134d72db9eb381114432e71903022.camel@intel.com> <1650511496.iys9nxdueb.astroid@bobo.none> In-Reply-To: MIME-Version: 1.0 Message-Id: <1650531495.h5u7ntu1jb.astroid@bobo.none> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: E413640029 X-Rspam-User: Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=KE+1uE8G; spf=pass (imf11.hostedemail.com: domain of npiggin@gmail.com designates 209.85.216.42 as permitted sender) smtp.mailfrom=npiggin@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Stat-Signature: wed8em6n6wcby4djuazpk9ssinp3etos X-HE-Tag: 1650532048-46323 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Excerpts from Linus Torvalds's message of April 21, 2022 4:02 pm: > On Wed, Apr 20, 2022 at 10:48 PM Linus Torvalds > wrote: >> >> The lagepage thing needs to be opt-in, and needs a lot more care. >=20 > Side note: part of the opt-in really should be about the performance impa= ct. >=20 > It clearly can be quite noticeable, as outlined by that powerpc case > in commit 8abddd968a30 ("powerpc/64s/radix: Enable huge vmalloc > mappings"), but it presumably is some _particular_ case that actually > matters. >=20 > But it's equalyl clearly not the module code/data case, since > __module_alloc() explicitly disables largepages on powerpc. >=20 > At a guess, it's one or more of the large hash-table allocations. The changelog is explicit it is the vfs hashes. > And it would actually be interesting to hear *which*one*. From the > 'git diff' workload, I'd expect it to be the dentry lookup hash table > - I can't think of anything else that would be vmalloc'ed that would > be remotely interesting - but who knows. I didn't measure dentry/inode separately but it should mostly (~entirely?) be the dentry hash, yes. > So I think the whole "opt in" isn't _purely_ about the "oh, random > cases are broken for odd reasons, so let's not enable it by default". The whole concept is totally broken upstream now though. Core code absolutely can not mark any allocation as able to use huge pages because x86 is in some crazy half-working state. Can we use hugepage dentry cache with x86 with hibernation? With BPF? Who knows. > I think it would actually be good to literally mark the cases that > matter (and have the performance numbers for those cases). As per previous comment, not for correctness but possibly to help guide some heuristic. I don't see it being too big a deal though, a multi-MB vmalloc that can use hugepages probably wants to, quite small downside (fragmentation being about the only one, but there aren't a vast number of such allocations in the kernel to have been noticed as yet). Thanks, Nick