From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2DC0FC433EF for ; Thu, 9 Dec 2021 10:59:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9360C6B0071; Thu, 9 Dec 2021 05:59:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8BE7E6B0073; Thu, 9 Dec 2021 05:59:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 75F986B0074; Thu, 9 Dec 2021 05:59:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay039.a.hostedemail.com [64.99.140.39]) by kanga.kvack.org (Postfix) with ESMTP id 616826B0071 for ; Thu, 9 Dec 2021 05:59:30 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 2BD88619 for ; Thu, 9 Dec 2021 10:59:20 +0000 (UTC) X-FDA: 78897959280.02.A8F3DB3 Received: from mail-pg1-f170.google.com (mail-pg1-f170.google.com [209.85.215.170]) by imf11.hostedemail.com (Postfix) with ESMTP id AEA8840002 for ; Thu, 9 Dec 2021 10:59:18 +0000 (UTC) Received: by mail-pg1-f170.google.com with SMTP id q16so4735346pgq.10 for ; Thu, 09 Dec 2021 02:59:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:subject:to:cc:references:in-reply-to:mime-version :message-id:content-transfer-encoding; bh=Y4YCf5p2flFdPi260MR0et4Z9XdnRdOjcpcw/CdZSuA=; b=m0D4bh/5cQBfjellXR4men/vybTmrhj1MXUALIfCO7gjPBgFLcfB6EIL1M3EvyzYjZ 9k8QN7BtpPb+qrEOX1PL0yAmKt1zTef7pFCID/+9CQ7L9+eDBNlLAm35vygYO6yMW/4R YLAdjRfbJ4Hr0cLHNY+LytjPyk1g9QRImfYVBPgF+1hfLiXbGxZMnTDafM5ohQ4m1xZq 7RIlFRKj7ntLRPJ6/5EdboTLLFXkJFJP+LX7J/3cyd3SAKMiyqH1h0+BsFNQ+j63wy5c +vOS5j3FvnmymbwJF4cnU8RWW0sequua5+vpkKISj7mZK3FhUBuF1b4uiDZ9C+JBoj2A XPEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:subject:to:cc:references:in-reply-to :mime-version:message-id:content-transfer-encoding; bh=Y4YCf5p2flFdPi260MR0et4Z9XdnRdOjcpcw/CdZSuA=; b=QSJCPwIxQ8BriXksgx/PxhNdEwCKvVv6W19SHLRrTUGmhRMAZuj41pqV1ZiZkePPj7 2PIz34ZQCtbeTkZFAam3AEZNNdbDcEyA3Vuvl/nL7u4LI42Wdk0HBU7S/V7DojjLJZnO TckU5tMqZbgngJhWFOl/CvEsHDxmDKfQmJaqYoKN/CV6z9i+8nforuqrpbFd0lTWDuF1 MHa++V9/LPjO/oPHF1krop6fH/kZKkUbeRJvpaRKkKnLQ6XuvYc7f7YHAnsqlfoxUK+U YmfuvUZZ73fSn8dH5+zbwi25KzTU6PjNAMx61Bivgep1Gb23jZa2sHYlO9jbAXC/C2WO irlA== X-Gm-Message-State: AOAM530Z4o0j4jaxD3VnxQFj5l/WmbPdBqZt9QuY9RbzgqW9z6KqysGl NdcbixeoT33YyNPJSoCZak9qqbT8vh0= X-Google-Smtp-Source: ABdhPJyOxM47n/df8lmPOYZKpTQMBrKlowgq+tR0oxnp7WDFcqqKV2XPTR9Sig5238rwQKvnDxJSbg== X-Received: by 2002:a63:5f0e:: with SMTP id t14mr33843123pgb.107.1639047558723; Thu, 09 Dec 2021 02:59:18 -0800 (PST) Received: from localhost (203-219-139-206.static.tpgi.com.au. [203.219.139.206]) by smtp.gmail.com with ESMTPSA id lx15sm5845855pjb.44.2021.12.09.02.59.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 09 Dec 2021 02:59:18 -0800 (PST) Date: Thu, 09 Dec 2021 20:59:12 +1000 From: Nicholas Piggin Subject: Re: [PATCH] mm/vmalloc: allocate small pages for area->pages To: linux-mm@kvack.org, Yu Xu Cc: akpm@linux-foundation.org References: <1639037882.ddpnbp5ftw.astroid@bobo.none> <50ea6251-fbde-10d9-c37c-3198aa9e2d82@linux.alibaba.com> In-Reply-To: <50ea6251-fbde-10d9-c37c-3198aa9e2d82@linux.alibaba.com> MIME-Version: 1.0 Message-Id: <1639046632.ijbttgtfmt.astroid@bobo.none> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 1wobxg8d36ytczpzmk796jjeynkwu7rd Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b="m0D4bh/5"; spf=pass (imf11.hostedemail.com: domain of npiggin@gmail.com designates 209.85.215.170 as permitted sender) smtp.mailfrom=npiggin@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: AEA8840002 X-HE-Tag: 1639047558-816240 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Excerpts from Yu Xu's message of December 9, 2021 7:27 pm: > On 12/9/21 4:23 PM, Nicholas Piggin wrote: >> Excerpts from Xu Yu's message of December 7, 2021 7:46 pm: >>> The area->pages stores the struct pages allocated for vmalloc mappings. >>> The allocated memory can be hugepage if arch has HAVE_ARCH_HUGE_VMALLOC >>> set, while area->pages itself does not have to be hugepage backed. >>> >>> Suppose that we want to vmalloc 1026M of memory, then area->pages is >>> 2052K in size, which is large than PMD_SIZE when the pagesize is 4K. >>> Currently, 4096K will be allocated for area->pages, wherein 2044K is >>> wasted. >>> >>> This introduces __vmalloc_node_no_huge, and makes area->pages backed by >>> small pages, because I think to allocate hugepage for area->pages is >>> unnecessary and vulnerable to abuse. >>=20 >> Any vmalloc allocation will be subject to internal fragmentation like >> this. What makes this one special? Is there a way to improve it for >> all with some heuristic? >=20 > As described in the commit log, I think vmalloc memory (*data*) can be > hugepage, while the area->pages (*meta*) is unnecessary. Right. I accept some vmalloc allocations aren't performance critical for accesses, and some allocations will have more fragmentation waste from huge vmalloc, and that area->pages fits both under some circumstances. But 1) other vmalloc allocations might too. And 2) sure you can waste 50% of the 0.2% space overhead that area->pages requires so it another 0.2% in the absolute worst case that you're vmallocing just over a gig of memory. The question is does this case actually matter that much. And would we be better served looking at actual wastage numbers across all allocs and whether it's worth worrying about or if a general heuristic could be used. > There should be heuristic ways, just like THP settings (always, madvise, > never). But such heuristic ways are mainly for data allocation, and I'm > not sure it's worth it to bring such logic in. >=20 >>=20 >> There would be an argument for a size-optimised vmalloc vs a space >> optimised one. An accounting strucutre like this doesn't matter >> much for speed. A vfs hash table does. Is it worth doing though? How >=20 > To be honest, I wrote the patch when studying your patchset. No real > issue. >=20 >> much do you gain in practice? >=20 > Therefore, no actual gain in practice. >=20 >=20 > Perhaps I should add an RFC tag in the patch. However, I saw that > Andrew Morton has added this patch to the -mm tree. >=20 > I wonder if we need to reconsider this patch. I like your thinking and that you're looking to optimise memory usage. I'm mainly concerned that the more API options we provide, the harder it=20 makes things for callers and the more chance there is that they will get it wrong. Heuristics aren't a magic fix either (they can go wrong too), so getting an idea of how much waste we could avoid would be a good start. Thanks, Nick