From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12A67C4332F for ; Fri, 18 Nov 2022 09:49:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5DD568E0001; Fri, 18 Nov 2022 04:49:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 565CD6B0072; Fri, 18 Nov 2022 04:49:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3DF638E0001; Fri, 18 Nov 2022 04:49:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 288096B0071 for ; Fri, 18 Nov 2022 04:49:47 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id B4A39C12A6 for ; Fri, 18 Nov 2022 09:49:46 +0000 (UTC) X-FDA: 80146091172.14.6F4948B Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf03.hostedemail.com (Postfix) with ESMTP id 1244A2000A for ; Fri, 18 Nov 2022 09:49:45 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 5D9F6224B3; Fri, 18 Nov 2022 09:49:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1668764984; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vRS3/bqk+3gnVapRIQLLfNkVNTHwbpwq+8sD/LeFjqk=; b=wvzTNB6V8B/04m48Y9boJkfptuctypkTQ/M4Cxc98w5OAFDY3T8LQCoLZ8PDCiN4/v5k8I qa3F2Z8VQrw2NZkQIzl3FL6L08mnzki7hAMXfCXdiP0F7qUBdWflo2IGsNFYtWoidsTxMY ExYPd2p5TxQqH59GvuiqZJ1xbsmSrL4= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1668764984; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vRS3/bqk+3gnVapRIQLLfNkVNTHwbpwq+8sD/LeFjqk=; b=zr5HqLzpEi3Fbb9c4rsSm7KBq4OgGBTWBnINMIPSqY098dtZAZJ9uxO33XFM+ZYFA8jAqp m4Jqi/xU/eEunxBQ== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 391F51345B; Fri, 18 Nov 2022 09:49:44 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id kOCKDDhVd2MPIgAAMHmgww (envelope-from ); Fri, 18 Nov 2022 09:49:44 +0000 Message-ID: <97b6c8e7-36f3-1181-7ffb-d94e8a8d293e@suse.cz> Date: Fri, 18 Nov 2022 10:49:43 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 Subject: Re: [linux-next:master 5002/7443] include/linux/compiler_types.h:357:45: error: call to '__compiletime_assert_474' declared with attribute error: BUILD_BUG_ON failed: PERCPU_DYNAMIC_EARLY_SIZE < NR_KMALLOC_TYPES * KMALLOC_SHIFT_HIGH * sizeof(struct kmem_cache_cpu) Content-Language: en-US To: Dennis Zhou , Baoquan He Cc: kernel test robot , oe-kbuild-all@lists.linux.dev, Linux Memory Management List , 42.hyeyoo@gmail.com References: <202211120436.HzD1i2yQ-lkp@intel.com> From: Vlastimil Babka In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1668764986; a=rsa-sha256; cv=none; b=wsqNGdTZ3ZieHfpfXw4vYqyuQkafYEObxybmSkyo+sNWmytEFdw5wGk+19enjkcHOlGvmz NdO19xdMXRCxwYX5D1K5CA/AWDzVs8MMKl5BcllOK48lvf+4FgpqV9X+b9tPJm64dLNEPj 1GLXJ3V+GdVZl9CVc1/rY5BrUSrZLfo= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=wvzTNB6V; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=zr5HqLzp; spf=pass (imf03.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1668764986; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vRS3/bqk+3gnVapRIQLLfNkVNTHwbpwq+8sD/LeFjqk=; b=K+Z41za5P3PdmS7clFJYhqQPXmBdBElEeMYPC5gmNEfEofGG20g69+U8rXVfWEEV0lfGkV plkAnyC0Evpsso8BhE6/2Tyh6ZUp5HvaCk9rgcurkYYsb3JN0247kyYEAPuBcl1On5737u 44gkcLInFPeCGmE1OzzwGLSuRPHrwpU= X-Stat-Signature: b1pwy64bnc7c9bi31f113pdq441ooscj X-Rspamd-Queue-Id: 1244A2000A Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=wvzTNB6V; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=zr5HqLzp; spf=pass (imf03.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1668764985-91436 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 11/17/22 20:23, Dennis Zhou wrote: > On Wed, Nov 16, 2022 at 07:32:03PM +0800, Baoquan He wrote: >> On 11/15/22 at 12:00pm, Dennis Zhou wrote: >> > On Tue, Nov 15, 2022 at 05:08:52PM +0800, Baoquan He wrote: >> > > Hi Dennis, >> > > >> > > On 11/14/22 at 08:13pm, Dennis Zhou wrote: >> > > > Hi Vlastimil & Baoquan, >> > > > >> > > > On Mon, Nov 14, 2022 at 06:58:13PM +0100, Vlastimil Babka wrote: >> > > > > On 11/14/22 08:44, Baoquan He wrote: >> > > > > > Hi, >> > > > > > >> > > > > > I reproduced the build failure according to lkp report and made a patch >> > > > > > as below to fix it. >> > > > > > >> > > > > > From dae7dd9705015ce36db757e88c78802584f949b1 Mon Sep 17 00:00:00 2001 >> > > > > > From: Baoquan He >> > > > > > Date: Sun, 13 Nov 2022 18:08:27 +0800 >> > > > > > Subject: [PATCH] percpu: adjust the value of PERCPU_DYNAMIC_EARLY_SIZE >> > > > > > Content-type: text/plain >> > > > > > >> > > > > > LKP reported a build failure as below on the patch "mm/slub, percpu: >> > > > > > correct the calculation of early percpu allocation size" >> > > > > >> > > > > Since I have that patch in slab.git exposed to -next, should I take this fix >> > > > > too, to make things simpler? Dennis? >> > > > > >> > > > >> > > > I don't have any problems with you running a fix, but I'm not quite sure >> > > > this is the right fix. Though this might cause a trivial merge conflict >> > > > with: d667c94962c1 ("mm/percpu: remove unused PERCPU_DYNAMIC_EARLY_SLOTS") >> > > > in my percpu#for-6.2 branch. >> > > > >> > > > If I'm understanding this correctly, slub requires additional percpu >> > > > memory due to the use of 64k pages. By increasing >> > > > PERCPU_DYNAMIC_EARLY_SIZE, we solve the problem for 64k page users, but >> > > > require a few unnecessary pages that can bloat the size of subsequent >> > > > percpu chunks. Though, I'm not sure if that's an issue today for >> > > > embedded devices. >> > > >> > > Thanks for looking into this. >> > > >> > > I guess you are talking about PERCPU_DYNAMIC_EARLY_SIZE will impact the >> > > first dynamic chunk size of page first chunk, because the embed first >> > > chunk will take PERCPU_DYNAMIC_RESERVE. And the impact is done in below >> > > max() invacation. >> > > >> > > static struct pcpu_alloc_info * __init __flatten pcpu_build_alloc_info( >> > > size_t reserved_size, size_t dyn_size, >> > > size_t atom_size, >> > > pcpu_fc_cpu_distance_fn_t cpu_distance_fn) >> > > { >> > > ...... >> > > /* calculate size_sum and ensure dyn_size is enough for early alloc */ >> > > size_sum = PFN_ALIGN(static_size + reserved_size + >> > > max_t(size_t, dyn_size, PERCPU_DYNAMIC_EARLY_SIZE)); >> > > ...... >> > > } >> > > >> > > > >> > > > I think adding parity to PERCPU_DYNAMIC_EARLY_SIZE with >> > > > PERCPU_DYNAMIC_RESERVE is defined by BITS_PER_LONG is a safer option >> > > > here. A small TODO item would be to make PERCPU_DYNAMIC_RESERVE be a + >> > > > value instead of a max() with PERCPU_DYNAMIC_EARLY_SIZE. >> > > >> > > Hmm, the below change may not take power arch into account. Please >> > > check arch/powerpc/include/asm/page.h, seems the 32bit ppc could have >> > > 256K pages too. Adding PERCPU_DYNAMIC_EARLY_SIZE to 20K may cost extra >> > > memory during boot. But th left space of 1st dynamic chunk will join >> > > the later percpu dynamic allocation, it's not wasted, right? >> > > >> > > Not sure if I got your point. >> > > >> > > >> > >> > Ah, I'm not familiar with all the PAGE_SIZE and word length >> > combinations. >> > >> > The first chunk is smaller in the embedded case with the assumption that >> > static percpu variables are highly accessed along with the limited >> > initial allocations. While adding an additional 8KB is not the biggest >> > deal to the first chunk, this can cause the unit_size for subsequent >> > chunks to be larger. For example, x86 unit size jumps in powers of 2 due >> > to alignment and packing against an allocation size of 2MB. So if we're >> > at say 60KB for the first chunk, subsequent chunks could be 64KB. But >> > adding 8KB, we'd go from 60KB -> 68KB and a chunk size of 64KB -> 128KB. >> >> I could have misunderstanding about the first chunk usage and percpu >> code. Below is my personal uderstanding about the 1st chunk size and >> how PERCPU_DYNAMIC_EARLY_SIZE could impact it, please help point out >> if I am wrong. >> >> ~~~ >> Abstract the definition of them here for reference. >> /* >> * Percpu allocator can serve percpu allocations before slab is >> * initialized which allows slab to depend on the percpu allocator. >> * The following parameter decide how much resource to preallocate >> * for this. Keep PERCPU_DYNAMIC_RESERVE equal to or larger than >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> * PERCPU_DYNAMIC_EARLY_SIZE. >> ~~~~~~~~~~~~~~~~~~~~~ >> */ >> #define PERCPU_DYNAMIC_EARLY_SIZE (12 << 10) >> ...... >> #if BITS_PER_LONG > 32 >> #define PERCPU_DYNAMIC_RESERVE (28 << 10) >> #else >> #define PERCPU_DYNAMIC_RESERVE (20 << 10) >> #endif >> >> From above definition, we can see that no matter how big >> PERCPU_DYNAMIC_RESERVE is , it's >= PERCPU_DYNAMIC_EARLY_SIZE as the >> code comment says. So the max() in pcpu_build_alloc_info() won't impact >> the embeded 1st chunk at all. >> >> So, PERCPU_DYNAMIC_EARLY_SIZE can only impact the page 1st chunk case, >> namely when calling pcpu_page_first_chunk() to do that. In >> pcpu_page_first_chunk(), we don't provide dyn_size, so with the help of >> max(), it will get final dyn_size as PERCPU_DYNAMIC_EARLY_SIZE. This is >> the only place where PERCPU_DYNAMIC_EARLY_SIZE takes effect on percpu. >> However, the atom size of page 1st chunk is PAGE_SIZE, it doesn't have >> the issue of possible bloating unit_size by the atom size, e.g 2M on >> x86_64. Since pcpu_page_first_chunk() is the fallback of >> pcpu_embed_first_chunk(), if we decide to provide PERCPU_DYNAMIC_RESERVE >> as the current value, why we grudge setting it as the smaller value, >> 20K, whether it's 32bit or 64bit. >> > > I think I might be overindexing on the out of tree modifications here. > Currently, I think it's clear how modifying PERCPU_DYNAMIC_RESERVE > affects the system with the lower bound being dictated by > PERCPU_DYNAMIC_EARLY_SIZE. If we bump PERCPU_DYNAMIC_EARLY_SIZE, it's > not inherently obvious you can drop that value lower depending on your > system config. > > Ultimately, it is only a few pages, so is saving it that big of a deal > today? Likely not, just a bit wasteful to potentially orphan a few extra > pages unnecessarily. > > Let's just fix this now and I can massage this in the future if anything > comes up. I appreciate you taking the time to have this discussion with > me. > > Vlastimil, can you please pick up this fix. Sorry, got a bit lost, so do you mean the original uncoditional bump, or the modification with BITS_PER_LONG > 32 (or PAGE_SHIFT > 12)? > Acked-by: Dennis Zhou > > Thanks, > Dennis > >> >> > >> > If not `BITS_PER_LONG >32`, we could do `PAGE_SHIFT > 12`. >> > >> > Thanks, >> > Dennis >> > >> > > > >> > > > --- >> > > > diff --git a/include/linux/percpu.h b/include/linux/percpu.h >> > > > index f1ec5ad1351c..22ce3271eed2 100644 >> > > > --- a/include/linux/percpu.h >> > > > +++ b/include/linux/percpu.h >> > > > @@ -42,7 +42,11 @@ >> > > > * larger than PERCPU_DYNAMIC_EARLY_SIZE. >> > > > */ >> > > > #define PERCPU_DYNAMIC_EARLY_SLOTS 128 >> > > > +#if BITS_PER_LONG > 32 >> > > > +#define PERCPU_DYNAMIC_EARLY_SIZE (20 << 10) >> > > > +#else >> > > > #define PERCPU_DYNAMIC_EARLY_SIZE (12 << 10) >> > > > +#endif >> > > > >> > > > /* >> > > > * PERCPU_DYNAMIC_RESERVE indicates the amount of free area to piggy >> > > > >> > > >> > > >> > >> >>