From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72A05EB64D9 for ; Wed, 12 Jul 2023 08:28:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CE27E6B0071; Wed, 12 Jul 2023 04:28:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C92736B0072; Wed, 12 Jul 2023 04:28:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B5B036B0075; Wed, 12 Jul 2023 04:28:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id A74996B0071 for ; Wed, 12 Jul 2023 04:28:43 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 6A4BBC0141 for ; Wed, 12 Jul 2023 08:28:43 +0000 (UTC) X-FDA: 81002283726.23.C1B5B02 Received: from mail-oi1-f173.google.com (mail-oi1-f173.google.com [209.85.167.173]) by imf18.hostedemail.com (Postfix) with ESMTP id 954DA1C000F for ; Wed, 12 Jul 2023 08:28:40 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="D2i/AltB"; spf=pass (imf18.hostedemail.com: domain of zhangpeng.00@bytedance.com designates 209.85.167.173 as permitted sender) smtp.mailfrom=zhangpeng.00@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689150521; a=rsa-sha256; cv=none; b=cFMxdzz8I3+rLxbCEIk3J+7JhaH+AtIGgy3zoRnD0Exti4TdZOivcHXcNsCue8KopHTKLO md1EW/18rT/LKe7R3jHkF9A7LGjdurGT6aUGLKbgu3vjkOZKebhU/DiFLV7C5p1T6G3JIN 319SZtjM5hlV0jSLf2M+JQrQ8+yqRD0= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="D2i/AltB"; spf=pass (imf18.hostedemail.com: domain of zhangpeng.00@bytedance.com designates 209.85.167.173 as permitted sender) smtp.mailfrom=zhangpeng.00@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689150521; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MKOx3ji7cu7+toV6q7E7Pf07IRWv+R94CaV9atCugGs=; b=PaV6Hy+/yckN5QwlnfBS8MfRO0TCtPW4/2JoyLhpuiOgcUMkRghdP1rOAKwHZKG583ptR2 PeY/K1QZAnTONusn//3fW6oWh7vj3W3a1CT9431ZlIZeAzmHUCx2uZLdwFh/w+xQqdtATG cMl4WOZRkSh3hes54nusNYWb5jUnKfk= Received: by mail-oi1-f173.google.com with SMTP id 5614622812f47-3a3373211a1so4439890b6e.0 for ; Wed, 12 Jul 2023 01:28:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1689150519; x=1691742519; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=MKOx3ji7cu7+toV6q7E7Pf07IRWv+R94CaV9atCugGs=; b=D2i/AltBrJ2Y3R+f5uguA+RuIjF61Is+SfVDW+rzzpd8C7g3O9ya76N6wPXxoEOvfZ 3QCISdKSp05eNxYgaUSaeOovvaKpJ0q4HhJ8iGbaucOB1GgPr57cMIbRPhD13vTwE/kp Z7P6WbNBN5QGxl4C0pIu9cernpcXnc3U0bDUDKC7cUEyILrJCgiI4KqTZbgCl22NdsvR hy4qkKaCFL8lbo01ggZVZug92uAvn6N3Iphvx+mJoBTD/by1Cymv3QX76PdtFHL6NnGz UgPj2blF+BB81VLw/UYrWQ6yf4VXWDfweZlLUR9ZXLWI9a0R42Wr/Zm2dyIX/uC5v35q 9Lwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689150519; x=1691742519; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=MKOx3ji7cu7+toV6q7E7Pf07IRWv+R94CaV9atCugGs=; b=S1QtdYFPgKPjaXF8DCdincUOQdeAY+MiUypN436BKkt/7mzvU55dnuTgRKD2MDFdB2 h0gBc6+u0dyvQedQkpMWpogZcsFMy/U7PBR0ITvFN2zShsXA2HWubS6Taq4N8kVB2g7m h15HKhY2CSFexAvP9m/fd9eFDqZdn2QPBTYixyyMoWwvawylIrYmG+yIzok7xild9ipi siXVb4y7FUAYlx6bYUPBnFCXOkL5gq4pkF/Gml3DYObbdqAQgPSR9buIXwf0AaCloprU 7PBToftUH1a60C16YK2g596N7ELXTJrTRPFQkjZgi7nfV3wywGdAeQUOuPZx2eQFyzbf AAEQ== X-Gm-Message-State: ABy/qLY3ofwceY2Q532o0vGalu6rfK2YADxkMzmmzQcwFCD3W/w/t9Bu 1lNyQzFsmPVEj9g79dmnUJTE/g== X-Google-Smtp-Source: APBJJlHdDeMBH0AEIZTxQU4jpsr8m+eRG+Yg8zdddAXYZagWOIG9FfOzoEPEuUocaJfqxNtCi4hHrA== X-Received: by 2002:a05:6808:152a:b0:3a3:ff72:14bf with SMTP id u42-20020a056808152a00b003a3ff7214bfmr9905777oiw.33.1689150518771; Wed, 12 Jul 2023 01:28:38 -0700 (PDT) Received: from [10.254.22.102] ([139.177.225.243]) by smtp.gmail.com with ESMTPSA id x16-20020a056a00271000b00672ea40b8a9sm3122098pfv.170.2023.07.12.01.28.35 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 12 Jul 2023 01:28:38 -0700 (PDT) Message-ID: <2a16a76c-506c-f325-6792-4fb58e8da531@bytedance.com> Date: Wed, 12 Jul 2023 16:28:32 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: [PATCH] mm: kfence: allocate kfence_metadata at runtime To: Marco Elver Cc: glider@google.com, dvyukov@google.com, akpm@linux-foundation.org, kasan-dev@googlegroups.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, muchun.song@linux.dev, Peng Zhang References: <20230710032714.26200-1-zhangpeng.00@bytedance.com> From: Peng Zhang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 954DA1C000F X-Stat-Signature: 1cjikggo15tm3ohkouhtbqp9tmhm3x9y X-Rspam-User: X-HE-Tag: 1689150520-387636 X-HE-Meta: U2FsdGVkX1/72L5+1XYw1vSZ/8RDX/NOD/rrRDdDwfMyPSHuZhhpKSEUndLlzfCSGEhJWz5txrFPE2p/fB1aHdjXPqrkh0GeAfqyKkl2hhDbdxMHQlEUqqHl2tbjOygDsdKe3MbnyROQXrGAlBTvX/U5a9xvFN7pkGn+P3qrbutGtofruGNBYnHS+KJmq9XVCZ1kSY5t7buJWciaarcsObIA1U6YM+3CyiHwLPbY9cV9UuMZagmHENoMpKhGvfhlMTW+pCQj2hvo0FKiq9cuZVH6Wt457ER0UexXcbGHlMADhUHTJHPcSbOOPuTNqZNBPd7ZS13vfxL6scnIeTf0U3tPyPEwt1Iw2ANZY3zez7xRRU8Wc8j/j3Mqu7DrQCpRltM6weTKXar+QUKLgbrik2KIqxchNHdnqkEeh4R4Z/BxTZVOWcc1ec7VvBJ/tKDPS9PtS6+2DE7fHfc/hykpOJavRmxzFXBs7vl7+7e0NJzZyx3xPsqLEPh4xZUMlBpyWT3hEwM5FrrqD+pENtWsW9hkWZvS4VurzNaBw95395GY20RyluqkOiS+EXCfLQno1RuuVo0sAdGKx/BAXcCbld0kKENCllhV2vgwhWdsG77UNDN6mF9QRtdmbkXxzKgYkTKPykzTN2Bnf5jbiTfpf5qzxh570Inf3pU+yRtKLfQ3iKodrCF7/fNUqJZDtnDyGSvFK5YclWQpXwSYjPwdgFlH5WFc0azkXOEZnzXJwPmO8Hz+wLkpBz4FhV1BPO6GsujI/vS1cn5WxdYMWvcuEKHhUK5FJ946YnTKhP1r7aahDPIcFzTG/IrGr73b/sJMkQO346lw67BiDBYqrS7hv+tAIhwedAFD6Fc8gOi2oIaqzbzgQrKLEjdmBTFDb+cKSUMP//kIUuWOfr6oGrGHgLplhVhxANYSNUF/sqaCyWWz4V+c1QpQ/A7CN9DgF0hlVKXvfQ2zgMBJkpCoGxG 7zBpCPbd J9NZwTuv7Rz5H23Vg11PfYkMxeC+ErRoCIVt7rqSSGq2yuroFMcICthwnu3alrLwTzTblvnUVM9nVHl3UaaU9mc5Oa0fzCSb7D4q5RIPBbMZBJn/tM5uhkz25woagEGOuvDG1zfMpThK1y6PbyxVFmhxT8v1MgXnJk7y1Cp2+Dm7U1EES9AZSTd773+LYD05i4q8pQxMS///8iszz90OGOmSD2YtYj7P4tG1Bk+ucJRUk7ZM431xmIUyuWzMsljnP0vBXQ3sSxAsNfseOOYcIUuJwkZQbRs/ZreasY+ERTJz3WqU2HdS3LzpSs6FEir1rd2c8m+3tq6Pa9JCy+9vuP0eCK7GT1mJ9Ste0BSwNy0moXbGG31zdjPLpEJzfeGlwxPFWe25wKJNt9iYrljAyR4aW23O15EvtjxyjEkIrWUqZoNHbNQw/WrMrormmbl9Jt4zFrRQJK8r5GaKmPnDF51BXp1+kUfCdsNRMxM0ifYKCP8TURXMtusT0nHp8fQ3NHV6ZbUh4/BYu4z2TLrvqGZSMZtUilFonPkjOkKC2a9Mf7Z/UVfLcLhYLC2LJiYCMEYhmlvmrD1aykHlxlhU5MvYw+g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: 在 2023/7/10 18:19, Marco Elver 写道: > On Mon, 10 Jul 2023 at 05:27, 'Peng Zhang' via kasan-dev > wrote: >> >> kfence_metadata is currently a static array. For the purpose of >> allocating scalable __kfence_pool, we first change it to runtime >> allocation of metadata. Since the size of an object of kfence_metadata >> is 1160 bytes, we can save at least 72 pages (with default 256 objects) >> without enabling kfence. >> >> Below is the numbers obtained in qemu (with default 256 objects). >> before: Memory: 8134692K/8388080K available (3668K bss) >> after: Memory: 8136740K/8388080K available (1620K bss) >> More than expected, it saves 2MB memory. >> >> Signed-off-by: Peng Zhang > > Seems like a reasonable optimization, but see comments below. > > Also with this patch applied on top of v6.5-rc1, KFENCE just doesn't > init at all anymore (early init). Please fix. I'm very sorry because I made a slight modification before sending the patch but it has not been tested, which caused it to not work properly. I fixed some of the issues you mentioned in v2[1]. [1] https://lore.kernel.org/lkml/20230712081616.45177-1-zhangpeng.00@bytedance.com/ > >> --- >> mm/kfence/core.c | 102 ++++++++++++++++++++++++++++++++------------- >> mm/kfence/kfence.h | 5 ++- >> 2 files changed, 78 insertions(+), 29 deletions(-) >> >> diff --git a/mm/kfence/core.c b/mm/kfence/core.c >> index dad3c0eb70a0..b9fec1c46e3d 100644 >> --- a/mm/kfence/core.c >> +++ b/mm/kfence/core.c >> @@ -116,7 +116,7 @@ EXPORT_SYMBOL(__kfence_pool); /* Export for test modules. */ >> * backing pages (in __kfence_pool). >> */ >> static_assert(CONFIG_KFENCE_NUM_OBJECTS > 0); >> -struct kfence_metadata kfence_metadata[CONFIG_KFENCE_NUM_OBJECTS]; >> +struct kfence_metadata *kfence_metadata; >> >> /* Freelist with available objects. */ >> static struct list_head kfence_freelist = LIST_HEAD_INIT(kfence_freelist); >> @@ -643,13 +643,56 @@ static unsigned long kfence_init_pool(void) >> return addr; >> } >> >> +static int kfence_alloc_metadata(void) >> +{ >> + unsigned long nr_pages = KFENCE_METADATA_SIZE / PAGE_SIZE; >> + >> +#ifdef CONFIG_CONTIG_ALLOC >> + struct page *pages; >> + >> + pages = alloc_contig_pages(nr_pages, GFP_KERNEL, first_online_node, >> + NULL); >> + if (pages) >> + kfence_metadata = page_to_virt(pages); >> +#else >> + if (nr_pages > MAX_ORDER_NR_PAGES) { >> + pr_warn("KFENCE_NUM_OBJECTS too large for buddy allocator\n"); > > Does this mean that KFENCE won't work at all if we can't allocate the > metadata? I.e. it won't work either in early nor late init modes? > > I know we already have this limitation for _late init_ of the KFENCE pool. > > So I have one major question: when doing _early init_, what is the > maximum size of the KFENCE pool (#objects) with this change? It will be limited to 2^10/sizeof(struct kfence_metadata) by buddy system, so I used memblock to allocate kfence_metadata in v2. > >> + return -EINVAL; >> + } >> + kfence_metadata = alloc_pages_exact(KFENCE_METADATA_SIZE, >> + GFP_KERNEL); >> +#endif >> + >> + if (!kfence_metadata) >> + return -ENOMEM; >> + >> + memset(kfence_metadata, 0, KFENCE_METADATA_SIZE); > > memzero_explicit, or pass __GFP_ZERO to alloc_pages? Unfortunately, __GFP_ZERO does not work successfully in alloc_contig_pages(), so I used memzero_explicit() in v2. Even though I don't know if memzero_explicit() is necessary (it just uses the barrier). > >> + return 0; >> +} >> + >> +static void kfence_free_metadata(void) >> +{ >> + if (WARN_ON(!kfence_metadata)) >> + return; >> +#ifdef CONFIG_CONTIG_ALLOC >> + free_contig_range(page_to_pfn(virt_to_page((void *)kfence_metadata)), >> + KFENCE_METADATA_SIZE / PAGE_SIZE); >> +#else >> + free_pages_exact((void *)kfence_metadata, KFENCE_METADATA_SIZE); >> +#endif >> + kfence_metadata = NULL; >> +} >> + >> static bool __init kfence_init_pool_early(void) >> { >> - unsigned long addr; >> + unsigned long addr = (unsigned long)__kfence_pool; >> >> if (!__kfence_pool) >> return false; >> >> + if (!kfence_alloc_metadata()) >> + goto free_pool; >> + >> addr = kfence_init_pool(); >> >> if (!addr) { >> @@ -663,6 +706,7 @@ static bool __init kfence_init_pool_early(void) >> return true; >> } >> >> + kfence_free_metadata(); >> /* >> * Only release unprotected pages, and do not try to go back and change >> * page attributes due to risk of failing to do so as well. If changing >> @@ -670,31 +714,12 @@ static bool __init kfence_init_pool_early(void) >> * fails for the first page, and therefore expect addr==__kfence_pool in >> * most failure cases. >> */ >> +free_pool: >> memblock_free_late(__pa(addr), KFENCE_POOL_SIZE - (addr - (unsigned long)__kfence_pool)); >> __kfence_pool = NULL; >> return false; >> } >> >> -static bool kfence_init_pool_late(void) >> -{ >> - unsigned long addr, free_size; >> - >> - addr = kfence_init_pool(); >> - >> - if (!addr) >> - return true; >> - >> - /* Same as above. */ >> - free_size = KFENCE_POOL_SIZE - (addr - (unsigned long)__kfence_pool); >> -#ifdef CONFIG_CONTIG_ALLOC >> - free_contig_range(page_to_pfn(virt_to_page((void *)addr)), free_size / PAGE_SIZE); >> -#else >> - free_pages_exact((void *)addr, free_size); >> -#endif >> - __kfence_pool = NULL; >> - return false; >> -} >> - >> /* === DebugFS Interface ==================================================== */ >> >> static int stats_show(struct seq_file *seq, void *v) >> @@ -896,6 +921,10 @@ void __init kfence_init(void) >> static int kfence_init_late(void) >> { >> const unsigned long nr_pages = KFENCE_POOL_SIZE / PAGE_SIZE; >> + unsigned long addr = (unsigned long)__kfence_pool; >> + unsigned long free_size = KFENCE_POOL_SIZE; >> + int ret; >> + >> #ifdef CONFIG_CONTIG_ALLOC >> struct page *pages; >> >> @@ -913,15 +942,29 @@ static int kfence_init_late(void) >> return -ENOMEM; >> #endif >> >> - if (!kfence_init_pool_late()) { >> - pr_err("%s failed\n", __func__); >> - return -EBUSY; >> + ret = kfence_alloc_metadata(); >> + if (!ret) >> + goto free_pool; >> + >> + addr = kfence_init_pool(); >> + if (!addr) { >> + kfence_init_enable(); >> + kfence_debugfs_init(); >> + return 0; >> } >> >> - kfence_init_enable(); >> - kfence_debugfs_init(); >> + pr_err("%s failed\n", __func__); >> + kfence_free_metadata(); >> + free_size = KFENCE_POOL_SIZE - (addr - (unsigned long)__kfence_pool); >> + ret = -EBUSY; >> >> - return 0; >> +free_pool: >> +#ifdef CONFIG_CONTIG_ALLOC >> + free_contig_range(page_to_pfn(virt_to_page((void *)addr)), free_size / PAGE_SIZE); >> +#else >> + free_pages_exact((void *)addr, free_size); >> +#endif > > You moved this from kfence_init_pool_late - that did "__kfence_pool = > NULL" which is missing now. Thanks for spotting this, I added it in v2.