From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38E9DFA3755 for ; Fri, 13 Sep 2024 15:01:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A88068D0009; Fri, 13 Sep 2024 11:01:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A5FD98D0001; Fri, 13 Sep 2024 11:01:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 926E38D0009; Fri, 13 Sep 2024 11:01:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 736978D0001 for ; Fri, 13 Sep 2024 11:01:01 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 18D83404E1 for ; Fri, 13 Sep 2024 15:01:01 +0000 (UTC) X-FDA: 82560027522.15.AAF46AD Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf14.hostedemail.com (Postfix) with ESMTP id D92E1100027 for ; Fri, 13 Sep 2024 15:00:57 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=YAJMkQFc; spf=pass (imf14.hostedemail.com: domain of ardb@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=ardb@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726239516; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=P6vlq4KZuWy+cpagWnMP/i1WxsOpBsApAGszCvaZ6aE=; b=PDg67klE6IGpKfLxPk6MGU9GXyLwYNI4DKH2qpeyukIlrIIJznvXo+lAmwP/ANi83F+SeH TiBMck86BENECPvAIQQ0IciC7ajiAQutb2g3mQ2IdRUjtpStwh6SK/+Dz6o3s8WTq91Wn7 VgNAtbvDT+amb1mCISXNFRX1KNMQafA= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=YAJMkQFc; spf=pass (imf14.hostedemail.com: domain of ardb@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=ardb@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726239516; a=rsa-sha256; cv=none; b=UHVaEAKtFkQ4u4PJTFgm4vFpyJ7YwtlxsveM5jJPCK/qcspbcX7RGmvWgjWUyEbK5YdkgW vFNkq/xOg6yBssmcxAFqym5enyurrgmZAOpuCxc1cB41NYO9r0v1XlPj3WTqjIXrMDjz7S zGZI0uMHTvdq2Wlgtz9v3abKUuBaytM= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id D7AC35C549B for ; Fri, 13 Sep 2024 15:00:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EC9D2C4CEDE for ; Fri, 13 Sep 2024 15:00:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726239656; bh=BA0ANnltDBBfuoBltj9aaFdaXEL//Iks38XJnB02U/g=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=YAJMkQFcS9yQrD3CD9uf2YDAnPedZRu2sdG3Uq7zy8+8g4fHsdnaV9YMu/zLJ/1tj EDiCdYnV0pGG/cGSGCoqzc1yz3NAQwdRBBLRfL0AaT5mhRSd5SqOtQjToavlsk96bb oKdYkALvWRv8THfAoHVYOAKmMboDnCW5qhSOfsTXSAkiRSZcMGaGh7XxCBtwPbd5yo iYcY5nqsZThg8+lFTkl4wHHIxP84VsBur9fkHWLMiLx+uY355fMP3Bd0gFRq3KpUT4 TIJSScKe0wBqYY1i+xFQfTmvYKL4lkyD4Paj6wDyvXKCliGwP8nsVtdZV47PWrdook qWRlseRXwzJPA== Received: by mail-lj1-f182.google.com with SMTP id 38308e7fff4ca-2f762de00e5so27639901fa.3 for ; Fri, 13 Sep 2024 08:00:55 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCWgSJzPHb9FrBqGWVoWQ77Iu00TKaTJAedgbOG2zVdxq6VvAbbKDJhl83Zn23T5ltRn5VnpfAb0bw==@kvack.org X-Gm-Message-State: AOJu0YylYp/4oc1hnbxM0Ioz/qVAF7DsXn8AhmLHlP4MkoO9MYKtmb9r BOuHl2w8S7XPHNHJMk+mjytG55NcwY0aLAvsB6UW6loohAY17PBOFiQTwc7Zc4IOfKZl0Xe8FG+ V8tQ8/rSMGbuF79RaDLY6acprghQ= X-Google-Smtp-Source: AGHT+IEIp2MygUsW5Tc1gnqrEXhSi4Y5LfaJ7pDs2t1FcVSY61BEs2pVKMYAU/+O+nJIsawLe6NqpBRTPpsi7cWLSzQ= X-Received: by 2002:a05:651c:1a0c:b0:2f0:1e0a:4696 with SMTP id 38308e7fff4ca-2f787dabe7emr39035411fa.7.1726239653924; Fri, 13 Sep 2024 08:00:53 -0700 (PDT) MIME-Version: 1.0 References: <20240909064730.3290724-1-rppt@kernel.org> <20240909064730.3290724-8-rppt@kernel.org> In-Reply-To: <20240909064730.3290724-8-rppt@kernel.org> From: Ard Biesheuvel Date: Fri, 13 Sep 2024 17:00:42 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v3 7/8] execmem: add support for cache of large ROX pages To: Mike Rapoport Cc: Andrew Morton , Andreas Larsson , Andy Lutomirski , Arnd Bergmann , Borislav Petkov , Brian Cain , Catalin Marinas , Christoph Hellwig , Christophe Leroy , Dave Hansen , Dinh Nguyen , Geert Uytterhoeven , Guo Ren , Helge Deller , Huacai Chen , Ingo Molnar , Johannes Berg , John Paul Adrian Glaubitz , Kent Overstreet , "Liam R. Howlett" , Luis Chamberlain , Mark Rutland , Masami Hiramatsu , Matt Turner , Max Filippov , Michael Ellerman , Michal Simek , Oleg Nesterov , Palmer Dabbelt , Peter Zijlstra , Richard Weinberger , Russell King , Song Liu , Stafford Horne , Steven Rostedt , Thomas Bogendoerfer , Thomas Gleixner , Uladzislau Rezki , Vineet Gupta , Will Deacon , bpf@vger.kernel.org, linux-alpha@vger.kernel.org, linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, linux-kernel@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-mm@kvack.org, linux-modules@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linux-riscv@lists.infradead.org, linux-sh@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-trace-kernel@vger.kernel.org, linux-um@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, loongarch@lists.linux.dev, sparclinux@vger.kernel.org, x86@kernel.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: D92E1100027 X-Stat-Signature: ewegbjd5qpsmq3xp8u9t7e7hs655nkbg X-Rspam-User: X-HE-Tag: 1726239657-467315 X-HE-Meta: U2FsdGVkX18XHkqfhoYVcD9LJD3SdFzmC5rrWO1ofFqePCd1/GJ7Hm/c5TBZJa5bZrCs53vEa8GJ/b+QpTMbA4fNiPAjBSOQBd0kp6Da31ymX5RfUbGuqAwgHoIFpSUdM+CjVfEtZNm6avka0NHUEmZNYXFjyUrtYLHGx8CCDwV4dEvI0eXWYJcev1yT6zvE5L9J2X/g4lD4YuJTEo+wX1C4NvXtmKrSwF2QRqT02Qd+7x73NKLkDIA1dbtg4ssQGvFc26EWVvMWXdWoeb39k57yfFdF/chC3u7WSJlGvS75oaEtn4CvC4id9w8Ld0WBhOlmAWdskOeOYiHh+/8j7w9skT4BQ+3rYhWoJMLM9Y3CN8++w1cdRxnJgfs8MeORNUj12wnXJM4gYQ5LEuLYwhUcms47ZpUio192cSWaZ5QAU6zZ5pSXC7K8w0gk/mJaZJCjx27C9bI1v7UZUzIQHrbAucNp7Q3zyzL7GqjrbcOCVBt9XI1zNp1MXT9uya96GGNf3Cb8TNC8jXry5svojk4vs/DuPPJ44KVx/5T2jL0gVXVhyEGSmsOVm3BRQy3o4wWRIKszCqCLDQW7HjzKv3EcVeDYNObiZvNeOkXJ/qGQaMAp1nnHGFq/e4lGTXql4PkDHB7XkjKdYvrzHvLQJc0jDVYrDaEbawhZLU11wlImyzGBPtEI0cXOaY5eybJR62LB/3SObAmaiGahCGE1cz8UpmHkx3o650o3cW8cwZvqzokRsQEsXGE8Dug+/giniqT5zoc1n3mLB0HWJC1ajogMI6SypNxV0FAeIC9JXOjqAf2rWswUYGcN7rpTKEq1t/BmC2XXP+TJraPe9fCITx84ADpCIUWKp5otcNzvlrTQmSdG1YJGBtcVbQIhoZqAXdh6fEkuLBVvKC8l7qFpShm4G99u3Ef64y+p+6P6gRLg49AxvItJz76thKpzisUuCku8vh1glqOm935nFen iV5RodFE Y4XfyAEcqd9EYgHLSWa2Q10lQFhXwLD9KQG2mKRCr5UsBj2juwsdk76IYUaVFShahEOS+YseZhWu02RybefoS3H6ec2lVpZ8J4XQmJO2Bavs0KoNI+Fs1tIkPp0YwGIaNccCtSX9TZKgHNQCkyt/r0RbTRBJmgjNCoc40WAvnhQKWxUEPyqfLgDm3mN0YgHpJc7p4tMoI07H2zthORIOXuRdVa2nrXD+I+QwbmIDtXboFr3psgVdwRxW7JPjqYR8vpDMjGgwefry1vpl54eJ7KnYiyp5q0vrQ4XO3IeVjHFv1yAZMKazPuXW5w5U/VIL3ahWbTsX+i+Qa8i9urUbkboect0SiSvRJ+JgwNRnjP+6Tmck= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Mike, On Mon, 9 Sept 2024 at 08:51, Mike Rapoport wrote: > > From: "Mike Rapoport (Microsoft)" > > Using large pages to map text areas reduces iTLB pressure and improves > performance. > > Extend execmem_alloc() with an ability to use huge pages with ROX > permissions as a cache for smaller allocations. > > To populate the cache, a writable large page is allocated from vmalloc with > VM_ALLOW_HUGE_VMAP, filled with invalid instructions and then remapped as > ROX. > > Portions of that large page are handed out to execmem_alloc() callers > without any changes to the permissions. > > When the memory is freed with execmem_free() it is invalidated again so > that it won't contain stale instructions. > > The cache is enabled when an architecture sets EXECMEM_ROX_CACHE flag in > definition of an execmem_range. > > Signed-off-by: Mike Rapoport (Microsoft) > --- > include/linux/execmem.h | 2 + > mm/execmem.c | 289 +++++++++++++++++++++++++++++++++++++++- > 2 files changed, 286 insertions(+), 5 deletions(-) > > diff --git a/include/linux/execmem.h b/include/linux/execmem.h > index dfdf19f8a5e8..7436aa547818 100644 > --- a/include/linux/execmem.h > +++ b/include/linux/execmem.h > @@ -77,12 +77,14 @@ struct execmem_range { > > /** > * struct execmem_info - architecture parameters for code allocations > + * @fill_trapping_insns: set memory to contain instructions that will trap > * @ranges: array of parameter sets defining architecture specific > * parameters for executable memory allocations. The ranges that are not > * explicitly initialized by an architecture use parameters defined for > * @EXECMEM_DEFAULT. > */ > struct execmem_info { > + void (*fill_trapping_insns)(void *ptr, size_t size, bool writable); > struct execmem_range ranges[EXECMEM_TYPE_MAX]; > }; > > diff --git a/mm/execmem.c b/mm/execmem.c > index 0f6691e9ffe6..f547c1f3c93d 100644 > --- a/mm/execmem.c > +++ b/mm/execmem.c > @@ -7,28 +7,88 @@ > */ > > #include > +#include > #include > #include > +#include > #include > #include > > +#include > + > +#include "internal.h" > + > static struct execmem_info *execmem_info __ro_after_init; > static struct execmem_info default_execmem_info __ro_after_init; > > -static void *__execmem_alloc(struct execmem_range *range, size_t size) > +#ifdef CONFIG_MMU > +struct execmem_cache { > + struct mutex mutex; > + struct maple_tree busy_areas; > + struct maple_tree free_areas; > +}; > + > +static struct execmem_cache execmem_cache = { > + .mutex = __MUTEX_INITIALIZER(execmem_cache.mutex), > + .busy_areas = MTREE_INIT_EXT(busy_areas, MT_FLAGS_LOCK_EXTERN, > + execmem_cache.mutex), > + .free_areas = MTREE_INIT_EXT(free_areas, MT_FLAGS_LOCK_EXTERN, > + execmem_cache.mutex), > +}; > + > +static void execmem_cache_clean(struct work_struct *work) > +{ > + struct maple_tree *free_areas = &execmem_cache.free_areas; > + struct mutex *mutex = &execmem_cache.mutex; > + MA_STATE(mas, free_areas, 0, ULONG_MAX); > + void *area; > + > + mutex_lock(mutex); > + mas_for_each(&mas, area, ULONG_MAX) { > + size_t size; > + > + if (!xa_is_value(area)) > + continue; > + > + size = xa_to_value(area); > + > + if (IS_ALIGNED(size, PMD_SIZE) && > + IS_ALIGNED(mas.index, PMD_SIZE)) { > + void *ptr = (void *)mas.index; > + > + mas_erase(&mas); > + vfree(ptr); > + } > + } > + mutex_unlock(mutex); > +} > + > +static DECLARE_WORK(execmem_cache_clean_work, execmem_cache_clean); > + > +static void execmem_fill_trapping_insns(void *ptr, size_t size, bool writable) > +{ > + if (execmem_info->fill_trapping_insns) > + execmem_info->fill_trapping_insns(ptr, size, writable); > + else > + memset(ptr, 0, size); Does this really have to be a function pointer with a runtime check? This could just be a __weak definition, with the arch providing an override if the memset() is not appropriate.