From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CC383C83029 for ; Mon, 30 Jun 2025 16:40:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5C8856B00B3; Mon, 30 Jun 2025 12:40:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5796A6B00D9; Mon, 30 Jun 2025 12:40:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 466A06B00DA; Mon, 30 Jun 2025 12:40:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 2F2176B00B3 for ; Mon, 30 Jun 2025 12:40:07 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id DB8CEBA466 for ; Mon, 30 Jun 2025 16:40:06 +0000 (UTC) X-FDA: 83612629212.23.527F419 Received: from mail-lj1-f178.google.com (mail-lj1-f178.google.com [209.85.208.178]) by imf04.hostedemail.com (Postfix) with ESMTP id 6E92540018 for ; Mon, 30 Jun 2025 16:40:04 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=hwjutqh1; spf=pass (imf04.hostedemail.com: domain of urezki@gmail.com designates 209.85.208.178 as permitted sender) smtp.mailfrom=urezki@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751301604; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mvxFM51uVXBLV1vC4dKJrFsVI8rMXQnN9LU3Zlb1RIM=; b=WJO+roroYn4qMFL6deD4JBrQlUTUKBbIlkaHQekpL7Sj9wYI2j89gDi1Xgv4Exgh6k53ym 15fqrPeADmTnTARfxVSJUFLfQD+l9L93OTUoBn2o9V3EbnYVRvQEIDSn2SDmIESHUHqWHj R9LxbQrZuAyU0W1vJSchpCVoY8L5rJE= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=hwjutqh1; spf=pass (imf04.hostedemail.com: domain of urezki@gmail.com designates 209.85.208.178 as permitted sender) smtp.mailfrom=urezki@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751301604; a=rsa-sha256; cv=none; b=ZTb7+Bev2ZVVmM//21EEF9DLDKnQGzC4gDNcFRrFTGkvLMDDNKlUYe5z9zTHdzctXO2E0N brAV23chAVl5l0UyesUfTCjgMZx94hifAFPewlQV/vqEGVlrBBzNzjPReVLrJjMFIDuDd9 nDudfNHPdSKsUTrhXHonlSO/kyyZlH4= Received: by mail-lj1-f178.google.com with SMTP id 38308e7fff4ca-32b3b250621so43902041fa.2 for ; Mon, 30 Jun 2025 09:40:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1751301602; x=1751906402; darn=kvack.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:date:from:from:to :cc:subject:date:message-id:reply-to; bh=mvxFM51uVXBLV1vC4dKJrFsVI8rMXQnN9LU3Zlb1RIM=; b=hwjutqh1EobCEeBlX9l464Nvcgj7JFjOtnbqDWg0HXC89Ec+RPuopp0EhPCTUb8sFO v4bmSxQs/TyMnCVZnD3/CF5gXYPeFsL7nyTHLvbTzQPMUha23JNBe1TPaBD4dYNwNMtg FFlGXGYPDn09JoSdrCZxtLepJ82xMhFpRsz+eV6XqodwJokmuiRgdeC9un+XhHMuN1r1 LIL1wUzySSgIJXAT/+Hj/2UWvsxOuDJ0f/nbCHTM6FDZm5jA0e56HcMQaJxEbEB+aQB3 8Hj1bXnQoqIel5ZLr2ztwVd2iXN/USTmuVMIlnvOW1suGOaWTkwcM1KS4Bag037Z/NG/ We+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751301602; x=1751906402; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:date:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=mvxFM51uVXBLV1vC4dKJrFsVI8rMXQnN9LU3Zlb1RIM=; b=LMGGUF+qnMeyaqqw6czUfWyHv7woZTTYSK4o4sH5TNi/+Dy+Z0PFtcxxlOO28Enm7s WRyoPHjEtj/iVlDe21NiJ8TnQ5s+pIRlYEDv1c17DQKBfxRWLAytavYYVBofgU03eurQ bf75Xh0JGMPar+EK29TaE7snH/26j3WOl2cVaMo6XY78Hbj1Yjvp3REgEw70b7Imtdtt GhAAHaorwv0yzQ5nSlsnCPJ9AiMwprGr8AiqXWDlVy7qGydftrM8GTFW8tPlWjDjsfV3 tPJzZHBdEvDoaAqDQPn9yA0l1tguP6MvDL1JWkybAxbvUFRnHnR92ofM5sRWdZdNqF+8 pLqg== X-Forwarded-Encrypted: i=1; AJvYcCUUMX4YOjUwgrJo0evG9OK9zhsd6ejlejbVEPiaA43qZmtXI/HSSMOU0Wjvte2mlfy26arCsuFp9Q==@kvack.org X-Gm-Message-State: AOJu0Yy1jepHWIw/xujciJlOcZz/ndJWBQEQTXRqGNSPGd1nPTXLdzzF lmekTyW4k6Kt2LSf//+5jaiUXS3QZKoALUOfntGJiCN1heCAB8zRty88 X-Gm-Gg: ASbGncvPprZ/KYurGeYaLDl1YZbzQ7jpMrAacYK9uZKkiKzU/FW5jPjsxfguc655mpK ES/q0c0srlMutSKFujvaYcU4M1yijzg1PhfRuzsxp1AXutFG9IFrmvKz5d56qKky+o2rLLNBgsc cEbWR2Usf0iSVGy84BlSI7RTx5w31+Hu8XDZmjh+lNzlAxvfi8FdYELAhicokUlvfhFT4xwdh3X cJPDbZPg/ZEVF9HCm9Ybx/5NmEWACEJQHOsKt7W5rA5cjK1gLh+qOU6eut0CAAFJPNrCYfmZ49I FSVB4aZJXo8XSTZ39lUjz2JniCqmP4mHGe3UIsyj28pjzK8T3eI0pCrPxM/6vmWnV7DhYWkS5F7 zL6QX5Dtanv8= X-Google-Smtp-Source: AGHT+IFLlHdz+izlLLQDh456GiXwtgcpoFyOKhudz16ZRyhokDMTXoBjMRRc74Und57f5eSHwbDAbg== X-Received: by 2002:ac2:4e05:0:b0:553:33b3:b944 with SMTP id 2adb3069b0e04-5550ba44a0dmr4170546e87.54.1751301601911; Mon, 30 Jun 2025 09:40:01 -0700 (PDT) Received: from pc636 (host-95-203-1-180.mobileonline.telia.com. [95.203.1.180]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5550b2ec0a7sm1480042e87.237.2025.06.30.09.40.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Jun 2025 09:40:01 -0700 (PDT) From: Uladzislau Rezki X-Google-Original-From: Uladzislau Rezki Date: Mon, 30 Jun 2025 18:39:59 +0200 To: Vitaly Wool Cc: Uladzislau Rezki , linux-mm@kvack.org, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, Danilo Krummrich , Alice Ryhl , rust-for-linux@vger.kernel.org Subject: Re: [PATCH v8 1/4] mm/vmalloc: allow to set node and align in vrealloc Message-ID: References: <20250628102315.2542656-1-vitaly.wool@konsulko.se> <20250628102537.2542789-1-vitaly.wool@konsulko.se> <9E9F1FCB-E4BF-463B-B2C3-833572B3918A@konsulko.se> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <9E9F1FCB-E4BF-463B-B2C3-833572B3918A@konsulko.se> X-Stat-Signature: rtsxe1pmdrsm9ky4rf1ykhy4fp86d9sk X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 6E92540018 X-Rspam-User: X-HE-Tag: 1751301604-915637 X-HE-Meta: U2FsdGVkX1+QpxabhYctvlcvX6Y6UzqdVKFlfHc9aoKjwtR9jiKSbSBPgMo9wqH30HucYlGV+UN3zybVEbmuUjkNZYZMrgxAoyVYqpfXxtCf2PmwS822B3qKOEEJDfFrUvy6Zkmg05a+ot3AOA1WfoVzgG8auy2QI/ODuwYEYsqvGheixd/7nefYvwTSlzoBUx+GPJOJt2/I46t9W89+X5pdnQRK1NDX1bLXKgbBi161vvGy2m7JkbLBPWL+mV/QQOkeKa7Xi2EctxBlnmK2XqXb26j19H0Dyp1IUlLdVWUuPpAhSNXN0uMtBq03Ubnid4jROdtFVEjc6LhD3QakCr0pJvDs9He6fYhxVihjk+773WGXso3BCGL8oXhoURynU/TZyuVM8D6WxY3L+FdG9OvO4yUFD8V3gcDU3DRKBJgvDYii6/iny5axMP9TZQGs8ftVX5ocBYKoPqKMk227tvpYzo+2/3lo3y10q3AQho15xGlKYYzywgv2UPR4ZYTZoZls+d8JcbdDm5Tyoy6NW/HfBn/jakyuLk1JgYzI33YsAWknVlgF7UXKkUNxr5XRDvbXMva3Gjowc5VFk2ytyz8w1QmFNKAFHHYVH8eDNqDMonx6TVcxtXs62JJAJThpgsMXgtHWI2sMd2zlUjIoSi04jSIBSAJfN0qwZNi2VMkA5bjW44iIauCSWTk/UIiIn3pnkVaKvDLuuEtjHP8pcRwIkXXcU2sOi7pWmgBAksqL3kYVHWa6N3KC1zBfT2GQm1R/rCqvoeuG0Hb4gJnGSsdsltxQBPBaTcP7EQYlf/Oe7H4LaF99p903xNlypbpSVkw6earSawEWZOfdwvSIxXXwzUNatYVwXWandR30ekHT+dWa0tdwY3alMaBdob5LLPDyLK9ZiuCC+QS8vQLCdW1r1J3RdteV/aoiABo+Qv6fuPJb8gmTMKpAOtXbg4vS6NfU8lscVeQtYKFCMYW y12dxlHq 0KgTPkwOPz8yjpRAL9hM4aZ1r6CCPEi8GitJuD5ZluYZQ+lrkQbPQ0HxTERIcKuaNMkDwe2irEXRM3JZ56l1qCmJJJZHFou7kqMPyvV3R6v6MshUWtLVaJGELhzYOUb/Ac0QYh0smKo0YeJ76iaNcz0FLBUCKIzpojTrto1hp4hsQ29KL11C2SWotqb+5Xb/F8HU5myNlmRhXePb8A1LkPbP9OvIH61KV6vDmZ03eXz/KuZiHWNVmEWNMRupsyRrNZl1TqkUdmBr3KuEnGKibIhuQa7Wf6B5dFpEkzTExtTR5UtwzR4SCQV6EerYcqshgSXsXOvrDDZE2CGeZufJPWff2+t1xLgH5mtng9rZrJ7kEMw35wel3obl1EMCIQp+lcWFKrRYGzoQz9Ea06B88Z6FGIX2KEpA7EPga6BiiIQqE6+3t99AjTcLW3Nu4BZdUAklqsZi2IQoVu+OndfQSO0DWRdy2GQ3vl3aKuLXxL5k4CkGnhrRG5PpNASfHc1CATlmYbQbVIq0EpJBY+nWc25KyG1k4Fab3ZmpdbfCcZazGj5E= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: > > On Jun 30, 2025, at 12:30 PM, Uladzislau Rezki wrote: > > On Sat, Jun 28, 2025 at 12:25:37PM +0200, Vitaly Wool wrote: > > Reimplement vrealloc() to be able to set node and alignment should > a user need to do so. Rename the function to vrealloc_node_align() > to better match what it actually does now and introduce macros for > vrealloc() and friends for backward compatibility. > > With that change we also provide the ability for the Rust part of > the kernel to set node and aligmnent in its allocations. > > Signed-off-by: Vitaly Wool > --- > include/linux/vmalloc.h | 12 +++++++++--- > mm/vmalloc.c | 20 ++++++++++++++++---- > 2 files changed, 25 insertions(+), 7 deletions(-) > > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h > index fdc9aeb74a44..68791f7cb3ba 100644 > --- a/include/linux/vmalloc.h > +++ b/include/linux/vmalloc.h > @@ -197,9 +197,15 @@ extern void *__vcalloc_noprof(size_t n, size_t > size, gfp_t flags) __alloc_size(1 > extern void *vcalloc_noprof(size_t n, size_t size) __alloc_size(1, 2); > #define vcalloc(...) alloc_hooks(vcalloc_noprof(__VA_ARGS__)) > > -void * __must_check vrealloc_noprof(const void *p, size_t size, gfp_t > flags) > - __realloc_size(2); > -#define vrealloc(...) alloc_hooks(vrealloc_noprof(__VA_ARGS__)) > +void *__must_check vrealloc_node_align_noprof(const void *p, size_t > size, > + unsigned long align, gfp_t flags, int nid) __realloc_size(2); > +#define vrealloc_node_noprof(_p, _s, _f, _nid) \ > + vrealloc_node_align_noprof(_p, _s, 1, _f, _nid) > +#define vrealloc_noprof(_p, _s, _f) \ > + vrealloc_node_align_noprof(_p, _s, 1, _f, NUMA_NO_NODE) > +#define vrealloc_node_align(...) alloc_hooks > (vrealloc_node_align_noprof(__VA_ARGS__)) > +#define vrealloc_node(...) alloc_hooks(vrealloc_node_noprof > (__VA_ARGS__)) > +#define vrealloc(...) alloc_hooks(vrealloc_noprof(__VA_ARGS__)) > > extern void vfree(const void *addr); > extern void vfree_atomic(const void *addr); > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index 6dbcdceecae1..d633ac0ff977 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -4089,12 +4089,15 @@ void *vzalloc_node_noprof(unsigned long size, > int node) > EXPORT_SYMBOL(vzalloc_node_noprof); > > /** > - * vrealloc - reallocate virtually contiguous memory; contents remain > unchanged > + * vrealloc_node_align_noprof - reallocate virtually contiguous > memory; contents > + * remain unchanged > * @p: object to reallocate memory for > * @size: the size to reallocate > + * @align: requested alignment > * @flags: the flags for the page level allocator > + * @nid: node id > * > - * If @p is %NULL, vrealloc() behaves exactly like vmalloc(). If @size > is 0 and > + * If @p is %NULL, vrealloc_XXX() behaves exactly like vmalloc(). If > @size is 0 and > * @p is not a %NULL pointer, the object pointed to is freed. > * > * If __GFP_ZERO logic is requested, callers must ensure that, starting > with the > @@ -4111,7 +4114,8 @@ EXPORT_SYMBOL(vzalloc_node_noprof); > * Return: pointer to the allocated memory; %NULL if @size is zero or > in case of > * failure > */ > -void *vrealloc_noprof(const void *p, size_t size, gfp_t flags) > +void *vrealloc_node_align_noprof(const void *p, size_t size, unsigned > long align, > + gfp_t flags, int nid) > { > struct vm_struct *vm = NULL; > size_t alloced_size = 0; > @@ -4135,6 +4139,13 @@ void *vrealloc_noprof(const void *p, size_t > size, gfp_t flags) > if (WARN(alloced_size < old_size, > "vrealloc() has mismatched area vs requested sizes (%p)\n", p)) > return NULL; > + if (WARN(nid != NUMA_NO_NODE && nid != page_to_nid(vmalloc_to_page > (p)), > + "vrealloc() has mismatched nids\n")) > + return NULL; > + if (WARN((uintptr_t)p & (align - 1), > + "will not reallocate with a bigger alignment (0x%lx)\n", > + align)) > + return NULL; > > > IMO, IS_ALIGNED() should be used instead. We have already a macro for this > purpose, i.e. the idea is just to check that "p" is aligned with "align" > request. > > Can you replace the (uintptr_t) casting to (ulong) or (unsigned long) > this is how we mostly cast in vmalloc code? > > > Thanks, noted. > > > WARN() probably is worth to replace. Use WARN_ON_ONCE() to prevent > flooding. > > > I am not sure i totally agree, because: > a) there’s already one WARN() in that block and I’m just following the pattern > b) I don’t think this will be a frequent error. > Could we just drop such assumption(b)? Instead we just eliminate it and thus we do not spam the kernel buffer :) Also, there is another: > > + if (WARN(nid != NUMA_NO_NODE && nid != page_to_nid(vmalloc_to_page(p)), > + "vrealloc() has mismatched nids\n")) > + return NULL; > I can easily trigger this with continuous kernel splats after adding vrealloc_alloc_test into the vmalloc test-suite: [ 53.517781] ------------[ cut here ]------------ [ 53.517787] vrealloc() has mismatched nids [ 53.517817] WARNING: CPU: 46 PID: 2213 at mm/vmalloc.c:4198 vrealloc_node_align_noprof+0x11b/0x230 [ 53.517829] Modules linked in: test_vmalloc(E+) binfmt_misc(E) ppdev(E) parport_pc(E) parport(E) bochs(E) snd_pcm(E) sg(E) drm_client_lib(E) snd_timer(E) drm_shmem_helper(E) evdev(E) joydev(E) snd(E) drm_kms_helper(E) vga16fb(E) soundcore(E) serio_raw(E) button(E) pcspkr(E) vgastate(E) drm(E) dm_mod(E) fuse(E) loop(E) configfs(E) efi_pstore(E) qemu_fw_cfg(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) sr_mod(E) cdrom(E) sd_mod(E) ata_generic(E) ata_piix(E) libata(E) i2c_piix4(E) scsi_mod(E) psmouse(E) floppy(E) e1000(E) i2c_smbus(E) scsi_common(E) [ 53.517879] CPU: 46 UID: 0 PID: 2213 Comm: vmalloc_test/10 Kdump: loaded Tainted: G W E 6.16.0-rc1+ #263 PREEMPT(undef) [ 53.517886] Tainted: [W]=WARN, [E]=UNSIGNED_MODULE [ 53.517887] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 53.517889] RIP: 0010:vrealloc_node_align_noprof+0x11b/0x230 [ 53.517894] Code: 89 4c 24 08 e8 76 b0 ff ff 4c 8b 4c 24 08 48 8b 00 48 c1 e8 36 41 39 c4 0f 84 64 ff ff ff 48 c7 c7 90 c4 28 a2 e8 25 a8 d3 ff <0f> 0b 31 ed eb 95 65 8b 05 f8 cf 90 01 a9 00 ff ff 00 0f 85 dd 00 [ 53.517897] RSP: 0018:ffffa6db87f27e08 EFLAGS: 00010282 [ 53.517900] RAX: 0000000000000000 RBX: ffffa6db9a315000 RCX: 0000000000000000 [ 53.517902] RDX: 0000000000000002 RSI: 0000000000000001 RDI: 00000000ffffffff [ 53.517904] RBP: 000000000000a000 R08: 0000000000000000 R09: 0000000000000003 [ 53.517905] R10: ffffa6db87f27ca0 R11: ffff98c5fff0a368 R12: 0000000000000002 [ 53.517908] R13: ffff98c201d06a80 R14: 0000000000009000 R15: 0000000000000001 [ 53.517912] FS: 0000000000000000(0000) GS:ffff98c24cf17000(0000) knlGS:0000000000000000 [ 53.517914] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 53.517916] CR2: 00007fe515c11390 CR3: 000000084bf03000 CR4: 00000000000006f0 [ 53.517920] Call Trace: [ 53.517923] [ 53.517928] ? __pfx_vrealloc_alloc_test+0x10/0x10 [test_vmalloc] [ 53.517937] vrealloc_alloc_test+0x22/0x60 [test_vmalloc] [ 53.517941] test_func+0xd5/0x1d0 [test_vmalloc] [ 53.517946] ? __pfx_test_func+0x10/0x10 [test_vmalloc] [ 53.517949] kthread+0x109/0x240 [ 53.517955] ? finish_task_switch.isra.0+0x85/0x2a0 [ 53.517960] ? __pfx_kthread+0x10/0x10 [ 53.517963] ? __pfx_kthread+0x10/0x10 [ 53.517966] ret_from_fork+0x87/0xf0 [ 53.517971] ? __pfx_kthread+0x10/0x10 [ 53.517974] ret_from_fork_asm+0x1a/0x30 [ 53.517980] [ 53.517981] ---[ end trace 0000000000000000 ]--- Please drop that WARN(). The motivation is, we should serve the memory. Because, processes can migrate between NUMA nodes and they still have to be able to allocate memory. Moreover, in the current vrealloc() implementation, memory is fully reallocated on a new NUMA node in any case and the old allocation is released after copying the data. So it does not matter if the NUMA node has changed. -- Uladzislau Rezki