From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 37172C48291 for ; Mon, 5 Feb 2024 09:32:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9806E6B0072; Mon, 5 Feb 2024 04:32:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 908FD6B0075; Mon, 5 Feb 2024 04:32:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 783476B007D; Mon, 5 Feb 2024 04:32:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 64D406B0072 for ; Mon, 5 Feb 2024 04:32:45 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 1F57F80904 for ; Mon, 5 Feb 2024 09:32:45 +0000 (UTC) X-FDA: 81757235490.10.245788D Received: from mail-lf1-f43.google.com (mail-lf1-f43.google.com [209.85.167.43]) by imf16.hostedemail.com (Postfix) with ESMTP id 15121180010 for ; Mon, 5 Feb 2024 09:32:41 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=stTXQ9YW; spf=pass (imf16.hostedemail.com: domain of alexghiti@rivosinc.com designates 209.85.167.43 as permitted sender) smtp.mailfrom=alexghiti@rivosinc.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707125562; a=rsa-sha256; cv=none; b=QAy/gTxzVKv6fsE0OmRWx+HyN+lP8U5RfElHXc1Da7eBi19SgO68NrTEcZCJvR5BSKA+mv AXlx/5V1c3yNVCEbdi6db+mvplAWjOa+UWVxiyX7DJykpqZWZ2PX1dAHJ2hJeBhS816jRw eIPJmmQO1mKFOx5Rz4N39/zuzU/kLEw= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=stTXQ9YW; spf=pass (imf16.hostedemail.com: domain of alexghiti@rivosinc.com designates 209.85.167.43 as permitted sender) smtp.mailfrom=alexghiti@rivosinc.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707125562; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xVZry8n33b0WHskESeVtXV0w9cCtNPi/Dge/gmziDPQ=; b=ZQ3MK2TXNTIz+Qg+TJEYq47U6Q/wtlkxfYzZlGDEsY6sbcy3WZlmUbsabJ/KQ4rTScZOgZ yZuDflkro3XF7ve6rYpOBGJCqJQ6EfHO+xDzsMrjggJYoak+Ke+Hkfge6RZlgt8OyZq2NY 5s2by9Bbbo5gaQ41etRTg5nXot1RKvc= Received: by mail-lf1-f43.google.com with SMTP id 2adb3069b0e04-5114b1e8819so1411380e87.1 for ; Mon, 05 Feb 2024 01:32:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1707125560; x=1707730360; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=xVZry8n33b0WHskESeVtXV0w9cCtNPi/Dge/gmziDPQ=; b=stTXQ9YWQgCKTLjzs3uunnBhuX+31CzBvHre+IakorSFGyZqCdNoc6ZCjWafURiyQ1 RGikdm9fq3JxDm63x5Nx8+Glf9dFGbsBd1BiXVH1N6SFceYoerJKKSnXwdHjTLuGyuE5 ryGoLXf4PtPo1mMAAL34vU4mvKU3b7Y3HZBHHq845Ip0UWoHiz1z658AkaAtm31vih3j pcJCM+7RNecDvOJ+pKkaqZRCp/FObmp8+YK0PyvZiLUYh015TgPji+YHE6nxJ33Oqk6Q oCEeVCTG0KFu0db+jP8c5jhxMeCRTkwJzd9m0EN2TthyTHjetT3KjT9l1WTflK9cLWZk q+HQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707125560; x=1707730360; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xVZry8n33b0WHskESeVtXV0w9cCtNPi/Dge/gmziDPQ=; b=pYY289ZIAZU4jg4ZQ8oq8Yc19JkIbZYERm4xLY9yV65lFEBRA9LJOju2tJtX6C7c8f yk3dLxEcrNJYA43uLADQxeWKBTXdiPocHTLQeUiRPnjNHkOrXH+K9aVb6P+d4ugEEGTY gtAhTugU/zMCOF1aDeAYdp2/2OP34g9KEM6WIB/UPW6hUfgaeS+4YJ4fukDjj5LCGlum 7GRl505AZIgg7Wmsv/InegWSWv1yLQlXjxHQaLsNDqbjfrsdfJ6fSA859szxCJ+cL8XK NBAusH9R3f0BSlcTxP5bTz5ImO+dft9nDnWUqjvKbEHb87sVxogExt9bVfLfEfRLzw3l lNLA== X-Gm-Message-State: AOJu0YwB6mjodDvd5yXXkbi/Wp6ZgwR3J22828QtlOzTLLDmj1xCiiTU nibzXJf5q6t+rRluq5jU0uV9Ka1SLZjE4EVjwg7nMj83a3wbddIVLrpiT2DF2ensJM4SlaaCs++ mMMvUDFxjKZ/mYx0MM2ZWoCBuygUGVykY8wfIWA== X-Google-Smtp-Source: AGHT+IGJtt3ctxY+h8i6j3P5dgJUYa3KuaK9C2dnqDHV2fJJh3i3ISjqNYBLXV7kPnHjw70SrQ3O8lgkERpEm8n+edM= X-Received: by 2002:a19:7509:0:b0:511:4f67:7d4d with SMTP id y9-20020a197509000000b005114f677d4dmr383850lfe.15.1707125559904; Mon, 05 Feb 2024 01:32:39 -0800 (PST) MIME-Version: 1.0 References: <20230316131711.1284451-1-alexghiti@rivosinc.com> <20240118082346.GB31078@hsinchu15> In-Reply-To: From: Alexandre Ghiti Date: Mon, 5 Feb 2024 10:32:28 +0100 Message-ID: Subject: Re: Fwd: [PATCH v8 0/4] riscv: Use PUD/P4D/PGD pages for the linear mapping To: Nylon Chen Cc: alex@ghiti.fr, apatel@ventanamicro.com, catalin.marinas@arm.com, will@kernel.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, robh+dt@kernel.org, frowand.list@gmail.com, rppt@kernel.org, akpm@linux-foundation.org, anup@brainfault.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, devicetree@vger.kernel.org, linux-mm@kvack.org, zong.li@sifive.com, nylon7717@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 15121180010 X-Stat-Signature: 5cwxzot6quw43fcjhc6fs46pp9iwhbws X-Rspam-User: X-HE-Tag: 1707125561-459765 X-HE-Meta: U2FsdGVkX18OJhW2EEUSjIGkBnkQgXH9NoRqBqvtdkKvUSpTiq1WKauMUz6p2TOQwqit1YO3ZQTcz6itetNE6X2X7k7HM8qXhVveITgVeLg+qoRTZ3eNT4vxnMJO8C8hNTgNaMZqGH25L/8D2ASQEv/qkCIro89ZFNE810art86IBMHJMZpVGxxgDGBDaR4Y08Z9upWMCgdE+WpQWeTIxSZ7bdvIopqzaifXpszZhKUDsdpP1qblXbLuSNAwL7yb2GI1N3VJMK3VHPQGhiSsFzLaPL1ZIMJR5V4+evEQt/QeaJjl00znmO7xCU6NDO7ivOCXfJvsHXOWG4chiwMagc2+xDAZg3Pqic2aniwAoVVF8a0MCIs6ILvqOPfIYqoIH4DCDhPnMtrVbgiZ/BFehZnyYpgwRNLvzjGH/C+OuCdkFX1qPWEhW+/h3pqyr34Go9NymQdEsMpvPRjwSKTsn9nFhw2G6W4zmoX/uTF7W2iPCA17DMB+SwMkZTM3yrL8HSQbYHgJaDuIItHgX4DrTVhfZSPyq7lmeylMyZb2tk4gn5DIB/IO/wcRoMk4GqvpSuHWWr4Q6V5GLA26+VRd8iqNwLRhRGHOZkAul1EjvNjsBy3YQCDSoGWB3exR3uTPQ0G08E/XiWvkwVM/CREa5Q+9R/N21HxzVRkUuenkh98oJMDuOqrbcmh+Quc/q8Qvaa8MISx4trokYXEWBQ5QZ3rMb7K5J4GQY1aNpxWi0/Eoi1jpsfBEiu/S5MzDdthmCljKxErHpZxuBLvRQq9rtMXC4tdkjJVWyPqCljetmVT7v0MKG9oI60voyKUbIImclgb28DZKxPwccxc5W1qPSHN0tw8AWYYXlB3xNsBkUJMXtnnHGKzbpwoIDGlfaLy5AtGs8LHvdZqcZMBeKPmbGjq0fWWmGv/5TkHjm7oiflB+Qf+hqUFRug5kq8lh30zGcWkmvNEAwNrVEeBnmMD zmch96LY wwgHVKbUt8zuIV59EsPsgOdGMbEhWaHnznHMa5IHwu55JSs2S8SpvTQ+Tit3efUdnWoKXTJ/HqKqZ+nbTrqElxSL+w4dLjAH0zD4BaXHBzVR6NeKGVTy1W7mo/kzwHVs0JaqmsfpLkFKNZgt9sktvo4bcai4nn3TTVc+6WuRQho/ab9WmbcJ9q/E9egYD4GMkAtCeS/0xW/Cg4NYx7wiFIX095mPMuu680QArjiTt1QYXA1rRpCeW/wSJSord7iIcCVeVfXa00Bi1eQuK7bfeMN0o2/9jmCMJXr1CZ+QE43P+LEN/0s++qy3ZehFBZknBn+y82FzgB94d0sGLBSR11LB4/SeWrYYd7OCCPpg/VLzlkD1CcRgzB4FcfPLkcFQYE9rz7Ehpl2+lgqXu3vf2A7c6lSWEqwuXyoFC5NemeS6PcxsXHdcJSCY7EoOqv9YbOs6M5bDwoJjyewNkJle3udu3olRkEB5Fo5dGixa12E8+O3NwtwilUz53uomfZjiaCdX3DMq3TT8fyIBVufhPMC3NGBBTHVVU9xbKAN+bYM28pSSXwdfYYmjwqM8Fe919EENjw2rlxmaQwU8yhSUegpCPpmAG4TgP40nJsGFHfarh6CgQlphAIbSgO57xDPdq/DFPV4UH/tQddAeyInXco67CDZKr9awZElsr X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Nylon, On Fri, Jan 19, 2024 at 10:27=E2=80=AFAM Nylon Chen = wrote: > > Alexandre Ghiti =E6=96=BC 2024=E5=B9=B41=E6=9C= =8818=E6=97=A5 =E9=80=B1=E5=9B=9B =E4=B8=8B=E5=8D=889:01=E5=AF=AB=E9=81=93= =EF=BC=9A > > > > Hi Nylon, > Hi Alexandre, thanks for your feedback, > > > > On Thu, Jan 18, 2024 at 9:23=E2=80=AFAM Nylon Chen wrote: > > > > > > > On 3/23/23 15:55, Anup Patel wrote: > > > > > On Thu, Mar 23, 2023 at 6:24=E2=80=AFPM Alexandre Ghiti wrote: > > > > >> Hi Anup, > > > > >> > > > > >> On Thu, Mar 23, 2023 at 1:18=E2=80=AFPM Anup Patel wrote: > > > > >>> Hi Alex, > > > > >>> > > > > >>> On Thu, Mar 16, 2023 at 6:48=E2=80=AFPM Alexandre Ghiti wrote: > > > > >>>> This patchset intends to improve tlb utilization by using huge= pages for > > > > >>>> the linear mapping. > > > > >>>> > > > > >>>> As reported by Anup in v6, when STRICT_KERNEL_RWX is enabled, = we must > > > > >>>> take care of isolating the kernel text and rodata so that they= are not > > > > >>>> mapped with a PUD mapping which would then assign wrong permis= sions to > > > > >>>> the whole region: it is achieved by introducing a new memblock= API. > > > > >>>> > > > > >>>> Another patch makes use of this new API in arm64 which used so= me sort of > > > > >>>> hack to solve this issue: it was built/boot tested successfull= y. > > > > >>>> > > > > >>>> base-commit-tag: v6.3-rc1 > > > > >>>> > > > > >>>> v8: > > > > >>>> - Fix rv32, as reported by Anup > > > > >>>> - Do not modify memblock_isolate_range and fixes comment, as s= uggested by Mike > > > > >>>> - Use the new memblock API for crash kernel too in arm64, as s= uggested by Andrew > > > > >>>> - Fix arm64 double mapping (which to me did not work in v7), b= ut ends up not > > > > >>>> being pretty at all, will wait for comments from arm64 revi= ewers, but > > > > >>>> this patch can easily be dropped if they do not want it. > > > > >>>> > > > > >>>> v7: > > > > >>>> - Fix Anup bug report by introducing memblock_isolate_memory w= hich > > > > >>>> allows us to split the memblock mappings and then avoid to = map the > > > > >>>> the PUD which contains the kernel as read only > > > > >>>> - Add a patch to arm64 to use this newly introduced API > > > > >>>> > > > > >>>> v6: > > > > >>>> - quiet LLVM warning by casting phys_ram_base into an unsigned= long > > > > >>>> > > > > >>>> v5: > > > > >>>> - Fix nommu builds by getting rid of riscv_pfn_base in patch 1= , thanks > > > > >>>> Conor > > > > >>>> - Add RB from Andrew > > > > >>>> > > > > >>>> v4: > > > > >>>> - Rebase on top of v6.2-rc3, as noted by Conor > > > > >>>> - Add Acked-by Rob > > > > >>>> > > > > >>>> v3: > > > > >>>> - Change the comment about initrd_start VA conversion so that = it fits > > > > >>>> ARM64 and RISCV64 (and others in the future if needed), as = suggested > > > > >>>> by Rob > > > > >>>> > > > > >>>> v2: > > > > >>>> - Add a comment on why RISCV64 does not need to set initrd_sta= rt/end that > > > > >>>> early in the boot process, as asked by Rob > > > > >>>> > > > > >>>> Alexandre Ghiti (4): > > > > >>>> riscv: Get rid of riscv_pfn_base variable > > > > >>>> mm: Introduce memblock_isolate_memory > > > > >>>> arm64: Make use of memblock_isolate_memory for the linear m= apping > > > > >>>> riscv: Use PUD/P4D/PGD pages for the linear mapping > > > > >>> Kernel boot fine on RV64 but there is a failure which is still = not > > > > >>> addressed. You can see this failure as following message in > > > > >>> kernel boot log: > > > > >>> 0.000000] Failed to add a System RAM resource at 80200000 > > > > >> Hmmm I don't get that in any of my test configs, would you mind > > > > >> sharing yours and your qemu command line? > > > > > Try alexghiti_test branch at > > > > > https://github.com/avpatel/linux.git > > > > > > > > > > I am building the kernel using defconfig and my rootfs is > > > > > based on busybox. > > > > > > > > > > My QEMU command is: > > > > > qemu-system-riscv64 -M virt -m 512M -nographic -bios > > > > > opensbi/build/platform/generic/firmware/fw_dynamic.bin -kernel > > > > > ./build-riscv64/arch/riscv/boot/Image -append "root=3D/dev/ram rw > > > > > console=3DttyS0 earlycon" -initrd ./rootfs_riscv64.img -smp 4 > > > > > > > > > > > > So splitting memblock.memory is the culprit, it "confuses" the reso= urces > > > > addition and I can only find hacky ways to fix that... > > > Hi Alexandre, > > > > > > We encountered the same error as Anup. After adding your patch > > > (3335068f87217ea59d08f462187dc856652eea15), we will not encounter the > > > error again. > > > > > > What I have observed so far is > > > > > > - before your patch > > > When merging consecutive memblocks, if the memblock types are differe= nt, > > > they will be merged into reserved > > > - after your patch > > > When consecutive memblocks are merged, if the memblock types are > > > different, they will be merged into memory. > > > > > > Such a result will cause the memory location of OpenSBI to be changed > > > from reserved to memory. Will this have any side effects? > > > > I guess it will end up in the memory pool and pages from openSBI > > region will be allocated, so we should see very quickly bad stuff > > happening (either PMP violation or M-mode ecall never > > returning/trapping/etc). > > > > But I don't observe the same thing, I always see the openSBI region > > being reserved: > > > > reserved[0x0] [0x0000000080000000-0x000000008007ffff], > > 0x0000000000080000 bytes flags: 0x0 > > > > Can you elaborate a bit more about "When consecutive memblocks are > > merged, if the memblock types are different, they will be merged into > > memory"? Where/when does this merge happen? Can you give me a config > > file and a kernel revision so that I can take a look? > Ok, If you want to reproduce the same results you just need to modify Ope= nSBI > > [ lib/sbi/sbi_domain.c ] > +#define TEST_SIZE 0x200000 > > - (scratch->fw_size - scratch->fw_rw_offs= et), > + (TEST_SIZE - scratch->fw_rw_offset), > > In addition, you can insert checks in the kernel merged function > [ mm/memblock.c ] > static void __init_memblock memblock_merge_regions(struct memblock_type *= type) > while (i < type->cnt - 1) { > ... > /* move forward from next + 1, index of which is i + 2 */ > memmove(next, next + 1, (type->cnt - (i + 2)) * sizeof(*n= ext)); > type->cnt--; > } > + pr_info("Merged memblock_type: cnt =3D %lu, max =3D %lu, > total_size =3D 0x%llx\n",type->cnt, type->max, type->total_size); > + for (i =3D 0; i < type->cnt; i++) { > + const char *region_type =3D > memblock_is_memory(type->regions[i].base) ? "memory" : "reserve"; > + pr_info("Region %d: base =3D 0x%llx, size =3D 0x%llx, typ= e > =3D %s\n", i, type->regions[i].base, type->regions[i].size, > region_type); > + } > } > This is kernel boot log > - before your patch > ... > [ 0.000000] OF: fdt: Reserving memory: base =3D 0x80000000, size =3D 0= x200000 > [ 0.000000] Merged memblock_type: cnt =3D 4, max =3D 128, total_size = =3D 0x1628501 > [ 0.000000] Region 0: base =3D 0x80000000, size =3D 0x1600000, type = =3D reserve > ... > > - after your patch > ... > [ 0.000000] OF: fdt: Reserving memory: base =3D 0x80000000, size =3D 0= x200000 > [ 0.000000] Merged memblock_type: cnt =3D 4, max =3D 128, total_size = =3D 0x180c42e > [ 0.000000] Region 0: base =3D 0x80000000, size =3D 0x1800000, type = =3D memory So the openSBI region is marked as memory, and not reserved because this region is now described as nomap, and memblock_mark_nomap() does not move this region into the reserved memblock list, but keep it in the memory list with the nomap flag (https://elixir.bootlin.com/linux/latest/source/drivers/of/fdt.c#L479). But as stated in the description of memblock_mark_nomap() (https://elixir.bootlin.com/linux/latest/source/mm/memblock.c#L969), the pages associated with the region will be marked as PageReserved and the region will not be covered in the linear mapping. So to me, this is normal and we are safe. Let me know if I made a mistake. And sorry for the long delay, that slipped my mind! Thanks, Alex > ... > [ 0.000000] Failed to add a system RAM resource at 80200000 > ... > > > > Thanks, > > > > Alex > > > > > > > > > > So given that the arm64 patch with the new API is not pretty and th= at > > > > the simplest solution is to re-merge the memblock regions afterward= s > > > > (which is done by memblock_clear_nomap), I'll drop the new API and = the > > > > arm64 patch to use the nomap API like arm64: I'll take advantage of= that > > > > to clean setup_vm_final which I have wanted to do for a long time. > > > > > > > > @Mike Thanks for you reviews! > > > > > > > > @Anup Thanks for all your bug reports on this patchset, I have to > > > > improve my test flow (it is in the work :)). > > > > > > > > > > > > > Regards, > > > > > Anup > > > > > > > > > >> Thanks > > > > >> > > > > >>> Regards, > > > > >>> Anup > > > > >>> > > > > >>>> arch/arm64/mm/mmu.c | 25 +++++++++++------ > > > > >>>> arch/riscv/include/asm/page.h | 19 +++++++++++-- > > > > >>>> arch/riscv/mm/init.c | 53 +++++++++++++++++++++++++= +++------- > > > > >>>> arch/riscv/mm/physaddr.c | 16 +++++++++++ > > > > >>>> drivers/of/fdt.c | 11 ++++---- > > > > >>>> include/linux/memblock.h | 1 + > > > > >>>> mm/memblock.c | 20 +++++++++++++ > > > > >>>> 7 files changed, 119 insertions(+), 26 deletions(-) > > > > >>>> > > > > >>>> -- > > > > >>>> 2.37.2 > > > > >>>> > > > > > _______________________________________________ > > > > > linux-riscv mailing list > > > > > linux-riscv@lists.infradead.org > > > > > http://lists.infradead.org/mailman/listinfo/linux-riscv > > > > > > > > _______________________________________________ > > > > linux-riscv mailing list > > > > linux-riscv@lists.infradead.org > > > > http://lists.infradead.org/mailman/listinfo/linux-riscv