From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75A77D206A5 for ; Wed, 16 Oct 2024 07:30:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DB1EA6B007B; Wed, 16 Oct 2024 03:30:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D61916B0082; Wed, 16 Oct 2024 03:30:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C51006B0083; Wed, 16 Oct 2024 03:30:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id A353B6B007B for ; Wed, 16 Oct 2024 03:30:51 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 10C1F1A1A6B for ; Wed, 16 Oct 2024 07:30:34 +0000 (UTC) X-FDA: 82678643250.10.50F854A Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf29.hostedemail.com (Postfix) with ESMTP id 403AF120005 for ; Wed, 16 Oct 2024 07:30:38 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=CwGflVpJ; spf=pass (imf29.hostedemail.com: domain of chenhuacai@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chenhuacai@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729063706; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LgdzTsWg99eVncVlxKZZP2cIXfC3Obm6mBBGKa9DLas=; b=Vo7GtiQa4uJyuuNbbbD38YTOV5Kvimh1BFrOtLFkYEC6j0D1e1igsDp7G8zeo5UyUDIwJH /DGKvWMqEHNlCJjH0VQezdmCUhzQOLZPDv2I4sK75PdUPPSBC2HqsqGnMx7p0RA/aMIn6W ZAlRBDDoudL84eiqszQtDEwrkoLJDCg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729063706; a=rsa-sha256; cv=none; b=UhrJMzwUbL2r0Ov1csjq3Su4H0XFEy5NuFzLNDpRMruPX4N1S5uFIRZxKulXqsN+BnHcaQ Ys3yO46XDhWlDbWcKWP/UY8dj6FS8+J0UEo8C9uyxyOv044J/Dq5SrTvkFK/gusl0aJ8Ng 68tePm1o1hsC6P6IizXIHttmZa6dVQ0= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=CwGflVpJ; spf=pass (imf29.hostedemail.com: domain of chenhuacai@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chenhuacai@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 962B45C5215 for ; Wed, 16 Oct 2024 07:30:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B0622C4CED6 for ; Wed, 16 Oct 2024 07:30:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1729063847; bh=BiGEJtQfZvYa4sIn+ELPpiWdnPMj6XIZ6gDNFCL2g7c=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=CwGflVpJh7LdN4I9Xl1gO9GtAJ5sp/Pj1wWtRWj7AZTJYOMBiJ7hI7Gfs0OezN/s+ 5eVdehTpV2kEIcdASKxhe0En5amuWbF33nVlKx1OPj2hAjHSH5fgOxdtaPmJaZKcHx QKLhexut+hS98/UIPBKZNOWnlrHCkgbQFKeUb/wJhfYsqV2kGJmLHli9F/G1UTxok1 cpRHSwO1u5hAizGJeoTjVdXj2C+gBjT7Skz+qXqBoApopamoYqPu7/CqLq1SI+VAaS uBAEpM77Ft7SeUElZMfTdtCNfFXeDxiTw7wZtC7QVa4yrCkvoMckE2v/miyicXQ9oq ZMYpA5mmsg3QA== Received: by mail-ed1-f49.google.com with SMTP id 4fb4d7f45d1cf-5c9454f3bfaso6813994a12.2 for ; Wed, 16 Oct 2024 00:30:47 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCX159LxomPBbiGy0P+hjQRDOJnm2D9TsXOp1GYxbWsSH26ZOMZv5SnFY2pkuUY51NXZBoG1AbYTKw==@kvack.org X-Gm-Message-State: AOJu0YxmTnUZr+urSFbhShNhMhFILLrAq45eb0EWDgMQZSzpnKd4uwZ4 oGtDNP3aPyuM7gybsUaMvTXcKBRNwsqrtTnp34XL551doIi6/yWIs/PHBf6vbmPvvznTH+YGPP4 tB5R0C0S0D8iyOn5npvEUKvyA18A= X-Google-Smtp-Source: AGHT+IGGgI/RALmibbLHEgiBVUT9M21MDnSkFq4Au5DeP01Xn6lKa0uHZWLkvAovFIylEUT/MwPrEjLDn03ZvFDfMUs= X-Received: by 2002:a17:907:848:b0:a99:3318:e7c3 with SMTP id a640c23a62f3a-a99e3e4c294mr1543126666b.43.1729063846118; Wed, 16 Oct 2024 00:30:46 -0700 (PDT) MIME-Version: 1.0 References: <20241014035855.1119220-1-maobibo@loongson.cn> <20241014035855.1119220-3-maobibo@loongson.cn> <1b4070c9-921e-65e3-c2a7-dab486d4f17f@loongson.cn> In-Reply-To: <1b4070c9-921e-65e3-c2a7-dab486d4f17f@loongson.cn> From: Huacai Chen Date: Wed, 16 Oct 2024 15:30:34 +0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v2 2/3] LoongArch: Add barrier between set_pte and memory access To: maobibo Cc: Andrey Ryabinin , Andrew Morton , David Hildenbrand , Barry Song , loongarch@lists.linux.dev, linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 403AF120005 X-Stat-Signature: i87ox17xt65bz61nhaameddniyoj95os X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1729063838-518747 X-HE-Meta: U2FsdGVkX181L+F0hNYEDP+4/XCAf+uao+YjrTF3vUCMMHyESoBuh96Q2ahoWKK+lSr/c13qLvfCa8ktWPH/IZkzt0Ljhu3s3ZBHD5AiMhalfXIhTPKHaayhfHixWY93X++MNffOHwAPr0dLFPZvsc2C71M5kBs+vqGXgis7hOf9XLdePzCsYdtoel68lg/YPjqub5nXTdfr0KCyWGVverBA/DosYyUMuYIyvkCDocGBh5zHtKGvJLYd6AeqhK/K4eunV09ph2gj2dlkg5IMGUHyCLmDzXAhc2HT7392pNnZ5ItUxSzv5HG8hhwu8k2HMQAzRsBTzax+U5k8E0DpQxxnoJYdueZkssLJF90BfXeq5//4pbWkO1w2Fo1Is47u0U1VjzJNTH+L91s+nS/9Wq3CkpNudnTPWEGNi/qWat7Fc0UxzGzCzEpQ4gCZnDyDp5nYdqng2NhCE7xt/lUuTRI5MlF6JcWU5N8RIrMq/6Gd6gdjhMf7qNe9dEvalAjP8vLZv8EYBPkvnU8++n1bjpEFUtOyOcwu4sW0bjMK7luc9Dhk6hHCiwV9GJaJNWaOamv3EN29fvuNxpDhJaYjX4D2Nd6Q7MDZXrFgKOhTXUyAdsGqFkRr2DSFlJIJWRx6na7jgR97kjFH7PVp/keoBwjs0PAm5NDJZcKOJkIbzhG0xnc3tlXad5Ddma2QfyS8cKX53UqEZfbGM4HOze+UR4+Na15xEFnijeIUnmnG+KKl/GNGTGgKI3YVx4JB9VnOf0MfsfwoZTmgjr9VfbEPtbsKB+Ft3lbwVMKvBJ1vSeAB7WlevMB+bnTuo75YTscTuCnw/vvm8EUScOiG3wO5tetYDV/X3Um5SFXYUFwz96g2YwwPmc6QhsVLLHaAxO73pY9csnoqZxhqWdBjqvwI7PH1lpVwVc/TTqaS4um+/dAOdmipF/zD7CKKsjOkKwUpenbtYMezUFkFyLm5qIW WfAD2PK+ pmPj0uKJMf/PeYh0mZmc1ZrSy28h2UNIvHtQKt+UwHkeJnFt+IbEXePBEWKTMNq43/ZP/ryO4gUG48NUU5lkXfeAnrYP5NpzpDQtCmal/GmLB4bAeMiarOi24Ebz4HvxDAsXLaSRi7Kgc1d3gnZWj42YxRda7f8vdnE9qmNVmf0/LPgDdFJ9B2JyFxPEV/SfuKrTiRkBBIi+twAAfuaQikJKP/11woJaJ+pShGAxfwHeAOuP8fBFdwqi/MWI1zVsbnxy/QzVRFAE4LIaAH5Xu3n+twCH6GSYsvQPHMRO16vmbQWv0ImMGHKwdtzqEBi26x+rLg/5jql+kOTOcdTimhn61RuWddYDcGjaT2eD75GHovI7A08JYSgb8UitKIpJQ2tqLKl99wiH9t+XjNGP+DMW2J38nZ9ASYxsNt+CPnJpXyvs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Oct 16, 2024 at 2:09=E2=80=AFPM maobibo wrote= : > > > > On 2024/10/15 =E4=B8=8B=E5=8D=888:27, Huacai Chen wrote: > > On Tue, Oct 15, 2024 at 10:54=E2=80=AFAM maobibo = wrote: > >> > >> > >> > >> On 2024/10/14 =E4=B8=8B=E5=8D=882:31, Huacai Chen wrote: > >>> Hi, Bibo, > >>> > >>> On Mon, Oct 14, 2024 at 11:59=E2=80=AFAM Bibo Mao wrote: > >>>> > >>>> It is possible to return a spurious fault if memory is accessed > >>>> right after the pte is set. For user address space, pte is set > >>>> in kernel space and memory is accessed in user space, there is > >>>> long time for synchronization, no barrier needed. However for > >>>> kernel address space, it is possible that memory is accessed > >>>> right after the pte is set. > >>>> > >>>> Here flush_cache_vmap/flush_cache_vmap_early is used for > >>>> synchronization. > >>>> > >>>> Signed-off-by: Bibo Mao > >>>> --- > >>>> arch/loongarch/include/asm/cacheflush.h | 14 +++++++++++++- > >>>> 1 file changed, 13 insertions(+), 1 deletion(-) > >>>> > >>>> diff --git a/arch/loongarch/include/asm/cacheflush.h b/arch/loongarc= h/include/asm/cacheflush.h > >>>> index f8754d08a31a..53be231319ef 100644 > >>>> --- a/arch/loongarch/include/asm/cacheflush.h > >>>> +++ b/arch/loongarch/include/asm/cacheflush.h > >>>> @@ -42,12 +42,24 @@ void local_flush_icache_range(unsigned long star= t, unsigned long end); > >>>> #define flush_cache_dup_mm(mm) do { } whi= le (0) > >>>> #define flush_cache_range(vma, start, end) do { } whi= le (0) > >>>> #define flush_cache_page(vma, vmaddr, pfn) do { } whi= le (0) > >>>> -#define flush_cache_vmap(start, end) do { } while= (0) > >>>> #define flush_cache_vunmap(start, end) do { } whi= le (0) > >>>> #define flush_icache_user_page(vma, page, addr, len) do { } whi= le (0) > >>>> #define flush_dcache_mmap_lock(mapping) do= { } while (0) > >>>> #define flush_dcache_mmap_unlock(mapping) do { } whi= le (0) > >>>> > >>>> +/* > >>>> + * It is possible for a kernel virtual mapping access to return a s= purious > >>>> + * fault if it's accessed right after the pte is set. The page faul= t handler > >>>> + * does not expect this type of fault. flush_cache_vmap is not exac= tly the > >>>> + * right place to put this, but it seems to work well enough. > >>>> + */ > >>>> +static inline void flush_cache_vmap(unsigned long start, unsigned l= ong end) > >>>> +{ > >>>> + smp_mb(); > >>>> +} > >>>> +#define flush_cache_vmap flush_cache_vmap > >>>> +#define flush_cache_vmap_early flush_cache_vmap > >>> From the history of flush_cache_vmap_early(), It seems only archs w= ith > >>> "virtual cache" (VIVT or VIPT) need this API, so LoongArch can be a > >>> no-op here. > > OK, flush_cache_vmap_early() also needs smp_mb(). > > > >> > >> Here is usage about flush_cache_vmap_early in file linux/mm/percpu.c, > >> map the page and access it immediately. Do you think it should be noop > >> on LoongArch. > >> > >> rc =3D __pcpu_map_pages(unit_addr, &pages[unit * unit_pages], > >> unit_pages); > >> if (rc < 0) > >> panic("failed to map percpu area, err=3D%d\n", rc); > >> flush_cache_vmap_early(unit_addr, unit_addr + ai->unit_size); > >> /* copy static data */ > >> memcpy((void *)unit_addr, __per_cpu_load, ai->static_size); > >> } > >> > >> > >>> > >>> And I still think flush_cache_vunmap() should be a smp_mb(). A > >>> smp_mb() in flush_cache_vmap() prevents subsequent accesses be > >>> reordered before pte_set(), and a smp_mb() in flush_cache_vunmap() > >> smp_mb() in flush_cache_vmap() does not prevent reorder. It is to flus= h > >> pipeline and let page table walker HW sync with data cache. > >> > >> For the following example. > >> rb =3D vmap(pages, nr_meta_pages + 2 * nr_data_pages, > >> VM_MAP | VM_USERMAP, PAGE_KERNEL); > >> if (rb) { > >> <<<<<<<<<<< * the sentence if (rb) can prevent reorder. Otherwise with > >> any API kmalloc/vmap/vmalloc and subsequent memory access, there will = be > >> reorder issu. * > >> kmemleak_not_leak(pages); > >> rb->pages =3D pages; > >> rb->nr_pages =3D nr_pages; > >> return rb; > >> } > >> > >>> prevents preceding accesses be reordered after pte_clear(). This > >> Can you give an example about such usage about flush_cache_vunmap()? a= nd > >> we can continue to talk about it, else it is just guessing. > > Since we cannot reach a consensus, and the flush_cache_* API look very > > strange for this purpose (Yes, I know PowerPC does it like this, but > > ARM64 doesn't). I prefer to still use the ARM64 method which means add > > a dbar in set_pte(). Of course the performance will be a little worse, > > but still better than the old version, and it is more robust. > > > > I know you are very busy, so if you have no time you don't need to > > send V3, I can just do a small modification on the 3rd patch. > No, I will send V3 by myself. And I will drop the this patch in this > patchset since by actual test vmalloc_test works well even without this > patch on 3C5000 Dual-way, also weak function kernel_pte_init will be > replaced with inline function rebased on > > https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patc= hes/mm-define-general-function-pxd_init.patch This patch is in Andrew's mm-unstable branch. As far I know, mm-unstable is for next (6.13) and mm-stable is for current (6.12). But this series is bugfix, so it is for current (6.12). > > I dislike the copy-paste method without further understanding :(, > although I also copy and paste code, but as least I try best to > understand it. I dislike too. But in order to make this series be in 6.12, it is better to keep copy-paste, and then update the refactoring patch to V2 for Andrew (rebase and drop is normal for mm-unstable). Huacai > > Regards > Bibo Mao > > > > > > Huacai > > > >> > >> Regards > >> Bibo Mao > >>> potential problem may not be seen from experiment, but it is needed i= n > >>> theory. > >>> > >>> Huacai > >>> > >>>> + > >>>> #define cache_op(op, addr) = \ > >>>> __asm__ __volatile__( = \ > >>>> " cacop %0, %1 \n= " \ > >>>> -- > >>>> 2.39.3 > >>>> > >>>> > >> > >> >