From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 039D6C4332F for ; Thu, 14 Dec 2023 18:55:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 39BBF6B03AF; Thu, 14 Dec 2023 13:55:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 34B516B03B1; Thu, 14 Dec 2023 13:55:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1EBEB6B03B2; Thu, 14 Dec 2023 13:55:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 0FDBB6B03AF for ; Thu, 14 Dec 2023 13:55:38 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 519D71A0C7F for ; Thu, 14 Dec 2023 18:55:37 +0000 (UTC) X-FDA: 81566327514.05.A755EAA Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf12.hostedemail.com (Postfix) with ESMTP id 9D03940014 for ; Thu, 14 Dec 2023 18:55:34 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=gNQWZJxL; spf=pass (imf12.hostedemail.com: domain of robh@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=robh@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702580135; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=whUVc6PuMubWbdcjltNO6UQDuVItLHqVHlctgOM4h7o=; b=XGr9vMKokMKmRK3VA3dT1+JKF+6X8hNkt8wwrb3Tz+tG//NdMAkGvMvrODBu1SF7yYUEX6 RX1Lmw+xDPjzN23OZOUW8SNBI+K+3IvXVdVjBW+Wdl67EMK0HAYUPtOU4ZCWFoxxAnPPDx vJuin+pSEZiVfcue1iFLF/R4ty828tw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702580135; a=rsa-sha256; cv=none; b=kY1Mhb1dpVJyUF2kHJRLCpYH1iHdHWbnINUUK2amePJGKxGIMTX36mrN4eHmedeF4ku1Oy vLNwO7gBB6/jCHoyJqBjLQHh3QdyCYpxvuinXgzerXDmxD0dNaaK4Vf69cltsAMcxRHifx 4Hid9/WhTtnYjsPlkK+CQYtW//+g5lw= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=gNQWZJxL; spf=pass (imf12.hostedemail.com: domain of robh@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=robh@kernel.org; dmarc=pass (policy=none) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id 15FE8CE255F for ; Thu, 14 Dec 2023 18:55:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 42632C433C9 for ; Thu, 14 Dec 2023 18:55:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1702580129; bh=nci52wgEP29pe93sJfvh7z3E961dOjZ6XELoPwO7AJ0=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=gNQWZJxLGAyg3thh8sppsdtne0S14JwyQcivHbHv7WO114nKNzDxo48NPc6NwYych LICtIuEwziEeq/jC1lhaXr9ngYWoZaqo8fUMS/56euTdwBx/kPmE7C+R0ZGJ1IKzrC tVwMMIGyAkxdOTcUZeHtX/oH7Bydean26t2VSLflEoxoggGkGn/iOrGO8984m5MjWF 6g6HOVwqpoozuR4ruL1FZairOzZ7r/I7gsRUbrIno5RR1LH/MvdqsoVIm43vGmBRJL vDwFwwCLWushNGNgEe0Jm42brbIwAIz/lHY9dFRME70eLb5XNXSRgbJ4ah8tKyaXMg S4KZ7ptfgHhzQ== Received: by mail-lf1-f54.google.com with SMTP id 2adb3069b0e04-50bfd3a5b54so9610785e87.3 for ; Thu, 14 Dec 2023 10:55:29 -0800 (PST) X-Gm-Message-State: AOJu0YyWUZdNjeVMMXwMN8qv+tqbVnUYtGSuwXMZKwAV/4pcJe7HsQ7k iWYoqPMvk1ftiEozcYMewQtrr0wC/CPTdXSVpQ== X-Google-Smtp-Source: AGHT+IHH/MtbPf3CsSHjCsaUiPlKJlfOgEPCUUPp4Ha+UDKvnOS4RB7kkEQUG6S8vXVoP3f52VmIHsLn4Hf6iJQk34o= X-Received: by 2002:ac2:51b0:0:b0:50b:fd52:e629 with SMTP id f16-20020ac251b0000000b0050bfd52e629mr4153621lfk.125.1702580127399; Thu, 14 Dec 2023 10:55:27 -0800 (PST) MIME-Version: 1.0 References: <20231119165721.9849-12-alexandru.elisei@arm.com> In-Reply-To: From: Rob Herring Date: Thu, 14 Dec 2023 12:55:14 -0600 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH RFC v2 11/27] arm64: mte: Reserve tag storage memory To: Alexandru Elisei Cc: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com, pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 9D03940014 X-Rspam-User: X-Stat-Signature: bfbpn9yibr8msss3qrw3hgc16tx8ko14 X-Rspamd-Server: rspam03 X-HE-Tag: 1702580134-213744 X-HE-Meta: U2FsdGVkX18hmliDIMVplL9X40PUy8IgLXo+pNYHEXKKQ2zy8YyW4h2zw6q1CO/ksMx3TKwIzgAacTT0HMLiE05qq7Eq82+kltPVn2j1l+rEtG3zuTPJeiGNdxRZ4PJUMhAw5IRZW0HrOo+p0LNXdzKw3QLie1wx+be25C+FQDJvJC42P+VET5dG3FEsfQ3pV9jWoJtFOD2H3n4bn9T8PDsotyqZ2p8db63WewuwQFC5ZwJJLoqRqgwqeVh3He9Qi3mxr009uKCJjWBZ1XD7bSHbyp29o3szLAejwqIa0N7YQHOWPgxhx/MvvjxZqq+u4wXfbDgO9rMKtfDVOWB1rNFw7gc4zJpxsAqZUTfmvAEYkpkA4hom/OxxKgCD2OxDM1P/or6n5MA6Ue/Nj01naWKyvyFsvSgzS+NPB7XNAcJAcJLneavagvT2Y8d3ps/G30SkLt4Vs/NAkmQbmqJeIeQ8EsICJK8/9yPj//Z4oSam3Jl2aEKpCfGFuaHI8NQXBLE73SZIAYKPAZFkzUC1Kq6VAgPs1N/qRi9WLHq0fGwOAMGkFOp6Sz8q2ICcUHJfQP2DQRXegvM89ISEZZy/d9Mn4OCZhAcyLjL1788tWr3Pl5l3zahAAapt3y/9q6ZyKsrssLuC4GQtL4MCeJeyVIh3cfjUhHjI7iEzCB1UHBT3jwp0ob+Ei5/UeLZTU00mwYz8YK8uYDABi7WXtpPuBBN99MKxziEtWpo/ydlxgOv07/C2DqiLiupi8faNtEhNxXTq6TyCHlln9QAZNWZvzEjnT26QfypcTfQsewEeL1f2Jn9W0myaukPKN+HoSNT+HOmuyJam2rgEMSh9ja9tK8VXSzS9PYmUxyrg0cqPGbVwX/+V4Unfks+DqeEgyFuDqY/1yZs11her1hyLQLCydwUjTlz98IeiTxRlu+m+GDT2VAtpmjkFDcq1zk9nPML/hloE1aQ6gpiH1lsQbbO iyBtcRTJ 1CED+jIs7xc9IrNWFnCKDJKejYMjBo4ONPoBKQ6f6Fqy47lnaoL8s7Rb9TQdLKYWvT0uauqBPid+HU0qJGUHonEYhk2+0CZhiHhGO8Qn5+G9xfyXRoSJwnXjfBt4n6y0gG0N+EPBKdjCktel0F77WrMFEFjhNK5ViWNLI/QT9QDD6+dieqMa0r6WLg9cTt/e9Mtc+lhCd8+MweQxul1oqEelOzU2VPcS6ORdYcRVjt7h1mTRaCjQujFDfgIrAwggH+oF6COjBPAZa7Z8Onohl6LoB7Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Dec 14, 2023 at 9:45=E2=80=AFAM Alexandru Elisei wrote: > > Hi, > > On Wed, Dec 13, 2023 at 02:30:42PM -0600, Rob Herring wrote: > > On Wed, Dec 13, 2023 at 11:44=E2=80=AFAM Alexandru Elisei > > wrote: > > > > > > On Wed, Dec 13, 2023 at 11:22:17AM -0600, Rob Herring wrote: > > > > On Wed, Dec 13, 2023 at 8:51=E2=80=AFAM Alexandru Elisei > > > > wrote: > > > > > > > > > > Hi, > > > > > > > > > > On Wed, Dec 13, 2023 at 08:06:44AM -0600, Rob Herring wrote: > > > > > > On Wed, Dec 13, 2023 at 7:05=E2=80=AFAM Alexandru Elisei > > > > > > wrote: > > > > > > > > > > > > > > Hi Rob, > > > > > > > > > > > > > > On Tue, Dec 12, 2023 at 12:44:06PM -0600, Rob Herring wrote: > > > > > > > > On Tue, Dec 12, 2023 at 10:38=E2=80=AFAM Alexandru Elisei > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > Hi Rob, > > > > > > > > > > > > > > > > > > Thank you so much for the feedback, I'm not very familiar= with device tree, > > > > > > > > > and any comments are very useful. > > > > > > > > > > > > > > > > > > On Mon, Dec 11, 2023 at 11:29:40AM -0600, Rob Herring wro= te: > > > > > > > > > > On Sun, Nov 19, 2023 at 10:59=E2=80=AFAM Alexandru Elis= ei > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > Allow the kernel to get the size and location of the = MTE tag storage > > > > > > > > > > > regions from the DTB. This memory is marked as reserv= ed for now. > > > > > > > > > > > > > > > > > > > > > > The DTB node for the tag storage region is defined as= : > > > > > > > > > > > > > > > > > > > > > > tags0: tag-storage@8f8000000 { > > > > > > > > > > > compatible =3D "arm,mte-tag-storage"; > > > > > > > > > > > reg =3D <0x08 0xf8000000 0x00 0x40000= 00>; > > > > > > > > > > > block-size =3D <0x1000>; > > > > > > > > > > > memory =3D <&memory0>; // Associat= ed tagged memory node > > > > > > > > > > > }; > > > > > > > > > > > > > > > > > > > > I skimmed thru the discussion some. If this memory rang= e is within > > > > > > > > > > main RAM, then it definitely belongs in /reserved-memor= y. > > > > > > > > > > > > > > > > > > Ok, will do that. > > > > > > > > > > > > > > > > > > If you don't mind, why do you say that it definitely belo= ngs in > > > > > > > > > reserved-memory? I'm not trying to argue otherwise, I'm c= urious about the > > > > > > > > > motivation. > > > > > > > > > > > > > > > > Simply so that /memory nodes describe all possible memory a= nd > > > > > > > > /reserved-memory is just adding restrictions. It's also bec= ause > > > > > > > > /reserved-memory is what gets handled early, and we don't n= eed > > > > > > > > multiple things to handle early. > > > > > > > > > > > > > > > > > Tag storage is not DMA and can live anywhere in memory. > > > > > > > > > > > > > > > > Then why put it in DT at all? The only reason CMA is there = is to set > > > > > > > > the size. It's not even clear to me we need CMA in DT eithe= r. The > > > > > > > > reasoning long ago was the kernel didn't do a good job of m= oving and > > > > > > > > reclaiming contiguous space, but that's supposed to be bett= er now (and > > > > > > > > most h/w figured out they need IOMMUs). > > > > > > > > > > > > > > > > But for tag storage you know the size as it is a function o= f the > > > > > > > > memory size, right? After all, you are validating the size = is correct. > > > > > > > > I guess there is still the aspect of whether you want enabl= e MTE or > > > > > > > > not which could be done in a variety of ways. > > > > > > > > > > > > > > Oh, sorry, my bad, I should have been clearer about this. I d= on't want to > > > > > > > put it in the DT as a "linux,cma" node. But I want it to be m= anaged by CMA. > > > > > > > > > > > > Yes, I understand, but my point remains. Why do you need this i= n DT? > > > > > > If the location doesn't matter and you can calculate the size f= rom the > > > > > > memory size, what else is there to add to the DT? > > > > > > > > > > I am afraid there has been a misunderstanding. What do you mean b= y > > > > > "location doesn't matter"? > > > > > > > > You said: > > > > > Tag storage is not DMA and can live anywhere in memory. > > > > > > > > Which I took as the kernel can figure out where to put it. But mayb= e > > > > you meant the h/w platform can hard code it to be anywhere in memor= y? > > > > If so, then yes, DT is needed. > > > > > > Ah, I see, sorry for not being clear enough, you are correct: tag sto= rage > > > is a hardware property, and software needs a mechanism (in this case,= the > > > dt) to discover its properties. > > > > > > > > > > > > At the very least, Linux needs to know the address and size of a = memory > > > > > region to use it. The series is about using the tag storage memor= y for > > > > > data. Tag storage cannot be described as a regular memory node be= cause it > > > > > cannot be tagged (and normal memory can). > > > > > > > > If the tag storage lives in the middle of memory, then it would be > > > > described in the memory node, but removed by being in reserved-memo= ry > > > > node. > > > > > > I don't follow. Would you mind going into more details? > > > > It goes back to what I said earlier about /memory nodes describing all > > the memory. There's no reason to reserve memory if you haven't > > described that range as memory to begin with. One could presumably > > just have a memory node for each contiguous chunk and not need > > /reserved-memory (ignoring the need to say what things are reserved > > for). That would become very difficult to adjust. Note that the kernel > > has a hardcoded limit of 64 reserved regions currently and that is not > > enough for some people. Seems like a lot, but I have no idea how they > > are (ab)using /reserved-memory. > > Ah, I see what you mean, reserved memory is about marking existing memory > (from a /memory node) as special, not about adding new memory. > > After the memblock allocator is initialized, the kernel can use it for it= s > own allocations. Kernel allocations are not movable. > > When a page is allocated as tagged, the associated tag storage cannot be > used for data, otherwise the tags would corrupt that data. To avoid this, > the requirement is that tag storage pages are only used for movable > allocations. When a page is allocated as tagged, the data in the associat= ed > tag storage is migrated and the tag storage is taken from the page > allocator (via alloc_contig_range()). > > My understanding is that the memblock allocator can use all the memory fr= om > a /memory node. If the tags storage memory is declared in a /memory node, > there exists the possibility that Linux will use tag storage memory for i= ts > own allocation, which would make that tags storage memory unmovable, and > thus unusable for storing tags. No, because the tag storage would be reserved in /reserved-memory. Of course, the arch code could do something between scanning /memory nodes and /reserved-memory, but that would be broken arch code. Ideally, there wouldn't be any arch code in between those 2 points, but it's complicated. It used to mainly be powerpc, but we keep adding to the complexity on arm64. > Looking at early_init_dt_scan_memory(), even if a /memory node if marked = at > hotpluggable, memblock will still use it, unless "movable_node" is set on > the kernel command line. > > That's the reason why I'm not describing tag storage in a /memory node. = Is > there way to tell the memblock allocator not to use memory from a /memory > node? > > > > > Let me give an example. Presumably using MTE at all is configurable. > > If you boot a kernel with MTE disabled (or older and not supporting > > it), then I'd assume you'd want to use the tag storage for regular > > memory. Well, If tag storage is already part of /memory, then all you > > have to do is ignore the tag reserved-memory region. Tweaking the > > memory nodes would be more work. > > Right now, memory is added via memblock_reserve(), and if MTE is disabled > (for example, via the kernel command line), the code calls > free_reserved_page() for each tag storage page. I find that straightfowar= d > to implement. But better to just not reserve the region in the first place. Also, it needs to be simple enough to back port. Also, does free_reserved_page() work on ranges outside of memblock range (e.g. beyond end_of_DRAM())? If the tag storage happened to live at the end of DRAM and you shorten the /memory node size to remove tag storage, is it still going to work? Rob