From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6520C52D70 for ; Tue, 6 Aug 2024 08:20:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 09E436B0083; Tue, 6 Aug 2024 04:20:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 04E9D6B0085; Tue, 6 Aug 2024 04:20:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E58346B0088; Tue, 6 Aug 2024 04:20:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id C82D76B0083 for ; Tue, 6 Aug 2024 04:20:03 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 59BBCC0861 for ; Tue, 6 Aug 2024 08:20:03 +0000 (UTC) X-FDA: 82421122686.16.B89C34D Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf16.hostedemail.com (Postfix) with ESMTP id 1AEAD180002 for ; Tue, 6 Aug 2024 08:19:59 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Mx3umcsw; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf16.hostedemail.com: domain of mripard@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=mripard@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722932354; a=rsa-sha256; cv=none; b=PiPDxhQB9VJo6hHypR/0+aBWS88lQ36SLx1gZhqq31eRUqmgyMJeMmnQTJi4GB8JCdBdVr UUZmMZx1t6RvNqAav/SrWGSiaAZg6XJM8lGkHR20g4zzTG8GV7OwPKQtT26mCPv5Q0kAti cPGuo1FvYuF1H9Ymk7kk75q4oVuACq8= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Mx3umcsw; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf16.hostedemail.com: domain of mripard@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=mripard@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722932354; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NdJb1rQAure0sbLoyn5qYMB/Xp7Wi/B5C9MpK/8zYDM=; b=q2k2KQb9vQQCAcCIgQSONuhciaZbPj3PpU4FfzSq1LkdlFXvWdD49iISwFAsghMSHrqANC lVUzdWCbK3KjkudZ3vm1KjuhiDSzZIh6UOY/J3iLJH7qmFgW2xY86ryLW/QChH87fINsSV U5R5giVil/TSCQSuWeYpmZeNqG+iifk= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id E348560F5D; Tue, 6 Aug 2024 08:19:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 43339C32786; Tue, 6 Aug 2024 08:19:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1722932398; bh=SyGxnGyC6ICq4ml8/rvgZWL+NhnuYgaPfzYh/fRYTjY=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Mx3umcswp4/foWvsYQvzWJ2zFmWjjNzO2Ko2mEphYYZt2hwcY5FeE6orw0COIHSpE IeD4r9cLlUYFJEATFeKwpNUg+ZqyzaC9hqOQmpI27tgjIKe7Zk8ZKXzGIsvL4KVoPT 7P8KN0el7G7BJQvR6iY9rLykUhfjSj/5OQHqM3EaGUFkzRvpxSzmLxx1wHKTEuUnfx hfMZZShMGPJjtCDuVXojeaQtWvocyRsgsBN+P0qpE783XjlD1fN8nU9CHlBBlmZkVP tw9h+gBMX0+6mv1YPfzqKG23K6TvvfP+on9NnIQVz+BB06oNhOyIgKpLmEzUeU55B8 Bxp0aukY5AjHg== Date: Tue, 6 Aug 2024 10:19:56 +0200 From: Maxime Ripard To: Maarten Lankhorst Cc: intel-xe@lists.freedesktop.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, Tejun Heo , Zefan Li , Johannes Weiner , Andrew Morton , Jonathan Corbet , David Airlie , Daniel Vetter , Thomas Zimmermann , Friedrich Vock , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-doc@vger.kernel.org Subject: Re: [RFC PATCH 2/6] drm/cgroup: Add memory accounting DRM cgroup Message-ID: <20240806-poetic-awesome-impala-fb6c2f@houat> References: <20240627154754.74828-1-maarten.lankhorst@linux.intel.com> <20240627154754.74828-3-maarten.lankhorst@linux.intel.com> <20240627-paper-vicugna-of-fantasy-c549ed@houat> <6cb7c074-55cb-4825-9f80-5cf07bbd6745@linux.intel.com> <20240628-romantic-emerald-snake-7b26ca@houat> <70289c58-7947-4347-8600-658821a730b0@linux.intel.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="bqepzngcvumclfgj" Content-Disposition: inline In-Reply-To: <70289c58-7947-4347-8600-658821a730b0@linux.intel.com> X-Rspamd-Queue-Id: 1AEAD180002 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 9fyij7mz16m3bwb58inim94qhbserc77 X-HE-Tag: 1722932399-738427 X-HE-Meta: U2FsdGVkX18OvOGA8Xl0QjWkAyAHtjpdtXtGOSNmzHvRzIpyY7hd6ZctIWWworV49WTEwMd0xNoNhUPw7rX6UA+Bb9m3yx8WyiauZTWops03Hf0RyY9a4tzl5om23ckgHiO5e4Lvkb6KF6XY1NEcxhlBFa1F+v2Y8do00Z5UYuopoCFaUHFGKZjGhgF3sNkRtxQ8IqG99mOg2/UkQE6oCI2jH8f5EySwGaaR6cUqfx7I3t8ifnU1lSdiuwPLTKBM/rhDcYe9gkBABRNcsfJd/1c1EjulOXV6aLgW1pi8i9iSUSyOjO1AWHenQG4rR/4kuHkmzMhTVOnSTRXZ7l/s1oXsiXQ2pOqedNURDoRWBemMhMVV4Usllps5k+kABQ35QrVx8Qgy7FgPTzfCPW4rN3Wh17UI4dM27jJ3zTQjXWQwYcxrcPI3snh8o8KCeXg5y8mpUj+1TMJppBR56MWmO5sNUASsBMaqIJ9fFVnTe0mxFETQcfsP9pZZxj2agVA/t9QVNnDVJp2hZtGYwJXwXHMJCIl4yZwtnxr/t9EMg7kszZc+Evwaxy1/e4b9y0zv/l4H4Vnwd7eAS95qYQ2hpfw+ISGrjaEQjcF93WxAR2ya069ceFksR+QMQeA36GM07kgmRUq5qKt3tgf8dGehr3b9oPb15qrcI4wNCHkG7pvPQLFavpigTYAi+EWRoAUi4VxLWhkyJmfJ1HB/ajSEtsVWNn4f7OK1AMKuzKpb4X1maPnYrDG4F8ZQT90PW5oGrFUsDrz9J+q1TR6wVe/zzYmzg9YjlJARW6Z23tRIxABE1k3zed0guaG4mo6Bq0WksbPwgKQLiQqqu9Z4k5+FgkVEyQ0XhzxFcbs+bwUY2S2/9HZS6Nb16H/anla8puAQMujcwZz5JEB7FJHT/H4Pij2lnBRtR9Ye4p0RWR100pFWklvC9dZXYTtAH9w9zWHLbySsu8h/9c3g39QcK5t Gs3Lj9hq C2rTfRcDqrQZDiD+ckr9d/AF863x29C38BcIX1NY19vRU7+WDTUkhv6K1GVJKKrofl00uYHfHR4AaCz0QwhWe0Fcc+lKsUmvhLVXZXWWvhQKYfPqE7IitV6jWENXwP4UZBMbQ7wFJHpkWCEOlZVq2rWO3MrX34So/d6jc0U3kQChBSmfQCMDRy3mDB9o+Dl9yncSYWELIf7MP9pxjsSPCvQO76PlozIgTCEx54KZyXoAH0WWlZPbu3Li8F5oFKQekkTY+AXbHmzt5IhvHDvLuJ/d+rxzjY2lLq2G70qd5+pfUHFa2sdzOmTWe0aXB5n8ZrKtxVpODFFV1TVUDFhGZpygtOFJVbmJJzwvTZwOZ9kIzcDuJBeGHpEOrPd7oBJjyqS9/9HHEojRLwR3wcWWkezpcaRI5/HPHHyCcIT2d91pZius= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: --bqepzngcvumclfgj Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi Maarten, On Mon, Jul 01, 2024 at 11:25:12AM GMT, Maarten Lankhorst wrote: > Den 2024-06-28 kl. 16:04, skrev Maxime Ripard: > > On Thu, Jun 27, 2024 at 09:22:56PM GMT, Maarten Lankhorst wrote: > >> Den 2024-06-27 kl. 19:16, skrev Maxime Ripard: > >>> Hi, > >>> > >>> Thanks for working on this! > >>> > >>> On Thu, Jun 27, 2024 at 05:47:21PM GMT, Maarten Lankhorst wrote: > >>>> The initial version was based roughly on the rdma and misc cgroup > >>>> controllers, with a lot of the accounting code borrowed from rdma. > >>>> > >>>> The current version is a complete rewrite with page counter; it uses > >>>> the same min/low/max semantics as the memory cgroup as a result. > >>>> > >>>> There's a small mismatch as TTM uses u64, and page_counter long page= s. > >>>> In practice it's not a problem. 32-bits systems don't really come wi= th > >>>>> =3D4GB cards and as long as we're consistently wrong with units, it= 's > >>>> fine. The device page size may not be in the same units as kernel pa= ge > >>>> size, and each region might also have a different page size (VRAM vs= GART > >>>> for example). > >>>> > >>>> The interface is simple: > >>>> - populate drmcgroup_device->regions[..] name and size for each acti= ve > >>>> region, set num_regions accordingly. > >>>> - Call drm(m)cg_register_device() > >>>> - Use drmcg_try_charge to check if you can allocate a chunk of memor= y, > >>>> use drmcg_uncharge when freeing it. This may return an error code, > >>>> or -EAGAIN when the cgroup limit is reached. In that case a refer= ence > >>>> to the limiting pool is returned. > >>>> - The limiting cs can be used as compare function for > >>>> drmcs_evict_valuable. > >>>> - After having evicted enough, drop reference to limiting cs with > >>>> drmcs_pool_put. > >>>> > >>>> This API allows you to limit device resources with cgroups. > >>>> You can see the supported cards in /sys/fs/cgroup/drm.capacity > >>>> You need to echo +drm to cgroup.subtree_control, and then you can > >>>> partition memory. > >>>> > >>>> Signed-off-by: Maarten Lankhorst > >>>> Co-developed-by: Friedrich Vock > >>> I'm sorry, I should have wrote minutes on the discussion we had with = TJ > >>> and Tvrtko the other day. > >>> > >>> We're all very interested in making this happen, but doing a "DRM" > >>> cgroup doesn't look like the right path to us. > >>> > >>> Indeed, we have a significant number of drivers that won't have a > >>> dedicated memory but will depend on DMA allocations one way or the > >>> other, and those pools are shared between multiple frameworks (DRM, > >>> V4L2, DMA-Buf Heaps, at least). > >>> > >>> This was also pointed out by Sima some time ago here: > >>> https://lore.kernel.org/amd-gfx/YCVOl8%2F87bqRSQei@phenom.ffwll.local/ > >>> > >>> So we'll want that cgroup subsystem to be cross-framework. We settled= on > >>> a "device" cgroup during the discussion, but I'm sure we'll have plen= ty > >>> of bikeshedding. > >>> > >>> The other thing we agreed on, based on the feedback TJ got on the last > >>> iterations of his series was to go for memcg for drivers not using DMA > >>> allocations. > >>> > >>> It's the part where I expect some discussion there too :) > >>> > >>> So we went back to a previous version of TJ's work, and I've started = to > >>> work on: > >>> > >>> - Integration of the cgroup in the GEM DMA and GEM VRAM helpers (t= his > >>> works on tidss right now) > >>> > >>> - Integration of all heaps into that cgroup but the system one > >>> (working on this at the moment) > >> > >> Should be similar to what I have then. I think you could use my work to > >> continue it. > >> > >> I made nothing DRM specific except the name, if you renamed it the dev= ice > >> resource management cgroup and changed the init function signature to = take a > >> name instead of a drm pointer, nothing would change. This is exactly w= hat > >> I'm hoping to accomplish, including reserving memory. > >=20 > > I've started to work on rebasing my current work onto your series today, > > and I'm not entirely sure how what I described would best fit. Let's > > assume we have two KMS device, one using shmem, one using DMA > > allocations, two heaps, one using the page allocator, the other using > > CMA, and one v4l2 device using dma allocations. > >=20 > > So we would have one KMS device and one heap using the page allocator, > > and one KMS device, one heap, and one v4l2 driver using the DMA > > allocator. > >=20 > > Would these make different cgroup devices, or different cgroup regions? >=20 > Each driver would register a device, whatever feels most logical for > that device I suppose. >=20 > My guess is that a prefix would also be nice here, so register a > device with name of drm/$name or v4l2/$name, heap/$name. I didn't give > it much thought and we're still experimenting, so just try something. > :) >=20 > There's no limit to amount of devices, I only fixed amount of pools to > match TTM, but even that could be increased arbitrarily. I just don't > think there is a point in doing so. Sorry, it took a while, but I implemented what (I think) we all had in mind here: https://github.com/mripard/linux/tree/device-cgroups-maarten It's rebased on top of 6.11, and with plenty of fixups to (hopefully :D) make your life easier. Let me know what you think, Maxime --bqepzngcvumclfgj Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCZrHcqwAKCRDj7w1vZxhR xXnkAQCmJVnMNyKrRw+63aZYJyNabEjZdrabgiVeTtigB3aDVAEAm0JgXIztS1rY 4d633eaGD7BBVCPnEawqpsLem2SalQc= =PvlN -----END PGP SIGNATURE----- --bqepzngcvumclfgj--