From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7FE9C30653 for ; Mon, 1 Jul 2024 09:25:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 364036B00B3; Mon, 1 Jul 2024 05:25:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3135E6B00B4; Mon, 1 Jul 2024 05:25:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1DB556B00B5; Mon, 1 Jul 2024 05:25:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id F24A46B00B3 for ; Mon, 1 Jul 2024 05:25:10 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id A6D9E160B36 for ; Mon, 1 Jul 2024 09:25:10 +0000 (UTC) X-FDA: 82290649980.30.04FA4BF Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.11]) by imf11.hostedemail.com (Postfix) with ESMTP id 1CCDD4000C for ; Mon, 1 Jul 2024 09:25:07 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=TRD7CfPz; spf=none (imf11.hostedemail.com: domain of maarten.lankhorst@linux.intel.com has no SPF policy when checking 192.198.163.11) smtp.mailfrom=maarten.lankhorst@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719825880; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Y0pIHVR6H3fwDqaDLzGs6sU2dm45A7h42h7zTyYqV4A=; b=W6Ur2WIl1C2qc/qQyXPKunLrBOV34HWHwaytwuJWysvF8WohrjgNt7z3ebeAWNIk4a3ijj q2TUUU+0QFoKW7Asc7ppy4+OoZtZGXuC4aOuKrgRPk1bR1+nWuF3c/RAylZCDdZgMTKg3M 4yFCflDHHauHZb3DVaD19YGd8TuvkEw= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=TRD7CfPz; spf=none (imf11.hostedemail.com: domain of maarten.lankhorst@linux.intel.com has no SPF policy when checking 192.198.163.11) smtp.mailfrom=maarten.lankhorst@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719825880; a=rsa-sha256; cv=none; b=mdyGGz1kcqgtLxTODDQ4Meh8ZhTTmyDclnzLvt4GnLdAiTKCwD+nzjxmemT3h593Kwtjql 5hxKJDqXrm3adMZxXflKXdr2YzN1hDYkitELIFQ5/NuTue//wlPf0X+jhAOsZzA0wAIsIf bR91WEErcb/m6JjbWQNMioRn2+G6I28= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1719825908; x=1751361908; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=zBGVaBqNwJqKSJ/XcD3Kc7/O+kfYBiURgInjfNkFM2I=; b=TRD7CfPzesVAzx37m//UCW+3AHIQJw2Fw98k5sZoyzJHmRZrmEJgZ0uR 0mZs4GuTPkOTY+8OTXH1MV6w0PPsQVhB8K41dCFGpTfyLKRkXLkX5V9dS LE9NRx12l0+2pihnbP0ZnTs+jQIt/Ur+k6nELAjFBG2/NIHFJ6odsGRcl uN7AjPvmUUgT81Q+l4I8N7boMeUhI8704NHzzqbbXdKgnYMj+Q+eKPHHs t76n3yOQMTmh381XtmkUnWdHUnFBRdR+Kb+a5Fr/gjlZm0IuQhQb7AIbH xJtgxoxeBpGm8FAEgJ9py9OigcEHPIkJgE9P/kAPvgnH9EVzm+7Ta14tZ g==; X-CSE-ConnectionGUID: IHQ6rXKsQACwqVQ1fW3YYg== X-CSE-MsgGUID: I546CVsNQXCy2ZnANHbs9g== X-IronPort-AV: E=McAfee;i="6700,10204,11119"; a="27559462" X-IronPort-AV: E=Sophos;i="6.09,175,1716274800"; d="scan'208";a="27559462" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by fmvoesa105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jul 2024 02:25:07 -0700 X-CSE-ConnectionGUID: gx3ZGV9+QWixFP9x1bnqLg== X-CSE-MsgGUID: 8TddMy73TtytfYhco8qPUA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,175,1716274800"; d="scan'208";a="45445623" Received: from pgcooper-mobl3.ger.corp.intel.com (HELO [10.245.244.51]) ([10.245.244.51]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jul 2024 02:25:02 -0700 Message-ID: <70289c58-7947-4347-8600-658821a730b0@linux.intel.com> Date: Mon, 1 Jul 2024 11:25:12 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH 2/6] drm/cgroup: Add memory accounting DRM cgroup To: Maxime Ripard Cc: intel-xe@lists.freedesktop.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, Tejun Heo , Zefan Li , Johannes Weiner , Andrew Morton , Jonathan Corbet , David Airlie , Daniel Vetter , Thomas Zimmermann , Friedrich Vock , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-doc@vger.kernel.org References: <20240627154754.74828-1-maarten.lankhorst@linux.intel.com> <20240627154754.74828-3-maarten.lankhorst@linux.intel.com> <20240627-paper-vicugna-of-fantasy-c549ed@houat> <6cb7c074-55cb-4825-9f80-5cf07bbd6745@linux.intel.com> <20240628-romantic-emerald-snake-7b26ca@houat> Content-Language: en-US From: Maarten Lankhorst In-Reply-To: <20240628-romantic-emerald-snake-7b26ca@houat> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 1CCDD4000C X-Stat-Signature: 651qknk1poseyq154iodbgqto4psfdah X-Rspam-User: X-HE-Tag: 1719825907-457087 X-HE-Meta: U2FsdGVkX196a0a8FwAqwMkjadUxqpZxnic0m1KRpOjuZgdufHdlXbUuILftb5bKFvpiHnMbjug8xIw5GmEFfUl9l2nKiGZcW2m9vgeqFChCLrKRS1fw9c9W2wIhcyRqhRmuxwJhU72YUGXjhL2/2Fx3mz1S1CtDxm1k5ND/Fqt5QKg3SLB968mWkW/0B/7wDhRyuisoqUQ67Fkx0HGs/WVdBFEBggoTIDXPFkbYRSafh8Q4LjlzV+ealSHLMXhqxWxjZR2SdQE2u7MbgAG1Z2xQFqrDDqmIN6QPgqYVnGKsUFVKP/TMv1C0N3PlzMbNXsjMGHNGzDofuOcLDFQBCIGOwWYwBbA4Q4k5zJeTuxIXOKkyM28zNl2dZLXawW36UZ523NbxIdywNDzXiD6l6pyhCoClQpqS7t+yC9kmZg8+ZTPVf/jOQSfCs1gGgnwceunIskuvhX6iOxVdm5OTR5UX6aH0XxVYG0vvyN8ven7kP9Jevk5RCT8ssnvYxWCTRmUtjfP14+HUkupN5LvEhD/Fj/ekisNPYwAeO1Wwe1ceGJfQyB81w9cbm9ShYbLFnlgkjGF9LXQ05MRuX2RZBbEGx1fHfYi+f1HBO6YehBwaO5vCl4BQHBGE2T5buIqsgWZE1DyOJ9CeLcYoFUGugwhP0D1o5rVGwlCep8OqSLzXN/VpnnAy7DyBbjgXWCOv3qQUfE4EYzPjIeJNfBfNyO7RqPHxt7OMOieds20+lxGC+A9FhgKdg0pcKHadkfpvch0kyRSFB3Tm5LVsXWtTsjTz1h8ROMy8mXr0yRn2h9/ETuL8EoSmdGftbe4ZaXtCM4F7s5LwsVJU3jW2JFm2XW9Epiz/yiIkW1/F/GLOXIcOlaiOFH1624pVp2kgz2yu0vhUVQNGjeU8WgqrRBZZEXC1GjWkh2HRZLHs7vkDQwx7Dyu0iPVTRE4tb3D4tmF54FkX9xEly/SsSSBXCZ7 roqMAzXm QxhkzAje6BUxTe7eGVxaGrhghYVsS5eQ8o3AkOLDc1+2suiTScMudUkA9O8Wqu6qb6eM8D0VO6PS/QpOVSDOsOUKRXbdjoDh/Mg/2z4NuhVdgmFzwcJ0224C6V0i2zSzNz5u/wIWsvq81WvTXbIaj0nbQ2YDEKQABtdytWjye8LfNR+lbyQKpV+g2dLWJn//0O79/P/tiPKjjK/Yhtu9+IKPEzlxZk76LGiWSGzPyMOosRjEXq2Z9AyJ3hg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Den 2024-06-28 kl. 16:04, skrev Maxime Ripard: > Hi, > > On Thu, Jun 27, 2024 at 09:22:56PM GMT, Maarten Lankhorst wrote: >> Den 2024-06-27 kl. 19:16, skrev Maxime Ripard: >>> Hi, >>> >>> Thanks for working on this! >>> >>> On Thu, Jun 27, 2024 at 05:47:21PM GMT, Maarten Lankhorst wrote: >>>> The initial version was based roughly on the rdma and misc cgroup >>>> controllers, with a lot of the accounting code borrowed from rdma. >>>> >>>> The current version is a complete rewrite with page counter; it uses >>>> the same min/low/max semantics as the memory cgroup as a result. >>>> >>>> There's a small mismatch as TTM uses u64, and page_counter long pages. >>>> In practice it's not a problem. 32-bits systems don't really come with >>>>> =4GB cards and as long as we're consistently wrong with units, it's >>>> fine. The device page size may not be in the same units as kernel page >>>> size, and each region might also have a different page size (VRAM vs GART >>>> for example). >>>> >>>> The interface is simple: >>>> - populate drmcgroup_device->regions[..] name and size for each active >>>> region, set num_regions accordingly. >>>> - Call drm(m)cg_register_device() >>>> - Use drmcg_try_charge to check if you can allocate a chunk of memory, >>>> use drmcg_uncharge when freeing it. This may return an error code, >>>> or -EAGAIN when the cgroup limit is reached. In that case a reference >>>> to the limiting pool is returned. >>>> - The limiting cs can be used as compare function for >>>> drmcs_evict_valuable. >>>> - After having evicted enough, drop reference to limiting cs with >>>> drmcs_pool_put. >>>> >>>> This API allows you to limit device resources with cgroups. >>>> You can see the supported cards in /sys/fs/cgroup/drm.capacity >>>> You need to echo +drm to cgroup.subtree_control, and then you can >>>> partition memory. >>>> >>>> Signed-off-by: Maarten Lankhorst >>>> Co-developed-by: Friedrich Vock >>> I'm sorry, I should have wrote minutes on the discussion we had with TJ >>> and Tvrtko the other day. >>> >>> We're all very interested in making this happen, but doing a "DRM" >>> cgroup doesn't look like the right path to us. >>> >>> Indeed, we have a significant number of drivers that won't have a >>> dedicated memory but will depend on DMA allocations one way or the >>> other, and those pools are shared between multiple frameworks (DRM, >>> V4L2, DMA-Buf Heaps, at least). >>> >>> This was also pointed out by Sima some time ago here: >>> https://lore.kernel.org/amd-gfx/YCVOl8%2F87bqRSQei@phenom.ffwll.local/ >>> >>> So we'll want that cgroup subsystem to be cross-framework. We settled on >>> a "device" cgroup during the discussion, but I'm sure we'll have plenty >>> of bikeshedding. >>> >>> The other thing we agreed on, based on the feedback TJ got on the last >>> iterations of his series was to go for memcg for drivers not using DMA >>> allocations. >>> >>> It's the part where I expect some discussion there too :) >>> >>> So we went back to a previous version of TJ's work, and I've started to >>> work on: >>> >>> - Integration of the cgroup in the GEM DMA and GEM VRAM helpers (this >>> works on tidss right now) >>> >>> - Integration of all heaps into that cgroup but the system one >>> (working on this at the moment) >> >> Should be similar to what I have then. I think you could use my work to >> continue it. >> >> I made nothing DRM specific except the name, if you renamed it the device >> resource management cgroup and changed the init function signature to take a >> name instead of a drm pointer, nothing would change. This is exactly what >> I'm hoping to accomplish, including reserving memory. > > I've started to work on rebasing my current work onto your series today, > and I'm not entirely sure how what I described would best fit. Let's > assume we have two KMS device, one using shmem, one using DMA > allocations, two heaps, one using the page allocator, the other using > CMA, and one v4l2 device using dma allocations. > > So we would have one KMS device and one heap using the page allocator, > and one KMS device, one heap, and one v4l2 driver using the DMA > allocator. > > Would these make different cgroup devices, or different cgroup regions? Each driver would register a device, whatever feels most logical for that device I suppose. My guess is that a prefix would also be nice here, so register a device with name of drm/$name or v4l2/$name, heap/$name. I didn't give it much thought and we're still experimenting, so just try something. :) There's no limit to amount of devices, I only fixed amount of pools to match TTM, but even that could be increased arbitrarily. I just don't think there is a point in doing so. >> The nice thing is that it should be similar to the memory cgroup controller >> in semantics, so you would have the same memory behavior whether you use the >> device cgroup or memory cgroup. >> >> I'm sad I missed the discussion, but hopefully we can coordinate more now >> that we know we're both working on it. :) > > Yeah, definitely :) > > Maxime Cheers, ~Maarten