From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E33E2D2E9C9 for ; Mon, 11 Nov 2024 09:28:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 598D16B007B; Mon, 11 Nov 2024 04:28:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 520776B0083; Mon, 11 Nov 2024 04:28:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3C1186B0085; Mon, 11 Nov 2024 04:28:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 1A1C16B007B for ; Mon, 11 Nov 2024 04:28:21 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id AFB0F416EB for ; Mon, 11 Nov 2024 09:28:20 +0000 (UTC) X-FDA: 82773287982.14.2FDC3E1 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by imf29.hostedemail.com (Postfix) with ESMTP id 887BD120022 for ; Mon, 11 Nov 2024 09:27:23 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Lq1aoUAi; spf=none (imf29.hostedemail.com: domain of maarten.lankhorst@linux.intel.com has no SPF policy when checking 198.175.65.13) smtp.mailfrom=maarten.lankhorst@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731317067; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Z//io2BWBhEcTnfCm5ezZgoG/KH/ShN0/TSEMdbysag=; b=QXvtufUvNmmLY8A6MUkFksc2rQl7bqMz0BGaPxUYCVWEipC5B9UBwu4mYZHo/epU8Tbbdh 4/m39fbGYbavVPpvyS8EeWz7OOoPMGyJTvFV1uu8QZK1ywJ0g3BtQSwT8EFyJMwhe30V5u Hcgyp1l8M1DcJmMYmzJpAt3LVj0DwuA= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Lq1aoUAi; spf=none (imf29.hostedemail.com: domain of maarten.lankhorst@linux.intel.com has no SPF policy when checking 198.175.65.13) smtp.mailfrom=maarten.lankhorst@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731317067; a=rsa-sha256; cv=none; b=Qk79GyngVmcs5RUY+4toMX8M+JqqF8JVHp9uCA6vroWNaq8TrnwVuBSLzCNRs56DzQGybL WEC6y/eLSuYs3lY/vlS6Wwcy81xhQ3bnuYFmCjdS4fdbF59WU8eFAdS7MpikZc3kU/t9U3 DYBwVcKzfSKSLEjLtH/n98mvyx33LnA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1731317298; x=1762853298; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=+SBAacjvUsQOBkoIHgArPsT8A8OVUexjcQWHe3IJnow=; b=Lq1aoUAiQJSWkzPvw3yj0zcjviffLYux13/OBRhYEu5I/EARFx22embi U3QTTCkSvd3/x7FV3EJYw7hWA+TZ4NVgCR6OCsXBVqTyrUUOfTndOc/kR HMzTBf1IhUnJ8oWHE0DPEjLwZRhE8PodHnymvXzsBjX1ldZLddkA6tauA j/7t3Ph7fG2MWvDq1IRGj5dFK4R4eHUeWWZf15Zsq/tl3r0YwBKV0toLO f4Fv/bTSA4NZbd9lA2sTGUR82VukwuQ+nYkrLRbYtD/h9v5vaEmWKjFCi co1LHsj8w35RRj5qb/KfAj5L0KoF6BoCAXdlcD0icMz7sD7KguTnBvN3e A==; X-CSE-ConnectionGUID: Rul52y7SSImPA0SqqN/vgQ== X-CSE-MsgGUID: EGoVyk3TRFuq4uOz7Iby5Q== X-IronPort-AV: E=McAfee;i="6700,10204,11222"; a="42213220" X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="42213220" Received: from fmviesa007.fm.intel.com ([10.60.135.147]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Nov 2024 01:28:17 -0800 X-CSE-ConnectionGUID: 9XbZoz18Tn2Q4+fN1Oe4ww== X-CSE-MsgGUID: P6Jt0NSkQg6jvR5zvXnVWw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,144,1728975600"; d="scan'208";a="86583250" Received: from pgcooper-mobl3.ger.corp.intel.com (HELO [10.245.244.88]) ([10.245.244.88]) by fmviesa007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Nov 2024 01:28:14 -0800 Message-ID: <2ddaf010-ce31-4957-bc4f-7f4c1bfe0826@linux.intel.com> Date: Mon, 11 Nov 2024 10:28:11 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 1/7] kernel/cgroup: Add "dev" memory accounting cgroup To: Waiman Long , intel-xe@lists.freedesktop.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, Tejun Heo , Zefan Li , Johannes Weiner , Andrew Morton Cc: Friedrich Vock , cgroups@vger.kernel.org, linux-mm@kvack.org, Maxime Ripard References: <20241023075302.27194-1-maarten.lankhorst@linux.intel.com> <20241023075302.27194-2-maarten.lankhorst@linux.intel.com> <813cc1d5-1648-4900-ae56-5405e52926df@redhat.com> Content-Language: en-US From: Maarten Lankhorst In-Reply-To: <813cc1d5-1648-4900-ae56-5405e52926df@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 887BD120022 X-Stat-Signature: 7e574qu1uezsgkwk88hj655n9wzoazky X-Rspam-User: X-HE-Tag: 1731317243-801533 X-HE-Meta: U2FsdGVkX1/iJmBVyyqHVAGWpxf0uHOpnQ3seTIWHToJKDtQKlkd9Yzm010gIx8rC0V+UVQS0FC8yV1TInWzfld//6r//zbypt1l4W4z5xl0TLV0sWQzSk1bRZ1ssulb6muJ6NH6mmMQgs1UQiVB+rrIPlxZd3fLC1z4J7U94LS/pa79gL0wrdzQqjDrK0UQLXlQopOxlQ+VUxicG22oBIbakNlLXPZ0Qg+vXuloUyCQG9oIe6n6saauzxT+R/MbIKY0VLPLSipslvpsqkUL1PCz6CQDKWA/BL+PO3y7TuYtFplfuKUDfITMPlexzc85jJAeNUSzwa/kFLD2/h6+o6C5H85DWnrUlm2xmJinNgAAWpp1DOTONzyNroClIqTzj0bzZ40GTIAxZu/BzuRMXQV8uFd70slzhKkvdGwbSrWPgyHiSLHFarDuQQFyDC+i/Ul+bUb8qSV3nms7xDJkdBXeYRxWUkLcT5KTCqCsw5UFiQQWH/nw3G9ROj7HvYd8FAG/88jLfBQzE/phB5xhhuYpEmWx0hatLeICdn1kkZ9tqKYBfV5OE2yMVXjf+on0UlIj+7F0GJ00m/6m+gq48j3EqTYKpGlEiAlZZtFlz+7JfedWrzk5n9nlGGfAZiGxaZLkKjcv/LCwF6quzyqsYvsIkA+WWKVRgYcE3ZVtIgIBq0lzo8heLrMsMlKG1F38ntivUX91S2mHFA2KIWvgxXAQ33iPNn+0MGhWnC0hbkGzkw59sSsH1OO2KDBy1YnnBwzsekXG9KolKcECWfeRZBUnhLd9lwtkqItXBCE+LshVKOhOAQPrHN15/T5EH7hpyYaI6jZ7FwW7+d4Y152ZJIyoZhKnLj0lBeKPWu53IICh7+n/WMnKxw+UREi2KZDV73DeIZ6AdtKNvZhHYErBUYBtO7CPU7T0z3EOf47lidNJGgvEJvrnK096HbGDZ5M8kfh4l1scBZHdC1I2cVp NyNTpuw/ uD28GiMYcQoAQKBYZ+urZdnU2Y0PG1UWBRRWHblSyLuIjrpEWEVgku3SrbHVlvizAYjOawrcUneZeEHldF7POFEGG+TcRBiXsnS8qctrbWIU1qEunvAti/NvA0i25txh+h16KIqh7b7ZK6RCT6fWjutZIUK+NciR5gAFdhijpbRNQnbpoZEMxG76FTOmh/jGhbkHqUsCPtczP07PXNlkACDchg1IaZnmmclrotW1nbp4L4waoQAR2lwMukQsEPrEgqU56wGQNXwDjmHE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Den 2024-10-23 kl. 17:26, skrev Waiman Long: > On 10/23/24 3:52 AM, Maarten Lankhorst wrote: >> The initial version was based roughly on the rdma and misc cgroup >> controllers, with a lot of the accounting code borrowed from rdma. >> >> The current version is a complete rewrite with page counter; it uses >> the same min/low/max semantics as the memory cgroup as a result. >> >> There's a small mismatch as TTM uses u64, and page_counter long pages. >> In practice it's not a problem. 32-bits systems don't really come with >>> =4GB cards and as long as we're consistently wrong with units, it's >> fine. The device page size may not be in the same units as kernel page >> size, and each region might also have a different page size (VRAM vs GART >> for example). >> >> The interface is simple: >> - populate dev_cgroup_try_charge->regions[..] name and size for each >> active >>    region, set num_regions accordingly. >> - Call (dev,drmm)_cgroup_register_device() >> - Use dev_cgroup_try_charge to check if you can allocate a chunk of >> memory, >>    use dev_cgroup__uncharge when freeing it. This may return an error >> code, >>    or -EAGAIN when the cgroup limit is reached. In that case a reference >>    to the limiting pool is returned. >> - The limiting cs can be used as compare function for >>    dev_cgroup_state_evict_valuable. >> - After having evicted enough, drop reference to limiting cs with >>    dev_cgroup_pool_state_put. >> >> This API allows you to limit device resources with cgroups. >> You can see the supported cards in /sys/fs/cgroup/dev.region.capacity >> You need to echo +dev to cgroup.subtree_control, and then you can >> partition memory. >> >> Co-developed-by: Friedrich Vock >> Signed-off-by: Friedrich Vock >> Co-developed-by: Maxime Ripard >> Signed-off-by: Maxime Ripard >> Signed-off-by: Maarten Lankhorst >> --- >>   Documentation/admin-guide/cgroup-v2.rst |  51 ++ >>   Documentation/core-api/cgroup.rst       |   9 + >>   Documentation/core-api/index.rst        |   1 + >>   Documentation/gpu/drm-compute.rst       |  54 ++ >>   include/linux/cgroup_dev.h              |  91 +++ >>   include/linux/cgroup_subsys.h           |   4 + >>   include/linux/page_counter.h            |   2 +- >>   init/Kconfig                            |   7 + >>   kernel/cgroup/Makefile                  |   1 + >>   kernel/cgroup/dev.c                     | 893 ++++++++++++++++++++++++ >>   mm/page_counter.c                       |   4 +- >>   11 files changed, 1114 insertions(+), 3 deletions(-) >>   create mode 100644 Documentation/core-api/cgroup.rst >>   create mode 100644 Documentation/gpu/drm-compute.rst >>   create mode 100644 include/linux/cgroup_dev.h >>   create mode 100644 kernel/cgroup/dev.c > > Just a general comment. > > Cgroup v1 has a legacy device controller in security/device_cgroup.c > which is no longer available in cgroup v2. So if you use the name device > controller, the documentation must be clear that it is completely > different and have no relationship from the device controller in cgroup v1. Hey, Thanks for noticing. I didn't know there was such a controller. Seems weird to have one for managing access to opening devnodes instead of a security module. I'll update the documentation in the next version to make it more clear. Cheers, ~Maarten