From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7B8BC4167B for ; Tue, 28 Nov 2023 23:04:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 162298D000B; Tue, 28 Nov 2023 18:04:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1121C8D0001; Tue, 28 Nov 2023 18:04:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F1BC28D000B; Tue, 28 Nov 2023 18:04:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E41778D0001 for ; Tue, 28 Nov 2023 18:04:14 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id BA658A03C3 for ; Tue, 28 Nov 2023 23:04:14 +0000 (UTC) X-FDA: 81508893228.27.7275E60 Received: from mail-oo1-f53.google.com (mail-oo1-f53.google.com [209.85.161.53]) by imf12.hostedemail.com (Postfix) with ESMTP id 0C40E40004 for ; Tue, 28 Nov 2023 23:04:12 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=oJ6FHwtD; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf12.hostedemail.com: domain of yosryahmed@google.com designates 209.85.161.53 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701212653; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=43/GorJHph6wZf7ST1s9EfwOKwV7OvTpzkca3ztNxsA=; b=0LVUM7kpak0lba9IyqfMnF0RoIH344Twg0GY7Go0jhezDxHEsWwr5dqEjlJFCP1hdiEFZT OGSshgA9XkdhtmFJd+w1PHLAX7qjrbg98U2OcVZHm9Ub6/XtT9eFwD9iq0ZSd+8XARGaSv ipfC+DjY8ctZSNkVcWaO27DZstsDQxE= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=oJ6FHwtD; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf12.hostedemail.com: domain of yosryahmed@google.com designates 209.85.161.53 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701212653; a=rsa-sha256; cv=none; b=5FIPT/5kHm+OKG4arOE0m8P4hW3wum8D7BTOX3fd+Yns20VkC9wVv4t2wSaQFc7OFYwPFu U8eeq0fx3txBZWlUsw5IKG4Ju3utRkEmeRfKS7nLbPb+E9/UfX9sW9gLhbPZKpKUZwK2Tm IAtRzvFv3FlWqXTkNGCge8HTVpw/4dM= Received: by mail-oo1-f53.google.com with SMTP id 006d021491bc7-58d06bfadf8so3632329eaf.1 for ; Tue, 28 Nov 2023 15:04:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1701212652; x=1701817452; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=43/GorJHph6wZf7ST1s9EfwOKwV7OvTpzkca3ztNxsA=; b=oJ6FHwtD43hFPFS6qPbjsun3899K+H7UqzVhx0u9Vb12pum0FoGZNGhJhaG4handyS XQCNlmiwE9uzmq+8fZMgkTEWWKJ5Tol+qopS0rD9DZcekX1NQdoHQdr07Cck5hfHDl5C JcN+FIgPcDCLtNljkS//ZTZkErRaatyi6K2tarGmtBOguhdKNWBe4tbZrKuWIIgdGMQW uoCiSIyQuj4InxkxOlaqUNXEGcWuHLicHN0LEnSlsWvyB8a+rZ5Ossg2zZLg8uLZ/pvh xPfUl8M9d2tNNchyJh1lzdwiz5far/mXpZJ7k0OP0ysJbth8+ZQBcBjpSbC2cqgTrPLi WLtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701212652; x=1701817452; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=43/GorJHph6wZf7ST1s9EfwOKwV7OvTpzkca3ztNxsA=; b=ls/Azfwcp0r1f2A7gjWGJJJj5HnBJxvRkY4w7eZhgCOimk7xPJGWzs1y/h0H8dCk0A ArVgb919Zu5DKMMy35ByPVCqfme5R8Zv5u/jc3bPjqyrKfOyywpFD7qPpgucSLZtyXyO cglIseMpUryp+5aaI4e7JJz50r2HSwcYrwUAXAVLGMiPvaDdNpBOnqoxqcJ28IeZT+7G ec/QphE/NOkwph07fHY0apViZYzxLsyGsoSSNvX5sEJFZ62ImQdcQ1MIJIJxgDJeeaJ0 Lxco+D6l2U1wCJ0tb2+Ys+m2FQZWwj2vuwdmkSAyRahw3rkYN7aqancmlKpCTtwpeMZL +5HQ== X-Gm-Message-State: AOJu0Yy5bFtof/sSZAPrFOll5K9sxrb+aLneV7iG8mu0RWPKwH0alafn f3hh04lHC+rSlYxM93kc+V8BowW1Zn3ldRCK8desgA== X-Google-Smtp-Source: AGHT+IGHG/fZ39GpQH7XprtM75RYsFLPLsqrr/P3BS2SaFOkE/NfcmuncSRUnInNrgp9PW0qlhq9u6ZhURJEjUsg//c= X-Received: by 2002:a05:6359:67a9:b0:16d:bd74:19c9 with SMTP id sq41-20020a05635967a900b0016dbd7419c9mr14576065rwb.16.1701212651861; Tue, 28 Nov 2023 15:04:11 -0800 (PST) MIME-Version: 1.0 References: <20231128204938.1453583-1-pasha.tatashin@soleen.com> In-Reply-To: From: Yosry Ahmed Date: Tue, 28 Nov 2023 15:03:30 -0800 Message-ID: Subject: Re: [PATCH 00/16] IOMMU memory observability To: Pasha Tatashin Cc: akpm@linux-foundation.org, alex.williamson@redhat.com, alim.akhtar@samsung.com, alyssa@rosenzweig.io, asahi@lists.linux.dev, baolu.lu@linux.intel.com, bhelgaas@google.com, cgroups@vger.kernel.org, corbet@lwn.net, david@redhat.com, dwmw2@infradead.org, hannes@cmpxchg.org, heiko@sntech.de, iommu@lists.linux.dev, jasowang@redhat.com, jernej.skrabec@gmail.com, jgg@ziepe.ca, jonathanh@nvidia.com, joro@8bytes.org, kevin.tian@intel.com, krzysztof.kozlowski@linaro.org, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-rockchip@lists.infradead.org, linux-samsung-soc@vger.kernel.org, linux-sunxi@lists.linux.dev, linux-tegra@vger.kernel.org, lizefan.x@bytedance.com, marcan@marcan.st, mhiramat@kernel.org, mst@redhat.com, m.szyprowski@samsung.com, netdev@vger.kernel.org, paulmck@kernel.org, rdunlap@infradead.org, robin.murphy@arm.com, samuel@sholland.org, suravee.suthikulpanit@amd.com, sven@svenpeter.dev, thierry.reding@gmail.com, tj@kernel.org, tomas.mudrunka@gmail.com, vdumpa@nvidia.com, virtualization@lists.linux.dev, wens@csie.org, will@kernel.org, yu-cheng.yu@intel.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 0C40E40004 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: bzjs9ws8exkhztiwquobz1oz1b9hk3ho X-HE-Tag: 1701212652-526866 X-HE-Meta: U2FsdGVkX1/xyxTRz3KIK7BP2nR2ISCNz3VMATjTkoe6H/BuygO2w88v64ku3xJtdcSBuDxlBOXYcAA6Tby0/cLhcZ/lxYyqUg/df3OsROdb9IF/m6e4yZ4D2wDun5DvKPs7f686a43gl9dhbuCEf4F7dnLwWSkkWOfpR8pEW6p8ENCPXZo5fZOwHZxPD0b0YISyTaTIpMCJlMFkNdO2oJef+YeU7FEYgXtK4C+Q37VZVjdgBlQ3HYPQ6GFDs/LB8UGZnIwu3ECxbfHviaIkOFm0bXv4E1/tQGJZPXRM6UMMeejhItcN2SQHtRUK6nirNkzRsMTWYdsIDnr1pms6LWaUVe3Ox8oQ5mjCn8NXi725fY0wjb26Ws7J393UElwi2zifkoXUiN0SV7P9/2zK9sgbrdQpQ8+CmZ58e7y3lZtvUJTwKjeZV6N3GX2pl5oozwBqcIzHfjr/remcnnSJ5IA360a0g6/LTtTJeBcQ2U0YDWhaO+n6GNw36mM1gtX6PzJEax44+7R6nYNwp6HIQbxmIdKdgXt07BVeeYCRq/6I9BuLFjpCV8tSHwfG1BvLbciW+/YJA0JQzC5L/ifkif9a7zR4kNEyLpx/TjNp6DTdwR1Y2lm8J/9l7MIgJt4CglL+YnGjdLJHi8ponEXmmJfuwMlbxe4ESf8iEzPKxgsxDKhj3X6hrWhoaVbzf/jfmS2XUkbmXkX7GTo4FNLZHhXaP1tsq0s3oR/JmMscM6FKwpRk7j8N4kLZOIXOY56UyIxbg3bL5PX1wG31FwWCBe9CCOd6bDV8oq1arhS+FJq73zR+GbJ1eJE/pB4cmfp8VJuxmPVGuHvRIMVQvnf/X9F8RXBXiQerh8cxVDeItQrDNOHywLRh/M9cjOc0hG959u5qq/oZ7PsC/exrnV/VGyrLo4fagDg3O+k+gOVcOyPSkkgs/0NMIjIR9QR7oFctvqNXsYLljqom9Fwc2Op Bc0F/TpR TI1Q8Dp6gCjThXOfUXvtvohonvNY4dIUyJDkkxuKOXv+km4ow6lPJS1EdjDwWDIZulXzwHnqiIfjiu43zlhc0RNOd7hXEwJiVTNHwsgXjAI16BxrOAauFMQmphV1veK/mirhINU5SJW2YzPIr6oPeNrv0AkeuCZu52uKTX0KRBuG3eStoPqCx07K+e+/zuCTHQgu/ohR2UoQ4quuRz14nMG2hrDNshWo2/xOuiVwBnoknDWMyytqB/AA6XUV3UlmGphQAka3WoBI4QCFpp4zwPMj/bqiLbUqkUuLGEEWxtkMoFzbpmM3+FQ+Atttpt4IfjuG+57xPxx44WIhnwUikjQAC5zWmXnySa3OFgaG4VH82DeVkiNjgcJsz6Qun4EJohWcb X-Bogosity: Ham, tests=bogofilter, spamicity=0.000011, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Nov 28, 2023 at 2:32=E2=80=AFPM Pasha Tatashin wrote: > > On Tue, Nov 28, 2023 at 4:34=E2=80=AFPM Yosry Ahmed wrote: > > > > On Tue, Nov 28, 2023 at 12:49=E2=80=AFPM Pasha Tatashin > > wrote: > > > > > > From: Pasha Tatashin > > > > > > IOMMU subsystem may contain state that is in gigabytes. Majority of t= hat > > > state is iommu page tables. Yet, there is currently, no way to observ= e > > > how much memory is actually used by the iommu subsystem. > > > > > > This patch series solves this problem by adding both observability to > > > all pages that are allocated by IOMMU, and also accountability, so > > > admins can limit the amount if via cgroups. > > > > > > The system-wide observability is using /proc/meminfo: > > > SecPageTables: 438176 kB > > > > > > Contains IOMMU and KVM memory. > > > > > > Per-node observability: > > > /sys/devices/system/node/nodeN/meminfo > > > Node N SecPageTables: 422204 kB > > > > > > Contains IOMMU and KVM memory memory in the given NUMA node. > > > > > > Per-node IOMMU only observability: > > > /sys/devices/system/node/nodeN/vmstat > > > nr_iommu_pages 105555 > > > > > > Contains number of pages IOMMU allocated in the given node. > > > > Does it make sense to have a KVM-only entry there as well? > > > > In that case, if SecPageTables in /proc/meminfo is found to be > > suspiciously high, it should be easy to tell which component is > > contributing most usage through vmstat. I understand that users can do > > the subtraction, but we wouldn't want userspace depending on that, in > > case a third class of "secondary" page tables emerges that we want to > > add to SecPageTables. The in-kernel implementation can do the > > subtraction for now if it makes sense though. > > Hi Yosry, > > Yes, another counter for KVM could be added. On the other hand KVM > only can be computed by subtracting one from another as there are only > two types of secondary page tables, KVM and IOMMU: > > /sys/devices/system/node/node0/meminfo > Node 0 SecPageTables: 422204 kB > > /sys/devices/system/node/nodeN/vmstat > nr_iommu_pages 105555 > > KVM only =3D SecPageTables - nr_iommu_pages * PAGE_SIZE / 1024 > Right, but as I mention above, if userspace starts depending on this equation, we won't be able to add any more classes of "secondary" page tables to SecPageTables. I'd like to avoid that if possible. We can do the subtraction in the kernel.