Re: [PATCH v4 01/10] iommu/vt-d: add wrapper functions for page allocations

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Robin Murphy <robin.murphy@arm.com>
To: Pasha Tatashin <pasha.tatashin@soleen.com>,
	akpm@linux-foundation.org, alim.akhtar@samsung.com,
	alyssa@rosenzweig.io, asahi@lists.linux.dev,
	baolu.lu@linux.intel.com, bhelgaas@google.com,
	cgroups@vger.kernel.org, corbet@lwn.net, david@redhat.com,
	dwmw2@infradead.org, hannes@cmpxchg.org, heiko@sntech.de,
	iommu@lists.linux.dev, jernej.skrabec@gmail.com,
	jonathanh@nvidia.com, joro@8bytes.org,
	krzysztof.kozlowski@linaro.org, linux-doc@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, linux-rockchip@lists.infradead.org,
	linux-samsung-soc@vger.kernel.org, linux-sunxi@lists.linux.dev,
	linux-tegra@vger.kernel.org, lizefan.x@bytedance.com,
	marcan@marcan.st, mhiramat@kernel.org, m.szyprowski@samsung.com,
	paulmck@kernel.org, rdunlap@infradead.org, samuel@sholland.org,
	suravee.suthikulpanit@amd.com, sven@svenpeter.dev,
	thierry.reding@gmail.com, tj@kernel.org,
	tomas.mudrunka@gmail.com, vdumpa@nvidia.com, wens@csie.org,
	will@kernel.org, yu-cheng.yu@intel.com, rientjes@google.com,
	bagasdotme@gmail.com, mkoutny@suse.com
Subject: Re: [PATCH v4 01/10] iommu/vt-d: add wrapper functions for page allocations
Date: Fri, 9 Feb 2024 13:44:20 +0000	[thread overview]
Message-ID: <8ce2cd7b-7702-45aa-b4c8-25a01c27ed83@arm.com> (raw)
In-Reply-To: <20240207174102.1486130-2-pasha.tatashin@soleen.com>

On 2024-02-07 5:40 pm, Pasha Tatashin wrote:
[...]> diff --git a/drivers/iommu/iommu-pages.h 
b/drivers/iommu/iommu-pages.h
> new file mode 100644
> index 000000000000..c412d0aaa399
> --- /dev/null
> +++ b/drivers/iommu/iommu-pages.h
> @@ -0,0 +1,204 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2024, Google LLC.
> + * Pasha Tatashin <pasha.tatashin@soleen.com>
> + */
> +
> +#ifndef __IOMMU_PAGES_H
> +#define __IOMMU_PAGES_H
> +
> +#include <linux/vmstat.h>
> +#include <linux/gfp.h>
> +#include <linux/mm.h>
> +
> +/*
> + * All page allocation that are performed in the IOMMU subsystem must use one of

"All page allocations" is too broad; As before, this is only about 
pagetable allocations, or I guess for the full nuance, allocations of 
pagetables and other per-iommu_domain configuration structures which are 
reasonable to report as "pagetables" to userspace.

> + * the functions below.  This is necessary for the proper accounting as IOMMU
> + * state can be rather large, i.e. multiple gigabytes in size.
> + */
> +
> +/**
> + * __iommu_alloc_pages_node - allocate a zeroed page of a given order from
> + * specific NUMA node.
> + * @nid: memory NUMA node id
> + * @gfp: buddy allocator flags
> + * @order: page order
> + *
> + * returns the head struct page of the allocated page.
> + */
> +static inline struct page *__iommu_alloc_pages_node(int nid, gfp_t gfp,
> +						    int order)
> +{
> +	struct page *page;
> +
> +	page = alloc_pages_node(nid, gfp | __GFP_ZERO, order);
> +	if (unlikely(!page))
> +		return NULL;
> +
> +	return page;
> +}

All 3 invocations of this only use the returned struct page to trivially 
derive page_address(), so we really don't need it; just clean up these 
callsites a bit more.

> +
> +/**
> + * __iommu_alloc_pages - allocate a zeroed page of a given order.
> + * @gfp: buddy allocator flags
> + * @order: page order
> + *
> + * returns the head struct page of the allocated page.
> + */
> +static inline struct page *__iommu_alloc_pages(gfp_t gfp, int order)
> +{
> +	struct page *page;
> +
> +	page = alloc_pages(gfp | __GFP_ZERO, order);
> +	if (unlikely(!page))
> +		return NULL;
> +
> +	return page;
> +}

Same for the single invocation of this one.

> +
> +/**
> + * __iommu_alloc_page_node - allocate a zeroed page at specific NUMA node.
> + * @nid: memory NUMA node id
> + * @gfp: buddy allocator flags
> + *
> + * returns the struct page of the allocated page.
> + */
> +static inline struct page *__iommu_alloc_page_node(int nid, gfp_t gfp)
> +{
> +	return __iommu_alloc_pages_node(nid, gfp, 0);
> +}

There are no users of this at all.

> +
> +/**
> + * __iommu_alloc_page - allocate a zeroed page
> + * @gfp: buddy allocator flags
> + *
> + * returns the struct page of the allocated page.
> + */
> +static inline struct page *__iommu_alloc_page(gfp_t gfp)
> +{
> +	return __iommu_alloc_pages(gfp, 0);
> +}
> +
> +/**
> + * __iommu_free_pages - free page of a given order
> + * @page: head struct page of the page
> + * @order: page order
> + */
> +static inline void __iommu_free_pages(struct page *page, int order)
> +{
> +	if (!page)
> +		return;
> +
> +	__free_pages(page, order);
> +}
> +
> +/**
> + * __iommu_free_page - free page
> + * @page: struct page of the page
> + */
> +static inline void __iommu_free_page(struct page *page)
> +{
> +	__iommu_free_pages(page, 0);
> +}

Beyond one more trivial Intel cleanup for __iommu_alloc_pages(), these 3 
are then only used by tegra-smmu, so honestly I'd be inclined to just 
open-code there page_address()/virt_to_page() conversions as appropriate 
there (once again I think the whole thing could in fact be refactored to 
not use struct pages at all because all it's ever ultimately doing with 
them is page_address(), but that would be a bigger job so definitely 
out-of-scope for this series).

> +
> +/**
> + * iommu_alloc_pages_node - allocate a zeroed page of a given order from
> + * specific NUMA node.
> + * @nid: memory NUMA node id
> + * @gfp: buddy allocator flags
> + * @order: page order
> + *
> + * returns the virtual address of the allocated page
> + */
> +static inline void *iommu_alloc_pages_node(int nid, gfp_t gfp, int order)
> +{
> +	struct page *page = __iommu_alloc_pages_node(nid, gfp, order);
> +
> +	if (unlikely(!page))
> +		return NULL;

As a general point I'd prefer to fold these checks into the accounting 
function itself rather than repeat them all over.

> +
> +	return page_address(page);
> +}
> +
> +/**
> + * iommu_alloc_pages - allocate a zeroed page of a given order
> + * @gfp: buddy allocator flags
> + * @order: page order
> + *
> + * returns the virtual address of the allocated page
> + */
> +static inline void *iommu_alloc_pages(gfp_t gfp, int order)
> +{
> +	struct page *page = __iommu_alloc_pages(gfp, order);
> +
> +	if (unlikely(!page))
> +		return NULL;
> +
> +	return page_address(page);
> +}
> +
> +/**
> + * iommu_alloc_page_node - allocate a zeroed page at specific NUMA node.
> + * @nid: memory NUMA node id
> + * @gfp: buddy allocator flags
> + *
> + * returns the virtual address of the allocated page
> + */
> +static inline void *iommu_alloc_page_node(int nid, gfp_t gfp)
> +{
> +	return iommu_alloc_pages_node(nid, gfp, 0);
> +}

TBH I'm not entirely convinced that saving 4 characters per invocation 
times 11 invocations makes this wrapper worthwhile :/

> +
> +/**
> + * iommu_alloc_page - allocate a zeroed page
> + * @gfp: buddy allocator flags
> + *
> + * returns the virtual address of the allocated page
> + */
> +static inline void *iommu_alloc_page(gfp_t gfp)
> +{
> +	return iommu_alloc_pages(gfp, 0);
> +}
> +
> +/**
> + * iommu_free_pages - free page of a given order
> + * @virt: virtual address of the page to be freed.
> + * @order: page order
> + */
> +static inline void iommu_free_pages(void *virt, int order)
> +{
> +	if (!virt)
> +		return;
> +
> +	__iommu_free_pages(virt_to_page(virt), order);
> +}
> +
> +/**
> + * iommu_free_page - free page
> + * @virt: virtual address of the page to be freed.
> + */
> +static inline void iommu_free_page(void *virt)
> +{
> +	iommu_free_pages(virt, 0);
> +}
> +
> +/**
> + * iommu_free_pages_list - free a list of pages.
> + * @page: the head of the lru list to be freed.
> + *
> + * There are no locking requirement for these pages, as they are going to be
> + * put on a free list as soon as refcount reaches 0. Pages are put on this LRU
> + * list once they are removed from the IOMMU page tables. However, they can
> + * still be access through debugfs.
> + */
> +static inline void iommu_free_pages_list(struct list_head *page)

Nit: I'd be inclined to call this iommu_put_pages_list for consistency.

> +{
> +	while (!list_empty(page)) {
> +		struct page *p = list_entry(page->prev, struct page, lru);
> +
> +		list_del(&p->lru);
> +		put_page(p);
> +	}
> +}

I realise now you've also missed the common freelist freeing sites in 
iommu-dma.

Thanks,
Robin.

> +
> +#endif	/* __IOMMU_PAGES_H */

next prev parent reply	other threads:[~2024-02-09 13:44 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-07 17:40 [PATCH v4 00/10] IOMMU memory observability Pasha Tatashin
2024-02-07 17:40 ` [PATCH v4 01/10] iommu/vt-d: add wrapper functions for page allocations Pasha Tatashin
2024-02-09 13:44   ` Robin Murphy [this message]
2024-02-10  2:21     ` Pasha Tatashin
2024-02-13 17:26       ` Robin Murphy
2024-02-16  1:05         ` Pasha Tatashin
2024-02-07 17:40 ` [PATCH v4 02/10] iommu/amd: use page allocation function provided by iommu-pages.h Pasha Tatashin
2024-02-07 17:40 ` [PATCH v4 03/10] iommu/io-pgtable-arm: " Pasha Tatashin
2024-02-07 17:40 ` [PATCH v4 04/10] iommu/io-pgtable-dart: " Pasha Tatashin
2024-02-07 17:40 ` [PATCH v4 05/10] iommu/exynos: " Pasha Tatashin
2024-02-09 11:26   ` Marek Szyprowski
2024-02-09 19:00     ` Pasha Tatashin
2024-02-07 17:40 ` [PATCH v4 06/10] iommu/rockchip: " Pasha Tatashin
2024-02-07 17:40 ` [PATCH v4 07/10] iommu/sun50i: " Pasha Tatashin
2024-02-09 10:55   ` Jernej Škrabec
2024-02-09 19:01     ` Pasha Tatashin
2024-02-07 17:41 ` [PATCH v4 08/10] iommu/tegra-smmu: " Pasha Tatashin
2024-02-07 17:41 ` [PATCH v4 09/10] iommu: observability of the IOMMU allocations Pasha Tatashin
2024-02-09 11:17   ` Robin Murphy
2024-02-07 17:41 ` [PATCH v4 10/10] iommu: account IOMMU allocated memory Pasha Tatashin
2024-02-09 10:50 ` [PATCH v4 00/10] IOMMU memory observability Joerg Roedel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8ce2cd7b-7702-45aa-b4c8-25a01c27ed83@arm.com \
    --to=robin.murphy@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=alim.akhtar@samsung.com \
    --cc=alyssa@rosenzweig.io \
    --cc=asahi@lists.linux.dev \
    --cc=bagasdotme@gmail.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=david@redhat.com \
    --cc=dwmw2@infradead.org \
    --cc=hannes@cmpxchg.org \
    --cc=heiko@sntech.de \
    --cc=iommu@lists.linux.dev \
    --cc=jernej.skrabec@gmail.com \
    --cc=jonathanh@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=krzysztof.kozlowski@linaro.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rockchip@lists.infradead.org \
    --cc=linux-samsung-soc@vger.kernel.org \
    --cc=linux-sunxi@lists.linux.dev \
    --cc=linux-tegra@vger.kernel.org \
    --cc=lizefan.x@bytedance.com \
    --cc=m.szyprowski@samsung.com \
    --cc=marcan@marcan.st \
    --cc=mhiramat@kernel.org \
    --cc=mkoutny@suse.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=paulmck@kernel.org \
    --cc=rdunlap@infradead.org \
    --cc=rientjes@google.com \
    --cc=samuel@sholland.org \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=sven@svenpeter.dev \
    --cc=thierry.reding@gmail.com \
    --cc=tj@kernel.org \
    --cc=tomas.mudrunka@gmail.com \
    --cc=vdumpa@nvidia.com \
    --cc=wens@csie.org \
    --cc=will@kernel.org \
    --cc=yu-cheng.yu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox