From: Tony Battersby <tonyb@cybernetics.com>
To: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Cc: iommu@lists.linux-foundation.org, kernel-team@fb.com,
Matthew Wilcox <willy@infradead.org>,
Keith Busch <kbusch@kernel.org>,
Andy Shevchenko <andy.shevchenko@gmail.com>,
Robin Murphy <robin.murphy@arm.com>,
Tony Lindgren <tony@atomide.com>
Subject: [PATCH v6 10/11] dmapool: improve scalability of dma_pool_alloc
Date: Tue, 7 Jun 2022 14:46:23 -0400 [thread overview]
Message-ID: <6a4fb3c3-e627-6266-7c49-322253abefb9@cybernetics.com> (raw)
In-Reply-To: <340ff8ef-9ff5-7175-c234-4132bbdfc5f7@cybernetics.com>
dma_pool_alloc() scales poorly when allocating a large number of pages
because it does a linear scan of all previously-allocated pages before
allocating a new one. Improve its scalability by maintaining a separate
list of pages that have free blocks ready to (re)allocate. In big O
notation, this improves the algorithm from O(n) to O(1).
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
---
Changes since v5:
pool_free_page() no longer exists.
Updated big O usage in description.
mm/dmapool.c | 26 ++++++++++++++++++++++----
1 file changed, 22 insertions(+), 4 deletions(-)
diff --git a/mm/dmapool.c b/mm/dmapool.c
index 4e075feb038f..fc9ae0683c20 100644
--- a/mm/dmapool.c
+++ b/mm/dmapool.c
@@ -17,6 +17,10 @@
* least 'size' bytes. Free blocks are tracked in an unsorted singly-linked
* list of free blocks within the page. Used blocks aren't tracked, but we
* keep a count of how many are currently allocated from each page.
+ *
+ * The avail_page_list keeps track of pages that have one or more free blocks
+ * available to (re)allocate. Pages are moved in and out of avail_page_list
+ * as their blocks are allocated and freed.
*/
#include <linux/device.h>
@@ -42,6 +46,7 @@
struct dma_pool { /* the pool */
struct list_head page_list;
+ struct list_head avail_page_list;
spinlock_t lock;
struct device *dev;
unsigned int size;
@@ -54,6 +59,7 @@ struct dma_pool { /* the pool */
struct dma_page { /* cacheable header for 'allocation' bytes */
struct list_head page_list;
+ struct list_head avail_page_link;
void *vaddr;
dma_addr_t dma;
unsigned int in_use;
@@ -155,6 +161,7 @@ struct dma_pool *dma_pool_create(const char *name, struct device *dev,
retval->dev = dev;
INIT_LIST_HEAD(&retval->page_list);
+ INIT_LIST_HEAD(&retval->avail_page_list);
spin_lock_init(&retval->lock);
retval->size = size;
retval->boundary = boundary;
@@ -311,10 +318,11 @@ void *dma_pool_alloc(struct dma_pool *pool, gfp_t mem_flags,
might_alloc(mem_flags);
spin_lock_irqsave(&pool->lock, flags);
- list_for_each_entry(page, &pool->page_list, page_list) {
- if (page->offset < pool->allocation)
- goto ready;
- }
+ page = list_first_entry_or_null(&pool->avail_page_list,
+ struct dma_page,
+ avail_page_link);
+ if (page)
+ goto ready;
/* pool_alloc_page() might sleep, so temporarily drop &pool->lock */
spin_unlock_irqrestore(&pool->lock, flags);
@@ -326,10 +334,13 @@ void *dma_pool_alloc(struct dma_pool *pool, gfp_t mem_flags,
spin_lock_irqsave(&pool->lock, flags);
list_add(&page->page_list, &pool->page_list);
+ list_add(&page->avail_page_link, &pool->avail_page_list);
ready:
page->in_use++;
offset = page->offset;
page->offset = *(int *)(page->vaddr + offset);
+ if (page->offset >= pool->allocation)
+ list_del_init(&page->avail_page_link);
retval = offset + page->vaddr;
*handle = offset + page->dma;
#ifdef DMAPOOL_DEBUG
@@ -451,6 +462,13 @@ void dma_pool_free(struct dma_pool *pool, void *vaddr, dma_addr_t dma)
memset(vaddr, 0, pool->size);
#endif
+ /*
+ * list_empty() on the page tests if the page is already linked into
+ * avail_page_list to avoid adding it more than once.
+ */
+ if (list_empty(&page->avail_page_link))
+ list_add(&page->avail_page_link, &pool->avail_page_list);
+
page->in_use--;
*(int *)vaddr = page->offset;
page->offset = offset;
--
2.25.1
next prev parent reply other threads:[~2022-06-07 18:46 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-07 18:38 [PATCH v6 00/11] mpt3sas and dmapool scalability Tony Battersby
2022-06-07 18:39 ` [PATCH v6 01/11] dmapool: remove checks for dev == NULL Tony Battersby
2022-06-07 18:40 ` [PATCH v6 02/11] dmapool: use sysfs_emit() instead of scnprintf() Tony Battersby
2022-06-07 18:41 ` [PATCH v6 03/11] dmapool: cleanup integer types Tony Battersby
2022-06-07 18:42 ` [PATCH v6 04/11] dmapool: fix boundary comparison Tony Battersby
2022-06-07 18:42 ` [PATCH v6 05/11] dmapool: improve accuracy of debug statistics Tony Battersby
2022-06-07 18:43 ` [PATCH v6 06/11] dmapool: debug: prevent endless loop in case of corruption Tony Battersby
2022-06-07 18:44 ` [PATCH v6 07/11] dmapool: ignore init_on_free when DMAPOOL_DEBUG enabled Tony Battersby
2022-06-07 18:44 ` [PATCH v6 08/11] dmapool: speedup DMAPOOL_DEBUG with init_on_alloc Tony Battersby
2022-06-07 18:45 ` [PATCH v6 09/11] dmapool: cleanup dma_pool_destroy Tony Battersby
2022-06-07 18:46 ` Tony Battersby [this message]
2022-06-07 18:46 ` [PATCH v6 11/11] dmapool: improve scalability of dma_pool_free Tony Battersby
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6a4fb3c3-e627-6266-7c49-322253abefb9@cybernetics.com \
--to=tonyb@cybernetics.com \
--cc=andy.shevchenko@gmail.com \
--cc=iommu@lists.linux-foundation.org \
--cc=kbusch@kernel.org \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=robin.murphy@arm.com \
--cc=tony@atomide.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox