From: Yunsheng Lin <linyunsheng@huawei.com>
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: <davem@davemloft.net>, <kuba@kernel.org>, <pabeni@redhat.com>,
<netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>, <linux-mm@kvack.org>
Subject: Re: [RFC v11 01/14] mm: page_frag: add a test module for page_frag
Date: Tue, 23 Jul 2024 21:19:02 +0800 [thread overview]
Message-ID: <c7497b53-2dd7-4176-bb70-4a14558d90ab@huawei.com> (raw)
In-Reply-To: <CAKgT0UcsBGKR+AGU6wDUpXY48FnEA4hdvvti-YC87=8zfGPLdg@mail.gmail.com>
On 2024/7/22 1:34, Alexander Duyck wrote:
> On Fri, Jul 19, 2024 at 2:36 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>>
>> Basing on the lib/objpool.c, change it to something like a
>> ptrpool, so that we can utilize that to test the correctness
>> and performance of the page_frag.
>>
>> The testing is done by ensuring that the fragment allocated
>> from a frag_frag_cache instance is pushed into a ptrpool
>> instance in a kthread binded to a specified cpu, and a kthread
>> binded to a specified cpu will pop the fragment from the
>> ptrpool and free the fragment.
>>
>> We may refactor out the common part between objpool and ptrpool
>> if this ptrpool thing turns out to be helpful for other place.
>>
>> CC: Alexander Duyck <alexander.duyck@gmail.com>
>> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
>> ---
>> mm/Kconfig.debug | 8 +
>> mm/Makefile | 1 +
>> mm/page_frag_test.c | 393 ++++++++++++++++++++++++++++++++++++++++++++
>> 3 files changed, 402 insertions(+)
>> create mode 100644 mm/page_frag_test.c
>
> I might have missed it somewhere. Is there any reason why this isn't
> in the selftests/mm/ directory? Seems like that would be a better fit
> for this.
>
>> diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug
>> index afc72fde0f03..1ebcd45f47d4 100644
>> --- a/mm/Kconfig.debug
>> +++ b/mm/Kconfig.debug
>> @@ -142,6 +142,14 @@ config DEBUG_PAGE_REF
>> kernel code. However the runtime performance overhead is virtually
>> nil until the tracepoints are actually enabled.
>>
>> +config DEBUG_PAGE_FRAG_TEST
>
> This isn't a "DEBUG" feature. This is a test feature.
>
>> + tristate "Test module for page_frag"
>> + default n
>> + depends on m && DEBUG_KERNEL
>
> I am not sure it is valid to have a tristate depend on being built as a module.
Perhaps I was copying the wrong pattern from TEST_OBJPOOL in lib/Kconfig.debug.
Perhaps mm/dmapool_test.c and DMAPOOL_TEST* *was more appropriate pattern
for test module for page_frag?
>
> I think if you can set it up as a selftest it will have broader use as
> you could compile it against any target kernel going forward and add
> it as a module rather than having to build it as a part of a debug
> kernel.
It seems tools/testing/selftests/mm/* are all about userspace testing
tool, and testing kernel module seems to be in the same directory with
the code to be tested?
>
>> + help
>> + This builds the "page_frag_test" module that is used to test the
>> + correctness and performance of page_frag's implementation.
>> +
>> config DEBUG_RODATA_TEST
>> bool "Testcase for the marking rodata read-only"
...
>> +
>> + /*
>> + * here we allocate percpu-slot & objs together in a single
>> + * allocation to make it more compact, taking advantage of
>> + * warm caches and TLB hits. in default vmalloc is used to
>> + * reduce the pressure of kernel slab system. as we know,
>> + * minimal size of vmalloc is one page since vmalloc would
>> + * always align the requested size to page size
>> + */
>> + if (gfp & GFP_ATOMIC)
>> + slot = kmalloc_node(size, gfp, cpu_to_node(i));
>> + else
>> + slot = __vmalloc_node(size, sizeof(void *), gfp,
>> + cpu_to_node(i),
>> + __builtin_return_address(0));
>
> When would anyone ever call this with atomic? This is just for your
> test isn't it?
>
>> + if (!slot)
>> + return -ENOMEM;
>> +
>> + memset(slot, 0, size);
>> + pool->cpu_slots[i] = slot;
>> +
>> + objpool_init_percpu_slot(pool, slot);
>> + }
>> +
>> + return 0;
>> +}
...
>> +/* release whole objpool forcely */
>> +static void objpool_free(struct objpool_head *pool)
>> +{
>> + if (!pool->cpu_slots)
>> + return;
>> +
>> + /* release percpu slots */
>> + objpool_fini_percpu_slots(pool);
>> +}
>> +
>
> Why add all this extra objpool overhead? This seems like overkill for
> what should be a simple test. Seems like you should just need a simple
> array located on one of your CPUs. I'm not sure what is with all the
> extra overhead being added here.
As mentioned in the commit log:
"We may refactor out the common part between objpool and ptrpool
if this ptrpool thing turns out to be helpful for other place."
The next thing I am trying to do is to use ptrpool to optimization
the pcp for mm subsystem. so I would rather not tailor the ptrpool
for page_frag_test, and it doesn't seem to affect the testing that
much.
>
>> +static struct objpool_head ptr_pool;
>> +static int nr_objs = 512;
>> +static atomic_t nthreads;
>> +static struct completion wait;
>> +static struct page_frag_cache test_frag;
>> +
>> +static int nr_test = 5120000;
>> +module_param(nr_test, int, 0);
>> +MODULE_PARM_DESC(nr_test, "number of iterations to test");
>> +
>> +static bool test_align;
>> +module_param(test_align, bool, 0);
>> +MODULE_PARM_DESC(test_align, "use align API for testing");
>> +
>> +static int test_alloc_len = 2048;
>> +module_param(test_alloc_len, int, 0);
>> +MODULE_PARM_DESC(test_alloc_len, "alloc len for testing");
>> +
>> +static int test_push_cpu;
>> +module_param(test_push_cpu, int, 0);
>> +MODULE_PARM_DESC(test_push_cpu, "test cpu for pushing fragment");
>> +
>> +static int test_pop_cpu;
>> +module_param(test_pop_cpu, int, 0);
>> +MODULE_PARM_DESC(test_pop_cpu, "test cpu for popping fragment");
>> +
>> +static int page_frag_pop_thread(void *arg)
>> +{
>> + struct objpool_head *pool = arg;
>> + int nr = nr_test;
>> +
>> + pr_info("page_frag pop test thread begins on cpu %d\n",
>> + smp_processor_id());
>> +
>> + while (nr > 0) {
>> + void *obj = objpool_pop(pool);
>> +
>> + if (obj) {
>> + nr--;
>> + page_frag_free(obj);
>> + } else {
>> + cond_resched();
>> + }
>> + }
>> +
>> + if (atomic_dec_and_test(&nthreads))
>> + complete(&wait);
>> +
>> + pr_info("page_frag pop test thread exits on cpu %d\n",
>> + smp_processor_id());
>> +
>> + return 0;
>> +}
>> +
>> +static int page_frag_push_thread(void *arg)
>> +{
>> + struct objpool_head *pool = arg;
>> + int nr = nr_test;
>> +
>> + pr_info("page_frag push test thread begins on cpu %d\n",
>> + smp_processor_id());
>> +
>> + while (nr > 0) {
>> + void *va;
>> + int ret;
>> +
>> + if (test_align) {
>> + va = page_frag_alloc_align(&test_frag, test_alloc_len,
>> + GFP_KERNEL, SMP_CACHE_BYTES);
>> +
>> + WARN_ONCE((unsigned long)va & (SMP_CACHE_BYTES - 1),
>> + "unaligned va returned\n");
>> + } else {
>> + va = page_frag_alloc(&test_frag, test_alloc_len, GFP_KERNEL);
>> + }
>> +
>> + if (!va)
>> + continue;
>> +
>> + ret = objpool_push(va, pool);
>> + if (ret) {
>> + page_frag_free(va);
>> + cond_resched();
>> + } else {
>> + nr--;
>> + }
>> + }
>> +
>> + pr_info("page_frag push test thread exits on cpu %d\n",
>> + smp_processor_id());
>> +
>> + if (atomic_dec_and_test(&nthreads))
>> + complete(&wait);
>> +
>> + return 0;
>> +}
>> +
>
> So looking over these functions they seem to overlook how the network
> stack works in many cases. One of the main motivations for the page
> frags approach is page recycling. For example with GRO enabled the
> headers allocated to record the frags might be freed for all but the
> first. As such you can end up with 17 fragments being allocated, and
> 16 freed within the same thread as NAPI will just be recycling the
> buffers.
>
> With this setup it doesn't seem very likely to be triggered since you
> are operating in two threads. One test you might want to look at
> adding is a test where you are allocating and freeing in the same
> thread at a fairly constant rate to test against the "ideal" scenario.
I am not sure if the above is still the "ideal" scenario, as you mentioned
that most drivers are turning to use page_pool for rx, the page frag is really
mostly for tx or skb->data for rx.
>
next prev parent reply other threads:[~2024-07-23 13:31 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20240719093338.55117-1-linyunsheng@huawei.com>
2024-07-19 9:33 ` Yunsheng Lin
2024-07-21 17:34 ` Alexander Duyck
2024-07-23 13:19 ` Yunsheng Lin [this message]
2024-07-19 9:33 ` [RFC v11 02/14] mm: move the page fragment allocator from page_alloc into its own file Yunsheng Lin
2024-07-21 17:58 ` Alexander Duyck
2024-07-27 15:04 ` Yunsheng Lin
2024-07-19 9:33 ` [RFC v11 03/14] mm: page_frag: use initial zero offset for page_frag_alloc_align() Yunsheng Lin
2024-07-21 18:34 ` Alexander Duyck
2024-07-19 9:33 ` [RFC v11 04/14] mm: page_frag: add '_va' suffix to page_frag API Yunsheng Lin
[not found] ` <CAKgT0UcqELiXntRA_uD8eJGjt-OCLO64ax=YFXrCHNnaj9kD8g@mail.gmail.com>
2024-07-25 12:21 ` Yunsheng Lin
2024-07-19 9:33 ` [RFC v11 05/14] mm: page_frag: avoid caller accessing 'page_frag_cache' directly Yunsheng Lin
2024-07-21 23:01 ` Alexander H Duyck
2024-07-19 9:33 ` [RFC v11 07/14] mm: page_frag: reuse existing space for 'size' and 'pfmemalloc' Yunsheng Lin
2024-07-21 22:59 ` Alexander H Duyck
2024-07-19 9:33 ` [RFC v11 08/14] mm: page_frag: some minor refactoring before adding new API Yunsheng Lin
2024-07-21 23:40 ` Alexander H Duyck
2024-07-22 12:55 ` Yunsheng Lin
2024-07-22 15:32 ` Alexander Duyck
2024-07-23 13:19 ` Yunsheng Lin
2024-07-30 13:20 ` Yunsheng Lin
2024-07-30 15:12 ` Alexander H Duyck
2024-07-31 12:35 ` Yunsheng Lin
2024-07-31 17:02 ` Alexander H Duyck
2024-08-01 12:53 ` Yunsheng Lin
2024-07-19 9:33 ` [RFC v11 09/14] mm: page_frag: use __alloc_pages() to replace alloc_pages_node() Yunsheng Lin
2024-07-21 21:41 ` Alexander H Duyck
2024-07-24 12:54 ` Yunsheng Lin
2024-07-24 15:03 ` Alexander Duyck
2024-07-25 12:19 ` Yunsheng Lin
2024-08-14 18:34 ` Alexander H Duyck
2024-07-19 9:33 ` [RFC v11 11/14] mm: page_frag: introduce prepare/probe/commit API Yunsheng Lin
2024-07-19 9:33 ` [RFC v11 13/14] mm: page_frag: update documentation for page_frag Yunsheng Lin
[not found] ` <CAKgT0UcGvrS7=r0OCGZipzBv8RuwYtRwb2QDXqiF4qW5CNws4g@mail.gmail.com>
[not found] ` <b2001dba-a2d2-4b49-bc9f-59e175e7bba1@huawei.com>
2024-07-22 15:21 ` [RFC v11 00/14] Replace page_frag with page_frag_cache for sk_page_frag() Alexander Duyck
2024-07-23 13:17 ` Yunsheng Lin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c7497b53-2dd7-4176-bb70-4a14558d90ab@huawei.com \
--to=linyunsheng@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=alexander.duyck@gmail.com \
--cc=davem@davemloft.net \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox