linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yunsheng Lin <linyunsheng@huawei.com>
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: <davem@davemloft.net>, <kuba@kernel.org>, <pabeni@redhat.com>,
	<netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>, <linux-mm@kvack.org>
Subject: Re: [RFC v11 01/14] mm: page_frag: add a test module for page_frag
Date: Tue, 23 Jul 2024 21:19:02 +0800	[thread overview]
Message-ID: <c7497b53-2dd7-4176-bb70-4a14558d90ab@huawei.com> (raw)
In-Reply-To: <CAKgT0UcsBGKR+AGU6wDUpXY48FnEA4hdvvti-YC87=8zfGPLdg@mail.gmail.com>

On 2024/7/22 1:34, Alexander Duyck wrote:
> On Fri, Jul 19, 2024 at 2:36 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>>
>> Basing on the lib/objpool.c, change it to something like a
>> ptrpool, so that we can utilize that to test the correctness
>> and performance of the page_frag.
>>
>> The testing is done by ensuring that the fragment allocated
>> from a frag_frag_cache instance is pushed into a ptrpool
>> instance in a kthread binded to a specified cpu, and a kthread
>> binded to a specified cpu will pop the fragment from the
>> ptrpool and free the fragment.
>>
>> We may refactor out the common part between objpool and ptrpool
>> if this ptrpool thing turns out to be helpful for other place.
>>
>> CC: Alexander Duyck <alexander.duyck@gmail.com>
>> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
>> ---
>>  mm/Kconfig.debug    |   8 +
>>  mm/Makefile         |   1 +
>>  mm/page_frag_test.c | 393 ++++++++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 402 insertions(+)
>>  create mode 100644 mm/page_frag_test.c
> 
> I might have missed it somewhere. Is there any reason why this isn't
> in the selftests/mm/ directory? Seems like that would be a better fit
> for this.
> 
>> diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug
>> index afc72fde0f03..1ebcd45f47d4 100644
>> --- a/mm/Kconfig.debug
>> +++ b/mm/Kconfig.debug
>> @@ -142,6 +142,14 @@ config DEBUG_PAGE_REF
>>           kernel code.  However the runtime performance overhead is virtually
>>           nil until the tracepoints are actually enabled.
>>
>> +config DEBUG_PAGE_FRAG_TEST
> 
> This isn't a "DEBUG" feature. This is a test feature.
> 
>> +       tristate "Test module for page_frag"
>> +       default n
>> +       depends on m && DEBUG_KERNEL
> 
> I am not sure it is valid to have a tristate depend on being built as a module.

Perhaps I was copying the wrong pattern from TEST_OBJPOOL in lib/Kconfig.debug.
Perhaps mm/dmapool_test.c and DMAPOOL_TEST* *was more appropriate pattern
for test module for page_frag?

> 
> I think if you can set it up as a selftest it will have broader use as
> you could compile it against any target kernel going forward and add
> it as a module rather than having to build it as a part of a debug
> kernel.

It seems tools/testing/selftests/mm/* are all about userspace testing
tool, and testing kernel module seems to be in the same directory with
the code to be tested?

> 
>> +       help
>> +         This builds the "page_frag_test" module that is used to test the
>> +         correctness and performance of page_frag's implementation.
>> +
>>  config DEBUG_RODATA_TEST
>>      bool "Testcase for the marking rodata read-only"

...

>> +
>> +               /*
>> +                * here we allocate percpu-slot & objs together in a single
>> +                * allocation to make it more compact, taking advantage of
>> +                * warm caches and TLB hits. in default vmalloc is used to
>> +                * reduce the pressure of kernel slab system. as we know,
>> +                * minimal size of vmalloc is one page since vmalloc would
>> +                * always align the requested size to page size
>> +                */
>> +               if (gfp & GFP_ATOMIC)
>> +                       slot = kmalloc_node(size, gfp, cpu_to_node(i));
>> +               else
>> +                       slot = __vmalloc_node(size, sizeof(void *), gfp,
>> +                                             cpu_to_node(i),
>> +                                             __builtin_return_address(0));
> 
> When would anyone ever call this with atomic? This is just for your
> test isn't it?
> 
>> +               if (!slot)
>> +                       return -ENOMEM;
>> +
>> +               memset(slot, 0, size);
>> +               pool->cpu_slots[i] = slot;
>> +
>> +               objpool_init_percpu_slot(pool, slot);
>> +       }
>> +
>> +       return 0;
>> +}

...

>> +/* release whole objpool forcely */
>> +static void objpool_free(struct objpool_head *pool)
>> +{
>> +       if (!pool->cpu_slots)
>> +               return;
>> +
>> +       /* release percpu slots */
>> +       objpool_fini_percpu_slots(pool);
>> +}
>> +
> 
> Why add all this extra objpool overhead? This seems like overkill for
> what should be a simple test. Seems like you should just need a simple
> array located on one of your CPUs. I'm not sure what is with all the
> extra overhead being added here.

As mentioned in the commit log:
"We may refactor out the common part between objpool and ptrpool
if this ptrpool thing turns out to be helpful for other place."

The next thing I am trying to do is to use ptrpool to optimization
the pcp for mm subsystem. so I would rather not tailor the ptrpool
for page_frag_test, and it doesn't seem to affect the testing that
much.

> 
>> +static struct objpool_head ptr_pool;
>> +static int nr_objs = 512;
>> +static atomic_t nthreads;
>> +static struct completion wait;
>> +static struct page_frag_cache test_frag;
>> +
>> +static int nr_test = 5120000;
>> +module_param(nr_test, int, 0);
>> +MODULE_PARM_DESC(nr_test, "number of iterations to test");
>> +
>> +static bool test_align;
>> +module_param(test_align, bool, 0);
>> +MODULE_PARM_DESC(test_align, "use align API for testing");
>> +
>> +static int test_alloc_len = 2048;
>> +module_param(test_alloc_len, int, 0);
>> +MODULE_PARM_DESC(test_alloc_len, "alloc len for testing");
>> +
>> +static int test_push_cpu;
>> +module_param(test_push_cpu, int, 0);
>> +MODULE_PARM_DESC(test_push_cpu, "test cpu for pushing fragment");
>> +
>> +static int test_pop_cpu;
>> +module_param(test_pop_cpu, int, 0);
>> +MODULE_PARM_DESC(test_pop_cpu, "test cpu for popping fragment");
>> +
>> +static int page_frag_pop_thread(void *arg)
>> +{
>> +       struct objpool_head *pool = arg;
>> +       int nr = nr_test;
>> +
>> +       pr_info("page_frag pop test thread begins on cpu %d\n",
>> +               smp_processor_id());
>> +
>> +       while (nr > 0) {
>> +               void *obj = objpool_pop(pool);
>> +
>> +               if (obj) {
>> +                       nr--;
>> +                       page_frag_free(obj);
>> +               } else {
>> +                       cond_resched();
>> +               }
>> +       }
>> +
>> +       if (atomic_dec_and_test(&nthreads))
>> +               complete(&wait);
>> +
>> +       pr_info("page_frag pop test thread exits on cpu %d\n",
>> +               smp_processor_id());
>> +
>> +       return 0;
>> +}
>> +
>> +static int page_frag_push_thread(void *arg)
>> +{
>> +       struct objpool_head *pool = arg;
>> +       int nr = nr_test;
>> +
>> +       pr_info("page_frag push test thread begins on cpu %d\n",
>> +               smp_processor_id());
>> +
>> +       while (nr > 0) {
>> +               void *va;
>> +               int ret;
>> +
>> +               if (test_align) {
>> +                       va = page_frag_alloc_align(&test_frag, test_alloc_len,
>> +                                                  GFP_KERNEL, SMP_CACHE_BYTES);
>> +
>> +                       WARN_ONCE((unsigned long)va & (SMP_CACHE_BYTES - 1),
>> +                                 "unaligned va returned\n");
>> +               } else {
>> +                       va = page_frag_alloc(&test_frag, test_alloc_len, GFP_KERNEL);
>> +               }
>> +
>> +               if (!va)
>> +                       continue;
>> +
>> +               ret = objpool_push(va, pool);
>> +               if (ret) {
>> +                       page_frag_free(va);
>> +                       cond_resched();
>> +               } else {
>> +                       nr--;
>> +               }
>> +       }
>> +
>> +       pr_info("page_frag push test thread exits on cpu %d\n",
>> +               smp_processor_id());
>> +
>> +       if (atomic_dec_and_test(&nthreads))
>> +               complete(&wait);
>> +
>> +       return 0;
>> +}
>> +
> 
> So looking over these functions they seem to overlook how the network
> stack works in many cases. One of the main motivations for the page
> frags approach is page recycling. For example with GRO enabled the
> headers allocated to record the frags might be freed for all but the
> first. As such you can end up with 17 fragments being allocated, and
> 16 freed within the same thread as NAPI will just be recycling the
> buffers.
> 
> With this setup it doesn't seem very likely to be triggered since you
> are operating in two threads. One test you might want to look at
> adding is a test where you are allocating and freeing in the same
> thread at a fairly constant rate to test against the "ideal" scenario.

I am not sure if the above is still the "ideal" scenario, as you mentioned
that most drivers are turning to use page_pool for rx, the page frag is really
mostly for tx or skb->data for rx.

> 



  reply	other threads:[~2024-07-23 13:31 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20240719093338.55117-1-linyunsheng@huawei.com>
2024-07-19  9:33 ` Yunsheng Lin
2024-07-21 17:34   ` Alexander Duyck
2024-07-23 13:19     ` Yunsheng Lin [this message]
2024-07-19  9:33 ` [RFC v11 02/14] mm: move the page fragment allocator from page_alloc into its own file Yunsheng Lin
2024-07-21 17:58   ` Alexander Duyck
2024-07-27 15:04     ` Yunsheng Lin
2024-07-19  9:33 ` [RFC v11 03/14] mm: page_frag: use initial zero offset for page_frag_alloc_align() Yunsheng Lin
2024-07-21 18:34   ` Alexander Duyck
2024-07-19  9:33 ` [RFC v11 04/14] mm: page_frag: add '_va' suffix to page_frag API Yunsheng Lin
     [not found]   ` <CAKgT0UcqELiXntRA_uD8eJGjt-OCLO64ax=YFXrCHNnaj9kD8g@mail.gmail.com>
2024-07-25 12:21     ` Yunsheng Lin
2024-07-19  9:33 ` [RFC v11 05/14] mm: page_frag: avoid caller accessing 'page_frag_cache' directly Yunsheng Lin
2024-07-21 23:01   ` Alexander H Duyck
2024-07-19  9:33 ` [RFC v11 07/14] mm: page_frag: reuse existing space for 'size' and 'pfmemalloc' Yunsheng Lin
2024-07-21 22:59   ` Alexander H Duyck
2024-07-19  9:33 ` [RFC v11 08/14] mm: page_frag: some minor refactoring before adding new API Yunsheng Lin
2024-07-21 23:40   ` Alexander H Duyck
2024-07-22 12:55     ` Yunsheng Lin
2024-07-22 15:32       ` Alexander Duyck
2024-07-23 13:19         ` Yunsheng Lin
2024-07-30 13:20           ` Yunsheng Lin
2024-07-30 15:12             ` Alexander H Duyck
2024-07-31 12:35               ` Yunsheng Lin
2024-07-31 17:02                 ` Alexander H Duyck
2024-08-01 12:53                   ` Yunsheng Lin
2024-07-19  9:33 ` [RFC v11 09/14] mm: page_frag: use __alloc_pages() to replace alloc_pages_node() Yunsheng Lin
2024-07-21 21:41   ` Alexander H Duyck
2024-07-24 12:54     ` Yunsheng Lin
2024-07-24 15:03       ` Alexander Duyck
2024-07-25 12:19         ` Yunsheng Lin
2024-08-14 18:34           ` Alexander H Duyck
2024-07-19  9:33 ` [RFC v11 11/14] mm: page_frag: introduce prepare/probe/commit API Yunsheng Lin
2024-07-19  9:33 ` [RFC v11 13/14] mm: page_frag: update documentation for page_frag Yunsheng Lin
     [not found] ` <CAKgT0UcGvrS7=r0OCGZipzBv8RuwYtRwb2QDXqiF4qW5CNws4g@mail.gmail.com>
     [not found]   ` <b2001dba-a2d2-4b49-bc9f-59e175e7bba1@huawei.com>
2024-07-22 15:21     ` [RFC v11 00/14] Replace page_frag with page_frag_cache for sk_page_frag() Alexander Duyck
2024-07-23 13:17       ` Yunsheng Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c7497b53-2dd7-4176-bb70-4a14558d90ab@huawei.com \
    --to=linyunsheng@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.duyck@gmail.com \
    --cc=davem@davemloft.net \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox