From: Jonathan Cameron <jonathan.cameron@huawei.com>
To: Shivank Garg <shivankg@amd.com>
Cc: <akpm@linux-foundation.org>, <david@redhat.com>, <ziy@nvidia.com>,
<willy@infradead.org>, <matthew.brost@intel.com>,
<joshua.hahnjy@gmail.com>, <rakie.kim@sk.com>, <byungchul@sk.com>,
<gourry@gourry.net>, <ying.huang@linux.alibaba.com>,
<apopple@nvidia.com>, <lorenzo.stoakes@oracle.com>,
<Liam.Howlett@oracle.com>, <vbabka@suse.cz>, <rppt@kernel.org>,
<surenb@google.com>, <mhocko@suse.com>, <vkoul@kernel.org>,
<lucas.demarchi@intel.com>, <rdunlap@infradead.org>,
<jgg@ziepe.ca>, <kuba@kernel.org>, <justonli@chromium.org>,
<ivecera@redhat.com>, <dave.jiang@intel.com>,
<dan.j.williams@intel.com>, <rientjes@google.com>,
<Raghavendra.KodsaraThimmappa@amd.com>, <bharata@amd.com>,
<alirad.malek@zptcorp.com>, <yiannis@zptcorp.com>,
<weixugc@google.com>, <linux-kernel@vger.kernel.org>,
<linux-mm@kvack.org>
Subject: Re: [RFC V3 7/9] dcbm: add dma core batch migrator for batch page offloading
Date: Thu, 2 Oct 2025 12:38:31 +0100 [thread overview]
Message-ID: <20251002123831.000000f9@huawei.com> (raw)
In-Reply-To: <20250923174752.35701-8-shivankg@amd.com>
On Tue, 23 Sep 2025 17:47:42 +0000
Shivank Garg <shivankg@amd.com> wrote:
> The dcbm (DMA core batch migrator) provides a generic interface using
> DMAEngine for end-to-end testing of the batch page migration offload
> feature.
>
> Enable DCBM offload:
> echo 1 > /sys/kernel/dcbm/offloading
> echo NR_DMA_CHAN_TO_USE > /sys/kernel/dcbm/nr_dma_chan
>
> Disable DCBM offload:
> echo 0 > /sys/kernel/dcbm/offloading
>
> Signed-off-by: Shivank Garg <shivankg@amd.com>
Hi Shivank,
Some minor comments inline.
J
> ---
> drivers/migoffcopy/Kconfig | 8 +
> drivers/migoffcopy/Makefile | 1 +
> drivers/migoffcopy/dcbm/Makefile | 1 +
> drivers/migoffcopy/dcbm/dcbm.c | 415 +++++++++++++++++++++++++++++++
> 4 files changed, 425 insertions(+)
> create mode 100644 drivers/migoffcopy/dcbm/Makefile
> create mode 100644 drivers/migoffcopy/dcbm/dcbm.c
>
> diff --git a/drivers/migoffcopy/Kconfig b/drivers/migoffcopy/Kconfig
> index e73698af3e72..c1b2eff7650d 100644
> --- a/drivers/migoffcopy/Kconfig
> +++ b/drivers/migoffcopy/Kconfig
> @@ -6,4 +6,12 @@ config MTCOPY_CPU
> Interface MT COPY CPU driver for batch page migration
> offloading. Say Y if you want to try offloading with
> MultiThreaded CPU copy APIs.
I'd put a blank line here.
> +config DCBM_DMA
> + bool "DMA Core Batch Migrator"
> + depends on OFFC_MIGRATION && DMA_ENGINE
> + default n
no need to say this. Everything is default n.
> + help
> + Interface DMA driver for batch page migration offloading.
> + Say Y if you want to try offloading with DMAEngine APIs
> + based driver.
Similar comment on the 'try'
>
> diff --git a/drivers/migoffcopy/Makefile b/drivers/migoffcopy/Makefile
> index 0a3c356d67e6..dedc86ff54c1 100644
> --- a/drivers/migoffcopy/Makefile
> +++ b/drivers/migoffcopy/Makefile
> @@ -1 +1,2 @@
> obj-$(CONFIG_MTCOPY_CPU) += mtcopy/
> +obj-$(CONFIG_DCBM_DMA) += dcbm/
> diff --git a/drivers/migoffcopy/dcbm/Makefile b/drivers/migoffcopy/dcbm/Makefile
> new file mode 100644
> index 000000000000..56ba47cce0f1
> --- /dev/null
> +++ b/drivers/migoffcopy/dcbm/Makefile
> @@ -0,0 +1 @@
> +obj-$(CONFIG_DCBM_DMA) += dcbm.o
> diff --git a/drivers/migoffcopy/dcbm/dcbm.c b/drivers/migoffcopy/dcbm/dcbm.c
> new file mode 100644
> index 000000000000..87a58c0c3b9b
> --- /dev/null
> +++ b/drivers/migoffcopy/dcbm/dcbm.c
> +/**
> + * folios_copy_dma - Copy folios using DMA engine
> + * @dst_list: Destination folio list
> + * @src_list: Source folio list
> + * @nr_folios: Number of folios to copy
> + *
> + * Return: 0. Fallback to CPU copy on any error.
> + */
> +static int folios_copy_dma(struct list_head *dst_list,
> + struct list_head *src_list,
> + unsigned int nr_folios)
> +{
> + struct dma_work *works;
> + struct list_head *src_pos = src_list->next;
> + struct list_head *dst_pos = dst_list->next;
> + int i, folios_per_chan, ret = 0;
> + dma_cap_mask_t mask;
> + int actual_channels = 0;
> + int max_channels;
> +
> + max_channels = min3(nr_dma_channels, nr_folios, MAX_DMA_CHANNELS);
> +
> + works = kcalloc(max_channels, sizeof(*works), GFP_KERNEL);
> + if (!works)
> + goto fallback;
> +
> + dma_cap_zero(mask);
> + dma_cap_set(DMA_MEMCPY, mask);
> +
> + for (i = 0; i < max_channels; i++) {
> + works[actual_channels].chan = dma_request_chan_by_mask(&mask);
> + if (IS_ERR(works[actual_channels].chan))
> + break;
> + init_completion(&works[actual_channels].done);
> + actual_channels++;
> + }
> +
> + if (actual_channels == 0) {
> + kfree(works);
> + goto fallback;
> + }
> +
> + for (i = 0; i < actual_channels; i++) {
> + folios_per_chan = nr_folios * (i + 1) / actual_channels -
> + (nr_folios * i) / actual_channels;
> + if (folios_per_chan == 0)
> + continue;
> +
> + ret = setup_sg_tables(&works[i], &src_pos, &dst_pos, folios_per_chan);
> + if (ret)
> + goto cleanup;
> + }
Indent issues here.
> +
> + for (i = 0; i < actual_channels; i++) {
> + ret = submit_dma_transfers(&works[i]);
> + if (ret) {
> + dev_err(dmaengine_get_dma_device(works[i].chan),
> + "Failed to submit transfers for channel %d\n", i);
> + goto cleanup;
> + }
> + }
> +
> + for (i = 0; i < actual_channels; i++) {
> + if (atomic_read(&works[i].pending) > 0)
> + dma_async_issue_pending(works[i].chan);
> + }
> +
> + for (i = 0; i < actual_channels; i++) {
> + if (atomic_read(&works[i].pending) > 0) {
I'd flip logic to something like.
if (!atomic_read(&works[i].pending)
continue;
if (!wait_for_...
Just to reduce the deep indent.
> + if (!wait_for_completion_timeout(&works[i].done, msecs_to_jiffies(10000))) {
> + dev_err(dmaengine_get_dma_device(works[i].chan),
> + "DMA timeout on channel %d\n", i);
> + ret = -ETIMEDOUT;
> + goto cleanup;
> + }
> + }
> + }
> +
> +cleanup:
> + cleanup_dma_work(works, actual_channels);
> + if (ret)
> + goto fallback;
This goto goto dance is probably not worth it. I'd just duplicate the
cleanup_dma_work() call to have a copy in the error path and one in the non error
path. Then you just end up with a conventional error block of labels + stuff to do.
> + return 0;
> +fallback:
> + /* Fall back to CPU copy */
> + pr_err("DCBM: Falling back to CPU copy\n");
> + folios_mc_copy(dst_list, src_list, nr_folios);
> + return 0;
> +}
> +static ssize_t offloading_store(struct kobject *kobj,
> + struct kobj_attribute *attr,
> + const char *buf, size_t count)
> +{
> + bool enable;
> + int ret;
> +
> + ret = kstrtobool(buf, &enable);
> + if (ret)
> + return ret;
> +
> + mutex_lock(&dcbm_mutex);
> +
> + if (enable == offloading_enabled) {
> + pr_err("migration offloading is already %s\n",
> + enable ? "ON" : "OFF");
To me that's not an error. Pointless, but not worth moaning about.
Just exit saying nothing.
> + goto out;
> + }
> +
> + if (enable) {
> + start_offloading(&dma_migrator);
> + offloading_enabled = true;
> + pr_info("migration offloading is now ON\n");
> + } else {
> + stop_offloading();
> + offloading_enabled = false;
> + pr_info("migration offloading is now OFF\n");
> + }
> +out:
> + mutex_unlock(&dcbm_mutex);
Perhaps guard and direct return above
> + return count;
> +}
> +static struct kobj_attribute offloading_attr = __ATTR_RW(offloading);
> +static struct kobj_attribute nr_dma_chan_attr = __ATTR_RW(nr_dma_chan);
> +
> +static struct attribute *dcbm_attrs[] = {
> + &offloading_attr.attr,
> + &nr_dma_chan_attr.attr,
> + NULL,
Trivial but doesn't need a trailing comma given this is a terminating entry
and nothing should ever come after it.
> +};
> +ATTRIBUTE_GROUPS(dcbm);
> +
> +static struct kobject *dcbm_kobj;
> +
> +static int __init dcbm_init(void)
> +{
> + int ret;
> +
> + dcbm_kobj = kobject_create_and_add("dcbm", kernel_kobj);
> + if (!dcbm_kobj)
> + return -ENOMEM;
> +
> + ret = sysfs_create_groups(dcbm_kobj, dcbm_groups);
Why use a group here and separate files in the CPU thread one?
I'd prefer a group there as well given slightly simpler error
handling as seen here.
> + if (ret) {
> + kobject_put(dcbm_kobj);
> + return ret;
> + }
> +
> + pr_info("DMA Core Batch Migrator initialized\n");
> + return 0;
> +}
> +
next prev parent reply other threads:[~2025-10-02 11:38 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-23 17:47 [RFC V3 0/9] Accelerate page migration with batch copying and hardware offload Shivank Garg
2025-09-23 17:47 ` [RFC V3 1/9] mm/migrate: factor out code in move_to_new_folio() and migrate_folio_move() Shivank Garg
2025-10-02 10:30 ` Jonathan Cameron
2025-09-23 17:47 ` [RFC V3 2/9] mm/migrate: revive MIGRATE_NO_COPY in migrate_mode Shivank Garg
2025-09-23 17:47 ` [RFC V3 3/9] mm: Introduce folios_mc_copy() for batch copying folios Shivank Garg
2025-09-23 17:47 ` [RFC V3 4/9] mm/migrate: add migrate_folios_batch_move to batch the folio move operations Shivank Garg
2025-10-02 11:03 ` Jonathan Cameron
2025-10-16 9:17 ` Garg, Shivank
2025-09-23 17:47 ` [RFC V3 5/9] mm: add support for copy offload for folio Migration Shivank Garg
2025-10-02 11:10 ` Jonathan Cameron
2025-10-16 9:40 ` Garg, Shivank
2025-09-23 17:47 ` [RFC V3 6/9] mtcopy: introduce multi-threaded page copy routine Shivank Garg
2025-10-02 11:29 ` Jonathan Cameron
2025-10-20 8:28 ` Byungchul Park
2025-11-06 6:27 ` Garg, Shivank
2025-11-12 2:12 ` Byungchul Park
2025-09-23 17:47 ` [RFC V3 7/9] dcbm: add dma core batch migrator for batch page offloading Shivank Garg
2025-10-02 11:38 ` Jonathan Cameron [this message]
2025-10-16 9:59 ` Garg, Shivank
2025-09-23 17:47 ` [RFC V3 8/9] adjust NR_MAX_BATCHED_MIGRATION for testing Shivank Garg
2025-09-23 17:47 ` [RFC V3 9/9] mtcopy: spread threads across die " Shivank Garg
2025-09-24 1:49 ` [RFC V3 0/9] Accelerate page migration with batch copying and hardware offload Huang, Ying
2025-09-24 2:03 ` Zi Yan
2025-09-24 3:11 ` Huang, Ying
2025-09-24 3:22 ` Zi Yan
2025-10-02 17:10 ` Garg, Shivank
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251002123831.000000f9@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=Liam.Howlett@oracle.com \
--cc=Raghavendra.KodsaraThimmappa@amd.com \
--cc=akpm@linux-foundation.org \
--cc=alirad.malek@zptcorp.com \
--cc=apopple@nvidia.com \
--cc=bharata@amd.com \
--cc=byungchul@sk.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=david@redhat.com \
--cc=gourry@gourry.net \
--cc=ivecera@redhat.com \
--cc=jgg@ziepe.ca \
--cc=joshua.hahnjy@gmail.com \
--cc=justonli@chromium.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=lucas.demarchi@intel.com \
--cc=matthew.brost@intel.com \
--cc=mhocko@suse.com \
--cc=rakie.kim@sk.com \
--cc=rdunlap@infradead.org \
--cc=rientjes@google.com \
--cc=rppt@kernel.org \
--cc=shivankg@amd.com \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=vkoul@kernel.org \
--cc=weixugc@google.com \
--cc=willy@infradead.org \
--cc=yiannis@zptcorp.com \
--cc=ying.huang@linux.alibaba.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox