From: Alex Sierra <alex.sierra@amd.com>
To: <akpm@linux-foundation.org>, <Felix.Kuehling@amd.com>,
<linux-mm@kvack.org>, <rcampbell@nvidia.com>,
<linux-ext4@vger.kernel.org>, <linux-xfs@vger.kernel.org>
Cc: <amd-gfx@lists.freedesktop.org>,
<dri-devel@lists.freedesktop.org>, <hch@lst.de>, <jgg@nvidia.com>,
<jglisse@redhat.com>, <apopple@nvidia.com>, <willy@infradead.org>
Subject: [PATCH v1 0/9] Add MEMORY_DEVICE_COHERENT for coherent device memory mapping
Date: Mon, 15 Nov 2021 13:30:17 -0600 [thread overview]
Message-ID: <20211115193026.27568-1-alex.sierra@amd.com> (raw)
This patch series introduces MEMORY_DEVICE_COHERENT, a type of memory
owned by a device that can be mapped into CPU page tables like
MEMORY_DEVICE_GENERIC and can also be migrated like
MEMORY_DEVICE_PRIVATE.
Christoph, the suggestion to incorporate Ralph Campbell’s refcount
cleanup patch into our hardware page migration patchset originally came
from you, but it proved impractical to do things in that order because
the refcount cleanup introduced a bug with wide ranging structural
implications. Instead, we amended Ralph’s patch so that it could be
applied after merging the migration work. As we saw from the recent
discussion, merging the refcount work is going to take some time and
cooperation between multiple development groups, while the migration
work is ready now and is needed now. So we propose to merge this
patchset first and continue to work with Ralph and others to merge the
refcount cleanup separately, when it is ready.
This patch series is mostly self-contained except for a few places where
it needs to update other subsystems to handle the new memory type.
System stability and performance are not affected according to our
ongoing testing, including xfstests.
How it works: The system BIOS advertises the GPU device memory
(aka VRAM) as SPM (special purpose memory) in the UEFI system address
map.
The amdgpu driver registers the memory with devmap as
MEMORY_DEVICE_COHERENT using devm_memremap_pages. The initial user for
this hardware page migration capability is the Frontier supercomputer
project. This functionality is not AMD-specific. We expect other GPU
vendors to find this functionality useful, and possibly other hardware
types in the future.
Our test nodes in the lab are similar to the Frontier configuration,
with .5 TB of system memory plus 256 GB of device memory split across
4 GPUs, all in a single coherent address space. Page migration is
expected to improve application efficiency significantly. We will
report empirical results as they become available.
We extended hmm_test to cover migration of MEMORY_DEVICE_COHERENT. This
patch set builds on HMM and our SVM memory manager already merged in
5.15.
Alex Sierra (9):
mm: add zone device coherent type memory support
mm: add device coherent vma selection for memory migration
drm/amdkfd: add SPM support for SVM
drm/amdkfd: coherent type as sys mem on migration to ram
lib: test_hmm add ioctl to get zone device type
lib: test_hmm add module param for zone device type
lib: add support for device coherent type in test_hmm
tools: update hmm-test to support device coherent type
tools: update test_hmm script to support SP config
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 34 ++-
include/linux/memremap.h | 8 +
include/linux/migrate.h | 1 +
include/linux/mm.h | 9 +
lib/test_hmm.c | 270 +++++++++++++++++------
lib/test_hmm_uapi.h | 21 +-
mm/memcontrol.c | 6 +-
mm/memory-failure.c | 6 +-
mm/memremap.c | 5 +-
mm/migrate.c | 30 ++-
tools/testing/selftests/vm/hmm-tests.c | 156 +++++++++++--
tools/testing/selftests/vm/test_hmm.sh | 20 +-
12 files changed, 446 insertions(+), 120 deletions(-)
--
2.32.0
next reply other threads:[~2021-11-15 19:31 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-15 19:30 Alex Sierra [this message]
2021-11-15 19:30 ` [PATCH v1 1/9] mm: add zone device coherent type memory support Alex Sierra
2021-11-17 9:50 ` Christoph Hellwig
2021-11-18 6:53 ` Alistair Popple
2021-11-18 18:59 ` Felix Kuehling
2021-11-22 2:40 ` Alistair Popple
2021-11-22 17:16 ` Felix Kuehling
2021-11-23 7:10 ` Alistair Popple
2021-11-15 19:30 ` [PATCH v1 2/9] mm: add device coherent vma selection for memory migration Alex Sierra
2021-11-18 6:54 ` Alistair Popple
2021-11-15 19:30 ` [PATCH v1 3/9] drm/amdkfd: add SPM support for SVM Alex Sierra
2021-11-15 19:30 ` [PATCH v1 4/9] drm/amdkfd: coherent type as sys mem on migration to ram Alex Sierra
2021-11-15 19:30 ` [PATCH v1 5/9] lib: test_hmm add ioctl to get zone device type Alex Sierra
2021-11-15 19:30 ` [PATCH v1 6/9] lib: test_hmm add module param for " Alex Sierra
2021-11-18 7:19 ` Alistair Popple
2021-11-15 19:30 ` [PATCH v1 7/9] lib: add support for device coherent type in test_hmm Alex Sierra
2021-11-25 2:12 ` Felix Kuehling
2021-11-15 19:30 ` [PATCH v1 8/9] tools: update hmm-test to support device coherent type Alex Sierra
2021-11-15 19:30 ` [PATCH v1 9/9] tools: update test_hmm script to support SP config Alex Sierra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211115193026.27568-1-alex.sierra@amd.com \
--to=alex.sierra@amd.com \
--cc=Felix.Kuehling@amd.com \
--cc=akpm@linux-foundation.org \
--cc=amd-gfx@lists.freedesktop.org \
--cc=apopple@nvidia.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=hch@lst.de \
--cc=jgg@nvidia.com \
--cc=jglisse@redhat.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=rcampbell@nvidia.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox