linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: Balbir Singh <balbirs@nvidia.com>
Cc: "Andrew Morton" <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linux-mm@kvack.org, "David Hildenbrand" <david@redhat.com>,
	"Zi Yan" <ziy@nvidia.com>,
	"Joshua Hahn" <joshua.hahnjy@gmail.com>,
	"Rakie Kim" <rakie.kim@sk.com>,
	"Byungchul Park" <byungchul@sk.com>,
	"Gregory Price" <gourry@gourry.net>,
	"Ying Huang" <ying.huang@linux.alibaba.com>,
	"Alistair Popple" <apopple@nvidia.com>,
	"Oscar Salvador" <osalvador@suse.de>,
	"Lorenzo Stoakes" <lorenzo.stoakes@oracle.com>,
	"Baolin Wang" <baolin.wang@linux.alibaba.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	"Nico Pache" <npache@redhat.com>,
	"Ryan Roberts" <ryan.roberts@arm.com>,
	"Dev Jain" <dev.jain@arm.com>, "Barry Song" <baohua@kernel.org>,
	"Lyude Paul" <lyude@redhat.com>,
	"Danilo Krummrich" <dakr@kernel.org>,
	"David Airlie" <airlied@gmail.com>,
	"Simona Vetter" <simona@ffwll.ch>,
	"Ralph Campbell" <rcampbell@nvidia.com>,
	"Mika Penttilä" <mpenttil@redhat.com>,
	"Francois Dugast" <francois.dugast@intel.com>
Subject: Re: [v7 00/16] mm: support device-private THP
Date: Wed, 19 Nov 2025 19:15:02 -0800	[thread overview]
Message-ID: <aR6HtvxhmVxUvd+h@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <24d8d39b-5ebe-4f29-93ff-3f7ca2a9b1cc@nvidia.com>

On Thu, Nov 20, 2025 at 01:59:09PM +1100, Balbir Singh wrote:
> On 11/20/25 13:50, Balbir Singh wrote:
> > On 11/20/25 13:40, Matthew Brost wrote:
> >> On Wed, Nov 12, 2025 at 10:52:43AM +1100, Balbir Singh wrote:
> >>> On 11/12/25 10:43, Andrew Morton wrote:
> >>>> On Thu, 9 Oct 2025 03:33:33 -0700 Matthew Brost <matthew.brost@intel.com> wrote:
> >>>>
> >>>>>>>> This patch series introduces support for Transparent Huge Page
> >>>>>>>> (THP) migration in zone device-private memory. The implementation enables
> >>>>>>>> efficient migration of large folios between system memory and
> >>>>>>>> device-private memory
> >>>>>>>
> >>>>>>> Lots of chatter for the v6 series, but none for v7.  I hope that's a
> >>>>>>> good sign.
> >>>>>>>
> >>>>>>
> >>>>>> I hope so too, I've tried to address the comments in v6.
> >>>>>>
> >>>>>
> >>>>> Circling back to this series, we will itegrate and test this version.
> >>>>
> >>>> How'd it go?
> >>>>
> >>
> >> My apologies for the delay—I got distracted by other tasks in Xe (my
> >> driver) and was out for a bit. Unfortunately, this series breaks
> >> something in the existing core MM code for the Xe SVM implementation. I
> >> have an extensive test case that hammers on SVM, which fully passes
> >> prior to applying this series, but fails randomly with the series
> >> applied (to drm-tip-rc6) due to the below kernel lockup.
> >>
> >> I've tried to trace where the migration PTE gets installed but not
> >> removed or isolate a test case which causes this failure but no luck so
> >> far. I'll keep digging as I have time.
> >>
> >> Beyond that, if I enable Xe SVM + THP, it seems to mostly work (though
> >> the same issue as above eventually occurs), but I do need two additional
> >> core MM patches—one is new code required for Xe, and the other could be
> >> considered a bug fix. Those patches can included when Xe merges SVM THP
> >> support but we need at least not break Xe SVM before this series merges.
> >>
> >> Stack trace:
> >>
> >> INFO: task kworker/u65:2:1642 blocked for more than 30
> >> seconds.
> >> [  212.624286]       Tainted: G S      W           6.18.0-rc6-xe+ #1719
> >> [  212.630561] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >> disables this message.
> >> [  212.638285] task:kworker/u65:2   state:D stack:0     pid:1642
> >> tgid:1642  ppid:2      task_flags:0x4208060 flags:0x00080000
> >> [  212.638288] Workqueue: xe_page_fault_work_queue
> >> xe_pagefault_queue_work [xe]
> >> [  212.638323] Call Trace:
> >> [  212.638324]  <TASK>
> >> [  212.638325]  __schedule+0x4b0/0x990
> >> [  212.638330]  schedule+0x22/0xd0
> >> [  212.638331]  io_schedule+0x41/0x60
> >> [  212.638333]  migration_entry_wait_on_locked+0x1d8/0x2d0
> >> [  212.638336]  ? __pfx_wake_page_function+0x10/0x10
> >> [  212.638339]  migration_entry_wait+0xd2/0xe0
> >> [  212.638341]  hmm_vma_walk_pmd+0x7c9/0x8d0
> >> [  212.638343]  walk_pgd_range+0x51d/0xa40
> >> [  212.638345]  __walk_page_range+0x75/0x1e0
> >> [  212.638347]  walk_page_range_mm+0x138/0x1f0
> >> [  212.638349]  hmm_range_fault+0x59/0xa0
> >> [  212.638351]  drm_gpusvm_get_pages+0x194/0x7b0 [drm_gpusvm_helper]
> >> [  212.638354]  drm_gpusvm_range_get_pages+0x2d/0x40 [drm_gpusvm_helper]
> >> [  212.638355]  __xe_svm_handle_pagefault+0x259/0x900 [xe]
> >> [  212.638375]  ? update_load_avg+0x7f/0x6c0
> >> [  212.638377]  ? update_curr+0x13d/0x170
> >> [  212.638379]  xe_svm_handle_pagefault+0x37/0x90 [xe]
> >> [  212.638396]  xe_pagefault_queue_work+0x2da/0x3c0 [xe]
> >> [  212.638420]  process_one_work+0x16e/0x2e0
> >> [  212.638422]  worker_thread+0x284/0x410
> >> [  212.638423]  ? __pfx_worker_thread+0x10/0x10
> >> [  212.638425]  kthread+0xec/0x210
> >> [  212.638427]  ? __pfx_kthread+0x10/0x10
> >> [  212.638428]  ? __pfx_kthread+0x10/0x10
> >> [  212.638430]  ret_from_fork+0xbd/0x100
> >> [  212.638433]  ? __pfx_kthread+0x10/0x10
> >> [  212.638434]  ret_from_fork_asm+0x1a/0x30
> >> [  212.638436]  </TASK>
> >>
> > 
> > Hi, Matt
> > 
> > Thanks for the report, two questions
> > 
> > 1. Are you using mm/mm-unstable, we've got some fixes in there (including fixes to remove_migration_pmd())

remove_migration_pmd - This is a PTE migration entry.

> >    - Generally a left behind migration entry is a symptom of a failed migration that did not clean up
> >      after itself.

I'm on drm-tip as I generally need the latest version of my driver
because of the speed we move at.

Yes, I agree it looks like somehow a migration PTE is not getting
properly removed.

I'm happy to cherry pick any patches that you think might be helpful
into my tree.

> > 2. The stack trace is from hmm_range_fault(), not something that this code touches.
> > 

Agree this is a symptom of the above issue.

> > The stack trace shows your code is seeing a migration entry and waiting on it.
> > Can you please provide a reproducer for the issue? In the form of a test in hmm-tests.c
> > 

That will be my plan. Right now I'm opening my test up which runs 1000s
of variations of SVM tests and the test that hangs is not consistent.
Some of these are threaded or multi-process so it might possibly be a
timing issue which could be hard to reproduce in hmm-tests.c. I'll do my
best here.

> > Have you been able to bisect the issue?
> 

That is my next step along with isolating a test case.

> Also could you please try with 10b9feee2d0d ("mm/hmm: populate PFNs from PMD swap entry")
> reverted?
> 

I can try but I highly doubt this is related. The hanging HMM code in is
PTE walk step after this, also I am not even enabling THP device pages
in my SVM code to reproduce this.

Matt

> > 
> > Balbir
> > 
> > 
> >> Matt 
> >>
> >>>> Balbir, what's the status here?  It's been a month and this series
> >>>> still has a "needs a new version" feeling to it.  If so, very soon
> >>>> please.
> >>>>
> >>>
> >>> I don't think this needs a new revision, I've been testing frequently
> >>> at my end to see if I can catch any regressions. I have a patch update for
> >>> mm-migrate_device-add-thp-splitting-during-migration.patch, it can be applied
> >>> on top or I can send a new version of the patch. I was waiting
> >>> on any feedback before I sent the patch out, but I'll do it now.
> >>>
> >>>> TODOs which I have noted are
> >>>>
> >>>> https://lkml.kernel.org/r/aOePfeoDuRW+prFq@lstrano-desk.jf.intel.com
> >>>
> >>> This was a clarification on the HMM patch mentioned in the changelog
> >>>
> >>>> https://lkml.kernel.org/r/CABzRoyZZ8QLF5PSeDCVxgcnQmF9kFQ3RZdNq0Deik3o9OrK+BQ@mail.gmail.com
> >>>
> >>> That's a minor comment on not using a temporary declaration, I don't think we need it, let me know if you feel strongly
> >>>
> >>>> https://lkml.kernel.org/r/D2A4B724-E5EF-46D3-9D3F-EBAD9B22371E@nvidia.com
> >>>
> >>> I have a patch for this, which I posted, I can do an update and resend it if required (the one mentioned above)
> >>>
> >>>> https://lkml.kernel.org/r/62073ca1-5bb6-49e8-b8d4-447c5e0e582e@
> >>>>
> >>>
> >>> I can't seem to open this
> >>>
> >>>> plus a general re-read of the
> >>>> mm-migrate_device-add-thp-splitting-during-migration.patch review
> >>>> discussion.
> >>>>
> >>> That's the patch I have
> >>>
> >>> Thanks for following up
> >>> Balbir
> > 
> 


  reply	other threads:[~2025-11-20  3:15 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-01  6:56 Balbir Singh
2025-10-01  6:56 ` [v7 01/16] mm/zone_device: support large zone device private folios Balbir Singh
2025-10-12  6:10   ` Lance Yang
2025-10-12 22:54     ` Balbir Singh
2025-10-01  6:56 ` [v7 02/16] mm/zone_device: Rename page_free callback to folio_free Balbir Singh
2025-10-01  6:56 ` [v7 03/16] mm/huge_memory: add device-private THP support to PMD operations Balbir Singh
2025-10-12 15:46   ` Lance Yang
2025-10-13  0:01     ` Balbir Singh
2025-10-13  1:48       ` Lance Yang
2025-10-17 14:49   ` linux-next: KVM/s390x regression (was: [v7 03/16] mm/huge_memory: add device-private THP support to PMD operations) Christian Borntraeger
2025-10-17 14:54     ` linux-next: KVM/s390x regression David Hildenbrand
2025-10-17 15:01       ` Christian Borntraeger
2025-10-17 15:07         ` David Hildenbrand
2025-10-17 15:20           ` Christian Borntraeger
2025-10-17 17:07             ` David Hildenbrand
2025-10-17 21:56               ` Balbir Singh
2025-10-17 22:15                 ` David Hildenbrand
2025-10-17 22:41                   ` David Hildenbrand
2025-10-20  7:01                     ` Christian Borntraeger
2025-10-20  7:00                 ` Christian Borntraeger
2025-10-20  8:41                   ` David Hildenbrand
2025-10-20  9:04                     ` Claudio Imbrenda
2025-10-27 16:47                     ` Claudio Imbrenda
2025-10-27 16:59                       ` David Hildenbrand
2025-10-27 17:06                       ` Christian Borntraeger
2025-10-28  9:24                         ` Balbir Singh
2025-10-28 13:01                         ` [PATCH v1 0/1] KVM: s390: Fix missing present bit for gmap puds Claudio Imbrenda
2025-10-28 13:01                           ` [PATCH v1 1/1] " Claudio Imbrenda
2025-10-28 21:23                             ` Balbir Singh
2025-10-29 10:00                             ` David Hildenbrand
2025-10-29 10:20                               ` Claudio Imbrenda
2025-10-28 22:53                           ` [PATCH v1 0/1] " Andrew Morton
2025-10-01  6:56 ` [v7 04/16] mm/rmap: extend rmap and migration support device-private entries Balbir Singh
2025-10-22 11:54   ` Lance Yang
2025-10-01  6:56 ` [v7 05/16] mm/huge_memory: implement device-private THP splitting Balbir Singh
2025-10-01  6:56 ` [v7 06/16] mm/migrate_device: handle partially mapped folios during collection Balbir Singh
2025-10-01  6:56 ` [v7 07/16] mm/migrate_device: implement THP migration of zone device pages Balbir Singh
2025-10-01  6:56 ` [v7 08/16] mm/memory/fault: add THP fault handling for zone device private pages Balbir Singh
2025-10-01  6:57 ` [v7 09/16] lib/test_hmm: add zone device private THP test infrastructure Balbir Singh
2025-10-01  6:57 ` [v7 10/16] mm/memremap: add driver callback support for folio splitting Balbir Singh
2025-10-01  6:57 ` [v7 11/16] mm/migrate_device: add THP splitting during migration Balbir Singh
2025-10-13 21:17   ` Zi Yan
2025-10-13 21:33     ` Balbir Singh
2025-10-13 21:55       ` Zi Yan
2025-10-13 22:50         ` Balbir Singh
2025-10-19  8:19   ` Wei Yang
2025-10-19 22:49     ` Balbir Singh
2025-10-19 22:59       ` Zi Yan
2025-10-21 21:34         ` Balbir Singh
2025-10-22  2:59           ` Zi Yan
2025-10-22  7:16             ` Balbir Singh
2025-10-22 15:26               ` Zi Yan
2025-10-28  9:32                 ` Balbir Singh
2025-10-01  6:57 ` [v7 12/16] lib/test_hmm: add large page allocation failure testing Balbir Singh
2025-10-01  6:57 ` [v7 13/16] selftests/mm/hmm-tests: new tests for zone device THP migration Balbir Singh
2025-10-01  6:57 ` [v7 14/16] selftests/mm/hmm-tests: partial unmap, mremap and anon_write tests Balbir Singh
2025-10-01  6:57 ` [v7 15/16] selftests/mm/hmm-tests: new throughput tests including THP Balbir Singh
2025-10-01  6:57 ` [v7 16/16] gpu/drm/nouveau: enable THP support for GPU memory migration Balbir Singh
2025-10-09  3:17 ` [v7 00/16] mm: support device-private THP Andrew Morton
2025-10-09  3:26   ` Balbir Singh
2025-10-09 10:33     ` Matthew Brost
2025-10-13 22:51       ` Balbir Singh
2025-11-11 23:43       ` Andrew Morton
2025-11-11 23:52         ` Balbir Singh
2025-11-12  0:24           ` Andrew Morton
2025-11-12  0:36             ` Balbir Singh
2025-11-20  2:40           ` Matthew Brost
2025-11-20  2:50             ` Balbir Singh
2025-11-20  2:59               ` Balbir Singh
2025-11-20  3:15                 ` Matthew Brost [this message]
2025-11-20  3:58                   ` Balbir Singh
2025-11-20  5:46                     ` Balbir Singh
2025-11-20  5:53                     ` Matthew Brost
2025-11-20  6:03                       ` Balbir Singh
2025-11-20 17:27                         ` Matthew Brost

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aR6HtvxhmVxUvd+h@lstrano-desk.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=airlied@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=balbirs@nvidia.com \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=byungchul@sk.com \
    --cc=dakr@kernel.org \
    --cc=david@redhat.com \
    --cc=dev.jain@arm.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=francois.dugast@intel.com \
    --cc=gourry@gourry.net \
    --cc=joshua.hahnjy@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=lyude@redhat.com \
    --cc=mpenttil@redhat.com \
    --cc=npache@redhat.com \
    --cc=osalvador@suse.de \
    --cc=rakie.kim@sk.com \
    --cc=rcampbell@nvidia.com \
    --cc=ryan.roberts@arm.com \
    --cc=simona@ffwll.ch \
    --cc=ying.huang@linux.alibaba.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox