On 11/24/22 6:43 PM, David Hildenbrand wrote: > On 24.11.22 11:21, Gavin Shan wrote: >> On 11/24/22 6:09 PM, David Hildenbrand wrote: >>> On 24.11.22 10:55, Gavin Shan wrote: >>>> The issue is reported when removing memory through virtio_mem device. >>>> The transparent huge page, experienced copy-on-write fault, is wrongly >>>> regarded as pinned. The transparent huge page is escaped from being >>>> isolated in isolate_migratepages_block(). The transparent huge page >>>> can't be migrated and the corresponding memory block can't be put >>>> into offline state. >>>> >>>> Fix it by replacing page_mapcount() with total_mapcount(). With this, >>>> the transparent huge page can be isolated and migrated, and the memory >>>> block can be put into offline state. Besides, The page's refcount is >>>> increased a bit earlier to avoid the page is released when the check >>>> is executed. >>> >>> Did you look into handling pages that are in the swapcache case as well? >>> >>> See is_refcount_suitable() in mm/khugepaged.c. >>> >>> Should be easy to reproduce, let me know if you need inspiration. >>> >> >> Nope, I didn't look into the case. Please elaborate the details so that >> I can reproduce it firstly. > > > A simple reproducer would be (on a system with ordinary swap (not zram)) > > 1) mmap a region (MAP_ANON|MAP_PRIVATE) that can hold a THP > > 2) Enable THP for that region (MADV_HUGEPAGE) > > 3) Populate a THP (e.g., write access) > > 4) PTE-map the THP, for example, using MADV_FREE on the last subpage > > 5) Trigger swapout of the THP, for example, using MADV_PAGEOUT > > 6) Read-access to some subpages to fault them in from the swapcache > > > Now you'd have a THP, which > > 1) Is partially PTE-mapped into the page table > 2) Is in the swapcache (each subpage should have one reference from the swapache) > > > Now we could test, if alloc_contig_range() will still succeed (e.g., using virtio-mem). > Thanks for the details. Step (4) and (5) can be actually combined. To swap part of the THP (e.g. one sub-page) will force the THP to be split. I followed your steps in the attached program, there is no issue to do memory hot-remove through virtio-mem with or without this patch. # numactl -p 1 testsuite mm swap -k Any key to split THP Any key to swap sub-pages Any key to read the swapped sub-pages Page[000]: 0xffffffffffffffff Page[001]: 0xffffffffffffffff : Page[255]: 0xffffffffffffffff Any key to exit // hold here and the program doesn't exit (qemu) qom-set vm1 requested-size 0 [ 356.005396] virtio_mem virtio1: plugged size: 0x40000000 [ 356.005996] virtio_mem virtio1: requested size: 0x0 [ 356.350299] Fallback order for Node 0: 0 1 [ 356.350810] Fallback order for Node 1: 1 0 [ 356.351260] Built 2 zonelists, mobility grouping on. Total pages: 491343 [ 356.351998] Policy zone: DMA Thanks, Gavin