From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 588C2C8303D for ; Fri, 4 Jul 2025 11:10:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DAD4E6B8040; Fri, 4 Jul 2025 07:10:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D5EA76B803F; Fri, 4 Jul 2025 07:10:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C25B26B8040; Fri, 4 Jul 2025 07:10:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id A9EE56B803F for ; Fri, 4 Jul 2025 07:10:33 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 4A6ECC05C0 for ; Fri, 4 Jul 2025 11:10:33 +0000 (UTC) X-FDA: 83626313946.04.DD6EC52 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf18.hostedemail.com (Postfix) with ESMTP id C40261C000B for ; Fri, 4 Jul 2025 11:10:30 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=JwKvKf9u; spf=pass (imf18.hostedemail.com: domain of mpenttil@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=mpenttil@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751627431; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FMdb9PRN+44i7f9frkGcPkndBjXM/NPvSxmrg11NQJY=; b=xCo23UYvdGSqUzlmwHPnadjoYAObwyxEISkIuY3D8mXXRBiDJpW/+j9eQO6o8XGKDJiDX7 cIVMtzPa1OmQr+wydSU3NatNMPay9aGa6mKDLlY0t6pV/Ktpdhg8j1hDpJYzSTK9Am+EqW 1N01vZfioSrpO6UIhwsCUo97ez+z1sI= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=JwKvKf9u; spf=pass (imf18.hostedemail.com: domain of mpenttil@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=mpenttil@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751627431; a=rsa-sha256; cv=none; b=XPA+H3ve7MMYaK4AVvOkxYrws7FIHSac2mWCiiBHUU07f8igaWVIMC3zVLFj0kmlRX+zUV QYXwCCDq4qKETL5QyAyRZZc++i1UCHoDxkcEzSvCVbiE/CHQuQlKa4AAQqmS6U7GYL8l81 oAhzxm+5aaU0wOUbq5AybtOlipn6hKs= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1751627430; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FMdb9PRN+44i7f9frkGcPkndBjXM/NPvSxmrg11NQJY=; b=JwKvKf9uqyd/DuT0y5C10dmnKe27iUxAFntWJSmQjj8slmISrTp8Hr+sU2V03nzY8aKBrJ eV6gjlxMpoR6Tv7jeN1mEqAbTjogcD/fb4Gf1Tyo91LEV+VyhFKDcVSH7eimo2WPH7Bmhx WVo+41NeBMrnuGuNUxyo7lU60JNWza8= Received: from mail-lf1-f72.google.com (mail-lf1-f72.google.com [209.85.167.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-488-qks-ae4KPTqytaGycLdjGA-1; Fri, 04 Jul 2025 07:10:28 -0400 X-MC-Unique: qks-ae4KPTqytaGycLdjGA-1 X-Mimecast-MFC-AGG-ID: qks-ae4KPTqytaGycLdjGA_1751627427 Received: by mail-lf1-f72.google.com with SMTP id 2adb3069b0e04-553ab0afaa4so683423e87.3 for ; Fri, 04 Jul 2025 04:10:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751627427; x=1752232227; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=FMdb9PRN+44i7f9frkGcPkndBjXM/NPvSxmrg11NQJY=; b=D3dEEzpvQkoyOijyAkegvWg4bfPS/6Hje42sYDJGDePa7mj903eGHI6VS8nvyNjj2m EbN1GJB3J7DLuGTh6Ki4P+ki6LTxUUZ9Sdd1ZnbLLP40Ng3a4unZsaUF/E+wFOnDz+mr 6u7fX7YQTnIsh/o7ZdIy3K1ykvz8jryTnj29O4ml4GIkK1+FGPMBteNELriXKuuBlOGH Am0H+MSAmn5h1WfIbkNXsJrh0TIU1gIsCcFGoesLYiV9IngarKrDmC6tBS0YqO2VOw25 2Y/WykhOjYgjT8S3Pg/mjod+/Z3uj7w4BF15EV3DVI7cNQy6cGkJY2YtEOEgxSdT31fT du0w== X-Forwarded-Encrypted: i=1; AJvYcCWuOF8Fq3kk76m7EAaT4hqqxaBf+T4cZRAJT0BLTmhIAdcshz/ka2X8js6HP49nkHBt/MwyQL2Uvg==@kvack.org X-Gm-Message-State: AOJu0Yy6SK8ViwsgxAng5aO8kPmUwRQnxdOD25vptQtRcsKhl2Ya+Aoc 3CCEwGPfHSZ5baQnUawuNtvjlDQSboUSgmCz+35H3iv6m2m/lDsznYZpOg4Yw5qtGtesTdY9ta6 9Nhc1fXNkaMHe3stG/zCBoGGrq7E7nHtwMKwBDpAt4LtYCIIN2KY= X-Gm-Gg: ASbGncuIs5mDw18hLGUpVZAUYuolDbYE5sC3O5ALMIMGpg9YZTvCOhV4Fbxk/ZNsQfO KZSJPqCilvYeY8PqSnxEPJ+3J3+5ZniaWbbTEppJ/RAYYNWE0ATy7XWMBt3Q+BDimA97s2KG8iL XZzvf1z7Kcx8H8odbStcUhziA6Itbw0cc9jwahi9kY+GGqEdlyGPVTs0e8X+vJaFtK77tt7dIwI eH6rRsL5UwWD1LYaZ5Tjkzh89Eb3YKe6X/vB9exMfwQpV4FdGW/0Sznj0VS69gs1mTGLvkKTKeI cxJVrAkwntMI3gCvIArF1Pt5voXh9TNMrvovWL3NKIJnlcSm X-Received: by 2002:a05:6512:3e22:b0:550:d4f3:8491 with SMTP id 2adb3069b0e04-557aa964209mr512710e87.41.1751627427148; Fri, 04 Jul 2025 04:10:27 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH2PZtsRuga3xKsC8d69Jk19CF7eIpNMi3+IqbZUwGPLhw9uYfIaAQAQrEchi2Dmr5kcqr+7g== X-Received: by 2002:a05:6512:3e22:b0:550:d4f3:8491 with SMTP id 2adb3069b0e04-557aa964209mr512674e87.41.1751627426599; Fri, 04 Jul 2025 04:10:26 -0700 (PDT) Received: from [192.168.1.86] (85-23-48-6.bb.dnainternet.fi. [85.23.48.6]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-55638494d5asm222436e87.100.2025.07.04.04.10.25 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 04 Jul 2025 04:10:26 -0700 (PDT) Message-ID: <4c274ac4-17d7-4d37-aeff-9517731d0c9c@redhat.com> Date: Fri, 4 Jul 2025 14:10:25 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [v1 resend 03/12] mm/thp: zone_device awareness in THP handling code To: Balbir Singh , linux-mm@kvack.org Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org, Karol Herbst , Lyude Paul , Danilo Krummrich , David Airlie , Simona Vetter , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Shuah Khan , David Hildenbrand , Barry Song , Baolin Wang , Ryan Roberts , Matthew Wilcox , Peter Xu , Zi Yan , Kefeng Wang , Jane Chu , Alistair Popple , Donet Tom References: <20250703233511.2028395-1-balbirs@nvidia.com> <20250703233511.2028395-4-balbirs@nvidia.com> From: =?UTF-8?Q?Mika_Penttil=C3=A4?= In-Reply-To: <20250703233511.2028395-4-balbirs@nvidia.com> X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: HwH0JipgJ-vKoy4dcXetKz8RcnEnqEx0etfdzbfnyUk_1751627427 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: C40261C000B X-Stat-Signature: f47ozbcrmdts6mzwj6mnz94b8he9dqwp X-Rspam-User: X-Rspamd-Server: rspam05 X-HE-Tag: 1751627430-673576 X-HE-Meta: U2FsdGVkX1+1PC2JV0tGkzZF5zVr6rOwiXE01XvE5talG5UJr2qKpnEXb4Gbk+3r1ZCWgYme59Z+B72KGbKtIoT00AeZfTw7at8hUJ6NFKicfjMRHv3C3QajpUqC9kSZ+mje8Dgb2t5IWnMqAAvMR3kXftzIMNZ9ZYvQpyD4AhNHWo5gTGSD/FbVETkwqQ/gviTAo/Y1KZcL/p3EbXJBudo390aLXIqEeRDseGsaMf4Z12jAeHjv+6khN0DfjLblDVjnuHlQRx4eVbT1LOgMF4DUn6Djk2oukh60xrwb7hPqC2fB/gVDQHyWRbDXGUCJZN8a9AIKbOydp2ESliBJCZLzRk/DIokxlbGp4j4IxlJ2rrOvpe4HcrR9PScxeRrEZSSo3H3Ask/OoXsrTfybeSxOzk3umZI9NuKaB+Ux0ZAYMpISNGNw8TdObgiZGUBuKUBSh3G0osYF9mR2GXSuHrqWssx3LhEEKGInYInSg9ueLvd5Sl5wDua+JEHixXEoEQl2PayFwZBpI4vH0lO01QT/r8OEtUr2eUBZd9PH8C8ift1xWMIp9p/72CEnQralT3vQbdUWVx0ihrL3aB1V0cZl9H+50SqMs4xHdmb4QKtsgQImv9yYVrZtFDGWnhshv7djfjXoQlOHc36m9p1rm5WzgZDp4FsoDgoOMdg4H4sL24AfGQkgJDehkS5TZmvn4N+qGUCkAy6y9WX6dIrDV3wzLyD3VjlPXAGGVbGZ/YR1Ni9DbEUT/6ZzESGO/NVMl1vVF1IAMzIQgbwn2zQRjwJbWM0jaGS40pWBQK5O8I3jEmd/Xkoov7B3awX6Aqs2EG03CXNupYohcXnwaqnIiYKYrnzIqyRQ5D7TrKMpGRIiPXiqsyKouwTE0QmB3jpqZNC9+PmIBCTAsehDRr49ULFxixAreF3Wh3WMKQi8SsdmpXJkbJiT1QHCIAo5W8wQWOR8lWYcrwTjgv5RSwY z9h3xMdz tU/NwHPzY5L+4E/D42lblc+y/WWWTyzzzaLCH+0DG6r9ivbV/U7rGDXDi4dsszgFCtW8a0GpMZ+HPh7L5fCZF+AU+Wdk3xTBtcJ5eLIxartmHCQ5OEeP9Y+j0a2pQCK7Z00kDvFjVJJTX1usCh9HHzknc2kT5GH8WG9veF0J3NbXyJI0Ahb+ru6a7vSbW0NYopkm++F2gFYQHw4hFMOvk2D6FoEp5K2S4XzbUlbQo18NRxGlNihC7TGumsutxShOuVxf6Zi2qMcNFUQkdB2sY6qoqiQr6lGFLQbCu3yZ0WvgNH4hd2FrZlzscwc9BCvyOGP5OEdIEOIx7Ka/ni6WUVhsyDnH4j1YzyBdxg9s8Pdae9y9GdEABDgNbuGgPg5cRDUnbP5GXSe9boYmX4igs+sSmL1dcaMxd1UanVy7sgoDK5mWn4VLENPIeIkCj1Bu5GOrATpKKzk/RFdz7Ote9JOOYfzRYELoJwPpvzDzg/q9MVb27gZZmoJsV7POoHBju3K8vjCueFdORkFzRrOXf/1QNygp9DeewN58w5qLtNgeM6Zj7D+xC888o+GfWuH2yUjxxrEawOqTcN0zEqQwntMLdryK+5HaNvIYhKgyNS/wFmf9PSla1HoNXJfwoMiFui+AGtiU4OV7uZkD7M6PVeNQuS+C+yCYJVCOIcyBPKiHPoFfLMXOkkvNYCBPb8/y3E/GElBhlgxtzk18bDYQ/DF2XFNmEW/6120ySMNtB7qmGxYbogQ3zqFJ4dCjSI9y4qUpLIEPHLwy5u5Tk0/A3FQblNycqe3QDmLROuUdEVteFD6DrHKE8148yv062xuycpcF3Dth89sPMBeX5ddLnI9AWqK5mRfCSQjItm4yV2DTCZ0BkTVpMhRoJtrjpyh9/lNV/aMX6de+WY9Z3sHqVrV6/4Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 7/4/25 02:35, Balbir Singh wrote: > Make THP handling code in the mm subsystem for THP pages > aware of zone device pages. Although the code is > designed to be generic when it comes to handling splitting > of pages, the code is designed to work for THP page sizes > corresponding to HPAGE_PMD_NR. > > Modify page_vma_mapped_walk() to return true when a zone > device huge entry is present, enabling try_to_migrate() > and other code migration paths to appropriately process the > entry > > pmd_pfn() does not work well with zone device entries, use > pfn_pmd_entry_to_swap() for checking and comparison as for > zone device entries. > > try_to_map_to_unused_zeropage() does not apply to zone device > entries, zone device entries are ignored in the call. > > Cc: Karol Herbst > Cc: Lyude Paul > Cc: Danilo Krummrich > Cc: David Airlie > Cc: Simona Vetter > Cc: "Jérôme Glisse" > Cc: Shuah Khan > Cc: David Hildenbrand > Cc: Barry Song > Cc: Baolin Wang > Cc: Ryan Roberts > Cc: Matthew Wilcox > Cc: Peter Xu > Cc: Zi Yan > Cc: Kefeng Wang > Cc: Jane Chu > Cc: Alistair Popple > Cc: Donet Tom > > Signed-off-by: Balbir Singh > --- > mm/huge_memory.c | 153 +++++++++++++++++++++++++++++++------------ > mm/migrate.c | 2 + > mm/page_vma_mapped.c | 10 +++ > mm/pgtable-generic.c | 6 ++ > mm/rmap.c | 19 +++++- > 5 files changed, 146 insertions(+), 44 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index ce130225a8e5..e6e390d0308f 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -1711,7 +1711,8 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, > if (unlikely(is_swap_pmd(pmd))) { > swp_entry_t entry = pmd_to_swp_entry(pmd); > > - VM_BUG_ON(!is_pmd_migration_entry(pmd)); > + VM_BUG_ON(!is_pmd_migration_entry(pmd) && > + !is_device_private_entry(entry)); > if (!is_readable_migration_entry(entry)) { > entry = make_readable_migration_entry( > swp_offset(entry)); > @@ -2222,10 +2223,17 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, > } else if (thp_migration_supported()) { > swp_entry_t entry; > > - VM_BUG_ON(!is_pmd_migration_entry(orig_pmd)); > entry = pmd_to_swp_entry(orig_pmd); > folio = pfn_swap_entry_folio(entry); > flush_needed = 0; > + > + VM_BUG_ON(!is_pmd_migration_entry(*pmd) && > + !folio_is_device_private(folio)); > + > + if (folio_is_device_private(folio)) { > + folio_remove_rmap_pmd(folio, folio_page(folio, 0), vma); > + WARN_ON_ONCE(folio_mapcount(folio) < 0); > + } > } else > WARN_ONCE(1, "Non present huge pmd without pmd migration enabled!"); > > @@ -2247,6 +2255,15 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, > folio_mark_accessed(folio); > } > > + /* > + * Do a folio put on zone device private pages after > + * changes to mm_counter, because the folio_put() will > + * clean folio->mapping and the folio_test_anon() check > + * will not be usable. > + */ > + if (folio_is_device_private(folio)) > + folio_put(folio); > + > spin_unlock(ptl); > if (flush_needed) > tlb_remove_page_size(tlb, &folio->page, HPAGE_PMD_SIZE); > @@ -2375,7 +2392,8 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, > struct folio *folio = pfn_swap_entry_folio(entry); > pmd_t newpmd; > > - VM_BUG_ON(!is_pmd_migration_entry(*pmd)); > + VM_BUG_ON(!is_pmd_migration_entry(*pmd) && > + !folio_is_device_private(folio)); > if (is_writable_migration_entry(entry)) { > /* > * A protection check is difficult so > @@ -2388,9 +2406,11 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, > newpmd = swp_entry_to_pmd(entry); > if (pmd_swp_soft_dirty(*pmd)) > newpmd = pmd_swp_mksoft_dirty(newpmd); > - } else { > + } else if (is_writable_device_private_entry(entry)) { > + newpmd = swp_entry_to_pmd(entry); > + entry = make_device_exclusive_entry(swp_offset(entry)); > + } else > newpmd = *pmd; > - } > > if (uffd_wp) > newpmd = pmd_swp_mkuffd_wp(newpmd); > @@ -2842,16 +2862,20 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, > struct page *page; > pgtable_t pgtable; > pmd_t old_pmd, _pmd; > - bool young, write, soft_dirty, pmd_migration = false, uffd_wp = false; > - bool anon_exclusive = false, dirty = false; > + bool young, write, soft_dirty, uffd_wp = false; > + bool anon_exclusive = false, dirty = false, present = false; > unsigned long addr; > pte_t *pte; > int i; > + swp_entry_t swp_entry; > > VM_BUG_ON(haddr & ~HPAGE_PMD_MASK); > VM_BUG_ON_VMA(vma->vm_start > haddr, vma); > VM_BUG_ON_VMA(vma->vm_end < haddr + HPAGE_PMD_SIZE, vma); > - VM_BUG_ON(!is_pmd_migration_entry(*pmd) && !pmd_trans_huge(*pmd)); > + > + VM_BUG_ON(!is_pmd_migration_entry(*pmd) && !pmd_trans_huge(*pmd) > + && !(is_swap_pmd(*pmd) && > + is_device_private_entry(pmd_to_swp_entry(*pmd)))); > > count_vm_event(THP_SPLIT_PMD); > > @@ -2899,20 +2923,25 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, > return __split_huge_zero_page_pmd(vma, haddr, pmd); > } > > - pmd_migration = is_pmd_migration_entry(*pmd); > - if (unlikely(pmd_migration)) { > - swp_entry_t entry; > > + present = pmd_present(*pmd); > + if (unlikely(!present)) { > + swp_entry = pmd_to_swp_entry(*pmd); > old_pmd = *pmd; > - entry = pmd_to_swp_entry(old_pmd); > - page = pfn_swap_entry_to_page(entry); > - write = is_writable_migration_entry(entry); > + > + folio = pfn_swap_entry_folio(swp_entry); > + VM_BUG_ON(!is_migration_entry(swp_entry) && > + !is_device_private_entry(swp_entry)); > + page = pfn_swap_entry_to_page(swp_entry); > + write = is_writable_migration_entry(swp_entry); > + > if (PageAnon(page)) > - anon_exclusive = is_readable_exclusive_migration_entry(entry); > - young = is_migration_entry_young(entry); > - dirty = is_migration_entry_dirty(entry); > + anon_exclusive = > + is_readable_exclusive_migration_entry(swp_entry); > soft_dirty = pmd_swp_soft_dirty(old_pmd); > uffd_wp = pmd_swp_uffd_wp(old_pmd); > + young = is_migration_entry_young(swp_entry); > + dirty = is_migration_entry_dirty(swp_entry); > } else { > /* > * Up to this point the pmd is present and huge and userland has > @@ -2996,30 +3025,45 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, > * Note that NUMA hinting access restrictions are not transferred to > * avoid any possibility of altering permissions across VMAs. > */ > - if (freeze || pmd_migration) { > + if (freeze || !present) { > for (i = 0, addr = haddr; i < HPAGE_PMD_NR; i++, addr += PAGE_SIZE) { > pte_t entry; > - swp_entry_t swp_entry; > - > - if (write) > - swp_entry = make_writable_migration_entry( > - page_to_pfn(page + i)); > - else if (anon_exclusive) > - swp_entry = make_readable_exclusive_migration_entry( > - page_to_pfn(page + i)); > - else > - swp_entry = make_readable_migration_entry( > - page_to_pfn(page + i)); > - if (young) > - swp_entry = make_migration_entry_young(swp_entry); > - if (dirty) > - swp_entry = make_migration_entry_dirty(swp_entry); > - entry = swp_entry_to_pte(swp_entry); > - if (soft_dirty) > - entry = pte_swp_mksoft_dirty(entry); > - if (uffd_wp) > - entry = pte_swp_mkuffd_wp(entry); > - > + if (freeze || is_migration_entry(swp_entry)) { > + if (write) > + swp_entry = make_writable_migration_entry( > + page_to_pfn(page + i)); > + else if (anon_exclusive) > + swp_entry = make_readable_exclusive_migration_entry( > + page_to_pfn(page + i)); > + else > + swp_entry = make_readable_migration_entry( > + page_to_pfn(page + i)); > + if (young) > + swp_entry = make_migration_entry_young(swp_entry); > + if (dirty) > + swp_entry = make_migration_entry_dirty(swp_entry); > + entry = swp_entry_to_pte(swp_entry); > + if (soft_dirty) > + entry = pte_swp_mksoft_dirty(entry); > + if (uffd_wp) > + entry = pte_swp_mkuffd_wp(entry); > + } else { > + VM_BUG_ON(!is_device_private_entry(swp_entry)); > + if (write) > + swp_entry = make_writable_device_private_entry( > + page_to_pfn(page + i)); > + else if (anon_exclusive) > + swp_entry = make_device_exclusive_entry( > + page_to_pfn(page + i)); > + else > + swp_entry = make_readable_device_private_entry( > + page_to_pfn(page + i)); > + entry = swp_entry_to_pte(swp_entry); > + if (soft_dirty) > + entry = pte_swp_mksoft_dirty(entry); > + if (uffd_wp) > + entry = pte_swp_mkuffd_wp(entry); > + } > VM_WARN_ON(!pte_none(ptep_get(pte + i))); > set_pte_at(mm, addr, pte + i, entry); > } > @@ -3046,7 +3090,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, > } > pte_unmap(pte); > > - if (!pmd_migration) > + if (present) > folio_remove_rmap_pmd(folio, page, vma); > if (freeze) > put_page(page); > @@ -3058,8 +3102,11 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, > void split_huge_pmd_locked(struct vm_area_struct *vma, unsigned long address, > pmd_t *pmd, bool freeze) > { > + > VM_WARN_ON_ONCE(!IS_ALIGNED(address, HPAGE_PMD_SIZE)); > - if (pmd_trans_huge(*pmd) || is_pmd_migration_entry(*pmd)) > + if (pmd_trans_huge(*pmd) || is_pmd_migration_entry(*pmd) || > + (is_swap_pmd(*pmd) && > + is_device_private_entry(pmd_to_swp_entry(*pmd)))) > __split_huge_pmd_locked(vma, pmd, address, freeze); > } > > @@ -3238,6 +3285,9 @@ static void lru_add_split_folio(struct folio *folio, struct folio *new_folio, > VM_BUG_ON_FOLIO(folio_test_lru(new_folio), folio); > lockdep_assert_held(&lruvec->lru_lock); > > + if (folio_is_device_private(folio)) > + return; > + > if (list) { > /* page reclaim is reclaiming a huge page */ > VM_WARN_ON(folio_test_lru(folio)); > @@ -3252,6 +3302,7 @@ static void lru_add_split_folio(struct folio *folio, struct folio *new_folio, > list_add_tail(&new_folio->lru, &folio->lru); > folio_set_lru(new_folio); > } > + > } > > /* Racy check whether the huge page can be split */ > @@ -3543,6 +3594,10 @@ static int __split_unmapped_folio(struct folio *folio, int new_order, > ((mapping || swap_cache) ? > folio_nr_pages(release) : 0)); > > + if (folio_is_device_private(release)) > + percpu_ref_get_many(&release->pgmap->ref, > + (1 << new_order) - 1); pgmap refcount should not be modified here, count should remain the same after the split also > + > lru_add_split_folio(origin_folio, release, lruvec, > list); > > @@ -4596,7 +4651,10 @@ int set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, > return 0; > > flush_cache_range(vma, address, address + HPAGE_PMD_SIZE); > - pmdval = pmdp_invalidate(vma, address, pvmw->pmd); > + if (!folio_is_device_private(folio)) > + pmdval = pmdp_invalidate(vma, address, pvmw->pmd); > + else > + pmdval = pmdp_huge_clear_flush(vma, address, pvmw->pmd); > > /* See folio_try_share_anon_rmap_pmd(): invalidate PMD first. */ > anon_exclusive = folio_test_anon(folio) && PageAnonExclusive(page); > @@ -4646,6 +4704,17 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new) > entry = pmd_to_swp_entry(*pvmw->pmd); > folio_get(folio); > pmde = folio_mk_pmd(folio, READ_ONCE(vma->vm_page_prot)); > + > + if (unlikely(folio_is_device_private(folio))) { > + if (pmd_write(pmde)) > + entry = make_writable_device_private_entry( > + page_to_pfn(new)); > + else > + entry = make_readable_device_private_entry( > + page_to_pfn(new)); > + pmde = swp_entry_to_pmd(entry); > + } > + > if (pmd_swp_soft_dirty(*pvmw->pmd)) > pmde = pmd_mksoft_dirty(pmde); > if (is_writable_migration_entry(entry)) > diff --git a/mm/migrate.c b/mm/migrate.c > index 767f503f0875..0b6ecf559b22 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -200,6 +200,8 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw, > > if (PageCompound(page)) > return false; > + if (folio_is_device_private(folio)) > + return false; > VM_BUG_ON_PAGE(!PageAnon(page), page); > VM_BUG_ON_PAGE(!PageLocked(page), page); > VM_BUG_ON_PAGE(pte_present(ptep_get(pvmw->pte)), page); > diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c > index e981a1a292d2..ff8254e52de5 100644 > --- a/mm/page_vma_mapped.c > +++ b/mm/page_vma_mapped.c > @@ -277,6 +277,16 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) > * cannot return prematurely, while zap_huge_pmd() has > * cleared *pmd but not decremented compound_mapcount(). > */ > + swp_entry_t entry; > + > + if (!thp_migration_supported()) > + return not_found(pvmw); > + entry = pmd_to_swp_entry(pmde); > + if (is_device_private_entry(entry)) { > + pvmw->ptl = pmd_lock(mm, pvmw->pmd); > + return true; > + } > + > if ((pvmw->flags & PVMW_SYNC) && > thp_vma_suitable_order(vma, pvmw->address, > PMD_ORDER) && > diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c > index 567e2d084071..604e8206a2ec 100644 > --- a/mm/pgtable-generic.c > +++ b/mm/pgtable-generic.c > @@ -292,6 +292,12 @@ pte_t *___pte_offset_map(pmd_t *pmd, unsigned long addr, pmd_t *pmdvalp) > *pmdvalp = pmdval; > if (unlikely(pmd_none(pmdval) || is_pmd_migration_entry(pmdval))) > goto nomap; > + if (is_swap_pmd(pmdval)) { > + swp_entry_t entry = pmd_to_swp_entry(pmdval); > + > + if (is_device_private_entry(entry)) > + goto nomap; > + } > if (unlikely(pmd_trans_huge(pmdval))) > goto nomap; > if (unlikely(pmd_bad(pmdval))) { > diff --git a/mm/rmap.c b/mm/rmap.c > index bd83724d14b6..da1e5b03e1fe 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -2336,8 +2336,23 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, > break; > } > #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION > - subpage = folio_page(folio, > - pmd_pfn(*pvmw.pmd) - folio_pfn(folio)); > + /* > + * Zone device private folios do not work well with > + * pmd_pfn() on some architectures due to pte > + * inversion. > + */ > + if (folio_is_device_private(folio)) { > + swp_entry_t entry = pmd_to_swp_entry(*pvmw.pmd); > + unsigned long pfn = swp_offset_pfn(entry); > + > + subpage = folio_page(folio, pfn > + - folio_pfn(folio)); > + } else { > + subpage = folio_page(folio, > + pmd_pfn(*pvmw.pmd) > + - folio_pfn(folio)); > + } > + > VM_BUG_ON_FOLIO(folio_test_hugetlb(folio) || > !folio_test_pmd_mappable(folio), folio); >