linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 1/2] mm: migrate: Fix return value if all subpages of THPs are migrated successfully
@ 2022-10-24  8:34 Baolin Wang
  2022-10-24  8:34 ` [PATCH v3 2/2] mm: migrate: Try again if THP split is failed due to page refcnt Baolin Wang
  2022-10-24 17:21 ` [PATCH v3 1/2] mm: migrate: Fix return value if all subpages of THPs are migrated successfully Yang Shi
  0 siblings, 2 replies; 3+ messages in thread
From: Baolin Wang @ 2022-10-24  8:34 UTC (permalink / raw)
  To: akpm
  Cc: david, ying.huang, ziy, shy828301, apopple, baolin.wang,
	jingshan, linux-mm, linux-kernel

During THP migration, if THPs are not migrated but they are split and all
subpages are migrated successfully, migrate_pages() will still return the
number of THP pages that were not migrated.  This will confuse the callers
of migrate_pages().  For example, the longterm pinning will failed though
all pages are migrated successfully.

Thus we should return 0 to indicate that all pages are migrated in this
case

Fixes: b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()")
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Cc: <stable@vger.kernel.org>
---
Changes from v2:
 - Add Fixes tag suggested by Yang Shi and Huang, Ying.
 - Drop 'nr_thp_split' validation suggested by Alistair.
 - Add reviewed tag from Alistair.
 - Update the commit message suggested by Andrew.
Changes from v1:
 - Fix the return value of migrate_pages() instead of fixing the
   callers' validation.
---
 mm/migrate.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/mm/migrate.c b/mm/migrate.c
index 8e5eb6e..2eb16f8 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1582,6 +1582,13 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 	 */
 	list_splice(&ret_pages, from);
 
+	/*
+	 * Return 0 in case all subpages of fail-to-migrate THPs are
+	 * migrated successfully.
+	 */
+	if (list_empty(from))
+		rc = 0;
+
 	count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
 	count_vm_events(PGMIGRATE_FAIL, nr_failed_pages);
 	count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);
-- 
1.8.3.1



^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v3 2/2] mm: migrate: Try again if THP split is failed due to page refcnt
  2022-10-24  8:34 [PATCH v3 1/2] mm: migrate: Fix return value if all subpages of THPs are migrated successfully Baolin Wang
@ 2022-10-24  8:34 ` Baolin Wang
  2022-10-24 17:21 ` [PATCH v3 1/2] mm: migrate: Fix return value if all subpages of THPs are migrated successfully Yang Shi
  1 sibling, 0 replies; 3+ messages in thread
From: Baolin Wang @ 2022-10-24  8:34 UTC (permalink / raw)
  To: akpm
  Cc: david, ying.huang, ziy, shy828301, apopple, baolin.wang,
	jingshan, linux-mm, linux-kernel

When creating a virtual machine, we will use memfd_create() to get
a file descriptor which can be used to create share memory mappings
using the mmap function, meanwhile the mmap() will set the MAP_POPULATE
flag to allocate physical pages for the virtual machine.

When allocating physical pages for the guest, the host can fallback to
allocate some CMA pages for the guest when over half of the zone's free
memory is in the CMA area.

In guest os, when the application wants to do some data transaction with
DMA, our QEMU will call VFIO_IOMMU_MAP_DMA ioctl to do longterm-pin and
create IOMMU mappings for the DMA pages. However, when calling
VFIO_IOMMU_MAP_DMA ioctl to pin the physical pages, we found it will be
failed to longterm-pin sometimes.

After some invetigation, we found the pages used to do DMA mapping can
contain some CMA pages, and these CMA pages will cause a possible
failure of the longterm-pin, due to failed to migrate the CMA pages.
The reason of migration failure may be temporary reference count or
memory allocation failure. So that will cause the VFIO_IOMMU_MAP_DMA
ioctl returns error, which makes the application failed to start.

I observed one migration failure case (which is not easy to reproduce) is
that, the 'thp_migration_fail' count is 1 and the 'thp_split_page_failed'
count is also 1.

That means when migrating a THP which is in CMA area, but can not allocate
a new THP due to memory fragmentation, so it will split the THP. However
THP split is also failed, probably the reason is temporary reference count
of this THP. And the temporary reference count can be caused by dropping
page caches (I observed the drop caches operation in the system), but we
can not drop the shmem page caches due to they are already dirty at that time.

Especially for THP split failure, which is caused by temporary reference
count, we can try again to mitigate the failure of migration in this case
according to previous discussion [1].

[1] https://lore.kernel.org/all/470dc638-a300-f261-94b4-e27250e42f96@redhat.com/
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
---
Changes from v2:
 - Add reviewed tag from Ying. Thanks.
Changes from v1:
 - Use another variable to save the return value of THP split.
---
 mm/huge_memory.c |  4 ++--
 mm/migrate.c     | 19 ++++++++++++++++---
 2 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index ad17c8d..a79f03b 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2666,7 +2666,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 	 * split PMDs
 	 */
 	if (!can_split_folio(folio, &extra_pins)) {
-		ret = -EBUSY;
+		ret = -EAGAIN;
 		goto out_unlock;
 	}
 
@@ -2716,7 +2716,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 			xas_unlock(&xas);
 		local_irq_enable();
 		remap_page(folio, folio_nr_pages(folio));
-		ret = -EBUSY;
+		ret = -EAGAIN;
 	}
 
 out_unlock:
diff --git a/mm/migrate.c b/mm/migrate.c
index 2eb16f8..14562ab 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1506,9 +1506,22 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 				if (is_thp) {
 					nr_thp_failed++;
 					/* THP NUMA faulting doesn't split THP to retry. */
-					if (!nosplit && !try_split_thp(page, &thp_split_pages)) {
-						nr_thp_split++;
-						break;
+					if (!nosplit) {
+						int ret = try_split_thp(page, &thp_split_pages);
+
+						if (!ret) {
+							nr_thp_split++;
+							break;
+						} else if (reason == MR_LONGTERM_PIN &&
+							   ret == -EAGAIN) {
+							/*
+							 * Try again to split THP to mitigate
+							 * the failure of longterm pinning.
+							 */
+							thp_retry++;
+							nr_retry_pages += nr_subpages;
+							break;
+						}
 					}
 				} else if (!no_subpage_counting) {
 					nr_failed++;
-- 
1.8.3.1



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v3 1/2] mm: migrate: Fix return value if all subpages of THPs are migrated successfully
  2022-10-24  8:34 [PATCH v3 1/2] mm: migrate: Fix return value if all subpages of THPs are migrated successfully Baolin Wang
  2022-10-24  8:34 ` [PATCH v3 2/2] mm: migrate: Try again if THP split is failed due to page refcnt Baolin Wang
@ 2022-10-24 17:21 ` Yang Shi
  1 sibling, 0 replies; 3+ messages in thread
From: Yang Shi @ 2022-10-24 17:21 UTC (permalink / raw)
  To: Baolin Wang
  Cc: akpm, david, ying.huang, ziy, apopple, jingshan, linux-mm, linux-kernel

On Mon, Oct 24, 2022 at 1:34 AM Baolin Wang
<baolin.wang@linux.alibaba.com> wrote:
>
> During THP migration, if THPs are not migrated but they are split and all
> subpages are migrated successfully, migrate_pages() will still return the
> number of THP pages that were not migrated.  This will confuse the callers
> of migrate_pages().  For example, the longterm pinning will failed though
> all pages are migrated successfully.
>
> Thus we should return 0 to indicate that all pages are migrated in this
> case
>
> Fixes: b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()")
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> Reviewed-by: Alistair Popple <apopple@nvidia.com>
> Cc: <stable@vger.kernel.org>

Reviewed-by: Yang Shi <shy828301@gmail.com>

> ---
> Changes from v2:
>  - Add Fixes tag suggested by Yang Shi and Huang, Ying.
>  - Drop 'nr_thp_split' validation suggested by Alistair.
>  - Add reviewed tag from Alistair.
>  - Update the commit message suggested by Andrew.
> Changes from v1:
>  - Fix the return value of migrate_pages() instead of fixing the
>    callers' validation.
> ---
>  mm/migrate.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 8e5eb6e..2eb16f8 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1582,6 +1582,13 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>          */
>         list_splice(&ret_pages, from);
>
> +       /*
> +        * Return 0 in case all subpages of fail-to-migrate THPs are
> +        * migrated successfully.
> +        */
> +       if (list_empty(from))
> +               rc = 0;
> +
>         count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
>         count_vm_events(PGMIGRATE_FAIL, nr_failed_pages);
>         count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);
> --
> 1.8.3.1
>


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-10-24 17:21 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-24  8:34 [PATCH v3 1/2] mm: migrate: Fix return value if all subpages of THPs are migrated successfully Baolin Wang
2022-10-24  8:34 ` [PATCH v3 2/2] mm: migrate: Try again if THP split is failed due to page refcnt Baolin Wang
2022-10-24 17:21 ` [PATCH v3 1/2] mm: migrate: Fix return value if all subpages of THPs are migrated successfully Yang Shi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox