From: Alistair Popple <apopple@nvidia.com>
To: John Hubbard <jhubbard@nvidia.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>,
linux-mm@kvack.org, Shigeru Yoshida <syoshida@redhat.com>,
David Hildenbrand <david@redhat.com>,
Jason Gunthorpe <jgg@nvidia.com>,
Minchan Kim <minchan@kernel.org>,
Pasha Tatashin <pasha.tatashin@soleen.com>
Subject: Re: [PATCH] mm/gup: stop leaking pinned pages in low memory conditions
Date: Thu, 17 Oct 2024 09:13:32 +1100 [thread overview]
Message-ID: <87ttdbofpw.fsf@nvdebian.thelocal> (raw)
In-Reply-To: <20241016202242.456953-1-jhubbard@nvidia.com>
John Hubbard <jhubbard@nvidia.com> writes:
> If a driver tries to call any of the pin_user_pages*(FOLL_LONGTERM)
> family of functions, and requests "too many" pages, then the call will
> erroneously leave pages pinned. This is visible in user space as an
> actual memory leak.
>
> Repro is trivial: just make enough pin_user_pages(FOLL_LONGTERM) calls
> to exhaust memory.
>
> The root cause of the problem is this sequence, within
> __gup_longterm_locked():
>
> __get_user_pages_locked()
> rc = check_and_migrate_movable_pages()
>
> ...which gets retried in a loop. The loop error handling is incomplete,
> clearly due to a somewhat unusual and complicated tri-state error API.
> But anyway, if -ENOMEM, or in fact, any unexpected error is returned
> from check_and_migrate_movable_pages(), then __gup_longterm_locked()
> happily returns the error, while leaving the pages pinned.
>
> In the failed case, which is an app that requests (via a device driver)
> 30720000000 bytes to be pinned, and then exits, I see this:
>
> $ grep foll /proc/vmstat
> nr_foll_pin_acquired 7502048
> nr_foll_pin_released 2048
>
> And after applying this patch, it returns to balanced pins:
>
> $ grep foll /proc/vmstat
> nr_foll_pin_acquired 7502048
> nr_foll_pin_released 7502048
>
> Fix this by unpinning the pages that __get_user_pages_locked() has
> pinned, in such error cases.
>
> Fixes: 24a95998e9ba ("mm/gup.c: simplify and fix check_and_migrate_movable_pages() return codes")
> Cc: Alistair Popple <apopple@nvidia.com>
> Cc: Shigeru Yoshida <syoshida@redhat.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Jason Gunthorpe <jgg@nvidia.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
> mm/gup.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/mm/gup.c b/mm/gup.c
> index a82890b46a36..24acf53c8294 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -2492,6 +2492,17 @@ static long __gup_longterm_locked(struct mm_struct *mm,
>
> /* FOLL_LONGTERM implies FOLL_PIN */
> rc = check_and_migrate_movable_pages(nr_pinned_pages, pages);
> +
> + /*
> + * The __get_user_pages_locked() call happens before we know
> + * that whether it's possible to successfully complete the whole
> + * operation. To compensate for this, if we get an unexpected
> + * error (such as -ENOMEM) then we must unpin everything, before
> + * erroring out.
> + */
> + if (rc != -EAGAIN && rc != 0)
> + unpin_user_pages(pages, nr_pinned_pages);
I know this is going to fall out of the loop in the next line but given
this is an error handling case it would be nice if this was made
explicit here with a break statement. It looks correct to me though so:
Reviewed-by: Alistair Popple <apopple@nvidia.com>
> +
> } while (rc == -EAGAIN);
> memalloc_pin_restore(flags);
> return rc ? rc : nr_pinned_pages;
next prev parent reply other threads:[~2024-10-16 22:17 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-16 20:22 John Hubbard
2024-10-16 21:57 ` Andrew Morton
2024-10-16 22:05 ` John Hubbard
2024-10-16 22:41 ` Andrew Morton
2024-10-16 22:13 ` Alistair Popple [this message]
2024-10-16 22:22 ` John Hubbard
2024-10-17 8:51 ` David Hildenbrand
2024-10-17 8:53 ` David Hildenbrand
2024-10-17 17:06 ` John Hubbard
2024-10-17 17:10 ` David Hildenbrand
2024-10-17 16:54 ` John Hubbard
2024-10-17 21:28 ` Alistair Popple
2024-10-17 21:47 ` David Hildenbrand
2024-10-17 21:57 ` John Hubbard
2024-10-17 22:03 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ttdbofpw.fsf@nvdebian.thelocal \
--to=apopple@nvidia.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=jgg@nvidia.com \
--cc=jhubbard@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=pasha.tatashin@soleen.com \
--cc=syoshida@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox