From: Mike Kravetz <mike.kravetz@oracle.com>
To: Rik van Riel <riel@surriel.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
kernel-team@meta.com, stable@kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@redhat.com>
Subject: Re: [PATCH v2] mm,madvise,hugetlb: fix unexpected data loss with MADV_DONTNEED on hugetlbfs
Date: Fri, 21 Oct 2022 16:53:54 -0700 [thread overview]
Message-ID: <Y1MxEk5ewlJeaW28@monkey> (raw)
In-Reply-To: <20221021192805.366ad573@imladris.surriel.com>
On 10/21/22 19:28, Rik van Riel wrote:
> A common use case for hugetlbfs is for the application to create
> memory pools backed by huge pages, which then get handed over to
> some malloc library (eg. jemalloc) for further management.
>
> That malloc library may be doing MADV_DONTNEED calls on memory
> that is no longer needed, expecting those calls to happen on
> PAGE_SIZE boundaries.
>
> However, currently the MADV_DONTNEED code rounds up any such
> requests to HPAGE_PMD_SIZE boundaries. This leads to undesired
> outcomes when jemalloc expects a 4kB MADV_DONTNEED, but 2MB of
> memory get zeroed out, instead.
>
> Use of pre-built shared libraries means that user code does not
> always know the page size of every memory arena in use.
>
> Avoid unexpected data loss with MADV_DONTNEED by rounding up
> only to PAGE_SIZE (in do_madvise), and rounding down to huge
> page granularity.
>
> That way programs will only get as much memory zeroed out as
> they requested.
>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: stable@kernel.org
> Fixes: 90e7e7f5ef3f ("mm: enable MADV_DONTNEED for hugetlb mappings")
I do hate changing behavior, but in hindsight this is the right behavior.
Especially, since it can cause unexpected data loss.
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
--
Mike Kravetz
> ---
> v2: split out the most urgent fix for stable backports
>
> mm/madvise.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 2baa93ca2310..c7105ec6d08c 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -813,7 +813,14 @@ static bool madvise_dontneed_free_valid_vma(struct vm_area_struct *vma,
> if (start & ~huge_page_mask(hstate_vma(vma)))
> return false;
>
> - *end = ALIGN(*end, huge_page_size(hstate_vma(vma)));
> + /*
> + * Madvise callers expect the length to be rounded up to PAGE_SIZE
> + * boundaries, and may be unaware that this VMA uses huge pages.
> + * Avoid unexpected data loss by rounding down the number of
> + * huge pages freed.
> + */
> + *end = ALIGN_DOWN(*end, huge_page_size(hstate_vma(vma)));
> +
> return true;
> }
>
> @@ -828,6 +835,9 @@ static long madvise_dontneed_free(struct vm_area_struct *vma,
> if (!madvise_dontneed_free_valid_vma(vma, start, &end, behavior))
> return -EINVAL;
>
> + if (start == end)
> + return 0;
> +
> if (!userfaultfd_remove(vma, start, end)) {
> *prev = NULL; /* mmap_lock has been dropped, prev is stale */
>
> --
> 2.37.2
>
>
prev parent reply other threads:[~2022-10-21 23:54 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-21 23:28 Rik van Riel
2022-10-21 23:53 ` Mike Kravetz [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y1MxEk5ewlJeaW28@monkey \
--to=mike.kravetz@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@surriel.com \
--cc=stable@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox