From: John Hubbard <jhubbard@nvidia.com>
To: <jglisse@redhat.com>, <linux-mm@kvack.org>
Cc: <linux-kernel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Dan Williams <dan.j.williams@intel.com>
Subject: Re: [PATCH v2 10/11] mm/hmm: add helpers for driver to safely take the mmap_sem v2
Date: Thu, 28 Mar 2019 13:54:01 -0700 [thread overview]
Message-ID: <9df742eb-61ca-3629-a5f4-8ad1244ff840@nvidia.com> (raw)
In-Reply-To: <20190325144011.10560-11-jglisse@redhat.com>
On 3/25/19 7:40 AM, jglisse@redhat.com wrote:
> From: Jérôme Glisse <jglisse@redhat.com>
>
> The device driver context which holds reference to mirror and thus to
> core hmm struct might outlive the mm against which it was created. To
> avoid every driver to check for that case provide an helper that check
> if mm is still alive and take the mmap_sem in read mode if so. If the
> mm have been destroy (mmu_notifier release call back did happen) then
> we return -EINVAL so that calling code knows that it is trying to do
> something against a mm that is no longer valid.
>
> Changes since v1:
> - removed bunch of useless check (if API is use with bogus argument
> better to fail loudly so user fix their code)
>
> Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: John Hubbard <jhubbard@nvidia.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> ---
> include/linux/hmm.h | 50 ++++++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 47 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/hmm.h b/include/linux/hmm.h
> index f3b919b04eda..5f9deaeb9d77 100644
> --- a/include/linux/hmm.h
> +++ b/include/linux/hmm.h
> @@ -438,6 +438,50 @@ struct hmm_mirror {
> int hmm_mirror_register(struct hmm_mirror *mirror, struct mm_struct *mm);
> void hmm_mirror_unregister(struct hmm_mirror *mirror);
>
> +/*
> + * hmm_mirror_mm_down_read() - lock the mmap_sem in read mode
> + * @mirror: the HMM mm mirror for which we want to lock the mmap_sem
> + * Returns: -EINVAL if the mm is dead, 0 otherwise (lock taken).
> + *
> + * The device driver context which holds reference to mirror and thus to core
> + * hmm struct might outlive the mm against which it was created. To avoid every
> + * driver to check for that case provide an helper that check if mm is still
> + * alive and take the mmap_sem in read mode if so. If the mm have been destroy
> + * (mmu_notifier release call back did happen) then we return -EINVAL so that
> + * calling code knows that it is trying to do something against a mm that is
> + * no longer valid.
> + */
> +static inline int hmm_mirror_mm_down_read(struct hmm_mirror *mirror)
Hi Jerome,
Let's please not do this. There are at least two problems here:
1. The hmm_mirror_mm_down_read() wrapper around down_read() requires a
return value. This is counter to how locking is normally done: callers do
not normally have to check the return value of most locks (other than
trylocks). And sure enough, your own code below doesn't check the return value.
That is a pretty good illustration of why not to do this.
2. This is a weird place to randomly check for semi-unrelated state, such
as "is HMM still alive". By that I mean, if you have to detect a problem
at down_read() time, then the problem could have existed both before and
after the call to this wrapper. So it is providing a false sense of security,
and it is therefore actually undesirable to add the code.
If you insist on having this wrapper, I think it should have approximately
this form:
void hmm_mirror_mm_down_read(...)
{
WARN_ON(...)
down_read(...)
}
> +{
> + struct mm_struct *mm;
> +
> + /* Sanity check ... */
> + if (!mirror || !mirror->hmm)
> + return -EINVAL;
> + /*
> + * Before trying to take the mmap_sem make sure the mm is still
> + * alive as device driver context might outlive the mm lifetime.
Let's find another way, and a better place, to solve this problem.
Ref counting?
> + *
> + * FIXME: should we also check for mm that outlive its owning
> + * task ?
> + */
> + mm = READ_ONCE(mirror->hmm->mm);
> + if (mirror->hmm->dead || !mm)
> + return -EINVAL;
> +
> + down_read(&mm->mmap_sem);
> + return 0;
> +}
> +
> +/*
> + * hmm_mirror_mm_up_read() - unlock the mmap_sem from read mode
> + * @mirror: the HMM mm mirror for which we want to lock the mmap_sem
> + */
> +static inline void hmm_mirror_mm_up_read(struct hmm_mirror *mirror)
> +{
> + up_read(&mirror->hmm->mm->mmap_sem);
> +}
> +
>
> /*
> * To snapshot the CPU page table you first have to call hmm_range_register()
> @@ -463,7 +507,7 @@ void hmm_mirror_unregister(struct hmm_mirror *mirror);
> * if (ret)
> * return ret;
> *
> - * down_read(mm->mmap_sem);
> + * hmm_mirror_mm_down_read(mirror);
See? The normal down_read() code never needs to check a return value, so when
someone does a "simple" upgrade, it introduces a fatal bug here: if the wrapper
returns early, then the caller proceeds without having acquired the mmap_sem.
> * again:
> *
> * if (!hmm_range_wait_until_valid(&range, TIMEOUT)) {
> @@ -476,13 +520,13 @@ void hmm_mirror_unregister(struct hmm_mirror *mirror);
> *
> * ret = hmm_range_snapshot(&range); or hmm_range_fault(&range);
> * if (ret == -EAGAIN) {
> - * down_read(mm->mmap_sem);
> + * hmm_mirror_mm_down_read(mirror);
Same problem here.
thanks,
--
John Hubbard
NVIDIA
next prev parent reply other threads:[~2019-03-28 20:54 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-25 14:40 [PATCH v2 00/11] Improve HMM driver API v2 jglisse
2019-03-25 14:40 ` [PATCH v2 01/11] mm/hmm: select mmu notifier when selecting HMM jglisse
2019-03-28 20:33 ` John Hubbard
2019-03-29 21:15 ` Jerome Glisse
2019-03-29 21:42 ` John Hubbard
2019-03-25 14:40 ` [PATCH v2 02/11] mm/hmm: use reference counting for HMM struct v2 jglisse
2019-03-28 11:07 ` Ira Weiny
2019-03-28 19:11 ` Jerome Glisse
2019-03-28 20:43 ` John Hubbard
2019-03-28 21:21 ` Jerome Glisse
2019-03-29 0:39 ` John Hubbard
2019-03-28 16:57 ` Ira Weiny
2019-03-29 1:00 ` Jerome Glisse
2019-03-29 1:18 ` John Hubbard
2019-03-29 1:50 ` Jerome Glisse
2019-03-28 18:21 ` Ira Weiny
2019-03-29 2:25 ` Jerome Glisse
2019-03-29 20:07 ` John Hubbard
2019-03-29 2:11 ` John Hubbard
2019-03-29 2:22 ` Jerome Glisse
2019-03-25 14:40 ` [PATCH v2 03/11] mm/hmm: do not erase snapshot when a range is invalidated jglisse
2019-03-25 14:40 ` [PATCH v2 04/11] mm/hmm: improve and rename hmm_vma_get_pfns() to hmm_range_snapshot() v2 jglisse
2019-03-28 13:30 ` Ira Weiny
2019-03-25 14:40 ` [PATCH v2 05/11] mm/hmm: improve and rename hmm_vma_fault() to hmm_range_fault() v2 jglisse
2019-03-28 13:43 ` Ira Weiny
2019-03-28 22:03 ` Jerome Glisse
2019-03-25 14:40 ` [PATCH v2 06/11] mm/hmm: improve driver API to work and wait over a range v2 jglisse
2019-03-28 13:11 ` Ira Weiny
2019-03-28 21:39 ` Jerome Glisse
2019-03-28 16:12 ` Ira Weiny
2019-03-29 0:56 ` Jerome Glisse
2019-03-28 18:49 ` Ira Weiny
2019-03-25 14:40 ` [PATCH v2 07/11] mm/hmm: add default fault flags to avoid the need to pre-fill pfns arrays jglisse
2019-03-28 21:59 ` John Hubbard
2019-03-28 22:12 ` Jerome Glisse
2019-03-28 22:19 ` John Hubbard
2019-03-28 22:31 ` Jerome Glisse
2019-03-28 22:40 ` John Hubbard
2019-03-28 23:21 ` Jerome Glisse
2019-03-28 23:28 ` John Hubbard
2019-03-28 16:42 ` Ira Weiny
2019-03-29 1:17 ` Jerome Glisse
2019-03-29 1:30 ` John Hubbard
2019-03-29 1:42 ` Jerome Glisse
2019-03-29 1:59 ` Jerome Glisse
2019-03-29 2:05 ` John Hubbard
2019-03-29 2:12 ` Jerome Glisse
2019-03-28 23:43 ` Jerome Glisse
2019-03-25 14:40 ` [PATCH v2 08/11] mm/hmm: mirror hugetlbfs (snapshoting, faulting and DMA mapping) v2 jglisse
2019-03-28 16:53 ` Ira Weiny
2019-03-25 14:40 ` [PATCH v2 09/11] mm/hmm: allow to mirror vma of a file on a DAX backed filesystem v2 jglisse
2019-03-28 18:04 ` Ira Weiny
2019-03-29 2:17 ` Jerome Glisse
2019-03-25 14:40 ` [PATCH v2 10/11] mm/hmm: add helpers for driver to safely take the mmap_sem v2 jglisse
2019-03-28 20:54 ` John Hubbard [this message]
2019-03-28 21:30 ` Jerome Glisse
2019-03-28 21:41 ` John Hubbard
2019-03-28 22:08 ` Jerome Glisse
2019-03-28 22:25 ` John Hubbard
2019-03-28 22:40 ` Jerome Glisse
2019-03-28 22:43 ` John Hubbard
2019-03-28 23:05 ` Jerome Glisse
2019-03-28 23:20 ` John Hubbard
2019-03-28 23:24 ` Jerome Glisse
2019-03-28 23:34 ` John Hubbard
2019-03-28 18:44 ` Ira Weiny
2019-03-25 14:40 ` [PATCH v2 11/11] mm/hmm: add an helper function that fault pages and map them to a device v2 jglisse
2019-04-01 11:59 ` Souptick Joarder
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9df742eb-61ca-3629-a5f4-8ad1244ff840@nvidia.com \
--to=jhubbard@nvidia.com \
--cc=akpm@linux-foundation.org \
--cc=dan.j.williams@intel.com \
--cc=jglisse@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox