From: Yang Shi <yang.shi@linux.alibaba.com>
To: Oscar Salvador <osalvador@suse.de>
Cc: chrubis@suse.cz, vbabka@suse.cz, kirill@shutemov.name,
akpm@linux-foundation.org, stable@vger.kernel.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified
Date: Wed, 20 Mar 2019 11:31:50 -0700 [thread overview]
Message-ID: <3c880e88-6eb7-cd6d-fbf3-394b89355e10@linux.alibaba.com> (raw)
In-Reply-To: <20190320081643.3c4m5tec5vx653sn@d104.suse.de>
On 3/20/19 1:16 AM, Oscar Salvador wrote:
> On Wed, Mar 20, 2019 at 02:35:56AM +0800, Yang Shi wrote:
>> Fixes: 6f4576e3687b ("mempolicy: apply page table walker on queue_pages_range()")
>> Reported-by: Cyril Hrubis <chrubis@suse.cz>
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>> Cc: stable@vger.kernel.org
>> Suggested-by: Kirill A. Shutemov <kirill@shutemov.name>
>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
>> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> Hi Yang, thanks for the patch.
>
> Some observations below.
>
>> }
>> page = pmd_page(*pmd);
>> @@ -473,8 +480,15 @@ static int queue_pages_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr,
>> ret = 1;
>> flags = qp->flags;
>> /* go to thp migration */
>> - if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL))
>> + if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
>> + if (!vma_migratable(walk->vma)) {
>> + ret = -EIO;
>> + goto unlock;
>> + }
>> +
>> migrate_page_add(page, qp->pagelist, flags);
>> + } else
>> + ret = -EIO;
> if (!(flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) ||
> !vma_migratable(walk->vma)) {
> ret = -EIO;
> goto unlock;
> }
>
> migrate_page_add(page, qp->pagelist, flags);
> unlock:
> spin_unlock(ptl);
> out:
> return ret;
>
> seems more clean to me?
Yes, it sounds so.
>
>
>> unlock:
>> spin_unlock(ptl);
>> out:
>> @@ -499,8 +513,10 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
>> ptl = pmd_trans_huge_lock(pmd, vma);
>> if (ptl) {
>> ret = queue_pages_pmd(pmd, ptl, addr, end, walk);
>> - if (ret)
>> + if (ret > 0)
>> return 0;
>> + else if (ret < 0)
>> + return ret;
> I would go with the following, but that's a matter of taste I guess.
>
> if (ret < 0)
> return ret;
> else
> return 0;
No, this is not correct. queue_pages_pmd() may return 0, which means THP
gets split. If it returns 0 the code should just fall through instead of
returning.
>
>> }
>>
>> if (pmd_trans_unstable(pmd))
>> @@ -521,11 +537,16 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
>> continue;
>> if (!queue_pages_required(page, qp))
>> continue;
>> - migrate_page_add(page, qp->pagelist, flags);
>> + if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
>> + if (!vma_migratable(vma))
>> + break;
>> + migrate_page_add(page, qp->pagelist, flags);
>> + } else
>> + break;
> I might be missing something, but AFAICS neither vma nor flags is going to change
> while we are in queue_pages_pte_range(), so, could not we move the check just
> above the loop?
> In that way, 1) we only perform the check once and 2) if we enter the loop
> we know that we are going to do some work, so, something like:
>
> index af171ccb56a2..7c0e44389826 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -487,6 +487,9 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
> if (pmd_trans_unstable(pmd))
> return 0;
>
> + if (!(flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) || !vma_migratable(vma))
> + return -EIO;
It sounds not correct to me. We need check if there is existing page on
the node which is not allowed by the policy. This is what
queue_pages_required() does.
Thanks,
Yang
> +
> pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
> for (; addr != end; pte++, addr += PAGE_SIZE) {
> if (!pte_present(*pte))
>
>
>> }
>> pte_unmap_unlock(pte - 1, ptl);
>> cond_resched();
>> - return 0;
>> + return addr != end ? -EIO : 0;
> If we can do the above, we can leave the return value as it was.
>
next prev parent reply other threads:[~2019-03-20 18:32 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-19 18:35 Yang Shi
2019-03-20 0:49 ` David Rientjes
2019-03-20 1:06 ` Yang Shi
2019-03-20 5:53 ` Souptick Joarder
2019-03-20 22:16 ` Andrew Morton
2019-03-20 23:06 ` Yang Shi
2019-03-20 8:16 ` Oscar Salvador
2019-03-20 18:31 ` Yang Shi [this message]
2019-03-20 18:48 ` Oscar Salvador
2019-03-20 15:44 ` Rafael Aquini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3c880e88-6eb7-cd6d-fbf3-394b89355e10@linux.alibaba.com \
--to=yang.shi@linux.alibaba.com \
--cc=akpm@linux-foundation.org \
--cc=chrubis@suse.cz \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=osalvador@suse.de \
--cc=stable@vger.kernel.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox