linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yang Shi <yang.shi@linux.alibaba.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: mgorman@techsingularity.net, vbabka@suse.cz,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] mm: mempolicy: remove MPOL_MF_LAZY
Date: Thu, 21 Mar 2019 10:25:08 -0700	[thread overview]
Message-ID: <60ef6b4a-4f24-567f-af2f-50d97a2672d6@linux.alibaba.com> (raw)
In-Reply-To: <20190321165112.GU8696@dhcp22.suse.cz>



On 3/21/19 9:51 AM, Michal Hocko wrote:
> On Thu 21-03-19 09:21:39, Yang Shi wrote:
>>
>> On 3/21/19 7:57 AM, Michal Hocko wrote:
>>> On Wed 20-03-19 08:27:39, Yang Shi wrote:
>>>> MPOL_MF_LAZY was added by commit b24f53a0bea3 ("mm: mempolicy: Add
>>>> MPOL_MF_LAZY"), then it was disabled by commit a720094ded8c ("mm:
>>>> mempolicy: Hide MPOL_NOOP and MPOL_MF_LAZY from userspace for now")
>>>> right away in 2012.  So, it is never ever exported to userspace.
>>>>
>>>> And, it looks nobody is interested in revisiting it since it was
>>>> disabled 7 years ago.  So, it sounds pointless to still keep it around.
>>> The above changelog owes us a lot of explanation about why this is
>>> safe and backward compatible. I am also not sure you can change
>>> MPOL_MF_INTERNAL because somebody still might use the flag from
>>> userspace and we want to guarantee it will have the exact same semantic.
>> Since MPOL_MF_LAZY is never exported to userspace (Mel helped to confirm
>> this in the other thread), so I'm supposed it should be safe and backward
>> compatible to userspace.
> You didn't get my point. The flag is exported to the userspace and
> nothing in the syscall entry path checks and masks it. So we really have
> to preserve the semantic of the flag bit for ever.

Thanks, I see you point. Yes, it is exported to userspace in some sense 
since it is in uapi header. But, it is never documented and 
MPOL_MF_VALID excludes it. mbind() does check and mask it. It would 
return -EINVAL if MPOL_MF_LAZY or any other undefined/invalid flag is 
set. See the below code snippet from do_mbind():

...
#define MPOL_MF_VALID    (MPOL_MF_STRICT   |     \
              MPOL_MF_MOVE     |     \
              MPOL_MF_MOVE_ALL)

if (flags & ~(unsigned long)MPOL_MF_VALID)
         return -EINVAL;

So, I don't think any application would really use the flag for mbind() 
unless it is aimed to test the -EINVAL. If just test program, it should 
be not considered as a regression.

>
>> I'm also not sure if anyone use MPOL_MF_INTERNAL or not and how they use it
>> in their applications, but how about keeping it unchanged?
> You really have to. Because it is an offset of other MPLO flags for
> internal usage.
>
> That being said. Considering that we really have to preserve
> MPOL_MF_LAZY value (we cannot even rename it because it is in uapi
> headers and we do not want to break compilation). What is the point of
> this change? Why is it an improvement? Yes, nobody is probably using
> this because this is not respected in anything but the preferred mem
> policy. At least that is the case from my quick glance. I might be still
> wrong as it is quite easy to overlook all the consequences. So the risk
> is non trivial while the benefit is not really clear to me. If you see
> one, _document_ it. "Mel said it is not in use" is not a justification,
> with all due respect.

As I elaborated above, mbind() syscall does check it and treat it as an 
invalid flag. MPOL_PREFERRED doesn't use it either, but just use 
MPOL_F_MOF directly.

Thanks,
Yang

>
>> Thanks,
>> Yang
>>
>>>> Cc: Mel Gorman <mgorman@techsingularity.net>
>>>> Cc: Michal Hocko <mhocko@suse.com>
>>>> Cc: Vlastimil Babka <vbabka@suse.cz>
>>>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
>>>> ---
>>>> Hi folks,
>>>> I'm not sure if you still would like to revisit it later. And, I may be
>>>> not the first one to try to remvoe it. IMHO, it sounds pointless to still
>>>> keep it around if nobody is interested in it.
>>>>
>>>>    include/uapi/linux/mempolicy.h |  3 +--
>>>>    mm/mempolicy.c                 | 13 -------------
>>>>    2 files changed, 1 insertion(+), 15 deletions(-)
>>>>
>>>> diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
>>>> index 3354774..eb52a7a 100644
>>>> --- a/include/uapi/linux/mempolicy.h
>>>> +++ b/include/uapi/linux/mempolicy.h
>>>> @@ -45,8 +45,7 @@ enum {
>>>>    #define MPOL_MF_MOVE	 (1<<1)	/* Move pages owned by this process to conform
>>>>    				   to policy */
>>>>    #define MPOL_MF_MOVE_ALL (1<<2)	/* Move every page to conform to policy */
>>>> -#define MPOL_MF_LAZY	 (1<<3)	/* Modifies '_MOVE:  lazy migrate on fault */
>>>> -#define MPOL_MF_INTERNAL (1<<4)	/* Internal flags start here */
>>>> +#define MPOL_MF_INTERNAL (1<<3)	/* Internal flags start here */
>>>>    #define MPOL_MF_VALID	(MPOL_MF_STRICT   | 	\
>>>>    			 MPOL_MF_MOVE     | 	\
>>>> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>>>> index af171cc..67886f4 100644
>>>> --- a/mm/mempolicy.c
>>>> +++ b/mm/mempolicy.c
>>>> @@ -593,15 +593,6 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
>>>>    	qp->prev = vma;
>>>> -	if (flags & MPOL_MF_LAZY) {
>>>> -		/* Similar to task_numa_work, skip inaccessible VMAs */
>>>> -		if (!is_vm_hugetlb_page(vma) &&
>>>> -			(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE)) &&
>>>> -			!(vma->vm_flags & VM_MIXEDMAP))
>>>> -			change_prot_numa(vma, start, endvma);
>>>> -		return 1;
>>>> -	}
>>>> -
>>>>    	/* queue pages from current vma */
>>>>    	if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL))
>>>>    		return 0;
>>>> @@ -1181,9 +1172,6 @@ static long do_mbind(unsigned long start, unsigned long len,
>>>>    	if (IS_ERR(new))
>>>>    		return PTR_ERR(new);
>>>> -	if (flags & MPOL_MF_LAZY)
>>>> -		new->flags |= MPOL_F_MOF;
>>>> -
>>>>    	/*
>>>>    	 * If we are using the default policy then operation
>>>>    	 * on discontinuous address spaces is okay after all
>>>> @@ -1226,7 +1214,6 @@ static long do_mbind(unsigned long start, unsigned long len,
>>>>    		int nr_failed = 0;
>>>>    		if (!list_empty(&pagelist)) {
>>>> -			WARN_ON_ONCE(flags & MPOL_MF_LAZY);
>>>>    			nr_failed = migrate_pages(&pagelist, new_page, NULL,
>>>>    				start, MIGRATE_SYNC, MR_MEMPOLICY_MBIND);
>>>>    			if (nr_failed)
>>>> -- 
>>>> 1.8.3.1
>>>>


  reply	other threads:[~2019-03-21 17:25 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-20  0:27 Yang Shi
2019-03-21 14:57 ` Michal Hocko
2019-03-21 16:21   ` Yang Shi
2019-03-21 16:51     ` Michal Hocko
2019-03-21 17:25       ` Yang Shi [this message]
2019-03-21 19:24         ` Mel Gorman
2019-03-21 23:29           ` Yang Shi
2019-03-21 19:45         ` Michal Hocko
2019-03-21 23:25           ` Yang Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=60ef6b4a-4f24-567f-af2f-50d97a2672d6@linux.alibaba.com \
    --to=yang.shi@linux.alibaba.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox