linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: linux-mm@kvack.org, Zi Yan <zi.yan@cs.rutgers.edu>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Vlastimil Babka <vbabka@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrea Reale <ar@linux.vnet.ibm.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH 1/3] mm, numa: rework do_pages_move
Date: Wed, 3 Jan 2018 10:52:11 +0100	[thread overview]
Message-ID: <20180103095211.GC11319@dhcp22.suse.cz> (raw)
In-Reply-To: <32bec0c9-60e2-0362-9446-feb4de1b119c@linux.vnet.ibm.com>

On Wed 03-01-18 15:06:49, Anshuman Khandual wrote:
> On 01/03/2018 02:28 PM, Michal Hocko wrote:
> > On Wed 03-01-18 14:12:17, Anshuman Khandual wrote:
> >> On 12/08/2017 09:45 PM, Michal Hocko wrote:
[...]
> >>> @@ -1593,79 +1556,80 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
> >>>  			 const int __user *nodes,
> >>>  			 int __user *status, int flags)
> >>>  {
> >>> -	struct page_to_node *pm;
> >>> -	unsigned long chunk_nr_pages;
> >>> -	unsigned long chunk_start;
> >>> -	int err;
> >>> -
> >>> -	err = -ENOMEM;
> >>> -	pm = (struct page_to_node *)__get_free_page(GFP_KERNEL);
> >>> -	if (!pm)
> >>> -		goto out;
> >>> +	int chunk_node = NUMA_NO_NODE;
> >>> +	LIST_HEAD(pagelist);
> >>> +	int chunk_start, i;
> >>> +	int err = 0, err1;
> >>
> >> err init might not be required, its getting assigned to -EFAULT right away.
> > 
> > No, nr_pages might be 0 AFAICS.
> 
> Right but there is another err = 0 after the for loop.

No we have 
out_flush:
	/* Make sure we do not overwrite the existing error */
	err1 = do_move_pages_to_node(mm, &pagelist, current_node);
	if (!err1)
		err1 = store_status(status, start, current_node, i - start);
	if (!err)
		err = err1;

This is obviously not an act of beauty and probably a subject to a
cleanup but I just wanted this thing to be working first. Further
cleanups can go on top.

> > [...]
> >>> +		if (chunk_node == NUMA_NO_NODE) {
> >>> +			chunk_node = node;
> >>> +			chunk_start = i;
> >>> +		} else if (node != chunk_node) {
> >>> +			err = do_move_pages_to_node(mm, &pagelist, chunk_node);
> >>> +			if (err)
> >>> +				goto out;
> >>> +			err = store_status(status, chunk_start, chunk_node, i - chunk_start);
> >>> +			if (err)
> >>> +				goto out;
> >>> +			chunk_start = i;
> >>> +			chunk_node = node;
> >>>  		}
> 
> [...]
> 
> >>> +		err = do_move_pages_to_node(mm, &pagelist, chunk_node);
> >>> +		if (err)
> >>> +			goto out;
> >>> +		if (i > chunk_start) {
> >>> +			err = store_status(status, chunk_start, chunk_node, i - chunk_start);
> >>> +			if (err)
> >>> +				goto out;
> >>> +		}
> >>> +		chunk_node = NUMA_NO_NODE;
> >>
> >> This block of code is bit confusing.
> > 
> > I believe this is easier to grasp when looking at the resulting code.
> >>
> >> 1) Why attempt to migrate when just one page could not be isolated ?
> >> 2) 'i' is always greater than chunk_start except the starting page
> >> 3) Why reset chunk_node as NUMA_NO_NODE ?
> > 
> > This is all about flushing the pending state on an error and
> > distinguising a fresh batch.
> 
> Okay. Will test it out on a multi node system once I get hold of one.

Thanks. I have been testing this specific code path with the following
simple test program and numactl -m0. The code is rather crude so I've
always modified it manually to test different scenarios (this one keeps
every 1k page on the node node to test batching.
---
#include <sys/mman.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <stdio.h>
#include <errno.h>
#include <numaif.h>

int main()
{
        unsigned long nr_pages = 10000;
        size_t length = nr_pages << 12, i;
        unsigned char *addr = mmap(NULL, length, PROT_READ | PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
        void *addrs[nr_pages];
        int nodes[nr_pages];
        int status[nr_pages];
        char cmd[128];
        char ch;

        if (addr == MAP_FAILED)
                return 1;

        madvise(addr, length, MADV_NOHUGEPAGE);

        for (i = 0; i < length; i += 4096)
                addr[i] = 1;
        for (i = 0; i < nr_pages; i++)
        {
                addrs[i] = &addr[i * 4096];
                if (i%1024)
                        nodes[i] = 1;
                else
                        nodes[i] = 0;
                status[i] = 0;
        }
        snprintf(cmd, sizeof(cmd)-1, "grep %lx /proc/%d/numa_maps", addr, getpid());
        system(cmd);
        snprintf(cmd, sizeof(cmd)-1, "grep %lx -A20 /proc/%d/smaps", addr, getpid());
        system(cmd);
        read(0, &ch, 1);
        if (move_pages(0, nr_pages, addrs, nodes, status, MPOL_MF_MOVE)) {
                printf("move_pages: err:%d\n", errno);
        }
        snprintf(cmd, sizeof(cmd)-1, "grep %lx /proc/%d/numa_maps", addr, getpid());
        system(cmd);
        snprintf(cmd, sizeof(cmd)-1, "grep %lx -A20 /proc/%d/smaps", addr, getpid());
        system(cmd);
        return 0;
}

---

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2018-01-03  9:52 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-07 12:48 [RFC PATCH] mm: unclutter THP migration Michal Hocko
2017-12-07 14:10 ` Zi Yan
2017-12-07 14:34   ` Michal Hocko
2017-12-08 16:15     ` [RFC PATCH 0/3] " Michal Hocko
2017-12-08 16:15       ` [RFC PATCH 1/3] mm, numa: rework do_pages_move Michal Hocko
2017-12-13 12:07         ` Kirill A. Shutemov
2017-12-13 12:17           ` Michal Hocko
2017-12-13 12:47             ` Kirill A. Shutemov
2017-12-13 14:10               ` Michal Hocko
2017-12-13 14:27                 ` Kirill A. Shutemov
2017-12-13 14:39         ` Michal Hocko
2017-12-14 15:35           ` Kirill A. Shutemov
2017-12-15  9:28             ` Michal Hocko
2017-12-15  9:51               ` Kirill A. Shutemov
2017-12-15  9:57                 ` Michal Hocko
2018-01-02 11:25         ` Anshuman Khandual
2018-01-02 12:12           ` Michal Hocko
2018-01-03  3:11             ` Anshuman Khandual
2018-01-03  8:42         ` Anshuman Khandual
2018-01-03  8:58           ` Michal Hocko
2018-01-03  9:36             ` Anshuman Khandual
2018-01-03  9:52               ` Michal Hocko [this message]
2017-12-08 16:15       ` [RFC PATCH 2/3] mm, migrate: remove reason argument from new_page_t Michal Hocko
2017-12-27  2:12         ` Zi Yan
2017-12-29 11:32           ` Michal Hocko
2017-12-08 16:15       ` [RFC PATCH 3/3] mm: unclutter THP migration Michal Hocko
2017-12-13 12:20         ` Kirill A. Shutemov
2017-12-27  2:19         ` Zi Yan
2017-12-29 11:36           ` Michal Hocko
2017-12-29 15:45             ` Zi Yan
2017-12-31  9:07               ` Michal Hocko
2017-12-31 13:09                 ` Zi Yan
2017-12-19 12:07       ` [RFC PATCH 0/3] " Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180103095211.GC11319@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=ar@linux.vnet.ibm.com \
    --cc=khandual@linux.vnet.ibm.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=vbabka@suse.cz \
    --cc=zi.yan@cs.rutgers.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox