Re: [PATCH v5 05/14] mm, compaction: move pageblock checks up from isolate_migratepages

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* Re: [PATCH v5 05/14] mm, compaction: move pageblock checks up from isolate_migratepages_range()
       [not found] ` <1406553101-29326-6-git-send-email-vbabka@suse.cz>
@ 2014-07-29  0:29   ` David Rientjes
  2014-07-29  9:27     ` Vlastimil Babka
  0 siblings, 1 reply; 7+ messages in thread
From: David Rientjes @ 2014-07-29  0:29 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, linux-kernel, linux-mm, Minchan Kim, Joonsoo Kim,
	Michal Nazarewicz, Naoya Horiguchi, Christoph Lameter,
	Rik van Riel, Mel Gorman, Zhang Yanfei

On Mon, 28 Jul 2014, Vlastimil Babka wrote:

> isolate_migratepages_range() is the main function of the compaction scanner,
> called either on a single pageblock by isolate_migratepages() during regular
> compaction, or on an arbitrary range by CMA's __alloc_contig_migrate_range().
> It currently perfoms two pageblock-wide compaction suitability checks, and
> because of the CMA callpath, it tracks if it crossed a pageblock boundary in
> order to repeat those checks.
> 
> However, closer inspection shows that those checks are always true for CMA:
> - isolation_suitable() is true because CMA sets cc->ignore_skip_hint to true
> - migrate_async_suitable() check is skipped because CMA uses sync compaction
> 
> We can therefore move the compaction-specific checks to isolate_migratepages()
> and simplify isolate_migratepages_range(). Furthermore, we can mimic the
> freepage scanner family of functions, which has isolate_freepages_block()
> function called both by compaction from isolate_freepages() and by CMA from
> isolate_freepages_range(), where each use-case adds own specific glue code.
> This allows further code simplification.
> 
> Therefore, we rename isolate_migratepages_range() to isolate_freepages_block()

s/isolate_freepages_block/isolate_migratepages_block/

I read your commit description before looking at the patch and was very 
nervous about the direction you were going if that was true :)  I'm 
relieved to see it was just a typo.

> and limit its functionality to a single pageblock (or its subset). For CMA,
> a new different isolate_migratepages_range() is created as a CMA-specific
> wrapper for the _block() function. The checks specific to compaction are moved
> to isolate_migratepages(). As part of the unification of these two families of
> functions, we remove the redundant zone parameter where applicable, since zone
> pointer is already passed in cc->zone.
> 
> Furthermore, going back to compact_zone() and compact_finished() when pageblock
> is found unsuitable (now by isolate_migratepages()) is wasteful - the checks
> are meant to skip pageblocks quickly. The patch therefore also introduces a
> simple loop into isolate_migratepages() so that it does not return immediately
> on failed pageblock checks, but keeps going until isolate_migratepages_range()
> gets called once. Similarily to isolate_freepages(), the function periodically
> checks if it needs to reschedule or abort async compaction.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Cc: Minchan Kim <minchan@kernel.org>
> Acked-by: Mel Gorman <mgorman@suse.de>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: Michal Nazarewicz <mina86@mina86.com>
> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: David Rientjes <rientjes@google.com>
> ---
>  mm/compaction.c | 234 +++++++++++++++++++++++++++++++++-----------------------
>  mm/internal.h   |   4 +-
>  mm/page_alloc.c |   3 +-
>  3 files changed, 140 insertions(+), 101 deletions(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 0a871e5..bac6e37 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -132,7 +132,7 @@ void reset_isolation_suitable(pg_data_t *pgdat)
>   */
>  static void update_pageblock_skip(struct compact_control *cc,
>  			struct page *page, unsigned long nr_isolated,
> -			bool set_unsuitable, bool migrate_scanner)
> +			bool migrate_scanner)
>  {
>  	struct zone *zone = cc->zone;
>  	unsigned long pfn;
> @@ -146,12 +146,7 @@ static void update_pageblock_skip(struct compact_control *cc,
>  	if (nr_isolated)
>  		return;
>  
> -	/*
> -	 * Only skip pageblocks when all forms of compaction will be known to
> -	 * fail in the near future.
> -	 */
> -	if (set_unsuitable)
> -		set_pageblock_skip(page);
> +	set_pageblock_skip(page);
>  
>  	pfn = page_to_pfn(page);
>  
> @@ -180,7 +175,7 @@ static inline bool isolation_suitable(struct compact_control *cc,
>  
>  static void update_pageblock_skip(struct compact_control *cc,
>  			struct page *page, unsigned long nr_isolated,
> -			bool set_unsuitable, bool migrate_scanner)
> +			bool migrate_scanner)
>  {
>  }
>  #endif /* CONFIG_COMPACTION */
> @@ -345,8 +340,7 @@ isolate_fail:
>  
>  	/* Update the pageblock-skip if the whole pageblock was scanned */
>  	if (blockpfn == end_pfn)
> -		update_pageblock_skip(cc, valid_page, total_isolated, true,
> -				      false);
> +		update_pageblock_skip(cc, valid_page, total_isolated, false);
>  
>  	count_compact_events(COMPACTFREE_SCANNED, nr_scanned);
>  	if (total_isolated)
> @@ -451,40 +445,34 @@ static bool too_many_isolated(struct zone *zone)
>  }
>  
>  /**
> - * isolate_migratepages_range() - isolate all migrate-able pages in range.
> - * @zone:	Zone pages are in.
> + * isolate_migratepages_block() - isolate all migrate-able pages within
> + *				  a single pageblock
>   * @cc:		Compaction control structure.
> - * @low_pfn:	The first PFN of the range.
> - * @end_pfn:	The one-past-the-last PFN of the range.
> - * @unevictable: true if it allows to isolate unevictable pages
> + * @low_pfn:	The first PFN to isolate
> + * @end_pfn:	The one-past-the-last PFN to isolate, within same pageblock
> + * @isolate_mode: Isolation mode to be used.
>   *
>   * Isolate all pages that can be migrated from the range specified by
> - * [low_pfn, end_pfn).  Returns zero if there is a fatal signal
> - * pending), otherwise PFN of the first page that was not scanned
> - * (which may be both less, equal to or more then end_pfn).
> - *
> - * Assumes that cc->migratepages is empty and cc->nr_migratepages is
> - * zero.
> + * [low_pfn, end_pfn). The range is expected to be within same pageblock.
> + * Returns zero if there is a fatal signal pending, otherwise PFN of the
> + * first page that was not scanned (which may be both less, equal to or more
> + * than end_pfn).
>   *
> - * Apart from cc->migratepages and cc->nr_migratetypes this function
> - * does not modify any cc's fields, in particular it does not modify
> - * (or read for that matter) cc->migrate_pfn.
> + * The pages are isolated on cc->migratepages list (not required to be empty),
> + * and cc->nr_migratepages is updated accordingly. The cc->migrate_pfn field
> + * is neither read nor updated.
>   */
> -unsigned long
> -isolate_migratepages_range(struct zone *zone, struct compact_control *cc,
> -		unsigned long low_pfn, unsigned long end_pfn, bool unevictable)
> +static unsigned long
> +isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
> +			unsigned long end_pfn, isolate_mode_t isolate_mode)
>  {
> -	unsigned long last_pageblock_nr = 0, pageblock_nr;
> +	struct zone *zone = cc->zone;
>  	unsigned long nr_scanned = 0, nr_isolated = 0;
>  	struct list_head *migratelist = &cc->migratepages;
>  	struct lruvec *lruvec;
>  	unsigned long flags;
>  	bool locked = false;
>  	struct page *page = NULL, *valid_page = NULL;
> -	bool set_unsuitable = true;
> -	const isolate_mode_t mode = (cc->mode == MIGRATE_ASYNC ?
> -					ISOLATE_ASYNC_MIGRATE : 0) |
> -				    (unevictable ? ISOLATE_UNEVICTABLE : 0);
>  
>  	/*
>  	 * Ensure that there are not too many pages isolated from the LRU
> @@ -515,19 +503,6 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc,
>  			}
>  		}
>  
> -		/*
> -		 * migrate_pfn does not necessarily start aligned to a
> -		 * pageblock. Ensure that pfn_valid is called when moving
> -		 * into a new MAX_ORDER_NR_PAGES range in case of large
> -		 * memory holes within the zone
> -		 */
> -		if ((low_pfn & (MAX_ORDER_NR_PAGES - 1)) == 0) {
> -			if (!pfn_valid(low_pfn)) {
> -				low_pfn += MAX_ORDER_NR_PAGES - 1;
> -				continue;
> -			}
> -		}
> -
>  		if (!pfn_valid_within(low_pfn))
>  			continue;
>  		nr_scanned++;
> @@ -545,28 +520,6 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc,
>  		if (!valid_page)
>  			valid_page = page;
>  
> -		/* If isolation recently failed, do not retry */
> -		pageblock_nr = low_pfn >> pageblock_order;
> -		if (last_pageblock_nr != pageblock_nr) {
> -			int mt;
> -
> -			last_pageblock_nr = pageblock_nr;
> -			if (!isolation_suitable(cc, page))
> -				goto next_pageblock;
> -
> -			/*
> -			 * For async migration, also only scan in MOVABLE
> -			 * blocks. Async migration is optimistic to see if
> -			 * the minimum amount of work satisfies the allocation
> -			 */
> -			mt = get_pageblock_migratetype(page);
> -			if (cc->mode == MIGRATE_ASYNC &&
> -			    !migrate_async_suitable(mt)) {
> -				set_unsuitable = false;
> -				goto next_pageblock;
> -			}
> -		}
> -
>  		/*
>  		 * Skip if free. page_order cannot be used without zone->lock
>  		 * as nothing prevents parallel allocations or buddy merging.
> @@ -601,8 +554,11 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc,
>  		 */
>  		if (PageTransHuge(page)) {
>  			if (!locked)
> -				goto next_pageblock;
> -			low_pfn += (1 << compound_order(page)) - 1;
> +				low_pfn = ALIGN(low_pfn + 1,
> +						pageblock_nr_pages) - 1;
> +			else
> +				low_pfn += (1 << compound_order(page)) - 1;
> +

Hmm, any reason not to always advance and align low_pfn to 
pageblock_nr_pages?  I don't see how pageblock_order > HPAGE_PMD_ORDER 
would make sense if encountering thp.

>  			continue;
>  		}
>  
> @@ -632,7 +588,7 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc,
>  		lruvec = mem_cgroup_page_lruvec(page, zone);
>  
>  		/* Try isolate the page */
> -		if (__isolate_lru_page(page, mode) != 0)
> +		if (__isolate_lru_page(page, isolate_mode) != 0)
>  			continue;
>  
>  		VM_BUG_ON_PAGE(PageTransCompound(page), page);
> @@ -651,11 +607,6 @@ isolate_success:
>  			++low_pfn;
>  			break;
>  		}
> -
> -		continue;
> -
> -next_pageblock:
> -		low_pfn = ALIGN(low_pfn + 1, pageblock_nr_pages) - 1;
>  	}
>  
>  	acct_isolated(zone, locked, cc);
> @@ -668,8 +619,7 @@ next_pageblock:
>  	 * if the whole pageblock was scanned without isolating any page.
>  	 */
>  	if (low_pfn == end_pfn)
> -		update_pageblock_skip(cc, valid_page, nr_isolated,
> -				      set_unsuitable, true);
> +		update_pageblock_skip(cc, valid_page, nr_isolated, true);
>  
>  	trace_mm_compaction_isolate_migratepages(nr_scanned, nr_isolated);
>  
> @@ -680,15 +630,61 @@ next_pageblock:
>  	return low_pfn;
>  }
>  
> +/**
> + * isolate_migratepages_range() - isolate migrate-able pages in a PFN range
> + * @start_pfn: The first PFN to start isolating.
> + * @end_pfn:   The one-past-last PFN.

Need to specify @cc?

> + *
> + * Returns zero if isolation fails fatally due to e.g. pending signal.
> + * Otherwise, function returns one-past-the-last PFN of isolated page
> + * (which may be greater than end_pfn if end fell in a middle of a THP page).
> + */
> +unsigned long
> +isolate_migratepages_range(struct compact_control *cc, unsigned long start_pfn,
> +							unsigned long end_pfn)
> +{
> +	unsigned long pfn, block_end_pfn;
> +
> +	/* Scan block by block. First and last block may be incomplete */
> +	pfn = start_pfn;
> +	block_end_pfn = ALIGN(pfn + 1, pageblock_nr_pages);
> +
> +	for (; pfn < end_pfn; pfn = block_end_pfn,
> +				block_end_pfn += pageblock_nr_pages) {
> +
> +		block_end_pfn = min(block_end_pfn, end_pfn);
> +
> +		/* Skip whole pageblock in case of a memory hole */
> +		if (!pfn_valid(pfn))
> +			continue;
> +
> +		pfn = isolate_migratepages_block(cc, pfn, block_end_pfn,
> +							ISOLATE_UNEVICTABLE);
> +
> +		/*
> +		 * In case of fatal failure, release everything that might
> +		 * have been isolated in the previous iteration, and signal
> +		 * the failure back to caller.
> +		 */
> +		if (!pfn) {
> +			putback_movable_pages(&cc->migratepages);
> +			cc->nr_migratepages = 0;
> +			break;
> +		}
> +	}
> +
> +	return pfn;
> +}
> +
>  #endif /* CONFIG_COMPACTION || CONFIG_CMA */
>  #ifdef CONFIG_COMPACTION
>  /*
>   * Based on information in the current compact_control, find blocks
>   * suitable for isolating free pages from and then isolate them.
>   */
> -static void isolate_freepages(struct zone *zone,
> -				struct compact_control *cc)
> +static void isolate_freepages(struct compact_control *cc)
>  {
> +	struct zone *zone = cc->zone;
>  	struct page *page;
>  	unsigned long block_start_pfn;	/* start of current pageblock */
>  	unsigned long block_end_pfn;	/* end of current pageblock */
> @@ -806,7 +802,7 @@ static struct page *compaction_alloc(struct page *migratepage,
>  	 */
>  	if (list_empty(&cc->freepages)) {
>  		if (!cc->contended)
> -			isolate_freepages(cc->zone, cc);
> +			isolate_freepages(cc);
>  
>  		if (list_empty(&cc->freepages))
>  			return NULL;
> @@ -840,34 +836,81 @@ typedef enum {
>  } isolate_migrate_t;
>  
>  /*
> - * Isolate all pages that can be migrated from the block pointed to by
> - * the migrate scanner within compact_control.
> + * Isolate all pages that can be migrated from the first suitable block,
> + * starting at the block pointed to by the migrate scanner pfn within
> + * compact_control.
>   */
>  static isolate_migrate_t isolate_migratepages(struct zone *zone,
>  					struct compact_control *cc)
>  {
>  	unsigned long low_pfn, end_pfn;
> +	struct page *page;
> +	const isolate_mode_t isolate_mode =
> +		(cc->mode == MIGRATE_ASYNC ? ISOLATE_ASYNC_MIGRATE : 0);
>  
> -	/* Do not scan outside zone boundaries */
> -	low_pfn = max(cc->migrate_pfn, zone->zone_start_pfn);
> +	/*
> +	 * Start at where we last stopped, or beginning of the zone as
> +	 * initialized by compact_zone()
> +	 */
> +	low_pfn = cc->migrate_pfn;
>  
>  	/* Only scan within a pageblock boundary */
>  	end_pfn = ALIGN(low_pfn + 1, pageblock_nr_pages);
>  
> -	/* Do not cross the free scanner or scan within a memory hole */
> -	if (end_pfn > cc->free_pfn || !pfn_valid(low_pfn)) {
> -		cc->migrate_pfn = end_pfn;
> -		return ISOLATE_NONE;
> -	}
> +	/*
> +	 * Iterate over whole pageblocks until we find the first suitable.
> +	 * Do not cross the free scanner.
> +	 */
> +	for (; end_pfn <= cc->free_pfn;
> +			low_pfn = end_pfn, end_pfn += pageblock_nr_pages) {
> +
> +		/*
> +		 * This can potentially iterate a massively long zone with
> +		 * many pageblocks unsuitable, so periodically check if we
> +		 * need to schedule, or even abort async compaction.
> +		 */
> +		if (!(low_pfn % (SWAP_CLUSTER_MAX * pageblock_nr_pages))
> +						&& compact_should_abort(cc))
> +			break;
> +
> +		/* Skip whole pageblock in case of a memory hole */
> +		if (!pfn_valid(low_pfn))
> +			continue;
> +
> +		page = pfn_to_page(low_pfn);
> +
> +		/* If isolation recently failed, do not retry */
> +		if (!isolation_suitable(cc, page))
> +			continue;
> +
> +		/*
> +		 * For async compaction, also only scan in MOVABLE blocks.
> +		 * Async compaction is optimistic to see if the minimum amount
> +		 * of work satisfies the allocation.
> +		 */
> +		if (cc->mode == MIGRATE_ASYNC &&
> +		    !migrate_async_suitable(get_pageblock_migratetype(page)))
> +			continue;
> +
> +		/* Perform the isolation */
> +		low_pfn = isolate_migratepages_block(cc, low_pfn, end_pfn,
> +								isolate_mode);

Hmm, why would we want to unconditionally set pageblock_skip if no pages 
could be isolated from a pageblock when 
isolate_mode == ISOLATE_ASYNC_MIGRATE?  It seems like it erroneously skip 
pageblocks for cases when isolate_mode == 0.

> +
> +		if (!low_pfn || cc->contended)
> +			return ISOLATE_ABORT;
>  
> -	/* Perform the isolation */
> -	low_pfn = isolate_migratepages_range(zone, cc, low_pfn, end_pfn, false);
> -	if (!low_pfn || cc->contended)
> -		return ISOLATE_ABORT;
> +		/*
> +		 * Either we isolated something and proceed with migration. Or
> +		 * we failed and compact_zone should decide if we should
> +		 * continue or not.
> +		 */
> +		break;
> +	}
>  
> +	/* Record where migration scanner will be restarted */
>  	cc->migrate_pfn = low_pfn;
>  
> -	return ISOLATE_SUCCESS;
> +	return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE;
>  }
>  
>  static int compact_finished(struct zone *zone,
> @@ -1040,9 +1083,6 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
>  			;
>  		}
>  
> -		if (!cc->nr_migratepages)
> -			continue;
> -
>  		err = migrate_pages(&cc->migratepages, compaction_alloc,
>  				compaction_free, (unsigned long)cc, cc->mode,
>  				MR_COMPACTION);
> diff --git a/mm/internal.h b/mm/internal.h
> index a1b651b..5a0738f 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -154,8 +154,8 @@ unsigned long
>  isolate_freepages_range(struct compact_control *cc,
>  			unsigned long start_pfn, unsigned long end_pfn);
>  unsigned long
> -isolate_migratepages_range(struct zone *zone, struct compact_control *cc,
> -	unsigned long low_pfn, unsigned long end_pfn, bool unevictable);
> +isolate_migratepages_range(struct compact_control *cc,
> +			   unsigned long low_pfn, unsigned long end_pfn);
>  
>  #endif
>  
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 91191fb..f424752 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6292,8 +6292,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
>  
>  		if (list_empty(&cc->migratepages)) {
>  			cc->nr_migratepages = 0;
> -			pfn = isolate_migratepages_range(cc->zone, cc,
> -							 pfn, end, true);
> +			pfn = isolate_migratepages_range(cc, pfn, end);
>  			if (!pfn) {
>  				ret = -EINTR;
>  				break;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 05/14] mm, compaction: move pageblock checks up from isolate_migratepages_range()
  2014-07-29  0:29   ` [PATCH v5 05/14] mm, compaction: move pageblock checks up from isolate_migratepages_range() David Rientjes
@ 2014-07-29  9:27     ` Vlastimil Babka
  2014-07-29 23:02       ` David Rientjes
  0 siblings, 1 reply; 7+ messages in thread
From: Vlastimil Babka @ 2014-07-29  9:27 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, linux-kernel, linux-mm, Minchan Kim, Joonsoo Kim,
	Michal Nazarewicz, Naoya Horiguchi, Christoph Lameter,
	Rik van Riel, Mel Gorman, Zhang Yanfei

On 07/29/2014 02:29 AM, David Rientjes wrote:
> On Mon, 28 Jul 2014, Vlastimil Babka wrote:
>
>> isolate_migratepages_range() is the main function of the compaction scanner,
>> called either on a single pageblock by isolate_migratepages() during regular
>> compaction, or on an arbitrary range by CMA's __alloc_contig_migrate_range().
>> It currently perfoms two pageblock-wide compaction suitability checks, and
>> because of the CMA callpath, it tracks if it crossed a pageblock boundary in
>> order to repeat those checks.
>>
>> However, closer inspection shows that those checks are always true for CMA:
>> - isolation_suitable() is true because CMA sets cc->ignore_skip_hint to true
>> - migrate_async_suitable() check is skipped because CMA uses sync compaction
>>
>> We can therefore move the compaction-specific checks to isolate_migratepages()
>> and simplify isolate_migratepages_range(). Furthermore, we can mimic the
>> freepage scanner family of functions, which has isolate_freepages_block()
>> function called both by compaction from isolate_freepages() and by CMA from
>> isolate_freepages_range(), where each use-case adds own specific glue code.
>> This allows further code simplification.
>>
>> Therefore, we rename isolate_migratepages_range() to isolate_freepages_block()
>
> s/isolate_freepages_block/isolate_migratepages_block/
>
> I read your commit description before looking at the patch and was very
> nervous about the direction you were going if that was true :)  I'm
> relieved to see it was just a typo.

Ah, thanks :)

>> @@ -601,8 +554,11 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc,
>>   		 */
>>   		if (PageTransHuge(page)) {
>>   			if (!locked)
>> -				goto next_pageblock;
>> -			low_pfn += (1 << compound_order(page)) - 1;
>> +				low_pfn = ALIGN(low_pfn + 1,
>> +						pageblock_nr_pages) - 1;
>> +			else
>> +				low_pfn += (1 << compound_order(page)) - 1;
>> +
>
> Hmm, any reason not to always advance and align low_pfn to
> pageblock_nr_pages?  I don't see how pageblock_order > HPAGE_PMD_ORDER
> would make sense if encountering thp.

I think PageTransHuge() might be true even for non-THP compound pages 
which might be actually of lower order and we wouldn't want to skip the 
whole pageblock.

>> @@ -680,15 +630,61 @@ next_pageblock:
>>   	return low_pfn;
>>   }
>>
>> +/**
>> + * isolate_migratepages_range() - isolate migrate-able pages in a PFN range
>> + * @start_pfn: The first PFN to start isolating.
>> + * @end_pfn:   The one-past-last PFN.
>
> Need to specify @cc?

OK.

>>
>>   /*
>> - * Isolate all pages that can be migrated from the block pointed to by
>> - * the migrate scanner within compact_control.
>> + * Isolate all pages that can be migrated from the first suitable block,
>> + * starting at the block pointed to by the migrate scanner pfn within
>> + * compact_control.
>>    */
>>   static isolate_migrate_t isolate_migratepages(struct zone *zone,
>>   					struct compact_control *cc)
>>   {
>>   	unsigned long low_pfn, end_pfn;
>> +	struct page *page;
>> +	const isolate_mode_t isolate_mode =
>> +		(cc->mode == MIGRATE_ASYNC ? ISOLATE_ASYNC_MIGRATE : 0);
>>
>> -	/* Do not scan outside zone boundaries */
>> -	low_pfn = max(cc->migrate_pfn, zone->zone_start_pfn);
>> +	/*
>> +	 * Start at where we last stopped, or beginning of the zone as
>> +	 * initialized by compact_zone()
>> +	 */
>> +	low_pfn = cc->migrate_pfn;
>>
>>   	/* Only scan within a pageblock boundary */
>>   	end_pfn = ALIGN(low_pfn + 1, pageblock_nr_pages);
>>
>> -	/* Do not cross the free scanner or scan within a memory hole */
>> -	if (end_pfn > cc->free_pfn || !pfn_valid(low_pfn)) {
>> -		cc->migrate_pfn = end_pfn;
>> -		return ISOLATE_NONE;
>> -	}
>> +	/*
>> +	 * Iterate over whole pageblocks until we find the first suitable.
>> +	 * Do not cross the free scanner.
>> +	 */
>> +	for (; end_pfn <= cc->free_pfn;
>> +			low_pfn = end_pfn, end_pfn += pageblock_nr_pages) {
>> +
>> +		/*
>> +		 * This can potentially iterate a massively long zone with
>> +		 * many pageblocks unsuitable, so periodically check if we
>> +		 * need to schedule, or even abort async compaction.
>> +		 */
>> +		if (!(low_pfn % (SWAP_CLUSTER_MAX * pageblock_nr_pages))
>> +						&& compact_should_abort(cc))
>> +			break;
>> +
>> +		/* Skip whole pageblock in case of a memory hole */
>> +		if (!pfn_valid(low_pfn))
>> +			continue;
>> +
>> +		page = pfn_to_page(low_pfn);
>> +
>> +		/* If isolation recently failed, do not retry */
>> +		if (!isolation_suitable(cc, page))
>> +			continue;
>> +
>> +		/*
>> +		 * For async compaction, also only scan in MOVABLE blocks.
>> +		 * Async compaction is optimistic to see if the minimum amount
>> +		 * of work satisfies the allocation.
>> +		 */
>> +		if (cc->mode == MIGRATE_ASYNC &&
>> +		    !migrate_async_suitable(get_pageblock_migratetype(page)))
>> +			continue;
>> +
>> +		/* Perform the isolation */
>> +		low_pfn = isolate_migratepages_block(cc, low_pfn, end_pfn,
>> +								isolate_mode);
>
> Hmm, why would we want to unconditionally set pageblock_skip if no pages
> could be isolated from a pageblock when
> isolate_mode == ISOLATE_ASYNC_MIGRATE?  It seems like it erroneously skip
> pageblocks for cases when isolate_mode == 0.

Well pageblock_skip is a single bit and you don't know if the next 
attempt will be async or sync. So now you would maybe skip needlessly if 
the next attempt would be sync. If we changed that, you wouldn't skip if 
the next attempt would be async again. Could be that one way is better 
than other but I'm not sure, and would consider it separately.
The former patch 15 (quick skip pageblock that won't be fully migrated) 
could perhaps change the balance here.

But I hope this patch doesn't change this particular thing, right?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 05/14] mm, compaction: move pageblock checks up from isolate_migratepages_range()
  2014-07-29  9:27     ` Vlastimil Babka
@ 2014-07-29 23:02       ` David Rientjes
  2014-07-29 23:21         ` Kirill A. Shutemov
  2014-07-30  9:39         ` Vlastimil Babka
  0 siblings, 2 replies; 7+ messages in thread
From: David Rientjes @ 2014-07-29 23:02 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, linux-kernel, linux-mm, Minchan Kim, Joonsoo Kim,
	Michal Nazarewicz, Naoya Horiguchi, Christoph Lameter,
	Rik van Riel, Mel Gorman, Zhang Yanfei

On Tue, 29 Jul 2014, Vlastimil Babka wrote:

> > > @@ -601,8 +554,11 @@ isolate_migratepages_range(struct zone *zone, struct
> > > compact_control *cc,
> > >   		 */
> > >   		if (PageTransHuge(page)) {
> > >   			if (!locked)
> > > -				goto next_pageblock;
> > > -			low_pfn += (1 << compound_order(page)) - 1;
> > > +				low_pfn = ALIGN(low_pfn + 1,
> > > +						pageblock_nr_pages) - 1;
> > > +			else
> > > +				low_pfn += (1 << compound_order(page)) - 1;
> > > +
> > 
> > Hmm, any reason not to always advance and align low_pfn to
> > pageblock_nr_pages?  I don't see how pageblock_order > HPAGE_PMD_ORDER
> > would make sense if encountering thp.
> 
> I think PageTransHuge() might be true even for non-THP compound pages which
> might be actually of lower order and we wouldn't want to skip the whole
> pageblock.
> 

Hmm, I'm confused at how that could be true, could you explain what 
memory other than thp can return true for PageTransHuge()?  Are you simply 
referring to possible checks on tail pages where we would need to look at 
PageHead() instead?  If so, I'm not sure how we could possibly encounter 
such a condition within this iteration.

> > > @@ -680,15 +630,61 @@ next_pageblock:
> > >   	return low_pfn;
> > >   }
> > > 
> > > +/**
> > > + * isolate_migratepages_range() - isolate migrate-able pages in a PFN
> > > range
> > > + * @start_pfn: The first PFN to start isolating.
> > > + * @end_pfn:   The one-past-last PFN.
> > 
> > Need to specify @cc?
> 
> OK.
> 
> > > 
> > >   /*
> > > - * Isolate all pages that can be migrated from the block pointed to by
> > > - * the migrate scanner within compact_control.
> > > + * Isolate all pages that can be migrated from the first suitable block,
> > > + * starting at the block pointed to by the migrate scanner pfn within
> > > + * compact_control.
> > >    */
> > >   static isolate_migrate_t isolate_migratepages(struct zone *zone,
> > >   					struct compact_control *cc)
> > >   {
> > >   	unsigned long low_pfn, end_pfn;
> > > +	struct page *page;
> > > +	const isolate_mode_t isolate_mode =
> > > +		(cc->mode == MIGRATE_ASYNC ? ISOLATE_ASYNC_MIGRATE : 0);
> > > 
> > > -	/* Do not scan outside zone boundaries */
> > > -	low_pfn = max(cc->migrate_pfn, zone->zone_start_pfn);
> > > +	/*
> > > +	 * Start at where we last stopped, or beginning of the zone as
> > > +	 * initialized by compact_zone()
> > > +	 */
> > > +	low_pfn = cc->migrate_pfn;
> > > 
> > >   	/* Only scan within a pageblock boundary */
> > >   	end_pfn = ALIGN(low_pfn + 1, pageblock_nr_pages);
> > > 
> > > -	/* Do not cross the free scanner or scan within a memory hole */
> > > -	if (end_pfn > cc->free_pfn || !pfn_valid(low_pfn)) {
> > > -		cc->migrate_pfn = end_pfn;
> > > -		return ISOLATE_NONE;
> > > -	}
> > > +	/*
> > > +	 * Iterate over whole pageblocks until we find the first suitable.
> > > +	 * Do not cross the free scanner.
> > > +	 */
> > > +	for (; end_pfn <= cc->free_pfn;
> > > +			low_pfn = end_pfn, end_pfn += pageblock_nr_pages) {
> > > +
> > > +		/*
> > > +		 * This can potentially iterate a massively long zone with
> > > +		 * many pageblocks unsuitable, so periodically check if we
> > > +		 * need to schedule, or even abort async compaction.
> > > +		 */
> > > +		if (!(low_pfn % (SWAP_CLUSTER_MAX * pageblock_nr_pages))
> > > +						&& compact_should_abort(cc))
> > > +			break;
> > > +
> > > +		/* Skip whole pageblock in case of a memory hole */
> > > +		if (!pfn_valid(low_pfn))
> > > +			continue;
> > > +
> > > +		page = pfn_to_page(low_pfn);
> > > +
> > > +		/* If isolation recently failed, do not retry */
> > > +		if (!isolation_suitable(cc, page))
> > > +			continue;
> > > +
> > > +		/*
> > > +		 * For async compaction, also only scan in MOVABLE blocks.
> > > +		 * Async compaction is optimistic to see if the minimum amount
> > > +		 * of work satisfies the allocation.
> > > +		 */
> > > +		if (cc->mode == MIGRATE_ASYNC &&
> > > +		    !migrate_async_suitable(get_pageblock_migratetype(page)))
> > > +			continue;
> > > +
> > > +		/* Perform the isolation */
> > > +		low_pfn = isolate_migratepages_block(cc, low_pfn, end_pfn,
> > > +								isolate_mode);
> > 
> > Hmm, why would we want to unconditionally set pageblock_skip if no pages
> > could be isolated from a pageblock when
> > isolate_mode == ISOLATE_ASYNC_MIGRATE?  It seems like it erroneously skip
> > pageblocks for cases when isolate_mode == 0.
> 
> Well pageblock_skip is a single bit and you don't know if the next attempt
> will be async or sync. So now you would maybe skip needlessly if the next
> attempt would be sync. If we changed that, you wouldn't skip if the next
> attempt would be async again. Could be that one way is better than other but
> I'm not sure, and would consider it separately.
> The former patch 15 (quick skip pageblock that won't be fully migrated) could
> perhaps change the balance here.
> 

That's why we have two separate per-zone cached start pfns, though, right?  
The next call to async compaction should start from where the previous 
caller left off so there would be no need to set pageblock skip in that 
case until we have checked all memory.  Or are you considering the case of 
concurrent async compaction?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 05/14] mm, compaction: move pageblock checks up from isolate_migratepages_range()
  2014-07-29 23:02       ` David Rientjes
@ 2014-07-29 23:21         ` Kirill A. Shutemov
  2014-07-29 23:51           ` David Rientjes
  2014-07-30  9:39         ` Vlastimil Babka
  1 sibling, 1 reply; 7+ messages in thread
From: Kirill A. Shutemov @ 2014-07-29 23:21 UTC (permalink / raw)
  To: David Rientjes
  Cc: Vlastimil Babka, Andrew Morton, linux-kernel, linux-mm,
	Minchan Kim, Joonsoo Kim, Michal Nazarewicz, Naoya Horiguchi,
	Christoph Lameter, Rik van Riel, Mel Gorman, Zhang Yanfei

On Tue, Jul 29, 2014 at 04:02:09PM -0700, David Rientjes wrote:
> Hmm, I'm confused at how that could be true, could you explain what 
> memory other than thp can return true for PageTransHuge()?

PageTransHuge() will be true for any head of compound page if THP is
enabled compile time: hugetlbfs, slab, whatever.

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 05/14] mm, compaction: move pageblock checks up from isolate_migratepages_range()
  2014-07-29 23:21         ` Kirill A. Shutemov
@ 2014-07-29 23:51           ` David Rientjes
  2014-07-30  9:27             ` Vlastimil Babka
  0 siblings, 1 reply; 7+ messages in thread
From: David Rientjes @ 2014-07-29 23:51 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Vlastimil Babka, Andrew Morton, linux-kernel, linux-mm,
	Minchan Kim, Joonsoo Kim, Michal Nazarewicz, Naoya Horiguchi,
	Christoph Lameter, Rik van Riel, Mel Gorman, Zhang Yanfei

On Wed, 30 Jul 2014, Kirill A. Shutemov wrote:

> > Hmm, I'm confused at how that could be true, could you explain what 
> > memory other than thp can return true for PageTransHuge()?
> 
> PageTransHuge() will be true for any head of compound page if THP is
> enabled compile time: hugetlbfs, slab, whatever.
> 

I was meaning in the context of the patch :)  Since PageLRU is set, that 
discounts slab so we're left with thp or hugetlbfs.  Logically, both 
should have sizes that are >= the size of the pageblock itself so I'm not 
sure why we don't unconditionally align up to pageblock_nr_pages here.  Is 
there a legitimiate configuration where a pageblock will span multiple 
pages of HPAGE_PMD_ORDER?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 05/14] mm, compaction: move pageblock checks up from isolate_migratepages_range()
  2014-07-29 23:51           ` David Rientjes
@ 2014-07-30  9:27             ` Vlastimil Babka
  0 siblings, 0 replies; 7+ messages in thread
From: Vlastimil Babka @ 2014-07-30  9:27 UTC (permalink / raw)
  To: David Rientjes, Kirill A. Shutemov
  Cc: Andrew Morton, linux-kernel, linux-mm, Minchan Kim, Joonsoo Kim,
	Michal Nazarewicz, Naoya Horiguchi, Christoph Lameter,
	Rik van Riel, Mel Gorman, Zhang Yanfei

On 07/30/2014 01:51 AM, David Rientjes wrote:
> On Wed, 30 Jul 2014, Kirill A. Shutemov wrote:
>
>>> Hmm, I'm confused at how that could be true, could you explain what
>>> memory other than thp can return true for PageTransHuge()?
>>
>> PageTransHuge() will be true for any head of compound page if THP is
>> enabled compile time: hugetlbfs, slab, whatever.
>>
>
> I was meaning in the context of the patch :)  Since PageLRU is set, that
> discounts slab so we're left with thp or hugetlbfs.  Logically, both
> should have sizes that are >= the size of the pageblock itself so I'm not
> sure why we don't unconditionally align up to pageblock_nr_pages here.  Is
> there a legitimiate configuration where a pageblock will span multiple
> pages of HPAGE_PMD_ORDER?

I think Joonsoo mentioned in some previous iteration that some arches 
may have this. But I have no idea.
But perhaps we could use HPAGE_PMD_ORDER instead of compound_order()?

In the locked case we know that PageLRU could not change so it still has 
to be a huge page so we know it's possible order.

In the !locked case, I'm now not even sure if the current code is safe 
enough. What if we pass the PageLRU check, but before the PageTransHuge 
check a compound page (THP or otherwise) materializes and we are at one 
of the tail pages. Then in DEBUG_VM configuration, this could fire in 
PageTransHuge() check: VM_BUG_ON_PAGE(PageTail(page), page);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 05/14] mm, compaction: move pageblock checks up from isolate_migratepages_range()
  2014-07-29 23:02       ` David Rientjes
  2014-07-29 23:21         ` Kirill A. Shutemov
@ 2014-07-30  9:39         ` Vlastimil Babka
  1 sibling, 0 replies; 7+ messages in thread
From: Vlastimil Babka @ 2014-07-30  9:39 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, linux-kernel, linux-mm, Minchan Kim, Joonsoo Kim,
	Michal Nazarewicz, Naoya Horiguchi, Christoph Lameter,
	Rik van Riel, Mel Gorman, Zhang Yanfei

On 07/30/2014 01:02 AM, David Rientjes wrote:
>>>>
>>>>    /*
>>>> - * Isolate all pages that can be migrated from the block pointed to by
>>>> - * the migrate scanner within compact_control.
>>>> + * Isolate all pages that can be migrated from the first suitable block,
>>>> + * starting at the block pointed to by the migrate scanner pfn within
>>>> + * compact_control.
>>>>     */
>>>>    static isolate_migrate_t isolate_migratepages(struct zone *zone,
>>>>    					struct compact_control *cc)
>>>>    {
>>>>    	unsigned long low_pfn, end_pfn;
>>>> +	struct page *page;
>>>> +	const isolate_mode_t isolate_mode =
>>>> +		(cc->mode == MIGRATE_ASYNC ? ISOLATE_ASYNC_MIGRATE : 0);
>>>>
>>>> -	/* Do not scan outside zone boundaries */
>>>> -	low_pfn = max(cc->migrate_pfn, zone->zone_start_pfn);
>>>> +	/*
>>>> +	 * Start at where we last stopped, or beginning of the zone as
>>>> +	 * initialized by compact_zone()
>>>> +	 */
>>>> +	low_pfn = cc->migrate_pfn;
>>>>
>>>>    	/* Only scan within a pageblock boundary */
>>>>    	end_pfn = ALIGN(low_pfn + 1, pageblock_nr_pages);
>>>>
>>>> -	/* Do not cross the free scanner or scan within a memory hole */
>>>> -	if (end_pfn > cc->free_pfn || !pfn_valid(low_pfn)) {
>>>> -		cc->migrate_pfn = end_pfn;
>>>> -		return ISOLATE_NONE;
>>>> -	}
>>>> +	/*
>>>> +	 * Iterate over whole pageblocks until we find the first suitable.
>>>> +	 * Do not cross the free scanner.
>>>> +	 */
>>>> +	for (; end_pfn <= cc->free_pfn;
>>>> +			low_pfn = end_pfn, end_pfn += pageblock_nr_pages) {
>>>> +
>>>> +		/*
>>>> +		 * This can potentially iterate a massively long zone with
>>>> +		 * many pageblocks unsuitable, so periodically check if we
>>>> +		 * need to schedule, or even abort async compaction.
>>>> +		 */
>>>> +		if (!(low_pfn % (SWAP_CLUSTER_MAX * pageblock_nr_pages))
>>>> +						&& compact_should_abort(cc))
>>>> +			break;
>>>> +
>>>> +		/* Skip whole pageblock in case of a memory hole */
>>>> +		if (!pfn_valid(low_pfn))
>>>> +			continue;
>>>> +
>>>> +		page = pfn_to_page(low_pfn);
>>>> +
>>>> +		/* If isolation recently failed, do not retry */
>>>> +		if (!isolation_suitable(cc, page))
>>>> +			continue;
>>>> +
>>>> +		/*
>>>> +		 * For async compaction, also only scan in MOVABLE blocks.
>>>> +		 * Async compaction is optimistic to see if the minimum amount
>>>> +		 * of work satisfies the allocation.
>>>> +		 */
>>>> +		if (cc->mode == MIGRATE_ASYNC &&
>>>> +		    !migrate_async_suitable(get_pageblock_migratetype(page)))
>>>> +			continue;
>>>> +
>>>> +		/* Perform the isolation */
>>>> +		low_pfn = isolate_migratepages_block(cc, low_pfn, end_pfn,
>>>> +								isolate_mode);
>>>
>>> Hmm, why would we want to unconditionally set pageblock_skip if no pages
>>> could be isolated from a pageblock when
>>> isolate_mode == ISOLATE_ASYNC_MIGRATE?  It seems like it erroneously skip
>>> pageblocks for cases when isolate_mode == 0.
>>
>> Well pageblock_skip is a single bit and you don't know if the next attempt
>> will be async or sync. So now you would maybe skip needlessly if the next
>> attempt would be sync. If we changed that, you wouldn't skip if the next
>> attempt would be async again. Could be that one way is better than other but
>> I'm not sure, and would consider it separately.
>> The former patch 15 (quick skip pageblock that won't be fully migrated) could
>> perhaps change the balance here.
>>
>
> That's why we have two separate per-zone cached start pfns, though, right?
> The next call to async compaction should start from where the previous
> caller left off so there would be no need to set pageblock skip in that
> case until we have checked all memory.  Or are you considering the case of
> concurrent async compaction?

Ah, well the lifecycle of cached pfn's and pageblock_skip is not 
generally in sync. It may be that cached pfn's are reset, but 
pageblock_skip bits remain. So this would be one async pass setting 
hints for the next async pass.

But maybe we've already reduced the impact of sync compaction enough so 
it could now be ignoring pageblock_skip completely, and leave those 
hints only for async compaction.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-07-30  9:39 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1406553101-29326-1-git-send-email-vbabka@suse.cz>
     [not found] ` <1406553101-29326-6-git-send-email-vbabka@suse.cz>
2014-07-29  0:29   ` [PATCH v5 05/14] mm, compaction: move pageblock checks up from isolate_migratepages_range() David Rientjes
2014-07-29  9:27     ` Vlastimil Babka
2014-07-29 23:02       ` David Rientjes
2014-07-29 23:21         ` Kirill A. Shutemov
2014-07-29 23:51           ` David Rientjes
2014-07-30  9:27             ` Vlastimil Babka
2014-07-30  9:39         ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox