linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andy Whitcroft <apw@shadowen.org>
To: KUROSAWA Takahiro <kurosawa@valinux.co.jp>
Cc: ckrm-tech@lists.sourceforge.net, linux-mm@kvack.org
Subject: Re: [PATCH 1/2] Add the pzone
Date: Thu, 19 Jan 2006 18:04:43 +0000	[thread overview]
Message-ID: <43CFD4BB.4070704@shadowen.org> (raw)
In-Reply-To: <20060119080413.24736.27946.sendpatchset@debian>

KUROSAWA Takahiro wrote:
> This patch implements the pzone (pseudo zone).  A pzone can be used
> for reserving pages in a zone.  Pzones are implemented by extending
> the zone structure and act almost the same as the conventional zones;
> we can specify pzones in a zonelist for __alloc_pages() and the vmscan
> code works on pzones with few modifications.
> 
> Signed-off-by: KUROSAWA Takahiro <kurosawa@valinux.co.jp>
[...]
> -/* Page flags: | [SECTION] | [NODE] | ZONE | ... | FLAGS | */
> -#define SECTIONS_PGOFF		((sizeof(unsigned long)*8) - SECTIONS_WIDTH)
> +/* Page flags: | [PZONE] | [SECTION] | [NODE] | ZONE | ... | FLAGS | */
> +#define PZONE_BIT_PGOFF		((sizeof(unsigned long)*8) - PZONE_BIT_WIDTH)
> +#define SECTIONS_PGOFF		(PZONE_BIT_PGOFF - SECTIONS_WIDTH)
>  #define NODES_PGOFF		(SECTIONS_PGOFF - NODES_WIDTH)
>  #define ZONES_PGOFF		(NODES_PGOFF - ZONES_WIDTH)

In general this PZONE bit is really a part of the zone number.  Much of
the order of these bits is chosen to obtain the cheapest extraction of
the most used bits, particularly the node/zone conbination or section
number on the left.  I would say put the PZONE_BIT next to ZONE
probabally to the right of it?  [See below for more reasons to put it
there.]

> @@ -431,6 +438,7 @@ void put_page(struct page *page);
>   * sections we define the shift as 0; that plus a 0 mask ensures
>   * the compiler will optimise away reference to them.
>   */
> +#define PZONE_BIT_PGSHIFT	(PZONE_BIT_PGOFF * (PZONE_BIT_WIDTH != 0))
>  #define SECTIONS_PGSHIFT	(SECTIONS_PGOFF * (SECTIONS_WIDTH != 0))
>  #define NODES_PGSHIFT		(NODES_PGOFF * (NODES_WIDTH != 0))
>  #define ZONES_PGSHIFT		(ZONES_PGOFF * (ZONES_WIDTH != 0))
> @@ -443,10 +451,11 @@ void put_page(struct page *page);
>  #endif
>  #define ZONETABLE_PGSHIFT	ZONES_PGSHIFT
>  
> -#if SECTIONS_WIDTH+NODES_WIDTH+ZONES_WIDTH > FLAGS_RESERVED
> -#error SECTIONS_WIDTH+NODES_WIDTH+ZONES_WIDTH > FLAGS_RESERVED
> +#if PZONE_BIT_WIDTH+SECTIONS_WIDTH+NODES_WIDTH+ZONES_WIDTH > FLAGS_RESERVED
> +#error PZONE_BIT_WIDTH+SECTIONS_WIDTH+NODES_WIDTH+ZONES_WIDTH > FLAGS_RESERVED
>  #endif

Do we have any bits left in the reserve on 32 bit machines?  The reserve
at last look was only 8 bits and there was little if any headroom in the
rest of the flags word to extend it; if memory serves at least 22 of the
24 remaining bits was accounted for.  Has this been tested on any such
machines?

> +#define PZONE_BIT_MASK		((1UL << PZONE_BIT_WIDTH) - 1)
>  #define ZONES_MASK		((1UL << ZONES_WIDTH) - 1)
>  #define NODES_MASK		((1UL << NODES_WIDTH) - 1)
>  #define SECTIONS_MASK		((1UL << SECTIONS_WIDTH) - 1)
[...]
> +#ifdef CONFIG_PSEUDO_ZONE
> +static inline int page_in_pzone(struct page *page)
> +{
> +	return (page->flags >> PZONE_BIT_PGSHIFT) & PZONE_BIT_MASK;
> +}
> +
> +static inline struct zone *page_zone(struct page *page)
> +{
> +	int idx;
> +
> +	idx = (page->flags >> ZONETABLE_PGSHIFT) & ZONETABLE_MASK;
> +	if (page_in_pzone(page))
> +		return pzone_table[idx].zone;
> +	return zone_table[idx];
> +}

Could we not do this all without changing the structure of the zone
table at all by placing the PZONE_BIT either to the left or right of the
ZONE in the flags.  Then ZONETABLE_MASK could be extended to cover it
when pzone's are enabled and the pzone's could be added to zone_table
instead of their own pzone_table?  This would mean the code above could
be unmodified and much simpler.

> +
> +static inline unsigned long page_to_nid(struct page *page)
> +{
> +	return page_zone(page)->zone_pgdat->node_id;
> +}
[...]
> +#ifdef CONFIG_PSEUDO_ZONE
> +#define MAX_NR_PZONES		1024

You seem to be allowing for 1024 pzone's here?  But in
pzone_setup_page_flags() you place the pzone_idx (an offset into the
pzone_table) into the ZONE field of the page flags.  This field is
typically only two bits wide?  I don't see this being increased in this
patch, nor is there space for it generally to get much bigger not on 32
bit kernels anyhow (see comments about bits earlier)?

#define ZONES_SHIFT             2       /* ceil(log2(MAX_NR_ZONES)) */

Cheers.

-apw

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2006-01-19 18:04 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-01-19  8:04 [PATCH 0/2] Pzone based CKRM memory resource controller KUROSAWA Takahiro
2006-01-19  8:04 ` [PATCH 1/2] Add the pzone KUROSAWA Takahiro
2006-01-19 18:04   ` Andy Whitcroft [this message]
2006-01-19 23:42     ` KUROSAWA Takahiro
2006-01-20  9:17       ` Andy Whitcroft
2006-01-20  7:08   ` KAMEZAWA Hiroyuki
2006-01-20  8:22     ` KUROSAWA Takahiro
2006-01-20  8:30       ` KAMEZAWA Hiroyuki
2006-01-19  8:04 ` [PATCH 2/2] Add CKRM memory resource controller using pzones KUROSAWA Takahiro
2006-01-31  2:30 ` [PATCH 0/8] Pzone based CKRM memory resource controller KUROSAWA Takahiro
2006-01-31  2:30   ` [PATCH 1/8] Add the __GFP_NOLRU flag KUROSAWA Takahiro
2006-01-31 18:18     ` [ckrm-tech] " Dave Hansen
2006-02-01  5:06       ` KUROSAWA Takahiro
2006-01-31  2:30   ` [PATCH 2/8] Keep the number of zones while zone iterator loop KUROSAWA Takahiro
2006-01-31  2:30   ` [PATCH 3/8] Add for_each_zone_in_node macro KUROSAWA Takahiro
2006-01-31  2:30   ` [PATCH 4/8] Extract zone specific routines as functions KUROSAWA Takahiro
2006-01-31  2:30   ` [PATCH 5/8] Add the pzone_create() function KUROSAWA Takahiro
2006-01-31  2:30   ` [PATCH 6/8] Add the pzone_destroy() function KUROSAWA Takahiro
2006-01-31  2:30   ` [PATCH 7/8] Make the number of pages in pzones resizable KUROSAWA Takahiro
2006-01-31  2:30   ` [PATCH 8/8] Add a CKRM memory resource controller using pzones KUROSAWA Takahiro
2006-02-01  2:58   ` [ckrm-tech] [PATCH 0/8] Pzone based CKRM memory resource controller chandra seetharaman
2006-02-01  5:39     ` KUROSAWA Takahiro
2006-02-01  6:16       ` Hirokazu Takahashi
2006-02-02  1:26       ` chandra seetharaman
2006-02-02  3:54         ` KUROSAWA Takahiro
2006-02-03  0:37           ` chandra seetharaman
2006-02-03  0:51             ` KUROSAWA Takahiro
2006-02-03  1:01               ` chandra seetharaman
2006-02-01  3:07   ` chandra seetharaman
2006-02-01  5:54     ` KUROSAWA Takahiro
2006-02-03  1:33     ` KUROSAWA Takahiro
2006-02-03  9:37       ` KUROSAWA Takahiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43CFD4BB.4070704@shadowen.org \
    --to=apw@shadowen.org \
    --cc=ckrm-tech@lists.sourceforge.net \
    --cc=kurosawa@valinux.co.jp \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox