On Thu, Apr 14, 2011 at 9:16 PM, KAMEZAWA Hiroyuki < kamezawa.hiroyu@jp.fujitsu.com> wrote: > On Thu, 14 Apr 2011 20:35:00 -0700 > Ying Han wrote: > > > On Thu, Apr 14, 2011 at 5:04 PM, KAMEZAWA Hiroyuki < > > kamezawa.hiroyu@jp.fujitsu.com> wrote: > > > > > On Thu, 14 Apr 2011 15:54:20 -0700 > > > Ying Han wrote: > > > > > > > There is a kswapd kernel thread for each numa node. We will add a > > > different > > > > kswapd for each memcg. The kswapd is sleeping in the wait queue > headed at > > > > kswapd_wait field of a kswapd descriptor. The kswapd descriptor > stores > > > > information of node or memcg and it allows the global and per-memcg > > > background > > > > reclaim to share common reclaim algorithms. > > > > > > > > This patch adds the kswapd descriptor and moves the per-node kswapd > to > > > use the > > > > new structure. > > > > > > > > > > No objections to your direction but some comments. > > > > > > > changelog v2..v1: > > > > 1. dynamic allocate kswapd descriptor and initialize the > wait_queue_head > > > of pgdat > > > > at kswapd_run. > > > > 2. add helper macro is_node_kswapd to distinguish per-node/per-cgroup > > > kswapd > > > > descriptor. > > > > > > > > changelog v3..v2: > > > > 1. move the struct mem_cgroup *kswapd_mem in kswapd sruct to later > patch. > > > > 2. rename thr in kswapd_run to something else. > > > > > > > > Signed-off-by: Ying Han > > > > --- > > > > include/linux/mmzone.h | 3 +- > > > > include/linux/swap.h | 7 ++++ > > > > mm/page_alloc.c | 1 - > > > > mm/vmscan.c | 95 > > > ++++++++++++++++++++++++++++++++++++------------ > > > > 4 files changed, 80 insertions(+), 26 deletions(-) > > > > > > > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > > > > index 628f07b..6cba7d2 100644 > > > > --- a/include/linux/mmzone.h > > > > +++ b/include/linux/mmzone.h > > > > @@ -640,8 +640,7 @@ typedef struct pglist_data { > > > > unsigned long node_spanned_pages; /* total size of physical > page > > > > range, including holes */ > > > > int node_id; > > > > - wait_queue_head_t kswapd_wait; > > > > - struct task_struct *kswapd; > > > > + wait_queue_head_t *kswapd_wait; > > > > int kswapd_max_order; > > > > enum zone_type classzone_idx; > > > > > > I think pg_data_t should include struct kswapd in it, as > > > > > > struct pglist_data { > > > ..... > > > struct kswapd kswapd; > > > }; > > > and you can add a macro as > > > > > > #define kswapd_waitqueue(kswapd) (&(kswapd)->kswapd_wait) > > > if it looks better. > > > > > > Why I recommend this is I think it's better to have 'struct kswapd' > > > on the same page of pg_data_t or struct memcg. > > > Do you have benefits to kmalloc() struct kswapd on damand ? > > > > > > > So we don't end of have kswapd struct on memcgs' which doesn't have > > per-memcg kswapd enabled. I don't see one is strongly better than the > other > > for the two approaches. If ok, I would like to keep as it is for this > > verion. Hope this is ok for now. > > > > My intension is to remove kswapd_spinlock. Can we remove it with > dynamic allocation ? IOW, static allocation still requires spinlock ? > Thank you for pointing that out which made me thinking a little harder on this. I don't think we need the spinlock in this patch. This is something I inherited from another kswapd patch we did where we allow one kswapd to reclaim from multiple pgdat. We need the spinlock there since we need to protect the pgdat list per kswapd. However, we have one-to-one mapping here and we can get rid of the lock. I will remove it on the next post. --Ying > > Thanks, > -Kame > > >