linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Chulmin Kim <cmlaika.kim@samsung.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Subject: Re: [PATCH v4 11/12] zsmalloc: page migration support
Date: Tue, 3 May 2016 10:43:05 +0900	[thread overview]
Message-ID: <20160503014305.GC2272@bbox> (raw)
In-Reply-To: <20160503004359.GA2272@bbox>

On Tue, May 03, 2016 at 09:43:59AM +0900, Minchan Kim wrote:
> Good morning, Chulmin
> 
> On Tue, May 03, 2016 at 08:33:16AM +0900, Chulmin Kim wrote:
> > Hello, Minchan!
> > 
> > On 2016년 04월 27일 16:48, Minchan Kim wrote:
> > >This patch introduces run-time migration feature for zspage.
> > >
> > >For migration, VM uses page.lru field so it would be better to not use
> > >page.next field for own purpose. For that, firstly, we can get first
> > >object offset of the page via runtime calculation instead of
> > >page->index so we can use page->index as link for page chaining.
> > >In case of huge object, it stores handle rather than page chaining.
> > >To identify huge object, we uses PG_owner_priv_1 flag.
> > >
> > >For migration, it supports three functions
> > >
> > >* zs_page_isolate
> > >
> > >It isolates a zspage which includes a subpage VM want to migrate from
> > >class so anyone cannot allocate new object from the zspage if it's first
> > >isolation on subpages of zspage. Thus, further isolation on other
> > >subpages cannot isolate zspage from class list.
> > >
> > >* zs_page_migrate
> > >
> > >First of all, it holds write-side zspage->lock to prevent migrate other
> > >subpage in zspage. Then, lock all objects in the page VM want to migrate.
> > >The reason we should lock all objects in the page is due to race between
> > >zs_map_object and zs_page_migrate.
> > >
> > >zs_map_object				zs_page_migrate
> > >
> > >pin_tag(handle)
> > >obj = handle_to_obj(handle)
> > >obj_to_location(obj, &page, &obj_idx);
> > >
> > >					write_lock(&zspage->lock)
> > >					if (!trypin_tag(handle))
> > >						goto unpin_object
> > >
> > >zspage = get_zspage(page);
> > >read_lock(&zspage->lock);
> > >
> > >If zs_page_migrate doesn't do trypin_tag, zs_map_object's page can
> > >be stale so go crash.
> > >
> > >If it locks all of objects successfully, it copies content from old page
> > >create new one, finally, create new page chain with new page.
> > >If it's last isolated page in the zspage, put the zspage back to class.
> > >
> > >* zs_page_putback
> > >
> > >It returns isolated zspage to right fullness_group list if it fails to
> > >migrate a page.
> > >
> > >Lastly, this patch introduces asynchronous zspage free. The reason
> > >we need it is we need page_lock to clear PG_movable but unfortunately,
> > >zs_free path should be atomic so the apporach is try to grab page_lock
> > >with preemption disabled. If it got page_lock of all of pages
> > >successfully, it can free zspage in the context. Otherwise, it queues
> > >the free request and free zspage via workqueue in process context.
> > >
> > >Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> > >Signed-off-by: Minchan Kim <minchan@kernel.org>
> > >---
> > >  include/uapi/linux/magic.h |   1 +
> > >  mm/zsmalloc.c              | 552 +++++++++++++++++++++++++++++++++++++++------
> > >  2 files changed, 487 insertions(+), 66 deletions(-)
> > >
> > >diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h
> > >index e1fbe72c39c0..93b1affe4801 100644
> > >--- a/include/uapi/linux/magic.h
> > >+++ b/include/uapi/linux/magic.h
> > >@@ -79,5 +79,6 @@
> > >  #define NSFS_MAGIC		0x6e736673
> > >  #define BPF_FS_MAGIC		0xcafe4a11
> > >  #define BALLOON_KVM_MAGIC	0x13661366
> > >+#define ZSMALLOC_MAGIC		0x58295829
> > >
> > >  #endif /* __LINUX_MAGIC_H__ */
> > >diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > >index 8d82e44c4644..042793015ecf 100644
> > >--- a/mm/zsmalloc.c
> > >+++ b/mm/zsmalloc.c
> > >@@ -17,15 +17,14 @@
> > >   *
> > >   * Usage of struct page fields:
> > >   *	page->private: points to zspage
> > >- *	page->index: offset of the first object starting in this page.
> > >- *		For the first page, this is always 0, so we use this field
> > >- *		to store handle for huge object.
> > >- *	page->next: links together all component pages of a zspage
> > >+ *	page->freelist: links together all component pages of a zspage
> > >+ *		For the huge page, this is always 0, so we use this field
> > >+ *		to store handle.
> > >   *
> > >   * Usage of struct page flags:
> > >   *	PG_private: identifies the first component page
> > >   *	PG_private2: identifies the last component page
> > >- *
> > >+ *	PG_owner_priv_1: indentifies the huge component page
> > >   */
> > >
> > >  #include <linux/module.h>
> > >@@ -47,6 +46,10 @@
> > >  #include <linux/debugfs.h>
> > >  #include <linux/zsmalloc.h>
> > >  #include <linux/zpool.h>
> > >+#include <linux/mount.h>
> > >+#include <linux/migrate.h>
> > >+
> > >+#define ZSPAGE_MAGIC	0x58
> > >
> > >  /*
> > >   * This must be power of 2 and greater than of equal to sizeof(link_free).
> > >@@ -128,8 +131,33 @@
> > >   *  ZS_MIN_ALLOC_SIZE and ZS_SIZE_CLASS_DELTA must be multiple of ZS_ALIGN
> > >   *  (reason above)
> > >   */
> > >+
> > >+/*
> > >+ * A zspage's class index and fullness group
> > >+ * are encoded in its (first)page->mapping
> > >+ */
> > >+#define FULLNESS_BITS	2
> > >+#define CLASS_BITS	8
> > >+#define ISOLATED_BITS	3
> > >+#define MAGIC_VAL_BITS	8
> > >+
> > >+
> > >  #define ZS_SIZE_CLASS_DELTA	(PAGE_SIZE >> CLASS_BITS)
> > >
> > >+struct zspage {
> > >+	struct {
> > >+		unsigned int fullness:FULLNESS_BITS;
> > >+		unsigned int class:CLASS_BITS;
> > >+		unsigned int isolated:ISOLATED_BITS;
> > >+		unsigned int magic:MAGIC_VAL_BITS;
> > >+	};
> > >+	unsigned int inuse;
> > >+	unsigned int freeobj;
> > >+	struct page *first_page;
> > >+	struct list_head list; /* fullness list */
> > >+	rwlock_t lock;
> > >+};
> > >+
> > >  /*
> > >   * We do not maintain any list for completely empty or full pages
> > >   */
> > >@@ -161,6 +189,8 @@ struct zs_size_stat {
> > >  static struct dentry *zs_stat_root;
> > >  #endif
> > >
> > >+static struct vfsmount *zsmalloc_mnt;
> > >+
> > >  /*
> > >   * number of size_classes
> > >   */
> > >@@ -243,24 +273,10 @@ struct zs_pool {
> > >  #ifdef CONFIG_ZSMALLOC_STAT
> > >  	struct dentry *stat_dentry;
> > >  #endif
> > >-};
> > >-
> > >-/*
> > >- * A zspage's class index and fullness group
> > >- * are encoded in its (first)page->mapping
> > >- */
> > >-#define FULLNESS_BITS	2
> > >-#define CLASS_BITS	8
> > >-
> > >-struct zspage {
> > >-	struct {
> > >-		unsigned int fullness:FULLNESS_BITS;
> > >-		unsigned int class:CLASS_BITS;
> > >-	};
> > >-	unsigned int inuse;
> > >-	unsigned int freeobj;
> > >-	struct page *first_page;
> > >-	struct list_head list; /* fullness list */
> > >+	struct inode *inode;
> > >+	spinlock_t free_lock;
> > >+	struct work_struct free_work;
> > >+	struct list_head free_zspage;
> > >  };
> > >
> > >  struct mapping_area {
> > >@@ -312,8 +328,11 @@ static struct zspage *cache_alloc_zspage(struct zs_pool *pool, gfp_t flags)
> > >  	struct zspage *zspage;
> > >
> > >  	zspage = kmem_cache_alloc(pool->zspage_cachep, flags & ~__GFP_HIGHMEM);
> > >-	if (zspage)
> > >+	if (zspage) {
> > >  		memset(zspage, 0, sizeof(struct zspage));
> > >+		zspage->magic = ZSPAGE_MAGIC;
> > >+		rwlock_init(&zspage->lock);
> > 
> > +              INIT_LIST_HEAD(&zspage->list);
> > 
> > If there is no special intention here,
> > I think we need the list initialization.
> 
> Intention was that I just watned to add unncessary instruction there

                     I just don't want to add unnecessary instruction there
Typo. :)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2016-05-03  1:43 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-27  7:48 [PATCH v4 00/13] Support non-lru page migration Minchan Kim
2016-04-27  7:48 ` [PATCH v4 01/12] mm: use put_page to free page instead of putback_lru_page Minchan Kim
2016-04-27  7:48 ` [PATCH v4 02/12] mm: migrate: support non-lru movable page migration Minchan Kim
2016-04-27  7:48 ` [PATCH v4 03/12] mm: balloon: use general non-lru movable page feature Minchan Kim
2016-04-27  7:48 ` [PATCH v4 04/12] zsmalloc: keep max_object in size_class Minchan Kim
2016-04-27  7:48 ` [PATCH v4 05/12] zsmalloc: use bit_spin_lock Minchan Kim
2016-04-27  7:48 ` [PATCH v4 06/12] zsmalloc: use accessor Minchan Kim
2016-04-27  7:48 ` [PATCH v4 07/12] zsmalloc: factor page chain functionality out Minchan Kim
2016-04-27  7:48 ` [PATCH v4 08/12] zsmalloc: introduce zspage structure Minchan Kim
2016-04-27  7:48 ` [PATCH v4 09/12] zsmalloc: separate free_zspage from putback_zspage Minchan Kim
2016-04-27  7:48 ` [PATCH v4 10/12] zsmalloc: use freeobj for index Minchan Kim
2016-04-27  7:48 ` [PATCH v4 11/12] zsmalloc: page migration support Minchan Kim
2016-05-02 23:33   ` Chulmin Kim
2016-05-03  0:43     ` Minchan Kim
2016-05-03  1:42       ` Chulmin Kim
2016-05-03  1:58         ` Minchan Kim
2016-05-03  1:43       ` Minchan Kim [this message]
2016-04-27  7:48 ` [PATCH v4 12/12] zram: use __GFP_MOVABLE for memory allocation Minchan Kim
2016-04-27 20:20 ` [PATCH v4 00/13] Support non-lru page migration Andrew Morton
2016-04-27 23:54   ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160503014305.GC2272@bbox \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=cmlaika.kim@samsung.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=sergey.senozhatsky@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox