[RFC v1 0/6] introduce gcma

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [RFC v1 0/6] introduce gcma
@ 2014-11-11 15:00 SeongJae Park
  2014-11-11 15:00 ` [RFC v1 1/6] gcma: introduce contiguous memory allocator SeongJae Park
                   ` (6 more replies)
  0 siblings, 7 replies; 9+ messages in thread
From: SeongJae Park @ 2014-11-11 15:00 UTC (permalink / raw)
  To: akpm; +Cc: lauraa, minchan, sergey.senozhatsky, linux-mm, SeongJae Park

Abstract
========

Current cma(contiguous memory allocator) could not guarantee success and fast
latency of contiguous allocation.
This coverletter explains about the problem in detail and suggest new
contiguous memory allocator, gcma(guaranteed contiguous allocator).

CMA: Contiguous Memory Allocator
================================

Basic idea of cma
-----------------

Basic idea of cma is as follows. It focuses on memory efficiency while keeping
contiguous allocation could be done without serious penalty.

 - Reserves large contiguous memory area during boot and let the area could be
   used by contiguous allocation.
 - Because system memory could be inefficient if the reserved memory is not
   fully utilized by contiguous allocation, let the area could be allocated for
   2nd-class clients
 - If pages being allocated for 2nd-class clients are necessary for contiguous
   allocation(doubtless 1st class client), migrates or discard the page and use
   them for contiguous allocation.

In cma, _2nd-class client_ is movable page. The reserved area could be
allocated for movable pages and the movable pages be migrated or discarded if
contiguous allocation needs them.

Problem of cma
--------------

This cma mechanism imposes following weaknesses.

1. Allocation failure
CMA could fail to allocate contiguous memory due to following reasons.
1-1. Direct pinning
Any kernel thread could pin any movable pages for a long time. If a movable
page which needs to be migrated for a contiguous memory allocation is already
pinned by someone, migration could not be completed. In consequence, contiguous
allocation could be fail if the page is not be unpinned longtime.
1-2. Indirect pin
If a movable page have dependency with an object, the object would increase
reference count of the movable page to assert it is safe to use the page. If a
movable page which is needs to be migrated for a contiguous memory allocation
is in the case, the page could not be free to be used by contiguous allocation.
In consequence, contiguous allocation could be failed.

2. High cost
Contiguous memory allocation of CMA could be expensive by following reasons.
2-1. Function overhead
Most of all, migration itself is not so simple. It should manipulate rmap and
copy content of the pages into another pages. It could require relatively long
time.
After migration, migrated pages be inserted in head of LRU page list again
though it was not be used, just migrated. In that case, the pages on LRU list
is not ordered in LRU degree. In consequence, system performance could be
degraded because working set pages could be swapped-out by the abnormal LRU
list.
2-2. Writeback cost
If the page which needs to be discarded for contiguous memory allocation was
dirty, it should be written-back to mapped file. Latency of write-back is
usually not predictably high because it depends on not only memory management,
but also block layer, file system and block device h/w characteristic.

In short, cma doesn't guarantee success and fast latency of contiguous
memory allocation. And, the core cause is the fact that cma chosen 2nd-class
client(movable pages) were not nice(hard to migrate / discard) enough.

The problem was discussed in detail from [1] and [2].

GCMA: Guaranteed contiguous memory allocator
============================================

Goal of gcma is to solve those two weaknesses of cma discussed above.
In other words, gcma aims to guarantee success and fast latency of contiguous
memory allocation.

Basic idea
----------

Basic idea of gcma is as same as cma's. It reserves large contiguous memory
area during boot and use it for contiguous memory allocator while let it be
allocated for 2nd-class clients to keep memory efficiency. If the pages
allocated for 2nd-class clients necessary for contiguous allocation(doubtless
1st-class client), discard or migrate them.

Difference with cma is choice and operation of 2nd-class client. In gcma,
2nd-class client should allocate pages from the reserved area only if the
allocated pages mets following conditions.

1. Out of Kernel
If a page is out of kernel scope, the page could be handled by the 2nd-class
client only and no others could see, touch or hold it. Those pages could be
discarded anytime. In consequence, contiguous allocation could not be fail if
2nd-class client cooperates well.
2. Quickly discardable or migratable
The pages being used by 2nd-class client should be Quickly discardable of
migratable. If so, the contiguous allocation could guarantee fast latency.

With above conditions, we picked 2 candidates for gcma 2nd-class clients.
Frontswap and cleancache are them.

Frontswap backend
-----------------

1. Out of Kernel
Pages inside frontswap backend is swapped-out pages, which are out of kernel.

2. Quickly discardable or migratable
Pages inside frontswap backend could be discarded using following policies.
2.1. Write-back
After the pages written-back containing data to backed real swap device, the
page could be free without any interference.
In this policy, latency of write-back operation could be bounded to swap device
write speed.
2.2. Write-through
Frontswap could be run with write-through mode. In this case, any pages in
frontswap backend could be free immediately because the data is already in swap
device.
This policy could show very fast speed but could make whole system slow due to
frequent write-through. In flash storage based system, it could cause the
storage system failure unless it do wear-leveling on swap device.
2.3. Put-back
When pages inside frontswap backend need to be discarded, gcma could allocates
pages from system memroy(not reserved memory) and copy content of discarding
pages into newly allocated page. After that, put those newly-allocated, data
copied pages inside swap cache to let them in frontswap backend again. After
that, the discarding pages are free.
Because it do only memory-operation, speed would not be too slow. We call the
operation as _put-back_.

Cleancache
----------

1. Out of Kernel
Pages inside clean cache is clean pages evicted from page cache, which means
out of kernel.

2. Quickly discardable or migratable
Because pages inside clean cache is clean, it could be free immediately without
any additional operation.

Current RFC implementation
==========================

Though we suggested 2 candidates and various policy for fast discarding,
current RFC implements gcma using only frontswap / write-through policy naively
because this is a prototype of prototype for various opinions of reviewers.

At the moment, current naive implementation is as follows:
1) Reserves large amount of memory during boot.
2) Allow the memory to write-through mode frontswap and contiguous memory
   allocation.
3) Drain pages being used for the frontswap if contiguous memroy allocation
   needs.

As discussed above, this implementation could introduces clear trade-off:
1) System performance could be degraded due to write-through mode
2) Flash storage using system should worry about wear-leveling

Configuring swap device using zram could be helpful to alleviate those problems
though the trade-off still exist.

Basic concept, implementation, and performance evaluation result were presented
in detail at [2].

Disclaimer
==========

Because cma and gcma has clear merits and demerits, gcma aims to be coexists
with cma rather than alternates it. Users could operate cma and gcma on a
system concurrently and could use them as they need.

Performance Evaluation
======================

Machine Setting
---------------

CuBox i4 Pro
 - ARM v7, 4 * 1 GHz cores
 - 800 MiB DDR3 RAM (Originally 2 GiB equipped.)
 - Class 10 SanDisk 16 GiB microSD card

Evaluation Variants
-------------------

 - Baseline:	Linux v3.17, 128 MiB swap
 - cma:		Baseline + 256 MiB CMA area
 - gcma:	Baseline + 256 MiB GCMA area
 - gcma.zram:	GCMA + 128 MiB zram swap device

Workloads
---------

 - Background workload: `make defconfig && time make -j 16` with Linux v3.12.6
 - Foreground workload: Request 1-32000 contiguous page allocation 32 times

Evaluation Result
-----------------

[ Latency (u-seconds) ]
Results below shows gcma's latency is significantly lower than cma's. Note that
cma max latency reaches more than 4 seconds easily.

		cma			gcma			gcma.zram
nr_pages	min	max	avg	min	max	avg	min	max	avg
1		383	53397	15737	13	43	13	13	34	13
512		578	3909212	135736	384	588	411	384	3326385	104419
1024		3074	4277142	386083	763	15580	1433	766	42521	3548
2048		3862	3334665	246806	1564	41844	3158	1536	11930	2379
4096		2502	3813997	266966	3122	10491	3608	3081	13155	3793
8192		12244	4196931	656029	6152	10682	6903	6154	37543	8406
16384		5447	4071272	853303	12544	50947	15647	12499	16819	13522
32000		18505	4293604	1102669	25427	62671	29331	25354	65421	28721

[ Background workload performance ]
Background workload(kernel build) result measured to evaluate system
performance degradation cma / gcma affects.
original means background workload result on CMA configuration without
foreground(contiguous allocation) workload.

cma and gcma degraded system performance due to page migration / write-through
and affected kernel build workload performance while gcma with zram swap device
shows alleviated performance degradation.

		user		system		elapsed		cpu
original	1702.98		169.41		08:32.13	365
cma		1723.13		187.21		09:25.46	337
gcma		1720.95		174.23		09:27.91	333
gcma.zram	1736.61		171.6		08:50.72	359

[ Evaluation result summary ]
With performance evaluation results above, we can say,
1. latency of gcma is significantly lower then cma's.
2. gcma degrade system performance though zram swap device configuration can
   abbreviate the effect a little.

NOTE: Appreciates any feedback to this simple idea and implementation though
this RFC is not yet matured and ugly a lot.

[1] https://lkml.org/lkml/2013/10/30/16
[2] http://sched.co/1qZcBAO

Really appreciate Minchan who suggested main idea and have helped a lot
during development with code fix/review.

SeongJae Park (6):
  gcma: introduce contiguous memory allocator
  gcma: utilize reserved memory as swap cache
  gcma: evict frontswap pages in LRU order when memory is full
  gcma: discard swap cache pages to meet successful GCMA allocation
  gcma: export statistical data on debugfs
  gcma: integrate gcma under cma interface

 include/linux/cma.h  |   4 +
 include/linux/gcma.h |  46 +++
 mm/Kconfig           |  15 +
 mm/Makefile          |   2 +
 mm/cma.c             | 110 +++++--
 mm/gcma.c            | 799 +++++++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 953 insertions(+), 23 deletions(-)
 create mode 100644 include/linux/gcma.h
 create mode 100644 mm/gcma.c

-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC v1 1/6] gcma: introduce contiguous memory allocator
  2014-11-11 15:00 [RFC v1 0/6] introduce gcma SeongJae Park
@ 2014-11-11 15:00 ` SeongJae Park
  2014-11-11 15:00 ` [RFC v1 2/6] gcma: utilize reserved memory as swap cache SeongJae Park
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2014-11-11 15:00 UTC (permalink / raw)
  To: akpm; +Cc: lauraa, minchan, sergey.senozhatsky, linux-mm, SeongJae Park

This patch introduces a simple contiguous memory allocator.
It's simple bitmap allocator to manage a contiguos memory.

Signed-off-by: SeongJae Park <sj38.park@gmail.com>
---
 include/linux/gcma.h |  26 ++++++++
 mm/gcma.c            | 173 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 199 insertions(+)
 create mode 100644 include/linux/gcma.h
 create mode 100644 mm/gcma.c

diff --git a/include/linux/gcma.h b/include/linux/gcma.h
new file mode 100644
index 0000000..3016968
--- /dev/null
+++ b/include/linux/gcma.h
@@ -0,0 +1,26 @@
+/*
+ * gcma.h - Guaranteed Contiguous Memory Allocator
+ *
+ * GCMA aims for contiguous memory allocation with success and fast
+ * latency guarantee.
+ * It reserves large amount of memory and let it be allocated to the
+ * contiguous memory request.
+ *
+ * Copyright (C) 2014  LG Electronics Inc.,
+ * Copyright (C) 2014  Minchan Kim <minchan@kernel.org>
+ * Copyright (C) 2014  SeongJae Park <sj38.park@gmail.com>
+ */
+
+#ifndef _LINUX_GCMA_H
+#define _LINUX_GCMA_H
+
+struct gcma;
+
+int gcma_init(unsigned long start_pfn, unsigned long size,
+	      struct gcma **res_gcma);
+int gcma_alloc_contig(struct gcma *gcma,
+		      unsigned long start_pfn, unsigned long size);
+void gcma_free_contig(struct gcma *gcma,
+		      unsigned long start_pfn, unsigned long size);
+
+#endif /* _LINUX_GCMA_H */
diff --git a/mm/gcma.c b/mm/gcma.c
new file mode 100644
index 0000000..20a8473
--- /dev/null
+++ b/mm/gcma.c
@@ -0,0 +1,173 @@
+/*
+ * gcma.c - Guaranteed Contiguous Memory Allocator
+ *
+ * GCMA aims for contiguous memory allocation with success and fast
+ * latency guarantee.
+ * It reserves large amount of memory and let it be allocated to the
+ * contiguous memory request.
+ *
+ * Copyright (C) 2014  LG Electronics Inc.,
+ * Copyright (C) 2014  Minchan Kim <minchan@kernel.org>
+ * Copyright (C) 2014  SeongJae Park <sj38.park@gmail.com>
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/highmem.h>
+#include <linux/gcma.h>
+
+struct gcma {
+	spinlock_t lock;
+	unsigned long *bitmap;
+	unsigned long base_pfn, size;
+	struct list_head list;
+};
+
+struct gcma_info {
+	spinlock_t lock;	/* protect list */
+	struct list_head head;
+};
+
+static struct gcma_info ginfo = {
+	.head = LIST_HEAD_INIT(ginfo.head),
+	.lock = __SPIN_LOCK_UNLOCKED(ginfo.lock),
+};
+
+/*
+ * gcma_init - initializes a contiguous memory area
+ *
+ * @start_pfn	start pfn of contiguous memory area
+ * @size	number of pages in the contiguous memory area
+ * @res_gcma	pointer to store the created gcma region
+ *
+ * Returns 0 on success, error code on failure.
+ */
+int gcma_init(unsigned long start_pfn, unsigned long size,
+		struct gcma **res_gcma)
+{
+	int bitmap_size = BITS_TO_LONGS(size) * sizeof(long);
+	struct gcma *gcma;
+
+	gcma = kmalloc(sizeof(*gcma), GFP_KERNEL);
+	if (!gcma)
+		goto out;
+
+	gcma->bitmap = kzalloc(bitmap_size, GFP_KERNEL);
+	if (!gcma->bitmap)
+		goto free_cma;
+
+	gcma->size = size;
+	gcma->base_pfn = start_pfn;
+	spin_lock_init(&gcma->lock);
+
+	spin_lock(&ginfo.lock);
+	list_add(&gcma->list, &ginfo.head);
+	spin_unlock(&ginfo.lock);
+
+	*res_gcma = gcma;
+	pr_info("initialized gcma area [%lu, %lu]\n",
+			start_pfn, start_pfn + size);
+	return 0;
+
+free_cma:
+	kfree(gcma);
+out:
+	return -ENOMEM;
+}
+
+static struct page *gcma_alloc_page(struct gcma *gcma)
+{
+	unsigned long bit;
+	unsigned long *bitmap = gcma->bitmap;
+	struct page *page = NULL;
+
+	spin_lock(&gcma->lock);
+	bit = bitmap_find_next_zero_area(bitmap, gcma->size, 0, 1, 0);
+	if (bit >= gcma->size) {
+		spin_unlock(&gcma->lock);
+		goto out;
+	}
+
+	bitmap_set(bitmap, bit, 1);
+	page = pfn_to_page(gcma->base_pfn + bit);
+	spin_unlock(&gcma->lock);
+
+out:
+	return page;
+}
+
+static void gcma_free_page(struct gcma *gcma, struct page *page)
+{
+	unsigned long pfn, offset;
+
+	pfn = page_to_pfn(page);
+
+	spin_lock(&gcma->lock);
+	offset = pfn - gcma->base_pfn;
+
+	bitmap_clear(gcma->bitmap, offset, 1);
+	spin_unlock(&gcma->lock);
+}
+
+/*
+ * gcma_alloc_contig - allocates contiguous pages
+ *
+ * @start_pfn	start pfn of requiring contiguous memory area
+ * @size	size of the requiring contiguous memory area
+ *
+ * Returns 0 on success, error code on failure.
+ */
+int gcma_alloc_contig(struct gcma *gcma, unsigned long start_pfn,
+			unsigned long size)
+{
+	unsigned long offset;
+
+	spin_lock(&gcma->lock);
+	offset = start_pfn - gcma->base_pfn;
+
+	if (bitmap_find_next_zero_area(gcma->bitmap, gcma->size, offset,
+				size, 0) != 0) {
+		spin_unlock(&gcma->lock);
+		pr_warn("already allocated region required: %lu, %lu",
+				start_pfn, size);
+		return -EINVAL;
+	}
+
+	bitmap_set(gcma->bitmap, offset, size);
+	spin_unlock(&gcma->lock);
+
+	return 0;
+}
+
+/*
+ * gcma_free_contig - free allocated contiguous pages
+ *
+ * @start_pfn	start pfn of freeing contiguous memory area
+ * @size	number of pages in freeing contiguous memory area
+ */
+void gcma_free_contig(struct gcma *gcma,
+		      unsigned long start_pfn, unsigned long size)
+{
+	unsigned long offset;
+
+	spin_lock(&gcma->lock);
+	offset = start_pfn - gcma->base_pfn;
+	bitmap_clear(gcma->bitmap, offset, size);
+	spin_unlock(&gcma->lock);
+}
+
+static int __init init_gcma(void)
+{
+	pr_info("loading gcma\n");
+
+	return 0;
+}
+
+module_init(init_gcma);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Minchan Kim <minchan@kernel.org>");
+MODULE_AUTHOR("SeongJae Park <sj38.park@gmail.com>");
+MODULE_DESCRIPTION("Guaranteed Contiguous Memory Allocator");
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC v1 2/6] gcma: utilize reserved memory as swap cache
  2014-11-11 15:00 [RFC v1 0/6] introduce gcma SeongJae Park
  2014-11-11 15:00 ` [RFC v1 1/6] gcma: introduce contiguous memory allocator SeongJae Park
@ 2014-11-11 15:00 ` SeongJae Park
  2014-11-11 15:00 ` [RFC v1 3/6] gcma: evict frontswap pages in LRU order when memory is full SeongJae Park
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2014-11-11 15:00 UTC (permalink / raw)
  To: akpm; +Cc: lauraa, minchan, sergey.senozhatsky, linux-mm, SeongJae Park

GCMA reserves an amount of memory during boot and the memory space
should be always available for guest of that area. However, the guest
doesn't need it everytime so this patch makes the reserved memory as
swap cache via write-through frontswap for memory efficiency.

If the guest declares to need it sometime, we can discard all of swap
cache because every data should be on swap disk by write-through
frontswap. It makes allocation latency for the guest really small.

The drawback of the approach is that it could degrade system performance
due to earlier swapout by reserving if the user makes GCMA area big(e.g.,
1/3 of the system memory) and swap-cache hit ratio is low.
It's a trade-off for getting guaranteed low latency contiguous memory
allocation.

Signed-off-by: SeongJae Park <sj38.park@gmail.com>
---
 include/linux/gcma.h |   2 +-
 mm/gcma.c            | 330 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 330 insertions(+), 2 deletions(-)

diff --git a/include/linux/gcma.h b/include/linux/gcma.h
index 3016968..d733a9b 100644
--- a/include/linux/gcma.h
+++ b/include/linux/gcma.h
@@ -4,7 +4,7 @@
  * GCMA aims for contiguous memory allocation with success and fast
  * latency guarantee.
  * It reserves large amount of memory and let it be allocated to the
- * contiguous memory request.
+ * contiguous memory request and utilize them as swap cache.
  *
  * Copyright (C) 2014  LG Electronics Inc.,
  * Copyright (C) 2014  Minchan Kim <minchan@kernel.org>
diff --git a/mm/gcma.c b/mm/gcma.c
index 20a8473..ddfc0d8 100644
--- a/mm/gcma.c
+++ b/mm/gcma.c
@@ -4,7 +4,7 @@
  * GCMA aims for contiguous memory allocation with success and fast
  * latency guarantee.
  * It reserves large amount of memory and let it be allocated to the
- * contiguous memory request.
+ * contiguous memory request and utilize as swap cache using frontswap.
  *
  * Copyright (C) 2014  LG Electronics Inc.,
  * Copyright (C) 2014  Minchan Kim <minchan@kernel.org>
@@ -15,6 +15,7 @@
 
 #include <linux/module.h>
 #include <linux/slab.h>
+#include <linux/frontswap.h>
 #include <linux/highmem.h>
 #include <linux/gcma.h>
 
@@ -35,6 +36,42 @@ static struct gcma_info ginfo = {
 	.lock = __SPIN_LOCK_UNLOCKED(ginfo.lock),
 };
 
+struct swap_slot_entry {
+	struct gcma *gcma;
+	struct rb_node rbnode;
+	pgoff_t offset;
+	struct page *page;
+	atomic_t refcount;
+};
+
+struct frontswap_tree {
+	struct rb_root rbroot;
+	spinlock_t lock;
+};
+
+static struct frontswap_tree *gcma_swap_trees[MAX_SWAPFILES];
+static struct kmem_cache *swap_slot_entry_cache;
+
+static struct frontswap_tree *swap_tree(struct page *page)
+{
+	return (struct frontswap_tree *)page->mapping;
+}
+
+static void set_swap_tree(struct page *page, struct frontswap_tree *tree)
+{
+	page->mapping = (struct address_space *)tree;
+}
+
+static struct swap_slot_entry *swap_slot(struct page *page)
+{
+	return (struct swap_slot_entry *)page->index;
+}
+
+static void set_swap_slot(struct page *page, struct swap_slot_entry *slot)
+{
+	page->index = (pgoff_t)slot;
+}
+
 /*
  * gcma_init - initializes a contiguous memory area
  *
@@ -112,6 +149,286 @@ static void gcma_free_page(struct gcma *gcma, struct page *page)
 }
 
 /*
+ * In the case that a entry with the same offset is found, a pointer to
+ * the existing entry is stored in dupentry and the function returns -EEXIST.
+ */
+static int frontswap_rb_insert(struct rb_root *root,
+		struct swap_slot_entry *entry,
+		struct swap_slot_entry **dupentry)
+{
+	struct rb_node **link = &root->rb_node, *parent = NULL;
+	struct swap_slot_entry *myentry;
+
+	while (*link) {
+		parent = *link;
+		myentry = rb_entry(parent, struct swap_slot_entry, rbnode);
+		if (myentry->offset > entry->offset)
+			link = &(*link)->rb_left;
+		else if (myentry->offset < entry->offset)
+			link = &(*link)->rb_right;
+		else {
+			*dupentry = myentry;
+			return -EEXIST;
+		}
+	}
+	rb_link_node(&entry->rbnode, parent, link);
+	rb_insert_color(&entry->rbnode, root);
+	return 0;
+}
+
+static void frontswap_rb_erase(struct rb_root *root,
+		struct swap_slot_entry *entry)
+{
+	if (!RB_EMPTY_NODE(&entry->rbnode)) {
+		rb_erase(&entry->rbnode, root);
+		RB_CLEAR_NODE(&entry->rbnode);
+	}
+}
+
+static struct swap_slot_entry *frontswap_rb_search(struct rb_root *root,
+		pgoff_t offset)
+{
+	struct rb_node *node = root->rb_node;
+	struct swap_slot_entry *entry;
+
+	while (node) {
+		entry = rb_entry(node, struct swap_slot_entry, rbnode);
+		if (entry->offset > offset)
+			node = node->rb_left;
+		else if (entry->offset < offset)
+			node = node->rb_right;
+		else
+			return entry;
+	}
+	return NULL;
+}
+
+/* Allocates a page from gcma areas using round-robin way */
+static struct page *frontswap_alloc_page(struct gcma **res_gcma)
+{
+	struct page *page;
+	struct gcma *gcma;
+
+	spin_lock(&ginfo.lock);
+	gcma = list_first_entry(&ginfo.head, struct gcma, list);
+	list_move_tail(&gcma->list, &ginfo.head);
+
+	list_for_each_entry(gcma, &ginfo.head, list) {
+		page = gcma_alloc_page(gcma);
+		if (page) {
+			*res_gcma = gcma;
+			goto out;
+		}
+	}
+
+out:
+	spin_unlock(&ginfo.lock);
+	*res_gcma = gcma;
+	return page;
+}
+
+static void frontswap_free_entry(struct swap_slot_entry *entry)
+{
+	gcma_free_page(entry->gcma, entry->page);
+	kmem_cache_free(swap_slot_entry_cache, entry);
+}
+
+/* Caller should hold frontswap tree spinlock */
+static void swap_slot_entry_get(struct swap_slot_entry *entry)
+{
+	atomic_inc(&entry->refcount);
+}
+
+/*
+ * Caller should hold frontswap tree spinlock.
+ * Remove from the tree and free it, if nobody reference the entry.
+ */
+static void swap_slot_entry_put(struct frontswap_tree *tree,
+		struct swap_slot_entry *entry)
+{
+	int refcount = atomic_dec_return(&entry->refcount);
+
+	BUG_ON(refcount < 0);
+
+	if (refcount == 0) {
+		frontswap_rb_erase(&tree->rbroot, entry);
+		frontswap_free_entry(entry);
+	}
+}
+
+/* Caller should hold frontswap tree spinlock */
+static struct swap_slot_entry *frontswap_find_get(struct frontswap_tree *tree,
+						pgoff_t offset)
+{
+	struct swap_slot_entry *entry;
+	struct rb_root *root = &tree->rbroot;
+
+	assert_spin_locked(&tree->lock);
+	entry = frontswap_rb_search(root, offset);
+	if (entry)
+		swap_slot_entry_get(entry);
+
+	return entry;
+}
+
+void gcma_frontswap_init(unsigned type)
+{
+	struct frontswap_tree *tree;
+
+	tree = kzalloc(sizeof(struct frontswap_tree), GFP_KERNEL);
+	if (!tree) {
+		pr_warn("front swap tree for type %d failed to alloc\n", type);
+		return;
+	}
+
+	tree->rbroot = RB_ROOT;
+	spin_lock_init(&tree->lock);
+	gcma_swap_trees[type] = tree;
+}
+
+int gcma_frontswap_store(unsigned type, pgoff_t offset,
+				struct page *page)
+{
+	struct swap_slot_entry *entry, *dupentry;
+	struct gcma *gcma;
+	struct page *gcma_page = NULL;
+	struct frontswap_tree *tree = gcma_swap_trees[type];
+	u8 *src, *dst;
+	int ret;
+
+	if (!tree) {
+		WARN(1, "frontswap tree for type %d is not exist\n",
+				type);
+		return -ENODEV;
+	}
+
+	gcma_page = frontswap_alloc_page(&gcma);
+	if (!gcma_page)
+		return -ENOMEM;
+
+	entry = kmem_cache_alloc(swap_slot_entry_cache, GFP_NOIO);
+	if (!entry) {
+		gcma_free_page(gcma, gcma_page);
+		return -ENOMEM;
+	}
+
+	entry->gcma = gcma;
+	entry->page = gcma_page;
+	entry->offset = offset;
+	atomic_set(&entry->refcount, 1);
+	RB_CLEAR_NODE(&entry->rbnode);
+
+	set_swap_tree(gcma_page, tree);
+	set_swap_slot(gcma_page, entry);
+
+	/* copy from orig data to gcma-page */
+	src = kmap_atomic(page);
+	dst = kmap_atomic(gcma_page);
+	memcpy(dst, src, PAGE_SIZE);
+	kunmap_atomic(src);
+	kunmap_atomic(dst);
+
+	spin_lock(&tree->lock);
+	do {
+		/*
+		 * Though this duplication scenario may happen rarely by
+		 * race of swap layer, we handle this case here rather
+		 * than fix swap layer because handling the possibility of
+		 * duplicates is part of the tmem ABI.
+		 */
+		ret = frontswap_rb_insert(&tree->rbroot, entry, &dupentry);
+		if (ret == -EEXIST) {
+			frontswap_rb_erase(&tree->rbroot, dupentry);
+			swap_slot_entry_put(tree, dupentry);
+		}
+	} while (ret == -EEXIST);
+	spin_unlock(&tree->lock);
+
+	return ret;
+}
+
+/*
+ * Returns 0 if success,
+ * Returns non-zero if failed.
+ */
+int gcma_frontswap_load(unsigned type, pgoff_t offset,
+			       struct page *page)
+{
+	struct frontswap_tree *tree = gcma_swap_trees[type];
+	struct swap_slot_entry *entry;
+	struct page *gcma_page;
+	u8 *src, *dst;
+
+	if (!tree) {
+		WARN(1, "tree for type %d not exist\n", type);
+		return -1;
+	}
+
+	spin_lock(&tree->lock);
+	entry = frontswap_find_get(tree, offset);
+	spin_unlock(&tree->lock);
+	if (!entry)
+		return -1;
+
+	gcma_page = entry->page;
+	src = kmap_atomic(gcma_page);
+	dst = kmap_atomic(page);
+	memcpy(dst, src, PAGE_SIZE);
+	kunmap_atomic(src);
+	kunmap_atomic(dst);
+
+	spin_lock(&tree->lock);
+	swap_slot_entry_put(tree, entry);
+	spin_unlock(&tree->lock);
+
+	return 0;
+}
+
+void gcma_frontswap_invalidate_page(unsigned type, pgoff_t offset)
+{
+	struct frontswap_tree *tree = gcma_swap_trees[type];
+	struct swap_slot_entry *entry;
+
+	spin_lock(&tree->lock);
+	entry = frontswap_rb_search(&tree->rbroot, offset);
+	if (!entry) {
+		spin_unlock(&tree->lock);
+		return;
+	}
+
+	swap_slot_entry_put(tree, entry);
+	spin_unlock(&tree->lock);
+}
+
+void gcma_frontswap_invalidate_area(unsigned type)
+{
+	struct frontswap_tree *tree = gcma_swap_trees[type];
+	struct swap_slot_entry *entry, *n;
+
+	if (!tree)
+		return;
+
+	spin_lock(&tree->lock);
+	rbtree_postorder_for_each_entry_safe(entry, n, &tree->rbroot, rbnode) {
+		frontswap_rb_erase(&tree->rbroot, entry);
+		swap_slot_entry_put(tree, entry);
+	}
+	tree->rbroot = RB_ROOT;
+	spin_unlock(&tree->lock);
+
+	kfree(tree);
+	gcma_swap_trees[type] = NULL;
+}
+
+static struct frontswap_ops gcma_frontswap_ops = {
+	.init = gcma_frontswap_init,
+	.store = gcma_frontswap_store,
+	.load = gcma_frontswap_load,
+	.invalidate_page = gcma_frontswap_invalidate_page,
+	.invalidate_area = gcma_frontswap_invalidate_area
+};
+
+/*
  * gcma_alloc_contig - allocates contiguous pages
  *
  * @start_pfn	start pfn of requiring contiguous memory area
@@ -162,6 +479,17 @@ static int __init init_gcma(void)
 {
 	pr_info("loading gcma\n");
 
+	swap_slot_entry_cache = KMEM_CACHE(swap_slot_entry, 0);
+	if (swap_slot_entry_cache == NULL)
+		return -ENOMEM;
+
+	/*
+	 * By writethough mode, GCMA could discard all of pages in an instant
+	 * instead of slow writing pages out to the swap device.
+	 */
+	frontswap_writethrough(true);
+	frontswap_register_ops(&gcma_frontswap_ops);
+
 	return 0;
 }
 
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC v1 3/6] gcma: evict frontswap pages in LRU order when memory is full
  2014-11-11 15:00 [RFC v1 0/6] introduce gcma SeongJae Park
  2014-11-11 15:00 ` [RFC v1 1/6] gcma: introduce contiguous memory allocator SeongJae Park
  2014-11-11 15:00 ` [RFC v1 2/6] gcma: utilize reserved memory as swap cache SeongJae Park
@ 2014-11-11 15:00 ` SeongJae Park
  2014-11-11 15:00 ` [RFC v1 4/6] gcma: discard swap cache pages to meet successful GCMA allocation SeongJae Park
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2014-11-11 15:00 UTC (permalink / raw)
  To: akpm; +Cc: lauraa, minchan, sergey.senozhatsky, linux-mm, SeongJae Park

GCMA uses free pages of the reserved space as swap cache so sometime we
ends up shortage of free space as time goes by and we should drain some
pages of swap cache for keeping new swapout pages in cache.
For it, GCMA manages swap cache in LRU order so we can keep active pages
in memory if possible. It could make swap-cache hit ratio high rather
than random evicting.

Signed-off-by: SeongJae Park <sj38.park@gmail.com>
---
 mm/gcma.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 88 insertions(+), 5 deletions(-)

diff --git a/mm/gcma.c b/mm/gcma.c
index ddfc0d8..d459116 100644
--- a/mm/gcma.c
+++ b/mm/gcma.c
@@ -19,6 +19,9 @@
 #include <linux/highmem.h>
 #include <linux/gcma.h>
 
+/* XXX: What's the ideal? */
+#define NR_EVICT_BATCH	32
+
 struct gcma {
 	spinlock_t lock;
 	unsigned long *bitmap;
@@ -49,9 +52,13 @@ struct frontswap_tree {
 	spinlock_t lock;
 };
 
+static LIST_HEAD(slru_list);	/* LRU list of swap cache */
+static spinlock_t slru_lock;	/* protect slru_list */
 static struct frontswap_tree *gcma_swap_trees[MAX_SWAPFILES];
 static struct kmem_cache *swap_slot_entry_cache;
 
+static unsigned long evict_frontswap_pages(unsigned long nr_pages);
+
 static struct frontswap_tree *swap_tree(struct page *page)
 {
 	return (struct frontswap_tree *)page->mapping;
@@ -209,6 +216,7 @@ static struct page *frontswap_alloc_page(struct gcma **res_gcma)
 	struct page *page;
 	struct gcma *gcma;
 
+retry:
 	spin_lock(&ginfo.lock);
 	gcma = list_first_entry(&ginfo.head, struct gcma, list);
 	list_move_tail(&gcma->list, &ginfo.head);
@@ -216,13 +224,18 @@ static struct page *frontswap_alloc_page(struct gcma **res_gcma)
 	list_for_each_entry(gcma, &ginfo.head, list) {
 		page = gcma_alloc_page(gcma);
 		if (page) {
-			*res_gcma = gcma;
-			goto out;
+			spin_unlock(&ginfo.lock);
+			goto got;
 		}
 	}
-
-out:
 	spin_unlock(&ginfo.lock);
+
+	/* Failed to alloc a page from entire gcma. Evict adequate LRU
+	 * frontswap slots and try allocation again */
+	if (evict_frontswap_pages(NR_EVICT_BATCH))
+		goto retry;
+
+got:
 	*res_gcma = gcma;
 	return page;
 }
@@ -240,7 +253,7 @@ static void swap_slot_entry_get(struct swap_slot_entry *entry)
 }
 
 /*
- * Caller should hold frontswap tree spinlock.
+ * Caller should hold frontswap tree spinlock and slru_lock.
  * Remove from the tree and free it, if nobody reference the entry.
  */
 static void swap_slot_entry_put(struct frontswap_tree *tree,
@@ -251,11 +264,67 @@ static void swap_slot_entry_put(struct frontswap_tree *tree,
 	BUG_ON(refcount < 0);
 
 	if (refcount == 0) {
+		struct page *page = entry->page;
+
 		frontswap_rb_erase(&tree->rbroot, entry);
+		list_del(&page->lru);
+
 		frontswap_free_entry(entry);
 	}
 }
 
+/*
+ * evict_frontswap_pages - evict @nr_pages LRU frontswap backed pages
+ *
+ * @nr_pages	number of LRU pages to be evicted
+ *
+ * Returns number of successfully evicted pages
+ */
+static unsigned long evict_frontswap_pages(unsigned long nr_pages)
+{
+	struct frontswap_tree *tree;
+	struct swap_slot_entry *entry;
+	struct page *page, *n;
+	unsigned long evicted = 0;
+	LIST_HEAD(free_pages);
+
+	spin_lock(&slru_lock);
+	list_for_each_entry_safe_reverse(page, n, &slru_list, lru) {
+		entry = swap_slot(page);
+
+		/*
+		 * the entry could be free by other thread in the while.
+		 * check whether the situation occurred and avoid others to
+		 * free it by compare reference count and increase it
+		 * atomically.
+		 */
+		if (!atomic_inc_not_zero(&entry->refcount))
+			continue;
+
+		list_move(&page->lru, &free_pages);
+		if (++evicted >= nr_pages)
+			break;
+	}
+	spin_unlock(&slru_lock);
+
+	list_for_each_entry_safe(page, n, &free_pages, lru) {
+		tree = swap_tree(page);
+		entry = swap_slot(page);
+
+		spin_lock(&tree->lock);
+		spin_lock(&slru_lock);
+		/* drop refcount increased by above loop */
+		swap_slot_entry_put(tree, entry);
+		/* free entry if the entry is still in tree */
+		if (frontswap_rb_search(&tree->rbroot, entry->offset))
+			swap_slot_entry_put(tree, entry);
+		spin_unlock(&slru_lock);
+		spin_unlock(&tree->lock);
+	}
+
+	return evicted;
+}
+
 /* Caller should hold frontswap tree spinlock */
 static struct swap_slot_entry *frontswap_find_get(struct frontswap_tree *tree,
 						pgoff_t offset)
@@ -339,9 +408,15 @@ int gcma_frontswap_store(unsigned type, pgoff_t offset,
 		ret = frontswap_rb_insert(&tree->rbroot, entry, &dupentry);
 		if (ret == -EEXIST) {
 			frontswap_rb_erase(&tree->rbroot, dupentry);
+			spin_lock(&slru_lock);
 			swap_slot_entry_put(tree, dupentry);
+			spin_unlock(&slru_lock);
 		}
 	} while (ret == -EEXIST);
+
+	spin_lock(&slru_lock);
+	list_add(&gcma_page->lru, &slru_list);
+	spin_unlock(&slru_lock);
 	spin_unlock(&tree->lock);
 
 	return ret;
@@ -378,7 +453,10 @@ int gcma_frontswap_load(unsigned type, pgoff_t offset,
 	kunmap_atomic(dst);
 
 	spin_lock(&tree->lock);
+	spin_lock(&slru_lock);
+	list_move(&gcma_page->lru, &slru_list);
 	swap_slot_entry_put(tree, entry);
+	spin_unlock(&slru_lock);
 	spin_unlock(&tree->lock);
 
 	return 0;
@@ -396,7 +474,9 @@ void gcma_frontswap_invalidate_page(unsigned type, pgoff_t offset)
 		return;
 	}
 
+	spin_lock(&slru_lock);
 	swap_slot_entry_put(tree, entry);
+	spin_unlock(&slru_lock);
 	spin_unlock(&tree->lock);
 }
 
@@ -411,7 +491,9 @@ void gcma_frontswap_invalidate_area(unsigned type)
 	spin_lock(&tree->lock);
 	rbtree_postorder_for_each_entry_safe(entry, n, &tree->rbroot, rbnode) {
 		frontswap_rb_erase(&tree->rbroot, entry);
+		spin_lock(&slru_lock);
 		swap_slot_entry_put(tree, entry);
+		spin_unlock(&slru_lock);
 	}
 	tree->rbroot = RB_ROOT;
 	spin_unlock(&tree->lock);
@@ -479,6 +561,7 @@ static int __init init_gcma(void)
 {
 	pr_info("loading gcma\n");
 
+	spin_lock_init(&slru_lock);
 	swap_slot_entry_cache = KMEM_CACHE(swap_slot_entry, 0);
 	if (swap_slot_entry_cache == NULL)
 		return -ENOMEM;
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC v1 4/6] gcma: discard swap cache pages to meet successful GCMA allocation
  2014-11-11 15:00 [RFC v1 0/6] introduce gcma SeongJae Park
                   ` (2 preceding siblings ...)
  2014-11-11 15:00 ` [RFC v1 3/6] gcma: evict frontswap pages in LRU order when memory is full SeongJae Park
@ 2014-11-11 15:00 ` SeongJae Park
  2014-11-11 15:00 ` [RFC v1 5/6] gcma: export statistical data on debugfs SeongJae Park
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2014-11-11 15:00 UTC (permalink / raw)
  To: akpm; +Cc: lauraa, minchan, sergey.senozhatsky, linux-mm, SeongJae Park

GCMA's goal is to allocate contiguous memory successfully anytime
as well as efficient usage of reserved memory space.

For memory efficiency, we allowed using reserved space as swap cache
so we should be able to drain those swap cache pages when GCMA user
want to get contiguos memory successfully, anytime.

We just discard swap caches pages if needed.
It's okay because we have used write-through mode of frontswap so
all of data should be on disk already.

Signed-off-by: SeongJae Park <sj38.park@gmail.com>
---
 mm/gcma.c | 192 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 181 insertions(+), 11 deletions(-)

diff --git a/mm/gcma.c b/mm/gcma.c
index d459116..9c07128 100644
--- a/mm/gcma.c
+++ b/mm/gcma.c
@@ -80,6 +80,50 @@ static void set_swap_slot(struct page *page, struct swap_slot_entry *slot)
 }
 
 /*
+ * Flags for status of a page in gcma
+ *
+ * GF_SWAP_LRU
+ * The page is being used for frontswap and hang on frontswap LRU list.
+ * It can be drained for contiguous memory allocation anytime.
+ * Protected by slru_lock.
+ *
+ * GF_RECLAIMING
+ * The page is being draining for contiguous memory allocation.
+ * Frontswap guests should not use it.
+ * Protected by slru_lock.
+ *
+ * GF_ISOLATED
+ * The page is isolated for contiguous memory allocation.
+ * GCMA guests can use the page safely while frontswap guests should not.
+ * Protected by gcma->lock.
+ */
+enum gpage_flags {
+	GF_SWAP_LRU = 0x1,
+	GF_RECLAIMING = 0x2,
+	GF_ISOLATED = 0x4,
+};
+
+static int gpage_flag(struct page *page, int flag)
+{
+	return page->private & flag;
+}
+
+static void set_gpage_flag(struct page *page, int flag)
+{
+	page->private |= flag;
+}
+
+static void clear_gpage_flag(struct page *page, int flag)
+{
+	page->private &= ~flag;
+}
+
+static void clear_gpage_flagall(struct page *page)
+{
+	page->private = 0;
+}
+
+/*
  * gcma_init - initializes a contiguous memory area
  *
  * @start_pfn	start pfn of contiguous memory area
@@ -137,11 +181,13 @@ static struct page *gcma_alloc_page(struct gcma *gcma)
 	bitmap_set(bitmap, bit, 1);
 	page = pfn_to_page(gcma->base_pfn + bit);
 	spin_unlock(&gcma->lock);
+	clear_gpage_flagall(page);
 
 out:
 	return page;
 }
 
+/* Caller should hold slru_lock */
 static void gcma_free_page(struct gcma *gcma, struct page *page)
 {
 	unsigned long pfn, offset;
@@ -151,7 +197,18 @@ static void gcma_free_page(struct gcma *gcma, struct page *page)
 	spin_lock(&gcma->lock);
 	offset = pfn - gcma->base_pfn;
 
-	bitmap_clear(gcma->bitmap, offset, 1);
+	if (likely(!gpage_flag(page, GF_RECLAIMING))) {
+		bitmap_clear(gcma->bitmap, offset, 1);
+	} else {
+		/*
+		 * The page should be safe to be used for a thread which
+		 * reclaimed the page.
+		 * To prevent further allocation from other thread,
+		 * set bitmap and mark the page as isolated.
+		 */
+		bitmap_set(gcma->bitmap, offset, 1);
+		set_gpage_flag(page, GF_ISOLATED);
+	}
 	spin_unlock(&gcma->lock);
 }
 
@@ -301,6 +358,7 @@ static unsigned long evict_frontswap_pages(unsigned long nr_pages)
 		if (!atomic_inc_not_zero(&entry->refcount))
 			continue;
 
+		clear_gpage_flag(page, GF_SWAP_LRU);
 		list_move(&page->lru, &free_pages);
 		if (++evicted >= nr_pages)
 			break;
@@ -377,7 +435,9 @@ int gcma_frontswap_store(unsigned type, pgoff_t offset,
 
 	entry = kmem_cache_alloc(swap_slot_entry_cache, GFP_NOIO);
 	if (!entry) {
+		spin_lock(&slru_lock);
 		gcma_free_page(gcma, gcma_page);
+		spin_unlock(&slru_lock);
 		return -ENOMEM;
 	}
 
@@ -415,6 +475,7 @@ int gcma_frontswap_store(unsigned type, pgoff_t offset,
 	} while (ret == -EEXIST);
 
 	spin_lock(&slru_lock);
+	set_gpage_flag(gcma_page, GF_SWAP_LRU);
 	list_add(&gcma_page->lru, &slru_list);
 	spin_unlock(&slru_lock);
 	spin_unlock(&tree->lock);
@@ -454,7 +515,8 @@ int gcma_frontswap_load(unsigned type, pgoff_t offset,
 
 	spin_lock(&tree->lock);
 	spin_lock(&slru_lock);
-	list_move(&gcma_page->lru, &slru_list);
+	if (likely(gpage_flag(gcma_page, GF_SWAP_LRU)))
+		list_move(&gcma_page->lru, &slru_list);
 	swap_slot_entry_put(tree, entry);
 	spin_unlock(&slru_lock);
 	spin_unlock(&tree->lock);
@@ -511,6 +573,43 @@ static struct frontswap_ops gcma_frontswap_ops = {
 };
 
 /*
+ * Return 0 if [start_pfn, end_pfn] is isolated.
+ * Otherwise, return first unisolated pfn from the start_pfn.
+ */
+static unsigned long isolate_interrupted(struct gcma *gcma,
+		unsigned long start_pfn, unsigned long end_pfn)
+{
+	unsigned long offset;
+	unsigned long *bitmap;
+	unsigned long pfn, ret = 0;
+	struct page *page;
+
+	spin_lock(&gcma->lock);
+
+	for (pfn = start_pfn; pfn < end_pfn; pfn++) {
+		int set;
+
+		offset = pfn - gcma->base_pfn;
+		bitmap = gcma->bitmap + offset / BITS_PER_LONG;
+
+		set = test_bit(pfn % BITS_PER_LONG, bitmap);
+		if (!set) {
+			ret = pfn;
+			break;
+		}
+
+		page = pfn_to_page(pfn);
+		if (!gpage_flag(page, GF_ISOLATED)) {
+			ret = pfn;
+			break;
+		}
+
+	}
+	spin_unlock(&gcma->lock);
+	return ret;
+}
+
+/*
  * gcma_alloc_contig - allocates contiguous pages
  *
  * @start_pfn	start pfn of requiring contiguous memory area
@@ -521,21 +620,92 @@ static struct frontswap_ops gcma_frontswap_ops = {
 int gcma_alloc_contig(struct gcma *gcma, unsigned long start_pfn,
 			unsigned long size)
 {
+	LIST_HEAD(free_pages);
+	struct page *page, *n;
+	struct swap_slot_entry *entry;
 	unsigned long offset;
+	unsigned long *bitmap;
+	struct frontswap_tree *tree;
+	unsigned long pfn;
+	unsigned long orig_start = start_pfn;
 
-	spin_lock(&gcma->lock);
-	offset = start_pfn - gcma->base_pfn;
+retry:
+	for (pfn = start_pfn; pfn < start_pfn + size; pfn++) {
+		spin_lock(&gcma->lock);
+
+		offset = pfn - gcma->base_pfn;
+		bitmap = gcma->bitmap + offset / BITS_PER_LONG;
+		page = pfn_to_page(pfn);
+
+		if (!test_bit(offset % BITS_PER_LONG, bitmap)) {
+			/* set a bit for prevent allocation for frontswap */
+			bitmap_set(gcma->bitmap, offset, 1);
+			set_gpage_flag(page, GF_ISOLATED);
+			spin_unlock(&gcma->lock);
+			continue;
+		}
+
+		/* Someone is using the page so it's complicated :( */
+		spin_unlock(&gcma->lock);
+		spin_lock(&slru_lock);
+		/*
+		 * If the page is in LRU, we can get swap_slot_entry from
+		 * the page with no problem.
+		 */
+		if (gpage_flag(page, GF_SWAP_LRU)) {
+			BUG_ON(gpage_flag(page, GF_RECLAIMING));
+
+			entry = swap_slot(page);
+			if (atomic_inc_not_zero(&entry->refcount)) {
+				clear_gpage_flag(page, GF_SWAP_LRU);
+				set_gpage_flag(page, GF_RECLAIMING);
+				list_move(&page->lru, &free_pages);
+				spin_unlock(&slru_lock);
+				continue;
+			}
+		}
 
-	if (bitmap_find_next_zero_area(gcma->bitmap, gcma->size, offset,
-				size, 0) != 0) {
+		/*
+		 * Someone is allocating the page but it's not yet in LRU
+		 * in case of frontswap_store or it was deleted from LRU
+		 * but not yet from gcma's bitmap in case of
+		 * frontswap_invalidate. Anycase, the race is small so retry
+		 * after a while will see success. Below isolate_interrupted
+		 * handles it.
+		 */
+		spin_lock(&gcma->lock);
+		if (!test_bit(offset % BITS_PER_LONG, bitmap)) {
+			bitmap_set(gcma->bitmap, offset, 1);
+			set_gpage_flag(page, GF_ISOLATED);
+		} else {
+			set_gpage_flag(page, GF_RECLAIMING);
+		}
 		spin_unlock(&gcma->lock);
-		pr_warn("already allocated region required: %lu, %lu",
-				start_pfn, size);
-		return -EINVAL;
+		spin_unlock(&slru_lock);
 	}
 
-	bitmap_set(gcma->bitmap, offset, size);
-	spin_unlock(&gcma->lock);
+	/*
+	 * Since we increased refcount of the page above, we can access
+	 * swap_slot_entry with safe
+	 */
+	list_for_each_entry_safe(page, n, &free_pages, lru) {
+		tree = swap_tree(page);
+		entry = swap_slot(page);
+
+		spin_lock(&tree->lock);
+		spin_lock(&slru_lock);
+		/* drop refcount increased by above loop */
+		swap_slot_entry_put(tree, entry);
+		/* free entry if the entry is still in tree */
+		if (frontswap_rb_search(&tree->rbroot, entry->offset))
+			swap_slot_entry_put(tree, entry);
+		spin_unlock(&slru_lock);
+		spin_unlock(&tree->lock);
+	}
+
+	start_pfn = isolate_interrupted(gcma, orig_start, orig_start + size);
+	if (start_pfn)
+		goto retry;
 
 	return 0;
 }
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC v1 5/6] gcma: export statistical data on debugfs
  2014-11-11 15:00 [RFC v1 0/6] introduce gcma SeongJae Park
                   ` (3 preceding siblings ...)
  2014-11-11 15:00 ` [RFC v1 4/6] gcma: discard swap cache pages to meet successful GCMA allocation SeongJae Park
@ 2014-11-11 15:00 ` SeongJae Park
  2014-11-11 15:00 ` [RFC v1 6/6] gcma: integrate gcma under cma interface SeongJae Park
  2014-11-11 18:57 ` [RFC v1 0/6] introduce gcma Christoph Lameter
  6 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2014-11-11 15:00 UTC (permalink / raw)
  To: akpm; +Cc: lauraa, minchan, sergey.senozhatsky, linux-mm, SeongJae Park

Export saved / loaded / evicted / reclaimed pages from gcma's frontswap
backend on debugfs to let users know how gcma is working internally.

Signed-off-by: SeongJae Park <sj38.park@gmail.com>
---
 mm/gcma.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/mm/gcma.c b/mm/gcma.c
index 9c07128..65395ec 100644
--- a/mm/gcma.c
+++ b/mm/gcma.c
@@ -57,6 +57,12 @@ static spinlock_t slru_lock;	/* protect slru_list */
 static struct frontswap_tree *gcma_swap_trees[MAX_SWAPFILES];
 static struct kmem_cache *swap_slot_entry_cache;
 
+/* For statistics */
+static atomic_t gcma_stored_pages = ATOMIC_INIT(0);
+static atomic_t gcma_loaded_pages = ATOMIC_INIT(0);
+static atomic_t gcma_evicted_pages = ATOMIC_INIT(0);
+static atomic_t gcma_reclaimed_pages = ATOMIC_INIT(0);
+
 static unsigned long evict_frontswap_pages(unsigned long nr_pages);
 
 static struct frontswap_tree *swap_tree(struct page *page)
@@ -380,6 +386,7 @@ static unsigned long evict_frontswap_pages(unsigned long nr_pages)
 		spin_unlock(&tree->lock);
 	}
 
+	atomic_add(evicted, &gcma_evicted_pages);
 	return evicted;
 }
 
@@ -480,6 +487,7 @@ int gcma_frontswap_store(unsigned type, pgoff_t offset,
 	spin_unlock(&slru_lock);
 	spin_unlock(&tree->lock);
 
+	atomic_inc(&gcma_stored_pages);
 	return ret;
 }
 
@@ -521,6 +529,7 @@ int gcma_frontswap_load(unsigned type, pgoff_t offset,
 	spin_unlock(&slru_lock);
 	spin_unlock(&tree->lock);
 
+	atomic_inc(&gcma_loaded_pages);
 	return 0;
 }
 
@@ -659,6 +668,7 @@ retry:
 			if (atomic_inc_not_zero(&entry->refcount)) {
 				clear_gpage_flag(page, GF_SWAP_LRU);
 				set_gpage_flag(page, GF_RECLAIMING);
+				atomic_inc(&gcma_reclaimed_pages);
 				list_move(&page->lru, &free_pages);
 				spin_unlock(&slru_lock);
 				continue;
@@ -679,6 +689,7 @@ retry:
 			set_gpage_flag(page, GF_ISOLATED);
 		} else {
 			set_gpage_flag(page, GF_RECLAIMING);
+			atomic_inc(&gcma_reclaimed_pages);
 		}
 		spin_unlock(&gcma->lock);
 		spin_unlock(&slru_lock);
@@ -727,6 +738,40 @@ void gcma_free_contig(struct gcma *gcma,
 	spin_unlock(&gcma->lock);
 }
 
+#ifdef CONFIG_DEBUG_FS
+#include <linux/debugfs.h>
+
+static struct dentry *gcma_debugfs_root;
+
+static int __init gcma_debugfs_init(void)
+{
+	if (!debugfs_initialized())
+		return -ENODEV;
+
+	gcma_debugfs_root = debugfs_create_dir("gcma", NULL);
+	if (!gcma_debugfs_root)
+		return -ENOMEM;
+
+	debugfs_create_atomic_t("stored_pages", S_IRUGO,
+			gcma_debugfs_root, &gcma_stored_pages);
+	debugfs_create_atomic_t("loaded_pages", S_IRUGO,
+			gcma_debugfs_root, &gcma_loaded_pages);
+	debugfs_create_atomic_t("evicted_pages", S_IRUGO,
+			gcma_debugfs_root, &gcma_evicted_pages);
+	debugfs_create_atomic_t("reclaimed_pages", S_IRUGO,
+			gcma_debugfs_root, &gcma_reclaimed_pages);
+
+	pr_info("gcma debufs init\n");
+	return 0;
+}
+#else
+static int __init gcma_debugfs_init(void)
+{
+	return 0;
+}
+#endif
+
+
 static int __init init_gcma(void)
 {
 	pr_info("loading gcma\n");
@@ -743,6 +788,7 @@ static int __init init_gcma(void)
 	frontswap_writethrough(true);
 	frontswap_register_ops(&gcma_frontswap_ops);
 
+	gcma_debugfs_init();
 	return 0;
 }
 
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC v1 6/6] gcma: integrate gcma under cma interface
  2014-11-11 15:00 [RFC v1 0/6] introduce gcma SeongJae Park
                   ` (4 preceding siblings ...)
  2014-11-11 15:00 ` [RFC v1 5/6] gcma: export statistical data on debugfs SeongJae Park
@ 2014-11-11 15:00 ` SeongJae Park
  2014-11-11 18:57 ` [RFC v1 0/6] introduce gcma Christoph Lameter
  6 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2014-11-11 15:00 UTC (permalink / raw)
  To: akpm; +Cc: lauraa, minchan, sergey.senozhatsky, linux-mm, SeongJae Park

Currently, cma reserves large contiguous memory area during early boot
and let the area could be used by others for movable pages only. Then,
if the movable pages arenecessary for contiguous memory allocation, cma
migrates and/or discards them out.

This mechanism have two weakness.
1) Because any one in kernel can pin any movable pages, contiguous
memory allocation could be fail due to migration failure.
2) Because of migration / reclaim overhead, the latency could be
extremely high.
In short, cma doesn't guarantee success and fast latency of contiguous
memory allocation. The problem was discussed in detail from [1] and [2].

gcma, which introduced by above patches, guarantees success and fast
latency of contiguous memory allocation. gcma concept and
implementation, performance evaluation was presented in detail from [2].

This patch let cma clients to be able to use gcma easily using friendly
cma interface by integrating gcma under cma interface.

After this patch, clients can decalre a contiguous memory area to be
managed in gcma way instead of cma way internally by using
gcma_declare_contiguous() function call. After declaration, clients can
use the area using familiar cma interface while it works in gcma way.

For example, you can use following code snippet to make two contiguous
regions: one region will work as cma and the other will work as gcma.

```
struct cma *cma, *gcma;

cma_declare_contiguous(base, size, limit, 0, 0, fixed, &cma);
gcma_declare_contiguous(gcma_base, size, gcma_limit, 0, 0, fixed, &gcma);

cma_alloc(cma, 1024, 0);	/* alloc in cma way */
cma_alloc(gcma, 1024, 0);	/* alloc in gcma way */
```

[1] https://lkml.org/lkml/2013/10/30/16
[2] http://sched.co/1qZcBAO

Signed-off-by: SeongJae Park <sj38.park@gmail.com>
---
 include/linux/cma.h  |   4 ++
 include/linux/gcma.h |  21 ++++++++++
 mm/Kconfig           |  15 +++++++
 mm/Makefile          |   2 +
 mm/cma.c             | 110 ++++++++++++++++++++++++++++++++++++++++-----------
 5 files changed, 129 insertions(+), 23 deletions(-)

diff --git a/include/linux/cma.h b/include/linux/cma.h
index 371b930..f81d0dd 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -22,6 +22,10 @@ extern int __init cma_declare_contiguous(phys_addr_t size,
 			phys_addr_t base, phys_addr_t limit,
 			phys_addr_t alignment, unsigned int order_per_bit,
 			bool fixed, struct cma **res_cma);
+extern int __init gcma_declare_contiguous(phys_addr_t size,
+			phys_addr_t base, phys_addr_t limit,
+			phys_addr_t alignment, unsigned int order_per_bit,
+			bool fixed, struct cma **res_cma);
 extern struct page *cma_alloc(struct cma *cma, int count, unsigned int align);
 extern bool cma_release(struct cma *cma, struct page *pages, int count);
 #endif
diff --git a/include/linux/gcma.h b/include/linux/gcma.h
index d733a9b..dedbd0f 100644
--- a/include/linux/gcma.h
+++ b/include/linux/gcma.h
@@ -16,6 +16,25 @@
 
 struct gcma;
 
+#ifndef CONFIG_GCMA
+
+inline int gcma_init(unsigned long start_pfn, unsigned long size,
+		     struct gcma **res_gcma)
+{
+	return 0;
+}
+
+inline int gcma_alloc_contig(struct gcma *gcma,
+			     unsigned long start, unsigned long end)
+{
+	return 0;
+}
+
+void gcma_free_contig(struct gcma *gcma,
+		      unsigned long pfn, unsigned long nr_pages) { }
+
+#else
+
 int gcma_init(unsigned long start_pfn, unsigned long size,
 	      struct gcma **res_gcma);
 int gcma_alloc_contig(struct gcma *gcma,
@@ -23,4 +42,6 @@ int gcma_alloc_contig(struct gcma *gcma,
 void gcma_free_contig(struct gcma *gcma,
 		      unsigned long start_pfn, unsigned long size);
 
+#endif
+
 #endif /* _LINUX_GCMA_H */
diff --git a/mm/Kconfig b/mm/Kconfig
index 886db21..1b232e3 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -519,6 +519,21 @@ config CMA_AREAS
 
 	  If unsure, leave the default value "7".
 
+config GCMA
+	bool "Guaranteed Contiguous Memory Allocator (EXPERIMENTAL)"
+	default n
+	select FRONTSWAP
+	select CMA
+	help
+	  A contiguous memory allocator which guarantees success and
+	  predictable latency for allocation request.
+	  It carves out large amount of memory and let them be allocated
+	  to the contiguous memory request while it can be used as backend
+	  for frontswap.
+
+	  This is marked experimental because it is a new feature that
+	  interacts heavily with memory reclaim.
+
 config MEM_SOFT_DIRTY
 	bool "Track memory changes"
 	depends on CHECKPOINT_RESTORE && HAVE_ARCH_SOFT_DIRTY && PROC_FS
diff --git a/mm/Makefile b/mm/Makefile
index 632ae77..ecff2c7 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -33,6 +33,7 @@ obj-$(CONFIG_HAVE_MEMBLOCK) += memblock.o
 obj-$(CONFIG_SWAP)	+= page_io.o swap_state.o swapfile.o
 obj-$(CONFIG_FRONTSWAP)	+= frontswap.o
 obj-$(CONFIG_ZSWAP)	+= zswap.o
+obj-$(CONFIG_GCMA)	+= gcma.o
 obj-$(CONFIG_HAS_DMA)	+= dmapool.o
 obj-$(CONFIG_HUGETLBFS)	+= hugetlb.o
 obj-$(CONFIG_NUMA) 	+= mempolicy.o
@@ -64,3 +65,4 @@ obj-$(CONFIG_ZBUD)	+= zbud.o
 obj-$(CONFIG_ZSMALLOC)	+= zsmalloc.o
 obj-$(CONFIG_GENERIC_EARLY_IOREMAP) += early_ioremap.o
 obj-$(CONFIG_CMA)	+= cma.o
+obj-$(CONFIG_GCMA)	+= gcma.o
diff --git a/mm/cma.c b/mm/cma.c
index c17751c..b085288 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -32,6 +32,9 @@
 #include <linux/slab.h>
 #include <linux/log2.h>
 #include <linux/cma.h>
+#include <linux/gcma.h>
+
+#define IS_GCMA ((struct gcma *)(void *)0xFF)
 
 struct cma {
 	unsigned long	base_pfn;
@@ -39,6 +42,7 @@ struct cma {
 	unsigned long	*bitmap;
 	unsigned int order_per_bit; /* Order of pages represented by one bit */
 	struct mutex	lock;
+	struct gcma	*gcma;
 };
 
 static struct cma cma_areas[MAX_CMA_AREAS];
@@ -83,26 +87,25 @@ static void cma_clear_bitmap(struct cma *cma, unsigned long pfn, int count)
 	mutex_unlock(&cma->lock);
 }
 
-static int __init cma_activate_area(struct cma *cma)
+/*
+ * Return reserved pages for CMA to buddy allocator for using those pages
+ * as movable pages.
+ * Return 0 if it's called successfully. Otherwise, non-zero.
+ */
+static int free_reserved_pages(unsigned long pfn, unsigned long count)
 {
-	int bitmap_size = BITS_TO_LONGS(cma_bitmap_maxno(cma)) * sizeof(long);
-	unsigned long base_pfn = cma->base_pfn, pfn = base_pfn;
-	unsigned i = cma->count >> pageblock_order;
+	int ret = 0;
+	unsigned long base_pfn;
 	struct zone *zone;
 
-	cma->bitmap = kzalloc(bitmap_size, GFP_KERNEL);
-
-	if (!cma->bitmap)
-		return -ENOMEM;
-
-	WARN_ON_ONCE(!pfn_valid(pfn));
+	count = count >> pageblock_order;
 	zone = page_zone(pfn_to_page(pfn));
 
 	do {
-		unsigned j;
+		unsigned i;
 
 		base_pfn = pfn;
-		for (j = pageblock_nr_pages; j; --j, pfn++) {
+		for (i = pageblock_nr_pages; i; --i, pfn++) {
 			WARN_ON_ONCE(!pfn_valid(pfn));
 			/*
 			 * alloc_contig_range requires the pfn range
@@ -110,18 +113,40 @@ static int __init cma_activate_area(struct cma *cma)
 			 * simple by forcing the entire CMA resv range
 			 * to be in the same zone.
 			 */
-			if (page_zone(pfn_to_page(pfn)) != zone)
-				goto err;
+			if (page_zone(pfn_to_page(pfn)) != zone) {
+				ret = -EINVAL;
+				break;
+			}
 		}
 		init_cma_reserved_pageblock(pfn_to_page(base_pfn));
-	} while (--i);
+	} while (--count);
 
+	return ret;
+}
+
+static int __init cma_activate_area(struct cma *cma)
+{
+	int bitmap_size = BITS_TO_LONGS(cma_bitmap_maxno(cma)) * sizeof(long);
+	unsigned long base_pfn = cma->base_pfn, pfn = base_pfn;
+	int fail;
+
+	cma->bitmap = kzalloc(bitmap_size, GFP_KERNEL);
+
+	if (!cma->bitmap)
+		return -ENOMEM;
+
+	WARN_ON_ONCE(!pfn_valid(pfn));
+
+	if (cma->gcma == IS_GCMA)
+		fail = gcma_init(cma->base_pfn, cma->count, &cma->gcma);
+	else
+		fail = free_reserved_pages(cma->base_pfn, cma->count);
+	if (fail != 0) {
+		kfree(cma->bitmap);
+		return -EINVAL;
+	}
 	mutex_init(&cma->lock);
 	return 0;
-
-err:
-	kfree(cma->bitmap);
-	return -EINVAL;
 }
 
 static int __init cma_init_reserved_areas(void)
@@ -140,7 +165,7 @@ static int __init cma_init_reserved_areas(void)
 core_initcall(cma_init_reserved_areas);
 
 /**
- * cma_declare_contiguous() - reserve custom contiguous area
+ * __declare_contiguous() - reserve custom contiguous area
  * @base: Base address of the reserved area optional, use 0 for any
  * @size: Size of the reserved area (in bytes),
  * @limit: End address of the reserved memory (optional, 0 for any).
@@ -157,7 +182,7 @@ core_initcall(cma_init_reserved_areas);
  * If @fixed is true, reserve contiguous area at exactly @base.  If false,
  * reserve in range from @base to @limit.
  */
-int __init cma_declare_contiguous(phys_addr_t base,
+int __init __declare_contiguous(phys_addr_t base,
 			phys_addr_t size, phys_addr_t limit,
 			phys_addr_t alignment, unsigned int order_per_bit,
 			bool fixed, struct cma **res_cma)
@@ -235,6 +260,36 @@ err:
 }
 
 /**
+ * gcma_declare_contiguous() - same as cma_declare_contiguous() except result
+ * cma's is_gcma field setting.
+ */
+int __init gcma_declare_contiguous(phys_addr_t base,
+			phys_addr_t size, phys_addr_t limit,
+			phys_addr_t alignment, unsigned int order_per_bit,
+			bool fixed, struct cma **res_cma)
+{
+	int ret = 0;
+	ret = __declare_contiguous(base, size, limit, alignment,
+			order_per_bit, fixed, res_cma);
+	if (ret >= 0)
+		(*res_cma)->gcma = IS_GCMA;
+
+	return ret;
+}
+
+int __init cma_declare_contiguous(phys_addr_t base,
+			phys_addr_t size, phys_addr_t limit,
+			phys_addr_t alignment, unsigned int order_per_bit,
+			bool fixed, struct cma **res_cma)
+{
+	int ret = 0;
+	ret = __declare_contiguous(base, size, limit, alignment,
+			order_per_bit, fixed, res_cma);
+
+	return ret;
+}
+
+/**
  * cma_alloc() - allocate pages from contiguous area
  * @cma:   Contiguous memory region for which the allocation is performed.
  * @count: Requested number of pages.
@@ -281,7 +336,12 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
 
 		pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
 		mutex_lock(&cma_mutex);
-		ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA);
+
+		if (cma->gcma)
+			ret = gcma_alloc_contig(cma->gcma, pfn, count);
+		else
+			ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA);
+
 		mutex_unlock(&cma_mutex);
 		if (ret == 0) {
 			page = pfn_to_page(pfn);
@@ -328,7 +388,11 @@ bool cma_release(struct cma *cma, struct page *pages, int count)
 
 	VM_BUG_ON(pfn + count > cma->base_pfn + cma->count);
 
-	free_contig_range(pfn, count);
+	if (cma->gcma)
+		gcma_free_contig(cma->gcma, pfn, count);
+	else
+		free_contig_range(pfn, count);
+
 	cma_clear_bitmap(cma, pfn, count);
 
 	return true;
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC v1 0/6] introduce gcma
  2014-11-11 15:00 [RFC v1 0/6] introduce gcma SeongJae Park
                   ` (5 preceding siblings ...)
  2014-11-11 15:00 ` [RFC v1 6/6] gcma: integrate gcma under cma interface SeongJae Park
@ 2014-11-11 18:57 ` Christoph Lameter
  2014-11-12  7:02   ` SeongJae Park
  6 siblings, 1 reply; 9+ messages in thread
From: Christoph Lameter @ 2014-11-11 18:57 UTC (permalink / raw)
  To: SeongJae Park; +Cc: akpm, lauraa, minchan, sergey.senozhatsky, linux-mm

On Wed, 12 Nov 2014, SeongJae Park wrote:

> Difference with cma is choice and operation of 2nd-class client. In gcma,
> 2nd-class client should allocate pages from the reserved area only if the
> allocated pages mets following conditions.

How about making CMA configurable in some fashion to be able to specify
the type of 2nd class clients? Clean page-cache pages can also be rather
easily evicted (see zone-reclaim). You could migrate them out when they
are dirtied so that you do not have the high writeback latency from the
CMA reserved area if it needs to be evicted later.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC v1 0/6] introduce gcma
  2014-11-11 18:57 ` [RFC v1 0/6] introduce gcma Christoph Lameter
@ 2014-11-12  7:02   ` SeongJae Park
  0 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2014-11-12  7:02 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: SeongJae Park, akpm, lauraa, minchan, sergey.senozhatsky, linux-mm

Hi Christoph,

On Tue, 11 Nov 2014, Christoph Lameter wrote:

> On Wed, 12 Nov 2014, SeongJae Park wrote:
>
>> Difference with cma is choice and operation of 2nd-class client. In gcma,
>> 2nd-class client should allocate pages from the reserved area only if the
>> allocated pages mets following conditions.
>
> How about making CMA configurable in some fashion to be able to specify
> the type of 2nd class clients? Clean page-cache pages can also be rather
> easily evicted (see zone-reclaim). You could migrate them out when they
> are dirtied so that you do not have the high writeback latency from the
> CMA reserved area if it needs to be evicted later.

Nice point.

Currently, gcma is integrated inside cma and user could decide a specific 
contiguous memory area to work in cma way(movable pages as 2nd class) or 
in gcma way(out-of-kernel, easy-to-discard pages as 2nd class).
It is implemented in 6th change of this RFC, "gcma: integrate gcma under 
cma interface".

In short, the 2nd-clients of cma is already configurable between 
movable pages and frontswap backend with this RFC.

And yes, cleancache will be great 2nd class client.
As described within coverletter, our 2nd class client candidates are 
frontswap and _cleancache_. But, because the gcma is still in unmatured 
sate yet, current RFC(this patchset) use only frontswap.
In future, it will be configurable.

Apologize I forgot to describe about future plan.

Thanks,
SeongJae Park

>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-11-12  7:02 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-11 15:00 [RFC v1 0/6] introduce gcma SeongJae Park
2014-11-11 15:00 ` [RFC v1 1/6] gcma: introduce contiguous memory allocator SeongJae Park
2014-11-11 15:00 ` [RFC v1 2/6] gcma: utilize reserved memory as swap cache SeongJae Park
2014-11-11 15:00 ` [RFC v1 3/6] gcma: evict frontswap pages in LRU order when memory is full SeongJae Park
2014-11-11 15:00 ` [RFC v1 4/6] gcma: discard swap cache pages to meet successful GCMA allocation SeongJae Park
2014-11-11 15:00 ` [RFC v1 5/6] gcma: export statistical data on debugfs SeongJae Park
2014-11-11 15:00 ` [RFC v1 6/6] gcma: integrate gcma under cma interface SeongJae Park
2014-11-11 18:57 ` [RFC v1 0/6] introduce gcma Christoph Lameter
2014-11-12  7:02   ` SeongJae Park

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox