linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] hwmem: Hardware memory driver
@ 2010-11-16 13:07 Johan Mossberg
  2010-11-16 13:08 ` [PATCH 1/3] hwmem: Add hwmem (part 1) Johan Mossberg
  2010-11-16 14:50 ` [PATCH 0/3] hwmem: Hardware memory driver Michał Nazarewicz
  0 siblings, 2 replies; 12+ messages in thread
From: Johan Mossberg @ 2010-11-16 13:07 UTC (permalink / raw)
  To: linux-mm; +Cc: Johan Mossberg

Hello everyone, 

The following patchset implements a "hardware memory driver". The
main purpose of hwmem is:

* To allocate buffers suitable for use with hardware. Currently
this means contiguous buffers.
* To synchronize the caches for the allocated buffers. This is
achieved by keeping track of when the CPU uses a buffer and when
other hardware uses the buffer, when we switch from CPU to other
hardware or vice versa the caches are synchronized.
* To handle sharing of allocated buffers between processes i.e.
import, export.

Hwmem is available both through a user space API and through a
kernel API.

Here at ST-Ericsson we use hwmem for graphics buffers. Graphics
buffers need to be contiguous due to our hardware, are passed
between processes (usually application and window manager)and are
part of usecases where performance is top priority so we can't
afford to synchronize the caches unecessarily.

Hwmem and CMA (Contiguous Memory Allocator) overlap to some extent.
Hwmem could use CMA as its allocator and thereby remove the overlap
but then defragmentation can not be implemented as CMA currently
has no support for this. We would very much like to see a
discussion about adding defragmentation to CMA.

Best regards
Johan Mossberg
Consultant at ST-Ericsson

Johan Mossberg (3):
  hwmem: Add hwmem (part 1)
  hwmem: Add hwmem (part 2)
  hwmem: Add hwmem to ux500 and mop500

 arch/arm/mach-ux500/board-mop500.c         |    1 +
 arch/arm/mach-ux500/devices.c              |   31 ++
 arch/arm/mach-ux500/include/mach/devices.h |    1 +
 drivers/misc/Kconfig                       |    7 +
 drivers/misc/Makefile                      |    1 +
 drivers/misc/hwmem/Makefile                |    3 +
 drivers/misc/hwmem/cache_handler.c         |  494 ++++++++++++++++++++++
 drivers/misc/hwmem/cache_handler.h         |   60 +++
 drivers/misc/hwmem/cache_handler_u8500.c   |  208 ++++++++++
 drivers/misc/hwmem/hwmem-ioctl.c           |  470 +++++++++++++++++++++
 drivers/misc/hwmem/hwmem-main.c            |  609 ++++++++++++++++++++++++++++
 include/linux/hwmem.h                      |  499 +++++++++++++++++++++++
 12 files changed, 2384 insertions(+), 0 deletions(-)
 create mode 100644 drivers/misc/hwmem/Makefile
 create mode 100644 drivers/misc/hwmem/cache_handler.c
 create mode 100644 drivers/misc/hwmem/cache_handler.h
 create mode 100644 drivers/misc/hwmem/cache_handler_u8500.c
 create mode 100644 drivers/misc/hwmem/hwmem-ioctl.c
 create mode 100644 drivers/misc/hwmem/hwmem-main.c
 create mode 100644 include/linux/hwmem.h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] hwmem: Add hwmem (part 1)
  2010-11-16 13:07 [PATCH 0/3] hwmem: Hardware memory driver Johan Mossberg
@ 2010-11-16 13:08 ` Johan Mossberg
  2010-11-16 13:08   ` [PATCH 2/3] hwmem: Add hwmem (part 2) Johan Mossberg
  2010-11-16 14:50 ` [PATCH 0/3] hwmem: Hardware memory driver Michał Nazarewicz
  1 sibling, 1 reply; 12+ messages in thread
From: Johan Mossberg @ 2010-11-16 13:08 UTC (permalink / raw)
  To: linux-mm; +Cc: Johan Mossberg

Add hardware memory driver, part 1.

The main purpose of hwmem is:

* To allocate buffers suitable for use with hardware. Currently
this means contiguous buffers.
* To synchronize the caches for the allocated buffers. This is
achieved by keeping track of when the CPU uses a buffer and when
other hardware uses the buffer, when we switch from CPU to other
hardware or vice versa the caches are synchronized.
* To handle sharing of allocated buffers between processes i.e.
import, export.

Hwmem is available both through a user space API and through a
kernel API.

Signed-off-by: Johan Mossberg <johan.xx.mossberg@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
---
 drivers/misc/Kconfig             |    7 +
 drivers/misc/Makefile            |    1 +
 drivers/misc/hwmem/Makefile      |    3 +
 drivers/misc/hwmem/hwmem-ioctl.c |  470 +++++++++++++++++++++++++++++
 drivers/misc/hwmem/hwmem-main.c  |  609 ++++++++++++++++++++++++++++++++++++++
 include/linux/hwmem.h            |  499 +++++++++++++++++++++++++++++++
 6 files changed, 1589 insertions(+), 0 deletions(-)
 create mode 100644 drivers/misc/hwmem/Makefile
 create mode 100644 drivers/misc/hwmem/hwmem-ioctl.c
 create mode 100644 drivers/misc/hwmem/hwmem-main.c
 create mode 100644 include/linux/hwmem.h

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 4d073f1..9a74534 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -452,6 +452,13 @@ config PCH_PHUB
 	  To compile this driver as a module, choose M here: the module will
 	  be called pch_phub.
 
+config HWMEM
+	bool "Hardware memory driver"
+	default n
+	help
+	  Allocates buffers suitable for use with hardware. Also handles
+	  sharing of allocated buffers between processes.
+
 source "drivers/misc/c2port/Kconfig"
 source "drivers/misc/eeprom/Kconfig"
 source "drivers/misc/cb710/Kconfig"
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 98009cc..50dfbbe 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -41,4 +41,5 @@ obj-$(CONFIG_VMWARE_BALLOON)	+= vmw_balloon.o
 obj-$(CONFIG_ARM_CHARLCD)	+= arm-charlcd.o
 obj-$(CONFIG_PCH_PHUB)		+= pch_phub.o
 obj-y				+= ti-st/
+obj-$(CONFIG_HWMEM)		+= hwmem/
 obj-$(CONFIG_AB8500_PWM)	+= ab8500-pwm.o
diff --git a/drivers/misc/hwmem/Makefile b/drivers/misc/hwmem/Makefile
new file mode 100644
index 0000000..da9080a
--- /dev/null
+++ b/drivers/misc/hwmem/Makefile
@@ -0,0 +1,3 @@
+hwmem-objs := hwmem-main.o hwmem-ioctl.o cache_handler.o cache_handler_u8500.o
+
+obj-$(CONFIG_HWMEM) += hwmem.o
diff --git a/drivers/misc/hwmem/hwmem-ioctl.c b/drivers/misc/hwmem/hwmem-ioctl.c
new file mode 100644
index 0000000..b1fc844
--- /dev/null
+++ b/drivers/misc/hwmem/hwmem-ioctl.c
@@ -0,0 +1,470 @@
+/*
+ * Copyright (C) ST-Ericsson AB 2010
+ *
+ * Hardware memory driver, hwmem
+ *
+ * Author: Marcus Lorentzon <marcus.xm.lorentzon@stericsson.com>
+ * for ST-Ericsson.
+ *
+ * License terms: GNU General Public License (GPL), version 2.
+ */
+
+#include <linux/kernel.h>
+#include <linux/fs.h>
+#include <linux/idr.h>
+#include <linux/err.h>
+#include <linux/slab.h>
+#include <linux/miscdevice.h>
+#include <linux/uaccess.h>
+#include <linux/mm_types.h>
+#include <linux/hwmem.h>
+#include <linux/device.h>
+#include <linux/sched.h>
+
+/*
+ * TODO:
+ * Count pin unpin at this level to ensure applications can't interfer
+ * with each other.
+ */
+
+static int hwmem_open(struct inode *inode, struct file *file);
+static int hwmem_ioctl_mmap(struct file *file, struct vm_area_struct *vma);
+static int hwmem_release_fop(struct inode *inode, struct file *file);
+static long hwmem_ioctl(struct file *file, unsigned int cmd,
+	unsigned long arg);
+static unsigned long hwmem_get_unmapped_area(struct file *file,
+	unsigned long addr, unsigned long len, unsigned long pgoff,
+	unsigned long flags);
+
+static const struct file_operations hwmem_fops = {
+	.open = hwmem_open,
+	.mmap = hwmem_ioctl_mmap,
+	.unlocked_ioctl = hwmem_ioctl,
+	.release = hwmem_release_fop,
+	.get_unmapped_area = hwmem_get_unmapped_area,
+};
+
+static struct miscdevice hwmem_device = {
+	.minor = MISC_DYNAMIC_MINOR,
+	.name = "hwmem",
+	.fops = &hwmem_fops,
+};
+
+struct hwmem_file {
+	struct mutex lock;
+	struct idr idr; /* id -> struct hwmem_alloc*, ref counted */
+	struct hwmem_alloc *fd_alloc; /* Ref counted */
+};
+
+static int create_id(struct hwmem_file *hwfile, struct hwmem_alloc *alloc)
+{
+	int id, ret;
+
+	while (true) {
+		if (idr_pre_get(&hwfile->idr, GFP_KERNEL) == 0)
+			return -ENOMEM;
+
+		ret = idr_get_new_above(&hwfile->idr, alloc, 1, &id);
+		if (ret == 0)
+			break;
+		else if (ret != -EAGAIN)
+			return -ENOMEM;
+	}
+
+	/*
+	 * TODO: This isn't great as we destroy IDR's ability to reuse freed
+	 * IDs. Currently we can use 19 bits for the ID ie 524388 IDs can be
+	 * generated by a hwmem file instance before this function starts
+	 * failing. This should be enough for most scenarios but the final
+	 * solution for this problem is to change IDR so that you can specify
+	 * a maximum ID.
+	 */
+	if (id >= 1 << (31 - PAGE_SHIFT)) {
+		dev_err(hwmem_device.this_device, "ID overflow!\n");
+		idr_remove(&hwfile->idr, id);
+		return -ENOMSG;
+	}
+
+	return id << PAGE_SHIFT;
+}
+
+static void remove_id(struct hwmem_file *hwfile, int id)
+{
+	idr_remove(&hwfile->idr, id >> PAGE_SHIFT);
+}
+
+static struct hwmem_alloc *resolve_id(struct hwmem_file *hwfile, int id)
+{
+	struct hwmem_alloc *alloc;
+
+	alloc = id ? idr_find(&hwfile->idr, id >> PAGE_SHIFT) :
+			hwfile->fd_alloc;
+	if (alloc == NULL)
+		alloc = ERR_PTR(-EINVAL);
+
+	return alloc;
+}
+
+static int alloc(struct hwmem_file *hwfile, struct hwmem_alloc_request *req)
+{
+	int ret = 0;
+	struct hwmem_alloc *alloc;
+
+	alloc = hwmem_alloc(req->size, req->flags, req->default_access,
+			req->mem_type);
+	if (IS_ERR(alloc))
+		return PTR_ERR(alloc);
+
+	ret = create_id(hwfile, alloc);
+	if (ret < 0)
+		hwmem_release(alloc);
+
+	return ret;
+}
+
+static int alloc_fd(struct hwmem_file *hwfile, struct hwmem_alloc_request *req)
+{
+	struct hwmem_alloc *alloc;
+
+	if (hwfile->fd_alloc)
+		return -EBUSY;
+
+	alloc = hwmem_alloc(req->size, req->flags, req->default_access,
+			req->mem_type);
+	if (IS_ERR(alloc))
+		return PTR_ERR(alloc);
+
+	hwfile->fd_alloc = alloc;
+
+	return 0;
+}
+
+static int release(struct hwmem_file *hwfile, s32 id)
+{
+	struct hwmem_alloc *alloc;
+
+	alloc = resolve_id(hwfile, id);
+	if (IS_ERR(alloc))
+		return PTR_ERR(alloc);
+
+	remove_id(hwfile, id);
+	hwmem_release(alloc);
+
+	return 0;
+}
+
+static int hwmem_ioctl_set_domain(struct hwmem_file *hwfile,
+					struct hwmem_set_domain_request *req)
+{
+	struct hwmem_alloc *alloc;
+
+	alloc = resolve_id(hwfile, req->id);
+	if (IS_ERR(alloc))
+		return PTR_ERR(alloc);
+
+	return hwmem_set_domain(alloc, req->access, req->domain, &req->region);
+}
+
+static int pin(struct hwmem_file *hwfile, struct hwmem_pin_request *req)
+{
+	struct hwmem_alloc *alloc;
+
+	alloc = resolve_id(hwfile, req->id);
+	if (IS_ERR(alloc))
+		return PTR_ERR(alloc);
+
+	return hwmem_pin(alloc, &req->phys_addr, req->scattered_addrs);
+}
+
+static int unpin(struct hwmem_file *hwfile, s32 id)
+{
+	struct hwmem_alloc *alloc;
+
+	alloc = resolve_id(hwfile, id);
+	if (IS_ERR(alloc))
+		return PTR_ERR(alloc);
+
+	hwmem_unpin(alloc);
+
+	return 0;
+}
+
+static int set_access(struct hwmem_file *hwfile,
+		struct hwmem_set_access_request *req)
+{
+	struct hwmem_alloc *alloc;
+
+	alloc = resolve_id(hwfile, req->id);
+	if (IS_ERR(alloc))
+		return PTR_ERR(alloc);
+
+	return hwmem_set_access(alloc, req->access, req->pid);
+}
+
+static int get_info(struct hwmem_file *hwfile,
+		struct hwmem_get_info_request *req)
+{
+	struct hwmem_alloc *alloc;
+
+	alloc = resolve_id(hwfile, req->id);
+	if (IS_ERR(alloc))
+		return PTR_ERR(alloc);
+
+	hwmem_get_info(alloc, &req->size, &req->mem_type, &req->access);
+
+	return 0;
+}
+
+static int export(struct hwmem_file *hwfile, s32 id)
+{
+	int ret;
+	struct hwmem_alloc *alloc;
+
+	uint32_t size;
+	enum hwmem_mem_type mem_type;
+	enum hwmem_access access;
+
+	alloc = resolve_id(hwfile, id);
+	if (IS_ERR(alloc))
+		return PTR_ERR(alloc);
+
+	/*
+	 * The user could be about to send the buffer to a driver but
+	 * there is a chance the current thread group don't have import rights
+	 * if it gained access to the buffer via a inter-process fd transfer
+	 * (fork, Android binder), if this is the case the driver will not be
+	 * able to resolve the buffer name. To avoid this situation we give the
+	 * current thread group import rights. This will not breach the
+	 * security as the process already has access to the buffer (otherwise
+	 * it would not be able to get here).
+	 */
+	hwmem_get_info(alloc, &size, &mem_type, &access);
+
+	ret = hwmem_set_access(alloc, (access | HWMEM_ACCESS_IMPORT),
+			task_tgid_nr(current));
+	if (ret < 0)
+		goto error;
+
+	return hwmem_get_name(alloc);
+
+error:
+	return ret;
+}
+
+static int import(struct hwmem_file *hwfile, s32 name)
+{
+	int ret = 0;
+	struct hwmem_alloc *alloc;
+
+	uint32_t size;
+	enum hwmem_mem_type mem_type;
+	enum hwmem_access access;
+
+	alloc = hwmem_resolve_by_name(name);
+	if (IS_ERR(alloc))
+		return PTR_ERR(alloc);
+
+	/* Check access permissions for process */
+	hwmem_get_info(alloc, &size, &mem_type, &access);
+
+	if (!(access & HWMEM_ACCESS_IMPORT)) {
+		ret = -EPERM;
+		goto error;
+	}
+
+	ret = create_id(hwfile, alloc);
+	if (ret < 0)
+		hwmem_release(alloc);
+
+error:
+	return ret;
+}
+
+static int import_fd(struct hwmem_file *hwfile, s32 name)
+{
+	struct hwmem_alloc *alloc;
+
+	if (hwfile->fd_alloc)
+		return -EBUSY;
+
+	alloc = hwmem_resolve_by_name(name);
+	if (IS_ERR(alloc))
+		return PTR_ERR(alloc);
+
+	hwfile->fd_alloc = alloc;
+
+	return 0;
+}
+
+static int hwmem_open(struct inode *inode, struct file *file)
+{
+	struct hwmem_file *hwfile;
+
+	hwfile = kzalloc(sizeof(struct hwmem_file), GFP_KERNEL);
+	if (hwfile == NULL)
+		return -ENOMEM;
+
+	idr_init(&hwfile->idr);
+	mutex_init(&hwfile->lock);
+	file->private_data = hwfile;
+
+	return 0;
+}
+
+static int hwmem_ioctl_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	struct hwmem_file *hwfile = (struct hwmem_file *)file->private_data;
+	struct hwmem_alloc *alloc;
+
+	alloc = resolve_id(hwfile, vma->vm_pgoff << PAGE_SHIFT);
+	if (IS_ERR(alloc))
+		return PTR_ERR(alloc);
+
+	return hwmem_mmap(alloc, vma);
+}
+
+static int hwmem_release_idr_for_each_wrapper(int id, void *ptr, void *data)
+{
+	hwmem_release((struct hwmem_alloc *)ptr);
+
+	return 0;
+}
+
+static int hwmem_release_fop(struct inode *inode, struct file *file)
+{
+	struct hwmem_file *hwfile = (struct hwmem_file *)file->private_data;
+
+	idr_for_each(&hwfile->idr, hwmem_release_idr_for_each_wrapper, NULL);
+	idr_destroy(&hwfile->idr);
+
+	if (hwfile->fd_alloc)
+		hwmem_release(hwfile->fd_alloc);
+
+	mutex_destroy(&hwfile->lock);
+
+	kfree(hwfile);
+
+	return 0;
+}
+
+static long hwmem_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+	int ret = -ENOSYS;
+	struct hwmem_file *hwfile = (struct hwmem_file *)file->private_data;
+
+	mutex_lock(&hwfile->lock);
+
+	switch (cmd) {
+	case HWMEM_ALLOC_IOC:
+		{
+			struct hwmem_alloc_request req;
+			if (copy_from_user(&req, (void __user *)arg,
+					sizeof(struct hwmem_alloc_request)))
+				ret = -EFAULT;
+			else
+				ret = alloc(hwfile, &req);
+		}
+		break;
+	case HWMEM_ALLOC_FD_IOC:
+		{
+			struct hwmem_alloc_request req;
+			if (copy_from_user(&req, (void __user *)arg,
+					sizeof(struct hwmem_alloc_request)))
+				ret = -EFAULT;
+			else
+				ret = alloc_fd(hwfile, &req);
+		}
+		break;
+	case HWMEM_RELEASE_IOC:
+		ret = release(hwfile, (s32)arg);
+		break;
+	case HWMEM_SET_DOMAIN_IOC:
+		{
+			struct hwmem_set_domain_request req;
+			if (copy_from_user(&req, (void __user *)arg,
+				sizeof(struct hwmem_set_domain_request)))
+				ret = -EFAULT;
+			else
+				ret = hwmem_ioctl_set_domain(hwfile, &req);
+		}
+		break;
+	case HWMEM_PIN_IOC:
+		{
+			struct hwmem_pin_request req;
+			/*
+			 * TODO: Validate and copy scattered_addrs. Not a
+			 * problem right now as it's never used.
+			 */
+			if (copy_from_user(&req, (void __user *)arg,
+				sizeof(struct hwmem_pin_request)))
+				ret = -EFAULT;
+			else
+				ret = pin(hwfile, &req);
+			if (ret == 0 && copy_to_user((void __user *)arg, &req,
+					sizeof(struct hwmem_pin_request)))
+				ret = -EFAULT;
+		}
+		break;
+	case HWMEM_UNPIN_IOC:
+		ret = unpin(hwfile, (s32)arg);
+		break;
+	case HWMEM_SET_ACCESS_IOC:
+		{
+			struct hwmem_set_access_request req;
+			if (copy_from_user(&req, (void __user *)arg,
+				sizeof(struct hwmem_set_access_request)))
+				ret = -EFAULT;
+			else
+				ret = set_access(hwfile, &req);
+		}
+		break;
+	case HWMEM_GET_INFO_IOC:
+		{
+			struct hwmem_get_info_request req;
+			if (copy_from_user(&req, (void __user *)arg,
+				sizeof(struct hwmem_get_info_request)))
+				ret = -EFAULT;
+			else
+				ret = get_info(hwfile, &req);
+			if (ret == 0 && copy_to_user((void __user *)arg, &req,
+					sizeof(struct hwmem_get_info_request)))
+				ret = -EFAULT;
+		}
+		break;
+	case HWMEM_EXPORT_IOC:
+		ret = export(hwfile, (s32)arg);
+		break;
+	case HWMEM_IMPORT_IOC:
+		ret = import(hwfile, (s32)arg);
+		break;
+	case HWMEM_IMPORT_FD_IOC:
+		ret = import_fd(hwfile, (s32)arg);
+		break;
+	}
+
+	mutex_unlock(&hwfile->lock);
+
+	return ret;
+}
+
+static unsigned long hwmem_get_unmapped_area(struct file *file,
+	unsigned long addr, unsigned long len, unsigned long pgoff,
+	unsigned long flags)
+{
+	/*
+	 * pgoff will not be valid as it contains a buffer id (right shifted
+	 * PAGE_SHIFT bits). To not confuse get_unmapped_area we'll not pass
+	 * on file or pgoff.
+	 */
+	return current->mm->get_unmapped_area(NULL, addr, len, 0, flags);
+}
+
+int __init hwmem_ioctl_init(void)
+{
+	return misc_register(&hwmem_device);
+}
+
+void __exit hwmem_ioctl_exit(void)
+{
+	misc_deregister(&hwmem_device);
+}
diff --git a/drivers/misc/hwmem/hwmem-main.c b/drivers/misc/hwmem/hwmem-main.c
new file mode 100644
index 0000000..287cab5
--- /dev/null
+++ b/drivers/misc/hwmem/hwmem-main.c
@@ -0,0 +1,609 @@
+/*
+ * Copyright (C) ST-Ericsson AB 2010
+ *
+ * Hardware memory driver, hwmem
+ *
+ * Author: Marcus Lorentzon <marcus.xm.lorentzon@stericsson.com>
+ * for ST-Ericsson.
+ *
+ * License terms: GNU General Public License (GPL), version 2.
+ */
+
+/*
+ * TODO:
+ * - Kernel addresses are non-cached which could be a problem when using them
+ * for cache synchronization operations, some CPU:s might skip the
+ * synchronization operation alltoghether.
+ */
+
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/dma-mapping.h>
+#include <linux/idr.h>
+#include <linux/mm.h>
+#include <linux/sched.h>
+#include <linux/err.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/pid.h>
+#include <linux/list.h>
+#include <linux/hwmem.h>
+#include "cache_handler.h"
+
+struct hwmem_alloc_threadg_info {
+	struct list_head list;
+
+	struct pid *threadg_pid; /* Ref counted */
+
+	enum hwmem_access access;
+};
+
+struct hwmem_alloc {
+	struct list_head list;
+
+	atomic_t ref_cnt;
+	enum hwmem_alloc_flags flags;
+	u32 start;
+	u32 size;
+	u32 name;
+
+	/* Access control */
+	enum hwmem_access default_access;
+	struct list_head threadg_info_list;
+
+	/* Cache handling */
+	struct cach_buf cach_buf;
+};
+
+static struct platform_device *hwdev;
+
+u32 hwmem_start;
+u32 hwmem_size;
+void *hwmem_kaddr;
+
+static LIST_HEAD(alloc_list);
+static DEFINE_IDR(global_idr);
+static DEFINE_MUTEX(lock);
+
+static void vm_open(struct vm_area_struct *vma);
+static void vm_close(struct vm_area_struct *vma);
+static struct vm_operations_struct vm_ops = {
+	.open = vm_open,
+	.close = vm_close,
+};
+
+/* Helpers */
+
+static void destroy_hwmem_alloc_threadg_info(
+		struct hwmem_alloc_threadg_info *info)
+{
+	if (info->threadg_pid)
+		put_pid(info->threadg_pid);
+
+	kfree(info);
+}
+
+static void clean_hwmem_alloc_threadg_info_list(struct hwmem_alloc *alloc)
+{
+	struct hwmem_alloc_threadg_info *info;
+	struct hwmem_alloc_threadg_info *tmp;
+
+	list_for_each_entry_safe(info, tmp, &(alloc->threadg_info_list), list) {
+		list_del(&info->list);
+		destroy_hwmem_alloc_threadg_info(info);
+	}
+}
+
+static enum hwmem_access get_access(struct hwmem_alloc *alloc)
+{
+	struct hwmem_alloc_threadg_info *info;
+	struct pid *my_pid;
+	bool found = false;
+
+	my_pid = find_get_pid(task_tgid_nr(current));
+	if (!my_pid)
+		return 0;
+
+	list_for_each_entry(info, &(alloc->threadg_info_list), list) {
+		if (info->threadg_pid == my_pid) {
+			found = true;
+			break;
+		}
+	}
+
+	put_pid(my_pid);
+
+	if (found)
+		return info->access;
+	else
+		return alloc->default_access;
+}
+
+static void clear_alloc_mem(struct hwmem_alloc *alloc)
+{
+	u32 offset;
+	void *v_start;
+
+	offset = alloc->start - hwmem_start;
+
+	v_start = (u8 *)hwmem_kaddr + offset;
+
+	/*
+	 * HWMEM_DOMAIN_SYNC is used as hwmem_kaddr is non-cached and any data
+	 * in the CPU caches should be flushed before we start using the
+	 * buffer. Usually the cache handler keeps track of these things but
+	 * since our kernel addresses don't have the cache settings specified
+	 * in the alloc we have to do it manually here.
+	 */
+	cach_set_domain(&alloc->cach_buf, HWMEM_ACCESS_WRITE,
+						HWMEM_DOMAIN_SYNC, NULL);
+
+	memset(v_start, 0, alloc->size);
+}
+
+static void clean_alloc(struct hwmem_alloc *alloc)
+{
+	if (alloc->name) {
+		idr_remove(&global_idr, alloc->name);
+		alloc->name = 0;
+	}
+
+	alloc->flags = 0;
+
+	clean_hwmem_alloc_threadg_info_list(alloc);
+}
+
+static void destroy_alloc(struct hwmem_alloc *alloc)
+{
+	clean_alloc(alloc);
+
+	kfree(alloc);
+}
+
+static void __hwmem_release(struct hwmem_alloc *alloc)
+{
+	struct hwmem_alloc *other;
+
+	clean_alloc(alloc);
+
+	other = list_entry(alloc->list.prev, struct hwmem_alloc, list);
+	if ((alloc->list.prev != &alloc_list) &&
+			atomic_read(&other->ref_cnt) == 0) {
+		other->size += alloc->size;
+		list_del(&alloc->list);
+		destroy_alloc(alloc);
+		alloc = other;
+	}
+	other = list_entry(alloc->list.next, struct hwmem_alloc, list);
+	if ((alloc->list.next != &alloc_list) &&
+			atomic_read(&other->ref_cnt) == 0) {
+		alloc->size += other->size;
+		list_del(&other->list);
+		destroy_alloc(other);
+	}
+}
+
+static struct hwmem_alloc *find_free_alloc_bestfit(u32 size)
+{
+	u32 best_diff = ~0;
+	struct hwmem_alloc *alloc = NULL, *i;
+
+	list_for_each_entry(i, &alloc_list, list) {
+		u32 diff = i->size - size;
+		if (atomic_read(&i->ref_cnt) > 0 || i->size < size)
+			continue;
+		if (diff < best_diff) {
+			alloc = i;
+			best_diff = diff;
+		}
+	}
+
+	return alloc != NULL ? alloc : ERR_PTR(-ENOMEM);
+}
+
+static struct hwmem_alloc *split_allocation(struct hwmem_alloc *alloc,
+							u32 new_alloc_size)
+{
+	struct hwmem_alloc *new_alloc;
+
+	new_alloc = kzalloc(sizeof(struct hwmem_alloc), GFP_KERNEL);
+	if (new_alloc == NULL)
+		return ERR_PTR(-ENOMEM);
+
+	atomic_inc(&new_alloc->ref_cnt);
+	INIT_LIST_HEAD(&new_alloc->threadg_info_list);
+	new_alloc->start = alloc->start;
+	new_alloc->size = new_alloc_size;
+	alloc->size -= new_alloc_size;
+	alloc->start += new_alloc_size;
+
+	list_add_tail(&new_alloc->list, &alloc->list);
+
+	return new_alloc;
+}
+
+static int init_alloc_list(void)
+{
+	struct hwmem_alloc *first_alloc;
+
+	first_alloc = kzalloc(sizeof(struct hwmem_alloc), GFP_KERNEL);
+	if (first_alloc == NULL)
+		return -ENOMEM;
+
+	first_alloc->start = hwmem_start;
+	first_alloc->size = hwmem_size;
+	INIT_LIST_HEAD(&first_alloc->threadg_info_list);
+
+	list_add_tail(&first_alloc->list, &alloc_list);
+
+	return 0;
+}
+
+static void clean_alloc_list(void)
+{
+	while (list_empty(&alloc_list) == 0) {
+		struct hwmem_alloc *i = list_first_entry(&alloc_list,
+						struct hwmem_alloc, list);
+
+		list_del(&i->list);
+
+		destroy_alloc(i);
+	}
+}
+
+/* HWMEM API */
+
+struct hwmem_alloc *hwmem_alloc(u32 size, enum hwmem_alloc_flags flags,
+		enum hwmem_access def_access, enum hwmem_mem_type mem_type)
+{
+	struct hwmem_alloc *alloc;
+
+	if (!hwdev) {
+		printk(KERN_ERR "hwmem: Badly configured\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	if (size == 0)
+		return ERR_PTR(-EINVAL);
+
+	mutex_lock(&lock);
+
+	size = PAGE_ALIGN(size);
+
+	alloc = find_free_alloc_bestfit(size);
+	if (IS_ERR(alloc)) {
+		dev_info(&hwdev->dev, "Allocation failed, no free slot\n");
+		goto no_slot;
+	}
+
+	if (size < alloc->size) {
+		alloc = split_allocation(alloc, size);
+		if (IS_ERR(alloc))
+			goto split_alloc_failed;
+	} else {
+		atomic_inc(&alloc->ref_cnt);
+	}
+
+	alloc->flags = flags;
+	alloc->default_access = def_access;
+	cach_init_buf(&alloc->cach_buf, alloc->flags,
+		(u32)hwmem_kaddr + (alloc->start - hwmem_start), alloc->start,
+								alloc->size);
+
+	clear_alloc_mem(alloc);
+
+	goto out;
+
+split_alloc_failed:
+no_slot:
+out:
+	mutex_unlock(&lock);
+
+	return alloc;
+}
+EXPORT_SYMBOL(hwmem_alloc);
+
+void hwmem_release(struct hwmem_alloc *alloc)
+{
+	mutex_lock(&lock);
+
+	if (atomic_dec_and_test(&alloc->ref_cnt))
+		__hwmem_release(alloc);
+
+	mutex_unlock(&lock);
+}
+EXPORT_SYMBOL(hwmem_release);
+
+int hwmem_set_domain(struct hwmem_alloc *alloc, enum hwmem_access access,
+		enum hwmem_domain domain, struct hwmem_region *region)
+{
+	mutex_lock(&lock);
+
+	cach_set_domain(&alloc->cach_buf, access, domain, region);
+
+	mutex_unlock(&lock);
+
+	return 0;
+}
+EXPORT_SYMBOL(hwmem_set_domain);
+
+int hwmem_pin(struct hwmem_alloc *alloc, uint32_t *phys_addr,
+					uint32_t *scattered_phys_addrs)
+{
+	mutex_lock(&lock);
+
+	*phys_addr = alloc->start;
+
+	mutex_unlock(&lock);
+
+	return 0;
+}
+EXPORT_SYMBOL(hwmem_pin);
+
+void hwmem_unpin(struct hwmem_alloc *alloc)
+{
+}
+EXPORT_SYMBOL(hwmem_unpin);
+
+static void vm_open(struct vm_area_struct *vma)
+{
+	atomic_inc(&((struct hwmem_alloc *)vma->vm_private_data)->ref_cnt);
+}
+
+static void vm_close(struct vm_area_struct *vma)
+{
+	hwmem_release((struct hwmem_alloc *)vma->vm_private_data);
+}
+
+int hwmem_mmap(struct hwmem_alloc *alloc, struct vm_area_struct *vma)
+{
+	int ret = 0;
+	unsigned long vma_size = vma->vm_end - vma->vm_start;
+	enum hwmem_access access;
+	mutex_lock(&lock);
+
+	access = get_access(alloc);
+
+	/* Check permissions */
+	if ((!(access & HWMEM_ACCESS_WRITE) &&
+				(vma->vm_flags & VM_WRITE)) ||
+			(!(access & HWMEM_ACCESS_READ) &&
+				(vma->vm_flags & VM_READ))) {
+		ret = -EPERM;
+		goto illegal_access;
+	}
+
+	if (vma_size > (unsigned long)alloc->size) {
+		ret = -EINVAL;
+		goto illegal_size;
+	}
+
+	/*
+	 * We don't want Linux to do anything (merging etc) with our VMAs as
+	 * the offset is not necessarily valid
+	 */
+	vma->vm_flags |= VM_SPECIAL;
+	cach_set_pgprot_cache_options(&alloc->cach_buf, &vma->vm_page_prot);
+	vma->vm_private_data = (void *)alloc;
+	atomic_inc(&alloc->ref_cnt);
+	vma->vm_ops = &vm_ops;
+
+	ret = remap_pfn_range(vma, vma->vm_start, alloc->start >> PAGE_SHIFT,
+		min(vma_size, (unsigned long)alloc->size), vma->vm_page_prot);
+	if (ret < 0)
+		goto map_failed;
+
+	goto out;
+
+map_failed:
+	atomic_dec(&alloc->ref_cnt);
+illegal_size:
+illegal_access:
+
+out:
+	mutex_unlock(&lock);
+
+	return ret;
+}
+EXPORT_SYMBOL(hwmem_mmap);
+
+int hwmem_set_access(struct hwmem_alloc *alloc,
+		enum hwmem_access access, pid_t pid_nr)
+{
+	int ret;
+	struct hwmem_alloc_threadg_info *info;
+	struct pid *pid;
+	bool found = false;
+
+	pid = find_get_pid(pid_nr);
+	if (!pid) {
+		ret = -EINVAL;
+		goto error_get_pid;
+	}
+
+	list_for_each_entry(info, &(alloc->threadg_info_list), list) {
+		if (info->threadg_pid == pid) {
+			found = true;
+			break;
+		}
+	}
+
+	if (!found) {
+		info = kmalloc(sizeof(*info), GFP_KERNEL);
+		if (!info) {
+			ret = -ENOMEM;
+			goto error_alloc_info;
+		}
+
+		info->threadg_pid = pid;
+		info->access = access;
+
+		list_add_tail(&(info->list), &(alloc->threadg_info_list));
+	} else {
+		info->access = access;
+	}
+
+	return 0;
+
+error_alloc_info:
+	put_pid(pid);
+error_get_pid:
+	return ret;
+}
+EXPORT_SYMBOL(hwmem_set_access);
+
+void hwmem_get_info(struct hwmem_alloc *alloc, uint32_t *size,
+	enum hwmem_mem_type *mem_type, enum hwmem_access *access)
+{
+	mutex_lock(&lock);
+
+	*size = alloc->size;
+	*mem_type = HWMEM_MEM_CONTIGUOUS_SYS;
+	*access = get_access(alloc);
+
+	mutex_unlock(&lock);
+}
+EXPORT_SYMBOL(hwmem_get_info);
+
+int hwmem_get_name(struct hwmem_alloc *alloc)
+{
+	int ret = 0, name;
+
+	mutex_lock(&lock);
+
+	if (alloc->name != 0) {
+		ret = alloc->name;
+		goto out;
+	}
+
+	while (true) {
+		if (idr_pre_get(&global_idr, GFP_KERNEL) == 0) {
+			ret = -ENOMEM;
+			goto pre_get_id_failed;
+		}
+
+		ret = idr_get_new_above(&global_idr, alloc, 1, &name);
+		if (ret == 0)
+			break;
+		else if (ret != -EAGAIN)
+			goto get_id_failed;
+	}
+
+	alloc->name = name;
+
+	ret = name;
+	goto out;
+
+get_id_failed:
+pre_get_id_failed:
+
+out:
+	mutex_unlock(&lock);
+
+	return ret;
+}
+EXPORT_SYMBOL(hwmem_get_name);
+
+struct hwmem_alloc *hwmem_resolve_by_name(s32 name)
+{
+	struct hwmem_alloc *alloc;
+
+	mutex_lock(&lock);
+
+	alloc = idr_find(&global_idr, name);
+	if (alloc == NULL) {
+		alloc = ERR_PTR(-EINVAL);
+		goto find_failed;
+	}
+	atomic_inc(&alloc->ref_cnt);
+
+	goto out;
+
+find_failed:
+
+out:
+	mutex_unlock(&lock);
+
+	return alloc;
+}
+EXPORT_SYMBOL(hwmem_resolve_by_name);
+
+/* Module */
+
+extern int hwmem_ioctl_init(void);
+extern void hwmem_ioctl_exit(void);
+
+static int __devinit hwmem_probe(struct platform_device *pdev)
+{
+	int ret = 0;
+	struct hwmem_platform_data *platform_data = pdev->dev.platform_data;
+
+	if (hwdev || platform_data->size == 0) {
+		dev_info(&pdev->dev, "hwdev || platform_data->size == 0\n");
+		return -EINVAL;
+	}
+
+	hwdev = pdev;
+	hwmem_start = platform_data->start;
+	hwmem_size = platform_data->size;
+
+	/*
+	 * TODO: This will consume a lot of the kernel's virtual memory space.
+	 * Investigate if a better solution exists.
+	 */
+	hwmem_kaddr = ioremap_nocache(hwmem_start, hwmem_size);
+	if (hwmem_kaddr == NULL) {
+		ret = -ENOMEM;
+		goto ioremap_failed;
+	}
+
+	/*
+	 * No need to flush the caches here. If we can keep track of the cache
+	 * content then none of our memory will be in the caches, if we can't
+	 * keep track of the cache content we always assume all our memory is
+	 * in the caches.
+	 */
+
+	ret = init_alloc_list();
+	if (ret < 0)
+		goto init_alloc_list_failed;
+
+	ret = hwmem_ioctl_init();
+	if (ret)
+		goto ioctl_init_failed;
+
+	dev_info(&pdev->dev, "Hwmem probed, device contains %#x bytes\n",
+			hwmem_size);
+
+	goto out;
+
+ioctl_init_failed:
+	clean_alloc_list();
+init_alloc_list_failed:
+	iounmap(hwmem_kaddr);
+ioremap_failed:
+	hwdev = NULL;
+
+out:
+	return ret;
+}
+
+static struct platform_driver hwmem_driver = {
+	.probe	= hwmem_probe,
+	.driver = {
+		.name	= "hwmem",
+	},
+};
+
+static int __init hwmem_init(void)
+{
+	return platform_driver_register(&hwmem_driver);
+}
+subsys_initcall(hwmem_init);
+
+MODULE_AUTHOR("Marcus Lorentzon <marcus.xm.lorentzon@stericsson.com>");
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Hardware memory driver");
+
diff --git a/include/linux/hwmem.h b/include/linux/hwmem.h
new file mode 100644
index 0000000..c3ba179
--- /dev/null
+++ b/include/linux/hwmem.h
@@ -0,0 +1,499 @@
+/*
+ * Copyright (C) ST-Ericsson AB 2010
+ *
+ * ST-Ericsson HW memory driver
+ *
+ * Author: Marcus Lorentzon <marcus.xm.lorentzon@stericsson.com>
+ * for ST-Ericsson.
+ *
+ * License terms: GNU General Public License (GPL), version 2.
+ */
+
+#ifndef _HWMEM_H_
+#define _HWMEM_H_
+
+#if !defined(__KERNEL__) && !defined(_KERNEL)
+#include <stdint.h>
+#include <sys/types.h>
+#else
+#include <linux/types.h>
+#include <linux/mm_types.h>
+#endif
+
+#define HWMEM_DEFAULT_DEVICE_NAME "hwmem"
+
+/**
+ * @brief Flags defining behavior of allocation
+ */
+enum hwmem_alloc_flags {
+	/**
+	 * @brief Buffer will not be cached and not buffered
+	 */
+	HWMEM_ALLOC_UNCACHED             = (0 << 0),
+	/**
+	 * @brief Buffer will be buffered, but not cached
+	 */
+	HWMEM_ALLOC_BUFFERED             = (1 << 0),
+	/**
+	 * @brief Buffer will be cached and buffered, use cache hints to be
+	 * more specific
+	 */
+	HWMEM_ALLOC_CACHED               = (3 << 0),
+	/**
+	 * @brief Buffer should be cached write-back in both level 1 and 2 cache
+	 */
+	HWMEM_ALLOC_CACHE_HINT_WB        = (1 << 2),
+	/**
+	 * @brief Buffer should be cached write-through in both level 1 and
+	 * 2 cache
+	 */
+	HWMEM_ALLOC_CACHE_HINT_WT        = (2 << 2),
+	/**
+	 * @brief Buffer should be cached write-back in level 1 cache
+	 */
+	HWMEM_ALLOC_CACHE_HINT_WB_INNER  = (3 << 2),
+	/**
+	 * @brief Buffer should be cached write-through in level 1 cache
+	 */
+	HWMEM_ALLOC_CACHE_HINT_WT_INNER  = (4 << 2),
+	HWMEM_ALLOC_CACHE_HINT_MASK      = 0x1C,
+};
+
+/**
+ * @brief Flags defining buffer access mode.
+ */
+enum hwmem_access {
+	/**
+	 * @brief Buffer will be read from.
+	 */
+	HWMEM_ACCESS_READ  = (1 << 0),
+	/**
+	 * @brief Buffer will be written to.
+	 */
+	HWMEM_ACCESS_WRITE = (1 << 1),
+	/**
+	 * @brief Buffer will be imported.
+	 */
+	HWMEM_ACCESS_IMPORT = (1 << 2),
+};
+
+/**
+ * @brief Flags defining memory type.
+ */
+enum hwmem_mem_type {
+	/**
+	 * @brief Scattered system memory. Currently not supported!
+	 */
+	HWMEM_MEM_SCATTERED_SYS  = (1 << 0),
+	/**
+	 * @brief Contiguous system memory.
+	 */
+	HWMEM_MEM_CONTIGUOUS_SYS = (1 << 1),
+};
+
+/**
+ * @brief Values defining memory domain.
+ */
+enum hwmem_domain {
+	/**
+	 * @brief This value specifies the neutral memory domain. Setting this
+	 * domain will syncronize all supported memory domains (currently CPU).
+	 */
+	HWMEM_DOMAIN_SYNC = 0,
+	/**
+	 * @brief This value specifies the CPU memory domain.
+	 */
+	HWMEM_DOMAIN_CPU  = 1,
+};
+
+/**
+ * @brief Structure defining a region of a memory buffer.
+ *
+ * A buffer is defined to contain a number of equally sized blocks. Each block
+ * has a part of it included in the region [<start>-<end>). That is
+ * <end>-<start> bytes. Each block is <size> bytes long. Total number of bytes
+ * in the region is (<end> - <start>) * <count>. First byte of the region is
+ * <offset> + <start> bytes into the buffer.
+ *
+ * Here's an example of a region in a graphics buffer (X = buffer, R = region):
+ *
+ * XXXXXXXXXXXXXXXXXXXX \
+ * XXXXXXXXXXXXXXXXXXXX |-- offset = 60
+ * XXXXXXXXXXXXXXXXXXXX /
+ * XXRRRRRRRRXXXXXXXXXX \
+ * XXRRRRRRRRXXXXXXXXXX |-- count = 4
+ * XXRRRRRRRRXXXXXXXXXX |
+ * XXRRRRRRRRXXXXXXXXXX /
+ * XXXXXXXXXXXXXXXXXXXX
+ * --| start = 2
+ * ----------| end = 10
+ * --------------------| size = 20
+ */
+struct hwmem_region {
+	/**
+	 * @brief The first block's offset from beginning of buffer.
+	 */
+	uint32_t offset;
+	/**
+	 * @brief The number of blocks included in this region.
+	 */
+	uint32_t count;
+	/**
+	 * @brief The index of the first byte included in this block.
+	 */
+	uint32_t start;
+	/**
+	 * @brief The index of the last byte included in this block plus one.
+	 */
+	uint32_t end;
+	/**
+	 * @brief The size in bytes of each block.
+	 */
+	uint32_t size;
+};
+
+/* User space API */
+
+/**
+ * @brief Alloc request data.
+ */
+struct hwmem_alloc_request {
+	/**
+	 * @brief [in] Size of requested allocation in bytes. Size will be
+	 * aligned to PAGE_SIZE bytes.
+	 */
+	uint32_t size;
+	/**
+	 * @brief [in] Flags describing requested allocation options.
+	 */
+	uint32_t flags; /* enum hwmem_alloc_flags */
+	/**
+	 * @brief [in] Default access rights for buffer.
+	 */
+	uint32_t default_access; /* enum hwmem_access */
+	/**
+	 * @brief [in] Memory type of the buffer.
+	 */
+	uint32_t mem_type; /* enum hwmem_mem_type */
+};
+
+/**
+ * @brief Set domain request data.
+ */
+struct hwmem_set_domain_request {
+	/**
+	 * @brief [in] Identifier of buffer to be prepared. If 0 is specified
+	 * the buffer associated with the current file instance will be used.
+	 */
+	int32_t id;
+	/**
+	 * @brief [in] Value specifying the new memory domain.
+	 */
+	uint32_t domain; /* enum hwmem_domain */
+	/**
+	 * @brief [in] Flags specifying access mode of the operation.
+	 *
+	 * One of HWMEM_ACCESS_READ and HWMEM_ACCESS_WRITE is required.
+	 * For details, @see enum hwmem_access.
+	 */
+	uint32_t access; /* enum hwmem_access */
+	/**
+	 * @brief [in] The region of bytes to be prepared.
+	 *
+	 * For details, @see struct hwmem_region.
+	 */
+	struct hwmem_region region;
+};
+
+/**
+ * @brief Pin request data.
+ */
+struct hwmem_pin_request {
+	/**
+	 * @brief [in] Identifier of buffer to be pinned. If 0 is specified,
+	 * the buffer associated with the current file instance will be used.
+	 */
+	int32_t id;
+	/**
+	 * @brief [out] Physical address of first word in buffer.
+	 */
+	uint32_t phys_addr;
+	/**
+	 * @brief [in] Pointer to buffer for physical addresses of pinned
+	 * scattered buffer. Buffer must be (buffer_size / page_size) *
+	 * sizeof(uint32_t) bytes.
+	 * This field can be NULL for physically contiguos buffers.
+	 */
+	uint32_t *scattered_addrs;
+};
+
+/**
+ * @brief Set access rights request data.
+ */
+struct hwmem_set_access_request {
+	/**
+	 * @brief [in] Identifier of buffer to be pinned. If 0 is specified,
+	 * the buffer associated with the current file instance will be used.
+	 */
+	int32_t id;
+	/**
+	 * @param access Access value indicating what is allowed.
+	 */
+	uint32_t access; /* enum hwmem_access */
+	/**
+	 * @param pid Process ID to set rights for.
+	 */
+	pid_t pid;
+};
+
+/**
+ * @brief Get info request data.
+ */
+struct hwmem_get_info_request {
+	/**
+	 * @brief [in] Identifier of buffer to get info about. If 0 is specified,
+	 * the buffer associated with the current file instance will be used.
+	 */
+	int32_t id;
+	/**
+	 * @brief [out] Size in bytes of buffer.
+	 */
+	uint32_t size;
+	/**
+	 * @brief [out] Memory type of buffer.
+	 */
+	uint32_t mem_type; /* enum hwmem_mem_type */
+	/**
+	 * @brief [out] Access rights for buffer.
+	 */
+	uint32_t access; /* enum hwmem_access */
+};
+
+/**
+ * @brief Allocates <size> number of bytes and returns a buffer identifier.
+ *
+ * Input is a pointer to a hwmem_alloc_request struct.
+ *
+ * @return A buffer identifier on success, or a negative error code.
+ */
+#define HWMEM_ALLOC_IOC _IOW('W', 1, struct hwmem_alloc_request)
+
+/**
+ * @brief Allocates <size> number of bytes and associates the created buffer
+ * with the current file instance.
+ *
+ * If the current file instance is already associated with a buffer the call
+ * will fail. Buffers referenced through files instances shall not be released
+ * with HWMEM_RELEASE_IOC, instead the file instance shall be closed.
+ *
+ * Input is a pointer to a hwmem_alloc_request struct.
+ *
+ * @return Zero on success, or a negative error code.
+ */
+#define HWMEM_ALLOC_FD_IOC _IOW('W', 2, struct hwmem_alloc_request)
+
+/**
+ * @brief Releases buffer.
+ *
+ * Buffers are reference counted and will not be destroyed until the last
+ * reference is released. Bufferes allocated with ALLOC_FD_IOC not allowed.
+ *
+ * Input is the buffer identifier.
+ *
+ * @return Zero on success, or a negative error code.
+ */
+#define HWMEM_RELEASE_IOC _IO('W', 3)
+
+/**
+ * @brief Set the buffer's memory domain and prepares it for access.
+ *
+ * Input is a pointer to a hwmem_set_domain_request struct.
+ *
+ * @return Zero on success, or a negative error code.
+ */
+#define HWMEM_SET_DOMAIN_IOC _IOR('W', 4, struct hwmem_set_domain_request)
+
+/**
+ * @brief Pins the buffer and returns the physical address of the buffer.
+ *
+ * @return Zero on success, or a negative error code.
+ */
+#define HWMEM_PIN_IOC _IOWR('W', 5, struct hwmem_pin_request)
+
+/**
+ * @brief Unpins the buffer.
+ *
+ * @return Zero on success, or a negative error code.
+ */
+#define HWMEM_UNPIN_IOC _IO('W', 6)
+
+/**
+ * @brief Set access rights for buffer.
+ *
+ * @return Zero on success, or a negative error code.
+ */
+#define HWMEM_SET_ACCESS_IOC _IOW('W', 7, struct hwmem_set_access_request)
+
+/**
+ * @brief Get buffer information.
+ *
+ * Input is the buffer identifier. If 0 is specified the buffer associated
+ * with the current file instance will be used.
+ *
+ * @return Zero on success, or a negative error code.
+ */
+#define HWMEM_GET_INFO_IOC _IOWR('W', 8, struct hwmem_get_info_request)
+
+/**
+ * @brief Export the buffer identifier for use in another process.
+ *
+ * The global name will not increase the buffers reference count and will
+ * therefore not keep the buffer alive.
+ *
+ * Input is the buffer identifier. If 0 is specified the buffer associated with
+ * the current file instance will be exported.
+ *
+ * @return A global buffer name on success, or a negative error code.
+ */
+#define HWMEM_EXPORT_IOC _IO('W', 9)
+
+/**
+ * @brief Import a buffer to allow local access to the buffer.
+ *
+ * Input is the buffer's global name.
+ *
+ * @return The imported buffer's identifier on success, or a negative error code.
+ */
+#define HWMEM_IMPORT_IOC _IO('W', 10)
+
+/**
+ * @brief Import a buffer to allow local access to the buffer using fd.
+ *
+ * Input is the buffer's global name.
+ *
+ * @return Zero on success, or a negative error code.
+ */
+#define HWMEM_IMPORT_FD_IOC _IO('W', 11)
+
+#ifdef __KERNEL__
+
+/* Kernel API */
+
+struct hwmem_alloc;
+
+/**
+ * @brief Allocates <size> number of bytes.
+ *
+ * @param size Number of bytes to allocate. All allocations are page aligned.
+ * @param flags Allocation options.
+ * @param def_access Default buffer access rights.
+ * @param mem_type Memory type.
+ *
+ * @return Pointer to allocation, or a negative error code.
+ */
+struct hwmem_alloc *hwmem_alloc(u32 size, enum hwmem_alloc_flags flags,
+		enum hwmem_access def_access, enum hwmem_mem_type mem_type);
+
+/**
+ * @brief Release a previously allocated buffer.
+ * When last reference is released, the buffer will be freed.
+ *
+ * @param alloc Buffer to be released.
+ */
+void hwmem_release(struct hwmem_alloc *alloc);
+
+/**
+ * @brief Set the buffer domain and prepare it for access.
+ *
+ * @param alloc Buffer to be prepared.
+ * @param access Flags defining memory access mode of the call.
+ * @param domain Value specifying the memory domain.
+ * @param region Structure defining the minimum area of the buffer to be
+ * prepared.
+ *
+ * @return Zero on success, or a negative error code.
+ */
+int hwmem_set_domain(struct hwmem_alloc *alloc, enum hwmem_access access,
+		enum hwmem_domain domain, struct hwmem_region *region);
+
+/**
+ * @brief Pins the buffer.
+ *
+ * @param alloc Buffer to be pinned.
+ * @param phys_addr Reference to variable to receive physical address.
+ * @param scattered_phys_addrs Pointer to buffer to receive physical addresses
+ * of all pages in the scattered buffer. Can be NULL if buffer is contigous.
+ * Buffer size must be (buffer_size / page_size) * sizeof(uint32_t) bytes.
+ */
+int hwmem_pin(struct hwmem_alloc *alloc, uint32_t *phys_addr,
+					uint32_t *scattered_phys_addrs);
+
+/**
+ * @brief Unpins the buffer.
+ *
+ * @param alloc Buffer to be unpinned.
+ */
+void hwmem_unpin(struct hwmem_alloc *alloc);
+
+/**
+ * @brief Map the buffer to user space.
+ *
+ * @param alloc Buffer to be unpinned.
+ */
+int hwmem_mmap(struct hwmem_alloc *alloc, struct vm_area_struct *vma);
+
+/**
+ * @brief Set access rights for buffer.
+ *
+ * @param alloc Buffer to set rights for.
+ * @param access Access value indicating what is allowed.
+ * @param pid Process ID to set rights for.
+ */
+int hwmem_set_access(struct hwmem_alloc *alloc, enum hwmem_access access,
+								pid_t pid);
+
+/**
+ * @brief Get buffer information.
+ *
+ * @param alloc Buffer to get information about.
+ * @param size Pointer to size output variable.
+ * @param size Pointer to memory type output variable.
+ * @param size Pointer to access rights output variable.
+ */
+void hwmem_get_info(struct hwmem_alloc *alloc, uint32_t *size,
+	enum hwmem_mem_type *mem_type, enum hwmem_access *access);
+
+/**
+ * @brief Allocate a global buffer name.
+ * Generated buffer name is valid in all processes. Consecutive calls will get
+ * the same name for the same buffer.
+ *
+ * @param alloc Buffer to be made public.
+ *
+ * @return Positive global name on success, or a negative error code.
+ */
+int hwmem_get_name(struct hwmem_alloc *alloc);
+
+/**
+ * @brief Import the global buffer name to allow local access to the buffer.
+ * This call will add a buffer reference. Resulting buffer should be
+ * released with a call to hwmem_release.
+ *
+ * @param name A valid global buffer name.
+ *
+ * @return Pointer to allocation, or a negative error code.
+ */
+struct hwmem_alloc *hwmem_resolve_by_name(s32 name);
+
+/* Internal */
+
+struct hwmem_platform_data {
+	/* Starting physical address of memory region */
+	unsigned long start;
+	/* Size of memory region */
+	unsigned long size;
+};
+
+#endif
+
+#endif /* _HWMEM_H_ */
-- 
1.6.3.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/3] hwmem: Add hwmem (part 2)
  2010-11-16 13:08 ` [PATCH 1/3] hwmem: Add hwmem (part 1) Johan Mossberg
@ 2010-11-16 13:08   ` Johan Mossberg
  2010-11-16 13:08     ` [PATCH 3/3] hwmem: Add hwmem to ux500 and mop500 Johan Mossberg
  0 siblings, 1 reply; 12+ messages in thread
From: Johan Mossberg @ 2010-11-16 13:08 UTC (permalink / raw)
  To: linux-mm; +Cc: Johan Mossberg

Add hardware memory driver, part 2.

The main purpose of hwmem is:

* To allocate buffers suitable for use with hardware. Currently
this means contiguous buffers.
* To synchronize the caches for the allocated buffers. This is
achieved by keeping track of when the CPU uses a buffer and when
other hardware uses the buffer, when we switch from CPU to other
hardware or vice versa the caches are synchronized.
* To handle sharing of allocated buffers between processes i.e.
import, export.

Hwmem is available both through a user space API and through a
kernel API.

Signed-off-by: Johan Mossberg <johan.xx.mossberg@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
---
 drivers/misc/hwmem/cache_handler.c       |  494 ++++++++++++++++++++++++++++++
 drivers/misc/hwmem/cache_handler.h       |   60 ++++
 drivers/misc/hwmem/cache_handler_u8500.c |  208 +++++++++++++
 3 files changed, 762 insertions(+), 0 deletions(-)
 create mode 100644 drivers/misc/hwmem/cache_handler.c
 create mode 100644 drivers/misc/hwmem/cache_handler.h
 create mode 100644 drivers/misc/hwmem/cache_handler_u8500.c

diff --git a/drivers/misc/hwmem/cache_handler.c b/drivers/misc/hwmem/cache_handler.c
new file mode 100644
index 0000000..831770d
--- /dev/null
+++ b/drivers/misc/hwmem/cache_handler.c
@@ -0,0 +1,494 @@
+/*
+ * Copyright (C) ST-Ericsson AB 2010
+ *
+ * Cache handler
+ *
+ * Author: Johan Mossberg <johan.xx.mossberg@stericsson.com>
+ * for ST-Ericsson.
+ *
+ * License terms: GNU General Public License (GPL), version 2.
+ */
+
+#include <linux/hwmem.h>
+
+#include <asm/pgtable.h>
+
+#include "cache_handler.h"
+
+#define U32_MAX (~(u32)0)
+
+void cachi_set_buf_cache_settings(struct cach_buf *buf,
+					enum hwmem_alloc_flags cache_settings);
+void cachi_set_pgprot_cache_options(struct cach_buf *buf,
+							pgprot_t *pgprot);
+void cachi_drain_cpu_write_buf(void);
+void cachi_invalidate_cpu_cache(u32 virt_start, u32 virt_end, u32 phys_start,
+		u32 phys_end, bool inner_only, bool *flushed_everything);
+void cachi_clean_cpu_cache(u32 virt_start, u32 virt_end, u32 phys_start,
+		u32 phys_end, bool inner_only, bool *cleaned_everything);
+void cachi_flush_cpu_cache(u32 virt_start, u32 virt_end, u32 phys_start,
+		u32 phys_end, bool inner_only, bool *flushed_everything);
+bool cachi_can_keep_track_of_range_in_cpu_cache(void);
+/* Returns 1 if no cache is present */
+u32 cachi_get_cache_granularity(void);
+
+static void sync_buf_pre_cpu(struct cach_buf *buf, enum hwmem_access access,
+						struct hwmem_region *region);
+static void sync_buf_post_cpu(struct cach_buf *buf,
+	enum hwmem_access next_access, struct hwmem_region *next_region);
+
+static void invalidate_cpu_cache(struct cach_buf *buf,
+					struct cach_range *range_2b_used);
+static void clean_cpu_cache(struct cach_buf *buf,
+					struct cach_range *range_2b_used);
+static void flush_cpu_cache(struct cach_buf *buf,
+					struct cach_range *range_2b_used);
+
+static void null_range(struct cach_range *range);
+static void expand_range(struct cach_range *range,
+					struct cach_range *range_2_add);
+/*
+ * Expands range to one of enclosing_range's two edges. The function will
+ * choose which of enclosing_range's edges to expand range to in such a
+ * way that the size of range is minimized. range must be located inside
+ * enclosing_range.
+ */
+static void expand_range_2_edge(struct cach_range *range,
+					struct cach_range *enclosing_range);
+static void shrink_range(struct cach_range *range,
+					struct cach_range *range_2_remove);
+static bool is_non_empty_range(struct cach_range *range);
+static void intersect_range(struct cach_range *range_1,
+		struct cach_range *range_2, struct cach_range *intersection);
+/* Align_up restrictions apply here to */
+static void align_range_up(struct cach_range *range, u32 alignment);
+static void region_2_range(struct hwmem_region *region, u32 buffer_size,
+						struct cach_range *range);
+
+static u32 offset_2_vaddr(struct cach_buf *buf, u32 offset);
+static u32 offset_2_paddr(struct cach_buf *buf, u32 offset);
+
+/* Saturates, might return unaligned values when that happens */
+static u32 align_up(u32 value, u32 alignment);
+static u32 align_down(u32 value, u32 alignment);
+
+static bool is_wb(enum hwmem_alloc_flags cache_settings);
+static bool is_inner_only(enum hwmem_alloc_flags cache_settings);
+
+/*
+ * Exported functions
+ */
+
+void cach_init_buf(struct cach_buf *buf, enum hwmem_alloc_flags cache_settings,
+					u32 vstart, u32 pstart,	u32 size)
+{
+	bool tmp;
+
+	buf->vstart = vstart;
+	buf->pstart = pstart;
+	buf->size = size;
+
+	cachi_set_buf_cache_settings(buf, cache_settings);
+
+	cachi_flush_cpu_cache(offset_2_vaddr(buf, 0),
+		offset_2_vaddr(buf, buf->size), offset_2_paddr(buf, 0),
+				offset_2_paddr(buf, buf->size), false, &tmp);
+	cachi_drain_cpu_write_buf();
+
+	buf->in_cpu_write_buf = false;
+	if (cachi_can_keep_track_of_range_in_cpu_cache())
+		null_range(&buf->range_in_cpu_cache);
+	else {
+		/* Assume worst case, that the entire alloc is in the cache. */
+		buf->range_in_cpu_cache.start = 0;
+		buf->range_in_cpu_cache.end = buf->size;
+		align_range_up(&buf->range_in_cpu_cache,
+						cachi_get_cache_granularity());
+	}
+	null_range(&buf->range_dirty_in_cpu_cache);
+	null_range(&buf->range_invalid_in_cpu_cache);
+}
+
+void cach_set_pgprot_cache_options(struct cach_buf *buf, pgprot_t *pgprot)
+{
+	cachi_set_pgprot_cache_options(buf, pgprot);
+}
+
+void cach_set_domain(struct cach_buf *buf, enum hwmem_access access,
+			enum hwmem_domain domain, struct hwmem_region *region)
+{
+	struct hwmem_region *__region;
+	struct hwmem_region full_region;
+
+	if (region != NULL)
+		__region = region;
+	else {
+		full_region.offset = 0;
+		full_region.count = 1;
+		full_region.start = 0;
+		full_region.end = buf->size;
+		full_region.size = buf->size;
+
+		__region = &full_region;
+	}
+
+	switch (domain) {
+	case HWMEM_DOMAIN_SYNC:
+		sync_buf_post_cpu(buf, access, __region);
+
+		break;
+
+	case HWMEM_DOMAIN_CPU:
+		sync_buf_pre_cpu(buf, access, __region);
+
+		break;
+	}
+}
+
+/*
+ * Local functions
+ */
+
+void __attribute__((weak)) cachi_set_buf_cache_settings(struct cach_buf *buf,
+					enum hwmem_alloc_flags cache_settings)
+{
+	buf->cache_settings = cache_settings & ~HWMEM_ALLOC_CACHE_HINT_MASK;
+
+	if ((cache_settings & HWMEM_ALLOC_CACHED) == HWMEM_ALLOC_CACHED) {
+		/*
+		 * If the alloc is cached we'll use the default setting. We
+		 * don't know what this setting is so we have to assume the
+		 * worst case, ie write back inner and outer.
+		 */
+		buf->cache_settings |= HWMEM_ALLOC_CACHE_HINT_WB;
+	}
+}
+
+void __attribute__((weak)) cachi_set_pgprot_cache_options(struct cach_buf *buf,
+							pgprot_t *pgprot)
+{
+	if ((buf->cache_settings & HWMEM_ALLOC_CACHED) == HWMEM_ALLOC_CACHED)
+		*pgprot = *pgprot; /* To silence compiler and checkpatch */
+	else if (buf->cache_settings & HWMEM_ALLOC_BUFFERED)
+		*pgprot = pgprot_writecombine(*pgprot);
+	else
+		*pgprot = pgprot_noncached(*pgprot);
+}
+
+bool __attribute__((weak)) cachi_can_keep_track_of_range_in_cpu_cache(void)
+{
+	/* We don't know so we go with the safe alternative */
+	return false;
+}
+
+static void sync_buf_pre_cpu(struct cach_buf *buf, enum hwmem_access access,
+						struct hwmem_region *region)
+{
+	bool write = access & HWMEM_ACCESS_WRITE;
+	bool read = access & HWMEM_ACCESS_READ;
+
+	if (!write && !read)
+		return;
+
+	if ((buf->cache_settings & HWMEM_ALLOC_CACHED) == HWMEM_ALLOC_CACHED) {
+		struct cach_range region_range;
+
+		region_2_range(region, buf->size, &region_range);
+
+		if (read || (write && is_wb(buf->cache_settings)))
+			/* Perform defered invalidates */
+			invalidate_cpu_cache(buf, &region_range);
+		if (read)
+			expand_range(&buf->range_in_cpu_cache, &region_range);
+		if (write && is_wb(buf->cache_settings)) {
+			struct cach_range intersection;
+
+			intersect_range(&buf->range_in_cpu_cache,
+						&region_range, &intersection);
+
+			expand_range(&buf->range_dirty_in_cpu_cache,
+								&intersection);
+		}
+	}
+	if (buf->cache_settings & HWMEM_ALLOC_BUFFERED) {
+		if (write)
+			buf->in_cpu_write_buf = true;
+	}
+}
+
+static void sync_buf_post_cpu(struct cach_buf *buf,
+	enum hwmem_access next_access, struct hwmem_region *next_region)
+{
+	bool write = next_access & HWMEM_ACCESS_WRITE;
+	bool read = next_access & HWMEM_ACCESS_READ;
+	struct cach_range region_range;
+
+	if (!write && !read)
+		return;
+
+	region_2_range(next_region, buf->size, &region_range);
+
+	if (write) {
+		if (cachi_can_keep_track_of_range_in_cpu_cache())
+			flush_cpu_cache(buf, &region_range);
+		else { /* Defer invalidate */
+			struct cach_range intersection;
+
+			intersect_range(&buf->range_in_cpu_cache,
+						&region_range, &intersection);
+
+			expand_range(&buf->range_invalid_in_cpu_cache,
+								&intersection);
+
+			clean_cpu_cache(buf, &region_range);
+		}
+	}
+	if (read)
+		clean_cpu_cache(buf, &region_range);
+
+	if (buf->in_cpu_write_buf) {
+		cachi_drain_cpu_write_buf();
+
+		buf->in_cpu_write_buf = false;
+	}
+}
+
+static void invalidate_cpu_cache(struct cach_buf *buf, struct cach_range *range)
+{
+	struct cach_range intersection;
+
+	intersect_range(&buf->range_invalid_in_cpu_cache, range,
+								&intersection);
+	if (is_non_empty_range(&intersection)) {
+		bool flushed_everything;
+
+		expand_range_2_edge(&intersection,
+					&buf->range_invalid_in_cpu_cache);
+
+		cachi_invalidate_cpu_cache(
+				offset_2_vaddr(buf, intersection.start),
+				offset_2_vaddr(buf, intersection.end),
+				offset_2_paddr(buf, intersection.start),
+				offset_2_paddr(buf, intersection.end),
+				is_inner_only(buf->cache_settings),
+							&flushed_everything);
+
+		if (flushed_everything) {
+			null_range(&buf->range_invalid_in_cpu_cache);
+			null_range(&buf->range_dirty_in_cpu_cache);
+		} else
+			/*
+			 * No need to shrink range_in_cpu_cache as invalidate
+			 * is only used when we can't keep track of what's in
+			 * the CPU cache.
+			 */
+			shrink_range(&buf->range_invalid_in_cpu_cache,
+								&intersection);
+	}
+}
+
+static void clean_cpu_cache(struct cach_buf *buf, struct cach_range *range)
+{
+	struct cach_range intersection;
+
+	intersect_range(&buf->range_dirty_in_cpu_cache, range, &intersection);
+	if (is_non_empty_range(&intersection)) {
+		bool cleaned_everything;
+
+		expand_range_2_edge(&intersection,
+					&buf->range_dirty_in_cpu_cache);
+
+		cachi_clean_cpu_cache(
+				offset_2_vaddr(buf, intersection.start),
+				offset_2_vaddr(buf, intersection.end),
+				offset_2_paddr(buf, intersection.start),
+				offset_2_paddr(buf, intersection.end),
+				is_inner_only(buf->cache_settings),
+							&cleaned_everything);
+
+		if (cleaned_everything)
+			null_range(&buf->range_dirty_in_cpu_cache);
+		else
+			shrink_range(&buf->range_dirty_in_cpu_cache,
+								&intersection);
+	}
+}
+
+static void flush_cpu_cache(struct cach_buf *buf, struct cach_range *range)
+{
+	struct cach_range intersection;
+
+	intersect_range(&buf->range_in_cpu_cache, range, &intersection);
+	if (is_non_empty_range(&intersection)) {
+		bool flushed_everything;
+
+		expand_range_2_edge(&intersection, &buf->range_in_cpu_cache);
+
+		cachi_flush_cpu_cache(
+				offset_2_vaddr(buf, intersection.start),
+				offset_2_vaddr(buf, intersection.end),
+				offset_2_paddr(buf, intersection.start),
+				offset_2_paddr(buf, intersection.end),
+				is_inner_only(buf->cache_settings),
+							&flushed_everything);
+
+		if (flushed_everything) {
+			if (cachi_can_keep_track_of_range_in_cpu_cache())
+				null_range(&buf->range_in_cpu_cache);
+			null_range(&buf->range_dirty_in_cpu_cache);
+			null_range(&buf->range_invalid_in_cpu_cache);
+		} else {
+			if (cachi_can_keep_track_of_range_in_cpu_cache())
+				shrink_range(&buf->range_in_cpu_cache,
+							 &intersection);
+			shrink_range(&buf->range_dirty_in_cpu_cache,
+								&intersection);
+			shrink_range(&buf->range_invalid_in_cpu_cache,
+								&intersection);
+		}
+	}
+}
+
+static void null_range(struct cach_range *range)
+{
+	range->start = U32_MAX;
+	range->end = 0;
+}
+
+static void expand_range(struct cach_range *range,
+						struct cach_range *range_2_add)
+{
+	range->start = min(range->start, range_2_add->start);
+	range->end = max(range->end, range_2_add->end);
+}
+
+/*
+ * Expands range to one of enclosing_range's two edges. The function will
+ * choose which of enclosing_range's edges to expand range to in such a
+ * way that the size of range is minimized. range must be located inside
+ * enclosing_range.
+ */
+static void expand_range_2_edge(struct cach_range *range,
+					struct cach_range *enclosing_range)
+{
+	u32 space_on_low_side = range->start - enclosing_range->start;
+	u32 space_on_high_side = enclosing_range->end - range->end;
+
+	if (space_on_low_side < space_on_high_side)
+		range->start = enclosing_range->start;
+	else
+		range->end = enclosing_range->end;
+}
+
+static void shrink_range(struct cach_range *range,
+					struct cach_range *range_2_remove)
+{
+	if (range_2_remove->start > range->start)
+		range->end = min(range->end, range_2_remove->start);
+	else
+		range->start = max(range->start, range_2_remove->end);
+
+	if (range->start >= range->end)
+		null_range(range);
+}
+
+static bool is_non_empty_range(struct cach_range *range)
+{
+	return range->end > range->start;
+}
+
+static void intersect_range(struct cach_range *range_1,
+		struct cach_range *range_2, struct cach_range *intersection)
+{
+	intersection->start = max(range_1->start, range_2->start);
+	intersection->end = min(range_1->end, range_2->end);
+
+	if (intersection->start >= intersection->end)
+		null_range(intersection);
+}
+
+/* Align_up restrictions apply here to */
+static void align_range_up(struct cach_range *range, u32 alignment)
+{
+	if (!is_non_empty_range(range))
+		return;
+
+	range->start = align_down(range->start, alignment);
+	range->end = align_up(range->end, alignment);
+}
+
+static void region_2_range(struct hwmem_region *region, u32 buffer_size,
+						struct cach_range *range)
+{
+	/*
+	 * We don't care about invalid regions, instead we limit the region's
+	 * range to the buffer's range. This should work good enough, worst
+	 * case we synch the entire buffer when we get an invalid region which
+	 * is acceptable.
+	 */
+	range->start = region->offset + region->start;
+	range->end = min(region->offset + (region->count * region->size) -
+				(region->size - region->end), buffer_size);
+	if (range->start >= range->end) {
+		null_range(range);
+		return;
+	}
+
+	align_range_up(range, cachi_get_cache_granularity());
+}
+
+static u32 offset_2_vaddr(struct cach_buf *buf, u32 offset)
+{
+	return buf->vstart + offset;
+}
+
+static u32 offset_2_paddr(struct cach_buf *buf, u32 offset)
+{
+	return buf->pstart + offset;
+}
+
+/* Saturates, might return unaligned values when that happens */
+static u32 align_up(u32 value, u32 alignment)
+{
+	u32 remainder = value % alignment;
+	u32 value_2_add;
+
+	if (remainder == 0)
+		return value;
+
+	value_2_add = alignment - remainder;
+
+	if (value_2_add > U32_MAX - value) /* Will overflow */
+		return U32_MAX;
+
+	return value + value_2_add;
+}
+
+static u32 align_down(u32 value, u32 alignment)
+{
+	u32 remainder = value % alignment;
+	if (remainder == 0)
+		return value;
+
+	return value - remainder;
+}
+
+static bool is_wb(enum hwmem_alloc_flags cache_settings)
+{
+	u32 cache_hints = cache_settings & HWMEM_ALLOC_CACHE_HINT_MASK;
+	if (cache_hints == HWMEM_ALLOC_CACHE_HINT_WB ||
+		cache_hints == HWMEM_ALLOC_CACHE_HINT_WB_INNER)
+		return true;
+	else
+		return false;
+}
+
+static bool is_inner_only(enum hwmem_alloc_flags cache_settings)
+{
+	u32 cache_hints = cache_settings & HWMEM_ALLOC_CACHE_HINT_MASK;
+	if (cache_hints == HWMEM_ALLOC_CACHE_HINT_WT_INNER ||
+		cache_hints == HWMEM_ALLOC_CACHE_HINT_WB_INNER)
+		return true;
+	else
+		return false;
+}
diff --git a/drivers/misc/hwmem/cache_handler.h b/drivers/misc/hwmem/cache_handler.h
new file mode 100644
index 0000000..3c2a71f
--- /dev/null
+++ b/drivers/misc/hwmem/cache_handler.h
@@ -0,0 +1,60 @@
+/*
+ * Copyright (C) ST-Ericsson AB 2010
+ *
+ * Cache handler
+ *
+ * Author: Johan Mossberg <johan.xx.mossberg@stericsson.com>
+ * for ST-Ericsson.
+ *
+ * License terms: GNU General Public License (GPL), version 2.
+ */
+
+/*
+ * Cache handler can not handle simultaneous execution! The caller has to
+ * ensure such a situation does not occur.
+ */
+
+#ifndef _CACHE_HANDLER_H_
+#define _CACHE_HANDLER_H_
+
+#include <linux/types.h>
+#include <linux/hwmem.h>
+
+/*
+ * To not have to double all datatypes we've used hwmem datatypes. If someone
+ * want's to use cache handler but not hwmem then we'll have to define our own
+ * datatypes.
+ */
+
+struct cach_range {
+	u32 start; /* Inclusive */
+	u32 end; /* Exclusive */
+};
+
+/*
+ * Internal, do not touch!
+ */
+struct cach_buf {
+	u32 vstart;
+	u32 pstart;
+	u32 size;
+
+	/* Remaining hints are active */
+	enum hwmem_alloc_flags cache_settings;
+
+	bool in_cpu_write_buf;
+	struct cach_range range_in_cpu_cache;
+	struct cach_range range_dirty_in_cpu_cache;
+	struct cach_range range_invalid_in_cpu_cache;
+};
+
+void cach_init_buf(struct cach_buf *buf,
+	enum hwmem_alloc_flags cache_settings, u32 vstart, u32 pstart,
+								u32 size);
+
+void cach_set_pgprot_cache_options(struct cach_buf *buf, pgprot_t *pgprot);
+
+void cach_set_domain(struct cach_buf *buf, enum hwmem_access access,
+			enum hwmem_domain domain, struct hwmem_region *region);
+
+#endif /* _CACHE_HANDLER_H_ */
diff --git a/drivers/misc/hwmem/cache_handler_u8500.c b/drivers/misc/hwmem/cache_handler_u8500.c
new file mode 100644
index 0000000..3c1bc5a
--- /dev/null
+++ b/drivers/misc/hwmem/cache_handler_u8500.c
@@ -0,0 +1,208 @@
+/*
+ * Copyright (C) ST-Ericsson AB 2010
+ *
+ * Cache handler
+ *
+ * Author: Johan Mossberg <johan.xx.mossberg@stericsson.com>
+ * for ST-Ericsson.
+ *
+ * License terms: GNU General Public License (GPL), version 2.
+ */
+
+/* TODO: Move all this stuff to mach */
+
+#include <linux/hwmem.h>
+#include <linux/dma-mapping.h>
+
+#include <asm/pgtable.h>
+#include <asm/cacheflush.h>
+#include <asm/outercache.h>
+#include <asm/system.h>
+
+#include "cache_handler.h"
+
+/*
+ * Values are derived from measurements on HREFP_1.1_V32_OM_S10 running
+ * u8500-android-2.2_r1.1_v0.21.
+ *
+ * A lot of time can be spent trying to figure out the perfect breakpoints but
+ * for now I've chosen the following simple way.
+ *
+ * breakpoint = best_case + (worst_case - best_case) * 0.666
+ * The breakpoint is moved slightly towards the worst case because a full
+ * clean/flush affects the entire system so we should be a bit careful.
+ *
+ * BEST CASE:
+ * Best case is that the cache is empty and the system is idling. The case
+ * where the cache contains only targeted data could be better in some cases
+ * but it's hard to do measurements and calculate on that case so I choose the
+ * easier alternative.
+ *
+ * inner_inv_breakpoint = time_2_range_inv_on_empty_cache(
+ *					complete_flush_on_empty_cache_time)
+ * inner_clean_breakpoint = time_2_range_clean_on_empty_cache(
+ *					complete_clean_on_empty_cache_time)
+ *
+ * outer_inv_breakpoint = time_2_range_inv_on_empty_cache(
+ *					complete_flush_on_empty_cache_time)
+ * outer_clean_breakpoint = time_2_range_clean_on_empty_cache(
+ *					complete_clean_on_empty_cache_time)
+ * outer_flush_breakpoint = time_2_range_flush_on_empty_cache(
+ *					complete_flush_on_empty_cache_time)
+ *
+ * WORST CASE:
+ * Worst case is that the cache is filled with dirty non targeted data that
+ * will be used after the synchronization and the system is under heavy load.
+ *
+ * inner_inv_breakpoint = time_2_range_inv_on_empty_cache(
+ *				complete_flush_on_full_cache_time * 1.5 +
+ *					complete_flush_on_full_cache_time / 2)
+ * Times 1.5 because it runs on both cores half the time. Plus
+ * "complete_flush_on_full_cache_time / 2" because all data has to be read
+ * back, here we assume that both cores can fill their cache simultaneously
+ * (seems to be the case as operations on full and empty inner cache takes
+ * roughlythe same amount of time ie the bus to outer is not the bottle neck).
+ * inner_clean_breakpoint = time_2_range_clean_on_empty_cache(
+ *				complete_clean_on_full_cache_time * 1.5)
+ *
+ * outer_inv_breakpoint = time_2_range_inv_on_empty_cache(
+ *					complete_flush_on_full_cache_time * 2 +
+ *					(complete_flush_on_full_cache_time -
+ *				complete_flush_on_empty_cache_time) * 2)
+ * Plus "(complete_flush_on_full_cache_time -
+ * complete_flush_on_empty_cache_time)" because no one else can work when we
+ * hog the bus with our unecessary transfer.
+ * outer_clean_breakpoint = time_2_range_clean_on_empty_cache(
+ *					complete_clean_on_full_cache_time +
+ *					(complete_clean_on_full_cache_time -
+ *					complete_clean_on_empty_cache_time))
+ * outer_flush_breakpoint = time_2_range_flush_on_empty_cache(
+ *					complete_flush_on_full_cache_time * 2 +
+ *					(complete_flush_on_full_cache_time -
+ *				complete_flush_on_empty_cache_time) * 2)
+ *
+ * These values might have to be updated if changes are made to the CPU, L2$,
+ * memory bus or memory.
+ */
+/* 36224 */
+static const u32 inner_inv_breakpoint =	21324 + (43697 - 21324) * 0.666;
+/* 28930 */
+static const u32 inner_clean_breakpoint = 21324 + (32744 - 21324) * 0.666;
+/* 485414 */
+static const u32 outer_inv_breakpoint = 68041 + (694727 - 68041) * 0.666;
+/* 254069 */
+static const u32 outer_clean_breakpoint = 68041 + (347363 - 68041) * 0.666;
+/* 485414 */
+static const u32 outer_flush_breakpoint = 68041 + (694727 - 68041) * 0.666;
+
+static bool is_wt(enum hwmem_alloc_flags cache_settings);
+
+void cachi_set_buf_cache_settings(struct cach_buf *buf,
+					enum hwmem_alloc_flags cache_settings)
+{
+	buf->cache_settings = cache_settings & ~HWMEM_ALLOC_CACHE_HINT_MASK;
+
+	if ((cache_settings & HWMEM_ALLOC_CACHED) == HWMEM_ALLOC_CACHED) {
+		if (is_wt(cache_settings))
+			buf->cache_settings |= HWMEM_ALLOC_CACHE_HINT_WT;
+		else
+			buf->cache_settings |= HWMEM_ALLOC_CACHE_HINT_WB;
+	}
+}
+
+void cachi_set_pgprot_cache_options(struct cach_buf *buf, pgprot_t *pgprot)
+{
+	if ((buf->cache_settings & HWMEM_ALLOC_CACHED) == HWMEM_ALLOC_CACHED) {
+		if (is_wt(buf->cache_settings))
+			*pgprot = __pgprot_modify(*pgprot, L_PTE_MT_MASK,
+							L_PTE_MT_WRITETHROUGH);
+		else
+			*pgprot = __pgprot_modify(*pgprot, L_PTE_MT_MASK,
+							L_PTE_MT_WRITEBACK);
+	} else if (buf->cache_settings & HWMEM_ALLOC_BUFFERED)
+		*pgprot = pgprot_writecombine(*pgprot);
+	else
+		*pgprot = pgprot_noncached(*pgprot);
+}
+
+void cachi_drain_cpu_write_buf(void)
+{
+	dsb();
+	outer_cache.sync();
+}
+
+void cachi_invalidate_cpu_cache(u32 virt_start, u32 virt_end, u32 phys_start,
+		u32 phys_end, bool inner_only, bool *flushed_everything)
+{
+	u32 range_size = virt_end - virt_start;
+
+	*flushed_everything = false;
+
+	if (range_size < outer_inv_breakpoint)
+		outer_cache.inv_range(phys_start, phys_end);
+	else
+		outer_cache.flush_all();
+
+	/* Inner invalidate range */
+	dmac_map_area((void *)virt_start, range_size, DMA_FROM_DEVICE);
+}
+
+void cachi_clean_cpu_cache(u32 virt_start, u32 virt_end, u32 phys_start,
+		u32 phys_end, bool inner_only, bool *cleaned_everything)
+{
+	u32 range_size = virt_end - virt_start;
+
+	*cleaned_everything = false;
+
+	/* Inner clean range */
+	dmac_map_area((void *)virt_start, range_size, DMA_TO_DEVICE);
+
+	/*
+	 * There is currently no outer_cache.clean_all() so we use flush
+	 * instead, which is ok as clean is a subset of flush. Clean range
+	 * and flush range take the same amount of time so we can use
+	 * outer_flush_breakpoint here.
+	 */
+	if (range_size < outer_flush_breakpoint)
+		outer_cache.clean_range(phys_start, phys_end);
+	else
+		outer_cache.flush_all();
+}
+
+void cachi_flush_cpu_cache(u32 virt_start, u32 virt_end, u32 phys_start,
+		u32 phys_end, bool inner_only, bool *flushed_everything)
+{
+	u32 range_size = virt_end - virt_start;
+
+	*flushed_everything = false;
+
+	/* Inner clean range */
+	dmac_map_area((void *)virt_start, range_size, DMA_TO_DEVICE);
+
+	if (range_size < outer_flush_breakpoint)
+		outer_cache.flush_range(phys_start, phys_end);
+	else
+		outer_cache.flush_all();
+
+	/* Inner invalidate range */
+	dmac_map_area((void *)virt_start, range_size, DMA_FROM_DEVICE);
+}
+
+u32 cachi_get_cache_granularity(void)
+{
+	return 32;
+}
+
+/*
+ * Local functions
+ */
+
+static bool is_wt(enum hwmem_alloc_flags cache_settings)
+{
+	u32 cache_hints = cache_settings & HWMEM_ALLOC_CACHE_HINT_MASK;
+	if (cache_hints == HWMEM_ALLOC_CACHE_HINT_WT ||
+		cache_hints == HWMEM_ALLOC_CACHE_HINT_WT_INNER)
+		return true;
+	else
+		return false;
+}
-- 
1.6.3.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 3/3] hwmem: Add hwmem to ux500 and mop500
  2010-11-16 13:08   ` [PATCH 2/3] hwmem: Add hwmem (part 2) Johan Mossberg
@ 2010-11-16 13:08     ` Johan Mossberg
  0 siblings, 0 replies; 12+ messages in thread
From: Johan Mossberg @ 2010-11-16 13:08 UTC (permalink / raw)
  To: linux-mm; +Cc: Johan Mossberg

Signed-off-by: Johan Mossberg <johan.xx.mossberg@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
---
 arch/arm/mach-ux500/board-mop500.c         |    1 +
 arch/arm/mach-ux500/devices.c              |   31 ++++++++++++++++++++++++++++
 arch/arm/mach-ux500/include/mach/devices.h |    1 +
 3 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/arch/arm/mach-ux500/board-mop500.c b/arch/arm/mach-ux500/board-mop500.c
index 2c89e70..7a0b604 100644
--- a/arch/arm/mach-ux500/board-mop500.c
+++ b/arch/arm/mach-ux500/board-mop500.c
@@ -304,6 +304,7 @@ static struct ske_keypad_platform_data ske_keypad_board = {
 /* add any platform devices here - TODO */
 static struct platform_device *platform_devs[] __initdata = {
 	&ux500_ske_keypad_device,
+	&ux500_hwmem_device,
 };
 
 #ifdef CONFIG_STE_DMA40
diff --git a/arch/arm/mach-ux500/devices.c b/arch/arm/mach-ux500/devices.c
index ea0a2f9..a8db519 100644
--- a/arch/arm/mach-ux500/devices.c
+++ b/arch/arm/mach-ux500/devices.c
@@ -10,10 +10,41 @@
 #include <linux/interrupt.h>
 #include <linux/io.h>
 #include <linux/amba/bus.h>
+#include <linux/hwmem.h>
 
 #include <mach/hardware.h>
 #include <mach/setup.h>
 
+static struct hwmem_platform_data hwmem_pdata = {
+	.start = 0,
+	.size = 0,
+};
+
+static int __init early_hwmem(char *p)
+{
+	hwmem_pdata.size = memparse(p, &p);
+
+	if (*p != '@')
+		goto no_at;
+
+	hwmem_pdata.start = memparse(p + 1, &p);
+
+	return 0;
+
+no_at:
+	hwmem_pdata.size = 0;
+
+	return -EINVAL;
+}
+early_param("hwmem", early_hwmem);
+
+struct platform_device ux500_hwmem_device = {
+	.name = "hwmem",
+	.dev = {
+		.platform_data = &hwmem_pdata,
+	},
+};
+
 void __init amba_add_devices(struct amba_device *devs[], int num)
 {
 	int i;
diff --git a/arch/arm/mach-ux500/include/mach/devices.h b/arch/arm/mach-ux500/include/mach/devices.h
index 020b636..d5182e2 100644
--- a/arch/arm/mach-ux500/include/mach/devices.h
+++ b/arch/arm/mach-ux500/include/mach/devices.h
@@ -17,6 +17,7 @@ extern struct amba_device ux500_pl031_device;
 
 extern struct platform_device u8500_dma40_device;
 extern struct platform_device ux500_ske_keypad_device;
+extern struct platform_device ux500_hwmem_device;
 
 void dma40_u8500ed_fixup(void);
 
-- 
1.6.3.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] hwmem: Hardware memory driver
  2010-11-16 13:07 [PATCH 0/3] hwmem: Hardware memory driver Johan Mossberg
  2010-11-16 13:08 ` [PATCH 1/3] hwmem: Add hwmem (part 1) Johan Mossberg
@ 2010-11-16 14:50 ` Michał Nazarewicz
  2010-11-16 15:25   ` Johan MOSSBERG
  1 sibling, 1 reply; 12+ messages in thread
From: Michał Nazarewicz @ 2010-11-16 14:50 UTC (permalink / raw)
  To: linux-mm, Johan Mossberg

On Tue, 16 Nov 2010 14:07:59 +0100, Johan Mossberg <johan.xx.mossberg@stericsson.com> wrote:
> The following patchset implements a "hardware memory driver". The
> main purpose of hwmem is:
>
> * To allocate buffers suitable for use with hardware. Currently
> this means contiguous buffers.
> * To synchronize the caches for the allocated buffers. This is
> achieved by keeping track of when the CPU uses a buffer and when
> other hardware uses the buffer, when we switch from CPU to other
> hardware or vice versa the caches are synchronized.
> * To handle sharing of allocated buffers between processes i.e.
> import, export.
>
> Hwmem is available both through a user space API and through a
> kernel API.
>
> Here at ST-Ericsson we use hwmem for graphics buffers. Graphics
> buffers need to be contiguous due to our hardware, are passed
> between processes (usually application and window manager)and are
> part of usecases where performance is top priority so we can't
> afford to synchronize the caches unecessarily.
>
> Hwmem and CMA (Contiguous Memory Allocator) overlap to some extent.
> Hwmem could use CMA as its allocator and thereby remove the overlap
> but then defragmentation can not be implemented as CMA currently
> has no support for this. We would very much like to see a
> discussion about adding defragmentation to CMA.

I would definitelly like to see what the two solution share and try to
merge those.

In particular, I'll try to figure out what you mean by defragmentation
and see whethe it could be added to CMA.

My idea about CMA is to provide only allocator framework and let others
interact with user space and/or share resources, which, as I understand,
hwmem does.

PS. I don't follow linux-mm carefully, so I'd be great if you'd Cc me on
     future versions of hwmem.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Michał "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH 0/3] hwmem: Hardware memory driver
  2010-11-16 14:50 ` [PATCH 0/3] hwmem: Hardware memory driver Michał Nazarewicz
@ 2010-11-16 15:25   ` Johan MOSSBERG
  2010-11-16 15:33     ` Michał Nazarewicz
  0 siblings, 1 reply; 12+ messages in thread
From: Johan MOSSBERG @ 2010-11-16 15:25 UTC (permalink / raw)
  To: Michał Nazarewicz; +Cc: linux-mm

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 861 bytes --]

Michał Nazarewicz wrote: 
> In particular, I'll try to figure out what you mean by defragmentation
> and see whethe it could be added to CMA.

I mean the ability to move allocated buffers to free more
contiguous space. To support this in CMA the API(s) would have to
change.
* A buffer's physical address cannot be used to identify it as the
physical address can change.
* Pin/unpin functions would have to be added so that you can pin a
buffer when hardware uses it.
* The allocators needs to be able to inform CMA that they have
moved a buffer. This is so that CMA can keep track of what memory
is free so that it can supply the free memory to the kernel for
temporary use there.

/Johan Mossberg
N‹§²æìr¸›zǧu©ž²Æ {\b­†éì¹»\x1c®&Þ–)îÆi¢žØ^n‡r¶‰šŽŠÝ¢j$½§$¢¸\x05¢¹¨­è§~Š'.)îÄÃ,yèm¶ŸÿÃ\f%Š{±šj+ƒñb‚^[nö¢®×¥yÊ&¦‰bs(§	©Úu«"‚xm¶Ÿÿv+,¢[Þ¶\x17œ®×\x1ckðèž×¦j)Z†·Ÿ

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] hwmem: Hardware memory driver
  2010-11-16 15:25   ` Johan MOSSBERG
@ 2010-11-16 15:33     ` Michał Nazarewicz
  2010-11-16 16:16       ` Johan MOSSBERG
  0 siblings, 1 reply; 12+ messages in thread
From: Michał Nazarewicz @ 2010-11-16 15:33 UTC (permalink / raw)
  To: Johan MOSSBERG; +Cc: linux-mm

On Tue, 16 Nov 2010 16:25:20 +0100, Johan MOSSBERG <johan.xx.mossberg@stericsson.com> wrote:

> Michał Nazarewicz wrote:
>> In particular, I'll try to figure out what you mean by defragmentation
>> and see whethe it could be added to CMA.
>
> I mean the ability to move allocated buffers to free more
> contiguous space. To support this in CMA the API(s) would have to
> change.
> * A buffer's physical address cannot be used to identify it as the
> physical address can change.
> * Pin/unpin functions would have to be added so that you can pin a
> buffer when hardware uses it.
> * The allocators needs to be able to inform CMA that they have
> moved a buffer. This is so that CMA can keep track of what memory
> is free so that it can supply the free memory to the kernel for
> temporary use there.

I don't think those are fundamentally against CMA and as such I see
no reason why such calls could not be added to CMA.  Allocators that
do not support defragmentation could just ignore those calls.

In particular, a cma_alloc() could return a pointer to an opaque
struct cma and to get physical address user would have to pin the
buffer with, say, cma_pin() and then call cma_phys() to obtain
physical address.

As a matter of fact, in the version of CMA I'm currently working on,
cma_alloc() returns a pointer to a transparent structure, so the
above would not be a huge change.

I'm only wondering if treating "unpin" as "free" and pin as another
"alloc" would not suffice?

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Michał "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH 0/3] hwmem: Hardware memory driver
  2010-11-16 15:33     ` Michał Nazarewicz
@ 2010-11-16 16:16       ` Johan MOSSBERG
  2010-11-16 17:36         ` Michał Nazarewicz
  0 siblings, 1 reply; 12+ messages in thread
From: Johan MOSSBERG @ 2010-11-16 16:16 UTC (permalink / raw)
  To: Michał Nazarewicz; +Cc: linux-mm

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1890 bytes --]

Michał Nazarewicz wrote:
> > I mean the ability to move allocated buffers to free more
> > contiguous space. To support this in CMA the API(s) would have to
> > change.
> > * A buffer's physical address cannot be used to identify it as the
> > physical address can change.
> > * Pin/unpin functions would have to be added so that you can pin a
> > buffer when hardware uses it.
> > * The allocators needs to be able to inform CMA that they have
> > moved a buffer. This is so that CMA can keep track of what memory
> > is free so that it can supply the free memory to the kernel for
> > temporary use there.
> 
> I don't think those are fundamentally against CMA and as such I see
> no reason why such calls could not be added to CMA.  Allocators that
> do not support defragmentation could just ignore those calls.

Sounds good.

> In particular, a cma_alloc() could return a pointer to an opaque
> struct cma and to get physical address user would have to pin the
> buffer with, say, cma_pin() and then call cma_phys() to obtain
> physical address.

I think cma_phys() is redundant, cma_pin() can return the physical
address, that's how we did it in hwmem.

> I'm only wondering if treating "unpin" as "free" and pin as another
> "alloc" would not suffice?

I don't understand. Wouldn't you lose all the data in the buffer
when you free it? How would we handle something like the desktop
image which is blitted to the display all the time but never
changes? We'd have to keep a scattered version and then copy it
into a temporary contiguous buffer which is not optimal
performance wise. The other alternative would be to keep the
allocation but then we would get fragmentation problems.

/Johan Mossberg
N‹§²æìr¸›zǧu©ž²Æ {\b­†éì¹»\x1c®&Þ–)îÆi¢žØ^n‡r¶‰šŽŠÝ¢j$½§$¢¸\x05¢¹¨­è§~Š'.)îÄÃ,yèm¶ŸÿÃ\f%Š{±šj+ƒñb‚^[nö¢®×¥yÊ&¦‰bs(§	©Úu«"‚xm¶Ÿÿv+,¢[Þ¶\x17œ®×\x1ckðèž×¦j)Z†·Ÿ

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] hwmem: Hardware memory driver
  2010-11-16 16:16       ` Johan MOSSBERG
@ 2010-11-16 17:36         ` Michał Nazarewicz
  2010-11-17  9:28           ` Johan MOSSBERG
  0 siblings, 1 reply; 12+ messages in thread
From: Michał Nazarewicz @ 2010-11-16 17:36 UTC (permalink / raw)
  To: Johan MOSSBERG; +Cc: linux-mm

On Tue, 16 Nov 2010 17:16:23 +0100, Johan MOSSBERG <johan.xx.mossberg@stericsson.com> wrote:

> Michał Nazarewicz wrote:
>> In particular, a cma_alloc() could return a pointer to an opaque
>> struct cma and to get physical address user would have to pin the
>> buffer with, say, cma_pin() and then call cma_phys() to obtain
>> physical address.

> I think cma_phys() is redundant, cma_pin() can return the physical
> address, that's how we did it in hwmem.

Makes sense.  I'd add cma_phys() for convenience anyway.

>> I'm only wondering if treating "unpin" as "free" and pin as another
>> "alloc" would not suffice?

> I don't understand. Wouldn't you lose all the data in the buffer
> when you free it? How would we handle something like the desktop
> image which is blitted to the display all the time but never
> changes? We'd have to keep a scattered version and then copy it
> into a temporary contiguous buffer which is not optimal
> performance wise. The other alternative would be to keep the
> allocation but then we would get fragmentation problems.

Got it.

Do you want to remap user space mappings when page is moved during
defragmentation? Or would user need to unmap the region?  Ie. would
mmap()ed buffer be pinned?

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Michał "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH 0/3] hwmem: Hardware memory driver
  2010-11-16 17:36         ` Michał Nazarewicz
@ 2010-11-17  9:28           ` Johan MOSSBERG
  2010-11-19 10:44             ` Michał Nazarewicz
  0 siblings, 1 reply; 12+ messages in thread
From: Johan MOSSBERG @ 2010-11-17  9:28 UTC (permalink / raw)
  To: Michał Nazarewicz; +Cc: linux-mm

Michał Nazarewicz wrote: 
> Do you want to remap user space mappings when page is moved during
> defragmentation? Or would user need to unmap the region?  Ie. would
> mmap()ed buffer be pinned?

Remap, i.e. not pinned. That means that the mapper needs to be
informed before and after a buffer is moved. Maybe add a function
to CMA where you can register a callback function that is called
before and after a buffer is moved? The callback function's
parameters would be buffer, new position and whether it will be
moved or has been moved. CMA would also need this type of
information to be able to evict temporary data from the
destination.

I'm a little bit worried that this approach put constraints on the
defragmentation algorithm but I can't think of any scenario where
we would run into problems. If a defragmentation algorithm does
temporary moves, and knows it at the time of the move, we would
have to add a flag to the callback that indicates that the move is
temporary so that it is not unnecessarily mapped, but that can be
done when/if the problem occurs. Temporarily moving a buffer to
scattered memory is not supported either but I suppose that can be
solved by adding a flag that indicates that the new position is
scattered, also something that can be done when needed.

/Johan Mossberg

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] hwmem: Hardware memory driver
  2010-11-17  9:28           ` Johan MOSSBERG
@ 2010-11-19 10:44             ` Michał Nazarewicz
  2010-11-19 13:47               ` Johan MOSSBERG
  0 siblings, 1 reply; 12+ messages in thread
From: Michał Nazarewicz @ 2010-11-19 10:44 UTC (permalink / raw)
  To: Johan MOSSBERG; +Cc: linux-mm

On Wed, 17 Nov 2010 10:28:13 +0100, Johan MOSSBERG <johan.xx.mossberg@stericsson.com> wrote:

> Michał Nazarewicz wrote:
>> Do you want to remap user space mappings when page is moved during
>> defragmentation? Or would user need to unmap the region?  Ie. would
>> mmap()ed buffer be pinned?
>
> Remap, i.e. not pinned. That means that the mapper needs to be
> informed before and after a buffer is moved. Maybe add a function
> to CMA where you can register a callback function that is called
> before and after a buffer is moved? The callback function's
> parameters would be buffer, new position and whether it will be
> moved or has been moved. CMA would also need this type of
> information to be able to evict temporary data from the
> destination.

The way I imagine pinning is that the allocator tells CMA that it want
to use given region of memory.  This would make CMA remove any kind of
data that is stored there (in the version of CMA I'm about to post that
basically means migrating pages).

> I'm a little bit worried that this approach put constraints on the
> defragmentation algorithm but I can't think of any scenario where
> we would run into problems. If a defragmentation algorithm does
> temporary moves, and knows it at the time of the move, we would
> have to add a flag to the callback that indicates that the move is
> temporary so that it is not unnecessarily mapped, but that can be
> done when/if the problem occurs. Temporarily moving a buffer to
> scattered memory is not supported either but I suppose that can be
> solved by adding a flag that indicates that the new position is
> scattered, also something that can be done when needed.

I think the question at this moment is whether we need such a mechanism
to be implemented at the this time.  I would rather wait with the
callback mechanism till the rest of the framework works and we have
an algorithm that actually does the defragmentation.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Michał "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH 0/3] hwmem: Hardware memory driver
  2010-11-19 10:44             ` Michał Nazarewicz
@ 2010-11-19 13:47               ` Johan MOSSBERG
  0 siblings, 0 replies; 12+ messages in thread
From: Johan MOSSBERG @ 2010-11-19 13:47 UTC (permalink / raw)
  To: Michał Nazarewicz; +Cc: linux-mm

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 2115 bytes --]

Michał Nazarewicz wrote:
> >> Do you want to remap user space mappings when page is moved during
> >> defragmentation? Or would user need to unmap the region?  Ie. would
> >> mmap()ed buffer be pinned?
> >
> > Remap, i.e. not pinned. That means that the mapper needs to be
> > informed before and after a buffer is moved. Maybe add a function
> > to CMA where you can register a callback function that is called
> > before and after a buffer is moved? The callback function's
> > parameters would be buffer, new position and whether it will be
> > moved or has been moved. CMA would also need this type of
> > information to be able to evict temporary data from the
> > destination.
> 
> The way I imagine pinning is that the allocator tells CMA that it want
> to use given region of memory.  This would make CMA remove any kind of
> data that is stored there (in the version of CMA I'm about to post that
> basically means migrating pages).

I don't understand what you mean with "pinning" but yes when the
allocator moves a buffer it will have to inform CMA both what
memory it will use (so that it can be evicted) and what memory it
will no longer use (so that it can be used for other stuff).

> I think the question at this moment is whether we need such a mechanism
> to be implemented at the this time.  I would rather wait with the
> callback mechanism till the rest of the framework works and we have
> an algorithm that actually does the defragmentation.

I agree. So long as we know it can be added without too much
trouble in the future that'll be fine. I actually think the "making
good use of the free memory in regions" feature is more important
than defragmentation. The reason I wanted a discussion about
defragmentation was to make sure the door wasn't closed on adding
support for defragmentation in the future and to make sure it is
taken into consideration when designing and implementing CMA.

/Johan Mossberg
N‹§²æìr¸›zǧu©ž²Æ {\b­†éì¹»\x1c®&Þ–)îÆi¢žØ^n‡r¶‰šŽŠÝ¢j$½§$¢¸\x05¢¹¨­è§~Š'.)îÄÃ,yèm¶ŸÿÃ\f%Š{±šj+ƒñb‚^[nö¢®×¥yÊ&¦‰bs(§	©Úu«"‚xm¶Ÿÿv+,¢[Þ¶\x17œ®×\x1ckðèž×¦j)Z†·Ÿ

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-11-19 13:47 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-16 13:07 [PATCH 0/3] hwmem: Hardware memory driver Johan Mossberg
2010-11-16 13:08 ` [PATCH 1/3] hwmem: Add hwmem (part 1) Johan Mossberg
2010-11-16 13:08   ` [PATCH 2/3] hwmem: Add hwmem (part 2) Johan Mossberg
2010-11-16 13:08     ` [PATCH 3/3] hwmem: Add hwmem to ux500 and mop500 Johan Mossberg
2010-11-16 14:50 ` [PATCH 0/3] hwmem: Hardware memory driver Michał Nazarewicz
2010-11-16 15:25   ` Johan MOSSBERG
2010-11-16 15:33     ` Michał Nazarewicz
2010-11-16 16:16       ` Johan MOSSBERG
2010-11-16 17:36         ` Michał Nazarewicz
2010-11-17  9:28           ` Johan MOSSBERG
2010-11-19 10:44             ` Michał Nazarewicz
2010-11-19 13:47               ` Johan MOSSBERG

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox