linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] kmemtrace
@ 2008-08-10 17:14 Eduard - Gabriel Munteanu
  2008-08-10 17:14 ` [PATCH 1/5] kmemtrace: Core implementation Eduard - Gabriel Munteanu
  0 siblings, 1 reply; 32+ messages in thread
From: Eduard - Gabriel Munteanu @ 2008-08-10 17:14 UTC (permalink / raw)
  To: penberg
  Cc: mathieu.desnoyers, cl, linux-mm, linux-kernel, rdunlap, mpm,
	rostedt, tglx

Hi everybody,

As usual, the kmemtrace userspace repo is located at
git://repo.or.cz/kmemtrace-user.git

It's not updated now, but I will rebase it. So re-clone it, don't just
git-rebase it. The changes were too extensive and I'd like to keep the
revision history clean.

Changes in kmemtrace:
- new ABI, supports variable sized packets and it's much shorter (it has
specific fields for allocations)
- we'll use splice() in userspace
- replaced timestamps with sequence numbers, since timestamps don't have a good
enough resolution (though they could be added as an additional feature)
- used relay_reserve() as Mathieu Desnoyers suggested
- moved additional docs into a different commit and documented the replacement
of inline with __always_inline in those commits

Please have a look and let me know what you think.

Eduard - Gabriel Munteanu (5):
  kmemtrace: Core implementation.
  kmemtrace: Additional documentation.
  kmemtrace: SLAB hooks.
  kmemtrace: SLUB hooks.
  kmemtrace: SLOB hooks.

 Documentation/ABI/testing/debugfs-kmemtrace |   71 ++++++
 Documentation/kernel-parameters.txt         |   10 +
 Documentation/vm/kmemtrace.txt              |  126 ++++++++++
 MAINTAINERS                                 |    6 +
 include/linux/kmemtrace.h                   |   85 +++++++
 include/linux/slab_def.h                    |   68 +++++-
 include/linux/slob_def.h                    |    9 +-
 include/linux/slub_def.h                    |   53 ++++-
 init/main.c                                 |    2 +
 lib/Kconfig.debug                           |   28 +++
 mm/Makefile                                 |    2 +-
 mm/kmemtrace.c                              |  335 +++++++++++++++++++++++++++
 mm/slab.c                                   |   71 +++++-
 mm/slob.c                                   |   37 +++-
 mm/slub.c                                   |   66 +++++-
 15 files changed, 933 insertions(+), 36 deletions(-)
 create mode 100644 Documentation/ABI/testing/debugfs-kmemtrace
 create mode 100644 Documentation/vm/kmemtrace.txt
 create mode 100644 include/linux/kmemtrace.h
 create mode 100644 mm/kmemtrace.c

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 1/5] kmemtrace: Core implementation.
  2008-08-10 17:14 [PATCH 0/5] kmemtrace Eduard - Gabriel Munteanu
@ 2008-08-10 17:14 ` Eduard - Gabriel Munteanu
  2008-08-10 17:14   ` [PATCH 2/5] kmemtrace: Additional documentation Eduard - Gabriel Munteanu
  2008-08-12  6:46   ` [PATCH 1/5] kmemtrace: Core implementation Pekka Enberg
  0 siblings, 2 replies; 32+ messages in thread
From: Eduard - Gabriel Munteanu @ 2008-08-10 17:14 UTC (permalink / raw)
  To: penberg
  Cc: mathieu.desnoyers, cl, linux-mm, linux-kernel, rdunlap, mpm,
	rostedt, tglx

kmemtrace provides tracing for slab allocator functions, such as kmalloc,
kfree, kmem_cache_alloc, kmem_cache_free etc.. Collected data is then fed
to the userspace application in order to analyse allocation hotspots,
internal fragmentation and so on, making it possible to see how well an
allocator performs, as well as debug and profile kernel code.

Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
 Documentation/kernel-parameters.txt |   10 +
 MAINTAINERS                         |    6 +
 include/linux/kmemtrace.h           |   85 +++++++++
 init/main.c                         |    2 +
 lib/Kconfig.debug                   |   28 +++
 mm/Makefile                         |    2 +-
 mm/kmemtrace.c                      |  335 +++++++++++++++++++++++++++++++++++
 7 files changed, 467 insertions(+), 1 deletions(-)
 create mode 100644 include/linux/kmemtrace.h
 create mode 100644 mm/kmemtrace.c

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index b52f47d..446a257 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -49,6 +49,7 @@ parameter is applicable:
 	ISAPNP	ISA PnP code is enabled.
 	ISDN	Appropriate ISDN support is enabled.
 	JOY	Appropriate joystick support is enabled.
+	KMEMTRACE kmemtrace is enabled.
 	LIBATA  Libata driver is enabled
 	LP	Printer support is enabled.
 	LOOP	Loopback device support is enabled.
@@ -941,6 +942,15 @@ and is between 256 and 4096 characters. It is defined in the file
 			use the HighMem zone if it exists, and the Normal
 			zone if it does not.
 
+	kmemtrace.enable=	[KNL,KMEMTRACE] Format: { yes | no }
+				Controls whether kmemtrace is enabled
+				at boot-time.
+
+	kmemtrace.subbufs=n	[KNL,KMEMTRACE] Overrides the number of
+			subbufs kmemtrace's relay channel has. Set this
+			higher than default (KMEMTRACE_N_SUBBUFS in code) if
+			you experience buffer overruns.
+
 	movablecore=nn[KMG]	[KNL,X86-32,IA-64,PPC,X86-64] This parameter
 			is similar to kernelcore except it specifies the
 			amount of memory used for migratable allocations.
diff --git a/MAINTAINERS b/MAINTAINERS
index 56a2f67..e967bc2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2425,6 +2425,12 @@ M:	jason.wessel@windriver.com
 L:	kgdb-bugreport@lists.sourceforge.net
 S:	Maintained
 
+KMEMTRACE
+P:	Eduard - Gabriel Munteanu
+M:	eduard.munteanu@linux360.ro
+L:	linux-kernel@vger.kernel.org
+S:	Maintained
+
 KPROBES
 P:	Ananth N Mavinakayanahalli
 M:	ananth@in.ibm.com
diff --git a/include/linux/kmemtrace.h b/include/linux/kmemtrace.h
new file mode 100644
index 0000000..2c33201
--- /dev/null
+++ b/include/linux/kmemtrace.h
@@ -0,0 +1,85 @@
+/*
+ * Copyright (C) 2008 Eduard - Gabriel Munteanu
+ *
+ * This file is released under GPL version 2.
+ */
+
+#ifndef _LINUX_KMEMTRACE_H
+#define _LINUX_KMEMTRACE_H
+
+#ifdef __KERNEL__
+
+#include <linux/types.h>
+#include <linux/marker.h>
+
+enum kmemtrace_type_id {
+	KMEMTRACE_TYPE_KMALLOC = 0,	/* kmalloc() or kfree(). */
+	KMEMTRACE_TYPE_CACHE,		/* kmem_cache_*(). */
+	KMEMTRACE_TYPE_PAGES,		/* __get_free_pages() and friends. */
+};
+
+#ifdef CONFIG_KMEMTRACE
+
+extern void kmemtrace_init(void);
+
+static inline void kmemtrace_mark_alloc_node(enum kmemtrace_type_id type_id,
+					     unsigned long call_site,
+					     const void *ptr,
+					     size_t bytes_req,
+					     size_t bytes_alloc,
+					     gfp_t gfp_flags,
+					     int node)
+{
+	trace_mark(kmemtrace_alloc, "type_id %d call_site %lu ptr %lu "
+		   "bytes_req %lu bytes_alloc %lu gfp_flags %lu node %d",
+		   type_id, call_site, (unsigned long) ptr,
+		   bytes_req, bytes_alloc, (unsigned long) gfp_flags, node);
+}
+
+static inline void kmemtrace_mark_free(enum kmemtrace_type_id type_id,
+				       unsigned long call_site,
+				       const void *ptr)
+{
+	trace_mark(kmemtrace_free, "type_id %d call_site %lu ptr %lu",
+		   type_id, call_site, (unsigned long) ptr);
+}
+
+#else /* CONFIG_KMEMTRACE */
+
+static inline void kmemtrace_init(void)
+{
+}
+
+static inline void kmemtrace_mark_alloc_node(enum kmemtrace_type_id type_id,
+					     unsigned long call_site,
+					     const void *ptr,
+					     size_t bytes_req,
+					     size_t bytes_alloc,
+					     gfp_t gfp_flags,
+					     int node)
+{
+}
+
+static inline void kmemtrace_mark_free(enum kmemtrace_type_id type_id,
+				       unsigned long call_site,
+				       const void *ptr)
+{
+}
+
+#endif /* CONFIG_KMEMTRACE */
+
+static inline void kmemtrace_mark_alloc(enum kmemtrace_type_id type_id,
+					unsigned long call_site,
+					const void *ptr,
+					size_t bytes_req,
+					size_t bytes_alloc,
+					gfp_t gfp_flags)
+{
+	kmemtrace_mark_alloc_node(type_id, call_site, ptr,
+				  bytes_req, bytes_alloc, gfp_flags, -1);
+}
+
+#endif /* __KERNEL__ */
+
+#endif /* _LINUX_KMEMTRACE_H */
+
diff --git a/init/main.c b/init/main.c
index 057f364..c00659c 100644
--- a/init/main.c
+++ b/init/main.c
@@ -66,6 +66,7 @@
 #include <asm/setup.h>
 #include <asm/sections.h>
 #include <asm/cacheflush.h>
+#include <linux/kmemtrace.h>
 
 #ifdef CONFIG_X86_LOCAL_APIC
 #include <asm/smp.h>
@@ -641,6 +642,7 @@ asmlinkage void __init start_kernel(void)
 	enable_debug_pagealloc();
 	cpu_hotplug_init();
 	kmem_cache_init();
+	kmemtrace_init();
 	debug_objects_mem_init();
 	idr_init_cache();
 	setup_per_cpu_pageset();
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index d2099f4..0ade2ae 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -674,6 +674,34 @@ config FIREWIRE_OHCI_REMOTE_DMA
 
 	  If unsure, say N.
 
+config KMEMTRACE
+	bool "Kernel memory tracer (kmemtrace)"
+	depends on RELAY && DEBUG_FS && MARKERS
+	help
+	  kmemtrace provides tracing for slab allocator functions, such as
+	  kmalloc, kfree, kmem_cache_alloc, kmem_cache_free etc.. Collected
+	  data is then fed to the userspace application in order to analyse
+	  allocation hotspots, internal fragmentation and so on, making it
+	  possible to see how well an allocator performs, as well as debug
+	  and profile kernel code.
+
+	  This requires an userspace application to use. See
+	  Documentation/vm/kmemtrace.txt for more information.
+
+	  Saying Y will make the kernel somewhat larger and slower. However,
+	  if you disable kmemtrace at run-time or boot-time, the performance
+	  impact is minimal (depending on the arch the kernel is built for).
+
+	  If unsure, say N.
+
+config KMEMTRACE_DEFAULT_ENABLED
+	bool "Enabled by default at boot"
+	depends on KMEMTRACE
+	help
+	  Say Y here to enable kmemtrace at boot-time by default. Whatever
+	  the choice, the behavior can be overridden by a kernel parameter,
+	  as described in documentation.
+
 source "samples/Kconfig"
 
 source "lib/Kconfig.kgdb"
diff --git a/mm/Makefile b/mm/Makefile
index 18c143b..d88a3bc 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -33,4 +33,4 @@ obj-$(CONFIG_MIGRATION) += migrate.o
 obj-$(CONFIG_SMP) += allocpercpu.o
 obj-$(CONFIG_QUICKLIST) += quicklist.o
 obj-$(CONFIG_CGROUP_MEM_RES_CTLR) += memcontrol.o
-
+obj-$(CONFIG_KMEMTRACE) += kmemtrace.o
diff --git a/mm/kmemtrace.c b/mm/kmemtrace.c
new file mode 100644
index 0000000..83ad1cc
--- /dev/null
+++ b/mm/kmemtrace.c
@@ -0,0 +1,335 @@
+/*
+ * Copyright (C) 2008 Pekka Enberg, Eduard - Gabriel Munteanu
+ *
+ * This file is released under GPL version 2.
+ */
+
+#include <linux/string.h>
+#include <linux/debugfs.h>
+#include <linux/relay.h>
+#include <linux/module.h>
+#include <linux/marker.h>
+#include <linux/gfp.h>
+#include <linux/kmemtrace.h>
+
+#define KMEMTRACE_SUBBUF_SIZE		524288
+#define KMEMTRACE_DEF_N_SUBBUFS		20
+
+static struct rchan *kmemtrace_chan;
+static u32 kmemtrace_buf_overruns;
+
+static unsigned int kmemtrace_n_subbufs;
+#ifdef CONFIG_KMEMTRACE_DEFAULT_ENABLED
+static unsigned int kmemtrace_enabled = 1;
+#else
+static unsigned int kmemtrace_enabled = 0;
+#endif
+
+/*
+ * The sequence number is used for reordering kmemtrace packets
+ * in userspace, since they are logged as per-CPU data.
+ *
+ * atomic_t should always be a 32-bit signed integer. Wraparound is not
+ * likely to occur, but userspace can deal with it by expecting a certain
+ * sequence number in the next packet that will be read.
+ */
+static atomic_t kmemtrace_seq_num;
+
+#define KMEMTRACE_ABI_VERSION		1
+
+static u32 kmemtrace_abi_version __read_mostly = KMEMTRACE_ABI_VERSION;
+
+enum kmemtrace_event_id {
+	KMEMTRACE_EVENT_ALLOC = 0,
+	KMEMTRACE_EVENT_FREE,
+};
+
+struct kmemtrace_event {
+	u8		event_id;
+	u8		type_id;
+	u16		event_size;
+	s32		seq_num;
+	u64		call_site;
+	u64		ptr;
+} __attribute__ ((__packed__));
+
+struct kmemtrace_stats_alloc {
+	u64		bytes_req;
+	u64		bytes_alloc;
+	u32		gfp_flags;
+	s32		numa_node;
+} __attribute__ ((__packed__));
+
+static void kmemtrace_probe_alloc(void *probe_data, void *call_data,
+				  const char *format, va_list *args)
+{
+	unsigned long flags;
+	struct kmemtrace_event *ev;
+	struct kmemtrace_stats_alloc *stats;
+	void *buf;
+
+	local_irq_save(flags);
+
+	buf = relay_reserve(kmemtrace_chan,
+			    sizeof(struct kmemtrace_event) +
+			    sizeof(struct kmemtrace_stats_alloc));
+	if (!buf)
+		goto failed;
+
+	/*
+	 * Don't convert this to use structure initializers,
+	 * C99 does not guarantee the rvalues evaluation order.
+	 */
+
+	ev = buf;
+	ev->event_id = KMEMTRACE_EVENT_ALLOC;
+	ev->type_id = va_arg(*args, int);
+	ev->event_size = sizeof(struct kmemtrace_event) +
+			 sizeof(struct kmemtrace_stats_alloc);
+	ev->seq_num = atomic_add_return(1, &kmemtrace_seq_num);
+	ev->call_site = va_arg(*args, unsigned long);
+	ev->ptr = va_arg(*args, unsigned long);
+
+	stats = buf + sizeof(struct kmemtrace_event);
+	stats->bytes_req = va_arg(*args, unsigned long);
+	stats->bytes_alloc = va_arg(*args, unsigned long);
+	stats->gfp_flags = va_arg(*args, unsigned long);
+	stats->numa_node = va_arg(*args, int);
+
+failed:
+	local_irq_restore(flags);
+}
+
+static void kmemtrace_probe_free(void *probe_data, void *call_data,
+				 const char *format, va_list *args)
+{
+	unsigned long flags;
+	struct kmemtrace_event *ev;
+
+	local_irq_save(flags);
+
+	ev = relay_reserve(kmemtrace_chan, sizeof(struct kmemtrace_event));
+	if (!ev)
+		goto failed;
+
+	/*
+	 * Don't convert this to use structure initializers,
+	 * C99 does not guarantee the rvalues evaluation order.
+	 */
+	ev->event_id = KMEMTRACE_EVENT_FREE;
+	ev->type_id = va_arg(*args, int);
+	ev->event_size = sizeof(struct kmemtrace_event);
+	ev->seq_num = atomic_add_return(1, &kmemtrace_seq_num);
+	ev->call_site = va_arg(*args, unsigned long);
+	ev->ptr = va_arg(*args, unsigned long);
+
+failed:
+	local_irq_restore(flags);
+}
+
+static struct dentry *
+kmemtrace_create_buf_file(const char *filename, struct dentry *parent,
+			  int mode, struct rchan_buf *buf, int *is_global)
+{
+	return debugfs_create_file(filename, mode, parent, buf,
+				   &relay_file_operations);
+}
+
+static int kmemtrace_remove_buf_file(struct dentry *dentry)
+{
+	debugfs_remove(dentry);
+
+	return 0;
+}
+
+static int kmemtrace_subbuf_start(struct rchan_buf *buf,
+				  void *subbuf,
+				  void *prev_subbuf,
+				  size_t prev_padding)
+{
+	if (relay_buf_full(buf)) {
+		/*
+		 * We know it's not SMP-safe, but neither
+		 * debugfs_create_u32() is.
+		 */
+		kmemtrace_buf_overruns++;
+		return 0;
+	}
+
+	return 1;
+}
+
+static struct rchan_callbacks relay_callbacks = {
+	.create_buf_file = kmemtrace_create_buf_file,
+	.remove_buf_file = kmemtrace_remove_buf_file,
+	.subbuf_start = kmemtrace_subbuf_start,
+};
+
+static struct dentry *kmemtrace_dir;
+static struct dentry *kmemtrace_overruns_dentry;
+static struct dentry *kmemtrace_abi_version_dentry;
+
+static struct dentry *kmemtrace_enabled_dentry;
+
+static int kmemtrace_start_probes(void)
+{
+	int err;
+
+	err = marker_probe_register("kmemtrace_alloc", "type_id %d "
+				    "call_site %lu ptr %lu "
+				    "bytes_req %lu bytes_alloc %lu "
+				    "gfp_flags %lu node %d",
+				    kmemtrace_probe_alloc, NULL);
+	if (err)
+		return err;
+	err = marker_probe_register("kmemtrace_free", "type_id %d "
+				    "call_site %lu ptr %lu",
+				    kmemtrace_probe_free, NULL);
+
+	return err;
+}
+
+static void kmemtrace_stop_probes(void)
+{
+	marker_probe_unregister("kmemtrace_alloc",
+				kmemtrace_probe_alloc, NULL);
+	marker_probe_unregister("kmemtrace_free",
+				kmemtrace_probe_free, NULL);
+}
+
+static int kmemtrace_enabled_get(void *data, u64 *val)
+{
+	*val = *((int *) data);
+
+	return 0;
+}
+
+static int kmemtrace_enabled_set(void *data, u64 val)
+{
+	u64 old_val = kmemtrace_enabled;
+
+	*((int *) data) = !!val;
+
+	if (old_val == val)
+		return 0;
+	if (val)
+		kmemtrace_start_probes();
+	else
+		kmemtrace_stop_probes();
+
+	return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(kmemtrace_enabled_fops,
+			kmemtrace_enabled_get,
+			kmemtrace_enabled_set, "%llu\n");
+
+static void kmemtrace_cleanup(void)
+{
+	if (kmemtrace_enabled_dentry)
+		debugfs_remove(kmemtrace_enabled_dentry);
+
+	kmemtrace_stop_probes();
+
+	if (kmemtrace_abi_version_dentry)
+		debugfs_remove(kmemtrace_abi_version_dentry);
+	if (kmemtrace_overruns_dentry)
+		debugfs_remove(kmemtrace_overruns_dentry);
+
+	relay_close(kmemtrace_chan);
+	kmemtrace_chan = NULL;
+
+	if (kmemtrace_dir)
+		debugfs_remove(kmemtrace_dir);
+}
+
+static int __init kmemtrace_setup_late(void)
+{
+	if (!kmemtrace_chan)
+		goto failed;
+
+	kmemtrace_dir = debugfs_create_dir("kmemtrace", NULL);
+	if (!kmemtrace_dir)
+		goto cleanup;
+
+	kmemtrace_abi_version_dentry =
+		debugfs_create_u32("abi_version", S_IRUSR,
+				   kmemtrace_dir, &kmemtrace_abi_version);
+	kmemtrace_overruns_dentry =
+		debugfs_create_u32("total_overruns", S_IRUSR,
+				   kmemtrace_dir, &kmemtrace_buf_overruns);
+	if (!kmemtrace_overruns_dentry || !kmemtrace_abi_version_dentry)
+		goto cleanup;
+
+	kmemtrace_enabled_dentry =
+		debugfs_create_file("enabled", S_IRUSR | S_IWUSR,
+				    kmemtrace_dir, &kmemtrace_enabled,
+				    &kmemtrace_enabled_fops);
+	if (!kmemtrace_enabled_dentry)
+		goto cleanup;
+
+	if (relay_late_setup_files(kmemtrace_chan, "cpu", kmemtrace_dir))
+		goto cleanup;
+
+	printk(KERN_INFO "kmemtrace: fully up.\n");
+
+	return 0;
+
+cleanup:
+	kmemtrace_cleanup();
+failed:
+	return 1;
+}
+late_initcall(kmemtrace_setup_late);
+
+static int __init kmemtrace_set_boot_enabled(char *str)
+{
+	if (!str)
+		return -EINVAL;
+
+	if (!strcmp(str, "yes"))
+		kmemtrace_enabled = 1;
+	else if (!strcmp(str, "no"))
+		kmemtrace_enabled = 0;
+	else
+		return -EINVAL;
+
+	return 0;
+}
+early_param("kmemtrace.enable", kmemtrace_set_boot_enabled);
+
+static int __init kmemtrace_set_subbufs(char *str)
+{
+	get_option(&str, &kmemtrace_n_subbufs);
+	return 0;
+}
+early_param("kmemtrace.subbufs", kmemtrace_set_subbufs);
+
+void kmemtrace_init(void)
+{
+	if (!kmemtrace_enabled)
+		return;
+
+	if (!kmemtrace_n_subbufs)
+		kmemtrace_n_subbufs = KMEMTRACE_DEF_N_SUBBUFS;
+
+	kmemtrace_chan = relay_open(NULL, NULL, KMEMTRACE_SUBBUF_SIZE,
+				    kmemtrace_n_subbufs, &relay_callbacks,
+				    NULL);
+	if (unlikely(!kmemtrace_chan)) {
+		printk(KERN_ERR "kmemtrace: could not open relay channel.\n");
+		return;
+	}
+
+	if (unlikely(kmemtrace_start_probes()))
+		goto probe_fail;
+
+	printk(KERN_INFO "kmemtrace: early init successful.\n");
+
+	return;
+
+probe_fail:
+	printk(KERN_ERR "kmemtrace: could not register marker probes!\n");
+	kmemtrace_cleanup();
+}
+
-- 
1.5.6.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 2/5] kmemtrace: Additional documentation.
  2008-08-10 17:14 ` [PATCH 1/5] kmemtrace: Core implementation Eduard - Gabriel Munteanu
@ 2008-08-10 17:14   ` Eduard - Gabriel Munteanu
  2008-08-10 17:14     ` [PATCH 3/5] kmemtrace: SLAB hooks Eduard - Gabriel Munteanu
                       ` (2 more replies)
  2008-08-12  6:46   ` [PATCH 1/5] kmemtrace: Core implementation Pekka Enberg
  1 sibling, 3 replies; 32+ messages in thread
From: Eduard - Gabriel Munteanu @ 2008-08-10 17:14 UTC (permalink / raw)
  To: penberg
  Cc: mathieu.desnoyers, cl, linux-mm, linux-kernel, rdunlap, mpm,
	rostedt, tglx

Documented kmemtrace's ABI, purpose and design. Also includes a short
usage guide, FAQ, as well as a link to the userspace application's Git
repository, which is currently hosted at repo.or.cz.

Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
 Documentation/ABI/testing/debugfs-kmemtrace |   71 +++++++++++++++
 Documentation/vm/kmemtrace.txt              |  126 +++++++++++++++++++++++++++
 2 files changed, 197 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/ABI/testing/debugfs-kmemtrace
 create mode 100644 Documentation/vm/kmemtrace.txt

diff --git a/Documentation/ABI/testing/debugfs-kmemtrace b/Documentation/ABI/testing/debugfs-kmemtrace
new file mode 100644
index 0000000..a5ff9a6
--- /dev/null
+++ b/Documentation/ABI/testing/debugfs-kmemtrace
@@ -0,0 +1,71 @@
+What:		/sys/kernel/debug/kmemtrace/
+Date:		July 2008
+Contact:	Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
+Description:
+
+In kmemtrace-enabled kernels, the following files are created:
+
+/sys/kernel/debug/kmemtrace/
+	cpu<n>		(0400)	Per-CPU tracing data, see below. (binary)
+	total_overruns	(0400)	Total number of bytes which were dropped from
+				cpu<n> files because of full buffer condition,
+				non-binary. (text)
+	abi_version	(0400)	Kernel's kmemtrace ABI version. (text)
+
+Each per-CPU file should be read according to the relay interface. That is,
+the reader should set affinity to that specific CPU and, as currently done by
+the userspace application (though there are other methods), use poll() with
+an infinite timeout before every read(). Otherwise, erroneous data may be
+read. The binary data has the following _core_ format:
+
+	Event ID	(1 byte)	Unsigned integer, one of:
+		0 - represents an allocation (KMEMTRACE_EVENT_ALLOC)
+		1 - represents a freeing of previously allocated memory
+		    (KMEMTRACE_EVENT_FREE)
+	Type ID		(1 byte)	Unsigned integer, one of:
+		0 - this is a kmalloc() / kfree()
+		1 - this is a kmem_cache_alloc() / kmem_cache_free()
+		2 - this is a __get_free_pages() et al.
+	Event size	(2 bytes)	Unsigned integer representing the
+					size of this event. Used to extend
+					kmemtrace. Discard the bytes you
+					don't know about.
+	Sequence number	(4 bytes)	Signed integer used to reorder data
+					logged on SMP machines. Wraparound
+					must be taken into account, although
+					it is unlikely.
+	Caller address	(8 bytes)	Return address to the caller.
+	Pointer to mem	(8 bytes)	Pointer to target memory area. Can be
+					NULL, but not all such calls might be
+					recorded.
+
+In case of KMEMTRACE_EVENT_ALLOC events, the next fields follow:
+
+	Requested bytes	(8 bytes)	Total number of requested bytes,
+					unsigned, must not be zero.
+	Allocated bytes (8 bytes)	Total number of actually allocated
+					bytes, unsigned, must not be lower
+					than requested bytes.
+	Requested flags	(4 bytes)	GFP flags supplied by the caller.
+	Target CPU	(4 bytes)	Signed integer, valid for event id 1.
+					If equal to -1, target CPU is the same
+					as origin CPU, but the reverse might
+					not be true.
+
+The data is made available in the same endianness the machine has.
+
+Other event ids and type ids may be defined and added. Other fields may be
+added by increasing event size, but see below for details.
+Every modification to the ABI, including new id definitions, are followed
+by bumping the ABI version by one.
+
+Adding new data to the packet (features) is done at the end of the mandatory
+data:
+	Feature size	(2 byte)
+	Feature ID	(1 byte)
+	Feature data	(Feature size - 4 bytes)
+
+
+Users:
+	kmemtrace-user - git://repo.or.cz/kmemtrace-user.git
+
diff --git a/Documentation/vm/kmemtrace.txt b/Documentation/vm/kmemtrace.txt
new file mode 100644
index 0000000..75360b1
--- /dev/null
+++ b/Documentation/vm/kmemtrace.txt
@@ -0,0 +1,126 @@
+			kmemtrace - Kernel Memory Tracer
+
+			  by Eduard - Gabriel Munteanu
+			     <eduard.munteanu@linux360.ro>
+
+I. Introduction
+===============
+
+kmemtrace helps kernel developers figure out two things:
+1) how different allocators (SLAB, SLUB etc.) perform
+2) how kernel code allocates memory and how much
+
+To do this, we trace every allocation and export information to the userspace
+through the relay interface. We export things such as the number of requested
+bytes, the number of bytes actually allocated (i.e. including internal
+fragmentation), whether this is a slab allocation or a plain kmalloc() and so
+on.
+
+The actual analysis is performed by a userspace tool (see section III for
+details on where to get it from). It logs the data exported by the kernel,
+processes it and (as of writing this) can provide the following information:
+- the total amount of memory allocated and fragmentation per call-site
+- the amount of memory allocated and fragmentation per allocation
+- total memory allocated and fragmentation in the collected dataset
+- number of cross-CPU allocation and frees (makes sense in NUMA environments)
+
+Moreover, it can potentially find inconsistent and erroneous behavior in
+kernel code, such as using slab free functions on kmalloc'ed memory or
+allocating less memory than requested (but not truly failed allocations).
+
+kmemtrace also makes provisions for tracing on some arch and analysing the
+data on another.
+
+II. Design and goals
+====================
+
+kmemtrace was designed to handle rather large amounts of data. Thus, it uses
+the relay interface to export whatever is logged to userspace, which then
+stores it. Analysis and reporting is done asynchronously, that is, after the
+data is collected and stored. By design, it allows one to log and analyse
+on different machines and different arches.
+
+As of writing this, the ABI is not considered stable, though it might not
+change much. However, no guarantees are made about compatibility yet. When
+deemed stable, the ABI should still allow easy extension while maintaining
+backward compatibility. This is described further in Documentation/ABI.
+
+Summary of design goals:
+	- allow logging and analysis to be done across different machines
+	- be fast and anticipate usage in high-load environments (*)
+	- be reasonably extensible
+	- make it possible for GNU/Linux distributions to have kmemtrace
+	included in their repositories
+
+(*) - one of the reasons Pekka Enberg's original userspace data analysis
+    tool's code was rewritten from Perl to C (although this is more than a
+    simple conversion)
+
+
+III. Quick usage guide
+======================
+
+1) Get a kernel that supports kmemtrace and build it accordingly (i.e. enable
+CONFIG_KMEMTRACE and CONFIG_DEFAULT_ENABLED).
+
+2) Get the userspace tool and build it:
+$ git-clone git://repo.or.cz/kmemtrace-user.git		# current repository
+$ cd kmemtrace-user/
+$ ./autogen.sh
+$ ./configure
+$ make
+
+3) Boot the kmemtrace-enabled kernel if you haven't, preferably in the
+'single' runlevel (so that relay buffers don't fill up easily), and run
+kmemtrace:
+# '$' does not mean user, but root here.
+$ mount -t debugfs none /sys/kernel/debug
+$ mount -t proc none /proc
+$ cd path/to/kmemtrace-user/
+$ ./kmemtraced
+Wait a bit, then stop it with CTRL+C.
+$ cat /sys/kernel/debug/kmemtrace/total_overruns	# Check if we didn't
+							# overrun, should
+							# be zero.
+$ (Optionally) [Run kmemtrace_check separately on each cpu[0-9]*.out file to
+		check its correctness]
+$ ./kmemtrace-report
+
+Now you should have a nice and short summary of how the allocator performs.
+
+IV. FAQ and known issues
+========================
+
+Q: 'cat /sys/kernel/debug/kmemtrace/total_overruns' is non-zero, how do I fix
+this? Should I worry?
+A: If it's non-zero, this affects kmemtrace's accuracy, depending on how
+large the number is. You can fix it by supplying a higher
+'kmemtrace.subbufs=N' kernel parameter.
+---
+
+Q: kmemtrace_check reports errors, how do I fix this? Should I worry?
+A: This is a bug and should be reported. It can occur for a variety of
+reasons:
+	- possible bugs in relay code
+	- possible misuse of relay by kmemtrace
+	- timestamps being collected unorderly
+Or you may fix it yourself and send us a patch.
+---
+
+Q: kmemtrace_report shows many errors, how do I fix this? Should I worry?
+A: This is a known issue and I'm working on it. These might be true errors
+in kernel code, which may have inconsistent behavior (e.g. allocating memory
+with kmem_cache_alloc() and freeing it with kfree()). Pekka Enberg pointed
+out this behavior may work with SLAB, but may fail with other allocators.
+
+It may also be due to lack of tracing in some unusual allocator functions.
+
+We don't want bug reports regarding this issue yet.
+---
+
+V. See also
+===========
+
+Documentation/kernel-parameters.txt
+Documentation/ABI/testing/debugfs-kmemtrace
+
-- 
1.5.6.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 3/5] kmemtrace: SLAB hooks.
  2008-08-10 17:14   ` [PATCH 2/5] kmemtrace: Additional documentation Eduard - Gabriel Munteanu
@ 2008-08-10 17:14     ` Eduard - Gabriel Munteanu
  2008-08-10 17:14       ` [PATCH 4/5] kmemtrace: SLUB hooks Eduard - Gabriel Munteanu
  2008-08-12  6:46       ` [PATCH 3/5] kmemtrace: SLAB hooks Pekka Enberg
  2008-08-12  6:46     ` [PATCH 2/5] kmemtrace: Additional documentation Pekka Enberg
  2008-08-18 19:57     ` Randy Dunlap
  2 siblings, 2 replies; 32+ messages in thread
From: Eduard - Gabriel Munteanu @ 2008-08-10 17:14 UTC (permalink / raw)
  To: penberg
  Cc: mathieu.desnoyers, cl, linux-mm, linux-kernel, rdunlap, mpm,
	rostedt, tglx

This adds hooks for the SLAB allocator, to allow tracing with kmemtrace.

We also convert some inline functions to __always_inline to make sure
_RET_IP_, which expands to __builtin_return_address(0), always works
as expected.

Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
 include/linux/slab_def.h |   68 ++++++++++++++++++++++++++++++++++++++------
 mm/slab.c                |   71 +++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 123 insertions(+), 16 deletions(-)

diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h
index 39c3a5e..7555ce9 100644
--- a/include/linux/slab_def.h
+++ b/include/linux/slab_def.h
@@ -14,6 +14,7 @@
 #include <asm/page.h>		/* kmalloc_sizes.h needs PAGE_SIZE */
 #include <asm/cache.h>		/* kmalloc_sizes.h needs L1_CACHE_BYTES */
 #include <linux/compiler.h>
+#include <linux/kmemtrace.h>
 
 /* Size description struct for general caches. */
 struct cache_sizes {
@@ -28,8 +29,26 @@ extern struct cache_sizes malloc_sizes[];
 void *kmem_cache_alloc(struct kmem_cache *, gfp_t);
 void *__kmalloc(size_t size, gfp_t flags);
 
-static inline void *kmalloc(size_t size, gfp_t flags)
+#ifdef CONFIG_KMEMTRACE
+extern void *kmem_cache_alloc_notrace(struct kmem_cache *cachep, gfp_t flags);
+extern size_t slab_buffer_size(struct kmem_cache *cachep);
+#else
+static __always_inline void *
+kmem_cache_alloc_notrace(struct kmem_cache *cachep, gfp_t flags)
 {
+	return kmem_cache_alloc(cachep, flags);
+}
+static inline size_t slab_buffer_size(struct kmem_cache *cachep)
+{
+	return 0;
+}
+#endif
+
+static __always_inline void *kmalloc(size_t size, gfp_t flags)
+{
+	struct kmem_cache *cachep;
+	void *ret;
+
 	if (__builtin_constant_p(size)) {
 		int i = 0;
 
@@ -50,10 +69,17 @@ static inline void *kmalloc(size_t size, gfp_t flags)
 found:
 #ifdef CONFIG_ZONE_DMA
 		if (flags & GFP_DMA)
-			return kmem_cache_alloc(malloc_sizes[i].cs_dmacachep,
-						flags);
+			cachep = malloc_sizes[i].cs_dmacachep;
+		else
 #endif
-		return kmem_cache_alloc(malloc_sizes[i].cs_cachep, flags);
+			cachep = malloc_sizes[i].cs_cachep;
+
+		ret = kmem_cache_alloc_notrace(cachep, flags);
+
+		kmemtrace_mark_alloc(KMEMTRACE_TYPE_KMALLOC, _THIS_IP_, ret,
+				     size, slab_buffer_size(cachep), flags);
+
+		return ret;
 	}
 	return __kmalloc(size, flags);
 }
@@ -62,8 +88,25 @@ found:
 extern void *__kmalloc_node(size_t size, gfp_t flags, int node);
 extern void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node);
 
-static inline void *kmalloc_node(size_t size, gfp_t flags, int node)
+#ifdef CONFIG_KMEMTRACE
+extern void *kmem_cache_alloc_node_notrace(struct kmem_cache *cachep,
+					   gfp_t flags,
+					   int nodeid);
+#else
+static __always_inline void *
+kmem_cache_alloc_node_notrace(struct kmem_cache *cachep,
+			      gfp_t flags,
+			      int nodeid)
+{
+	return kmem_cache_alloc_node(cachep, flags, nodeid);
+}
+#endif
+
+static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
 {
+	struct kmem_cache *cachep;
+	void *ret;
+
 	if (__builtin_constant_p(size)) {
 		int i = 0;
 
@@ -84,11 +127,18 @@ static inline void *kmalloc_node(size_t size, gfp_t flags, int node)
 found:
 #ifdef CONFIG_ZONE_DMA
 		if (flags & GFP_DMA)
-			return kmem_cache_alloc_node(malloc_sizes[i].cs_dmacachep,
-						flags, node);
+			cachep = malloc_sizes[i].cs_dmacachep;
+		else
 #endif
-		return kmem_cache_alloc_node(malloc_sizes[i].cs_cachep,
-						flags, node);
+			cachep = malloc_sizes[i].cs_cachep;
+
+		ret = kmem_cache_alloc_node_notrace(cachep, flags, node);
+
+		kmemtrace_mark_alloc_node(KMEMTRACE_TYPE_KMALLOC, _THIS_IP_,
+					  ret, size, slab_buffer_size(cachep),
+					  flags, node);
+
+		return ret;
 	}
 	return __kmalloc_node(size, flags, node);
 }
diff --git a/mm/slab.c b/mm/slab.c
index 046607f..1496962 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -111,6 +111,7 @@
 #include	<linux/rtmutex.h>
 #include	<linux/reciprocal_div.h>
 #include	<linux/debugobjects.h>
+#include	<linux/kmemtrace.h>
 
 #include	<asm/cacheflush.h>
 #include	<asm/tlbflush.h>
@@ -567,6 +568,14 @@ static void **dbg_userword(struct kmem_cache *cachep, void *objp)
 
 #endif
 
+#ifdef CONFIG_KMEMTRACE
+size_t slab_buffer_size(struct kmem_cache *cachep)
+{
+	return cachep->buffer_size;
+}
+EXPORT_SYMBOL(slab_buffer_size);
+#endif
+
 /*
  * Do not go above this order unless 0 objects fit into the slab.
  */
@@ -3621,10 +3630,23 @@ static inline void __cache_free(struct kmem_cache *cachep, void *objp)
  */
 void *kmem_cache_alloc(struct kmem_cache *cachep, gfp_t flags)
 {
-	return __cache_alloc(cachep, flags, __builtin_return_address(0));
+	void *ret = __cache_alloc(cachep, flags, __builtin_return_address(0));
+
+	kmemtrace_mark_alloc(KMEMTRACE_TYPE_CACHE, _RET_IP_, ret,
+			     obj_size(cachep), cachep->buffer_size, flags);
+
+	return ret;
 }
 EXPORT_SYMBOL(kmem_cache_alloc);
 
+#ifdef CONFIG_KMEMTRACE
+void *kmem_cache_alloc_notrace(struct kmem_cache *cachep, gfp_t flags)
+{
+	return __cache_alloc(cachep, flags, __builtin_return_address(0));
+}
+EXPORT_SYMBOL(kmem_cache_alloc_notrace);
+#endif
+
 /**
  * kmem_ptr_validate - check if an untrusted pointer might be a slab entry.
  * @cachep: the cache we're checking against
@@ -3669,23 +3691,47 @@ out:
 #ifdef CONFIG_NUMA
 void *kmem_cache_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid)
 {
-	return __cache_alloc_node(cachep, flags, nodeid,
-			__builtin_return_address(0));
+	void *ret = __cache_alloc_node(cachep, flags, nodeid,
+				       __builtin_return_address(0));
+
+	kmemtrace_mark_alloc_node(KMEMTRACE_TYPE_CACHE, _RET_IP_, ret,
+				  obj_size(cachep), cachep->buffer_size,
+				  flags, nodeid);
+
+	return ret;
 }
 EXPORT_SYMBOL(kmem_cache_alloc_node);
 
+#ifdef CONFIG_KMEMTRACE
+void *kmem_cache_alloc_node_notrace(struct kmem_cache *cachep,
+				    gfp_t flags,
+				    int nodeid)
+{
+	return __cache_alloc_node(cachep, flags, nodeid,
+				  __builtin_return_address(0));
+}
+EXPORT_SYMBOL(kmem_cache_alloc_node_notrace);
+#endif
+
 static __always_inline void *
 __do_kmalloc_node(size_t size, gfp_t flags, int node, void *caller)
 {
 	struct kmem_cache *cachep;
+	void *ret;
 
 	cachep = kmem_find_general_cachep(size, flags);
 	if (unlikely(ZERO_OR_NULL_PTR(cachep)))
 		return cachep;
-	return kmem_cache_alloc_node(cachep, flags, node);
+	ret = kmem_cache_alloc_node_notrace(cachep, flags, node);
+
+	kmemtrace_mark_alloc_node(KMEMTRACE_TYPE_KMALLOC,
+				  (unsigned long) caller, ret,
+				  size, cachep->buffer_size, flags, node);
+
+	return ret;
 }
 
-#ifdef CONFIG_DEBUG_SLAB
+#if defined(CONFIG_DEBUG_SLAB) || defined(CONFIG_KMEMTRACE)
 void *__kmalloc_node(size_t size, gfp_t flags, int node)
 {
 	return __do_kmalloc_node(size, flags, node,
@@ -3718,6 +3764,7 @@ static __always_inline void *__do_kmalloc(size_t size, gfp_t flags,
 					  void *caller)
 {
 	struct kmem_cache *cachep;
+	void *ret;
 
 	/* If you want to save a few bytes .text space: replace
 	 * __ with kmem_.
@@ -3727,11 +3774,17 @@ static __always_inline void *__do_kmalloc(size_t size, gfp_t flags,
 	cachep = __find_general_cachep(size, flags);
 	if (unlikely(ZERO_OR_NULL_PTR(cachep)))
 		return cachep;
-	return __cache_alloc(cachep, flags, caller);
+	ret = __cache_alloc(cachep, flags, caller);
+
+	kmemtrace_mark_alloc(KMEMTRACE_TYPE_KMALLOC,
+			     (unsigned long) caller, ret,
+			     size, cachep->buffer_size, flags);
+
+	return ret;
 }
 
 
-#ifdef CONFIG_DEBUG_SLAB
+#if defined(CONFIG_DEBUG_SLAB) || defined(CONFIG_KMEMTRACE)
 void *__kmalloc(size_t size, gfp_t flags)
 {
 	return __do_kmalloc(size, flags, __builtin_return_address(0));
@@ -3770,6 +3823,8 @@ void kmem_cache_free(struct kmem_cache *cachep, void *objp)
 		debug_check_no_obj_freed(objp, obj_size(cachep));
 	__cache_free(cachep, objp);
 	local_irq_restore(flags);
+
+	kmemtrace_mark_free(KMEMTRACE_TYPE_CACHE, _RET_IP_, objp);
 }
 EXPORT_SYMBOL(kmem_cache_free);
 
@@ -3796,6 +3851,8 @@ void kfree(const void *objp)
 	debug_check_no_obj_freed(objp, obj_size(c));
 	__cache_free(c, (void *)objp);
 	local_irq_restore(flags);
+
+	kmemtrace_mark_free(KMEMTRACE_TYPE_KMALLOC, _RET_IP_, objp);
 }
 EXPORT_SYMBOL(kfree);
 
-- 
1.5.6.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 4/5] kmemtrace: SLUB hooks.
  2008-08-10 17:14     ` [PATCH 3/5] kmemtrace: SLAB hooks Eduard - Gabriel Munteanu
@ 2008-08-10 17:14       ` Eduard - Gabriel Munteanu
  2008-08-10 17:14         ` [PATCH 5/5] kmemtrace: SLOB hooks Eduard - Gabriel Munteanu
  2008-08-11 14:04         ` [PATCH 4/5] kmemtrace: SLUB hooks Christoph Lameter
  2008-08-12  6:46       ` [PATCH 3/5] kmemtrace: SLAB hooks Pekka Enberg
  1 sibling, 2 replies; 32+ messages in thread
From: Eduard - Gabriel Munteanu @ 2008-08-10 17:14 UTC (permalink / raw)
  To: penberg
  Cc: mathieu.desnoyers, cl, linux-mm, linux-kernel, rdunlap, mpm,
	rostedt, tglx

This adds hooks for the SLUB allocator, to allow tracing with kmemtrace.

Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
 include/linux/slub_def.h |   53 ++++++++++++++++++++++++++++++++++--
 mm/slub.c                |   66 +++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 110 insertions(+), 9 deletions(-)

diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index d117ea2..d77012a 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -10,6 +10,7 @@
 #include <linux/gfp.h>
 #include <linux/workqueue.h>
 #include <linux/kobject.h>
+#include <linux/kmemtrace.h>
 
 enum stat_item {
 	ALLOC_FASTPATH,		/* Allocation from cpu slab */
@@ -203,13 +204,31 @@ static __always_inline struct kmem_cache *kmalloc_slab(size_t size)
 void *kmem_cache_alloc(struct kmem_cache *, gfp_t);
 void *__kmalloc(size_t size, gfp_t flags);
 
+#ifdef CONFIG_KMEMTRACE
+extern void *kmem_cache_alloc_notrace(struct kmem_cache *s, gfp_t gfpflags);
+#else
+static __always_inline void *
+kmem_cache_alloc_notrace(struct kmem_cache *s, gfp_t gfpflags)
+{
+	return kmem_cache_alloc(s, gfpflags);
+}
+#endif
+
 static __always_inline void *kmalloc_large(size_t size, gfp_t flags)
 {
-	return (void *)__get_free_pages(flags | __GFP_COMP, get_order(size));
+	unsigned int order = get_order(size);
+	void *ret = (void *) __get_free_pages(flags, order);
+
+	kmemtrace_mark_alloc(KMEMTRACE_TYPE_KMALLOC, _THIS_IP_, ret,
+			     size, PAGE_SIZE << order, flags);
+
+	return ret;
 }
 
 static __always_inline void *kmalloc(size_t size, gfp_t flags)
 {
+	void *ret;
+
 	if (__builtin_constant_p(size)) {
 		if (size > PAGE_SIZE)
 			return kmalloc_large(size, flags);
@@ -220,7 +239,13 @@ static __always_inline void *kmalloc(size_t size, gfp_t flags)
 			if (!s)
 				return ZERO_SIZE_PTR;
 
-			return kmem_cache_alloc(s, flags);
+			ret = kmem_cache_alloc_notrace(s, flags);
+
+			kmemtrace_mark_alloc(KMEMTRACE_TYPE_KMALLOC,
+					     _THIS_IP_, ret,
+					     size, s->size, flags);
+
+			return ret;
 		}
 	}
 	return __kmalloc(size, flags);
@@ -230,8 +255,24 @@ static __always_inline void *kmalloc(size_t size, gfp_t flags)
 void *__kmalloc_node(size_t size, gfp_t flags, int node);
 void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node);
 
+#ifdef CONFIG_KMEMTRACE
+extern void *kmem_cache_alloc_node_notrace(struct kmem_cache *s,
+					   gfp_t gfpflags,
+					   int node);
+#else
+static __always_inline void *
+kmem_cache_alloc_node_notrace(struct kmem_cache *s,
+			      gfp_t gfpflags,
+			      int node)
+{
+	return kmem_cache_alloc_node(s, gfpflags, node);
+}
+#endif
+
 static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
 {
+	void *ret;
+
 	if (__builtin_constant_p(size) &&
 		size <= PAGE_SIZE && !(flags & SLUB_DMA)) {
 			struct kmem_cache *s = kmalloc_slab(size);
@@ -239,7 +280,13 @@ static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
 		if (!s)
 			return ZERO_SIZE_PTR;
 
-		return kmem_cache_alloc_node(s, flags, node);
+		ret = kmem_cache_alloc_node_notrace(s, flags, node);
+
+		kmemtrace_mark_alloc_node(KMEMTRACE_TYPE_KMALLOC,
+					  _THIS_IP_, ret,
+					  size, s->size, flags, node);
+
+		return ret;
 	}
 	return __kmalloc_node(size, flags, node);
 }
diff --git a/mm/slub.c b/mm/slub.c
index 315c392..940145f 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -23,6 +23,7 @@
 #include <linux/kallsyms.h>
 #include <linux/memory.h>
 #include <linux/math64.h>
+#include <linux/kmemtrace.h>
 
 /*
  * Lock order:
@@ -1652,18 +1653,47 @@ static __always_inline void *slab_alloc(struct kmem_cache *s,
 
 void *kmem_cache_alloc(struct kmem_cache *s, gfp_t gfpflags)
 {
-	return slab_alloc(s, gfpflags, -1, __builtin_return_address(0));
+	void *ret = slab_alloc(s, gfpflags, -1, __builtin_return_address(0));
+
+	kmemtrace_mark_alloc(KMEMTRACE_TYPE_CACHE, _RET_IP_, ret,
+			     s->objsize, s->size, gfpflags);
+
+	return ret;
 }
 EXPORT_SYMBOL(kmem_cache_alloc);
 
+#ifdef CONFIG_KMEMTRACE
+void *kmem_cache_alloc_notrace(struct kmem_cache *s, gfp_t gfpflags)
+{
+	return slab_alloc(s, gfpflags, -1, __builtin_return_address(0));
+}
+EXPORT_SYMBOL(kmem_cache_alloc_notrace);
+#endif
+
 #ifdef CONFIG_NUMA
 void *kmem_cache_alloc_node(struct kmem_cache *s, gfp_t gfpflags, int node)
 {
-	return slab_alloc(s, gfpflags, node, __builtin_return_address(0));
+	void *ret = slab_alloc(s, gfpflags, node,
+			       __builtin_return_address(0));
+
+	kmemtrace_mark_alloc_node(KMEMTRACE_TYPE_CACHE, _RET_IP_, ret,
+				  s->objsize, s->size, gfpflags, node);
+
+	return ret;
 }
 EXPORT_SYMBOL(kmem_cache_alloc_node);
 #endif
 
+#ifdef CONFIG_KMEMTRACE
+void *kmem_cache_alloc_node_notrace(struct kmem_cache *s,
+				    gfp_t gfpflags,
+				    int node)
+{
+	return slab_alloc(s, gfpflags, node, __builtin_return_address(0));
+}
+EXPORT_SYMBOL(kmem_cache_alloc_node_notrace);
+#endif
+
 /*
  * Slow patch handling. This may still be called frequently since objects
  * have a longer lifetime than the cpu slabs in most processing loads.
@@ -1771,6 +1801,8 @@ void kmem_cache_free(struct kmem_cache *s, void *x)
 	page = virt_to_head_page(x);
 
 	slab_free(s, page, x, __builtin_return_address(0));
+
+	kmemtrace_mark_free(KMEMTRACE_TYPE_CACHE, _RET_IP_, x);
 }
 EXPORT_SYMBOL(kmem_cache_free);
 
@@ -2676,6 +2708,7 @@ static struct kmem_cache *get_slab(size_t size, gfp_t flags)
 void *__kmalloc(size_t size, gfp_t flags)
 {
 	struct kmem_cache *s;
+	void *ret;
 
 	if (unlikely(size > PAGE_SIZE))
 		return kmalloc_large(size, flags);
@@ -2685,7 +2718,12 @@ void *__kmalloc(size_t size, gfp_t flags)
 	if (unlikely(ZERO_OR_NULL_PTR(s)))
 		return s;
 
-	return slab_alloc(s, flags, -1, __builtin_return_address(0));
+	ret = slab_alloc(s, flags, -1, __builtin_return_address(0));
+
+	kmemtrace_mark_alloc(KMEMTRACE_TYPE_KMALLOC, _RET_IP_, ret,
+			     size, s->size, flags);
+
+	return ret;
 }
 EXPORT_SYMBOL(__kmalloc);
 
@@ -2704,16 +2742,30 @@ static void *kmalloc_large_node(size_t size, gfp_t flags, int node)
 void *__kmalloc_node(size_t size, gfp_t flags, int node)
 {
 	struct kmem_cache *s;
+	void *ret;
 
-	if (unlikely(size > PAGE_SIZE))
-		return kmalloc_large_node(size, flags, node);
+	if (unlikely(size > PAGE_SIZE)) {
+		ret = kmalloc_large_node(size, flags, node);
+
+		kmemtrace_mark_alloc_node(KMEMTRACE_TYPE_KMALLOC,
+					  _RET_IP_, ret,
+					  size, PAGE_SIZE << get_order(size),
+					  flags, node);
+
+		return ret;
+	}
 
 	s = get_slab(size, flags);
 
 	if (unlikely(ZERO_OR_NULL_PTR(s)))
 		return s;
 
-	return slab_alloc(s, flags, node, __builtin_return_address(0));
+	ret = slab_alloc(s, flags, node, __builtin_return_address(0));
+
+	kmemtrace_mark_alloc_node(KMEMTRACE_TYPE_KMALLOC, _RET_IP_, ret,
+				  size, s->size, flags, node);
+
+	return ret;
 }
 EXPORT_SYMBOL(__kmalloc_node);
 #endif
@@ -2771,6 +2823,8 @@ void kfree(const void *x)
 		return;
 	}
 	slab_free(page->slab, page, object, __builtin_return_address(0));
+
+	kmemtrace_mark_free(KMEMTRACE_TYPE_KMALLOC, _RET_IP_, x);
 }
 EXPORT_SYMBOL(kfree);
 
-- 
1.5.6.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 5/5] kmemtrace: SLOB hooks.
  2008-08-10 17:14       ` [PATCH 4/5] kmemtrace: SLUB hooks Eduard - Gabriel Munteanu
@ 2008-08-10 17:14         ` Eduard - Gabriel Munteanu
  2008-08-10 17:48           ` Pekka Enberg
  2008-08-11 14:04         ` [PATCH 4/5] kmemtrace: SLUB hooks Christoph Lameter
  1 sibling, 1 reply; 32+ messages in thread
From: Eduard - Gabriel Munteanu @ 2008-08-10 17:14 UTC (permalink / raw)
  To: penberg
  Cc: mathieu.desnoyers, cl, linux-mm, linux-kernel, rdunlap, mpm,
	rostedt, tglx

This adds hooks for the SLOB allocator, to allow tracing with kmemtrace.

We also convert some inline functions to __always_inline to make sure
_RET_IP_, which expands to __builtin_return_address(0), always works
as expected.

Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
 include/linux/slob_def.h |    9 +++++----
 mm/slob.c                |   37 +++++++++++++++++++++++++++++++------
 2 files changed, 36 insertions(+), 10 deletions(-)

diff --git a/include/linux/slob_def.h b/include/linux/slob_def.h
index 59a3fa4..0ec00b3 100644
--- a/include/linux/slob_def.h
+++ b/include/linux/slob_def.h
@@ -3,14 +3,15 @@
 
 void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node);
 
-static inline void *kmem_cache_alloc(struct kmem_cache *cachep, gfp_t flags)
+static __always_inline void *kmem_cache_alloc(struct kmem_cache *cachep,
+					      gfp_t flags)
 {
 	return kmem_cache_alloc_node(cachep, flags, -1);
 }
 
 void *__kmalloc_node(size_t size, gfp_t flags, int node);
 
-static inline void *kmalloc_node(size_t size, gfp_t flags, int node)
+static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
 {
 	return __kmalloc_node(size, flags, node);
 }
@@ -23,12 +24,12 @@ static inline void *kmalloc_node(size_t size, gfp_t flags, int node)
  * kmalloc is the normal method of allocating memory
  * in the kernel.
  */
-static inline void *kmalloc(size_t size, gfp_t flags)
+static __always_inline void *kmalloc(size_t size, gfp_t flags)
 {
 	return __kmalloc_node(size, flags, -1);
 }
 
-static inline void *__kmalloc(size_t size, gfp_t flags)
+static __always_inline void *__kmalloc(size_t size, gfp_t flags)
 {
 	return kmalloc(size, flags);
 }
diff --git a/mm/slob.c b/mm/slob.c
index a3ad667..23375ed 100644
--- a/mm/slob.c
+++ b/mm/slob.c
@@ -65,6 +65,7 @@
 #include <linux/module.h>
 #include <linux/rcupdate.h>
 #include <linux/list.h>
+#include <linux/kmemtrace.h>
 #include <asm/atomic.h>
 
 /*
@@ -463,27 +464,38 @@ void *__kmalloc_node(size_t size, gfp_t gfp, int node)
 {
 	unsigned int *m;
 	int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
+	void *ret;
 
 	if (size < PAGE_SIZE - align) {
 		if (!size)
 			return ZERO_SIZE_PTR;
 
 		m = slob_alloc(size + align, gfp, align, node);
+
 		if (!m)
 			return NULL;
 		*m = size;
-		return (void *)m + align;
+		ret = (void *)m + align;
+
+		kmemtrace_mark_alloc_node(KMEMTRACE_TYPE_KMALLOC,
+					  _RET_IP_, ret,
+					  size, size + align, gfp, node);
 	} else {
-		void *ret;
+		unsigned int order = get_order(size);
 
-		ret = slob_new_page(gfp | __GFP_COMP, get_order(size), node);
+		ret = slob_new_page(gfp | __GFP_COMP, order, node);
 		if (ret) {
 			struct page *page;
 			page = virt_to_page(ret);
 			page->private = size;
 		}
-		return ret;
+
+		kmemtrace_mark_alloc_node(KMEMTRACE_TYPE_KMALLOC,
+					  _RET_IP_, ret,
+					  size, PAGE_SIZE << order, gfp, node);
 	}
+
+	return ret;
 }
 EXPORT_SYMBOL(__kmalloc_node);
 
@@ -501,6 +513,8 @@ void kfree(const void *block)
 		slob_free(m, *m + align);
 	} else
 		put_page(&sp->page);
+
+	kmemtrace_mark_free(KMEMTRACE_TYPE_KMALLOC, _RET_IP_, block);
 }
 EXPORT_SYMBOL(kfree);
 
@@ -569,10 +583,19 @@ void *kmem_cache_alloc_node(struct kmem_cache *c, gfp_t flags, int node)
 {
 	void *b;
 
-	if (c->size < PAGE_SIZE)
+	if (c->size < PAGE_SIZE) {
 		b = slob_alloc(c->size, flags, c->align, node);
-	else
+		kmemtrace_mark_alloc_node(KMEMTRACE_TYPE_CACHE,
+					  _RET_IP_, b, c->size,
+					  SLOB_UNITS(c->size) * SLOB_UNIT,
+					  flags, node);
+	} else {
 		b = slob_new_page(flags, get_order(c->size), node);
+		kmemtrace_mark_alloc_node(KMEMTRACE_TYPE_CACHE,
+					  _RET_IP_, b, c->size,
+					  PAGE_SIZE << get_order(c->size),
+					  flags, node);
+	}
 
 	if (c->ctor)
 		c->ctor(c, b);
@@ -608,6 +631,8 @@ void kmem_cache_free(struct kmem_cache *c, void *b)
 	} else {
 		__kmem_cache_free(b, c->size);
 	}
+
+	kmemtrace_mark_free(KMEMTRACE_TYPE_CACHE, _RET_IP_, b);
 }
 EXPORT_SYMBOL(kmem_cache_free);
 
-- 
1.5.6.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 5/5] kmemtrace: SLOB hooks.
  2008-08-10 17:14         ` [PATCH 5/5] kmemtrace: SLOB hooks Eduard - Gabriel Munteanu
@ 2008-08-10 17:48           ` Pekka Enberg
  2008-08-10 23:18             ` Matt Mackall
  0 siblings, 1 reply; 32+ messages in thread
From: Pekka Enberg @ 2008-08-10 17:48 UTC (permalink / raw)
  To: Eduard - Gabriel Munteanu
  Cc: mathieu.desnoyers, cl, linux-mm, linux-kernel, rdunlap, mpm,
	rostedt, tglx

On Sun, Aug 10, 2008 at 8:14 PM, Eduard - Gabriel Munteanu
<eduard.munteanu@linux360.ro> wrote:
> This adds hooks for the SLOB allocator, to allow tracing with kmemtrace.
>
> We also convert some inline functions to __always_inline to make sure
> _RET_IP_, which expands to __builtin_return_address(0), always works
> as expected.
>
> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>

I think Matt acked this already but as you dropped the tags, I'll ask
once more before I merge this.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 5/5] kmemtrace: SLOB hooks.
  2008-08-10 17:48           ` Pekka Enberg
@ 2008-08-10 23:18             ` Matt Mackall
  2008-08-12  6:46               ` Pekka Enberg
  0 siblings, 1 reply; 32+ messages in thread
From: Matt Mackall @ 2008-08-10 23:18 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Eduard - Gabriel Munteanu, mathieu.desnoyers, cl, linux-mm,
	linux-kernel, rdunlap, rostedt, tglx

On Sun, 2008-08-10 at 20:48 +0300, Pekka Enberg wrote:
> On Sun, Aug 10, 2008 at 8:14 PM, Eduard - Gabriel Munteanu
> <eduard.munteanu@linux360.ro> wrote:
> > This adds hooks for the SLOB allocator, to allow tracing with kmemtrace.
> >
> > We also convert some inline functions to __always_inline to make sure
> > _RET_IP_, which expands to __builtin_return_address(0), always works
> > as expected.
> >
> > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> 
> I think Matt acked this already but as you dropped the tags, I'll ask
> once more before I merge this.

Yeah, that's fine.

Acked-by: Matt Mackall <mpm@selenic.com>

-- 
Mathematics is the supreme nostalgia of our time.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] kmemtrace: SLUB hooks.
  2008-08-10 17:14       ` [PATCH 4/5] kmemtrace: SLUB hooks Eduard - Gabriel Munteanu
  2008-08-10 17:14         ` [PATCH 5/5] kmemtrace: SLOB hooks Eduard - Gabriel Munteanu
@ 2008-08-11 14:04         ` Christoph Lameter
  2008-08-11 14:09           ` Pekka Enberg
  1 sibling, 1 reply; 32+ messages in thread
From: Christoph Lameter @ 2008-08-11 14:04 UTC (permalink / raw)
  To: Eduard - Gabriel Munteanu
  Cc: penberg, mathieu.desnoyers, linux-mm, linux-kernel, rdunlap, mpm,
	rostedt, tglx

Eduard - Gabriel Munteanu wrote:



>  static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
>  {
> +	void *ret;
> +
>  	if (__builtin_constant_p(size) &&
>  		size <= PAGE_SIZE && !(flags & SLUB_DMA)) {
>  			struct kmem_cache *s = kmalloc_slab(size);
> @@ -239,7 +280,13 @@ static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
>  		if (!s)
>  			return ZERO_SIZE_PTR;
>  
> -		return kmem_cache_alloc_node(s, flags, node);
> +		ret = kmem_cache_alloc_node_notrace(s, flags, node);
> +
> +		kmemtrace_mark_alloc_node(KMEMTRACE_TYPE_KMALLOC,
> +					  _THIS_IP_, ret,
> +					  size, s->size, flags, node);
> +
> +		return ret;

You could simplify the stuff in slub.h if you would fall back to the uninlined
functions in the case that kmemtrace is enabled. IMHO adding additional inline
code here does grow these function to a size where inlining is not useful anymore.


> diff --git a/mm/slub.c b/mm/slub.c
> index 315c392..940145f 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -23,6 +23,7 @@
>  #include <linux/kallsyms.h>
>  #include <linux/memory.h>
>  #include <linux/math64.h>
> +#include <linux/kmemtrace.h>
>  
>  /*
>   * Lock order:
> @@ -1652,18 +1653,47 @@ static __always_inline void *slab_alloc(struct kmem_cache *s,
>  
>  void *kmem_cache_alloc(struct kmem_cache *s, gfp_t gfpflags)
>  {
> -	return slab_alloc(s, gfpflags, -1, __builtin_return_address(0));
> +	void *ret = slab_alloc(s, gfpflags, -1, __builtin_return_address(0));
> +
> +	kmemtrace_mark_alloc(KMEMTRACE_TYPE_CACHE, _RET_IP_, ret,
> +			     s->objsize, s->size, gfpflags);
> +
> +	return ret;
>  }

_RET_IP == __builtin_return_address(0) right? Put that into a local variable?
At least we need consistent usage within one function. Maybe convert
__builtin_return_address(0) to _RET_IP_ within slub?

>  EXPORT_SYMBOL(kmem_cache_alloc);
>  
> +#ifdef CONFIG_KMEMTRACE
> +void *kmem_cache_alloc_notrace(struct kmem_cache *s, gfp_t gfpflags)
> +{
> +	return slab_alloc(s, gfpflags, -1, __builtin_return_address(0));
> +}
> +EXPORT_SYMBOL(kmem_cache_alloc_notrace);
> +#endif
> +
>  #ifdef CONFIG_NUMA
>  void *kmem_cache_alloc_node(struct kmem_cache *s, gfp_t gfpflags, int node)
>  {
> -	return slab_alloc(s, gfpflags, node, __builtin_return_address(0));
> +	void *ret = slab_alloc(s, gfpflags, node,
> +			       __builtin_return_address(0));
> +
> +	kmemtrace_mark_alloc_node(KMEMTRACE_TYPE_CACHE, _RET_IP_, ret,
> +				  s->objsize, s->size, gfpflags, node);
> +
> +	return ret;

Same here.

>  }
>  EXPORT_SYMBOL(kmem_cache_alloc_node);
>  #endif
>  
> +#ifdef CONFIG_KMEMTRACE
> +void *kmem_cache_alloc_node_notrace(struct kmem_cache *s,
> +				    gfp_t gfpflags,
> +				    int node)
> +{
> +	return slab_alloc(s, gfpflags, node, __builtin_return_address(0));
> +}
> +EXPORT_SYMBOL(kmem_cache_alloc_node_notrace);
> +#endif
> +
>  /*
>   * Slow patch handling. This may still be called frequently since objects
>   * have a longer lifetime than the cpu slabs in most processing loads.
> @@ -1771,6 +1801,8 @@ void kmem_cache_free(struct kmem_cache *s, void *x)
>  	page = virt_to_head_page(x);
>  
>  	slab_free(s, page, x, __builtin_return_address(0));
> +
> +	kmemtrace_mark_free(KMEMTRACE_TYPE_CACHE, _RET_IP_, x);
>  }
>  EXPORT_SYMBOL(kmem_cache_free);

And again.

>  
> @@ -2676,6 +2708,7 @@ static struct kmem_cache *get_slab(size_t size, gfp_t flags)
>  void *__kmalloc(size_t size, gfp_t flags)
>  {
>  	struct kmem_cache *s;
> +	void *ret;
>  
>  	if (unlikely(size > PAGE_SIZE))
>  		return kmalloc_large(size, flags);
> @@ -2685,7 +2718,12 @@ void *__kmalloc(size_t size, gfp_t flags)
>  	if (unlikely(ZERO_OR_NULL_PTR(s)))
>  		return s;
>  
> -	return slab_alloc(s, flags, -1, __builtin_return_address(0));
> +	ret = slab_alloc(s, flags, -1, __builtin_return_address(0));
> +
> +	kmemtrace_mark_alloc(KMEMTRACE_TYPE_KMALLOC, _RET_IP_, ret,
> +			     size, s->size, flags);
> +
> +	return ret;
>  }
>  EXPORT_SYMBOL(__kmalloc);
>  

And again.

;
>  #endif
> @@ -2771,6 +2823,8 @@ void kfree(const void *x)
>  		return;
>  	}
>  	slab_free(page->slab, page, object, __builtin_return_address(0));
> +
> +	kmemtrace_mark_free(KMEMTRACE_TYPE_KMALLOC, _RET_IP_, x);

And another one.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] kmemtrace: SLUB hooks.
  2008-08-11 14:04         ` [PATCH 4/5] kmemtrace: SLUB hooks Christoph Lameter
@ 2008-08-11 14:09           ` Pekka Enberg
  2008-08-11 14:13             ` Christoph Lameter
  2008-08-12 15:25             ` Eduard - Gabriel Munteanu
  0 siblings, 2 replies; 32+ messages in thread
From: Pekka Enberg @ 2008-08-11 14:09 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Eduard - Gabriel Munteanu, mathieu.desnoyers, linux-mm,
	linux-kernel, rdunlap, mpm, rostedt, tglx

On Mon, 2008-08-11 at 09:04 -0500, Christoph Lameter wrote:
> Eduard - Gabriel Munteanu wrote:
> 
> 
> 
> >  static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
> >  {
> > +	void *ret;
> > +
> >  	if (__builtin_constant_p(size) &&
> >  		size <= PAGE_SIZE && !(flags & SLUB_DMA)) {
> >  			struct kmem_cache *s = kmalloc_slab(size);
> > @@ -239,7 +280,13 @@ static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
> >  		if (!s)
> >  			return ZERO_SIZE_PTR;
> >  
> > -		return kmem_cache_alloc_node(s, flags, node);
> > +		ret = kmem_cache_alloc_node_notrace(s, flags, node);
> > +
> > +		kmemtrace_mark_alloc_node(KMEMTRACE_TYPE_KMALLOC,
> > +					  _THIS_IP_, ret,
> > +					  size, s->size, flags, node);
> > +
> > +		return ret;
> 
> You could simplify the stuff in slub.h if you would fall back to the uninlined
> functions in the case that kmemtrace is enabled. IMHO adding additional inline
> code here does grow these function to a size where inlining is not useful anymore.

So, if CONFIG_KMEMTRACE is enabled, make the inlined version go away
completely? I'm okay with that though I wonder if that means we now take
a performance hit when CONFIG_KMEMTRACE is enabled but tracing is
disabled at run-time...

> > diff --git a/mm/slub.c b/mm/slub.c
> > index 315c392..940145f 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -23,6 +23,7 @@
> >  #include <linux/kallsyms.h>
> >  #include <linux/memory.h>
> >  #include <linux/math64.h>
> > +#include <linux/kmemtrace.h>
> >  
> >  /*
> >   * Lock order:
> > @@ -1652,18 +1653,47 @@ static __always_inline void *slab_alloc(struct kmem_cache *s,
> >  
> >  void *kmem_cache_alloc(struct kmem_cache *s, gfp_t gfpflags)
> >  {
> > -	return slab_alloc(s, gfpflags, -1, __builtin_return_address(0));
> > +	void *ret = slab_alloc(s, gfpflags, -1, __builtin_return_address(0));
> > +
> > +	kmemtrace_mark_alloc(KMEMTRACE_TYPE_CACHE, _RET_IP_, ret,
> > +			     s->objsize, s->size, gfpflags);
> > +
> > +	return ret;
> >  }
> 
> _RET_IP == __builtin_return_address(0) right? Put that into a local variable?
> At least we need consistent usage within one function. Maybe convert
> __builtin_return_address(0) to _RET_IP_ within slub?

I think we should just convert SLUB to use _RET_IP_ everywhere. Eduard,
care to make a patch and send it and rebase this on top of that?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] kmemtrace: SLUB hooks.
  2008-08-11 14:09           ` Pekka Enberg
@ 2008-08-11 14:13             ` Christoph Lameter
  2008-08-11 14:16               ` Pekka Enberg
  2008-08-11 14:30               ` Steven Rostedt
  2008-08-12 15:25             ` Eduard - Gabriel Munteanu
  1 sibling, 2 replies; 32+ messages in thread
From: Christoph Lameter @ 2008-08-11 14:13 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Eduard - Gabriel Munteanu, mathieu.desnoyers, linux-mm,
	linux-kernel, rdunlap, mpm, rostedt, tglx

Pekka Enberg wrote:
> On Mon, 2008-08-11 at 09:04 -0500, Christoph Lameter wrote:
>> Eduard - Gabriel Munteanu wrote:
>>
>>
>>
>>>  static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
>>>  {
>>> +	void *ret;
>>> +
>>>  	if (__builtin_constant_p(size) &&
>>>  		size <= PAGE_SIZE && !(flags & SLUB_DMA)) {
>>>  			struct kmem_cache *s = kmalloc_slab(size);
>>> @@ -239,7 +280,13 @@ static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
>>>  		if (!s)
>>>  			return ZERO_SIZE_PTR;
>>>  
>>> -		return kmem_cache_alloc_node(s, flags, node);
>>> +		ret = kmem_cache_alloc_node_notrace(s, flags, node);
>>> +
>>> +		kmemtrace_mark_alloc_node(KMEMTRACE_TYPE_KMALLOC,
>>> +					  _THIS_IP_, ret,
>>> +					  size, s->size, flags, node);
>>> +
>>> +		return ret;
>> You could simplify the stuff in slub.h if you would fall back to the uninlined
>> functions in the case that kmemtrace is enabled. IMHO adding additional inline
>> code here does grow these function to a size where inlining is not useful anymore.
> 
> So, if CONFIG_KMEMTRACE is enabled, make the inlined version go away
> completely? I'm okay with that though I wonder if that means we now take
> a performance hit when CONFIG_KMEMTRACE is enabled but tracing is
> disabled at run-time...

We already take a performance hit because of the additional function calls.

With the above approach the kernel binary will grow significantly because you
are now inserting an additional function call at all call sites.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] kmemtrace: SLUB hooks.
  2008-08-11 14:13             ` Christoph Lameter
@ 2008-08-11 14:16               ` Pekka Enberg
  2008-08-11 14:21                 ` Christoph Lameter
  2008-08-11 14:30               ` Steven Rostedt
  1 sibling, 1 reply; 32+ messages in thread
From: Pekka Enberg @ 2008-08-11 14:16 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Eduard - Gabriel Munteanu, mathieu.desnoyers, linux-mm,
	linux-kernel, rdunlap, mpm, rostedt, tglx

On Mon, 2008-08-11 at 09:13 -0500, Christoph Lameter wrote:
> Pekka Enberg wrote:
> > On Mon, 2008-08-11 at 09:04 -0500, Christoph Lameter wrote:
> >> Eduard - Gabriel Munteanu wrote:
> >>
> >>
> >>
> >>>  static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
> >>>  {
> >>> +	void *ret;
> >>> +
> >>>  	if (__builtin_constant_p(size) &&
> >>>  		size <= PAGE_SIZE && !(flags & SLUB_DMA)) {
> >>>  			struct kmem_cache *s = kmalloc_slab(size);
> >>> @@ -239,7 +280,13 @@ static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
> >>>  		if (!s)
> >>>  			return ZERO_SIZE_PTR;
> >>>  
> >>> -		return kmem_cache_alloc_node(s, flags, node);
> >>> +		ret = kmem_cache_alloc_node_notrace(s, flags, node);
> >>> +
> >>> +		kmemtrace_mark_alloc_node(KMEMTRACE_TYPE_KMALLOC,
> >>> +					  _THIS_IP_, ret,
> >>> +					  size, s->size, flags, node);
> >>> +
> >>> +		return ret;
> >> You could simplify the stuff in slub.h if you would fall back to the uninlined
> >> functions in the case that kmemtrace is enabled. IMHO adding additional inline
> >> code here does grow these function to a size where inlining is not useful anymore.
> > 
> > So, if CONFIG_KMEMTRACE is enabled, make the inlined version go away
> > completely? I'm okay with that though I wonder if that means we now take
> > a performance hit when CONFIG_KMEMTRACE is enabled but tracing is
> > disabled at run-time...
> 
> We already take a performance hit because of the additional function calls.
> 
> With the above approach the kernel binary will grow significantly because you
> are now inserting an additional function call at all call sites.

The function call is supposed to go away when we convert kmemtrace to
use Mathieu's markers but I suppose even then we have a problem with
inlining?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] kmemtrace: SLUB hooks.
  2008-08-11 14:16               ` Pekka Enberg
@ 2008-08-11 14:21                 ` Christoph Lameter
  2008-08-11 14:22                   ` Pekka Enberg
  2008-08-11 14:36                   ` Steven Rostedt
  0 siblings, 2 replies; 32+ messages in thread
From: Christoph Lameter @ 2008-08-11 14:21 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Eduard - Gabriel Munteanu, mathieu.desnoyers, linux-mm,
	linux-kernel, rdunlap, mpm, rostedt, tglx

Pekka Enberg wrote:

> The function call is supposed to go away when we convert kmemtrace to
> use Mathieu's markers but I suppose even then we have a problem with
> inlining?

The function calls are overwritten with NOPs? Or how does that work?


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] kmemtrace: SLUB hooks.
  2008-08-11 14:21                 ` Christoph Lameter
@ 2008-08-11 14:22                   ` Pekka Enberg
  2008-08-12 15:29                     ` Eduard - Gabriel Munteanu
  2008-08-11 14:36                   ` Steven Rostedt
  1 sibling, 1 reply; 32+ messages in thread
From: Pekka Enberg @ 2008-08-11 14:22 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Eduard - Gabriel Munteanu, mathieu.desnoyers, linux-mm,
	linux-kernel, rdunlap, mpm, rostedt, tglx

On Mon, 2008-08-11 at 09:21 -0500, Christoph Lameter wrote:
> Pekka Enberg wrote:
> 
> > The function call is supposed to go away when we convert kmemtrace to
> > use Mathieu's markers but I suppose even then we have a problem with
> > inlining?
> 
> The function calls are overwritten with NOPs? Or how does that work?

I have no idea. Mathieu, Eduard?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] kmemtrace: SLUB hooks.
  2008-08-11 14:13             ` Christoph Lameter
  2008-08-11 14:16               ` Pekka Enberg
@ 2008-08-11 14:30               ` Steven Rostedt
  2008-08-11 14:37                 ` Christoph Lameter
  1 sibling, 1 reply; 32+ messages in thread
From: Steven Rostedt @ 2008-08-11 14:30 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Pekka Enberg, Eduard - Gabriel Munteanu, mathieu.desnoyers,
	linux-mm, linux-kernel, rdunlap, mpm, tglx


On Mon, 11 Aug 2008, Christoph Lameter wrote:

> Pekka Enberg wrote:
> > On Mon, 2008-08-11 at 09:04 -0500, Christoph Lameter wrote:
> >> Eduard - Gabriel Munteanu wrote:
> >>
> >>
> >>
> >>>  static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
> >>>  {
> >>> +	void *ret;
> >>> +
> >>>  	if (__builtin_constant_p(size) &&
> >>>  		size <= PAGE_SIZE && !(flags & SLUB_DMA)) {
> >>>  			struct kmem_cache *s = kmalloc_slab(size);
> >>> @@ -239,7 +280,13 @@ static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
> >>>  		if (!s)
> >>>  			return ZERO_SIZE_PTR;
> >>>  
> >>> -		return kmem_cache_alloc_node(s, flags, node);
> >>> +		ret = kmem_cache_alloc_node_notrace(s, flags, node);
> >>> +
> >>> +		kmemtrace_mark_alloc_node(KMEMTRACE_TYPE_KMALLOC,
> >>> +					  _THIS_IP_, ret,
> >>> +					  size, s->size, flags, node);
> >>> +
> >>> +		return ret;
> >> You could simplify the stuff in slub.h if you would fall back to the uninlined
> >> functions in the case that kmemtrace is enabled. IMHO adding additional inline
> >> code here does grow these function to a size where inlining is not useful anymore.
> > 
> > So, if CONFIG_KMEMTRACE is enabled, make the inlined version go away
> > completely? I'm okay with that though I wonder if that means we now take
> > a performance hit when CONFIG_KMEMTRACE is enabled but tracing is
> > disabled at run-time...
> 
> We already take a performance hit because of the additional function calls.
> 
> With the above approach the kernel binary will grow significantly because you
> are now inserting an additional function call at all call sites.
> 

The kmemtrace_mark_alloc_node itself is an inline function, which calls 
another inline function "trace_mark" which is designed to test a 
read_mostly variable, and will do an "unlikely" jmp if the variable is 
set (which it is when tracing is enabled), to the actual function call.

There should be no extra function calls when this is configured on but 
tracing disabled. We try very hard to keep the speed of the tracer as 
close to a non tracing kernel as possible when tracing is disabled.

-- Steve

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] kmemtrace: SLUB hooks.
  2008-08-11 14:21                 ` Christoph Lameter
  2008-08-11 14:22                   ` Pekka Enberg
@ 2008-08-11 14:36                   ` Steven Rostedt
  2008-08-11 18:28                     ` Mathieu Desnoyers
  1 sibling, 1 reply; 32+ messages in thread
From: Steven Rostedt @ 2008-08-11 14:36 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Pekka Enberg, Eduard - Gabriel Munteanu, mathieu.desnoyers,
	linux-mm, linux-kernel, rdunlap, mpm, tglx

On Mon, 11 Aug 2008, Christoph Lameter wrote:

> Pekka Enberg wrote:
> 
> > The function call is supposed to go away when we convert kmemtrace to
> > use Mathieu's markers but I suppose even then we have a problem with
> > inlining?
> 
> The function calls are overwritten with NOPs? Or how does that work?

I believe in the latest version they are just a variable test. But when 
Mathieu's immediate code makes it in (which it is in linux-tip), we will 
be overwriting the conditionals with nops (Mathieu, correct me if I'm 
wrong here).

But the calls themselves are done in the unlikely branch. This is 
important, as Mathieu stated in previous thread. The reason is that all 
the stack setup for the function call is also in the unlikely branch, and 
the normal fast path does not take a hit for the function call setup.

-- Steve

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] kmemtrace: SLUB hooks.
  2008-08-11 14:30               ` Steven Rostedt
@ 2008-08-11 14:37                 ` Christoph Lameter
  2008-08-11 15:34                   ` Frank Ch. Eigler
  2008-08-11 18:29                   ` Mathieu Desnoyers
  0 siblings, 2 replies; 32+ messages in thread
From: Christoph Lameter @ 2008-08-11 14:37 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Pekka Enberg, Eduard - Gabriel Munteanu, mathieu.desnoyers,
	linux-mm, linux-kernel, rdunlap, mpm, tglx

Steven Rostedt wrote:

> The kmemtrace_mark_alloc_node itself is an inline function, which calls 
> another inline function "trace_mark" which is designed to test a 
> read_mostly variable, and will do an "unlikely" jmp if the variable is 
> set (which it is when tracing is enabled), to the actual function call.
> 
> There should be no extra function calls when this is configured on but 
> tracing disabled. We try very hard to keep the speed of the tracer as 
> close to a non tracing kernel as possible when tracing is disabled.

Makes sense. But then we have even more code bloat because of the tests that
are inserted in all call sites of kmalloc.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] kmemtrace: SLUB hooks.
  2008-08-11 14:37                 ` Christoph Lameter
@ 2008-08-11 15:34                   ` Frank Ch. Eigler
  2008-08-11 15:48                     ` Christoph Lameter
  2008-08-11 18:29                   ` Mathieu Desnoyers
  1 sibling, 1 reply; 32+ messages in thread
From: Frank Ch. Eigler @ 2008-08-11 15:34 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Steven Rostedt, Pekka Enberg, Eduard - Gabriel Munteanu,
	mathieu.desnoyers, linux-mm, linux-kernel, rdunlap, mpm, tglx

Christoph Lameter <cl@linux-foundation.org> writes:

> [...]
>> There should be no extra function calls when this is configured on but 
>> tracing disabled. We try very hard to keep the speed of the tracer as 
>> close to a non tracing kernel as possible when tracing is disabled.
>
> Makes sense. But then we have even more code bloat because of the
> tests that are inserted in all call sites of kmalloc.

Are you talking about the tests that implement checking whether a
marker is active or not?  Those checks are already efficient, and will
get more so with the "immediate values" optimization in or near the
tree.

- FChE

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] kmemtrace: SLUB hooks.
  2008-08-11 15:34                   ` Frank Ch. Eigler
@ 2008-08-11 15:48                     ` Christoph Lameter
  2008-08-11 15:54                       ` Steven Rostedt
  2008-08-11 15:57                       ` Frank Ch. Eigler
  0 siblings, 2 replies; 32+ messages in thread
From: Christoph Lameter @ 2008-08-11 15:48 UTC (permalink / raw)
  To: Frank Ch. Eigler
  Cc: Steven Rostedt, Pekka Enberg, Eduard - Gabriel Munteanu,
	mathieu.desnoyers, linux-mm, linux-kernel, rdunlap, mpm, tglx

Frank Ch. Eigler wrote:
> Christoph Lameter <cl@linux-foundation.org> writes:
> 
>> [...]
>>> There should be no extra function calls when this is configured on but 
>>> tracing disabled. We try very hard to keep the speed of the tracer as 
>>> close to a non tracing kernel as possible when tracing is disabled.
>> Makes sense. But then we have even more code bloat because of the
>> tests that are inserted in all call sites of kmalloc.
> 
> Are you talking about the tests that implement checking whether a
> marker is active or not?  Those checks are already efficient, and will
> get more so with the "immediate values" optimization in or near the
> tree.

AFAICT: Each test also adds an out of line call to the tracing facility.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] kmemtrace: SLUB hooks.
  2008-08-11 15:48                     ` Christoph Lameter
@ 2008-08-11 15:54                       ` Steven Rostedt
  2008-08-11 15:57                       ` Frank Ch. Eigler
  1 sibling, 0 replies; 32+ messages in thread
From: Steven Rostedt @ 2008-08-11 15:54 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Frank Ch. Eigler, Pekka Enberg, Eduard - Gabriel Munteanu,
	mathieu.desnoyers, linux-mm, linux-kernel, rdunlap, mpm, tglx

On Mon, 11 Aug 2008, Christoph Lameter wrote:

> Frank Ch. Eigler wrote:
> > Christoph Lameter <cl@linux-foundation.org> writes:
> > 
> >> [...]
> >>> There should be no extra function calls when this is configured on but 
> >>> tracing disabled. We try very hard to keep the speed of the tracer as 
> >>> close to a non tracing kernel as possible when tracing is disabled.
> >> Makes sense. But then we have even more code bloat because of the
> >> tests that are inserted in all call sites of kmalloc.
> > 
> > Are you talking about the tests that implement checking whether a
> > marker is active or not?  Those checks are already efficient, and will
> > get more so with the "immediate values" optimization in or near the
> > tree.
> 
> AFAICT: Each test also adds an out of line call to the tracing facility.
> 

Frank,

Christoph is correct. He's not bringing up the issue of efficiency, but 
the issue of bloat.

The marker code will be added to everyplace that calls kmalloc. Which can 
be quite a lot.  I'd be interested in seeing the size of the .text section 
with and without this patch added and makers configure in.

-- Steve

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] kmemtrace: SLUB hooks.
  2008-08-11 15:48                     ` Christoph Lameter
  2008-08-11 15:54                       ` Steven Rostedt
@ 2008-08-11 15:57                       ` Frank Ch. Eigler
  1 sibling, 0 replies; 32+ messages in thread
From: Frank Ch. Eigler @ 2008-08-11 15:57 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Steven Rostedt, Pekka Enberg, Eduard - Gabriel Munteanu,
	mathieu.desnoyers, linux-mm, linux-kernel, rdunlap, mpm, tglx

Hi -

On Mon, Aug 11, 2008 at 10:48:29AM -0500, Christoph Lameter wrote:
> [...]
> AFAICT: Each test also adds an out of line call to the tracing facility.

Yes, but that call is normally placed out of the cache-hot path with unlikely().

- FChE

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] kmemtrace: SLUB hooks.
  2008-08-11 14:36                   ` Steven Rostedt
@ 2008-08-11 18:28                     ` Mathieu Desnoyers
  0 siblings, 0 replies; 32+ messages in thread
From: Mathieu Desnoyers @ 2008-08-11 18:28 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Christoph Lameter, Pekka Enberg, Eduard - Gabriel Munteanu,
	linux-mm, linux-kernel, rdunlap, mpm, tglx

* Steven Rostedt (rostedt@goodmis.org) wrote:
> 
> On Mon, 11 Aug 2008, Christoph Lameter wrote:
> 
> > Pekka Enberg wrote:
> > 
> > > The function call is supposed to go away when we convert kmemtrace to
> > > use Mathieu's markers but I suppose even then we have a problem with
> > > inlining?
> > 
> > The function calls are overwritten with NOPs? Or how does that work?
> 
> I believe in the latest version they are just a variable test. But when 
> Mathieu's immediate code makes it in (which it is in linux-tip), we will 
> be overwriting the conditionals with nops (Mathieu, correct me if I'm 
> wrong here).
> 

The current immediate values in tip does a load immediate, test, branch,
which removes the cost of the memory load. We will try to get gcc
support to be able to declare patchable static jump sites, which could
be patched with NOPs when disabled. But that will probably not happen
"now".

Mathieu

> But the calls themselves are done in the unlikely branch. This is 
> important, as Mathieu stated in previous thread. The reason is that all 
> the stack setup for the function call is also in the unlikely branch, and 
> the normal fast path does not take a hit for the function call setup.
> 
> -- Steve
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] kmemtrace: SLUB hooks.
  2008-08-11 14:37                 ` Christoph Lameter
  2008-08-11 15:34                   ` Frank Ch. Eigler
@ 2008-08-11 18:29                   ` Mathieu Desnoyers
  1 sibling, 0 replies; 32+ messages in thread
From: Mathieu Desnoyers @ 2008-08-11 18:29 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Steven Rostedt, Pekka Enberg, Eduard - Gabriel Munteanu,
	linux-mm, linux-kernel, rdunlap, mpm, tglx

* Christoph Lameter (cl@linux-foundation.org) wrote:
> Steven Rostedt wrote:
> 
> > The kmemtrace_mark_alloc_node itself is an inline function, which calls 
> > another inline function "trace_mark" which is designed to test a 
> > read_mostly variable, and will do an "unlikely" jmp if the variable is 
> > set (which it is when tracing is enabled), to the actual function call.
> > 
> > There should be no extra function calls when this is configured on but 
> > tracing disabled. We try very hard to keep the speed of the tracer as 
> > close to a non tracing kernel as possible when tracing is disabled.
> 
> Makes sense. But then we have even more code bloat because of the tests that
> are inserted in all call sites of kmalloc.
> 

The long-term goal is to turn the tests into NOPs, but only once we get
gcc support.

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/5] kmemtrace: Core implementation.
  2008-08-10 17:14 ` [PATCH 1/5] kmemtrace: Core implementation Eduard - Gabriel Munteanu
  2008-08-10 17:14   ` [PATCH 2/5] kmemtrace: Additional documentation Eduard - Gabriel Munteanu
@ 2008-08-12  6:46   ` Pekka Enberg
  1 sibling, 0 replies; 32+ messages in thread
From: Pekka Enberg @ 2008-08-12  6:46 UTC (permalink / raw)
  To: Eduard - Gabriel Munteanu
  Cc: mathieu.desnoyers, cl, linux-mm, linux-kernel, rdunlap, mpm,
	rostedt, tglx

Eduard - Gabriel Munteanu wrote:
> kmemtrace provides tracing for slab allocator functions, such as kmalloc,
> kfree, kmem_cache_alloc, kmem_cache_free etc.. Collected data is then fed
> to the userspace application in order to analyse allocation hotspots,
> internal fragmentation and so on, making it possible to see how well an
> allocator performs, as well as debug and profile kernel code.
> 
> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>

Applied, thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/5] kmemtrace: Additional documentation.
  2008-08-10 17:14   ` [PATCH 2/5] kmemtrace: Additional documentation Eduard - Gabriel Munteanu
  2008-08-10 17:14     ` [PATCH 3/5] kmemtrace: SLAB hooks Eduard - Gabriel Munteanu
@ 2008-08-12  6:46     ` Pekka Enberg
  2008-08-18 19:57     ` Randy Dunlap
  2 siblings, 0 replies; 32+ messages in thread
From: Pekka Enberg @ 2008-08-12  6:46 UTC (permalink / raw)
  To: Eduard - Gabriel Munteanu
  Cc: mathieu.desnoyers, cl, linux-mm, linux-kernel, rdunlap, mpm,
	rostedt, tglx

Eduard - Gabriel Munteanu wrote:
> Documented kmemtrace's ABI, purpose and design. Also includes a short
> usage guide, FAQ, as well as a link to the userspace application's Git
> repository, which is currently hosted at repo.or.cz.
> 
> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>

Applied, thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/5] kmemtrace: SLAB hooks.
  2008-08-10 17:14     ` [PATCH 3/5] kmemtrace: SLAB hooks Eduard - Gabriel Munteanu
  2008-08-10 17:14       ` [PATCH 4/5] kmemtrace: SLUB hooks Eduard - Gabriel Munteanu
@ 2008-08-12  6:46       ` Pekka Enberg
  1 sibling, 0 replies; 32+ messages in thread
From: Pekka Enberg @ 2008-08-12  6:46 UTC (permalink / raw)
  To: Eduard - Gabriel Munteanu
  Cc: mathieu.desnoyers, cl, linux-mm, linux-kernel, rdunlap, mpm,
	rostedt, tglx

Eduard - Gabriel Munteanu wrote:
> This adds hooks for the SLAB allocator, to allow tracing with kmemtrace.
> 
> We also convert some inline functions to __always_inline to make sure
> _RET_IP_, which expands to __builtin_return_address(0), always works
> as expected.
> 
> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>

Applied, thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 5/5] kmemtrace: SLOB hooks.
  2008-08-10 23:18             ` Matt Mackall
@ 2008-08-12  6:46               ` Pekka Enberg
  0 siblings, 0 replies; 32+ messages in thread
From: Pekka Enberg @ 2008-08-12  6:46 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Eduard - Gabriel Munteanu, mathieu.desnoyers, cl, linux-mm,
	linux-kernel, rdunlap, rostedt, tglx

Matt Mackall wrote:
> On Sun, 2008-08-10 at 20:48 +0300, Pekka Enberg wrote:
>> On Sun, Aug 10, 2008 at 8:14 PM, Eduard - Gabriel Munteanu
>> <eduard.munteanu@linux360.ro> wrote:
>>> This adds hooks for the SLOB allocator, to allow tracing with kmemtrace.
>>>
>>> We also convert some inline functions to __always_inline to make sure
>>> _RET_IP_, which expands to __builtin_return_address(0), always works
>>> as expected.
>>>
>>> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
>> I think Matt acked this already but as you dropped the tags, I'll ask
>> once more before I merge this.
> 
> Yeah, that's fine.
> 
> Acked-by: Matt Mackall <mpm@selenic.com>

Applied, thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] kmemtrace: SLUB hooks.
  2008-08-11 14:09           ` Pekka Enberg
  2008-08-11 14:13             ` Christoph Lameter
@ 2008-08-12 15:25             ` Eduard - Gabriel Munteanu
  1 sibling, 0 replies; 32+ messages in thread
From: Eduard - Gabriel Munteanu @ 2008-08-12 15:25 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Christoph Lameter, mathieu.desnoyers, linux-mm, linux-kernel,
	rdunlap, mpm, rostedt, tglx

On Mon, Aug 11, 2008 at 05:09:34PM +0300, Pekka Enberg wrote:
> On Mon, 2008-08-11 at 09:04 -0500, Christoph Lameter wrote:
> > Eduard - Gabriel Munteanu wrote:
> > 
> > 
> > 
> > >  static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
> > >  {
> > > +	void *ret;
> > > +
> > >  	if (__builtin_constant_p(size) &&
> > >  		size <= PAGE_SIZE && !(flags & SLUB_DMA)) {
> > >  			struct kmem_cache *s = kmalloc_slab(size);
> > > @@ -239,7 +280,13 @@ static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
> > >  		if (!s)
> > >  			return ZERO_SIZE_PTR;
> > >  
> > > -		return kmem_cache_alloc_node(s, flags, node);
> > > +		ret = kmem_cache_alloc_node_notrace(s, flags, node);
> > > +
> > > +		kmemtrace_mark_alloc_node(KMEMTRACE_TYPE_KMALLOC,
> > > +					  _THIS_IP_, ret,
> > > +					  size, s->size, flags, node);
> > > +
> > > +		return ret;
> > 
> > You could simplify the stuff in slub.h if you would fall back to the uninlined
> > functions in the case that kmemtrace is enabled. IMHO adding additional inline
> > code here does grow these function to a size where inlining is not useful anymore.
> 
> So, if CONFIG_KMEMTRACE is enabled, make the inlined version go away
> completely? I'm okay with that though I wonder if that means we now take
> a performance hit when CONFIG_KMEMTRACE is enabled but tracing is
> disabled at run-time...

Oh, good. I'm also thinking to add a macro that expands to simple inline when
CONFIG_KMEMTRACE is disabled and to __always_inline otherwise.

> > > +	kmemtrace_mark_alloc(KMEMTRACE_TYPE_CACHE, _RET_IP_, ret,
> > > +			     s->objsize, s->size, gfpflags);
> > > +
> > > +	return ret;
> > >  }
> > 
> > _RET_IP == __builtin_return_address(0) right? Put that into a local variable?
> > At least we need consistent usage within one function. Maybe convert
> > __builtin_return_address(0) to _RET_IP_ within slub?
> 
> I think we should just convert SLUB to use _RET_IP_ everywhere. Eduard,
> care to make a patch and send it and rebase this on top of that?

Sure. Will get back soon.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] kmemtrace: SLUB hooks.
  2008-08-11 14:22                   ` Pekka Enberg
@ 2008-08-12 15:29                     ` Eduard - Gabriel Munteanu
  2008-08-12 15:43                       ` Mathieu Desnoyers
  2008-08-13  2:09                       ` Matt Mackall
  0 siblings, 2 replies; 32+ messages in thread
From: Eduard - Gabriel Munteanu @ 2008-08-12 15:29 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Christoph Lameter, mathieu.desnoyers, linux-mm, linux-kernel,
	rdunlap, mpm, rostedt, tglx

On Mon, Aug 11, 2008 at 05:22:37PM +0300, Pekka Enberg wrote:
> On Mon, 2008-08-11 at 09:21 -0500, Christoph Lameter wrote:
> > Pekka Enberg wrote:
> > 
> > > The function call is supposed to go away when we convert kmemtrace to
> > > use Mathieu's markers but I suppose even then we have a problem with
> > > inlining?
> > 
> > The function calls are overwritten with NOPs? Or how does that work?
> 
> I have no idea. Mathieu, Eduard?

Yes, the code is patched at runtime. But AFAIK markers already provide
this stuff (called "immediate values"). Mathieu's tracepoints also do
it. But it's not available on all arches. x86 and x86-64 work as far as
I remember.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] kmemtrace: SLUB hooks.
  2008-08-12 15:29                     ` Eduard - Gabriel Munteanu
@ 2008-08-12 15:43                       ` Mathieu Desnoyers
  2008-08-13  2:09                       ` Matt Mackall
  1 sibling, 0 replies; 32+ messages in thread
From: Mathieu Desnoyers @ 2008-08-12 15:43 UTC (permalink / raw)
  To: Eduard - Gabriel Munteanu
  Cc: Pekka Enberg, Christoph Lameter, linux-mm, linux-kernel, rdunlap,
	mpm, rostedt, tglx

* Eduard - Gabriel Munteanu (eduard.munteanu@linux360.ro) wrote:
> On Mon, Aug 11, 2008 at 05:22:37PM +0300, Pekka Enberg wrote:
> > On Mon, 2008-08-11 at 09:21 -0500, Christoph Lameter wrote:
> > > Pekka Enberg wrote:
> > > 
> > > > The function call is supposed to go away when we convert kmemtrace to
> > > > use Mathieu's markers but I suppose even then we have a problem with
> > > > inlining?
> > > 
> > > The function calls are overwritten with NOPs? Or how does that work?
> > 
> > I have no idea. Mathieu, Eduard?
> 
> Yes, the code is patched at runtime. But AFAIK markers already provide
> this stuff (called "immediate values"). Mathieu's tracepoints also do
> it. But it's not available on all arches. x86 and x86-64 work as far as
> I remember.
> 

The markers present in mainline kernel does not use immediate values.
However, immediate values in tip does implement a load
immediate/test/branch for x86, x86_64 and powerpc. I also have support
for sparc64 in my lttng tree.

Mathieu


-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] kmemtrace: SLUB hooks.
  2008-08-12 15:29                     ` Eduard - Gabriel Munteanu
  2008-08-12 15:43                       ` Mathieu Desnoyers
@ 2008-08-13  2:09                       ` Matt Mackall
  1 sibling, 0 replies; 32+ messages in thread
From: Matt Mackall @ 2008-08-13  2:09 UTC (permalink / raw)
  To: Eduard - Gabriel Munteanu
  Cc: Pekka Enberg, Christoph Lameter, mathieu.desnoyers, linux-mm,
	linux-kernel, rdunlap, rostedt, tglx

On Tue, 2008-08-12 at 18:29 +0300, Eduard - Gabriel Munteanu wrote:
> On Mon, Aug 11, 2008 at 05:22:37PM +0300, Pekka Enberg wrote:
> > On Mon, 2008-08-11 at 09:21 -0500, Christoph Lameter wrote:
> > > Pekka Enberg wrote:
> > > 
> > > > The function call is supposed to go away when we convert kmemtrace to
> > > > use Mathieu's markers but I suppose even then we have a problem with
> > > > inlining?
> > > 
> > > The function calls are overwritten with NOPs? Or how does that work?
> > 
> > I have no idea. Mathieu, Eduard?
> 
> Yes, the code is patched at runtime. But AFAIK markers already provide
> this stuff (called "immediate values"). Mathieu's tracepoints also do
> it. But it's not available on all arches. x86 and x86-64 work as far as
> I remember.

Did we ever see size(1) numbers for kernels with and without this
support? I'm still a bit worried about adding branches to such a popular
inline. Simply multiplying the branch test by the number of locations is
pretty substantial, never mind the unlikely part of the branch.

-- 
Mathematics is the supreme nostalgia of our time.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/5] kmemtrace: Additional documentation.
  2008-08-10 17:14   ` [PATCH 2/5] kmemtrace: Additional documentation Eduard - Gabriel Munteanu
  2008-08-10 17:14     ` [PATCH 3/5] kmemtrace: SLAB hooks Eduard - Gabriel Munteanu
  2008-08-12  6:46     ` [PATCH 2/5] kmemtrace: Additional documentation Pekka Enberg
@ 2008-08-18 19:57     ` Randy Dunlap
  2 siblings, 0 replies; 32+ messages in thread
From: Randy Dunlap @ 2008-08-18 19:57 UTC (permalink / raw)
  To: Eduard - Gabriel Munteanu
  Cc: penberg, mathieu.desnoyers, cl, linux-mm, linux-kernel, rdunlap,
	mpm, rostedt, tglx

On Sun, 10 Aug 2008 20:14:04 +0300 Eduard - Gabriel Munteanu wrote:

> Documented kmemtrace's ABI, purpose and design. Also includes a short
> usage guide, FAQ, as well as a link to the userspace application's Git
> repository, which is currently hosted at repo.or.cz.
> 
> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> ---
>  Documentation/ABI/testing/debugfs-kmemtrace |   71 +++++++++++++++
>  Documentation/vm/kmemtrace.txt              |  126 +++++++++++++++++++++++++++
>  2 files changed, 197 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/ABI/testing/debugfs-kmemtrace
>  create mode 100644 Documentation/vm/kmemtrace.txt
> 
> diff --git a/Documentation/ABI/testing/debugfs-kmemtrace b/Documentation/ABI/testing/debugfs-kmemtrace
> new file mode 100644
> index 0000000..a5ff9a6
> --- /dev/null
> +++ b/Documentation/ABI/testing/debugfs-kmemtrace
> @@ -0,0 +1,71 @@
> +What:		/sys/kernel/debug/kmemtrace/
> +Date:		July 2008
> +Contact:	Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> +Description:
> +
> +In kmemtrace-enabled kernels, the following files are created:
> +
> +/sys/kernel/debug/kmemtrace/
> +	cpu<n>		(0400)	Per-CPU tracing data, see below. (binary)
> +	total_overruns	(0400)	Total number of bytes which were dropped from
> +				cpu<n> files because of full buffer condition,
> +				non-binary. (text)
> +	abi_version	(0400)	Kernel's kmemtrace ABI version. (text)
> +
> +Each per-CPU file should be read according to the relay interface. That is,
> +the reader should set affinity to that specific CPU and, as currently done by
> +the userspace application (though there are other methods), use poll() with
> +an infinite timeout before every read(). Otherwise, erroneous data may be
> +read. The binary data has the following _core_ format:
> +
> +	Event ID	(1 byte)	Unsigned integer, one of:
> +		0 - represents an allocation (KMEMTRACE_EVENT_ALLOC)
> +		1 - represents a freeing of previously allocated memory
> +		    (KMEMTRACE_EVENT_FREE)
> +	Type ID		(1 byte)	Unsigned integer, one of:
> +		0 - this is a kmalloc() / kfree()
> +		1 - this is a kmem_cache_alloc() / kmem_cache_free()
> +		2 - this is a __get_free_pages() et al.
> +	Event size	(2 bytes)	Unsigned integer representing the
> +					size of this event. Used to extend
> +					kmemtrace. Discard the bytes you
> +					don't know about.
> +	Sequence number	(4 bytes)	Signed integer used to reorder data
> +					logged on SMP machines. Wraparound
> +					must be taken into account, although
> +					it is unlikely.
> +	Caller address	(8 bytes)	Return address to the caller.
> +	Pointer to mem	(8 bytes)	Pointer to target memory area. Can be
> +					NULL, but not all such calls might be
> +					recorded.
> +
> +In case of KMEMTRACE_EVENT_ALLOC events, the next fields follow:
> +
> +	Requested bytes	(8 bytes)	Total number of requested bytes,
> +					unsigned, must not be zero.
> +	Allocated bytes (8 bytes)	Total number of actually allocated
> +					bytes, unsigned, must not be lower
> +					than requested bytes.
> +	Requested flags	(4 bytes)	GFP flags supplied by the caller.
> +	Target CPU	(4 bytes)	Signed integer, valid for event id 1.
> +					If equal to -1, target CPU is the same
> +					as origin CPU, but the reverse might
> +					not be true.
> +
> +The data is made available in the same endianness the machine has.
> +
> +Other event ids and type ids may be defined and added. Other fields may be
> +added by increasing event size, but see below for details.
> +Every modification to the ABI, including new id definitions, are followed
> +by bumping the ABI version by one.
> +
> +Adding new data to the packet (features) is done at the end of the mandatory
> +data:
> +	Feature size	(2 byte)
> +	Feature ID	(1 byte)
> +	Feature data	(Feature size - 4 bytes)

Why is this "- 4 bytes"?  Is there an implied alignment byte somewhere in the
features "struct"?  How about making it explicit?


> +
> +
> +Users:
> +	kmemtrace-user - git://repo.or.cz/kmemtrace-user.git
> +
> diff --git a/Documentation/vm/kmemtrace.txt b/Documentation/vm/kmemtrace.txt
> new file mode 100644
> index 0000000..75360b1
> --- /dev/null
> +++ b/Documentation/vm/kmemtrace.txt
> @@ -0,0 +1,126 @@
> +			kmemtrace - Kernel Memory Tracer
> +
> +			  by Eduard - Gabriel Munteanu
> +			     <eduard.munteanu@linux360.ro>
> +
> +I. Introduction
> +===============
> +
> +kmemtrace helps kernel developers figure out two things:
> +1) how different allocators (SLAB, SLUB etc.) perform
> +2) how kernel code allocates memory and how much
> +
> +To do this, we trace every allocation and export information to the userspace
> +through the relay interface. We export things such as the number of requested
> +bytes, the number of bytes actually allocated (i.e. including internal
> +fragmentation), whether this is a slab allocation or a plain kmalloc() and so
> +on.
> +
> +The actual analysis is performed by a userspace tool (see section III for
> +details on where to get it from). It logs the data exported by the kernel,
> +processes it and (as of writing this) can provide the following information:
> +- the total amount of memory allocated and fragmentation per call-site
> +- the amount of memory allocated and fragmentation per allocation
> +- total memory allocated and fragmentation in the collected dataset
> +- number of cross-CPU allocation and frees (makes sense in NUMA environments)
> +
> +Moreover, it can potentially find inconsistent and erroneous behavior in
> +kernel code, such as using slab free functions on kmalloc'ed memory or
> +allocating less memory than requested (but not truly failed allocations).
> +
> +kmemtrace also makes provisions for tracing on some arch and analysing the
> +data on another.
> +
> +II. Design and goals
> +====================
> +
> +kmemtrace was designed to handle rather large amounts of data. Thus, it uses
> +the relay interface to export whatever is logged to userspace, which then
> +stores it. Analysis and reporting is done asynchronously, that is, after the
> +data is collected and stored. By design, it allows one to log and analyse
> +on different machines and different arches.
> +
> +As of writing this, the ABI is not considered stable, though it might not
> +change much. However, no guarantees are made about compatibility yet. When
> +deemed stable, the ABI should still allow easy extension while maintaining
> +backward compatibility. This is described further in Documentation/ABI.
> +
> +Summary of design goals:
> +	- allow logging and analysis to be done across different machines
> +	- be fast and anticipate usage in high-load environments (*)
> +	- be reasonably extensible
> +	- make it possible for GNU/Linux distributions to have kmemtrace
> +	included in their repositories
> +
> +(*) - one of the reasons Pekka Enberg's original userspace data analysis
> +    tool's code was rewritten from Perl to C (although this is more than a
> +    simple conversion)
> +
> +
> +III. Quick usage guide
> +======================
> +
> +1) Get a kernel that supports kmemtrace and build it accordingly (i.e. enable
> +CONFIG_KMEMTRACE and CONFIG_DEFAULT_ENABLED).

                        CONFIG_KMEMTRACE_DEFAULT_ENABLED

> +
> +2) Get the userspace tool and build it:

...


---
~Randy
Linux Plumbers Conference, 17-19 September 2008, Portland, Oregon USA
http://linuxplumbersconf.org/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2008-08-18 19:57 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-08-10 17:14 [PATCH 0/5] kmemtrace Eduard - Gabriel Munteanu
2008-08-10 17:14 ` [PATCH 1/5] kmemtrace: Core implementation Eduard - Gabriel Munteanu
2008-08-10 17:14   ` [PATCH 2/5] kmemtrace: Additional documentation Eduard - Gabriel Munteanu
2008-08-10 17:14     ` [PATCH 3/5] kmemtrace: SLAB hooks Eduard - Gabriel Munteanu
2008-08-10 17:14       ` [PATCH 4/5] kmemtrace: SLUB hooks Eduard - Gabriel Munteanu
2008-08-10 17:14         ` [PATCH 5/5] kmemtrace: SLOB hooks Eduard - Gabriel Munteanu
2008-08-10 17:48           ` Pekka Enberg
2008-08-10 23:18             ` Matt Mackall
2008-08-12  6:46               ` Pekka Enberg
2008-08-11 14:04         ` [PATCH 4/5] kmemtrace: SLUB hooks Christoph Lameter
2008-08-11 14:09           ` Pekka Enberg
2008-08-11 14:13             ` Christoph Lameter
2008-08-11 14:16               ` Pekka Enberg
2008-08-11 14:21                 ` Christoph Lameter
2008-08-11 14:22                   ` Pekka Enberg
2008-08-12 15:29                     ` Eduard - Gabriel Munteanu
2008-08-12 15:43                       ` Mathieu Desnoyers
2008-08-13  2:09                       ` Matt Mackall
2008-08-11 14:36                   ` Steven Rostedt
2008-08-11 18:28                     ` Mathieu Desnoyers
2008-08-11 14:30               ` Steven Rostedt
2008-08-11 14:37                 ` Christoph Lameter
2008-08-11 15:34                   ` Frank Ch. Eigler
2008-08-11 15:48                     ` Christoph Lameter
2008-08-11 15:54                       ` Steven Rostedt
2008-08-11 15:57                       ` Frank Ch. Eigler
2008-08-11 18:29                   ` Mathieu Desnoyers
2008-08-12 15:25             ` Eduard - Gabriel Munteanu
2008-08-12  6:46       ` [PATCH 3/5] kmemtrace: SLAB hooks Pekka Enberg
2008-08-12  6:46     ` [PATCH 2/5] kmemtrace: Additional documentation Pekka Enberg
2008-08-18 19:57     ` Randy Dunlap
2008-08-12  6:46   ` [PATCH 1/5] kmemtrace: Core implementation Pekka Enberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox