linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] KSM: Replace jhash2 with xxhash
@ 2017-09-21 23:18 Timofey Titovets
  2017-09-21 23:18 ` [PATCH v2 1/2] xxHash: create arch dependent 32/64-bit xxhash() Timofey Titovets
  2017-09-21 23:18 ` [PATCH v2 2/2] KSM: Replace jhash2 with xxhash Timofey Titovets
  0 siblings, 2 replies; 6+ messages in thread
From: Timofey Titovets @ 2017-09-21 23:18 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, Timofey Titovets

ksm use jhash2 for hashing pages,
in 4.14 xxhash has been merged to mainline kernel.

xxhash much faster then jhash2 on big inputs (32 byte+)

xxhash has 2 versions, one with 32-bit hash and
one with 64-bit hash.

64-bit version works faster then 32-bit on 64-bit arch.

So lets get better from two worlds,
create arch dependent xxhash() function that will use
fastest algo for current arch.
This a first patch.

Performance info and ksm update can be found in second patch.

Changelog:
  v1 -> v2:
    - Move xxhash() to xxhash.h/c and separate patches

Timofey Titovets (2):
  xxHash: create arch dependent 32/64-bit xxhash()
  KSM: Replace jhash2 with xxhash

 include/linux/xxhash.h | 24 ++++++++++++++++++++++++
 lib/xxhash.c           | 10 ++++++++++
 mm/Kconfig             |  1 +
 mm/ksm.c               | 14 +++++++-------
 4 files changed, 42 insertions(+), 7 deletions(-)

--
2.14.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/2] xxHash: create arch dependent 32/64-bit xxhash()
  2017-09-21 23:18 [PATCH v2 0/2] KSM: Replace jhash2 with xxhash Timofey Titovets
@ 2017-09-21 23:18 ` Timofey Titovets
  2017-09-25 14:59   ` Matthew Wilcox
  2017-09-21 23:18 ` [PATCH v2 2/2] KSM: Replace jhash2 with xxhash Timofey Titovets
  1 sibling, 1 reply; 6+ messages in thread
From: Timofey Titovets @ 2017-09-21 23:18 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, Timofey Titovets

xxh32() - fast on both 32/64-bit platforms
xxh64() - fast only on 64-bit platform

Create xxhash() which will pickup fastest version
on compile time.

Add type xxhash_t to map correct hash size.

As result depends on cpu word size,
the main proporse of that - in memory hashing.

Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Linux-kernel <linux-kernel@vger.kernel.org>
---
 include/linux/xxhash.h | 24 ++++++++++++++++++++++++
 lib/xxhash.c           | 10 ++++++++++
 2 files changed, 34 insertions(+)

diff --git a/include/linux/xxhash.h b/include/linux/xxhash.h
index 9e1f42cb57e9..195a0ae10e9b 100644
--- a/include/linux/xxhash.h
+++ b/include/linux/xxhash.h
@@ -76,6 +76,7 @@
 #define XXHASH_H

 #include <linux/types.h>
+#include <linux/bitops.h> /* BITS_PER_LONG */

 /*-****************************
  * Simple Hash Functions
@@ -107,6 +108,29 @@ uint32_t xxh32(const void *input, size_t length, uint32_t seed);
  */
 uint64_t xxh64(const void *input, size_t length, uint64_t seed);

+#if BITS_PER_LONG == 64
+typedef	u64	xxhash_t;
+#else
+typedef	u32	xxhash_t;
+#endif
+
+/**
+ * xxhash() - calculate 32/64-bit hash based on cpu word size
+ *
+ * @input:  The data to hash.
+ * @length: The length of the data to hash.
+ * @seed:   The seed can be used to alter the result predictably.
+ *
+ * This function always work as xxh32() for 32-bit systems
+ * and as xxh64() for 64-bit systems.
+ * Because result depends on cpu work size,
+ * the main proporse of that function is for  in memory hashing.
+ *
+ * Return:  32/64-bit hash of the data.
+ */
+
+xxhash_t xxhash(const void *input, size_t length, uint64_t seed);
+
 /*-****************************
  * Streaming Hash Functions
  *****************************/
diff --git a/lib/xxhash.c b/lib/xxhash.c
index aa61e2a3802f..7dd1105fcc30 100644
--- a/lib/xxhash.c
+++ b/lib/xxhash.c
@@ -236,6 +236,16 @@ uint64_t xxh64(const void *input, const size_t len, const uint64_t seed)
 }
 EXPORT_SYMBOL(xxh64);

+xxhash_t xxhash(const void *input, size_t length, uint64_t seed)
+{
+#if BITS_PER_LONG == 64
+	return xxh64(input, length, seed);
+#else
+	return xxh32(input, length, seed);
+#endif
+}
+EXPORT_SYMBOL(xxhash);
+
 /*-**************************************************
  * Advanced Hash Functions
  ***************************************************/
--
2.14.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 2/2] KSM: Replace jhash2 with xxhash
  2017-09-21 23:18 [PATCH v2 0/2] KSM: Replace jhash2 with xxhash Timofey Titovets
  2017-09-21 23:18 ` [PATCH v2 1/2] xxHash: create arch dependent 32/64-bit xxhash() Timofey Titovets
@ 2017-09-21 23:18 ` Timofey Titovets
  1 sibling, 0 replies; 6+ messages in thread
From: Timofey Titovets @ 2017-09-21 23:18 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, Timofey Titovets

jhash2 used for calculating checksum
for in memory pages, for detect fact of
changes in page.

xxhash much faster then jhash2, some tests:
  x86_64 host:
    CPU: Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz
    PAGE_SIZE: 4096, loop count: 1048576
    jhash2:   0xacbc7a5b            time: 1907 ms,  th:  2251.9 MiB/s
    xxhash32: 0x570da981            time: 739 ms,   th:  5809.4 MiB/s
    xxhash64: 0xa1fa032ab85bbb62    time: 371 ms,   th: 11556.6 MiB/s

    CPU: Intel(R) Xeon(R) CPU E5-2420 0 @ 1.90GHz
    PAGE_SIZE: 4096, loop count: 1048576
    jhash2:   0xe680b382            time: 3722 ms,  th: 1153.896680 MiB/s
    xxhash32: 0x56d00be4            time: 1183 ms,  th: 3629.130689 MiB/s
    xxhash64: 0x8c194cff29cc4dee    time: 725 ms,   th: 5918.003401 MiB/s

xxhash64 on x86_32 work with ~ same speed as jhash2.
xxhash32 on x86_32 work with ~ same speed as for x86_64
jhash2 are faster than xxhash on input data smaller than 32 byte

So use xxhash() which will take appropriate hash version
for target arch

I did some benchmarks (i get cpu load of ksmd from htop):
  CPU: Intel(R) Xeon(R) CPU E5-2420 0 @ 1.90GHz
  ksm: sleep_millisecs = 1
    jhash2:   ~18%
    xxhash64: ~11%
  ksm: sleep_millisecs = 20 - default
    jhash2:   ~4.7%
    xxhash64: ~3.3%

  - 11 / 18 ~= 0.6 -> Profit: ~40%
  - 3.3/4.7 ~= 0.7 -> Profit: ~30%

Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
---
 mm/Kconfig |  1 +
 mm/ksm.c   | 14 +++++++-------
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/mm/Kconfig b/mm/Kconfig
index 9c4bdddd80c2..252ab266ac23 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -305,6 +305,7 @@ config MMU_NOTIFIER
 config KSM
 	bool "Enable KSM for page merging"
 	depends on MMU
+	select XXHASH
 	help
 	  Enable Kernel Samepage Merging: KSM periodically scans those areas
 	  of an application's address space that an app has advised may be
diff --git a/mm/ksm.c b/mm/ksm.c
index 15dd7415f7b3..6527fe21aaa3 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -25,7 +25,7 @@
 #include <linux/pagemap.h>
 #include <linux/rmap.h>
 #include <linux/spinlock.h>
-#include <linux/jhash.h>
+#include <linux/xxhash.h>
 #include <linux/delay.h>
 #include <linux/kthread.h>
 #include <linux/wait.h>
@@ -186,7 +186,7 @@ struct rmap_item {
 	};
 	struct mm_struct *mm;
 	unsigned long address;		/* + low bits used for flags below */
-	unsigned int oldchecksum;	/* when unstable */
+	xxhash_t oldchecksum;		/* when unstable */
 	union {
 		struct rb_node node;	/* when node of unstable tree */
 		struct {		/* when listed from stable tree */
@@ -255,7 +255,7 @@ static unsigned int ksm_thread_pages_to_scan = 100;
 static unsigned int ksm_thread_sleep_millisecs = 20;

 /* Checksum of an empty (zeroed) page */
-static unsigned int zero_checksum __read_mostly;
+static xxhash_t zero_checksum __read_mostly;

 /* Whether to merge empty (zeroed) pages with actual zero pages */
 static bool ksm_use_zero_pages __read_mostly;
@@ -982,11 +982,11 @@ static int unmerge_and_remove_all_rmap_items(void)
 }
 #endif /* CONFIG_SYSFS */

-static u32 calc_checksum(struct page *page)
+static xxhash_t calc_checksum(struct page *page)
 {
-	u32 checksum;
+	xxhash_t checksum;
 	void *addr = kmap_atomic(page);
-	checksum = jhash2(addr, PAGE_SIZE / 4, 17);
+	checksum = xxhash(addr, PAGE_SIZE, 0);
 	kunmap_atomic(addr);
 	return checksum;
 }
@@ -1994,7 +1994,7 @@ static void cmp_and_merge_page(struct page *page, struct rmap_item *rmap_item)
 	struct page *tree_page = NULL;
 	struct stable_node *stable_node;
 	struct page *kpage;
-	unsigned int checksum;
+	xxhash_t checksum;
 	int err;
 	bool max_page_sharing_bypass = false;

--
2.14.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] xxHash: create arch dependent 32/64-bit xxhash()
  2017-09-21 23:18 ` [PATCH v2 1/2] xxHash: create arch dependent 32/64-bit xxhash() Timofey Titovets
@ 2017-09-25 14:59   ` Matthew Wilcox
  2017-09-25 16:17     ` Timofey Titovets
  0 siblings, 1 reply; 6+ messages in thread
From: Matthew Wilcox @ 2017-09-25 14:59 UTC (permalink / raw)
  To: Timofey Titovets; +Cc: linux-mm, linux-kernel

On Fri, Sep 22, 2017 at 02:18:17AM +0300, Timofey Titovets wrote:
> diff --git a/include/linux/xxhash.h b/include/linux/xxhash.h
> index 9e1f42cb57e9..195a0ae10e9b 100644
> --- a/include/linux/xxhash.h
> +++ b/include/linux/xxhash.h
> @@ -76,6 +76,7 @@
>  #define XXHASH_H
> 
>  #include <linux/types.h>
> +#include <linux/bitops.h> /* BITS_PER_LONG */
> 
>  /*-****************************
>   * Simple Hash Functions

Huh?  linux/types.h already brings in BITS_PER_LONG.  Look:

linux/types.h
  uapi/linux/types.h
  uapi/asm/types.h
  uapi/asm-generic/types.h
  uapi/asm-generic/int-ll64.h
  asm/bitsperlong.h

> @@ -107,6 +108,29 @@ uint32_t xxh32(const void *input, size_t length, uint32_t seed);
>   */
>  uint64_t xxh64(const void *input, size_t length, uint64_t seed);
> 
> +#if BITS_PER_LONG == 64
> +typedef	u64	xxhash_t;
> +#else
> +typedef	u32	xxhash_t;
> +#endif

This is a funny way to spell 'unsigned long' ...

> +/**
> + * xxhash() - calculate 32/64-bit hash based on cpu word size
> + *
> + * @input:  The data to hash.
> + * @length: The length of the data to hash.
> + * @seed:   The seed can be used to alter the result predictably.
> + *
> + * This function always work as xxh32() for 32-bit systems
> + * and as xxh64() for 64-bit systems.
> + * Because result depends on cpu work size,
> + * the main proporse of that function is for  in memory hashing.
> + *
> + * Return:  32/64-bit hash of the data.
> + */
> +

> +xxhash_t xxhash(const void *input, size_t length, uint64_t seed)
> +{
> +#if BITS_PER_LONG == 64
> +	return xxh64(input, length, seed);
> +#else
> +	return xxh32(input, length, seed);
> +#endif
> +}

Let's move that to the header file and make it a static inline.  That way
it doesn't need to be an EXPORT_SYMBOL.

Also, I think the kerneldoc could do with a bit of work.  Try this:

/**
 * xxhash() - calculate wordsize hash of the input with a given seed
 * @input:  The data to hash.
 * @length: The length of the data to hash.
 * @seed:   The seed can be used to alter the result predictably.
 *
 * If the hash does not need to be comparable between machines with
 * different word sizes, this function will call whichever of xxh32()
 * or xxh64() is faster.
 *
 * Return:  wordsize hash of the data.
 */
static inline
unsigned long xxhash(const void *input, size_t length, unsigned long seed)
{
#if BITS_PER_LONG == 64
	return xxh64(input, length, seed);
#else
	return xxh32(input, length, seed);
#endif
}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] xxHash: create arch dependent 32/64-bit xxhash()
  2017-09-25 14:59   ` Matthew Wilcox
@ 2017-09-25 16:17     ` Timofey Titovets
  0 siblings, 0 replies; 6+ messages in thread
From: Timofey Titovets @ 2017-09-25 16:17 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm, Linux Kernel

2017-09-25 17:59 GMT+03:00 Matthew Wilcox <willy@infradead.org>:
> On Fri, Sep 22, 2017 at 02:18:17AM +0300, Timofey Titovets wrote:
>> diff --git a/include/linux/xxhash.h b/include/linux/xxhash.h
>> index 9e1f42cb57e9..195a0ae10e9b 100644
>> --- a/include/linux/xxhash.h
>> +++ b/include/linux/xxhash.h
>> @@ -76,6 +76,7 @@
>>  #define XXHASH_H
>>
>>  #include <linux/types.h>
>> +#include <linux/bitops.h> /* BITS_PER_LONG */
>>
>>  /*-****************************
>>   * Simple Hash Functions
>
> Huh?  linux/types.h already brings in BITS_PER_LONG.  Look:
>
> linux/types.h
>   uapi/linux/types.h
>   uapi/asm/types.h
>   uapi/asm-generic/types.h
>   uapi/asm-generic/int-ll64.h
>   asm/bitsperlong.h

Will fix that, thanks.

>> @@ -107,6 +108,29 @@ uint32_t xxh32(const void *input, size_t length, uint32_t seed);
>>   */
>>  uint64_t xxh64(const void *input, size_t length, uint64_t seed);
>>
>> +#if BITS_PER_LONG == 64
>> +typedef      u64     xxhash_t;
>> +#else
>> +typedef      u32     xxhash_t;
>> +#endif
>
> This is a funny way to spell 'unsigned long' ...

i'm just want some strict and obvious types for in memory hashing.
And that just looks pretty for my eye (IMHO),
I will replace that with 'unsigned long' of course and drop xxhash_t completely,
as you find that unacceptable.

>> +/**
>> + * xxhash() - calculate 32/64-bit hash based on cpu word size
>> + *
>> + * @input:  The data to hash.
>> + * @length: The length of the data to hash.
>> + * @seed:   The seed can be used to alter the result predictably.
>> + *
>> + * This function always work as xxh32() for 32-bit systems
>> + * and as xxh64() for 64-bit systems.
>> + * Because result depends on cpu work size,
>> + * the main proporse of that function is for  in memory hashing.
>> + *
>> + * Return:  32/64-bit hash of the data.
>> + */
>> +
>
>> +xxhash_t xxhash(const void *input, size_t length, uint64_t seed)
>> +{
>> +#if BITS_PER_LONG == 64
>> +     return xxh64(input, length, seed);
>> +#else
>> +     return xxh32(input, length, seed);
>> +#endif
>> +}
>
> Let's move that to the header file and make it a static inline.  That way
> it doesn't need to be an EXPORT_SYMBOL.

Agreed, thanks.

> Also, I think the kerneldoc could do with a bit of work.  Try this:
>
> /**
>  * xxhash() - calculate wordsize hash of the input with a given seed
>  * @input:  The data to hash.
>  * @length: The length of the data to hash.
>  * @seed:   The seed can be used to alter the result predictably.
>  *
>  * If the hash does not need to be comparable between machines with
>  * different word sizes, this function will call whichever of xxh32()
>  * or xxh64() is faster.
>  *
>  * Return:  wordsize hash of the data.
>  */

Replace with your version, thanks.

> static inline
> unsigned long xxhash(const void *input, size_t length, unsigned long seed)
> {
> #if BITS_PER_LONG == 64
>         return xxh64(input, length, seed);
> #else
>         return xxh32(input, length, seed);
> #endif
> }


-- 
Have a nice day,
Timofey.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 0/2] KSM: Replace jhash2 with xxhash
@ 2017-09-22  9:52 Timofey Titovets
  0 siblings, 0 replies; 6+ messages in thread
From: Timofey Titovets @ 2017-09-22  9:52 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, borntraeger, kvm, Timofey Titovets

(Resend, added cc kvm@vger.kernel.org)

ksm use jhash2 for hashing pages,
in 4.14 xxhash has been merged to mainline kernel.

xxhash much faster then jhash2 on big inputs (32 byte+)

xxhash has 2 versions, one with 32-bit hash and
one with 64-bit hash.

64-bit version works faster then 32-bit on 64-bit arch.

So lets get better from two worlds,
create arch dependent xxhash() function that will use
fastest algo for current arch.
This a first patch.

Performance info and ksm update can be found in second patch.

Changelog:
  v1 -> v2:
    - Move xxhash() to xxhash.h/c and separate patches

Timofey Titovets (2):
  xxHash: create arch dependent 32/64-bit xxhash()
  KSM: Replace jhash2 with xxhash

 include/linux/xxhash.h | 24 ++++++++++++++++++++++++
 lib/xxhash.c           | 10 ++++++++++
 mm/Kconfig             |  1 +
 mm/ksm.c               | 14 +++++++-------
 4 files changed, 42 insertions(+), 7 deletions(-)

--
2.14.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-09-25 16:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-21 23:18 [PATCH v2 0/2] KSM: Replace jhash2 with xxhash Timofey Titovets
2017-09-21 23:18 ` [PATCH v2 1/2] xxHash: create arch dependent 32/64-bit xxhash() Timofey Titovets
2017-09-25 14:59   ` Matthew Wilcox
2017-09-25 16:17     ` Timofey Titovets
2017-09-21 23:18 ` [PATCH v2 2/2] KSM: Replace jhash2 with xxhash Timofey Titovets
2017-09-22  9:52 [PATCH v2 0/2] " Timofey Titovets

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox