linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: sioh Lee <solee@os.korea.ac.kr>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: akpm@linux-foundation.org, mingo@kernel.org,
	zhongjiang@huawei.com, minchan@kernel.org,
	arvind.yadav.cs@gmail.com, imbrenda@linux.vnet.ibm.com,
	kirill.shutemov@linux.intel.com, linux-mm@kvack.org,
	hxy@os.korea.ac.kr, oslab@os.korea.ac.kr
Subject: Re: [PATCH] mm/ksm : Checksum calculation function change (jhash2 -> crc32)
Date: Wed, 9 Aug 2017 22:17:31 +0900	[thread overview]
Message-ID: <df5c8e04-280b-c0eb-2820-eff2dce67582@os.korea.ac.kr> (raw)
In-Reply-To: <20170803132350.GI21775@redhat.com>

Hello.
I am sending you the results of the experiments.
The experiment was done for two workloads.
The first is Kernel build (CPU Intensive) and the second is the iozone benchmark (I/O Intensive).
In the experiment, four VMs compile kernel at the same time.
I also experimented with iozone in the same way.


The values measured in the experiment are:
1. CoW count, 2. Checksum computation time, 3. pages_unshared, 4. pages_sharing, 5. (pages_unshared / pages_sharing).
The experiment was conducted twice for each workload and the average value was calculated.
Checksum computation time, pages_unshared, and pages_sharing are recorded every 1 second,
and the average of the recorded values is obtained after the end of the experiment.
The CoW was also recorded whenever CoW occurs on a shared page.

Experiment environment

test platform : openstack cloud platform (NEWTON version)
Experiment node : openstack based cloud compute node (CPU: Xeon E5-2650 v3 2.3Ghz 10core, memory : 64Gb)
VM : (2 VCPU, RAM 4GB, DISK 20GB) * 4
workload : Kernel Compile (kernel 4.47), iozone (read, write, random read and write for 2GB)
KSM setup - sleep_millisecs : 200ms, pages_to_scan : 1600

The experimental results are as follows. (All values are truncated to the second decimal place)

kernel build

Crc32

CoW count    Checksum time (ns)    pages_sharing    pages_unshared    unshared/sharing
  44036.5           903.58                  951660.82          265401.54             0.27

Jhash2
CoW count    Checksum time (ns)    pages_sharing    pages_unshared    unshared/sharing
  46114             4203.33                  949578.19          266564.98            0.28

Increase/Decrease percentage compared to jhash2 (I: Increase, D: Decrease)
CoW count    Checksum time (ns)    pages_sharing    pages_unshared    unshared/sharing
  4.5% D            78.5% D                 0.2% I             0.4% D             0.64% D

For the kernel build workload, the number of CoWs compared to jhash2 decreased by 4.5%, pages_sharing increased by 0.2%,
pages_unshared decreased by 0.4%, checksums computation decreased by 78.5%, and (pages_unshared / pages_sharing) decreased by 0.64%.

iozone

Crc32
CoW count    Checksum time (ns)    pages_sharing    pages_unshared    unshared/sharing
 4288702.5           1139.31             1441299.78         117746.22                0.14

Jhash2
CoW count    Checksum time (ns)    pages_sharing    pages_unshared    unshared/sharing
 4229174              4980.21            1446143.41         116153.12               0.13

Increase/Decrease percentage compared to jhash2 (I: Increase, D: Decrease)
CoW count    Checksum time (ns)    pages_sharing    pages_unshared    unshared/sharing
  1.4% I             77.1% D               0.33% D            1.37% I              1.89% I

For the iozone workload, the number of CoWs compared to jhash2 increased by 1.4%, pages_sharing decreased by 0.33%,
pages_unshared increased by 1.37%, checksums computation decreased by 77.1%, and (pages_unshared / pages_sharing) increased by 1.89%.


In summary, the experiment shows that crc32 has definite advantages over jhash2 for CPU intensive task.
For I/O intensive task, CoW increases only by 1.4% while the checksum computation time is significantly reduced by 77%.


 


2017-08-03 i??i?? 10:23i?? Andrea Arcangeli i?'(e??) i?' e,?:
> On Thu, Aug 03, 2017 at 02:26:27PM +0900, sioh Lee wrote:
>> Thank you very much for reading and responding to my commit.
>> I understand the problem with crc32 you describe.
>> I will investigate a?? as the first step, I will try to compare the number of CoWs with jhash2 and crc32. And I will send you the experiment results.
> Also the number of KSM merges and ideally in a non simple workload. If
> the hash triggers false positives it's not just that there will be
> more CoWs, but the unstable tree will get more unstable and its
> ability to find equality will decrease. This is why I don't like to
> weaken the hash with a crc and I'd rather prefer to keep a real hash
> there (doesn't need to be a crypto one, but it'd be even better if it
> was).
>
> The hash isn't used to find equality, it's only used to find which
> pages are updated frequently (and if an app overwrites the same value
> over and over, not even a crypto hash would be capable to detect it).
>
> There were attempts to replace the hashing with a dirty bit set in
> hardware in the pagetable in fact, that would be the ideal way, but
> it's quite more complicated that way.
>
> Thanks,
> Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-08-09 13:41 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-01 12:07 leesioh
2017-08-01 13:29 ` Claudio Imbrenda
2017-08-01 20:05 ` Andrea Arcangeli
2017-08-02 12:26   ` Claudio Imbrenda
2017-08-03  5:26   ` sioh Lee
2017-08-03 13:23     ` Andrea Arcangeli
2017-08-09 13:17       ` sioh Lee [this message]
2017-08-24 19:14         ` Andrea Arcangeli
2017-08-29  6:35           ` sioh Lee
2017-08-29 16:05             ` Andrea Arcangeli
2017-10-11 15:49 Timofey Titovets

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=df5c8e04-280b-c0eb-2820-eff2dce67582@os.korea.ac.kr \
    --to=solee@os.korea.ac.kr \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=arvind.yadav.cs@gmail.com \
    --cc=hxy@os.korea.ac.kr \
    --cc=imbrenda@linux.vnet.ibm.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=mingo@kernel.org \
    --cc=oslab@os.korea.ac.kr \
    --cc=zhongjiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox