From: sioh Lee <solee@os.korea.ac.kr>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: akpm@linux-foundation.org, mingo@kernel.org,
zhongjiang@huawei.com, minchan@kernel.org,
arvind.yadav.cs@gmail.com, imbrenda@linux.vnet.ibm.com,
kirill.shutemov@linux.intel.com, linux-mm@kvack.org,
hxy@os.korea.ac.kr, oslab@os.korea.ac.kr
Subject: Re: [PATCH] mm/ksm : Checksum calculation function change (jhash2 -> crc32)
Date: Wed, 9 Aug 2017 22:17:31 +0900 [thread overview]
Message-ID: <df5c8e04-280b-c0eb-2820-eff2dce67582@os.korea.ac.kr> (raw)
In-Reply-To: <20170803132350.GI21775@redhat.com>
Hello.
I am sending you the results of the experiments.
The experiment was done for two workloads.
The first is Kernel build (CPU Intensive) and the second is the iozone benchmark (I/O Intensive).
In the experiment, four VMs compile kernel at the same time.
I also experimented with iozone in the same way.
The values measured in the experiment are:
1. CoW count, 2. Checksum computation time, 3. pages_unshared, 4. pages_sharing, 5. (pages_unshared / pages_sharing).
The experiment was conducted twice for each workload and the average value was calculated.
Checksum computation time, pages_unshared, and pages_sharing are recorded every 1 second,
and the average of the recorded values is obtained after the end of the experiment.
The CoW was also recorded whenever CoW occurs on a shared page.
Experiment environment
test platform : openstack cloud platform (NEWTON version)
Experiment node : openstack based cloud compute node (CPU: Xeon E5-2650 v3 2.3Ghz 10core, memory : 64Gb)
VM : (2 VCPU, RAM 4GB, DISK 20GB) * 4
workload : Kernel Compile (kernel 4.47), iozone (read, write, random read and write for 2GB)
KSM setup - sleep_millisecs : 200ms, pages_to_scan : 1600
The experimental results are as follows. (All values are truncated to the second decimal place)
kernel build
Crc32
CoW count Checksum time (ns) pages_sharing pages_unshared unshared/sharing
44036.5 903.58 951660.82 265401.54 0.27
Jhash2
CoW count Checksum time (ns) pages_sharing pages_unshared unshared/sharing
46114 4203.33 949578.19 266564.98 0.28
Increase/Decrease percentage compared to jhash2 (I: Increase, D: Decrease)
CoW count Checksum time (ns) pages_sharing pages_unshared unshared/sharing
4.5% D 78.5% D 0.2% I 0.4% D 0.64% D
For the kernel build workload, the number of CoWs compared to jhash2 decreased by 4.5%, pages_sharing increased by 0.2%,
pages_unshared decreased by 0.4%, checksums computation decreased by 78.5%, and (pages_unshared / pages_sharing) decreased by 0.64%.
iozone
Crc32
CoW count Checksum time (ns) pages_sharing pages_unshared unshared/sharing
4288702.5 1139.31 1441299.78 117746.22 0.14
Jhash2
CoW count Checksum time (ns) pages_sharing pages_unshared unshared/sharing
4229174 4980.21 1446143.41 116153.12 0.13
Increase/Decrease percentage compared to jhash2 (I: Increase, D: Decrease)
CoW count Checksum time (ns) pages_sharing pages_unshared unshared/sharing
1.4% I 77.1% D 0.33% D 1.37% I 1.89% I
For the iozone workload, the number of CoWs compared to jhash2 increased by 1.4%, pages_sharing decreased by 0.33%,
pages_unshared increased by 1.37%, checksums computation decreased by 77.1%, and (pages_unshared / pages_sharing) increased by 1.89%.
In summary, the experiment shows that crc32 has definite advantages over jhash2 for CPU intensive task.
For I/O intensive task, CoW increases only by 1.4% while the checksum computation time is significantly reduced by 77%.
2017-08-03 i??i?? 10:23i?? Andrea Arcangeli i?'(e??) i?' e,?:
> On Thu, Aug 03, 2017 at 02:26:27PM +0900, sioh Lee wrote:
>> Thank you very much for reading and responding to my commit.
>> I understand the problem with crc32 you describe.
>> I will investigate a?? as the first step, I will try to compare the number of CoWs with jhash2 and crc32. And I will send you the experiment results.
> Also the number of KSM merges and ideally in a non simple workload. If
> the hash triggers false positives it's not just that there will be
> more CoWs, but the unstable tree will get more unstable and its
> ability to find equality will decrease. This is why I don't like to
> weaken the hash with a crc and I'd rather prefer to keep a real hash
> there (doesn't need to be a crypto one, but it'd be even better if it
> was).
>
> The hash isn't used to find equality, it's only used to find which
> pages are updated frequently (and if an app overwrites the same value
> over and over, not even a crypto hash would be capable to detect it).
>
> There were attempts to replace the hashing with a dirty bit set in
> hardware in the pagetable in fact, that would be the ideal way, but
> it's quite more complicated that way.
>
> Thanks,
> Andrea
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-08-09 13:41 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-01 12:07 leesioh
2017-08-01 13:29 ` Claudio Imbrenda
2017-08-01 20:05 ` Andrea Arcangeli
2017-08-02 12:26 ` Claudio Imbrenda
2017-08-03 5:26 ` sioh Lee
2017-08-03 13:23 ` Andrea Arcangeli
2017-08-09 13:17 ` sioh Lee [this message]
2017-08-24 19:14 ` Andrea Arcangeli
2017-08-29 6:35 ` sioh Lee
2017-08-29 16:05 ` Andrea Arcangeli
2017-10-11 15:49 Timofey Titovets
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=df5c8e04-280b-c0eb-2820-eff2dce67582@os.korea.ac.kr \
--to=solee@os.korea.ac.kr \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=arvind.yadav.cs@gmail.com \
--cc=hxy@os.korea.ac.kr \
--cc=imbrenda@linux.vnet.ibm.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=mingo@kernel.org \
--cc=oslab@os.korea.ac.kr \
--cc=zhongjiang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox