linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Strange finding about kernel samepage merging
@ 2012-02-06 22:44 Jidong Xiao
  2012-02-07  3:35 ` Michael Roth
  0 siblings, 1 reply; 4+ messages in thread
From: Jidong Xiao @ 2012-02-06 22:44 UTC (permalink / raw)
  To: linux-mm; +Cc: virtualization

Hi,

This is a very very strange thing I have seen in Linux Kernel. I wrote
a simple program, all it does is to load a file into memory. This
programming is running on a virtual machine while linux-kvm is working
as the hypervisor. I enabled ksm in the hypervisor level, my host
machine was installed with a Opensuse11.4 while the guest OS is
Fedora14, the strange thing is, whenever I run following simple
program, the number exported by /sys/kernel/mm/ksm/page_sharing
increase dramatically, I mean, no matter what file I am loading, the
corresponding pages will always be merged.

Here is the simple program:

[root@fedora14 kernel]# cat testmkv.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int ae_load_file_to_memory(const char *filename, char **result)
{
       int size = 0;
       int ret;
       FILE *f = fopen(filename, "rb");
       if (f == NULL)
       {
               *result = NULL;
               return -1; // -1 means file opening fail
       }
       fseek(f, 0, SEEK_END);
       size = ftell(f);
       fseek(f, 0, SEEK_SET);
       ret = posix_memalign(result,4096,size+1);
//        *result = (char *)malloc(size+1);
       if (size != fread(*result, sizeof(char), size, f))
       {
               free(*result);
               return -2; // -2 means file reading fail
       }
       fclose(f);
       (*result)[size] = 0;
       return size;
}

int main()
{
       char *content;
       int size,pages;
       int read;
       struct timeval tb,ta;
       double tv;
       size = ae_load_file_to_memory("test.mkv", &content);
       if (size < 0)
       {
               puts("Error loading file");
               return 1;
       }
       sleep(150);
       return 0;

}

Here is my observation, before I run the program:

jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
14539
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
14539
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
14540
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
14540
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
14540
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
14540
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
14540
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
14540

After I run the program (during the the sleeping time period and after
the program exits.)

jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
25526
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
32368
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
35066
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
38010
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
40410
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
43012
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
45562
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
47866
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
50072
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
52314
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
54010
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
54486
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
54655
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
54969
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
54969
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
54969
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
54968
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
54968
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
54968
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
54968
jxiao@yosemite:~> cat /sys/kernel/mm/ksm/pages_sharing
54968

The increased number pretty equals to the pages of the applicaiton,
i.e. test.mkv (file size, 158M). I just cannot understand who will
share pages with test.mkv, test.mkv is a special application, it's
unique, moreover, I tried many other files/applications, I mean, I
replaced test.mkv with many other files, including some windows
specific files such *.exe files, but I still saw the same result. How
could that happen??

If you need more information, just let me know. Thank you.

Regards

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Strange finding about kernel samepage merging
  2012-02-06 22:44 Strange finding about kernel samepage merging Jidong Xiao
@ 2012-02-07  3:35 ` Michael Roth
  2012-02-07  4:13   ` Jidong Xiao
  0 siblings, 1 reply; 4+ messages in thread
From: Michael Roth @ 2012-02-07  3:35 UTC (permalink / raw)
  To: Jidong Xiao; +Cc: linux-mm, virtualization

[-- Attachment #1: Type: text/plain, Size: 238 bytes --]

My guess is you end up with 2 copies of each page on the guest: the copy in
the guest's page cache, and the copy in the buffer you allocated. From the
perspective of the host this all looks like anonymous memory, so ksm merges
the pages.

[-- Attachment #2: Type: text/html, Size: 249 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Strange finding about kernel samepage merging
  2012-02-07  3:35 ` Michael Roth
@ 2012-02-07  4:13   ` Jidong Xiao
  2012-02-07  5:46     ` fluxion
  0 siblings, 1 reply; 4+ messages in thread
From: Jidong Xiao @ 2012-02-07  4:13 UTC (permalink / raw)
  To: Michael Roth; +Cc: linux-mm, virtualization

On Mon, Feb 6, 2012 at 10:35 PM, Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
> My guess is you end up with 2 copies of each page on the guest: the copy in
> the guest's page cache, and the copy in the buffer you allocated. From the
> perspective of the host this all looks like anonymous memory, so ksm merges
> the pages.

Yes, the result definitely shows that there two copies. But I don't
understand why there would be two copies. So whenever you allocate
memory in a guest OS, you will always create two copies of the same
memory?

An interesting thing is, if I replace the posix_memalign() function
with the malloc() function (See the original program, the commented
line.) there would be only one copy, i.e., no merging happens,
however, since I need to have some page-aligned memory, that's why I
use posix_memalign().

Regards
Jidong

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Strange finding about kernel samepage merging
  2012-02-07  4:13   ` Jidong Xiao
@ 2012-02-07  5:46     ` fluxion
  0 siblings, 0 replies; 4+ messages in thread
From: fluxion @ 2012-02-07  5:46 UTC (permalink / raw)
  To: Jidong Xiao; +Cc: virtualization, linux-mm

[-- Attachment #1: Type: text/plain, Size: 1518 bytes --]

On Feb 6, 2012 10:14 PM, "Jidong Xiao" <jidong.xiao@gmail.com> wrote:
>
> On Mon, Feb 6, 2012 at 10:35 PM, Michael Roth <mdroth@linux.vnet.ibm.com>
wrote:
> > My guess is you end up with 2 copies of each page on the guest: the
copy in
> > the guest's page cache, and the copy in the buffer you allocated. From
the
> > perspective of the host this all looks like anonymous memory, so ksm
merges
> > the pages.
>
> Yes, the result definitely shows that there two copies. But I don't
> understand why there would be two copies. So whenever you allocate
> memory in a guest OS, you will always create two copies of the same
> memory?

Well, not just guests, hosts as well. Most operating systems will, by
default, cache the data read from disks in memory to speed up subsequent
access. In your case you're also creating a copy by allocating a second
buffer and storing the data there as well.

Ksm only merges anonymous pages, not disk/page cache, but since your
guest's pagecache looks like anonymous memory to the host, ksm is able to
merge the dupes.

>
> An interesting thing is, if I replace the posix_memalign() function
> with the malloc() function (See the original program, the commented
> line.) there would be only one copy, i.e., no merging happens,
> however, since I need to have some page-aligned memory, that's why I
> use posix_memalign().

Yup, ksm can only detect duplicate pages, so if your buffer isn't page
aligned it's unable to merge with the copy in the guest's page cache

>
> Regards
> Jidong
>

[-- Attachment #2: Type: text/html, Size: 1889 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-02-07  5:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-06 22:44 Strange finding about kernel samepage merging Jidong Xiao
2012-02-07  3:35 ` Michael Roth
2012-02-07  4:13   ` Jidong Xiao
2012-02-07  5:46     ` fluxion

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox