From: Timofey Titovets <nefelim4ag@gmail.com>
To: Oleksandr Natalenko <oleksandr@redhat.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>,
Linux Kernel <linux-kernel@vger.kernel.org>,
Vlastimil Babka <vbabka@suse.cz>, Michal Hocko <mhocko@suse.com>,
Matthew Wilcox <willy@infradead.org>,
Pavel Tatashin <pasha.tatashin@soleen.com>,
Aaron Tomlin <atomlin@redhat.com>,
linux-mm@kvack.org
Subject: Re: [PATCH RFC 0/4] mm/ksm: add option to automerge VMAs
Date: Mon, 13 May 2019 14:48:29 +0300 [thread overview]
Message-ID: <CAGqmi744Vef7iF0tuBO3uBtXbNCKYxBV_c-T_Eg3LKPY0rKcWA@mail.gmail.com> (raw)
In-Reply-To: <20190513113314.lddxv4kv5ajjldae@butterfly.localdomain>
пн, 13 мая 2019 г. в 14:33, Oleksandr Natalenko <oleksandr@redhat.com>:
>
> Hi.
>
> On Mon, May 13, 2019 at 01:38:43PM +0300, Kirill Tkhai wrote:
> > On 10.05.2019 10:21, Oleksandr Natalenko wrote:
> > > By default, KSM works only on memory that is marked by madvise(). And the
> > > only way to get around that is to either:
> > >
> > > * use LD_PRELOAD; or
> > > * patch the kernel with something like UKSM or PKSM.
> > >
> > > Instead, lets implement a so-called "always" mode, which allows marking
> > > VMAs as mergeable on do_anonymous_page() call automatically.
> > >
> > > The submission introduces a new sysctl knob as well as kernel cmdline option
> > > to control which mode to use. The default mode is to maintain old
> > > (madvise-based) behaviour.
> > >
> > > Due to security concerns, this submission also introduces VM_UNMERGEABLE
> > > vmaflag for apps to explicitly opt out of automerging. Because of adding
> > > a new vmaflag, the whole work is available for 64-bit architectures only.
> > >> This patchset is based on earlier Timofey's submission [1], but it doesn't
> > > use dedicated kthread to walk through the list of tasks/VMAs.
> > >
> > > For my laptop it saves up to 300 MiB of RAM for usual workflow (browser,
> > > terminal, player, chats etc). Timofey's submission also mentions
> > > containerised workload that benefits from automerging too.
> >
> > This all approach looks complicated for me, and I'm not sure the shown profit
> > for desktop is big enough to introduce contradictory vma flags, boot option
> > and advance page fault handler. Also, 32/64bit defines do not look good for
> > me. I had tried something like this on my laptop some time ago, and
> > the result was bad even in absolute (not in memory percentage) meaning.
> > Isn't LD_PRELOAD trick enough to desktop? Your workload is same all the time,
> > so you may statically insert correct preload to /etc/profile and replace
> > your mmap forever.
> >
> > Speaking about containers, something like this may have a sense, I think.
> > The probability of that several containers have the same pages are higher,
> > than that desktop applications have the same pages; also LD_PRELOAD for
> > containers is not applicable.
>
> Yes, I get your point. But the intention is to avoid another hacky trick
> (LD_PRELOAD), thus *something* should *preferably* be done on the
> kernel level instead.
>
> > But 1)this could be made for trusted containers only (are there similar
> > issues with KSM like with hardware side-channel attacks?!);
>
> Regarding side-channel attacks, yes, I think so. Were those openssl guys
> who complained about it?..
>
> > 2) the most
> > shared data for containers in my experience is file cache, which is not
> > supported by KSM.
> >
> > There are good results by the link [1], but it's difficult to analyze
> > them without knowledge about what happens inside them there.
> >
> > Some of tests have "VM" prefix. What the reason the hypervisor don't mark
> > their VMAs as mergeable? Can't this be fixed in hypervisor? What is the
> > generic reason that VMAs are not marked in all the tests?
>
> Timofey, could you please address this?
That's just a describe of machine,
only to show difference in deduplication for application in small VM
and real big server
i.e. KSM enabled in VM for containers, not for hypervisor.
> Also, just for the sake of another piece of stats here:
>
> $ echo "$(cat /sys/kernel/mm/ksm/pages_sharing) * 4 / 1024" | bc
> 526
IIRC, for calculate saving you must use (pages_shared - pages_sharing)
> > In case of there is a fundamental problem of calling madvise, can't we
> > just implement an easier workaround like a new write-only file:
> >
> > #echo $task > /sys/kernel/mm/ksm/force_madvise
> >
> > which will mark all anon VMAs as mergeable for a passed task's mm?
> >
> > A small userspace daemon may write mergeable tasks there from time to time.
> >
> > Then we won't need to introduce additional vm flags and to change
> > anon pagefault handler, and the changes will be small and only
> > related to mm/ksm.c, and good enough for both 32 and 64 bit machines.
>
> Yup, looks appealing. Two concerns, though:
>
> 1) we are falling back to scanning through the list of tasks (I guess
> this is what we wanted to avoid, although this time it happens in the
> userspace);
>
> 2) what kinds of opt-out we should maintain? Like, what if force_madvise
> is called, but the task doesn't want some VMAs to be merged? This will
> required new flag anyway, it seems. And should there be another
> write-only file to unmerge everything forcibly for specific task?
>
> Thanks.
>
> P.S. Cc'ing Pavel properly this time.
>
> --
> Best regards,
> Oleksandr Natalenko (post-factum)
> Senior Software Maintenance Engineer
--
Have a nice day,
Timofey.
next prev parent reply other threads:[~2019-05-13 11:49 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-10 7:21 Oleksandr Natalenko
2019-05-10 7:21 ` [PATCH RFC 1/4] mm/ksm: introduce ksm_enter() helper Oleksandr Natalenko
2019-05-10 7:21 ` [PATCH RFC 2/4] mm/ksm: introduce VM_UNMERGEABLE Oleksandr Natalenko
2019-05-10 7:21 ` [PATCH RFC 3/4] mm/ksm: allow anonymous memory automerging Oleksandr Natalenko
2019-05-10 7:21 ` [PATCH RFC 4/4] mm/ksm: add automerging documentation Oleksandr Natalenko
2019-05-13 10:38 ` [PATCH RFC 0/4] mm/ksm: add option to automerge VMAs Kirill Tkhai
2019-05-13 11:33 ` Oleksandr Natalenko
2019-05-13 11:48 ` Timofey Titovets [this message]
2019-05-13 12:01 ` Oleksandr Natalenko
2019-05-13 12:06 ` Oleksandr Natalenko
2019-05-13 12:37 ` Kirill Tkhai
2019-05-14 6:30 ` Oleksandr Natalenko
2019-05-14 9:12 ` Kirill Tkhai
2019-05-14 13:33 ` Oleksandr Natalenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAGqmi744Vef7iF0tuBO3uBtXbNCKYxBV_c-T_Eg3LKPY0rKcWA@mail.gmail.com \
--to=nefelim4ag@gmail.com \
--cc=atomlin@redhat.com \
--cc=ktkhai@virtuozzo.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=oleksandr@redhat.com \
--cc=pasha.tatashin@soleen.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox