From: Stefan Roesch <shr@devkernel.io>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: kernel-team@fb.com, linux-mm@kvack.org, riel@surriel.com,
mhocko@suse.com, david@redhat.com,
linux-kselftest@vger.kernel.org, linux-doc@vger.kernel.org,
akpm@linux-foundation.org
Subject: Re: [RFC PATCH v2 00/19] mm: process/cgroup ksm support
Date: Tue, 21 Feb 2023 09:59:59 -0800 [thread overview]
Message-ID: <qvqwttzevqtx.fsf@dev0134.prn3.facebook.com> (raw)
In-Reply-To: <Y/TtBc9DAkUKRHnV@cmpxchg.org>
Johannes Weiner <hannes@cmpxchg.org> writes:
> Hi Stefan,
>
> On Fri, Feb 10, 2023 at 01:50:04PM -0800, Stefan Roesch wrote:
>> So far KSM can only be enabled by calling madvise for memory regions. What is
>> required to enable KSM for more workloads is to enable / disable it at the
>> process / cgroup level.
>>
>> Use case:
>> The madvise call is not available in the programming language. An example for
>> this are programs with forked workloads using a garbage collected language without
>> pointers. In such a language madvise cannot be made available.
>>
>> In addition the addresses of objects get moved around as they are garbage
>> collected. KSM sharing needs to be enabled "from the outside" for these type of
>> workloads.
>
> It would be good to expand on the argument that Rik made about the
> interpreter being used for things were there are no merging
> opportunities, and the KSM scanning overhead isn't amortized.
>
> There is a fundamental mismatch in scopes. madvise() is a
> workload-local decision, whereas sizable sharing opportunities may or
> may not exist across multiple workloads. Only a higher-level entity
> like a job scheduler can know for certain whether it's running one or
> more instances of a job. That job scheduler in turn doesn't have the
> necessary knowledge of the workload's internals to make targeted and
> well-timed advise calls with, say, process_madvise().
>
> This also applies to the security concerns brought up in previous
> threads. An individual workload doesn't know what else is running on
> the machine, so it needs to be highly conservative about what it can
> give up for system-wide merging. However, if the system is dedicated
> to running multiple jobs within the same security domain, it's the job
> scheduler that knows that sharing isn't a problem, and even desirable.
>
> So I think this series makes sense, but it would be good to expand a
> bit on the reasoning and address the security aspect in the cover/doc.
>
These are good points Johannes, I'll elaborate on them with the next
version of the patch.
>> Stefan Roesch (19):
>> mm: add new flag to enable ksm per process
>> mm: add flag to __ksm_enter
>> mm: add flag to __ksm_exit call
>> mm: invoke madvise for all vmas in scan_get_next_rmap_item
>> mm: support disabling of ksm for a process
>> mm: add new prctl option to get and set ksm for a process
>
> The implementation looks sound to me as well.
>
> I think it would be a bit easier to review if you folded these ^^^
> patches, the tools patch below, and the prctl selftests, all into one
> single commit. It's one logical change. This way the new flags and
> helper functions can be reviewed against the new users and callsites
> without having to jump back and forth between emails.
>
I'll fold them in the next version.
>> mm: split off pages_volatile function
>> mm: expose general_profit metric
>> docs: document general_profit sysfs knob
>> mm: calculate ksm process profit metric
>> mm: add ksm_merge_type() function
>> mm: expose ksm process profit metric in ksm_stat
>> mm: expose ksm merge type in ksm_stat
>> docs: document new procfs ksm knobs
>
> Same with the new knobs/stats and their documentation.
>
I'll fold them in the next version.
> Logical splitting is easier to follow than geographical splitting.
>
> Thanks!
prev parent reply other threads:[~2023-02-21 18:03 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-10 21:50 Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 01/19] mm: add new flag to enable ksm per process Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 02/19] mm: add flag to __ksm_enter Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 03/19] mm: add flag to __ksm_exit call Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 04/19] mm: invoke madvise for all vmas in scan_get_next_rmap_item Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 05/19] mm: support disabling of ksm for a process Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 06/19] mm: add new prctl option to get and set " Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 07/19] mm: split off pages_volatile function Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 08/19] mm: expose general_profit metric Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 09/19] docs: document general_profit sysfs knob Stefan Roesch
2023-02-11 8:17 ` Bagas Sanjaya
2023-02-15 23:00 ` Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 10/19] mm: calculate ksm process profit metric Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 11/19] mm: add ksm_merge_type() function Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 12/19] mm: expose ksm process profit metric in ksm_stat Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 13/19] mm: expose ksm merge type " Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 14/19] docs: document new procfs ksm knobs Stefan Roesch
2023-02-11 8:18 ` Bagas Sanjaya
2023-02-10 21:50 ` [RFC PATCH v2 15/19] tools: add new prctl flags to prctl in tools dir Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 16/19] selftests/vm: add KSM prctl merge test Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 17/19] selftests/vm: add KSM get merge type test Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 18/19] selftests/vm: add KSM fork test Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 19/19] selftests/vm: add two functions for debugging merge outcome Stefan Roesch
2023-02-10 23:23 ` [RFC PATCH v2 00/19] mm: process/cgroup ksm support Matthew Wilcox
2023-02-11 2:41 ` Rik van Riel
2023-02-21 16:10 ` Johannes Weiner
2023-02-21 17:59 ` Stefan Roesch [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=qvqwttzevqtx.fsf@dev0134.prn3.facebook.com \
--to=shr@devkernel.io \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@fb.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=riel@surriel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox