Re: [RFC PATCH v2 00/19] mm: process/cgroup ksm support

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Stefan Roesch <shr@devkernel.io>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: kernel-team@fb.com, linux-mm@kvack.org, riel@surriel.com,
	mhocko@suse.com, david@redhat.com,
	linux-kselftest@vger.kernel.org, linux-doc@vger.kernel.org,
	akpm@linux-foundation.org
Subject: Re: [RFC PATCH v2 00/19] mm: process/cgroup ksm support
Date: Tue, 21 Feb 2023 09:59:59 -0800	[thread overview]
Message-ID: <qvqwttzevqtx.fsf@dev0134.prn3.facebook.com> (raw)
In-Reply-To: <Y/TtBc9DAkUKRHnV@cmpxchg.org>


Johannes Weiner <hannes@cmpxchg.org> writes:

> Hi Stefan,
>
> On Fri, Feb 10, 2023 at 01:50:04PM -0800, Stefan Roesch wrote:
>> So far KSM can only be enabled by calling madvise for memory regions. What is
>> required to enable KSM for more workloads is to enable / disable it at the
>> process / cgroup level.
>>
>> Use case:
>> The madvise call is not available in the programming language. An example for
>> this are programs with forked workloads using a garbage collected language without
>> pointers. In such a language madvise cannot be made available.
>>
>> In addition the addresses of objects get moved around as they are garbage
>> collected. KSM sharing needs to be enabled "from the outside" for these type of
>> workloads.
>
> It would be good to expand on the argument that Rik made about the
> interpreter being used for things were there are no merging
> opportunities, and the KSM scanning overhead isn't amortized.
>
> There is a fundamental mismatch in scopes. madvise() is a
> workload-local decision, whereas sizable sharing opportunities may or
> may not exist across multiple workloads. Only a higher-level entity
> like a job scheduler can know for certain whether it's running one or
> more instances of a job. That job scheduler in turn doesn't have the
> necessary knowledge of the workload's internals to make targeted and
> well-timed advise calls with, say, process_madvise().
>
> This also applies to the security concerns brought up in previous
> threads. An individual workload doesn't know what else is running on
> the machine, so it needs to be highly conservative about what it can
> give up for system-wide merging. However, if the system is dedicated
> to running multiple jobs within the same security domain, it's the job
> scheduler that knows that sharing isn't a problem, and even desirable.
>
> So I think this series makes sense, but it would be good to expand a
> bit on the reasoning and address the security aspect in the cover/doc.
>

These are good points Johannes, I'll elaborate on them with the next
version of the patch.

>> Stefan Roesch (19):
>>   mm: add new flag to enable ksm per process
>>   mm: add flag to __ksm_enter
>>   mm: add flag to __ksm_exit call
>>   mm: invoke madvise for all vmas in scan_get_next_rmap_item
>>   mm: support disabling of ksm for a process
>>   mm: add new prctl option to get and set ksm for a process
>
> The implementation looks sound to me as well.
>
> I think it would be a bit easier to review if you folded these ^^^
> patches, the tools patch below, and the prctl selftests, all into one
> single commit. It's one logical change. This way the new flags and
> helper functions can be reviewed against the new users and callsites
> without having to jump back and forth between emails.
>

I'll fold them in the next version.

>>   mm: split off pages_volatile function
>>   mm: expose general_profit metric
>>   docs: document general_profit sysfs knob
>>   mm: calculate ksm process profit metric
>>   mm: add ksm_merge_type() function
>>   mm: expose ksm process profit metric in ksm_stat
>>   mm: expose ksm merge type in ksm_stat
>>   docs: document new procfs ksm knobs
>
> Same with the new knobs/stats and their documentation.
>

I'll fold them in the next version.

> Logical splitting is easier to follow than geographical splitting.
>
> Thanks!

     prev parent reply	other threads:[~2023-02-21 18:03 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-10 21:50 Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 01/19] mm: add new flag to enable ksm per process Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 02/19] mm: add flag to __ksm_enter Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 03/19] mm: add flag to __ksm_exit call Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 04/19] mm: invoke madvise for all vmas in scan_get_next_rmap_item Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 05/19] mm: support disabling of ksm for a process Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 06/19] mm: add new prctl option to get and set " Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 07/19] mm: split off pages_volatile function Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 08/19] mm: expose general_profit metric Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 09/19] docs: document general_profit sysfs knob Stefan Roesch
2023-02-11  8:17   ` Bagas Sanjaya
2023-02-15 23:00     ` Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 10/19] mm: calculate ksm process profit metric Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 11/19] mm: add ksm_merge_type() function Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 12/19] mm: expose ksm process profit metric in ksm_stat Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 13/19] mm: expose ksm merge type " Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 14/19] docs: document new procfs ksm knobs Stefan Roesch
2023-02-11  8:18   ` Bagas Sanjaya
2023-02-10 21:50 ` [RFC PATCH v2 15/19] tools: add new prctl flags to prctl in tools dir Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 16/19] selftests/vm: add KSM prctl merge test Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 17/19] selftests/vm: add KSM get merge type test Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 18/19] selftests/vm: add KSM fork test Stefan Roesch
2023-02-10 21:50 ` [RFC PATCH v2 19/19] selftests/vm: add two functions for debugging merge outcome Stefan Roesch
2023-02-10 23:23 ` [RFC PATCH v2 00/19] mm: process/cgroup ksm support Matthew Wilcox
2023-02-11  2:41   ` Rik van Riel
2023-02-21 16:10 ` Johannes Weiner
2023-02-21 17:59   ` Stefan Roesch [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=qvqwttzevqtx.fsf@dev0134.prn3.facebook.com \
    --to=shr@devkernel.io \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=riel@surriel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox