From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8B0CC64ED6 for ; Tue, 21 Feb 2023 16:10:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4F3F26B0071; Tue, 21 Feb 2023 11:10:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4A4456B0072; Tue, 21 Feb 2023 11:10:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 36CD56B0073; Tue, 21 Feb 2023 11:10:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 264736B0071 for ; Tue, 21 Feb 2023 11:10:50 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id CE6EFA06AE for ; Tue, 21 Feb 2023 16:10:49 +0000 (UTC) X-FDA: 80491787418.01.07BAA33 Received: from mail-qt1-f177.google.com (mail-qt1-f177.google.com [209.85.160.177]) by imf05.hostedemail.com (Postfix) with ESMTP id 8152C100025 for ; Tue, 21 Feb 2023 16:10:47 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=yDl2F+bJ; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf05.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.177 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676995847; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=hOLB/HnML99ya6804mbfQSmZBEApjV83G7ETjQ04VsU=; b=se1Jv4JsKychp18AgJZjcdM/RQ1CgcWiQ0cEKeUP9PCXmW7JyK+hYCAH1PGcykclozT5N8 M8KZg9hJAS4HfZjJpSeBnw8My6pQfZbNTY79w+TBbSDPGYmaAU+arOtfMyERYG38SjLEsc ARX2Wib/jPUnUTZ2CGPvenIH+GdvVMQ= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=yDl2F+bJ; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf05.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.177 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676995847; a=rsa-sha256; cv=none; b=7UlViPRFZ008HMW4APVl/GaZioKCp2PfKCAj1ON9B2NG6kCbKngMVhNl3R9VORd7wR37VK b/euylJEHstl0r0b7n5DxJOs3r8RVwAI3xw+6ILOUwMEHt8ARpUrwtqD7+hQdeYgiar2O4 oxRQ+rggRZBC7/M0NhPVIQDit/21BdA= Received: by mail-qt1-f177.google.com with SMTP id w23so4822300qtn.6 for ; Tue, 21 Feb 2023 08:10:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20210112.gappssmtp.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=hOLB/HnML99ya6804mbfQSmZBEApjV83G7ETjQ04VsU=; b=yDl2F+bJ2ER+JfnGjFqIXfhCtD2SJ6l/O6+O3fI/Sq+ptsvezNMgriS6S5JZkhOi7j Cpz27bf5cwgllWWelIJa9jIsb6Xv+AOUjWGRf7DqkBVnA41mUuBr5/NdP8xpqWCBG9qE s5pDYuCj2uLijFydwVKxfaDe39x0bAV1M8x4VqFO/AYJPqz56VCO8u6R1EM74IFKnJz2 3eUYgOXY90HKc6EQCEre/kS2O1Ay+CyqUTXYhxN+hoPX+uKZoXSQKXhRkhNahkqL2k0v 9xTKmSgeTWdw4pgV8kBuvWvRrRTqJICyj13L0RZhGEqWOh/xlVL3BWfuy3oY1XLu7pa3 N+9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=hOLB/HnML99ya6804mbfQSmZBEApjV83G7ETjQ04VsU=; b=QnZhF6/2KjKJlYxfCP5fxc/cHeYUI6keFCWJzNOl2Wku+jxfwAB818B0oftHJwppoR YlQpDvn61gDA6F1Zn3HMTWrAXpipUkViwoq5j8f+0FCh9NWqpQuZYKk5Dz5jjY/Stp2o dgEt4xd+/3WSLVM0t7Xl9foKRqp4pTrHInoqdckAslAn2fZDwFq2AlaPNlvliUWaFiME Z2X5x1Jv+5vYyK6RepG5PQbJh0a9Mn5aY6z4Rg+Q8hzFBsE0jATVFLvOjL1wN4fZecb0 /8OmCGwodFdnDg4U7tvjn2dlk7QZpOe3OyUMJZm47Q60rM6C2uDveCYgVLnVmnEcOcFI pM6Q== X-Gm-Message-State: AO0yUKWArxhusVP1yyDc97NOuP2vVy+74+RnOHFDInd8n/2Iyompa8yd WMF36rCX0Vfc6rHk7N2t8aeaNQ== X-Google-Smtp-Source: AK7set/d87HCAHTcDyiftivrBKW2GCwl+NkEApnd4V3/UrHDLxxuTFB2tkatAEs/5lqrJXY8eYeXCg== X-Received: by 2002:ac8:5f10:0:b0:3bf:a545:cffa with SMTP id x16-20020ac85f10000000b003bfa545cffamr2863338qta.23.1676995846477; Tue, 21 Feb 2023 08:10:46 -0800 (PST) Received: from localhost ([2620:10d:c091:480::1:5e17]) by smtp.gmail.com with ESMTPSA id f23-20020ac84657000000b003b9a6d54b6csm952844qto.59.2023.02.21.08.10.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Feb 2023 08:10:46 -0800 (PST) Date: Tue, 21 Feb 2023 11:10:45 -0500 From: Johannes Weiner To: Stefan Roesch Cc: kernel-team@fb.com, linux-mm@kvack.org, riel@surriel.com, mhocko@suse.com, david@redhat.com, linux-kselftest@vger.kernel.org, linux-doc@vger.kernel.org, akpm@linux-foundation.org Subject: Re: [RFC PATCH v2 00/19] mm: process/cgroup ksm support Message-ID: References: <20230210215023.2740545-1-shr@devkernel.io> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230210215023.2740545-1-shr@devkernel.io> X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 8152C100025 X-Stat-Signature: a37ed3xumb66ajw5r43qgfm3wnmfresn X-HE-Tag: 1676995847-346141 X-HE-Meta: U2FsdGVkX1/5ClBjhDgwlxd9iJndEC9FbIGT4+7mU+9fcBvugQvT8Ibyy/mA+yZx7wy4vpJrItEMgUAVeY/Z3ZYYwW3z9dmF0Pq21+euFrwyurNdgVjo0gfCkSGhRVM4t+k5bvZ0z28jCh91aiyLFYzy08pgTk40Abon0vyKkRK/5Xc+dVABvo21piOK8p2CVyB5V/oKjp5FprSoeVSUA48TXfN29qGZa/hZ+mwwFjCiK5qDjf246zyXS5U5sPdkMYtXk+vneopAtu1M1BmUxzuoZ7r4Qf2OGRt2AtDE/nlruleSc3Y+1BTfjM+JyJLUKTZacBR7sr/3iOqTd6phvs9+Dal2Yi11ynUhn+LobIAWj0AtFbyI4lmyENLSEgmBK4iNWOsOZ2xRjqfQYf7NBXqXQsnGrGythpRGgnepWpdjdHv6kHlEpDIzgFFPA0Sk++FVAmmiZkCvrhxaKFeI9/pYWN3XseUsNCF7E9GLyjjiQcTpyjfEaorughic2l6FP7kNepgA4vktMnFVegNJdUWkqmYMeoMSmdOe9iabLJz1HNYzBqgDagsGlPl0F/u+ng81nSTllOtPmdtFBwXaiv+S/mU1DfjS1Kv3mmVbW7FFo3R7lFoemCFLagmc/BgtMze9hz2/IGlPQBBzXl9B958pctG4JkV9phG4UOy8mc/zh6FClcixrQ08eHVBibToTJG9JGOXIYAJgW6KrKBxrqBZYBRKFFG4PXhdu9CCfdlwyhG/V1SlahwdEhxTzJOSTPi/jInHYGxJqCz8JSHW987H9DM1KKiwN8h9MuwA21OjSWkfO4hLMiu0dLBoZxt0E0cf8Bz3zWotpEh8UIMcAHSTdqiAMfe4p5zF28gDSBuFec6atGzI8FY1S66XXHUU+AIpqvsCSDu5oIC6Tywc5BpXH50M8d/zc8yzTGGKrqs2oK2gHI61Lk9KKN56K6ueEan0rFcca2RpWqVLFyM A2OM4lmp U8ua5Aub0aN5vBvPy+TxA7UkiP1xk5sZCWtnvdYA2rmNlYEYD0lOVhHp5opTsVRW52yXaj4PYIbD3tFOsMZiWodbxWdUhZEtH3hf1P8kK6ZMfZ4bSW0y5cXGhfT+E9rurY2miYWV6o+9QpX+Zw4uw+kidOl1z1+FwfjfWtVfE17bKUEUNezQ9CTS7UgWwP+NEnT3HSjFb6B/mI56WA2ovNgdF9W5f9nA7mYh8FHm747YfstfAwG5sN+RfThweA3LSwqm2sMwuvDvbEkaR/lj1lqhG1Y/VXKO8Q6vmlNoSSCe36cJEUoeW6wCh9sDeNPT+NqkNZfROyr2YB7Cmv+57HksguyVg2/mvr7tWBACdaP+N0e6H5qrI6ms+thCOQ9zOnFSzsVxO69eUe4RiqwUqZrvsImR5/aRZJcKM X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Stefan, On Fri, Feb 10, 2023 at 01:50:04PM -0800, Stefan Roesch wrote: > So far KSM can only be enabled by calling madvise for memory regions. What is > required to enable KSM for more workloads is to enable / disable it at the > process / cgroup level. > > Use case: > The madvise call is not available in the programming language. An example for > this are programs with forked workloads using a garbage collected language without > pointers. In such a language madvise cannot be made available. > > In addition the addresses of objects get moved around as they are garbage > collected. KSM sharing needs to be enabled "from the outside" for these type of > workloads. It would be good to expand on the argument that Rik made about the interpreter being used for things were there are no merging opportunities, and the KSM scanning overhead isn't amortized. There is a fundamental mismatch in scopes. madvise() is a workload-local decision, whereas sizable sharing opportunities may or may not exist across multiple workloads. Only a higher-level entity like a job scheduler can know for certain whether it's running one or more instances of a job. That job scheduler in turn doesn't have the necessary knowledge of the workload's internals to make targeted and well-timed advise calls with, say, process_madvise(). This also applies to the security concerns brought up in previous threads. An individual workload doesn't know what else is running on the machine, so it needs to be highly conservative about what it can give up for system-wide merging. However, if the system is dedicated to running multiple jobs within the same security domain, it's the job scheduler that knows that sharing isn't a problem, and even desirable. So I think this series makes sense, but it would be good to expand a bit on the reasoning and address the security aspect in the cover/doc. > Stefan Roesch (19): > mm: add new flag to enable ksm per process > mm: add flag to __ksm_enter > mm: add flag to __ksm_exit call > mm: invoke madvise for all vmas in scan_get_next_rmap_item > mm: support disabling of ksm for a process > mm: add new prctl option to get and set ksm for a process The implementation looks sound to me as well. I think it would be a bit easier to review if you folded these ^^^ patches, the tools patch below, and the prctl selftests, all into one single commit. It's one logical change. This way the new flags and helper functions can be reviewed against the new users and callsites without having to jump back and forth between emails. > mm: split off pages_volatile function > mm: expose general_profit metric > docs: document general_profit sysfs knob > mm: calculate ksm process profit metric > mm: add ksm_merge_type() function > mm: expose ksm process profit metric in ksm_stat > mm: expose ksm merge type in ksm_stat > docs: document new procfs ksm knobs Same with the new knobs/stats and their documentation. Logical splitting is easier to follow than geographical splitting. Thanks!