From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62163C0218F for ; Mon, 3 Feb 2025 02:54:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B2C2E6B007B; Sun, 2 Feb 2025 21:54:57 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AB5636B0083; Sun, 2 Feb 2025 21:54:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 92E966B0085; Sun, 2 Feb 2025 21:54:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 700F66B007B for ; Sun, 2 Feb 2025 21:54:57 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 2FDB746B7C for ; Mon, 3 Feb 2025 02:54:57 +0000 (UTC) X-FDA: 83077116234.10.CAD6A6C Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) by imf06.hostedemail.com (Postfix) with ESMTP id 5899C180007 for ; Mon, 3 Feb 2025 02:54:55 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=JbK79T4U; spf=pass (imf06.hostedemail.com: domain of rientjes@google.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738551295; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=glUsoXN3kXbxdG/zrksKMIQcxgge5yfSPnytar3F5NY=; b=UB5rL2zCG2FLLoRdprfHwwgdbFNwGB4ozoehCWwIp944SPP65JTT/yDVli5t9t1ZJV0mgf HnVxrZoTUrZPkON6L8s/SBzOg/gYvPYUrWK/C19sVqz/98n9X2lmkefRucILhPZBg4itJm qqL/az2iHpG/2a5NQeH9jYv7BlSXUGw= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=JbK79T4U; spf=pass (imf06.hostedemail.com: domain of rientjes@google.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738551295; a=rsa-sha256; cv=none; b=vnPMZnjzG9Xca94OrF5wfsc6Zbm8NyMzlHeulb6dSbzt3yQ93s7PnwNm29vwXSUI4KdCGk YHY1Ra01fqEtWuXG8ilU50zzlvlVD0tqdZFOYJ2OL7LI+XdX81zLMlKnnWMTJh88GHJEyo raUAZCDUb9uVwzvQhVgWZgDbUwr3VQ4= Received: by mail-pl1-f182.google.com with SMTP id d9443c01a7336-2163affd184so161595ad.1 for ; Sun, 02 Feb 2025 18:54:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738551294; x=1739156094; darn=kvack.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=glUsoXN3kXbxdG/zrksKMIQcxgge5yfSPnytar3F5NY=; b=JbK79T4UgKruz4KoWmXBmHrHxjeekZ+ZOWN650YHCNOdo12ht0xqgPNZK0mnprgnZX KQfAbdLQIY3Jy23K4/Mg8JZ6Wp2yFzEbrR0OKe4o/F/rhomKzjqjO1CUgamVqan3mzVR Me3a2hKdK01FKxh66jixpYZYLsvDtwpb+3vlLSsEAyrf5nguc/CBEDPibM5zWEOqxD35 4iDvTVQ5RDgrXvbV1PF8aD48elqZYTTiA1+ij3AR2pXTk8C2g/hTeYLLo07JuEnUHbsR U5km2BehE6GeHetW+9Inug3LyVroh50LoDOuwP1TgprUJSz52y5VKOXbU4Jh6tHdyanc CJlQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738551294; x=1739156094; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=glUsoXN3kXbxdG/zrksKMIQcxgge5yfSPnytar3F5NY=; b=IgDj23vFK6AIE0IPCyzqWmBeZ9gfsSXoQdM/Mc0xh4r+F0BVBjh8wE0b12oPcfG7iK qXdsSFchIHwNmsonYQWzozARvrtALPJPeRrfzY3D0EOS0+Wc1sc0ChfRK+oxQmBrqXu9 x2VSRurtju+21Syt1+FtLbISXDt8Bb3LtXnVMHD31T/NRrIqVqhlZCNj2XaHvkAcBgME qM+aYAdP5DAMQi6ypDtR8G0S82N338Aw1pg/PzNnm82p5itsGNCR9MujmBz0ggSSTaSW L7hINs5hsa8yXuj9sIvjG1bhLH9InvO+5yP3WYnyDo6/MCy2mxWoJR1Tq0C9HnmAyPPp F6mw== X-Forwarded-Encrypted: i=1; AJvYcCWvNLb9ukO1IklVnvy9dyr1lqWmtczeePiA0mVTf7StVRpMtl3jrdXuufcykk1SBPCCMDEohiFy8w==@kvack.org X-Gm-Message-State: AOJu0YzuBvuebXFOGiy2tZNDgjIJGPORTXzd5gAkdlMkJDa7rbX/S7+T WfHLcHva0lcXynQdays3y3EifyrBFTukXFFZzFFDg8Bb9Vg9oIs8DR9yfg95lItvOmEm7YqtQFL wtw== X-Gm-Gg: ASbGncvQRVPGsgjYVRorBzQDYW/tD94AmcgYjS/WXwlK+72ZFhXZO1PzqpSNhPaAhnz MOm4uSWwU12Q1KM/2TaxqeAI/ktKptIWBLMWQUL23AQ4K5wS7SCL86bRGbTf+XKTF9sxkMfgk2Q ygRTr3F2DnGXp5YPcXqMECKkqjaeKYX72z28tTtbWvh4An8JrDHfm5kdlb5sWUvtzf+dmbfGlG4 oe+OAAGX7iFp4meYHCZot3hxXcd9q6B064DFa/slasv2xUU9Mn0BMFQrITQLuCJVWhuRmRaZ6/m z3A4tiIYGcP2333MzWp3t+LqxvG0JqDX7TEWCYwhJ1eGJvUHtLOMZmEHvqimKLWPCeRWnF2Lz4o 1iSlXjaocuA== X-Google-Smtp-Source: AGHT+IGEguSClWFlp7PM3svLX1yr1VJHK88XPXplrdMPOHq0/8hh7FMiywpBmg4BmXtlzHvB9Vcz+w== X-Received: by 2002:a17:903:1a30:b0:216:33a:4b70 with SMTP id d9443c01a7336-21ee648033bmr2956895ad.2.1738551293923; Sun, 02 Feb 2025 18:54:53 -0800 (PST) Received: from [2a00:79e0:2eb0:8:ecc2:7863:1532:b90f] ([2a00:79e0:2eb0:8:ecc2:7863:1532:b90f]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-72fe6424f81sm7213668b3a.37.2025.02.02.18.54.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 02 Feb 2025 18:54:53 -0800 (PST) Date: Sun, 2 Feb 2025 18:54:52 -0800 (PST) From: David Rientjes To: Sourav Panda cc: lsf-pc@lists.linux-foundation.org, Linux-MM , Pavel Tatashin , Yu Zhao , shr@devkernel.io Subject: Re: [LSF/MM/BPF TOPIC] KSM Enhancements: Selective KSM In-Reply-To: Message-ID: <54a56cd9-dc34-e0d1-045f-04a372b3c3a1@google.com> References: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="279713360-634776585-1738551293=:698213" X-Rspamd-Queue-Id: 5899C180007 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: jzct841gkagdngbtc4aosy6etwm9guet X-HE-Tag: 1738551295-627246 X-HE-Meta: U2FsdGVkX1+8BF12M42vpwdTbXEpJ9gonf1KV1JBl3xd7jARdfKWaOqj3oTvYBkcykNYLByyP1NzCwCFtebZI2JOpzQEeGT1tZVBZ9/rP7JHS+sHBjHeJNpYuXPED/EA5lQ/QTL/y37nK9ZKkJqs/1+9pN7jxgYZGS7MfzTQ37s+EH8liOdD9HXAN+eO/u42+S2hLXcBoAJir8oF9b6agcBQekBYpgq9407NqM2rQK8vMfpb01x3w2mLOyuDSoqdVEi39fg/PavpNheldD/OjLWf++GXSlfM/qpWUdyMlmhVxfNXlhTv6tozdOL3sZbrIZTKxaeq5exyqNYvdWOwsAbnh1vK0rrtOVEhI+7iluQ7N+E2heo0yjZHO3jliPVL5O05Ymjx0pOIWQOdKXolRhHeXmB2gV8ueoUHOQOIiWmaGzVZmHkaqealBFd7Rm9alOxv1+yPHOonQYggkBkCEB3lFfNw2GQRzsmMO6xkZmbhh3r+6rumE2tcPYJ1ajzqG0gC1DPSODOl3GP9nkeMX97BsYsOC6fU04ACbHTpQM+24WQSOegBr91iw8MnwPAmN6LPbQzhJJ7ki7bR/OgRQW6PqnUmQoYmDzW4y3zEXn52naP7t+uqD/kwX8iCXrgBjM93rcKfsjma9sRKQuDpCgy6DCkg3/nSmwmvt+iGbyDf1zWz9fa17uxYR/ZChH8GqzTF+vgCli4Lh13BheCb1NeddD9QBJOdi+l3Hv2abwsi0Qf8xcWeexhH8ruAI6/s6wwXcCyydJ3SRIx17l19Cvbdh37UmBndMBuF8WPVgdXvfEdxGT/LZNfqdwQHOLt5Yk8229dam1lOWcykVT61l9KbjQ8A1q0eXtatQfId40WUADo1BW8OT3cJr64EC5hxTEyFAf+49csM7QP1K7Zv28xsLMy7ORae1iKViWzDbuG+MOWq0s60LVVQBrgaAtEVWGKK5sD7+SkNQfSjJQ0 j4DanPZs SIB5tcXYt/bViOVrixFpJXkmnnAnsAIIy5up02OFfux5EtOGkwGArHbjVSP0BR/TUZVZFyZ4RGFQbJTRpBq1mSUK5bE+MjhwHdBs+rODwkiXacK8Fk3hyF3KUbsDySCJfEOrBfs7JEEi+2b+MMWK4vE/Z6uc8la2MLjGPfeWIR5xMey/oCzr0LpNiKVKUMNuJoyu0mBP8e21Se+k6cIatQ55TkQ6kMRKeTXW5E3N4O8hjYqzs48SAiUPFtCABCuXLFvAP47ICkI3u2wBdfZpA4fO44nP15WmRLtjTQAx009NZxXN0fZKdbjd3jTBk7JpDg1R7FJGpO3tb97DGhH4DgDkQ0j7d4d0oTYk67yq2XxqcW+HKxQhB3KelXCu56T7QEioLMUJxnO3OUo8IqJZ64/saa9PMG1mcxC8PEIQx9XCr5kg0/KXQ4IJLToy3o9qbvbbo/e3AeEtX2ssjGfcaDbfK7g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.001666, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --279713360-634776585-1738551293=:698213 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT On Fri, 31 Jan 2025, Sourav Panda wrote: > Hi, > > KSM is a powerful tool for deduplicating memory, reducing usage by merging > > identical pages across processes. However, there are certain interface and > > implementation aspect that prevents its deployment in our use case; wherein > > security and efficiency (CPU overhead - due to background scanning) are of > > greater importance. > > We propose Selective KSM, a mechanism to control when the merging takes > > place and what pages can be merged together. We do this by partitioning the > > merge-space as per security-domains and carryout the merging as part of a > > synchronous syscall. Doing so, we ensure sensitive-content is not merged > > with non-sensitive content. > Thanks for proposing this, Sourav, it sounds like a useful topic to discuss. Regarding the above, this looks like this is analogous to doing synchronous MADV_COLLAPSE in process context and not relying on khugepaged as the sole mechanism for doing that collapse? In your case, it's userspace doing a merge in process context without relying on ksmd. Is s/Selective/Userspace/ the way to think about it? Does this require a fully cooperative guest for it to work properly? > Our overall goal is to optimize the memory utilization in a virtualized > > environment, wherein there exists significant duplications across guest > > instances (e.g., kernel). With the better ability of the operator to group > pages > > as per security and similarity, Selective KSM improves security and > efficiency. > > Other than virtualized environments, we also want Selective KSM to work > > well in containerized environments. > > An example API could look like this ( Alternatively we can do it through > sysfs > > without adding syscalls): > > // This feature shall be gated by a KConfig: “CONFIG_SELECTIVE_KSM” > > // Create a unique identifier known to userland. > > char *ksm_name = “some_name”; > > // ksm_open() creates and opens a new, or opens an existing, ksm partition > obj. > > // flags is a bit mask to determine if the merging is sync, etc. > > // KSM_SYNC: Carryout synchronous merging (no-background scanning). > > // KSM_CREAT: Creates a KSM partition obj if it does not exist. > > // KSM_EXCL: If KSM partition obj with name already exists and > > // KSM_CREAT is also specified, return err. > > // modes is used to handle permissions: > > // O_RDONLY, O_WRONLY, O_RDWR, S_IRUSR, S_IWUSR, S_IXUSR > > // On success, returns a file descriptor (a nonnegative integer) and > creates the > > // sysfs path: > > // /sys/kernel/mm/ksm/partition// > > // On failure, it returns -1 and sets errno to indicate the error. > > int ksm_fd = ksm_open(ksm_name, flag, mode); > > // Destroy the name. The named object will be removed only after all open > > // references are closed. On success, ksm_unlink() returns 0. > > // On failure, it returns -1 and sets errno to indicate the error. > > ksm_unlink(ksm_name); > > // Trigger merge. Only valid if KSM_SYNC is set during ksm_open(). > > ksm_merge(ksm_fd, pid, addr, size); > > // Trigger unmerge. Only valid if KSM_SYNC is set during ksm_open(). > > ksm_unmerge(ksm_fd, pid, addr, size); > > With regards, > > Sourav Panda > --279713360-634776585-1738551293=:698213--