From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 713BBC0218F for ; Sat, 1 Feb 2025 02:15:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 60E726B007B; Fri, 31 Jan 2025 21:15:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5BCD06B0082; Fri, 31 Jan 2025 21:15:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 45D6F6B0083; Fri, 31 Jan 2025 21:15:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 22A476B007B for ; Fri, 31 Jan 2025 21:15:25 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id AFF5A81310 for ; Sat, 1 Feb 2025 02:15:24 +0000 (UTC) X-FDA: 83069758968.05.482E871 Received: from mail-qt1-f182.google.com (mail-qt1-f182.google.com [209.85.160.182]) by imf10.hostedemail.com (Postfix) with ESMTP id E5B0EC000F for ; Sat, 1 Feb 2025 02:15:22 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ZPPUCVqu; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf10.hostedemail.com: domain of souravpanda@google.com designates 209.85.160.182 as permitted sender) smtp.mailfrom=souravpanda@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738376123; a=rsa-sha256; cv=none; b=r6e4GPpfhAUnpUUU4qVuyh4+lgWamT1t9lBmnWCLoT3G3RihpV60dq7/V4dzAVx23zwPRV HOgjvSoxOvVSNdk0Qsq98oRKpFx0IRR0+v9ET116ZcR4Hbwh+EGRnWsO4DX6W7CelX1+Y6 Ei2rkxgV7PvOjbymw3j4iKbq/wmh51c= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ZPPUCVqu; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf10.hostedemail.com: domain of souravpanda@google.com designates 209.85.160.182 as permitted sender) smtp.mailfrom=souravpanda@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738376123; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=v56CJNyuUY2uT0Hok9dWwolFuVznOqLARx+mk3H/6E0=; b=2/ep3HNlByE+LNlF/Q3PH8v2Cf8Amiwmev1YuntPZ8aStuOQNRqrc057jIXyWp0/TSyt86 knBdiHKqPIE7HdA6HPPqHzswPcK4JDto13J+hF/+UkOfqp6Z1DekBJrQo7y/eGoAI2pqxL X0/Wg+7Dn10oX9jLD+lfR3YwoJSInPs= Received: by mail-qt1-f182.google.com with SMTP id d75a77b69052e-4679b5c66d0so67151cf.1 for ; Fri, 31 Jan 2025 18:15:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738376122; x=1738980922; darn=kvack.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=v56CJNyuUY2uT0Hok9dWwolFuVznOqLARx+mk3H/6E0=; b=ZPPUCVqueE9KnWZhAC8P/sH/W+jADcSLW0i7eBZRCMK56+J90aexL7zdRdCFZdywOa uRi8DSIIGRpd8mO/pbQXyDOze3SHMBsSjcTSR+IWPPn/ES2yu8xWYE5O9vAa/ORtGJug muCvWa50jLCPAYazr7KccLp8kb2UhtwMNXJ4MhLjPPa/L5VBmZWzKaWiCb052AfIuFBW sAy6cEHNUClGZTS2FH/ESUyz4TDPYv2RznEW56USTga2viYL4vEurRsnB+htvLXbkBuE 52gqJenw86Cw3kt/cYVFQ/Gst5eiMRA17PmLB3HSolvtdt8M9AMD/ByJWRmfZuu/7DoU F41Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738376122; x=1738980922; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=v56CJNyuUY2uT0Hok9dWwolFuVznOqLARx+mk3H/6E0=; b=svqmwm6QM+BCV9LvAotsf9vU+aLVDR2F0eyAM2iLVBKW3IWkswANtAruUU/oYenRYr NWlw935ZmgI7XJ5RGbStMNj7/9zrv9n8ZlGqZvPxxa9fVKyptuO4mCgpkqU5fk18Z+Yu YiBD3pJhOEMjsP0w9M0UehBczAyr/tr+RyRTMhWl+lXaGm4oeO81hxTUiwIO5s+HGnr5 H2K119fVO4jm/5Lb800W5qLzl1KFnnUpSuvpM+JCXG6GLEHH0P/jt3qTHgFDSF5pVDvL o2W/BOt4m3wJ0bmLmM10JM5BATyVW2rMlP5MkJomZu5y8CkeMOYt8D/aYrvH27E2oWrX FyBQ== X-Forwarded-Encrypted: i=1; AJvYcCV0AWKMB0oudjgfEknBeUoWmiWo8yUllDQRoLQGq9Xeyhl5E+rQK9GGipQJlN6S2JQWhjT7m2X+yQ==@kvack.org X-Gm-Message-State: AOJu0YxmBjId14C08vQDcTALFoPR853albzZef9Yn4uYwquf9X5eBLTT 1upSY/7xbNA9qjSAKFq48U+XYf78noZ/kKCktULEhjhpEnSq+i3QgBVR8ksRQQ7uBqBKuAKlL+T CJxj8lk+Vf88FZ3UJRG0+xJWl7x9ZwOkVHdj3lQvRsVhBZdK5h4oD X-Gm-Gg: ASbGncvDVBvtnUU2oPk9hDmZwKp+Hm0uxyNMlFqWlDC3g/YEorElbaQRCByeLRw0Z5A U8rq5D9l0z0Dx1+fwcMVfMLEtqrXab4fKR6Z/xV3aXjGsSY1kh7StdoGBAkFgN3rzcVirlAtJih P5niXLHMsha2WLJ/i98oVjwY7S X-Google-Smtp-Source: AGHT+IHHCsaANPBO7mNmFUucOoOoJG1ef86RkTy6sBlPfuEgemSshRoe+8ebmn8aU5SsnaHILZwfBq7zUPBJnfcL5x8= X-Received: by 2002:a05:622a:251a:b0:466:a11c:cad2 with SMTP id d75a77b69052e-46ffa89d42emr887561cf.7.1738376121694; Fri, 31 Jan 2025 18:15:21 -0800 (PST) MIME-Version: 1.0 From: Sourav Panda Date: Fri, 31 Jan 2025 18:15:10 -0800 X-Gm-Features: AWEUYZktZdbIEZgc7RQDvzyZsCvPN4G4mLNVi1SypuYyAugGzX_rHCl_mUSZwIY Message-ID: Subject: [LSF/MM/BPF TOPIC] KSM Enhancements: Selective KSM To: lsf-pc@lists.linux-foundation.org, Linux-MM , Pavel Tatashin , Yu Zhao , shr@devkernel.io, David Rientjes Content-Type: multipart/alternative; boundary="00000000000038bbba062d0b3b22" X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: E5B0EC000F X-Stat-Signature: 8chah4paiku88mqi8859pf4a154gr6md X-HE-Tag: 1738376122-399051 X-HE-Meta: U2FsdGVkX19n5hncTPF45YYni6O5Nn9/winDJEnhPmRUtKVrrYq9o32RR6jEM3Z4qto3kj0HjWhG3HZFVD1SVo3nNOvmXWgAQ9Z4rsTPpnc/X9PVOgCQNLdZD9/cafc/F6p/WJkpVDuOBaAZc422aBjCSInIeftMmhmjjLH2KudqxWtpF0tUSBiH+RzP1raop8HIBdnD+0bW+FuguHwhYHTESydxwEiFTJVS8APkZVR7xGBTw/yjfl3r3hdzlBfbIp/xePuUU4ncQewNdgtEPJgAkyNGFx9h1c8VkXNY2XDYVZXImR6DQbJsNGxry6yPF+hLZ/LZReSIW/0UMQFbf3VEwvH5cC0QVvGJcdzEUJj99JhAyfWRT40AHuob6uAdGaXq0LnhAk982zltL2/j+D6f1sbPzUZX/BvDmVFdSC3Aem5qGa7qkFPiqPhwkRaIrRC0TVgooxfwIXocSRtEa0j8xhtX3kzBNIVQCwuyqWbB1XSpPpGttuH+oHcf/l63ZZbYNsA3p5pclFP5w0lhMeWrybIo2jw4tOhRpCYL4OW50QvalSsRVW2tpBY/RWC+T2YewGVm+6wrPN/bQamIDXCjrxWkv5sqVqjht4mGNsGEaxfkoHQ6+bcvwq14r52zHw1B4H7AHauTF8lSl9MsP13d7pYTox9mc0uu1xxILD7BroSGPPeXDre6pFX49ehYSxML2Xf3vIVYre13jHtP2KKljnybDyBCjBBA8PCc0s4R/f3FdbvVzmozc6NqyYBwMKqCd7CYAuZ3Mym9+1FyMEcK/Z4spVKMHamJVoCmqQBI5k2U34DJ9LXRZBBPRIpxct94h66XVXG5TI7rjX6prqO2q3JD2lgUec3IvfXi6vKD2dRVWMpZGuePAb6xSL/C83A78EG9b2n5BduBs+eipQZx9WZcMHP9K2+ykemdsJiUzI9F+W12V4o04umsq7QOc2+tqKM7fNx9AEGdPHU n3md33rl RPwa8a8saLajJ687yGv7pWxuK3wxz1w3cdD9qgQj+np1v42ZTMqyb+W+16cYCb94r2WnprKk90zmGq2kbA9JVMfvndNa2l/vO2c0IDyQWMF448JnckXaWnz4PYLlZZ4hU22KcXPCfBS+bi3HR1Ie3W2htL/Xyrup438op3AIDT+O+SlHtwgAjZcp+2yqUqMyeePJBKtyNi4/OffHXg65PEOwd53y8rCdPR8Qov+cUbi+pGP71GvX6j+jIo1cW+ecPPjgZlPkzRvy1P94= X-Bogosity: Ham, tests=bogofilter, spamicity=0.090247, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: --00000000000038bbba062d0b3b22 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi, KSM is a powerful tool for deduplicating memory, reducing usage by merging identical pages across processes. However, there are certain interface and implementation aspect that prevents its deployment in our use case; wherein security and efficiency (CPU overhead - due to background scanning) are of greater importance. We propose Selective KSM, a mechanism to control when the merging takes place and what pages can be merged together. We do this by partitioning the merge-space as per security-domains and carryout the merging as part of a synchronous syscall. Doing so, we ensure sensitive-content is not merged with non-sensitive content. Our overall goal is to optimize the memory utilization in a virtualized environment, wherein there exists significant duplications across guest instances (e.g., kernel). With the better ability of the operator to group pages as per security and similarity, Selective KSM improves security and efficiency. Other than virtualized environments, we also want Selective KSM to work well in containerized environments. An example API could look like this ( Alternatively we can do it through sysfs without adding syscalls): // This feature shall be gated by a KConfig: =E2=80=9CCONFIG_SELECTIVE_KSM= =E2=80=9D // Create a unique identifier known to userland. char *ksm_name =3D =E2=80=9Csome_name=E2=80=9D; // ksm_open() creates and opens a new, or opens an existing, ksm partition obj. // flags is a bit mask to determine if the merging is sync, etc. // KSM_SYNC: Carryout synchronous merging (no-background scanning). // KSM_CREAT: Creates a KSM partition obj if it does not exist. // KSM_EXCL: If KSM partition obj with name already exists and // KSM_CREAT is also specified, return err. // modes is used to handle permissions: // O_RDONLY, O_WRONLY, O_RDWR, S_IRUSR, S_IWUSR, S_IXUSR // On success, returns a file descriptor (a nonnegative integer) and creates the // sysfs path: // /sys/kernel/mm/ksm/partition// // On failure, it returns -1 and sets errno to indicate the error. int ksm_fd =3D ksm_open(ksm_name, flag, mode); // Destroy the name. The named object will be removed only after all open // references are closed. On success, ksm_unlink() returns 0. // On failure, it returns -1 and sets errno to indicate the error. ksm_unlink(ksm_name); // Trigger merge. Only valid if KSM_SYNC is set during ksm_open(). ksm_merge(ksm_fd, pid, addr, size); // Trigger unmerge. Only valid if KSM_SYNC is set during ksm_open(). ksm_unmerge(ksm_fd, pid, addr, size); With regards, Sourav Panda --00000000000038bbba062d0b3b22 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

Hi,

KSM is a powerful tool for dedu= plicating memory, reducing usage by merging

identical pages across processes. However, there are = certain interface and

i= mplementation aspect that prevents its deployment in our use case; wherein<= /span>

security and efficiency= (CPU overhead - due to background scanning) are of

greater importance.


We propose Selective KSM, a mechanism to co= ntrol when the merging takes

place and what pages can be merged together. We do this by partitioni= ng the

merge-space as p= er security-domains and carryout the merging as part of a

synchronous syscall. Doing so, we ensur= e sensitive-content is not merged

with non-sensitive content.


Our overall goal is to optimize the memory utilizatio= n in a virtualized

envi= ronment, wherein there exists significant duplications across guest<= /p>

instances (e.g., kernel). With= the better ability of the operator to=C2=A0 group pages

as per security and similarity, Selecti= ve KSM improves security and efficiency.

Other than virtualized environments, we also want Selecti= ve KSM to work

well in = containerized environments.


An example API could look like this ( Alternatively we can do it t= hrough sysfs

without ad= ding syscalls):


// = This feature shall be gated by a KConfig: =E2=80=9CCONFIG_SELECTIVE_KSM=E2= =80=9D


// Create a = unique identifier known to userland.

char *ksm_name =3D =E2=80=9Csome_name=E2=80=9D;

// ksm_open() creates and opens = a new, or opens an existing, ksm partition obj.

// flags is a bit mask to determine if the merging= is sync, etc.

// KSM_SYNC: Carryout synchron= ous merging (no-background scanning).

// KSM_CREAT: Creates a KSM partition obj if it does not exist.

// KSM_EXCL= : If KSM partition obj with name already exists a= nd

// = KSM_CREAT is also specified, retu= rn err.

// modes is use= d to handle permissions:

// O_RDONLY, O_WRONLY, O_RDWR, S_IRUSR, S_IWUSR= , S_IXUSR

// On success= , returns a file descriptor (a nonnegative integer) and creates the<= /p>

// sysfs path:

// /sys/kerne= l/mm/ksm/partition/<ksm_name>/

// On failure, it returns -1 and sets errno to indicate the e= rror.

int ksm_fd =3D ks= m_open(ksm_name, flag, mode);


// Destroy the name. The named object will be removed only after= all open

// references= are closed. On success, ksm_unlink() returns 0.

//=C2=A0 On failure, it returns -1 and sets errno= to indicate the error.

ksm_unlink(ksm_name);


// Trigger merge. Only valid if KSM_SYNC is set during ksm_open().

ksm_merge(ksm_fd, pid, addr= , size);


// Trigger= unmerge. Only valid if KSM_SYNC is set during ksm_open().

ksm_unmerge(ksm_fd, pid, addr, size);


With regards,=

Sourav Panda


--00000000000038bbba062d0b3b22--