From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5502E6918F for ; Fri, 22 Nov 2024 20:38:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 20FC46B0082; Fri, 22 Nov 2024 15:38:39 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 198DC6B0083; Fri, 22 Nov 2024 15:38:39 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 012756B0085; Fri, 22 Nov 2024 15:38:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D44466B0082 for ; Fri, 22 Nov 2024 15:38:38 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 822451A1613 for ; Fri, 22 Nov 2024 20:38:38 +0000 (UTC) X-FDA: 82814893266.04.FE5D348 Received: from mail-io1-f50.google.com (mail-io1-f50.google.com [209.85.166.50]) by imf02.hostedemail.com (Postfix) with ESMTP id C80D580005 for ; Fri, 22 Nov 2024 20:36:53 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=i57wHLhp; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf02.hostedemail.com: domain of bijan311@gmail.com designates 209.85.166.50 as permitted sender) smtp.mailfrom=bijan311@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1732307779; a=rsa-sha256; cv=none; b=VSCHry3ouREbD3R0VIY466c/i4psjg2VXtKKyv9UjbSg8bYH28+iQNOKC14gr1kNPica9r BgFmi+5dBgiHITTy4P/IuOcMhrVCjxYIAC2suj+vkb8rDtXDxInlFAGl8hUCfj+mSF/Fe8 8ZEJKRzaTq89bO5WlgQrkdgt9JXlDCI= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=i57wHLhp; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf02.hostedemail.com: domain of bijan311@gmail.com designates 209.85.166.50 as permitted sender) smtp.mailfrom=bijan311@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1732307779; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=eu14sCRVzejODJC6qzjxU3K6dPA4Gz/RH8oyvGexODM=; b=xja9oakdz2I0K2HUJ4Y99ddPPlE4ssASBPXBlHkDLSo5mr4ktrxWEpiCfD7nnNK7meHqZb ngBXXC2cvj7FK2G1wE/+NNKo3S7+LD3bTFXLFczaM+Gc6yBFKGGb7p/H7xE15haYEjob0O Edt5wps7eH4+SAOZcOMOPUyemze9vhw= Received: by mail-io1-f50.google.com with SMTP id ca18e2360f4ac-83b430a4cfdso92749539f.2 for ; Fri, 22 Nov 2024 12:38:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732307916; x=1732912716; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=eu14sCRVzejODJC6qzjxU3K6dPA4Gz/RH8oyvGexODM=; b=i57wHLhpMR6BLhbJLe41BNSEkYe/ShpC1K17FhYF+A7o/Wv2wOlokiG+Vwd3vWHLZ2 g+wPDvcp1GfokTNll4aibGN/K5ubGB1d9qk/DEw4Hk0BRE+/ayZEeNayRmJxUT9JkeuH yGkyg+jrJjF943DvN4b7v5gUIri0tW6vhDF2WOTnZfbIGcXPr4JNz9ANziwA1QbK9Dyi 3ONcjG+5N2wd5s5T4QszKFI1GHulMw5Q+q5KT1hXZ3J8xTU+JDKiSlLVz7bM84saliSo n5ZZxnqBvZmTwQZzQNiwJPUifHc9WIxJLYVeo1wnptEJdhdELrSguZhgLRTWXxaUQ3+q qCIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732307916; x=1732912716; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=eu14sCRVzejODJC6qzjxU3K6dPA4Gz/RH8oyvGexODM=; b=mz8a1pAlu2Dl0iHN6kwqJPBdloYcJ1rDc6pP6CdTSzvSIbBnRDlHfT0idYzA2ErVVs osnIkPHz1j3SyhM6n5nAlmXXt1yFCHP3xAyN0LEXFgcb+3AwDByegHShm6fIITtDH5NU Vf6hNKSk+jHwiGVzuj4MsBsCQmLmr1pPN/IdpMQ385zw08g3FBm32SnBpLPCO6uPS/E6 7gI636uCUVgHQabM4CIsxjc1s9lu7YKckqWSw03pU3QbRsY5MYiPb1BZSOOiJmpEcgoq etfwtixCpH3aHbnl7BwkE8CAI2oQE2rxOWtYwnfcgPRDd/oy2EfCVMT7iatWgKV0jGRf W04A== X-Forwarded-Encrypted: i=1; AJvYcCUlFWcICf61rgrfWpb8iYkDm6Zmt7MhDISfGJH3JXu0cbgVv2apnmSbgovO1Q2h2qLlhmcgmQqYPg==@kvack.org X-Gm-Message-State: AOJu0Ywi/iPKwgdSLjV1/XjFhPO4cxUlepEIDYvXC1Uo/vvqqgtP5x0V h3qQvIIQl8rOqD5Xe7chtLM6DrxDMzuikIpiK3ms3SKkOw2wgFIu X-Gm-Gg: ASbGncvY/OtztRhIrSgBNNP90ypNPGA8ER4M5PGiAj2/ZPvexk9Awt1fCD13KigW8P1 2EQWUY+5AVtb5FA230Meub1NLnqCRHLF6eyEXzA2GdIRgW1nUnVSSC4/lAtF1sc+fEzLypI12lE TSyOu0WbXMACVfnxRClOyE1m91Q+Ek8wKIWMFoiaL2Ok+ykppIuZIne9d6jStatt9rUTjkwcjiR LIeZFGa8yTT/bhfybrtyiGrbJ/XFE4BUI2ScXodj8Ti2kRbADmC7e9T0au2FCiLGf5YGGldgDo= X-Google-Smtp-Source: AGHT+IH6rJWbNfVr7vcbyDX4s5qAkX+9U6Billtz99yD39jNe4BipPNDJvFp6knxEVG7rcfvJfNSqA== X-Received: by 2002:a05:6602:1693:b0:83a:acba:887b with SMTP id ca18e2360f4ac-83ecdd22d41mr493393739f.10.1732307915566; Fri, 22 Nov 2024 12:38:35 -0800 (PST) Received: from manaslu.cs.wisc.edu (manaslu.cs.wisc.edu. [128.105.15.4]) by smtp.gmail.com with ESMTPSA id 8926c6da1cb9f-4e1cfe52506sm794682173.77.2024.11.22.12.38.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Nov 2024 12:38:35 -0800 (PST) From: Bijan Tabatabai X-Google-Original-From: Bijan Tabatabai To: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, btabatabai@wisc.edu Cc: akpm@linux-foundation.org, viro@zeniv.linux.org.uk, brauner@kernel.org, mingo@redhat.com Subject: [RFC PATCH 0/4] Add support for File Based Memory Management Date: Fri, 22 Nov 2024 14:38:26 -0600 Message-Id: <20241122203830.2381905-1-btabatabai@wisc.edu> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: C80D580005 X-Stat-Signature: d81g9t5656bsk9zbsabqdizuhbhxfin8 X-Rspam-User: X-Rspamd-Server: rspam05 X-HE-Tag: 1732307813-441511 X-HE-Meta: U2FsdGVkX18ouarRkBxmsbtx9wECER8JP9pAN760xhWOiM4LA55qatUss0Eg7ZxOhBY+iAQ8mN3RFa18JdAGgAJ4B7jmYspRQuAnwkQTkZPE4nikCG7Tj6jX3mfcNLnX1LuTGFcpO5zJjSWv3q7DDjAXk5td+D2PuJQZzEikUrZZVUaD/6P66tCktd0wZJNmhvwTMo1pPsi3lr3ZBplQmfvX7rr/P8W/hUJgcW5mU0vNDsHMq7RSSYEdxSwfDBbVHtqv2Q7yqDKJKnUD95dzppXUy1mAkFYXBof9m/cpXTwQT0sjDS8NXKj3LrwmblvXccuLPnOQqZKC0F5CQAzRfRZaCxTyHTCuI1+xKHFXMcJBreE103iyGR0rgJ+jcsjudQSAglQGQO8B3DIIa05yIASHKswAXr4fsRH6T8An7p+ypXAmNfWVuFW02ZcoESg1mIk0yURZQo+p/NGoidu8jIs9idxxNzRYWvp7PbodYjrEFFHz2SdJvqc/mSdYXN2QV9y6nNHq0nMT8/AXV0CyUlafIKVjuz4qfbR2N7lFiGuayEHTIaSskgGS2VYgPcQq/+FrtSJen2+Hx01nhU7uXr837oLNeXHDTJN4eNdKF30/AnBiQ2RbvzSwCE4M3QEJKEUDQ4I5TNlSSsCLMq/o1kR2iEKF+EN4hEUNB2mN5oztHUwLrkykrA7IfUEZbaHVAb4H3URlVEN/PHRe6tl+0DjWG+XBZ/zJ34hiFAx+d/XbblQi+fNSeHGrHm4uFaAeqstwQFdWJ3qFmNXU93iv1NhSMY1ne+0eyah9gHYUJAGxtUZXm42xwPFZ0RnAvRXrYuX0rJ7kFIw3GYQ0Wq6CyMA5Wu+MKsSmsbKtglYMion6yeP0Q2ztHkE3LFWPObm7sBf6DZ0ZLktRgLotasQn0Lz2EMbAmP48+YEDADwNoYEzWDMxUliGAO2OyHn71n4hRvUcTGEb2kP4BZkwQ5O HvI9LH0l gpQnHX6YAbT5Vw73oKOfSBPycu6uWMb6ZQ3OxkAA0Qiujl9yvcEArZBqU8QQ5h83UuwciieNXwdGuIDxHknk5Gn9JrPBLAzE29CgCHLRgGxXg16Aqb13fJGSonOQ68AR45r1ygAXovgdeD1ozcjpfpC2UXnz2bxKSvla/XKT4dyAmEl3hsxIdUE0ExWsTFIJ3beTMjxARr6x1xY+Mf3xhYffJQXEk7symGY+aH7N/nIVxP9V2kjpSjUd4ZW1StlQ7LUp5oT+52OW7gV2vdkUfribbwFZzXityHEUX7k6Y68X8V7neOg0NegHHwkuPfOses3F+IL2JMeCY1dE+C8TVndVgGEjkmL+TsCz3/i7oaunn60M= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000039, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This patch set implements file based memory management (FBMM) [1], a research project from the University of Wisconsin-Madison where a process's memory can be transparently managed by memory managers which are written as filesystems. When using FBMM, instead of using the traditional anonymous memory path, a process's memory is managed by mapping files from a memory management filesystem (MFS) into its address space. The MFS implements the memory management related callback functions provided by the VFS to implement the desired memory management functionality. After presenting this work at a conference, a handful of people asked if we were going to upstream the work, so we decided to see if the Linux community would be interested in this functionality as well. This work is inspired by the increase in heterogeneity in memory hardware, such as from Optane and CXL. This heterogeneity is leading to a lot of research involving extending Linux's memory management subsystem. However, the monolithic design of the memory management subsystem makes it difficult to extend, and this difficulty grows as the complexity of the subsystem increases. Others in the research community have identified this problem as well [2,3]. We believe the kernel would benefit from some sort of extension interface to more easily prototype and implement memory management behaviors for a world with more diverse memory hierarchies. Filesystems are a natural extension mechanism for memory management because it already exists and memory mapping files into processes works. Also, precedent exists for writing memory managers as filesystems in the kernel with HugeTLBFS. While FBMM is easiest used for research and prototyping, I have also received feedback from people who work in industry that it would be useful for them as well. One person I talked to mentioned that they have made several changes to the memory management system in their branch that are not upstreamed, and it would be convinient to modularize those changes to avoid the headaches of rebasing when upgrading the kernel version. To use FBMM, one would perform the following steps: 1) Mount the MFS(s) they want to use 2) Enable FBMM by writting 1 to /sys/kernel/mm/fbmm/state 3) Set the MFS an application should allocate its memory from by writting the desired MFS's mount directory to /proc//fbmm_mnt_dir, where is the PID of the target process. To have a process use an MFS for the entirety of the execution, one could use a wrapper program that writes /proc/self/fbmm_mount_dir then calls exec for the target process. We have created such a wrapper, which can be found at [4]. ld could also be extended to do this, using an environment variable similar to LD_PRELOAD. The first patch in this series adds the core of FBMM, allowing a user to set the MFS an application should allocate its anonymous memory from, transparently to the application. The second patch adds helper functions for common MM functionality that may be useful to MFS implementors for supporting swapping and handling fork/copy on write. Because fork is complicated, this patch adds a callback function to the super_operations struct to allow an MFS to decide its fork behavior, e.g. allow it to decide to do a deep copy of memory on fork instead of copy on write, and adds logic to the dup_mmap function to handle FBMM files. The third patch exports some kernel functions that are needed to implement an MFS to allow for MFSs to be written as kernel modules. The fourth and final patch in this series provides a sample implementation of a simple MFS, and is not actually intended to be upstreamed. [1] https://www.usenix.org/conference/atc24/presentation/tabatabai [2] https://www.usenix.org/conference/atc24/presentation/jalalian [3] https://www.usenix.org/conference/atc24/presentation/cao [4] https://github.com/multifacet/fbmm-workspace/blob/main/bmks/fbmm_wrapper.c Bijan Tabatabai (4): mm: Add support for File Based Memory Management fbmm: Add helper functions for FBMM MM Filesystems mm: Export functions for writing MM Filesystems Add base implementation of an MFS BasicMFS/Kconfig | 3 + BasicMFS/Makefile | 8 + BasicMFS/basic.c | 717 ++++++++++++++++++++++++++++++++ BasicMFS/basic.h | 29 ++ arch/x86/include/asm/tlbflush.h | 2 - arch/x86/mm/tlb.c | 1 + fs/Kconfig | 7 + fs/Makefile | 1 + fs/exec.c | 2 + fs/file_based_mm.c | 663 +++++++++++++++++++++++++++++ fs/proc/base.c | 4 + include/linux/file_based_mm.h | 99 +++++ include/linux/fs.h | 1 + include/linux/mm.h | 10 + include/linux/sched.h | 4 + kernel/exit.c | 3 + kernel/fork.c | 57 ++- mm/Makefile | 1 + mm/fbmm_helpers.c | 372 +++++++++++++++++ mm/filemap.c | 2 + mm/gup.c | 1 + mm/internal.h | 13 + mm/memory.c | 3 + mm/mmap.c | 44 +- mm/pgtable-generic.c | 1 + mm/rmap.c | 2 + mm/vmscan.c | 14 +- 27 files changed, 2040 insertions(+), 24 deletions(-) create mode 100644 BasicMFS/Kconfig create mode 100644 BasicMFS/Makefile create mode 100644 BasicMFS/basic.c create mode 100644 BasicMFS/basic.h create mode 100644 fs/file_based_mm.c create mode 100644 include/linux/file_based_mm.h create mode 100644 mm/fbmm_helpers.c -- 2.34.1