From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 924C7C02192 for ; Fri, 7 Feb 2025 06:26:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0856C6B0082; Fri, 7 Feb 2025 01:26:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 00CAF6B0083; Fri, 7 Feb 2025 01:26:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DC90E6B0085; Fri, 7 Feb 2025 01:26:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id BC0976B0082 for ; Fri, 7 Feb 2025 01:26:05 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 43FB81A169E for ; Fri, 7 Feb 2025 06:26:05 +0000 (UTC) X-FDA: 83092163490.02.34DFFAF Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf10.hostedemail.com (Postfix) with ESMTP id 7D26CC000F for ; Fri, 7 Feb 2025 06:26:03 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=C0wTW1sk; spf=pass (imf10.hostedemail.com: domain of 3eqelZwsKCKMDFNHUOHbWQJJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--ackerleytng.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3eqelZwsKCKMDFNHUOHbWQJJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738909563; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:dkim-signature; bh=pVBLx5GIpN7LjdmT+CcHUgO81qGH5daXphk7+y3Ogdw=; b=DH0RB+bmdJEWn0VyVv02TsfTd5dXQ+uSr/IuJEQZKOXmOhByHdTcuqAw9ejKRqw58znmDO SCZd+Ei/MnGJov3W7y/fJyJuzF2PWNQGJJ8t+d7Y6kRzFd3I5nF9RqvvsDu1Xxjcb+Q7yx k1ugY2itHgUgOXvn7DketO8jE90CEsk= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=C0wTW1sk; spf=pass (imf10.hostedemail.com: domain of 3eqelZwsKCKMDFNHUOHbWQJJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--ackerleytng.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3eqelZwsKCKMDFNHUOHbWQJJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738909563; a=rsa-sha256; cv=none; b=3cwgEXkT5Ua/ivNznTDVFzdejrMR/47g35qr9o0KnHX2fxjVU7bRw+ueFKuIrmiuHNJ2y1 XdGeccHSWr4I1whbucBSTMq5TAGaLovsGRj5Hc4FhFAgAAwghujlAaxRiYMO/+hfZ4uj+6 1l96Q30wkbSo8uLogDdGd42SYw1uheA= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-21f5060ef10so22117835ad.2 for ; Thu, 06 Feb 2025 22:26:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738909562; x=1739514362; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:in-reply-to:date:from:to :cc:subject:date:message-id:reply-to; bh=pVBLx5GIpN7LjdmT+CcHUgO81qGH5daXphk7+y3Ogdw=; b=C0wTW1sksvxRuFMFwDQjaZizduuVJJ+UyEDWVjgz8ffzw+lJsfVES7fUQGGpxEeyyS UiAqhsCOSUaQFiqpGrDG9GXETWMQWZUVUpRAyz0bUdPa1rMTm3gBeU+Y9Lj3t9p+86yH PB+EHhCkTXA3132IaCvnN8J1zGCNCJu0PIpZisTigvRzsBgaoXHSVXVF6A4M5hgEN/r6 MItbWUtruqKoLZ18vPOrYF/clscQO85fkwgj7X/oO4VDuJcG8qznh2/gIhKxww0I+Rqo VeIbp0Q+DtfwcR8mtT0IImligFPRsMpWlLeYWNCpAlwwj7VpxLpeIdjUe8kQRKQRjuh0 XJ6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738909562; x=1739514362; h=cc:to:from:subject:message-id:mime-version:in-reply-to:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=pVBLx5GIpN7LjdmT+CcHUgO81qGH5daXphk7+y3Ogdw=; b=V5ulF6zQ0W9ALQEA4A2JKUmc118N6u6EpN2tDy6fA7CBuIThVG+YlKt/7/W8XM3PSa nQk2UI1gJeQxs6hZjIUU2ADUm6/iWAV7jAmkKLKZu/uhXHMzNP8Rf7HeVHg+VA5aituK C2QbTo3dLxfcDheRUBDp5T1G8XXm5gWoMWK8/5kZj3ss/7R/4oUotJFA7wcQ3pYrxxtN xpZLSepcXhUzvyQtLhAFTm9BpXA+oxNm+yq2p7D1oKSihW2UAPuT1SMRNjY1ERizrADI CBYS2192TNPHy/pDS8cC/HQizor5EN8rCeFJoe3D0FYEvbRw109ZuFiyACEHjgedMAJD 80Vw== X-Forwarded-Encrypted: i=1; AJvYcCU1zlxclUjbak6BmO7rV8vCD2TCZcFrjnCPUNoH3HYT4nv2yFxqZfE/H+X7gDBIT6oj0dQxHOAz/A==@kvack.org X-Gm-Message-State: AOJu0YwyA/uOmBYzJEzvemMOXYXkEt2pi+d8MkJE0fl1RVHUhHqaIEwv M7X+szvMrkuOa6B+cxF79K+bReGd4Mqh/VPg/OL3GtOT6GYuBsSokb/J7rCRNl6mgCLzb+1RLnl cinYk1RmeNS8EbQNkylgohg== X-Google-Smtp-Source: AGHT+IHXNTs3kgTobb4JCthlwm4Jfzsw4vmzDv2oF+RkKfjY2IbUGXQW6QU/6emUQ2gWCi0WeScassVHI7XwDeeNyw== X-Received: from pjtd16.prod.google.com ([2002:a17:90b:50:b0:2ef:d136:17fc]) (user=ackerleytng job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:ce84:b0:21f:c13:89a9 with SMTP id d9443c01a7336-21f4e75d8a7mr38698485ad.35.1738909562300; Thu, 06 Feb 2025 22:26:02 -0800 (PST) Date: Fri, 07 Feb 2025 06:25:59 +0000 In-Reply-To: (message from Amit Shah on Thu, 06 Feb 2025 12:07:58 +0100) Mime-Version: 1.0 Message-ID: Subject: Re: [RFC PATCH 00/39] 1G page support for guest_memfd From: Ackerley Tng To: Amit Shah Cc: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com, erdemaktas@google.com, vannapurve@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 7D26CC000F X-Stat-Signature: eq6gxj97y9k6bqmiwhybfo568kkyzboo X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1738909563-585705 X-HE-Meta: U2FsdGVkX1/5MknbIHA9s+5QWGy7k5ZH5Dly7DbuRtcWKDQxHmZ+POdXxoTJj9ffUyhA1QxLumitPd3RlwTT59/27/VZe/ZMzc1N7Nkx8x0eJXeo6M8lFL54pABuvANn3nkjIyTuWuhGPXVi3nilXq7gQbu3Yv5rPBz1vFxH7eInzb/5JLyicdRRZq/hCUJKQsr4Pl3xm2yUEGmYHzQc4EHV2LSE8QuY8wZnA7XoZPq3Ccb2hiweXrqsNBxOoh7RkZ69kLZliItE2y4eptPDbdi5Dg/lfLkWbhxejauy89etpNff5X/qMVgKRX/OdMSBSv7FXorY2Vl0c9GLPvpShx+G5RWMmsLJtjMzIiegh0B1e/l+T6CF6VgnKn1N1QXZxllCctbESlmmGRreg+Ui1yOUS8M83oKi7jOWyQ9561ibNiG3Q5ly4qcThJApb0Dtr1HYnlZ7vReVu/H6PseCeTa6JEJDLMUHhCzJ2eCI/EE6FUl8RvwkAdM80YQ34OXw1IyvDPtu9/ryeH8P3ZVLhk/xrXm5SbxFFELIH09V01qizUn4APOwF6t87YskAM0lw6JiYr2IOhADm4+GClQLrSgZMugm2KetFuw5T/dv91EUN5RMMY7VTNBouQFk47PiaDcuBxdQ82UIdrAGm4gvKTthLCS2b0oC1s2b6KaKvmbSDwQ1Ux2dUrQxJEZ2F9j5RwYKBBx7415ER1NKv+L8w3l3NaMB7foLBVorvXTq5RuUKMK8eW/HSIgpOAHXJT6GPa92y2aZzMXp/QAgCGTcywEmkor6lf2VZEYjs/SNuJjkxyKUzi9UVA5H9GFhu9a8w2s6OnnuUdpLGW2bJ2wwzMgaalmEvSvemMkc2Glq0vId0WCfl4mrigTaFF4VWTuO928pxcRHaOu8Dy2ACi4/w4U9PRfskc8obzROZdpdXZoHm8sRID/xs9BdtnikiDzZPumGZGpObHH4ckwc0V4 TSgsUlD2 wiyIgLtiASPDQaxXhWN9MAIO6rHANKBYhK0ucJkeeBzbV/P4VkPRi0rvGHaEZaiB2camLCO5+Bwe5LPDCWwMAR3P/a2pgCDD3cSor6WnSnOpinAJAsEti+IP8CK66nhSSoFJqeyKjbGDnx+Fa5VAsKyWyHu0MtfleF9DQevPjl9nyiV7nMO1j8HfV1FBAzXEtPs5DZjUMkAXw6a2HfVdgdjNoT6hOm5sWlnAGS/HNdk7CsZ8fE5g2KjE4u71OCE2TE5/cdLY4YXz2nU+SNNMxciuWQZzxSgefh7ykBPG3CEj/eR3iNZdwU2J9/tVZIguF2SDshaP1AD+NLRD93DYendtC72Frx7SgLOeKLEqwKTYUob+NpbFgdDYK2H06z5Z1lDkmc2pLjqXzBslFf/7LO8XAr3Ue1TNHICWnOlXgt1wLYeiPKpO+N6QQ23xj5ZWAi4Hx/tgDmECF0CHiYVbm+SCFpKtWB/qmcT7cLYJCcWEf9MfY6WwKPP8e3oS/f2ACbVsRa95/MczQAgEzFl9VUqhbb/whQe1AFTg7E04nNUjpF1LW63gbaLzKjufxIlYGs6oUfs5/kqjdmv5+0j6nw3RuQU9dUDqj/RIQ1nPlseS5ACch7f6SsOP1xAhQ6Nw/AkpOLikawrLQdGJCqRc+x401E9e7bQO4OBoMOqJFMYwqM/xeDwLWPiK1kenfxGBDE9b1bvKIDq7XNchzzLCsky+ZwRpvOhp9iuamHzTbeH3TEGOj4vTJt3Q/AFlsTAlY9EM9UIvOenhULuPwKxyFYuu0kcDubo2qLT97lPcMJWxTOmA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000014, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Amit Shah writes: >> >> >> Thanks all your help and comments during the guest_memfd upstream >> calls, >> and thanks for the help from AMD. >> >> Extending mmap() support from Fuad with 1G page support introduces >> more >> states that made it more complicated (at least for me). >> >> I'm modeling the states in python so I can iterate more quickly. I >> also >> have usage flows (e.g. allocate, guest_use, host_use, >> transient_folio_get, close, transient_folio_put) as test cases. >> >> I'm almost done with the model and my next steps are to write up a >> state >> machine (like Fuad's [5]) and share that. Thanks everyone for all the comments at the 2025-02-06 guest_memfd upstream call! Here are the + Slides: https://lpc.events/event/18/contributions/1764/attachments/1409/3704/guest-memfd-1g-page-support-2025-02-06.pdf + State diagram: https://lpc.events/event/18/contributions/1764/attachments/1409/3702/guest-memfd-state-diagram-split-merge-2025-02-06.drawio.svg + For those interested in editing the state diagram using draw.io: https://lpc.events/event/18/contributions/1764/attachments/1409/3703/guest-memfd-state-diagram-split-merge-2025-02-06.drawio.xml >> >> I'd be happy to share the python model too but I have to work through >> some internal open-sourcing processes first, so if you think this >> will >> be useful, let me know! > > No problem. Yes, I'm interested in this - it'll be helpful! I've started working through the internal processes and will update here when I'm done! > > The other thing of note is that while we have the kernel patches, a > userspace to drive them and exercise them is currently missing. In this and future patch series, I'll have selftests that will exercise any new functionality. > >> Then, I'll code it all up in a new revision of this series (target: >> March 2025), which will be accompanied by source code on GitHub. >> >> I'm happy to collaborate more closely, let me know if you have ideas >> for >> collaboration! > > Thank you. I think currently the bigger problem we have is allocation > of hugepages -- which is also blocking a lot of the follow-on work. > Vishal briefly mentioned isolating pages from Linux entirely last time > - that's also what I'm interested in to figure out if we can completely > bypass the allocation problem by not allocating struct pages for non- > host use pages entirely. The guest_memfs/KHO/kexec/live-update patches > also take this approach on AWS (for how their VMs are launched). If we > work with those patches together, allocation of 1G hugepages is > simplified. I'd like to discuss more on these themes to see if this is > an approach that helps as well. > > > Amit Vishal is still very interested in this and will probably be looking into this while I push ahead assuming that KVM continues to use struct pages. This was also brought up at the guest_memfd upstream call on 2025-02-06, people were interested in this and think that it will simplify refcounting for merging and splitting. I'll push ahead assuming that we use hugetlb as the source of 1G pages, and assuming that KVM continues to use struct pages to describe guest private memory. The series will still be useful as an interim solution/prototype even if other allocators are preferred and get merged. :)