From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3B5EC61DA4 for ; Wed, 22 Feb 2023 13:44:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1C9AB6B0071; Wed, 22 Feb 2023 08:44:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 152916B0073; Wed, 22 Feb 2023 08:44:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F0DB96B0074; Wed, 22 Feb 2023 08:43:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id DAB146B0071 for ; Wed, 22 Feb 2023 08:43:59 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 9B0598014B for ; Wed, 22 Feb 2023 13:43:59 +0000 (UTC) X-FDA: 80495046198.18.BF9F058 Received: from mail-qt1-f177.google.com (mail-qt1-f177.google.com [209.85.160.177]) by imf19.hostedemail.com (Postfix) with ESMTP id DD1001A0005 for ; Wed, 22 Feb 2023 13:43:57 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=soleen.com header.s=google header.b=Lk5REJcm; spf=pass (imf19.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.160.177 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677073438; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lg+GdqFsxgom9+HLXj0wvzMhlVlRX1FLBD7fvoBZEO4=; b=cwkPQGN2B3qxJ+xMvKfez9fP5vkE+eaqV6BbYUfR3bd2keZV+JxqK9dv7dN+bJ7Dn6JF9l kE5oFAqXGuxzRd7P4N9JVw1j7XNkg8sYJy0FbtBKWgAlJYm25NF+Lge7/1e3ffqAUf+01G Ys/venN7GqMw3JTXw8VZcSeBbXTR8Ew= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=soleen.com header.s=google header.b=Lk5REJcm; spf=pass (imf19.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.160.177 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677073438; a=rsa-sha256; cv=none; b=b/Q9AzwP0IKDHHkwpJmWLTGSGEwcarHR8bxnwpaZ5E9J0aUooArFsN6JAUYsJx5bsGP3FR ee9oIqbIbDnzyBV6xLWIS2wGypvvQVn+BLNG5glPGzENuwfvHVUM0RnQe0tjkSnSw/uuzU VWSDF6ZqUnzGepercsSCjAOtD16avLQ= Received: by mail-qt1-f177.google.com with SMTP id l12so7468490qtr.0 for ; Wed, 22 Feb 2023 05:43:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=lg+GdqFsxgom9+HLXj0wvzMhlVlRX1FLBD7fvoBZEO4=; b=Lk5REJcm0fRkMLTX+NfLdKa+MZqeuRqpZUo7P+5TwuYSEXTvyz3ETZsc9+Z7bHPVRT jizRrqMJg5sSv8fnP8FWHXGnj7zDmyz9QGQfQ03bNthLOviFONcjN01f81OFWaj7HrsT fZAC8MPs8T7V83sE8sHLS44Ec0vAQTkmvEOnuIe2ktxAMTtcx5QNBEP5/AYLol0EHrQL CRiwqQnf/VnYlLOvaixKKtdPmoxEeQ83hpy6HdYP6No0hZQ4b/X+x7wQ/zCwiIBUh3Rr 2946/N8+QguPuuAuD5YeoiEXhKcG5csLbewp4QCd1bN2dihpTVwdTHAqp606FxP8emFw hSFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=lg+GdqFsxgom9+HLXj0wvzMhlVlRX1FLBD7fvoBZEO4=; b=yFKEIXWPRyUlAyzcsj/gIoF937iN/l6nynHRuuASECitzAD0S28fsX3Ookq60qS8dc D+0ZoXYrqAQaOG4XbTgeSm5/lFqXcgbIfYweudZzDweuEEcaZVHl/C6OBKNt1tXrULhP JBvbA0fUu48gWoR0pkGx93ZSqxPu7XYgFuJXKYejPL0c3Mw1k0czOQTdFiQkpgZ/1IVn L856HwmlPGvhXSglwTd/wdbnhsFIAsAAz6k0Wo9/EHNBwfsKbxBcXm787h2JCdnOte6L I/yQ0M1V/3/Ib7E1xpJkk/4Y1BS4jkk5t/uQx7M7hjkemfskaWJRG5otHeR857120c4F LYDw== X-Gm-Message-State: AO0yUKXRq7tgjICgrcttrF/WWRPlcJgczZg/6oeeGue22RTQH17FZTL/ PY8wT/88fsYFSzXk9ovKq/4R/z0oeOHL+33q7Vy4NQ== X-Google-Smtp-Source: AK7set+271I6q6ADBRy4mYh5m2eAGTddEEwYQjryJx+XIUG0/1TbbJYxvm3PTYPgiyQt2UYhuqGpHEP+CspNHb1gA6Y= X-Received: by 2002:ac8:4b46:0:b0:3ba:1d6f:c4b5 with SMTP id e6-20020ac84b46000000b003ba1d6fc4b5mr1078226qts.12.1677073436937; Wed, 22 Feb 2023 05:43:56 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Pasha Tatashin Date: Wed, 22 Feb 2023 08:43:19 -0500 Message-ID: Subject: Re: [LSF/MM/BPF TOPIC] Virtual Machine Memory Passthrough To: Gavin Shan Cc: lsf-pc@lists.linux-foundation.org, linux-mm Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: DD1001A0005 X-Stat-Signature: up6oyuo7xpq4g1n4wam5ni1wzgffcnex X-HE-Tag: 1677073437-1717 X-HE-Meta: U2FsdGVkX19XC3zmb+9/dZ07fv9RQEHOCi14pJx/4tCnDCHvznPCpgUmmmJBSn+9NNHmfoYcIiB8CSQUTgQMcP2j8vO20SbEvKISB4rKaV29GrD86ulRhYhii/8F5oNl4Y1vbNbYlwjC+Q+CKQv3mnS4XGQeerDprcboLkKm5uFv+JstxGjRr3nfWrcu8vKR+YNnKASHN0MoXchvZDw3/5DoVzoutkhlbwjE+6ZEgm5UdcDrQ60Bw82kc98SwRvPiGyD3l76H+vbo5UvKsWt3xkcyCbDAftBapnSJGX4JwEDk8kuPWYmlmuYhdXB+ZSWgPPXH8vtaavJ2mmlcHCwTuSQ9B0hViYR8f3tQZoG+pMghS/ZBJfq+kdd/gZGpgXfANqjlkMckgyjmQl5YtRvY+NCCz6DAbdQ9Es6iWPBC8inDaJSPNT1qBzcn1qZ8t8tpqY2iW6SFEcAWZ0IvBaUwSRrZjI9Q2je8ItqSWP/aDufXHl/X8gbbMk1MmdvnSjmERznPfe17sPd3UoJGXt8Xh2KWdnS6TTr8YI6gmNDhHCHl7VbTynBqG1ZaVvAhu9wTiqCnV2NctuXtMI67YEEjW5Kr7Lz9S2kW4EE68nqmaJlQfZgiLVuEYyTKoUOLaxlCOh4RTwzm2It7HFFRdR9hYDlNdwnQrs2yhEIdMcPgmN6tLZPCFdmO2hiD83go2K/Y1l8jKBylunTQ3We18C6EoFvhujhCUWzZ7U9Yzoqc95SVMDXw1dJsGWXmTg3hzJn8bZduV1NLSKwJ712moofrXmMp7wOs7rM546sFIZM7FmZ9sbK/7sXp8/K/Ko3KmxoztzrgUt83HX9c0HC78BIz12gEpcmLODReKQw3Q1kCsxqMUQHARhmKUvX0U+6122eTPnXgTtfMNLV18kCgmSzyQgVv4UmtHroVT8VLGBu/d0YxZt8cCtrXsMMX98b3zAfAb6Kp5AgTf4ZMF45qr1 0UZYzy+X 71omQXlaGRam6lgwl2fNHmjeC39i2PvjwqaBaEVG5E2aOFhGRzBqoPYzdhofl7W2G1TeqaqCpNikOWGgApGp78j63LKSIqx0wXe/3z4RuYvyi2WwxU5ZLmAga1mqJKrFfPEPQ08nmpaNJBncixkMvkTWQtzH4mUUmpdv4GFbL6rxbuc81LpAPpPrkqzJuPx9kJGoXqz76+MsUnU1VSs6scEYVg9bwCE1ogvytfZsVpZtqCl59eeVv1Yt/hw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000109, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Feb 20, 2023 at 6:51 PM Gavin Shan wrote: > > Hi Pasha, > > On 2/21/23 3:31 AM, Pasha Tatashin wrote: > > > > As a part of an ongoing work of replacing some containerized work load > > with virtual machines within Google, I have worked on making the > > memory translations faster. > > > > I would like to propose the following topic for this year's LSF/MM/BPF: > > > > Discuss a set of techniques that can improve the guest performance, > > memory footprint overhead, observability, and manageability of virtual > > machines by hypervirtualizing the guest memory to the extreme. The end > > goal is to allow very lightweight virtual machines to be closer in > > performance to the containers. > > > > The following items are going to be discussed in this topic: > > - Reducing the cost of SLAT page table translations. > > - Reducing the memory footprint overhead. > > - Reducing the memory management overhead. > > - Increasing the observability of guest memory. > > > > It's all about to understand the problem and possible solution or directions. > > I googled for 'SLAT' and direct me to x86's EPT. ARM64 has similar thing called > stage-2 page table. The usual way to reduce page table translation cost is to map > the contiguous memory through PUD/PMD. I'm not sure if there are other solutions > we're heading for? > > Guest's memory is usually backed up by virtual memory area (VMA), which is either > a anonymous or hugetlb region. As I understand, the page fault handling is excessive > to populate the requested memory. I'm not sure if reducing the memory management > overhead is to get it faster, or something else? :) Hi Gavin, In a non-virtualized environment, when converting VA to PA, we load each level of page table, so converting to a 4K page takes 4 or 5 loads, depending on the page table type used. However, in a virtualized environment, the number of loads to convert guest VA to host PA is not a summation of SLAT page table levels and Guest page table levels; rather, it is equal to: n*m + n + m. This is because each guest's page table level must also be converted from guest PA to host PA. One way to minimize the number of loads is for the guest to use huge pages, for example, 1-Gbyte pages. However, this normally wastes a lot of memory. The idea is that we can use guest physical memory in a virtual way: create 1-Gbyte pages that are only partially backed by host memory, yet improve the access performance due to fewer TLB misses and faster translations through guest + SLAT page tables. I would like to discuss how this can be achieved. Thanks, Pasha > > Thanks, > Gavin >