From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59D76C7EE25 for ; Fri, 9 Jun 2023 15:17:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ACC7F6B0072; Fri, 9 Jun 2023 11:17:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A7CBA6B0074; Fri, 9 Jun 2023 11:17:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 946338E0002; Fri, 9 Jun 2023 11:17:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 815236B0072 for ; Fri, 9 Jun 2023 11:17:40 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 36404C0279 for ; Fri, 9 Jun 2023 15:17:40 +0000 (UTC) X-FDA: 80883563880.28.AEEE1FF Received: from mail-qk1-f172.google.com (mail-qk1-f172.google.com [209.85.222.172]) by imf12.hostedemail.com (Postfix) with ESMTP id 554AA40026 for ; Fri, 9 Jun 2023 15:17:38 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=soleen.com header.s=google header.b=nA2oc+32; spf=pass (imf12.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.222.172 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1686323858; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XJFmq3sxZNHpee+jBoQ8rcUcoIafVkBS5WyfGWrLxV4=; b=3j894r0bjMvaTNSx91MK5G03fbtHqDaBzSFNYiEQ/6Ux+G9z4KVSR3IgN+tXM9J2GurOdE KRv/sldjvs2amqDR2G4wug847rOMZ78CFm6mGET7qCa7s/T18dEYKVNmDoIb4sQalVR7if ggsuQMdEB0UevyuYno/YhGcOnZlVhQg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1686323858; a=rsa-sha256; cv=none; b=w+NrhWArq7uBwj/ETmzaJpb4XsxdOcjZ1WteTLT6UpLlxdrhXHggaOJ2D7NmJMmdMJIzDD 2yJb91jtEI1WU21U/1yYRtWDh7b7uZjYWUJhU1t0yx51FEwW0Ir3HIkpGVkGzpYraYjYg7 v7pRL8d/AJQgbWc4jrX6PMMapuBm7O4= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=soleen.com header.s=google header.b=nA2oc+32; spf=pass (imf12.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.222.172 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=none Received: by mail-qk1-f172.google.com with SMTP id af79cd13be357-75d4a4cf24aso175656685a.1 for ; Fri, 09 Jun 2023 08:17:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1686323857; x=1688915857; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=XJFmq3sxZNHpee+jBoQ8rcUcoIafVkBS5WyfGWrLxV4=; b=nA2oc+32HhEyUJqo/DbcFoXkqUl0vzmKBuKKiMgw0Y7u590CISgQzK3T9k7/etl9Uy nc9LZrQmU5lY+AQ9/N9dd42uaV0DLqYX5cDxXi++Q5r/xra4SRHpdPzYPRvlWdRl21j6 32ZdjJk+hPxSUc0Lci/vU/gpB0j/XHugzz2YTjSjkDpHEqzaaHj3bHE4rtnJ09qVN0cQ NuE5kneu4qsYnfXKEWEGST3aKE/nIXXcsxXinjsiuSrxrY5JEDQSl+tojohZ5BTzGwYu XeAo0I8nEscjhwDvwwJ/BGMtW3Pppb4Vp/3iIDmOzB8nNCBv8jtoE8gdkYby0Ufsod6b dfaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686323857; x=1688915857; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=XJFmq3sxZNHpee+jBoQ8rcUcoIafVkBS5WyfGWrLxV4=; b=SZZnMChIvqnVfaJQQ+XNB7CIh7h8rE8hPiNElC/Sfi6N9upZEcjfBkzFitY/VuUGhT 3FSopGxvVtPzbV9MRLb0lU2CYYp1JmcMoPheIgBsnRzg1PUR6fZrCnU9kbT/NM2NnFE2 qZ1CHOrmt2Zw1lCO/Np5ywploP6vSPMg2hSY9/1Rl1yzwDFbrbTwDDeuw9+GQ7slwoz0 xAnB2xoLWQQnI9/KYygXHf5dxr3EK6RbuGkCDJriRI5rXZym6nVdM7kum/G42PN8pS5P WV3tWhMBHxaVPMCcHwEnI1tXkkF8UYk7MgiS9PDKJM8gC6eE8/SBnE3Z0/3t0Iy2LBmv 40og== X-Gm-Message-State: AC+VfDwzNLwovULRN6qytN+jarDR3aNxGCIPdlAYF+vrRLUP+YdvWGkp e6tFeY5bEVQrn9HDBA5jBUVdpmL4jKhf2yyWVFTrrw== X-Google-Smtp-Source: ACHHUZ6aGO2TBNTOnHCm+Jjl2gl9toXZvW0N3Y4gfr6rDHQHbypBJayLHHEYvm/eJ9yMHdjrvSFmv9LSiz558IO1y34= X-Received: by 2002:ac8:58c3:0:b0:3f3:98b8:3fcb with SMTP id u3-20020ac858c3000000b003f398b83fcbmr1783438qta.36.1686323857267; Fri, 09 Jun 2023 08:17:37 -0700 (PDT) MIME-Version: 1.0 References: <20230602172723.GA3941@monkey> <7e0ce268-f374-8e83-2b32-7c53f025fec5@google.com> <7c42a738-d082-3338-dfb5-fd28f75edc58@redhat.com> <75d5662a-a901-1e02-4706-66545ad53c5c@redhat.com> <20230607220651.GC4122@monkey> <686e3e61-704e-1258-8a8b-f18399b41668@google.com> <20230608212336.GA88798@monkey> <6B42EC7F-7EB6-45E0-AF4D-F4F0FA7A012E@nvidia.com> In-Reply-To: <6B42EC7F-7EB6-45E0-AF4D-F4F0FA7A012E@nvidia.com> From: Pasha Tatashin Date: Fri, 9 Jun 2023 11:17:00 -0400 Message-ID: Subject: Re: [LSF/MM/BPF TOPIC] HGM for hugetlbfs To: Zi Yan Cc: Mike Kravetz , Yang Shi , David Hildenbrand , David Rientjes , Yosry Ahmed , James Houghton , Naoya Horiguchi , Miaohe Lin , lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org, Peter Xu , Michal Hocko , Matthew Wilcox , Axel Rasmussen , Jiaqi Yan Content-Type: text/plain; charset="UTF-8" X-Stat-Signature: tbb8hwkpd3ywu1wemjpiqhpifmwdprrn X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 554AA40026 X-Rspam-User: X-HE-Tag: 1686323858-956164 X-HE-Meta: U2FsdGVkX1/Br6xq9oc5t0RHjbuu75xPYYHC0cG96VYyNnb7vb9eVQWqSrOACYXS7LWkzP2PnRJ0AgNXEY2l38MdJVUjOFN1thix8blheOntu0gmnuLFsnDwb+7We1prTx7n/VIAxqtP7gCMCkmeAzVVOqSaDlgr004B9DZJ0VAo5bKUUE6hZXvi6TX2fOQWP4KzFvzZWTGhO9Lmy4Glvp56Ok3emtirErY2QgngmUIrTo6t5ID4Y+7TOt5GiFtj394kw4JvHUMdUIOZ2JHvvnehtZhrDCqW9CTpXsRmrblBjHrH3cbrxyC/dT5xKkjuV8CwvWiI5+H9Tkn/8lntm9mf7tFN/Cv3wsco9C8extfhJM4d9wGTJIC9/M98+crwSUkJk6Bc08BWdHR71Xh3pStExSQxfn8LbnnpnI31YfSjLJViZlT3FBRqntUq1t5ZTfFzYzGKvgBrXexdk+7Z1jZiAN6L7Kyxfws3p2CLEFY3UKas/fO2VyMELzc0a8052FAcUs3gua171nUFxWe75ua6apED/0iAnz+xaMcBArJZH2AOwCWgEZAybaiqvBuJcFVirQ6nF0kZy6LwWvic5HlABpGxppLtXZq/dZFmSinBp1OnK8pXURTmJ1obw6b4Iq5Tc/h9F7H4YKlGVAtsip2NEdzuo6DYeJXkqGn0CBO88Rg2buQVc1MY8GBomVsBzR4JlkCffo4SFsGPxVi07YNG+1sCvpvD6MVR3kfa3dobGk6hVA7rMIQp1wFx/6NuNvpMVO888nJAFOpJp2ZNvqbnTO7iTYjm6yL0wWEgY12FgSwJvecGKyNtjkuHOJxLo7MuvjGn/R2E5D1RF88B2zDKzobyeUxwnwv/gnv57Pq2VWQSatB5nNsCLEejv0T7QC+nsrrU2UVguDU7yYBg13ptFV5oTU2VKrXtDPwQGxtP8R/KkqPSel7aAoHP615QADbLViLFB3/JCAPamBz xxKsDS7W X+Jr+aSIv8xrToFx9QVQ7FU3iR57BrWwr1VFRZcvR3fRwtKUyWamnHH/1cn07vLOd74EuCz6lJvK3ZsPMmn8/9l5ZYP+GZFVVtREyExXxtm4qKx9MKzyyuttYviZw9WCEj2tvtxxsbPtHWQ04gX/0LmKm4FHGzXsy+N3hnRgIRZFnLh2PfFkmHnbjjm8BD4mxPCwV1jtQ9iTQTDAAbPArTgy6xwIUQMTswoROK/FsaoPJO1N0ep9yPUHjpq19kcJgODV9LhyJyCpo6eSaE1SAvQztqZsk09DrijgZRtORu6ACSoPnOWmkHGk/qu24BcT+D16IxGOdNEJ+61sgkNFzeGjP3YJ/kF6F+rrP2oZQQcaGUxfCbAidxr/S/Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > > Hate to even bring this up, but there are complaints today about 'allocation > > time' of 1GB pages from the hugetlb pool. This 'allocation time' is actually > > the time it takes to clear/zero 1G of memory. Only reason I mention is > > using something like CMA to allocate 1G pages (at fault time) may add > > unacceptable latency. > > One solution I had in mind is that you could zero these 1GB pages at free > time in a worker thread, so that you do not pay the penalty at page allocation > time. But it would not work if the allocation comes right after a page is > freed. In addition, there were several proposals to speed zeroing of huge pages: 1. X86 specific: Cannon Matthews proposed "clear 1G pages with streaming stores on x86" change. https://lore.kernel.org/linux-mm/20200307010353.172991-1-cannonmatthews@google.com This speeds up setting up 1G pages by roughly 4 times. 2. X86 specific: Kirill and Andi proposed also proposed a similar change even earlier: https://lore.kernel.org/all/1345470757-12005-1-git-send-email-kirill.shutemov@linux.intel.com 3. Arch Generic: Ktasks https://lwn.net/Articles/770826 That allows zeroing HugeTLB pages in Parallel. 4. VM Specific: https://lwn.net/Articles/931933/ Allows to lazyly zero 1G pages in the guest. I looked through the (1) proposal and did not see any major pushbacks, I do not see why movnti can't be used specifically for gigantic pages. Pasha