From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B1B7C54EBD for ; Mon, 9 Jan 2023 08:37:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 051F98E0002; Mon, 9 Jan 2023 03:37:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F1BF58E0001; Mon, 9 Jan 2023 03:37:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D967E8E0002; Mon, 9 Jan 2023 03:37:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id C84338E0001 for ; Mon, 9 Jan 2023 03:37:57 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 784D7160485 for ; Mon, 9 Jan 2023 08:37:57 +0000 (UTC) X-FDA: 80334607794.30.1FA6D51 Received: from wout3-smtp.messagingengine.com (wout3-smtp.messagingengine.com [64.147.123.19]) by imf29.hostedemail.com (Postfix) with ESMTP id 69DB9120002 for ; Mon, 9 Jan 2023 08:37:54 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=shutemov.name header.s=fm1 header.b=LKLbV1DW; dkim=pass header.d=messagingengine.com header.s=fm3 header.b=WGAEs4y0; dmarc=none; spf=pass (imf29.hostedemail.com: domain of kirill@shutemov.name designates 64.147.123.19 as permitted sender) smtp.mailfrom=kirill@shutemov.name ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1673253474; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KZgTIysNBU7Ml+njIqjvjW7k75En2N362j2VbDXyzyY=; b=slkt5h8e1SLuEqV1sxwEil5wbASG8XWICcm0sFBfOTK+8WETzNNAV9QN5vFJBxiF4jThSB kSY6MMySLOa0ihNFxgKrdlclDCBFqVCvp73pAGIhRu13LXEmTkxZ9S18nr/f+3AmqMETJ/ j8Cb9vE4pf7+Gh1pRTrs8BUKPkDkNJg= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=shutemov.name header.s=fm1 header.b=LKLbV1DW; dkim=pass header.d=messagingengine.com header.s=fm3 header.b=WGAEs4y0; dmarc=none; spf=pass (imf29.hostedemail.com: domain of kirill@shutemov.name designates 64.147.123.19 as permitted sender) smtp.mailfrom=kirill@shutemov.name ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1673253474; a=rsa-sha256; cv=none; b=65uN+uhovw6htvCXJ3dIUWBcSg7uZtLdUE0rkOkDX5b97Bu2frNjfX/H1ITAJB4Itj4YIP J60yCStAvL4UQLGWKvoaEWJEBec7ipqjAp9uzHWh8nPs1o9WarHdy7s+LuIfnuSBbA457Z RX5fhNh9wtR6CPuQpsXoa8XXs62KJ1s= Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.west.internal (Postfix) with ESMTP id BA6EE3200920; Mon, 9 Jan 2023 03:37:51 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Mon, 09 Jan 2023 03:37:52 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov.name; h=cc:cc:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm1; t=1673253471; x=1673339871; bh=KZ gTIysNBU7Ml+njIqjvjW7k75En2N362j2VbDXyzyY=; b=LKLbV1DW62Op1uo8v2 UaOXxrbvxsKctjOVl7A2M2TH8xoYHCbPAb21xQyAgTDWHFXguQDZJgesHivguyPG 1BkEil4EUXArJ39hXQA8Y8Kxjrjh9tHH1Rs2w1ATttE3E24E2F/6vq2A2geixW83 tFslTZneJuHMc6ayv3Yt+7mpxUuC2zZLBQiceFdrgenVAFCfqblOGFmLhE2rL25f r8rjfDR3IxjYRZfZIFL8wTyBhcPDh6Npgu+uzoE25RoRp9fhmF1ZE/ds7LunPDr8 0vSdA8k7wXDpQVxWNPTqL+dpJcWGyj6QqBjOyG7yDB7qCbUg5es7JQO63PAzwYbk Tt9A== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:date:date:feedback-id :feedback-id:from:from:in-reply-to:in-reply-to:message-id :mime-version:references:reply-to:sender:subject:subject:to:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1673253471; x=1673339871; bh=KZgTIysNBU7Ml+njIqjvjW7k75En 2N362j2VbDXyzyY=; b=WGAEs4y0euFKhhTXUr0uq+NyqLGmRhPtKz92a12qqFeT Zom2N3TR/9qWbkWFSJe0pH2SuJBLeOqP31g2YHbTeERTDmkpTS7RqMlwSjBq19Mc M8H54aA5zvXFgXYOZMthsVdiQzKaPeqbmOU8KdQtG670LYMK0mN0cyXirUHq6raV imYRdByrJLEwsyCzq+0FpSsrbmtNqwlvhwum7LEnMmA9ql8ygJpca8dpURQkLxDN vLhQZBFfQJwnyhTqb9kclTG0NGimROdZ6sGdStZH4sJDW/6QfQSX4w2pFyzmEpEy hMieJl0d3XU0BTYLoiJBVLS9jAUcW0VHavZBsUH+/w== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrkeehgdduvdehucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesthdttddttddtvdenucfhrhhomhepfdfmihhr ihhllhcutedrucfuhhhuthgvmhhovhdfuceokhhirhhilhhlsehshhhuthgvmhhovhdrnh grmhgvqeenucggtffrrghtthgvrhhnpefhieeghfdtfeehtdeftdehgfehuddtvdeuheet tddtheejueekjeegueeivdektdenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmh epmhgrihhlfhhrohhmpehkihhrihhllhesshhhuhhtvghmohhvrdhnrghmvg X-ME-Proxy: Feedback-ID: ie3994620:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 9 Jan 2023 03:37:50 -0500 (EST) Received: by box.shutemov.name (Postfix, from userid 1000) id 05E7610B07C; Mon, 9 Jan 2023 11:37:47 +0300 (+03) Date: Mon, 9 Jan 2023 11:37:46 +0300 From: "Kirill A. Shutemov" To: Yin Fengwei Cc: linux-mm@kvack.org, akpm@linux-foundation.org, jack@suse.cz, hughd@google.com, kirill.shutemov@linux.intel.com, mhocko@suse.com, ak@linux.intel.com, aarcange@redhat.com, npiggin@gmail.com, mgorman@techsingularity.net, willy@infradead.org, rppt@kernel.org, dave.hansen@intel.com, ying.huang@intel.com, tim.c.chen@intel.com Subject: Re: [RFC PATCH 0/4] Multiple consecutive page for anonymous mapping Message-ID: <20230109083746.dzpsdk5mxoxvym6j@box.shutemov.name> References: <20230109072232.2398464-1-fengwei.yin@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230109072232.2398464-1-fengwei.yin@intel.com> X-Rspamd-Queue-Id: 69DB9120002 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: jknj3gnfs8zge1hk3y6gsayaqqqfm36z X-HE-Tag: 1673253474-993340 X-HE-Meta: U2FsdGVkX19jwlymHfXBdbVM2ryRh5kkWrrEJ8iQiaw9lgAYpdvsMDZvnRi8qc3kSeAjps6Vu/Y3AxBe2DtF+7cmGOaMqpskLp0REo/XzXQeXpdtvOvOUKfSjpnAvRqxihATi14/aRyLNvwwCnQ9IwsbrWr7asfqfLal8eanIXoHqC1sHRQ3jcEMf3Xwn5QlQwoAhuOWMxegIBSsLdpF/OwaHBvxdTwbqOQZSNnmnCcvYkdFpp91jbhFuysolsPs5J8Cwx5anCthq/ELQ+FaF+0jkT3OF2eM7ZWcBV6oKk0Qp0ADAJMnX7f1Khw1VaDqVOLlvC4XDSXXkeTX6ZhEjXAfm0+7GqPBudBgtvUh1ulUYpN2vGfciui8RX9escQchcaUMsec9AeMczQMB1MWSpuWDHk7K1WIdUVufZgGyVrDbzWTra9lj/sJK35wcxWTucukqgJ0PJTnNOdnt64VA3vxIPWqrwt2kq5q1iOaGO1yaVji37M+iXP0bg1WvTA2i1Whc2i1k7xrWYAfusgBvdVwt0a0xEBNbgyB0BCMoGiMC4H+lqpvze1nCoZGjDC/60CrpAqiOiKvrF+1G2oz7ft9HpIzMrPfSJ6BBX3TW0IVX9a4L0Z9U2f+7KXXGIeAYoplQpopTICOrllHflhwUjnPWZ2F4jSejCf8UuCDsLbuDuFoh6XQBFrMzEijtzU5sP8V4yB1y0EKXOd2vEEZ+tiWEEvpyDq1mVck6YrxwYQ6cGc1ON2IuC4HaCmgmktPBHAXITLBgNqSh/2SeOB1/xgD9kQ8gf35eKQzYr/OEdJeVE2f+S6mCeIh//OlIWeGKsLZjmAxhcuWL8UrKsXwVKtjif/0s0xvlTqUJEtieA4pgnoOMvC431zxnbAQrSlJ0/KS8F/YqM17wITWoWesVcOklV+waCN3VnGii74KuhsqgYEwv8of/XIM90lSuzPES/aMTidILpE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jan 09, 2023 at 03:22:28PM +0800, Yin Fengwei wrote: > In a nutshell: 4k is too small and 2M is too big. We started > asking ourselves whether there was something in the middle that > we could do. This series shows what that middle ground might > look like. It provides some of the benefits of THP while > eliminating some of the downsides. > > This series uses "multiple consecutive pages" (mcpages) of > between 8K and 2M of base pages for anonymous user space mappings. > This will lead to less internal fragmentation versus 2M mappings > and thus less memory consumption and wasted CPU time zeroing > memory which will never be used. > > In the implementation, we allocate high order page with order of > mcpage (e.g., order 2 for 16KB mcpage). This makes sure the > physical contiguous memory is used and benefit sequential memory > access latency. > > Then split the high order page. By doing this, the sub-page of > mcpage is just 4K normal page. The current kernel page > management is applied to "mc" pages without any changes. Batching > page faults is allowed with mcpage and reduce page faults number. > > There are costs with mcpage. Besides no TLB benefit THP brings, it > increases memory consumption and latency of allocation page > comparing to 4K base page. > > This series is the first step of mcpage. The furture work can be > enable mcpage for more components like page cache, swapping etc. > Finally, most pages in system will be allocated/free/reclaimed > with mcpage order. It doesn't worth adding a new path in page fault handing. We need to make existing mechanisms more flexible. I think it has to be done on top of folios: 1. Converts anonymous memory to folios. Only order-9 (HPAGE_PMD_ORDER) and order-0 at first. 2. Remove assumption of THP being order-9. 3. Start allocating THPs