From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91C17C2BD09 for ; Thu, 27 Jun 2024 21:20:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 151096B009C; Thu, 27 Jun 2024 17:20:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 101336B009D; Thu, 27 Jun 2024 17:20:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EE4336B009E; Thu, 27 Jun 2024 17:20:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id C98E56B009C for ; Thu, 27 Jun 2024 17:20:35 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 89E22A3F52 for ; Thu, 27 Jun 2024 21:20:35 +0000 (UTC) X-FDA: 82277937630.07.F803E62 Received: from mail-wm1-f49.google.com (mail-wm1-f49.google.com [209.85.128.49]) by imf08.hostedemail.com (Postfix) with ESMTP id B62AC16000F for ; Thu, 27 Jun 2024 21:20:33 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=4bWxc83D; spf=pass (imf08.hostedemail.com: domain of yuzhao@google.com designates 209.85.128.49 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719523224; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1gAULreh3rtOC1d/4Wogmb8c1fsZXcK3zxj4Xt1tx6M=; b=7LxJSNeTe35zVumkLHSghE4YLkepoJEB1sQFVv9C1IIhafQorIhnqXs+n+iulobUhDoxYM /BxN+rTtIT3Pmqpfsqh14AMBgx1QWBUFvv49UMoW+fp3wYjK/qrzrSKq4nZMhsrGZ/rviL DfSHDcbpy5jFSdj22SVu0nqWNznsZ74= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=4bWxc83D; spf=pass (imf08.hostedemail.com: domain of yuzhao@google.com designates 209.85.128.49 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719523224; a=rsa-sha256; cv=none; b=5P7eA3l/1ZPXi0dqpgYcY3f3ATGACGwx0fACRA/Nafj7f1UmFO9vXswGBrhqGX2S6ciceh 5iW+OFGMXF2BZCMsehzf80JwVBreSfWjt4VmhRMqgSTacKKMQ36zB/DyQkHdqK8UKY56aq BWbAzFcW9Qc+WSZAbBad/7f63ZqNLSQ= Received: by mail-wm1-f49.google.com with SMTP id 5b1f17b1804b1-424a3cb87f1so5065e9.1 for ; Thu, 27 Jun 2024 14:20:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1719523232; x=1720128032; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=1gAULreh3rtOC1d/4Wogmb8c1fsZXcK3zxj4Xt1tx6M=; b=4bWxc83Da7or4Z9qjCnkNSib/IB0Y9cULrC4jWDx0My4yiCB9WMYM7y3DG4Sqg/nox d3YE5qybcCjwiOnr8uc2+FP1jF8KI7xAaMrxWhfjkbXH3kKayfIi8QiYlEX8yNf7E/OM soF1rpAUXENVfOWGVXZ1dFiztFISQm52NbdpKa2JfNDoE/EEAOd9TAUfepUATxNHo6Pr NEpAVDT4wlHpI2G1dcvrYiXIV5hQwK5dIO8AnZAG5fPrBZz5edZQ8SOWtDT30ueyrVA8 BBAX/ZsWR5M5pMeG49h8c3D2yV+1x6TUvqx7FUTrBWaKBztciyO1q+nIlbDe32nNA5PM CwAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719523232; x=1720128032; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1gAULreh3rtOC1d/4Wogmb8c1fsZXcK3zxj4Xt1tx6M=; b=CxNQ9ioF+/4wNXd8r42n7sysFthFXHt6SVYgCVJySvbDpMIybydtMw70JQhfBy/ILH koGHbIJccomYY0a3tynmLp3wUHUYKM89/DxD+N+hF9vAJliPyaojRaquAXb1uu1v8xCY v55NIxSizBb9ZhnoPGA9SYtdFy6xSEqPpfFyD/qPPMd29t+D82Ong3gEi8opmFfFZ5lS zBkh9VmD0hOBybqgnNHkeVq4pWprMYk4OYJAeghu8X34qLCqvKfWty3HnWjs3rPHXftb SOTlyEHQ6mMsLQmynKAUZzUhhXoDJ1cdxj+Q+7QIUopGQwf5tpU3LwxM4lEAXM3FxN0G L8cg== X-Forwarded-Encrypted: i=1; AJvYcCUVDp3bCe54BQD+vThJJDgwnK8kZ1uFSEhzrBykWh3/ieiWz8dIp5tnh84X6/HuIIur8GkFkxUG/MG3dFOJu3x1RRg= X-Gm-Message-State: AOJu0YxHmt+IB52iZS5JXmIEySGqChwmCEcEOu478oqX0RaiNyy5Ih13 UoOtKu7lKJ+AVTCgGZVZRTqw4jKU86b/nBOf+qL/4YZJFWPZopTlODf+bTiEy9VeY6cjX8L6WWN G9o0ai+j1agsav+vnQrBmb+HVx6TVgiTd/Lwb X-Google-Smtp-Source: AGHT+IEQcYI4tMC/xVU9fpIjsQD17XpzU0FPLZXu1rA9iwcI/OwI+pd94ZomravD1sIA6yr6hGJYGrOBHgIKqlGjzZI= X-Received: by 2002:a05:600c:5113:b0:421:6c54:3a8 with SMTP id 5b1f17b1804b1-4256c2a9e25mr158255e9.7.1719523231932; Thu, 27 Jun 2024 14:20:31 -0700 (PDT) MIME-Version: 1.0 References: <20240113094436.2506396-1-sunnanyong@huawei.com> In-Reply-To: From: Yu Zhao Date: Thu, 27 Jun 2024 15:19:55 -0600 Message-ID: Subject: Re: [PATCH v3 0/3] A Solution to Re-enable hugetlb vmemmap optimize To: Catalin Marinas Cc: Nanyong Sun , will@kernel.org, mike.kravetz@oracle.com, muchun.song@linux.dev, akpm@linux-foundation.org, anshuman.khandual@arm.com, willy@infradead.org, wangkefeng.wang@huawei.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: B62AC16000F X-Stat-Signature: kkm8igyxi89ng13tnbjg8oomaoaa13aa X-Rspam-User: X-HE-Tag: 1719523233-63172 X-HE-Meta: U2FsdGVkX190Ti+3f2GlNnLC086dNEZN/a07Jk1Fxc8BM4Z1ilfTk0IEATmgz8gBKz82+Gxo4fp1yvyX4fw9sXsitMiXez+leiZDc+vYFlHGnu4Qo0w4f2QauFhgGSBJJ0UktVKyBokN8liBo/FDeay3pKXGNLM5h1aoV5foNqYJXZtmuIuYLtgvC62fhDrvxmcQwrSh+H6LWoOPWCgHwswUuCc4+USvqlAeVBxUT1X3/tLLLbNEWcS7+1mRBRdGGh/g/iFI5jslSwbJe6ptLDkImcCU9zjIMAjUqF2KSQm+epV/qzqHSMSwmspV1fpTEKCJHsl0o4lmWus0hw/oazywE5MT7+s/W/LFguUA2DuiqqPw50e8mbh9qkwcWL/gPxSFcvwbyBmIZ3ibTZI2PmWl3ms5BMEmA/MtSU2nBGqXfWZgCFaKS/F+lUhBdZNMZOSUOZ3rOtwKDFS/lURmpu0v8z02HtvU/vDM17sY6hSadTASgJ8WUe02TUSrllLR/vZb49xSCoiLsfpA+CdPz7vhIyKtMtJ1DafhDU/l9DbmQq0qFefmIA94WrLzy1RyiNkteapSboFKEF+oyT2GwXLso1b38nKPXH2aSdNROxd+6eJ+w2IyhxQNfe+vCImf7y1GfN4v40QHgn8beDcNaTxKy1caLNe481lTmiUFaUkO1Tb2dt6Flzq62lCuCMYiXn55RHC1Fv7tFxI/bM26QuqgMieamZfvI925JT+2u8r6l5c4R9sAupWhYjoz8l6xAfYgoR3MG5Wc+Xr2Ni/vGGnBI2YZeY56e7wqa/jsEclNMst43uGvsge0Ftok+gsnbrey/yvbF05XSVing88AAyatlkm9Ri24frVqdh1W85ld+z0xFxDDJATA3zKgqMPVdQJg1zHVcwQOXeXEJU9s9btBDx2k1mxIGHSpI7Q5EyHkjbOeYreIvKfw9rIokFTYa/TopQXOztBjuxKG5Wc 5EYn6/nN oof9tfij+chA8mDyWIN+ryIMytmCbOzXrwiM0s7AdA8mtmbsc3nb1AuaQo28IikuReXtXbHRVUc0siHP+zLdfV5XsO+vd6tK+F5UNt/7boYfuF6s15aw5bRncgk/QOrAlSxUVQiuRPuQn+sAvAVHYx+IFHhjh+l0nGj9JZauy3hWDaxILTZwb44QejWe8kQPrRidD0o7D8Aw+sZiVA8JsZoKaJ+kP0BnHBZFd0obNAJkZWk4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Feb 7, 2024 at 5:44=E2=80=AFAM Catalin Marinas wrote: > > On Sat, Jan 27, 2024 at 01:04:15PM +0800, Nanyong Sun wrote: > > On 2024/1/26 2:06, Catalin Marinas wrote: > > > On Sat, Jan 13, 2024 at 05:44:33PM +0800, Nanyong Sun wrote: > > > > HVO was previously disabled on arm64 [1] due to the lack of necessa= ry > > > > BBM(break-before-make) logic when changing page tables. > > > > This set of patches fix this by adding necessary BBM sequence when > > > > changing page table, and supporting vmemmap page fault handling to > > > > fixup kernel address translation fault if vmemmap is concurrently a= ccessed. > [...] > > > How often is this code path called? I wonder whether a stop_machine() > > > approach would be simpler. > > As long as allocating or releasing hugetlb is called. We cannot limit = users > > to only allocate or release hugetlb > > when booting or not running any workload on all other cpus, so if use > > stop_machine(), it will be triggered > > 8 times every 2M and 4096 times every 1G, which is probably too expensi= ve. > > I'm hoping this can be batched somehow and not do a stop_machine() (or > 8) for every 2MB huge page. Theoretically, all hugeTLB vmemmap operations from a single user request can be done in one batch. This would require the preallocation of the new copy of vmemmap so that the old copy can be replaced with one BBM. > Just to make sure I understand - is the goal to be able to free struct > pages corresponding to hugetlbfs pages? Correct, if you are referring to the pages holding struct page[]. > Can we not leave the vmemmap in > place and just release that memory to the page allocator? We cannot, since the goal is to reuse those pages for something else, i.e., reduce the metadata overhead for hugeTLB. > The physical > RAM for those struct pages isn't going anywhere This is not the case.