From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F06ACD11C2 for ; Wed, 10 Apr 2024 21:50:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A5C8C6B0089; Wed, 10 Apr 2024 17:50:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A0B976B008A; Wed, 10 Apr 2024 17:50:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8FAB26B008C; Wed, 10 Apr 2024 17:50:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 732A66B0089 for ; Wed, 10 Apr 2024 17:50:37 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 2B1C880960 for ; Wed, 10 Apr 2024 21:50:37 +0000 (UTC) X-FDA: 81994966914.29.BB59FEB Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf29.hostedemail.com (Postfix) with ESMTP id 6EA0E120005 for ; Wed, 10 Apr 2024 21:50:35 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=lqM+vvql; spf=pass (imf29.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1712785835; a=rsa-sha256; cv=none; b=jHL7H15WEzju6X5QlwEK0SMMvKi3qPSzW4k8KzHBcHT9sHSthTlOXyKus4mu41VPmoxp0d BIw1iJ4eDdVlD+Eih8zgc7BbZodNX0xFWh+D/tCL4itm/tMxDtVNGSZypMZd7TyrMLT/c+ KKrp0J6yh0ZcRxfh/rp66/Yd/V8PTVU= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=lqM+vvql; spf=pass (imf29.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1712785835; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=QZYsdONCqr6ZX2KaTZvHpli9BF+fvbXHdW0CUCLbEtA=; b=F+SePi9WaboK1zq0abwwMfmQVx+3s/onAIMoYELpSUOVYVyDnAeDcAvLj/6OshbwnV9/XN oUpsJFY9ddfbVH8JFVrhuVH6PVtaqFjwYSBVG6n393yF7spdaK8ZiCoKyg6kVDadMegW2j cMzx9fTPVpPHSkMbNCtWF1wfnJXSn2k= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 8583061671; Wed, 10 Apr 2024 21:50:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A0F97C433C7; Wed, 10 Apr 2024 21:50:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1712785834; bh=yh3IWOaQtFjiF7MditS6RnNuh0K6U3o5yhxsX2GMDTA=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=lqM+vvqloDVWttOtSYXVhi4ub9WpGrP3QcdXnOlKwjVihHfyRykWEG1HGe3AEUszb kLw9KeQp65S7jGLvAoe3J7chll6YXlgxNh5C8UxjH2ck9JDiVV0Zqz7bLZhRItqxQU GZ0BdWcndjAVMBATSmzlbpRgbS+c4YSB1bO5XZjE= Date: Wed, 10 Apr 2024 14:50:33 -0700 From: Andrew Morton To: Lance Yang Cc: ryan.roberts@arm.com, david@redhat.com, 21cnbao@gmail.com, mhocko@suse.com, fengwei.yin@intel.com, zokeefe@google.com, shy828301@gmail.com, xiehuan09@gmail.com, wangkefeng.wang@huawei.com, songmuchun@bytedance.com, peterx@redhat.com, minchan@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v5 0/2] mm/madvise: enhance lazyfreeing with mTHP in madvise_free Message-Id: <20240410145033.5cdb8a41f3a6894a62191f42@linux-foundation.org> In-Reply-To: <20240408042437.10951-1-ioworker0@gmail.com> References: <20240408042437.10951-1-ioworker0@gmail.com> X-Mailer: Sylpheed 3.8.0beta1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 6EA0E120005 X-Stat-Signature: 39er5opxsui5dd7b7ep8mjuksxop1bxi X-Rspam-User: X-HE-Tag: 1712785835-195201 X-HE-Meta: U2FsdGVkX19HbSZzFSZlwUpYLHbEePjUGyKeOC6Sb+uTszf/tMu0PnScr84qKFaE+CLHCqJjEPcWO/62vUlLvbQMknXD2TUErZl2lTgRsiHylGRLz1haXORETyYsiJt0LulaF7BUvM5FwNX0ZGZnszM7saBm41BnkEcR5VmCcBhxl8lvqoVX8N27aea9AxBcYmqqFd9x8qOOJJi/X1kTMekrjBXU9PJkvYiguJQGQMiYqP1RkO1008X0foxEx10vSDlfHsdO6Tyut5Fa9eLcj43uTpbG6RtYVE6lJMtkn78Btbshj0RcFVzrAkacGq1dkExVks4K3/w7V8il1TFSlyp6QqrAPqmQC56kpcUA4xR9AhcBsjSX4yAQ9ADFbABzUqlai0ZYvcEu3m7Jowhee3lmZz4zDlH83aGC303WmsxEs9HsssWdWKTWEjVl86vYam3K93Bdwatuzk5V/cDBsaDsMcch/j8At7kUjSIwwW8BhBrSgdFk6en+U1IQ7hFelFOT/BM7YA3g3AlelLPYXOcwNnRthxyiRdZ2GCbHVU41zDwlHd54sJ+v6YxFG1xkL85Yn2HZt52sAYnAQ9gpJeaERzk5y4VAeUdX6jrwPCGvXQjnjQsQNIf+TCOETDwaYAC83lO76eIyBGWWMNaSPyI82QT//AVnxJY9EOALlFB2S9D80Kl15E8ckllC8ss70SKK7qRlep0RAjRqQOHNhKQLh2Leqe4lHfL+vGctC6th7nLQnmzTA7tfISKeBMGwPZiOOvLnjtg/fCr7GE4l2D+PCszjBV4COSJLeF3tnzcf/KDkctFhGtz5OJN/ClvPu+Hd0HpiLyXqSgmcV2vOk/uZKVj1LArSm+W88jEmk5U+fJ0c9YjHYeaybfbS6uiBx5RHtdw8RhOEVTWDvh5Of4KS7kbE+aUuZ51rXCiGC05kPkC9TtxrWRbjMB1jwmkMN8DtIfKNWPjMJgQmbP6 CxdWyKa6 q2bOTBj2EPl6x8DCC1WOcnUfe3Wej/bu38V78LQL0TfBxCwZbUgws+8XsUGBAitsaBB7fobogLyZpDS3+6esAjZ/l4WvW3MOVugo/6/Fb9hESJp4n5a92H/H0l1G7JItYZc9TF9vykfVHEUAOzeJUptQvkcon2w/TSknjfh1tCEm/+ddV23aLlgZFpjWO8/x8my3tTa//1fkJkfAZkJ7lqt9AsPARjn+Yv1DxvBfd4d+rjpY2qSrEKmiiwHAAwrZ3EZRQTePTBmgeEE9enXpFnAXHrt+A3Lk6Z6eag2HxdNPYb2g= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, 8 Apr 2024 12:24:35 +0800 Lance Yang wrote: > Hi All, > > This patchset adds support for lazyfreeing multi-size THP (mTHP) without > needing to first split the large folio via split_folio(). However, we > still need to split a large folio that is not fully mapped within the > target range. > > If a large folio is locked or shared, or if we fail to split it, we just > leave it in place and advance to the next PTE in the range. But note that > the behavior is changed; previously, any failure of this sort would cause > the entire operation to give up. As large folios become more common, > sticking to the old way could result in wasted opportunities. > > Performance Testing > =================== > > On an Intel I5 CPU, lazyfreeing a 1GiB VMA backed by PTE-mapped folios of > the same size results in the following runtimes for madvise(MADV_FREE) > in seconds (shorter is better): > > Folio Size | Old | New | Change > ------------------------------------------ > 4KiB | 0.590251 | 0.590259 | 0% > 16KiB | 2.990447 | 0.185655 | -94% > 32KiB | 2.547831 | 0.104870 | -95% > 64KiB | 2.457796 | 0.052812 | -97% > 128KiB | 2.281034 | 0.032777 | -99% > 256KiB | 2.230387 | 0.017496 | -99% > 512KiB | 2.189106 | 0.010781 | -99% > 1024KiB | 2.183949 | 0.007753 | -99% > 2048KiB | 0.002799 | 0.002804 | 0% That looks nice but punting work to another thread can slightly increase overall system load and can mess up utilization accounting by attributing work to threads which didn't initiate that work. And there's a corner-case risk where the thread running madvise() has realtime policy (SCHED_RR/SCHED_FIFO) on a single-CPU system, preventing any other threads from executing, resulting in indefinitely deferred freeing resulting in memory squeezes or even OOM conditions. It would be good if the changelog(s) were to show some consideration of such matters and some demonstration that the benefits exceed the risks and costs.