From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4DDB5C5475B for ; Mon, 11 Mar 2024 10:01:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D5DD66B007E; Mon, 11 Mar 2024 06:01:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D0D9B6B0081; Mon, 11 Mar 2024 06:01:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BD5E16B0082; Mon, 11 Mar 2024 06:01:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id AE21B6B007E for ; Mon, 11 Mar 2024 06:01:30 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 728CE121190 for ; Mon, 11 Mar 2024 10:01:30 +0000 (UTC) X-FDA: 81884315940.24.8D2125C Received: from mail-vk1-f174.google.com (mail-vk1-f174.google.com [209.85.221.174]) by imf07.hostedemail.com (Postfix) with ESMTP id AC5D640002 for ; Mon, 11 Mar 2024 10:01:28 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=KqAuWe2Z; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf07.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.174 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710151288; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zjVrxa7jT6Alb18qUv/5fbPGQg11qSCep3Z1UEbuvQw=; b=PIhf105Pi1R3DQD4ADnMyEtrBnb62hulh3nS+0YSDnCkZ4Su6MDp6G7kUVAmOSXQiquL9A votDAqZ9D0Tb6GEb1cAd40t/lNA4P/FKMzhKMpZBIXHMqUrgz/aADuJ+5BigcgQv8UzMGG yUJlqve94vsJ9fXWb4bsp2KAvVWWKkk= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=KqAuWe2Z; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf07.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.174 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710151288; a=rsa-sha256; cv=none; b=qdg3A3S5Lv/sBDY9BhxG6emAbqXLOjTgo044VpDMqTmHBVP+b42LKTM6rxt0bsoVcMeQ+q hBx36kRp0dhHBDm4yT9ztYT7m1TnCXhPhjEB8wnn3c3UJcrTdP5kf3DvYHYnvd4T8tfDPe KmoQSS/cFdEUFX/ARo5BkZJTvZSr6DQ= Received: by mail-vk1-f174.google.com with SMTP id 71dfb90a1353d-4d36c20d0f7so2846619e0c.0 for ; Mon, 11 Mar 2024 03:01:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1710151288; x=1710756088; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=zjVrxa7jT6Alb18qUv/5fbPGQg11qSCep3Z1UEbuvQw=; b=KqAuWe2ZjmI4RjqR3TQQUMxdycBNxGrMFBP3IvOWNG3+8AT46LY+qZ3OI8qRqVnF6B 5rEdjiAS0J/aiWVLsdQXAqzb2cy5VTKkSp6mwUw0p/btgLJuyvJIvEkWkjuKM7cBOhlh 3oZD220pW/GcDcd0tynMOVLMIns10lCme7Z7oWMsrfoGtVp4G+seBfIrMRSQ8TZWVoe2 y57QwUFqLYPmpcZrKb/UNWMH17llwc3tzPnffgvMDZhzLQmazFRoZG4yTosvKhKSCaEH LaV+vRrJX2/wTRrH/jK6Miy5wvsw02wwj7oKGr1oBQPtPTO/fICVRP2p0ANjTTihCq5N FeEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710151288; x=1710756088; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zjVrxa7jT6Alb18qUv/5fbPGQg11qSCep3Z1UEbuvQw=; b=VjGRHL/4kUO7I0b/LE8AEJzVuCt6BfrATRoghPpU9jDJDMTGMITPAUr3EsuP90UZrx WvVU+RAty35lD4aGgihq0w1Irt/uhIq6eQXuFnpYlHc8IcfEbFeslvSkRC86S7LIFqrf 7eHu0XQ+g+l3P5oIRff1x6KXE8p80sJgGM2a6cznVsQDXpLBNjxdh8PzK8mO7VVUCxUW ZrV00ShXSPyWeHrlNbVrrACvaAbuDkM5/Qf5GUlNtZvkxCWW1hHmL2DUtnKZB+2oSVvl LUUlrqO2pevCIE97E7HGUdyiAIPPuBcD6Ty6IPYEiPbUdycM9/xWV5Xmtx+X7S10jKBw DVLg== X-Forwarded-Encrypted: i=1; AJvYcCUliCMIMvnXF3dMesP6DkpqJBJeEv/mvYRlndFAc7rbyk5wZ3rhdb3xexKfdgNVOmqCqXV1gDWpYkuG5iByqHZ1H4E= X-Gm-Message-State: AOJu0YxHO3bBtK/bZG2HQTBNCyZt/1hf6mqy0ulZG7cgLjulHSPhOOcp xty8P/JQR7GXyXCV11T8kv7mkyn5RRy7Rv8PP07XWvWYveAl896nFES3yaAzEFXm/NOqu1j6QgL x+2Eu7LSP49MusnTWbovcZbmx10A= X-Google-Smtp-Source: AGHT+IGxp3cW+5nvOi2fDqLJuWZzS8i8Sc1PON+TaPD2t8dC1BxYiROYmtRJuAZ0N1ha9jWhKW0w3EUwD8w4EtiPhZ8= X-Received: by 2002:a05:6122:449b:b0:4b9:e8bd:3b2 with SMTP id cz27-20020a056122449b00b004b9e8bd03b2mr3514927vkb.2.1710151286195; Mon, 11 Mar 2024 03:01:26 -0700 (PDT) MIME-Version: 1.0 References: <20240307061425.21013-1-ioworker0@gmail.com> <03458c20-5544-411b-9b8d-b4600a9b802f@arm.com> <501c9f77-1459-467a-8619-78e86b46d300@arm.com> <8f84c7d6-982a-4933-a7a7-3f640df64991@redhat.com> <60dc7309-cb38-45e3-b2c0-ff0119202a12@arm.com> <37bc1a30-7613-4404-b123-c351e36fc800@arm.com> In-Reply-To: <37bc1a30-7613-4404-b123-c351e36fc800@arm.com> From: Barry Song <21cnbao@gmail.com> Date: Mon, 11 Mar 2024 18:01:14 +0800 Message-ID: Subject: Re: [PATCH v2 1/1] mm/madvise: enhance lazyfreeing with mTHP in madvise_free To: Ryan Roberts Cc: David Hildenbrand , Lance Yang , Vishal Moola , akpm@linux-foundation.org, zokeefe@google.com, shy828301@gmail.com, mhocko@suse.com, fengwei.yin@intel.com, xiehuan09@gmail.com, wangkefeng.wang@huawei.com, songmuchun@bytedance.com, peterx@redhat.com, minchan@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: AC5D640002 X-Stat-Signature: osktzxrcjhs953ik9mizn5ta4s1wgnp3 X-HE-Tag: 1710151288-317301 X-HE-Meta: U2FsdGVkX1+1KmfZ04W92CcX551f6++NYIzs1iaZ2TDkzbSoDPI7yfRctQWqK2dBQyW6TUKk+D5OC7O+abaHsQ98FUgMbnPwyDCu3i7WE80Fehc2SXlsE2rh5qroJ0Zo0HnzRuyf5fybo0vf+XIJdt8+S2xfTuYg9OMJow7J9b32vWyDUO529fa7EywhFJkqs7SU3au0pHmxH8jbqM5Uwv/09r8Gx6fAmSLScs2ADJUXPLkMV1VqSioSN/kSutVfv0josElpzjQx5dxuExnnueY7cZiIktJgS7GpZnYpMx1GBC+njcbKxjOiY1M/5pB0Dz6kFHEgFtDlz15uSAmAbeoEr/P+GCXdBazaH9aHOoxeeGP+n8n17XxAiyJry4MT7gG00TB187GPpabRzgxIEpzNHsbWo/PZsimCw1CazmuNptgB+5jFp7PuGiyd4lHa7jnO2AA5D7AJZdwkYR1UjR5cQEZuqt/PEJnGfe+frzjMgPj5gwSP0j6TcjWexlwMjrrfaOf1h/uLJv8ycTW625IIh6eNcbEp1xRgu1205n7+rCtdGRqwhb4VuoxEMlYJW4eGlA+hskC4KrXJVSUnZguVZ7KS50yqFDcvUq+9c10t52A2G6UT2dQQAlzMvv/zON75Hizmt36wfd768UGHKS1JQAbVFtGAl3+Id3BZ3xLgPxFK2XFontJJJ8f8CWaJDgQSn3nt251AtkU1AX6HEByPrcQIZqLhEzCYtxM1OW37OKojDxe69Uwn+Rl5r5yGAenU/xuO26H1AsBR2mBXe0ZhleRYKNrbAnyZ7QjhFU2A/c8dIfp5/0jIey+Ca61ak54oUrcINcYM0w59WqplAovpDoSY43VFkVVn9ax5EvBreOvh3MkdozMLXwpkuMKpL1pJt3/5bthDK222ylhU+WQYWLtJI2dMHp7yGE1NvhWkJJFyJjsERUobQNRspGL8x9G7M2wz6Vqzknssque n7Pp8avr BgDdWbnEahlw83X99HBFoJSjwqK/PAxSwp85k/Dfhr0Vwk0UrQNQkWRy7AmPjt6f2bxTWEb4eivLJpl7nBfvxpi3bYBu9pUedXi5zcySytF/StXIRqNOCokpRqku2dTvX6M6v3+StKm6htrF8G4kQRC6dh+c+k9zSaPnb X-Bogosity: Ham, tests=bogofilter, spamicity=0.000020, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Mar 11, 2024 at 5:55=E2=80=AFPM Ryan Roberts = wrote: > > [...] > > >>>>> we don't want reclamation overhead later. and we want memories imme= diately > >>>>> available to others. > >>>> > >>>> But by that logic, you also don't want to leave the large folio part= ially mapped > >>>> all the way until the last subpage is CoWed. Surely you would want t= o reclaim it > >>>> when you reach partial map status? > >>> > >>> To some extent, I agree. But then we will have two many copies. The l= ast > >>> subpage is small, and a safe place to copy instead. > >>> > >>> We actually had to tune userspace to decrease partial map as too much > >>> partial map both unfolded CONT-PTE and wasted too much memory. if a > >>> vma had too much partial map, we disabled mTHP on this VMA. > >> > >> I actually had a whacky idea around introducing selectable page size A= BI > >> per-process that might help here. I know Android is doing work to make= the > >> system 16K page compatible. You could run most of the system processes= with 16K > >> ABI on top of 4K kernel. Then those processes don't even have the abil= ity to > >> madvise/munmap/mprotect/mremap anything less than 16K alignment so tha= t acts as > >> an anti-fragmentation mechanism while allowing non-16K capable process= es to run > >> side-by-side. Just a passing thought... > > > > Right, this project faces a challenge in supporting legacy > > 4KiB-aligned applications. > > but I don't find it will be an issue to run 16KiB-aligned applications > > on a kernel whose > > page size is 4KiB. > > Yes, agreed that a 16K-aligned (or 64K-aligned) app will work without iss= ue on > 4K kernel, but it will also use getpagesize() and know what the page size= is. > I'm suggesting you could actually run these apps on a 4K kernel but with = a 16K > ABI and potentially get close to the native 16K performance out of them. = It's > just a thought though - I don't have any data that actually shows this is= better > than just running on a 4K kernel with a 4K ABI, and using 16K or 64K mTHP > opportunistically. I fully agree with this as my Ubuntu filesystem can run on 4KiB, 16KiB and 64KiB basepage size as its elf files are 64KiB aligned. so I would expect new Android apps/middleware move to 64KiB ABI though it might want to change the base page size to 16KiB instead. I believe this is the case. Thanks Barry