From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E079C5475B for ; Fri, 8 Mar 2024 09:08:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 065D86B035C; Fri, 8 Mar 2024 04:08:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 017A96B035D; Fri, 8 Mar 2024 04:08:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DF8F76B035E; Fri, 8 Mar 2024 04:08:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id CFBF26B035C for ; Fri, 8 Mar 2024 04:08:06 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id A409D16061B for ; Fri, 8 Mar 2024 09:08:06 +0000 (UTC) X-FDA: 81873294972.07.D4ADD8E Received: from mail-ua1-f49.google.com (mail-ua1-f49.google.com [209.85.222.49]) by imf20.hostedemail.com (Postfix) with ESMTP id 0D3741C000A for ; Fri, 8 Mar 2024 09:08:04 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=O4OM0HGp; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.222.49 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709888885; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Ss+8devJZ1E5Fnt41gZaOZT4I7WsgKy/EVaBoBRvx64=; b=QcBlMliZf3OnF7FW8SC2All37hHj8XBHhHc0Nx5fjXE6H0v78giI7kd9jQGh9XhKxd/oeN F1wr4mAtFrf8n1qllLODSWbRKCN9ouXXaruKr798lKtPHW749CCzArj9uPAmO/b6aplFV4 zrPN015nixhJdR8Q19YnTuWPT4VqsXI= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=O4OM0HGp; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.222.49 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709888885; a=rsa-sha256; cv=none; b=FedBuJr/Du1x62I0HERrn4dXX8Mo98JMHDuaXZwkDtQ//YP7MfqicqIhy6omsfKHzJ0f+s 1BgcBHtC4WcW244yqkuDjTHP5c6A8mDV1fwOX12B3IVXJc6tNEIVjm3HCJzV/8+F6PJbdx Vhu8Le2A5p9rItWjxt+Qh7RsFyq9bxU= Received: by mail-ua1-f49.google.com with SMTP id a1e0cc1a2514c-7db36dbd474so325884241.2 for ; Fri, 08 Mar 2024 01:08:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709888884; x=1710493684; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Ss+8devJZ1E5Fnt41gZaOZT4I7WsgKy/EVaBoBRvx64=; b=O4OM0HGpLxnZqbxfJQ691ua+bcGXo38XMyelYV7c7ojv9JwmN+/wOI3WjgX6QsNK13 KnhQP4KCyXfgL0OkcUyOjlvhuPSOnT/PqiPXTFm7FT+QSJ82DO/bWp4xThj9nvbjGdAZ pKsoNMhTl5tuVRtewE9wbWztkI8wkVa9dFIBsUO8u3TGwMn1T2VOkFv+B9ex+owbYcAy UW0xNpfWbe0pov7FSTq8hO4qC4/2qf8R2SxHjQsjK/ZyGrSl6DuyCeF6cOYRNWC8XDO4 yuoPvEkXVeEQNKDVSPVfn3oeVXg8UIvl6rlfqEr3h4BWVvXYFz8wOThi7DCUQDrRU4mm /qtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709888884; x=1710493684; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Ss+8devJZ1E5Fnt41gZaOZT4I7WsgKy/EVaBoBRvx64=; b=A7mrHJa25ZCgyItdBVtSu8qFCg+njYeG1AFAWy3/Pl1SlgJt+4GLMunJd0gPZOverj kYzfY5JKcdW4W6xCzDpD+Pulx3fmlfYZwO8U2H8pBQNMYx57BPnwAmILCNSSDYMwK+vk DBnhD1lqj0ft1kOYDwb3bPM/2JkGrM0/MxhvgJC+ziAZfE9wCpykUy8DlPu3kDjuZoXQ LCjdFX3EGBotRZKreIjkl+CpDmrEUxOyzVAGgrxiyi9ZQNl9U64HUq4xB40llS1ZwYQ0 mAz+2p7sKZgoj4EAml/1qHoOvFw3y8JeT+EC1gTBGF/WnWIr/HjrWA/U3DwunFYfcvTU tnQw== X-Forwarded-Encrypted: i=1; AJvYcCVbbCemyKUO1sNPIoCXozw/wARIE75FXtDsu+4URi/O9aHkVKMlGuLGvVbvs26FMWMj1JZX0O6r2IH3hn+c6DwQHbw= X-Gm-Message-State: AOJu0YyIOOX19NOYvj5+2BF2sqjLe2mKU+q1oqO0Jc7TQU2m8WzE7+lt 2tzjJlQCfVzkia/0pNqbP64o+iwcbkuCed380ZYrq8VrlRX1f4EL6o4jqLC/o305kh4z4Z6QGq8 j9ZyZyCs8FA0gyaoGwASUcAludpg= X-Google-Smtp-Source: AGHT+IHxzxsD+L50os7+Tl61TNXHL8RdQK46GzAyJZPjzZtJqKcP15hxF9j8PmtDsl0xoTXICDKk9FMnZbGzOOUhyEs= X-Received: by 2002:a05:6122:2912:b0:4d3:48b9:3c91 with SMTP id fm18-20020a056122291200b004d348b93c91mr11474717vkb.5.1709888884106; Fri, 08 Mar 2024 01:08:04 -0800 (PST) MIME-Version: 1.0 References: <20240308085653.124180-1-21cnbao@gmail.com> <4392e407-b9cf-4785-a926-3eb143708260@redhat.com> In-Reply-To: <4392e407-b9cf-4785-a926-3eb143708260@redhat.com> From: Barry Song <21cnbao@gmail.com> Date: Fri, 8 Mar 2024 22:07:52 +1300 Message-ID: Subject: Re: [PATCH] mm: prohibit the last subpage from reusing the entire large folio To: David Hildenbrand Cc: akpm@linux-foundation.org, linux-mm@kvack.org, minchan@kernel.org, fengwei.yin@intel.com, linux-kernel@vger.kernel.org, mhocko@suse.com, peterx@redhat.com, ryan.roberts@arm.com, shy828301@gmail.com, songmuchun@bytedance.com, wangkefeng.wang@huawei.com, xiehuan09@gmail.com, zokeefe@google.com, chrisl@kernel.org, yuzhao@google.com, Barry Song , Lance Yang Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 0D3741C000A X-Stat-Signature: uz1gfk1133smr773u58jhh3zu4cbr6ux X-HE-Tag: 1709888884-885994 X-HE-Meta: U2FsdGVkX190Iwo8N7EBs6LZctWKBMz+fMD1jd3pnH0/quDCIZiuIZBIwkWvS7662TzffEzh60PtLcxuCVuyIwQRZD491JzqbViT0oJBWeSa65vIprqonVTWHqWNP65s5jZPnwx9WHrkZq9fJedsnBwvR6q/+nXoMOFYJ4ReD0x9sfc+s6y6VLCftMUnSV9ljQaQICVLTGbdSot93elxSZFi5kB/etGXmPuUjt+znKFevC+jRi1QS+WOxvSvC4CPTNofCH2bkiy9VTy3o51KyaSOXcpPzmDF74gCqbyIG1L64lo2bUW0EavgOMNVzqRdy/cZq+sRVeaVVCsIfKb7OW7we0VrE7iIVdlGG9IscNgZJIFshQ2wkwBgkN3PpxapFQcuH6lQL1VFWltK5et60sJ5i3lq0nCc2O0Y0kmJfORAC/KiS8q1yQH53Ft9/Q7AhDSh+Tgs2ijB4TXQo63x4WzpUw5mEfXso0sm1gEPMQP2rQoFLI9ETzhwA3p7SvtD0tqTUxQDd6wEggmJBqU2yvWAaRisgldAs9wgE0Mari3Pbvt1tQzGt7eNc/CR7TMUhmnZyfDUXcyZodjHmiQkStL6nyLcG2D6j90ig+1KPn1hAYQirwI+4Gg3h1eVI+u/w2ePzsb67TSNgA4ANdUaMC8linLpWX34fESza61Adx5QzIMt1gHHyFTRSfx2aSPU41feLmFjT28G9nX8XnnyWlzYFw30b99uWvzNAK+dkss71RM+3ocdM8IDbMI2KM8u1W5ooryPgDjjxaqeT2W9Ms+CcB+g3/rFrTxWHQpvh7JAtpWZ4hCGbUKDUzZl71lDmm1ZxX4cU6h/3A4MavQv6PDKo3ZGzehRt9HMqHxQuTKe5uRmPG2xtpIq7jRcB2147XDW7Sy1d5RYoS6xnTq1ggGba2UP8MxovfvqDzzsp0whz7lUCys78oSmJPYyHVdP4Q6HnJEprYy7PycUzqE m8Zt6ga2 IxSmz3oPiTAtmaY5Yj5xVXlm2gdSUecLBq9+9 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Mar 8, 2024 at 10:03=E2=80=AFPM David Hildenbrand wrote: > > On 08.03.24 09:56, Barry Song wrote: > > From: Barry Song > > > > In a Copy-on-Write (CoW) scenario, the last subpage will reuse the enti= re > > large folio, resulting in the waste of (nr_pages - 1) pages. This waste= d > > memory remains allocated until it is either unmapped or memory > > reclamation occurs. > > > > The following small program can serve as evidence of this behavior > > > > main() > > { > > #define SIZE 1024 * 1024 * 1024UL > > void *p =3D malloc(SIZE); > > memset(p, 0x11, SIZE); > > if (fork() =3D=3D 0) > > _exit(0); > > memset(p, 0x12, SIZE); > > printf("done\n"); > > while(1); > > } > > > > For example, using a 1024KiB mTHP by: > > echo always > /sys/kernel/mm/transparent_hugepage/hugepages-1024kB/en= abled > > > > (1) w/o the patch, it takes 2GiB, > > > > Before running the test program, > > / # free -m > > total used free shared buff/cache = available > > Mem: 5754 84 5692 0 17 = 5669 > > Swap: 0 0 0 > > > > / # /a.out & > > / # done > > > > After running the test program, > > / # free -m > > total used free shared buff/cache= available > > Mem: 5754 2149 3627 0 19 = 3605 > > Swap: 0 0 0 > > > > (2) w/ the patch, it takes 1GiB only, > > > > Before running the test program, > > / # free -m > > total used free shared buff/cache= available > > Mem: 5754 89 5687 0 17 = 5664 > > Swap: 0 0 0 > > > > / # /a.out & > > / # done > > > > After running the test program, > > / # free -m > > total used free shared buff/cache = available > > Mem: 5754 1122 4655 0 17 = 4632 > > Swap: 0 0 0 > > > > This patch migrates the last subpage to a small folio and immediately > > returns the large folio to the system. It benefits both memory availabi= lity > > and anti-fragmentation. > > > > Cc: David Hildenbrand > > Cc: Ryan Roberts > > Cc: Lance Yang > > Signed-off-by: Barry Song > > --- > > mm/memory.c | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > > > diff --git a/mm/memory.c b/mm/memory.c > > index e17669d4f72f..0200bfc15f94 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -3523,6 +3523,14 @@ static bool wp_can_reuse_anon_folio(struct folio= *folio, > > folio_unlock(folio); > > return false; > > } > > + /* > > + * If the last subpage reuses the entire large folio, it would > > + * result in a waste of (nr_pages - 1) pages > > + */ > > + if (folio_ref_count(folio) =3D=3D 1 && folio_test_large(folio)) { > > + folio_unlock(folio); > > + return false; > > + } > > /* > > * Ok, we've got the only folio reference from our mapping > > * and the folio is locked, it's dark out, and we're wearing > > > Why not simply: > > diff --git a/mm/memory.c b/mm/memory.c > index e17669d4f72f7..46d286bd450c6 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3498,6 +3498,10 @@ static vm_fault_t wp_page_shared(struct vm_fault > *vmf, struct folio *folio) > static bool wp_can_reuse_anon_folio(struct folio *folio, > struct vm_area_struct *vma) > { > + > + if (folio_test_large(folio)) > + return false; > + > /* > * We have to verify under folio lock: these early checks are > * just an optimization to avoid locking the folio and freeing > > We could only possibly succeed if we are the last one mapping a PTE > either way. No we simply give up right away for the time being. nice ! > > -- > Cheers, > > David / dhildenb >