From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 276A9D277D0 for ; Sat, 10 Jan 2026 05:48:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 60E256B0088; Sat, 10 Jan 2026 00:48:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5BB7C6B0089; Sat, 10 Jan 2026 00:48:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 472106B008A; Sat, 10 Jan 2026 00:48:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 345B46B0088 for ; Sat, 10 Jan 2026 00:48:55 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id CAFE455AF8 for ; Sat, 10 Jan 2026 05:48:54 +0000 (UTC) X-FDA: 84314975388.10.914B3A8 Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) by imf09.hostedemail.com (Postfix) with ESMTP id B9587140003 for ; Sat, 10 Jan 2026 05:48:52 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=mgycGVuz; spf=pass (imf09.hostedemail.com: domain of jiaqiyan@google.com designates 209.85.128.50 as permitted sender) smtp.mailfrom=jiaqiyan@google.com; dmarc=pass (policy=reject) header.from=google.com; arc=pass ("google.com:s=arc-20240605:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1768024132; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=TbNBS6+582THKiphuJq71Lq4d9pP5yCfw3WH36b5sQw=; b=Mba/4S+Ts1208dyEyS+iY4M1FYADIwl6AgEjAS7/gductVcAFbp9OdZ5cHgrpJhfYEyBZn To/wY9ZfTQ1QbBKqDstqNO57Yy+eJWThw0352g6JgQECMKjFEsBGN6OFYqmlxADGEeUp7M DaYIQZx9talEMCsiS2z8+AL5fQHFwaU= ARC-Authentication-Results: i=2; imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=mgycGVuz; spf=pass (imf09.hostedemail.com: domain of jiaqiyan@google.com designates 209.85.128.50 as permitted sender) smtp.mailfrom=jiaqiyan@google.com; dmarc=pass (policy=reject) header.from=google.com; arc=pass ("google.com:s=arc-20240605:i=1") ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1768024132; a=rsa-sha256; cv=pass; b=48jgi8mBUgPL3LeqtXIu4jMI6uD6K5sia6fW3+R6O7if2Bcvxu3252Fk98Riecop8dLXXW 36TXM6NUW2msvzTb5/LNufBRYWiS0GFxj22fpkDO1b2BI/x2XM4BcBMur0eEjL3v2k2Au+ 9Yuwgc05lT72Fod+u0MHd1DIVPn6S0g= Received: by mail-wm1-f50.google.com with SMTP id 5b1f17b1804b1-4779a4fb9bfso29025e9.0 for ; Fri, 09 Jan 2026 21:48:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1768024131; cv=none; d=google.com; s=arc-20240605; b=CfquWpklykAfGiLPDRNPFouWr25xf5vLU1qybTVI9s4p+RC2Iat+TlSMuaV//GTG1b 9/wbtXQCLtK8YOte5R6vCW2oRSAZk3YidX1IgwduA+jyMr9/Bn7yVcpvLLQXFFvqm2uo IGNJv2aq2vC9/T5KzGJpDUickGzMujRffkeH5FwGtSQaMMuOZdYesKCf5EbQZKvMMRs0 THES49NPm1TSgRKL6Cx+34ShT3c8ex8JHepMkkfSLDuAJtOrcIB35hS4Lf0BcFT8eGCZ ldkR2SkN9hyyJMNbsx5yBF5+2ZtviYSe03RcYXC5vN20QI1FLrmDJwUuMIy+wel91dxh 1ixw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=TbNBS6+582THKiphuJq71Lq4d9pP5yCfw3WH36b5sQw=; fh=3W9AZdv1TAqM/tz+A96ovMIUQCou85FcjIf8a8O8I0w=; b=Y/JuFZ1xKlBJ1aIFX2YbBbhb+UvSqVkMisW2OiCqqu1H9K/WQSeKAdvgra6yGpfuWu giF3E2KlivjE6gDLNkuRVe7w624MWCJH1AasJ9OMUKFau6wScfemBzSUMqfn7t610Iea aPrG9NovsLw/EcoeshOM5dGxkQr0NjswnSEDeUoW8vox+Dq/Z5Nx/EIZDguso3ttSlEm DIoaCadlXoPHNzl4Jclqg0dwweFfX3WW6Gca+1vI4ivmz0FkNfpy7YJNkNQ3+w/ilY6H YF3hTTKLLc/5SJCZhqU7VKS0sV+MdSUPdutekmBlFdqj72o46VtaXfpFwQKTNtT6eCaa Ijsg==; darn=kvack.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1768024131; x=1768628931; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=TbNBS6+582THKiphuJq71Lq4d9pP5yCfw3WH36b5sQw=; b=mgycGVuzYi28NGDmTnVEai2OFte/+gryEij6kfZ7I2uDvKINYCgt7kut6sJ4RkGYtz 7LraY5z0ULcfWLfT8s/XWqixp6Digs7DNR9BuNXz05O0P+TQKSBOzWk+S1yvfSlr/bHv MoSkrkr54LTlI6PRr4ZKLqQGC/Ajjm40OKoFQoH3k+q2sWyd0lbFBvizMd4Rnhdtgpyr MMomPsUWFd5Nz2XVJkVpOa/1Z11GDyqefqhrKS6zKUSla7F4SLFI9gn/rIEV58t7t5IF vq9TpTtrdF7SoolIfJYgOFd3SYzQ66G0h/A2JGdhLSbjg8P/KhACZeQ8FfP662lGScNI cGbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1768024131; x=1768628931; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=TbNBS6+582THKiphuJq71Lq4d9pP5yCfw3WH36b5sQw=; b=wCsufukiTPfyo1P9G25EvYcfvuQYd+maC0UFoMWCinAPYnNhsXO8wP63Z3qdN3w42E b0UkUFxeN7jMX9JHEHeNvXsu2mYx9sV8IsB6o8rywcRp3XbpXQtkdNOIaXaPV2Dnqk8g lNHJLWyQdWACAp7YbK8UIhL6EOdILdRsRebxfvItW088+RAVqZUMeTWSIXLkqpKYW/cz sboPpDhoHEfxsnBhqArEzVU+1PmonhoCGLVV2GG9ISQQJYYmYYNoKbj9iVRnFZAXPHwQ l+DAelsP4ZJGpmFpdUdUJ3V3xCuCC4n66s983V6olyxo7favPlu2b7SwE4B6m4SS3UNT 79lw== X-Forwarded-Encrypted: i=1; AJvYcCVng7UYGE0yOY/U3nQVcN1JVCI/yomRaghg8EjtbbuBfs+yJNBoV2UVH6jpCo0bCGFz2Acw9aZgBg==@kvack.org X-Gm-Message-State: AOJu0Ywgqj++XH9uEld0NTd0DBFQ01GamBBlzhCYvMoqvBY8b+UtdohD /zXCZz5y0mn977TPrC7Zw9PVX800Zarxw7rU9eqA1453UTX0f0FbgOuHMsjRpeMNPKRxibklv1I pZkHJ+YGG1MQ9DTPncAEENAbf+IaZ3WbPRyR9mJZA X-Gm-Gg: AY/fxX6vbtZkdNXUdbm8h6egvweZAgqyz/1+AbucDAYkx9GFMLJa53/fKP5K57b2wMM KAlTYovIax5J9qm8RTkmB2Wx8XNsQg5gvt5oZZdI+Dog5yKWMY0nDiW9I1uL3YF4Z3NiLUecXua IdbV5U0jGlwp2PFjucSLLHMxBfOLRR/bNhABkOwripuDx2pRHU44e53ib5qyhsZGAXEEGXxY1PD XDBfg8leRjYT4PFYVNwe1vj/kwNW37rtQx6Ey+QRToD9nEhm0qvYUg1SshyvUKh9HHJ/+SAhWUA jVn72EdVhKpFdd/ARoXZubVRufVo X-Received: by 2002:a05:600c:6a93:b0:477:76ea:ba7a with SMTP id 5b1f17b1804b1-47ec5b18753mr233785e9.3.1768024130914; Fri, 09 Jan 2026 21:48:50 -0800 (PST) MIME-Version: 1.0 References: <20251219183346.3627510-1-jiaqiyan@google.com> <20251219183346.3627510-3-jiaqiyan@google.com> In-Reply-To: From: Jiaqi Yan Date: Fri, 9 Jan 2026 21:48:39 -0800 X-Gm-Features: AZwV_Qi18lgv65yORn7Pvp_oZbAw-fLVKU2au5YVXPT1OwuP166-f_JTqoWbsPw Message-ID: Subject: Re: [PATCH v2 2/3] mm/page_alloc: only free healthy pages in high-order HWPoison folio To: Harry Yoo Cc: jackmanb@google.com, hannes@cmpxchg.org, linmiaohe@huawei.com, ziy@nvidia.com, willy@infradead.org, nao.horiguchi@gmail.com, david@redhat.com, lorenzo.stoakes@oracle.com, william.roche@oracle.com, tony.luck@intel.com, wangkefeng.wang@huawei.com, jane.chu@oracle.com, akpm@linux-foundation.org, osalvador@suse.de, muchun.song@linux.dev, rientjes@google.com, duenwen@google.com, jthoughton@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: dqmzbegkgf7ehnwi1tttrk7dokmhhwky X-Rspam-User: X-Rspamd-Queue-Id: B9587140003 X-Rspamd-Server: rspam08 X-HE-Tag: 1768024132-427412 X-HE-Meta: U2FsdGVkX19aHHhuHIDESFz5kRiQHbTgrJ0XH9ecswQozH5NKnjhCJwej5ac/ev84JWOACXTq8PSJN8gpbbOYFW9OHaeMRImbehUw9IgD/8lWM8QZYb2cx/XgfvGG/3LD63dbS0nXmoN38NhtbOlF7RuoG0GiuZb9p4Vr44syv0M0/r+1urgJxsS3wI6BaSzomyNEMNgCSy3ElNWAE3i3qBJ7BBVQB64ozY4+G9HJw2M2oa+BGldcdOma8RU5qp8hqxZ1S6EaEg2oiELOtcJr/apgIBT4o9Ry+PGI/LuDQsOQe0qzrZH6xjq+t8NSd1NGhEYQWWhPE1C07jn7Z+9uMnNV+L7O+fTSplvwxz4UEfhqlHeX0sirJonziq3hWe7pafkcSsCI4V419r3qm6vrtBhmZw9p8/ddInb41iNwoigGgHVSsgrvmHoF5Hl9j/z/+2IBFxQh9QHIQdjGiU/aLTBBx+oKDPAiQJfQ1D62+4wMbSbqaPLWm2ZQsrq2H7zQl7GaRDsrN0OzqNlZkb1z3Bm+maiYTf6b9El/v0EO4lbsV2CRNdAmJXyH6j7zdSjRE3DrAe5ouwzUlGx+HwPaSG+XfiFZzrj/JkAR3+w3u05/2lTw3YsRv+xu3OHIildGKwgI30RzY69ouQ/rLKXUUuf1WoyW0vfFYH8O9wD0A+zoN7mjajg65KfiBtE7gjLq6pREm7aRBE+ryPc5jrQS6eb9m+UP1sDVt9lQzrk9g9J8g0bqE3g72lErAFY+jvdgoy545BlOcpVe1pHbU4odWo7a6xqHeBUInVjYr/tPymUHiNRIgOvQzXSvkILjpg2E6sItfW/LKvKqJcClzdsPPf430ohUR3BQDWiw2lqKQ8/rvV4v1J8lMM4MlRLXohh/MmUZrr+0geX02BYmn21bwEI2EDSe7Tcflig3WTUW8v1umYxWLP0QzKODOq7hFtNEZqUii5hhK8A7xr/ICE 8PubSuyx aGLn2zhUENs9mKqgkmkddvznv7yNs0bFqblkOCgspjNgNB9tz65Vk0MVIR8XyuWrgKFOBhWF1s7gw5hwiTNJeplZOdRwHF9RSC8kwbz2MwI8P+1cNExBmO09PgKWVbBto2PJIeqjsAyvC9TPSwoye4D3zgKJ5DsDgqf+iXlxcUgcosQrsLnJL5MkgRrTe0iH4q03LMuaLHDeFtFF1b9nsg74fdkM+S5wBrZBBAXepY8gFUEOk0pL9KTF527aJgSUYecWoWohET6pqqK/6Cstwfm/4LTAzKVrJIToN6gkDUbHyGhcD+yiFzKnyBTkV4qPSTzVT3rxHciuJsZWSORCgyeHTGCOqJJWp2lt7lYRHHa3lFMda9bngkuWUprk9hwT9qddy X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jan 5, 2026 at 11:35=E2=80=AFPM Harry Yoo wr= ote: > > On Mon, Jan 05, 2026 at 07:40:00PM -0800, Jiaqi Yan wrote: > > On Tue, Dec 30, 2025 at 8:37=E2=80=AFPM Harry Yoo wrote: > > > On Tue, Dec 30, 2025 at 04:19:50PM -0800, Jiaqi Yan wrote: > > > > On Sun, Dec 28, 2025 at 5:15=E2=80=AFPM Harry Yoo wrote: > > > > > On Fri, Dec 26, 2025 at 05:50:59PM -0800, Jiaqi Yan wrote: > > > > > > On Mon, Dec 22, 2025 at 9:14=E2=80=AFPM Harry Yoo wrote: > > > > > > > On Fri, Dec 19, 2025 at 06:33:45PM +0000, Jiaqi Yan wrote: > > > > > In general, I think they're not designed to handle cases where > > > > > the allocation order and the free order differ (unless we split > > > > > metadata like __split_unmapped_folio() does). > > > > > > > > I believe the proper way to fix this is to do something similar to > > > > pgalloc_tag_split(), used by __split_unmapped_folio(). > > > > > > Yeah, that will work. > > > > > > The difference is that pgalloc_tag_split(), split_page_owner(), and > > > split_page_memcg() only support splitting the whole page into many (>= 1) > > > smaller pieces, but we're trying to create only one smaller page from= the > > > original page. > > > > > > > When we split a > > > > new block from the original folio, we create a compound page from t= he > > > > block (just the way prep_compound_page_to_free does) and link the > > > > alloc tag of the original head page to the head of the new compound > > > > page. > > > > > > > > Something like copy_alloc_tag (to be added in v3) below to demo > > > > my idea, assuming tag=3Dpgalloc_tag_get(original head page): > > > > Actually I had another idea for page metadata like page alloc tag and > > page owner - how about we do the special handling AFTER > > free_pages_prepare()? > > Sounds good to me. > > > Looking at free_pages_prepare, it actually breaks down compound page > > (if it is compound, using free_tail_page_prepare), adjust page flags, > > and deals page metadata: > > > > for (i =3D 1; i < (1 << order); i++) { > > if (compound) > > bad +=3D free_tail_page_prepare(page, page + i); > > ... > > page->flags.f &=3D ~PAGE_FLAGS_CHECK_AT_PREP; > > reset_page_owner(page, order); > > page_table_check_free(page, order); > > pgalloc_tag_sub(page, 1 << order); > > > > Even the original compound page has some HWPoison pages, these > > operations should be done in the same way when no HWPoison page exist. > > and we're not losing hwpoison flag after free_pages_prepare()... > that actually sounds fine. > > > The compound structure will be gone after free, the alloc tag will be > > reduced after free, the page owner will be reset after free. > > Right. > > > So instead of splitting these stuff into temporary compound pages, why > > not getting rid of them in the first place with free_pages_prepare()? > > > > This is unique to the freeing path, so we don't need to mimic > > everything __split_unmapped_folio() does. > > Not 100% sure how it'd work in the memdesc world, > but for today, right. > > > At high level, applying the idea to __free_pages_ok (or > > __free_frozen_pages) looks like this: > > > > +static bool compound_has_hwpoisoned(struct page *page, unsigned int or= der) > > { > > + if (order =3D=3D 0 || !PageCompound(page)) > > + return false; > > + > > + return folio_test_has_hwpoisoned(page_folio(page)); > > +} > > > > static void __free_pages_ok(struct page *page, unsigned int order, > > { > > unsigned long pfn =3D page_to_pfn(page); > > struct zone *zone =3D page_zone(page); > > + bool should_fhh =3D compound_has_hwpoisoned(page, order); > > > > - if (free_pages_prepare(page, order)) > > - free_one_page(zone, page, pfn, order, fpi_flags); > > + if (!free_pages_prepare(page, order)) > > + return; > > + > > + if (should_fhh) { > > + free_has_hwpoisoned(page, order, fpi_flags); > > + return; > > + } > > + > > + free_one_page(zone, page, pfn, order, fpi_flags); > > } > > > > Under this idea, free_has_hwpoisoned() doesn't need > > prepare_compound_page_to_free() anymore. It can just directly call > > free_one_page() when it selects the right order. > > Right. > > > The only imperfect part is compound_has_hwpoisoned(), that we need to > > make a call to whether free_has_hwpoisoned() or not ealier than > > free_pages_prepare(). > > Right. > > > free_pages_prepare() clears PAGE_FLAGS_SECOND > > flags on the first tail page of a compound, which clears > > PG_has_hwpoisoned. On the other hand, we can't easily exclude > > PG_has_hwpoisoned from PAGE_FLAGS_SECOND. Because PG_has_hwpoisoned =3D= =3D > > PG_active, free_page_is_bad() will confuse and complaint that the > > first tail page is still active. > > Adding a comment like this wouldn't hurt: ack. > > /* > * Should be called before decomposing compound page in > * free_pages_prepare(). > */ > bool should_fhh =3D compound_has_hwpoisoned(page, order); > > > I have implemented a working prototype and tested just fine. I can > > send it out as v3 after clean it up. > > > > > > +static void prep_compound_page_to_free(struct page *head, unsigned= int order, > > > > + unsigned long flags, struct > > > > alloc_tag *tag) > > > > +{ > > > > + head->flags.f =3D flags & (~PAGE_FLAGS_CHECK_AT_FREE); > > > > + head->mapping =3D NULL; > > > > + head->private =3D 0; > > > > + > > > > + clear_compound_head(head); > > > > + if (order) > > > > + prep_compound_page(head, order); > > > > + > > > > + copy_alloc_tag(head, tag); > > > > > > Do we need a similar thing for memcg? Probably not, since it should h= ave > > > > Yes, I think we don't need to do anything for memgc, it is already > > uncharged in __folio_put(). > > > > > been uncharged before freeing (as long as we're not using it for kmem= pages) > > > > You mean the folio was charged to kernel memory, right? From my > > reading of uncharge_folio(), it covers the kmem case. > > Yes but I think not all kmem pages go through uncharge_folio() given > free_pages_prepare() handles uncharging kmem pages. I see. Then doing free_has_hwpoisoned() *after* free_pages_prepare() also address this concern, right? > > > > Perhaps a comment on why we don't need to split memcg will be enough. > > > > > > Do we need a similar thing for page_owner? I think yes, although it w= on't > > > crash or cause a warning, it will be inconsistent if we split page_ow= ner in > > > split_page()/__split_unmapped_folio()/etc but not in > > > prep_compound_page_to_free(). > > > > > > > +} > > > > > > > > I tested now the WARNING from include/linux/alloc_tag.h:164 is gone > > > > for both 2M and 1G pages. BTW we also need to copy_alloc_tag() for > > > > HWPoison pages before pgalloc_tag_sub(). > > > > > > > > > > > > + total_freed +=3D nr_pages; > > > > > > > > + curr =3D PageHWPoison(next) ? next + 1 : next= ; > > > > > > > > + } > > > > > > > > + > > > > > > > > + pr_info("Excluded %lu hwpoison pages from folio\n", t= otal_hwp); > > > > > > > > + pr_info("Freed %#lx pages from folio\n", total_freed)= ; > > > > > > > > +} > > > > > > > > + > > > > > > > > void free_frozen_pages(struct page *page, unsigned int ord= er) > > > > > > > > { > > > > > > > > + struct folio *folio =3D page_folio(page); > > > > > > > > + > > > > > > > > + if (order > 0 && unlikely(folio_test_has_hwpoisoned(f= olio))) { > > > > > > > > + folio_clear_has_hwpoisoned(folio); > > > > > > > > + free_has_hwpoison_pages(page, order); > > > > > > > > + return; > > > > > > > > + } > > > > > > > > + > > > > > > > > > > > > > > It feels like it's a bit random place to do has_hwpoisoned ch= eck. > > > > > > > Can we move this to free_pages_prepare() where we have some > > > > > > > sanity checks (and also order-0 hwpoison page handling)? > > > > > > > > > > > > While free_pages_prepare() seems to be a better place to do the > > > > > > has_hwpoisoned check, it is not a good place to do > > > > > > free_has_hwpoison_pages(). > > > > > > > > > > Why is it not a good place for free_has_hwpoison_pages()? > > > > > > > > > > Callers of free_pages_prepare() are supposed to avoid freeing it = back to > > > > > the buddy or using the page when it returns false. > > > > > > > > What I mean is, callers of free_pages_prepare() wouldn't know from = the > > > > single false return value whether 1) they should completely bail ou= t > > > > or 2) they should retry with free_has_hwpoison_pages. > > > > > > I was thinking that once free_pages_prepare() returns false, it alrea= dy > > > has done all the work to isolate HWPoison pages and freed healthy por= tions, > > > so the caller doesn't have to do anything and can bail out completely= . > > > > Does this change the meaning of free_pages_prepare()'s return value? > > My first take is it returns if the preparation+check works are good. > > > > I don't know if doing the freeing work in free_pages_prepare() is a goo= d > > idea. > > As long as it's documented in a comment, why not. > > It doesn't make callers' code less readable. The semantic is the same; > callers should free the page only if free_pages_prepare() returns true. > > They don't have to care why they should not free it. > (either it should not be freed back to buddy because it's bad page, > or it had hwpoison page(s) and healthy portions are already freed) > > > But if it is fine, we can actually hide the ugly > > compound_has_hwpoisoned() entirely inside free_pages_prepare(): > > - true means preparation + check work are all good but caller needs to > > free prepared+checked pages itself > > - false means one of the two: > > - some check failed and it is impossible to safely free pages for cal= lers > > - this is a compound page with some HWPoison pages, and healthy > > pages are already freed safely > > Yes, something like this could be documented in free_pages_prepare(). > > > Doesn't seem very clean with a boolean return type... > > The return type could be an enum type, but that wouldn't affect > the caller's behavior (bailing out) anyway? > > No strong opinion from me on the type. I just noticed free_pages_prepare() does have an external caller compaction_free(), so I want to be cautious about adding new freeing behavoir to it. Maybe I can wrap compound_has_hwpoisoned(), free_pages_prepare(), and free_has_hwpoisoned() into free_pages_prepare_has_hwpoisoned() and make it private to page_alloc.c. Let me put this together in V3. I also want to correct my previous latency measurement. Both the compared group and baseline take less time (2ms vs 0.066ms), but it is not ~2x but ~30x. > > -- > Cheers, > Harry / Hyeonggon