From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id ECAC3CAC582 for ; Tue, 9 Sep 2025 20:07:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F060C8E0003; Tue, 9 Sep 2025 16:07:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EB6418E0001; Tue, 9 Sep 2025 16:07:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DA4708E0003; Tue, 9 Sep 2025 16:07:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id BEFCA8E0001 for ; Tue, 9 Sep 2025 16:07:35 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 7A22916075A for ; Tue, 9 Sep 2025 20:07:35 +0000 (UTC) X-FDA: 83870796870.02.8F91EE8 Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) by imf27.hostedemail.com (Postfix) with ESMTP id AEB7B40007 for ; Tue, 9 Sep 2025 20:07:33 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="IEY//LHD"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of jyescas@google.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=jyescas@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1757448453; a=rsa-sha256; cv=none; b=FynyvXCnyKIfV+l/Z1MGLARTDGfWTtycQy5Lq+0j/g82j+3a+q9NO2kwPqYUWm6X0VKF4/ SI9my6WWkw/dKGXF3GRVziT2HIA3N7Cc3BPHtR09J5auBCGf6lPKggYyuHVaECgavcHeNf 2acs0mNhInvKjO3YJzo0uYPmUuAyMaM= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="IEY//LHD"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of jyescas@google.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=jyescas@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1757448453; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GJGSId8xZGClXTTL3PjAabP3O/IwmDMLJ48Vb9IC160=; b=E5FaX42o0DDqc6Bp6YCx7AOJ7HL6KRc0+Vl9oFMOIriec74LT9FQTuM9auE9PctMu1of2H gYTQnXxObox1m6yR8BF2pMLuJP6lfbo1RRB30RuzkYgzbixleKqVoVMeyHkAS67fz34bxJ pNpBF+2pG70VvhfmNJSX+WkTSGFVClc= Received: by mail-pl1-f182.google.com with SMTP id d9443c01a7336-24b2d018f92so8785ad.1 for ; Tue, 09 Sep 2025 13:07:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1757448452; x=1758053252; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=GJGSId8xZGClXTTL3PjAabP3O/IwmDMLJ48Vb9IC160=; b=IEY//LHD+S7D/n+Ykxlo8qzeqMXUirJK5N2iU3rH5lbs+ZQKHSTC6hkU4kCG0onbVj Yqp9TzrCDmpuKGA3p2l5yrd7C1lp6wAukrsbJnZh4ZyWpf+UVe6Ar+eH57Ncmq2cgcEy yW4MLnuK6iNF2f3JkrJcpNFbjNi4WBhvr/y1SNsgW++Mq1pETxB70mPyvI5kNUMFZR0K mX4TMb3D8+rOLkGt+0Z298EpFAejhjY+nHtgKjJZde1XJnaRdxHZBieG/Upa06NXjoQc WNQL1s+2xZnz7rzR2LbOljYo/cFhRmNZrgdWd35wLxVA0eEDmh5xlk4b7JhHM0m5lqaM 9biw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757448452; x=1758053252; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=GJGSId8xZGClXTTL3PjAabP3O/IwmDMLJ48Vb9IC160=; b=QgOix6tt25RLGjaxU3jAMyETJU8Bk3tCca/E9KTf2gfIpeL969HsYRSYN0ZMsw0Vm4 cQmjtdZo0EMIds8luVhZUYXfjSq6vcF2ScqPLp6cilo3xm7XZMJmtsDq+UQRf+7/9+Zt /kFLcHz1RVB4+HjvsGcD3SPoDQW3F5IwLJysU/ix4YwWseU5eZNByBQ5ym5kXdIMobdu IeE7TRbVU2chcRN/3TYUwVJqBfagPxfnHp/DlO18RkVRevZ4h4L6i6uHk1MB2/ByGf3s 5KQnCmAeq5Jc1eYjh8lhXPSR+xdaRRsGf94/tS4eJ9JhuqZlD2FRRBgO49qAKPRr0YZL FNPQ== X-Forwarded-Encrypted: i=1; AJvYcCUJBhu3rub0nwFqTh5CE8yJxTLeH0mVxoTx8ijoi06jVHHUEoI/DoFTR9X2yDmqytan/vYB7mUX8w==@kvack.org X-Gm-Message-State: AOJu0YzHpwR25tguNYUSJYrCz0M1m6Ws0hpsH3kkY6ULk3aKzHqgg465 nYHAKjZ2QjV3CCdo2mcdSCP/BpRdi3yEXR5SwhJhN2gNXRQ+MFfZlVLWigyOMW1j5qH1xUrUGj7 Uhc5T8uQZ7/jKSSHxojZCxj6OlI9uk0DavvPTgLvh X-Gm-Gg: ASbGncuSph/AuLd0bwPk+3OkwQ7PyR5KQN9zfu75WpmmS7F16qs8IGF4b2HeO7QDxuW 7KVDNN3EaKHxHz6PvlFu5+yyu7eRJBnvQX8VnbtSYEuqNf/Rh98hN3U/Z3VyxTLNIPeW9d9Tjul Gnu1m6BGAoo9A7TjSMRpMeyL1I38rUAH1KjazgPSbBZ1Mg4+NA8PSwJMqHE6wg8lDsgJRBNbR34 41bzFKwtXgaqJ5V2czn/ltSqXEHKFm6guzoRJHsi9g= X-Google-Smtp-Source: AGHT+IEhjbN+9tvrD+zszW+/7dIiUSph/hnrO8562d7Hw5weQdGqLDDKswFsLQj+Gdp+QQZb780o7qkHno+U2EtpwOw= X-Received: by 2002:a17:902:a715:b0:251:a3b3:1565 with SMTP id d9443c01a7336-25a5ae3a71dmr620765ad.11.1757448451998; Tue, 09 Sep 2025 13:07:31 -0700 (PDT) MIME-Version: 1.0 References: <486c5773-c7fa-4e19-b429-90823ed2f7d5@redhat.com> <6dee6cd9-c67f-480f-b728-21c3a7b72004@redhat.com> In-Reply-To: <6dee6cd9-c67f-480f-b728-21c3a7b72004@redhat.com> From: Juan Yescas Date: Tue, 9 Sep 2025 13:07:17 -0700 X-Gm-Features: Ac12FXyqJk95BzMsBAXQbxuk4y3BxEVHd4SQLHyRVgKfv8bcP2fe-zsqgx1zwvI Message-ID: Subject: Re: [RFC PATCH] mm/page_alloc: Add PCP list for THP CMA To: David Hildenbrand Cc: akash.tyagi@mediatek.com, Andrew Morton , angelogioacchino.delregno@collabora.com, hannes@cmpxchg.org, Brendan Jackman , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mediatek@lists.infradead.org, Linux Memory Management List , matthias.bgg@gmail.com, Michal Hocko , Suren Baghdasaryan , Vlastimil Babka , wsd_upstream@mediatek.com, Zi Yan , Kalesh Singh , "T.J. Mercier" , Isaac Manjarres Content-Type: multipart/alternative; boundary="000000000000b22515063e63dad7" X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: AEB7B40007 X-Stat-Signature: 34h5nk6jzasn4re8de5ooagbbdhmdsfi X-Rspam-User: X-HE-Tag: 1757448453-849877 X-HE-Meta: U2FsdGVkX19fvx6l0Gwm99yPqVvyaXM7NnjeevawgEnLNDlQB0HI+kKVS4dIca84fdhELQYYiKL4rOvsSpqzmFydYOlftfB4XrgsW5ra3D0zJ5gtlMQ28oESVd1Co1/wX5chbK5Fo6+s1h8GgO9XFA0nPyOGq7qyJP+MtYJJeEt9I/vI5Q8732aadAlKW29Tdw7w+Ek3ne5UIBIBOYPqQVhe1MrvXzAHAbfMInYOBFOJV91svbj4JQL8vntJhldz9Jz2YjCGRgaLV1fGvOWzYVBRyTlpfKhvzw5SilBipf1g/dGnd7eIPJ17eaBAnIzhd1eztuOZI35MCv6VtDP7bH/idZpFVKNUM6rYZj3jYLHaTpzsdGbcNKF7Iv4DstmPLfVNP4nQ7uHS2jwxkKOjky8UypXzPxSZYt8/j+oPU3Y+bKtoyc8LmmfINgTAq0RRbQHb3wBzC9v8x2OSEcWyrpHrkk+DDL6mUVwE8XVuHPSlY2hW1rI06cPkEdKTXLkYOS4VRIxwRGSLKQiM8lCWEq82myvKckJp1dOvgJgkHCAsBpnaLl5YmeTGJN4Drj0eFAZ5Ta4/F3BzOJG5FL8PiILgeBXG/k65avh/h4KMM7GLf0AIAWsh9++XGKdR397/740e/SXnNHKoC4jxaywVHDcrcyTK1LW/TSDCA19FR14ZbxyHx87/FooUkxNrRoiL6JNTQLXxJT/LIvqKDfDrd0qtPCWo9UMwjTE6uvqqKSwKSeeynLhcsY/dQhAAgii1dMm3UP9S+71HWgwDlugul0B5zDt8Yz9WHWyRC6UzzsJ3QEgqxBlHrHC1ct5uwOFKUoKH9MjhWGGfacD+vBMW5DN80HvcuTl1Q6wmZrKkFPQAdQz82lsFOvkFBHI8ttllk9EtI61F0/2FV/guyXdJUWUpSEiGdqzjb5DKJY31ObeaIJAcaif+KJTip2W9rDE8iesaKt8fa5BLHf0dCe7 D5lbmLSm lzw4fq7EDL/aDo7JYuYa846QJPMDiO5H1JN4NhBg2TU5y5f5An6A25893Ywcmw9BajeAotdpXLVNCA3/J5T016yQlEB2uSXrsClug X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: --000000000000b22515063e63dad7 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Aug 5, 2025 at 2:08=E2=80=AFPM David Hildenbrand = wrote: > On 05.08.25 18:57, Juan Yescas wrote: > > On Tue, Aug 5, 2025 at 2:58=E2=80=AFAM David Hildenbrand > wrote: > >> > >> On 05.08.25 03:22, Juan Yescas wrote: > >>> On Mon, Aug 4, 2025 at 11:50=E2=80=AFAM David Hildenbrand > wrote: > >>>> > >>>> On 04.08.25 20:20, Juan Yescas wrote: > >>>>> Hi David/Zi, > >>>>> > >>>>> Is there any reason why the MIGRATE_CMA pages are not in the PCP > lists? > >>>>> > >>>>> There are many devices that need fast allocation of MIGRATE_CMA > pages, > >>>>> and they have to get them from the buddy allocator, which is a bit > >>>>> slower in comparison to the PCP lists. > >>>>> > >>>>> We also have cases where the MIGRATE_CMA memory requirements are bi= g. > >>>>> For example, GPUs need MIGRATE_CMA memory in the ranges of 30MiB to > 500MiBs. > >>>>> These cases would benefit if we have THPs for CMAs. > >>>>> > >>>>> Could we add the support for MIGRATE_CMA pages on the PCP and THP > lists? > >>>> > >>>> Remember how CMA memory is used: > >>>> > >>>> The owner allocates it through cma_alloc() and friends, where the CM= A > >>>> allocator will try allocating *specific physical memory regions* usi= ng > >>>> alloc_contig_range(). It doesn't just go ahead and pick a random CMA > >>>> page from the buddy (or PCP) lists. Doesn't work (just imagine havin= g > >>>> different CMA areas etc). > >>>> > >>>> Anybody else is free to use CMA pages for MOVABLE allocations. So we > >>>> treat them as being MOVABLE on the PCP. > >>>> > >>>> Having a separate CMA PCP list doesn't solve or speedup anything, > really. > >>>> > >>> > >>> Thanks David for the quick overview. > >>> > >>>> I still have no clue what this patch here tried to solve: it doesn't > >>>> make any sense. > >>>> > >>> > >>> The story started with this out of tree patch that is part of Android= . > >>> > >>> > https://lore.kernel.org/lkml/cover.1604282969.git.cgoldswo@codeaurora.org= /T/#u > >>> > >>> This patch introduced the __GFP_CMA flag that allocates pages from > >>> MIGRATE_MOVABLE > >>> or MIGRATE_CMA. What it happens then, it is that the MIGRATE_MOVABLE > >>> pages in the > >>> PCP lists were consumed pretty fast. To solve this issue, the PCP > >>> MIGRATE_CMA list was added. > >>> This list is initialized by rmqueue_bulk() when it is empty. That's > >>> how we end up with the PCP MIGRATE_CMA list > >>> in Android. In addition to this, the THP list for MIGRATE_MOVABLE was > >>> allowed to contain > >>> MIGRATE_CMA pages. This is causing THP MIGRATE_CMA pages to be used > >>> for THP MIGRATE_MOVABLE > >>> making later allocations from THP MIGRATE_CMA to fail. > >> > >> Okay, so this patch here really is not suitable for the upstream kerne= l > >> as is. It's purely targeted at the OOT Android patch. > >> > > Right, it is a temporary solution for the pinned MIGRATE_CMA pages. > > > >>> > >>> These workarounds are mainly because we need to solve this issue > upstream: > >>> > >>> - When devices reserve big blocks of MIGRATE_CMA pages, the > >>> underutilized MIGRATE_CMA > >>> can fall back to MIGRATE_MOVABLE and these pages can be pinned, so if > >>> we require MIGRATE_CMA > >>> pages, the allocations might fail. > >>> > >>> I remember that you presented the problem in LPC. Were you able to > >>> make some progress on that? > >> > >> There is the problem of CMA pages getting allocated by someone for a > >> MOVABLE allocation, to then short-term pin it for DMA. Long-term > >> pinnings are disallowed (we just recently fixed a bug where we > >> accidentally allowed it). > >> > > Nice, it is great the issue got caught and fixed upstream :) > > > >> One concern is that a steady stream of short-term pinnings can turn su= ch > >> pages unmovable. We discussed ideas on how to handle that, but there i= s > >> no solution upstream yet. > > > > Are there any plans to continue the discussion? Is it in the priority > > list? > > Ohh, it's somewheeeeeere on the todo list :) > > Do you (or one of your colleagues) have capacity to work on that? One > idea was to flag folios as "pending on migration" and disallow any > further short-term pins until migration is done. IIRC, there were other > ideas (e.g., isolated pageblock). > > > Maybe > > a topic we can discuss in LPC Japan? > > Sounds good, feel free to propose this as a topic. I wills end out the > announcement of the MM MC probably next week. > > The topic has been proposed in the Kernel Memory Management track. https://lpc.events/event/19/abstracts/2397/ Thanks Juan > -- > Cheers, > > David / dhildenb > > --000000000000b22515063e63dad7 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable



On Tue, Aug= 5, 2025 at 2:08=E2=80=AFPM David Hildenbrand <david@redhat.com> wrote:
On 05.08.25 18:57, Juan Yescas wrote:
> On Tue, Aug 5, 2025 at 2:58=E2=80=AFAM David Hildenbrand <david@redhat.com> wro= te:
>>
>> On 05.08.25 03:22, Juan Yescas wrote:
>>> On Mon, Aug 4, 2025 at 11:50=E2=80=AFAM David Hildenbrand <= david@redhat.com&= gt; wrote:
>>>>
>>>> On 04.08.25 20:20, Juan Yescas wrote:
>>>>> Hi David/Zi,
>>>>>
>>>>> Is there any reason why the MIGRATE_CMA pages are not = in the PCP lists?
>>>>>
>>>>> There are many devices that need fast allocation of MI= GRATE_CMA pages,
>>>>> and they have to get them from the buddy allocator, wh= ich is a bit
>>>>> slower in comparison to the PCP lists.
>>>>>
>>>>> We also have cases where the MIGRATE_CMA memory requir= ements are big.
>>>>> For example, GPUs need MIGRATE_CMA memory in the range= s of 30MiB to 500MiBs.
>>>>> These cases would benefit if we have THPs for CMAs. >>>>>
>>>>> Could we add the support for MIGRATE_CMA pages on the = PCP and THP lists?
>>>>
>>>> Remember how CMA memory is used:
>>>>
>>>> The owner allocates it through cma_alloc() and friends, wh= ere the CMA
>>>> allocator will try allocating *specific physical memory re= gions* using
>>>> alloc_contig_range(). It doesn't just go ahead and pic= k a random CMA
>>>> page from the buddy (or PCP) lists. Doesn't work (just= imagine having
>>>> different CMA areas etc).
>>>>
>>>> Anybody else is free to use CMA pages for MOVABLE allocati= ons. So we
>>>> treat them as being MOVABLE on the PCP.
>>>>
>>>> Having a separate CMA PCP list doesn't solve or speedu= p anything, really.
>>>>
>>>
>>> Thanks David for the quick overview.
>>>
>>>> I still have no clue what this patch here tried to solve: = it doesn't
>>>> make any sense.
>>>>
>>>
>>> The story started with this out of tree patch that is part of = Android.
>>>
>>> https://l= ore.kernel.org/lkml/cover.1604282969.git.cgoldswo@codeaurora.org/T/#u >>>
>>> This patch introduced the __GFP_CMA flag that allocates pages = from
>>> MIGRATE_MOVABLE
>>> or MIGRATE_CMA. What it happens then, it is that the MIGRATE_M= OVABLE
>>> pages in the
>>> PCP lists were consumed pretty fast. To solve this issue, the = PCP
>>> MIGRATE_CMA list was added.
>>> This list is initialized by rmqueue_bulk() when it is empty. T= hat's
>>> how we end up with the PCP MIGRATE_CMA list
>>> in Android. In addition to this, the THP list for MIGRATE_MOVA= BLE was
>>> allowed to contain
>>> MIGRATE_CMA pages. This is causing THP MIGRATE_CMA pages to be= used
>>> for THP MIGRATE_MOVABLE
>>> making later allocations from THP MIGRATE_CMA to fail.
>>
>> Okay, so this patch here really is not suitable for the upstream k= ernel
>> as is. It's purely targeted at the OOT Android patch.
>>
> Right, it is a temporary solution for the pinned MIGRATE_CMA pages. >
>>>
>>> These workarounds are mainly because we need to solve this iss= ue upstream:
>>>
>>> - When devices reserve big blocks of MIGRATE_CMA pages, the >>> underutilized MIGRATE_CMA
>>> can fall back to MIGRATE_MOVABLE and these pages can be pinned= , so if
>>> we require MIGRATE_CMA
>>> pages, the allocations might fail.
>>>
>>> I remember that you presented the problem in LPC. Were you abl= e to
>>> make some progress on that?
>>
>> There is the problem of CMA pages getting allocated by someone for= a
>> MOVABLE allocation, to then short-term pin it for DMA. Long-term >> pinnings are disallowed (we just recently fixed a bug where we
>> accidentally allowed it).
>>
> Nice, it is great the issue got caught and fixed upstream :)
>
>> One concern is that a steady stream of short-term pinnings can tur= n such
>> pages unmovable. We discussed ideas on how to handle that, but the= re is
>> no solution upstream yet.
>
> Are there any plans to continue the discussion? Is it in the priority<= br> > list?

Ohh, it's somewheeeeeere on the todo list :)

Do you (or one of your colleagues) have capacity to work on that? One
idea was to flag folios as "pending on migration" and disallow an= y
further short-term pins until migration is done. IIRC, there were other ideas (e.g., isolated pageblock).

> Maybe
> a topic we can discuss in LPC Japan?

Sounds good, feel free to propose this as a topic. I wills end out the
announcement of the MM MC probably next week.

The topic has been proposed in the Kernel Memory Mana= gement track.


Thanks
Juan
=C2=A0
--
Cheers,

David / dhildenb

--000000000000b22515063e63dad7--