From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90572C27C79 for ; Mon, 17 Jun 2024 11:55:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0AA846B0186; Mon, 17 Jun 2024 07:55:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 05A5D6B0187; Mon, 17 Jun 2024 07:55:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E3C566B0188; Mon, 17 Jun 2024 07:55:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C53A86B0186 for ; Mon, 17 Jun 2024 07:55:36 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 7C029A3A5E for ; Mon, 17 Jun 2024 11:55:36 +0000 (UTC) X-FDA: 82240225872.20.55E5482 Received: from mail-vs1-f48.google.com (mail-vs1-f48.google.com [209.85.217.48]) by imf13.hostedemail.com (Postfix) with ESMTP id BBE3720002 for ; Mon, 17 Jun 2024 11:55:34 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ZWaEYjxI; spf=pass (imf13.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.48 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718625329; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XV8Tu4yudfofDqv7fDAng14xyMI6FIAAjDu0Vw4P4zQ=; b=OfVMrhywlnXVU2/+NW/2PrAr2mZLjxV5jTwB9mcl5KZ478ZbH/16IiLNi3IRCtGu4D6MiK b1rV60c+qQ9rNw0yW/Gb0ijK1O0YpaBphFGh8h+RJaZPTGiNSYY9Gbp633b3zTYtj0nw9n KTfSOXi9Vh2YIfhmdA61A2IGRaAqRvE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718625329; a=rsa-sha256; cv=none; b=fScT2wD4d+qyl5K8HeIflxylT4Gd2vllcVklDxHOkBGe69XGOx4nmMiWiRxff2vbKDp1rw 1180ygG6CA7yysiNFGPj+xTT/x26nc8XJVwqlDkzeLTcQZoEGkb5obwOi18LJoDyhH0EGC Ov01JSKTIFgOi4846h9FfFaj3ps63FY= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ZWaEYjxI; spf=pass (imf13.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.48 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-vs1-f48.google.com with SMTP id ada2fe7eead31-48efb6873cfso147136137.3 for ; Mon, 17 Jun 2024 04:55:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718625334; x=1719230134; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=XV8Tu4yudfofDqv7fDAng14xyMI6FIAAjDu0Vw4P4zQ=; b=ZWaEYjxIrljbKB7/CIdPKt/+t1KB81ySqi0ZH7U86rSMf/cbcqFKzuuldweQa0a89K 2QN1BC+ptX/SFr2Y9U3VbSWr67oMO88W0hWDzV6zM1lMuantyH+7lKBgT1f3rDt1Ghm0 7uA8OkxwEakzbhv9fkC03p4n4N2v7Wmzqn1oGqojrL+n6k0/5UMdmynaUCvYV9uq9GO/ w6qjVQv+XMAgxo3BQhNN6MIB1Q3isuh8gk84yvrQjNTRsNX0L2HH7EYE0uSyGppgvrXS 0E1kB9uRvUUSNJ+nYAs9+G0G/Cro9XYuzNg6zc+MEMD1ubtPcYwsq/FOhz1DRjuD9LI3 dgxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718625334; x=1719230134; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XV8Tu4yudfofDqv7fDAng14xyMI6FIAAjDu0Vw4P4zQ=; b=gl/wXU00CtevKbD93sKim7GpuwhXa9C9JHtdftWE1+9t6hrNLUyd+58fIyvTL1uZG4 XYsmzB67oAjEARQNQBCWiTU3EDelM3DC537nTs8I5GhF1yI9XfYKxQCyfmytzA6tVmcS DNTAzPVgLEwO/27s1EsqyjXVe54Ag5TsYIAgPnlPFnst4BXuWHbtTtRhWI97bb50W8C/ op2oBHTlASaba7G2Zds7G2lzwKxJy4n67+xVDktbeW6NRC/8Vhp05HRj1bfhNwCWGJZi o3dl0TasNHUN2xq+hKBQQQKaJO4847CLZCPXLpMlOvbqn/zQNQFui+PaQz9MPxlSZoNn GPnw== X-Forwarded-Encrypted: i=1; AJvYcCVjQ3beFKuROwKMERLFBhup2jXZn7aEBleCRjGmBXsp1LtNGMPNV6ctz0tcBQxcbWQjLxVhb/TMn+uWVWWYvjgO5YY= X-Gm-Message-State: AOJu0YyjUUqeTEUNjEKASPSWRTGiAZ9syXHssSJDqUsD1NigbpGtBDLr FehlUFg6TTn9YLPJc24ilNgQCndavz1mxVS0XV11RDEl9Gf0DnvEUmpbXTCYxE4a+bl9RA1VRxN uGu0LGWzbuE51WL/AoLLXLUGW3Fc= X-Google-Smtp-Source: AGHT+IEIyYlMDM5bd9lCVKySv9/fFUx+T5N0Y5Y7EVhWkghUKBG4aDI1IEd6aEgK4ezk3W516XFLIS5iH7N3r5dKZ7A= X-Received: by 2002:a05:6102:3b85:b0:48d:bd49:1fe with SMTP id ada2fe7eead31-48dbd490291mr5131098137.1.1718625333769; Mon, 17 Jun 2024 04:55:33 -0700 (PDT) MIME-Version: 1.0 References: <1717492460-19457-1-git-send-email-yangge1116@126.com> <82d31425-86d7-16fa-d09b-fcb203de0986@126.com> <7087d0af-93d8-4d49-94f4-dc846a4e2b98@linux.alibaba.com> <6dc8df31-eb01-4382-8467-c5510f75531e@linux.alibaba.com> In-Reply-To: <6dc8df31-eb01-4382-8467-c5510f75531e@linux.alibaba.com> From: Barry Song <21cnbao@gmail.com> Date: Mon, 17 Jun 2024 19:55:21 +0800 Message-ID: Subject: Re: [PATCH] mm/page_alloc: skip THP-sized PCP list when allocating non-CMA THP-sized page To: Baolin Wang Cc: yangge1116 , akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, liuzixing@hygon.cn, Johannes Weiner , Vlastimil Babka , Zi Yan Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: BBE3720002 X-Stat-Signature: feaj1y9p847zjaoay4d8uifjwaa6x49b X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1718625334-543624 X-HE-Meta: U2FsdGVkX1+rNdfJwoODyUTZRya8oGPkp+pQBfhZtAR4d29uzsdf5zlqUB3+Yc+ZVlgDlCgvi+9w50F6l14qOd83CR+ov4N+AVvJi/57iJ5D+mVWhoX/QGVLCScQyZs225BZqTOW8nlBTDFtinFK00x/KMBRhDS04yjK24+xPSrUKKg2NrofTBM/SitPWhamkhii23CASFW4S930T+w+PHbdNSvU4O6HtXJDiEM/2Ulr4wxBRGdx7PPr6ZBfAKz1jas7P8UcZuzeR9bJppWs+7xNVwPwbVDKBBS0bqqSgTeuP+9Zk+3eaCUF+p37jpJrd2DDNa2hlGmOQ85GXc5aIfhLWUpb8nBwdw9EOnXzOXGU25GaXm2YPYaQzPJdwXg/h8Jfb0JKF2UrNwJhmshDp84y5vbmBWbmjrvP8t71++7ab4yYsyS+qEFqQLv0d1bGNttrHJYNxBQvHJrEA2C3Vo8fh5uYDTT7ix3NI+RrfWZG0rx8rBevcErC5Nj5fTbhdSd5E/BJlafEuXpX8cee42eFPMVCKkoJj+VaB8pVUvP7ygqOR1ePilk33VgzSCAwgCm2VICuAqnI3TVZk9OuhGU+1003t01eSV7WfNbPJ4Erzcilndbn30pEX57Y4hOD8RBiuTyqqmD/WIOVWiNADnIMLk5cKsUKZ/OaTx5F27MGcVEEi/xD+BRQNOiqUNv2WVSdtNBdUgsFKStO2QrMmdAZsIqGSM38IM3hbVWYxVTf6+Vr2rcnL842Iq4KCAR7Fxo5/jWy9nAzQ7epauJEGvkSKEzUQvQRIqlnSHhTbrGoN50oHG0v88hhRyJOyqr3LqEJk/2Y4Kx+F9Ufo2Z+pWcO3IfDjmfsn70d6pnXn/ilNdKjPBcG7mq9QpF2O9C5RqfUjUTHuQGCPX1pl44J2i/7lNGhKquk5IhXfKIZWqJ6VEJhnaZoeM9pE40uoLnSesxpUSg7WatvMlVv/ye DtO8hXr7 /lDdywHd2hC8Lmwysjeq0vSFn0lVXLnz3qFoZ5Bx6v731pR7m/1b4bxgrEPJJqo68u0r2hG68ksG9S2TkqBAphK7jmVovmFUVusZ3yBydgMRuUcdqNvzmWrtsWt1D62ui394KWi9MzWLeJnBrPeV4KlNaej1U6oyqga5cqCu76fNXiRsZWAC2Fon52DPXMmlGCg+dxarPFQMcOegTSqrAW4zziPygjPYAzkU7ha6gTVW+oL6dz6LSdTZN1bO1Mfc9MqAmlwBwPWZDRKpQFSTueuF0camf7ya5Ne9gcHxEeaphkYBj3UGQmqGUrUBZaShFh5n4Jw0AcxlfJiM5nRhKVEe42AJ+TrK2m2khbBbl8+g6lKWo9t/l7osp4OOcFytFxjbT X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jun 17, 2024 at 7:36=E2=80=AFPM Baolin Wang wrote: > > > > On 2024/6/17 18:43, Barry Song wrote: > > On Thu, Jun 6, 2024 at 3:07=E2=80=AFPM Baolin Wang > > wrote: > >> > >> > >> > >> On 2024/6/4 20:36, yangge1116 wrote: > >>> > >>> > >>> =E5=9C=A8 2024/6/4 =E4=B8=8B=E5=8D=888:01, Baolin Wang =E5=86=99=E9= =81=93: > >>>> Cc Johannes, Zi and Vlastimil. > >>>> > >>>> On 2024/6/4 17:14, yangge1116@126.com wrote: > >>>>> From: yangge > >>>>> > >>>>> Since commit 5d0a661d808f ("mm/page_alloc: use only one PCP list fo= r > >>>>> THP-sized allocations") no longer differentiates the migration type > >>>>> of pages in THP-sized PCP list, it's possible to get a CMA page fro= m > >>>>> the list, in some cases, it's not acceptable, for example, allocati= ng > >>>>> a non-CMA page with PF_MEMALLOC_PIN flag returns a CMA page. > >>>>> > >>>>> The patch forbids allocating non-CMA THP-sized page from THP-sized > >>>>> PCP list to avoid the issue above. > >>>>> > >>>>> Fixes: 5d0a661d808f ("mm/page_alloc: use only one PCP list for > >>>>> THP-sized allocations") > >>>>> Signed-off-by: yangge > >>>>> --- > >>>>> mm/page_alloc.c | 10 ++++++++++ > >>>>> 1 file changed, 10 insertions(+) > >>>>> > >>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c > >>>>> index 2e22ce5..0bdf471 100644 > >>>>> --- a/mm/page_alloc.c > >>>>> +++ b/mm/page_alloc.c > >>>>> @@ -2987,10 +2987,20 @@ struct page *rmqueue(struct zone > >>>>> *preferred_zone, > >>>>> WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1)); > >>>>> if (likely(pcp_allowed_order(order))) { > >>>>> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > >>>>> + if (!IS_ENABLED(CONFIG_CMA) || alloc_flags & ALLOC_CMA || > >>>>> + order !=3D HPAGE_PMD_ORDER) { > >>>> > >>>> Seems you will also miss the non-CMA THP from the PCP, so I wonder i= f > >>>> we can add a migratetype comparison in __rmqueue_pcplist(), and if > >>>> it's not suitable, then fallback to buddy? > >>> > >>> Yes, we may miss some non-CMA THPs in the PCP. But, if add a migratet= ype > >>> comparison in __rmqueue_pcplist(), we may need to compare many times > >>> because of pcp batch. > >> > >> I mean we can only compare once, focusing on CMA pages. > >> > >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c > >> index 3734fe7e67c0..960a3b5744d8 100644 > >> --- a/mm/page_alloc.c > >> +++ b/mm/page_alloc.c > >> @@ -2973,6 +2973,11 @@ struct page *__rmqueue_pcplist(struct zone *zon= e, > >> unsigned int order, > >> } > >> > >> page =3D list_first_entry(list, struct page, pcp_lis= t); > >> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > >> + if (order =3D=3D HPAGE_PMD_ORDER && > >> !is_migrate_movable(migratetype) && > >> + is_migrate_cma(get_pageblock_migratetype(page))) > >> + return NULL; > >> +#endif > > > > This doesn't seem ideal either. It's possible that the PCP still has ma= ny > > non-CMA folios, but due to bad luck, the first entry is "always" CMA. > > In this case, > > allocations with is_migrate_movable(migratetype) =3D=3D false will alwa= ys lose the > > chance to use the PCP. It also appears to incur a PCP spin lock/unloc= k. > > Yes, just some ideas to to mitigate the issue... > > > > > I don't see an ideal solution unless we bring back the CMA PCP :-) > > Tend to agree, and adding a CMA PCP seems the overhead can be acceptable? yes. probably. Hi Ge, Could we printk the size before and after adding 1 to NR_PCP_LISTS? Does it increase one cacheline? struct per_cpu_pages { spinlock_t lock; /* Protects lists field */ int count; /* number of pages in the list */ int high; /* high watermark, emptying needed */ int high_min; /* min high watermark */ int high_max; /* max high watermark */ int batch; /* chunk size for buddy add/remove */ u8 flags; /* protected by pcp->lock */ u8 alloc_factor; /* batch scaling factor during allocate */ #ifdef CONFIG_NUMA u8 expire; /* When 0, remote pagesets are drained */ #endif short free_count; /* consecutive free count */ /* Lists of pages, one per migrate type stored on the pcp-lists */ struct list_head lists[NR_PCP_LISTS]; } ____cacheline_aligned_in_smp;