From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E7D8C27C53 for ; Wed, 19 Jun 2024 08:20:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EB8986B027A; Wed, 19 Jun 2024 04:20:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E678B6B027B; Wed, 19 Jun 2024 04:20:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D07876B027C; Wed, 19 Jun 2024 04:20:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id ACD4C6B027A for ; Wed, 19 Jun 2024 04:20:56 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 51E8F40EE6 for ; Wed, 19 Jun 2024 08:20:56 +0000 (UTC) X-FDA: 82246942512.06.4ACE022 Received: from mail-vs1-f45.google.com (mail-vs1-f45.google.com [209.85.217.45]) by imf30.hostedemail.com (Postfix) with ESMTP id 82B9680016 for ; Wed, 19 Jun 2024 08:20:54 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=LPasEXQl; spf=pass (imf30.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.45 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718785250; a=rsa-sha256; cv=none; b=3ifeVH7O1GR0IYQUqdF7idTvjVMnmSi9mUVRKi58vKo5fZIMqLY19WM8q7sCx7tJ54vsJ7 dl4SoCpIbReEykqAO6t4kXCBYaW6JLT1GDQiEDhM+4pTjy3qLy0wuO1De2iEkDGljFNDag tj/foN9aelTIGxKFV0X+fwclvaGcsAk= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=LPasEXQl; spf=pass (imf30.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.45 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718785250; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=He+zo0Tp/RNYugMKckmaumklLbi/HraZr7oLDXy9GoY=; b=obDkw5Kn4t0mmeDjTfRnOZBzWdZqe9Fx8X0y71EVWA/HXNz8qSXOf15X/qniZEqAHWawNU JscbKKGV0BNOWtTetkv9aP6d7tZ+0nP7+JiH8Ma2jI89+/D0mXOvORSu8pYaQ/Wvg3UoJO vXvcuxBfIdsGE6dauXzcErr+JmMcauQ= Received: by mail-vs1-f45.google.com with SMTP id ada2fe7eead31-48c52d0fe6bso1996019137.1 for ; Wed, 19 Jun 2024 01:20:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718785253; x=1719390053; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=He+zo0Tp/RNYugMKckmaumklLbi/HraZr7oLDXy9GoY=; b=LPasEXQleiJfXN7G0XzRLrDLW87DDGsvh7Fs+mg/H3T2Hhp0LBNols8Hs+/RgXuYbE i1EpFVT7hAwToetQpvwKbJepg+0Bnn3SQEz5c8s4SUWp3zvRaM0NlonIvibF23wCI0Em jPsqTOzH5oLzUCnJxfxZa4YTDitGsPG1CSarp5fped34iSpeapN7cVzxD2dQUDs0cYli o6h3Ssbl83gXjFQ02CBGN8QhrQ3DchibHuZzgxGNkzjhqM/dbvI+wCx2ristFkr8r1PM 5ax9IE3lSszDQ/H61l41tng0fSbiHZ2fk0QcoNsR4DHGkpfW22+wvSAsqy2iVjG299W2 0b+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718785253; x=1719390053; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=He+zo0Tp/RNYugMKckmaumklLbi/HraZr7oLDXy9GoY=; b=ISM2TPTw/Q5YSRk4SbPc67ysPMs1jisazTu5TFMuj7/Hl1HUtWDk6FU0IBBSrQ0M0e GzQQorcOwYaBpqHvCCtPBC5s+DRsInz3ZH5q8uoFqpmmKYverCRhTNpan7aEwod6roQL dndSXjsD+sJwnSwc+9L3HnDOEtAAXolLSpivgNjFSMG+g7AgMBBcH3qZbqDAeos1b0gt M3kuG+0VHFNe2J2a8+P4SnTsW8sG02MJVs5NIR1ERN3OVIPEfoRPq5tfG45YgL4u1Xrz RmRu0dEa+zCeDmtB1EtMdwjntLQKzEvd/vpefUNFqNSZ0aMWZUYE1f4ly7XVp8v/SFr/ ukxg== X-Forwarded-Encrypted: i=1; AJvYcCUnGtqH0QeO5e+sGZAnS8Lh3aiaDJ+Jj9Nti4JVSnHUcO0QdsvCro3w6EKUl8QDSjAWdkG0RIuukQXDB9Dwncaay3A= X-Gm-Message-State: AOJu0YweNKg6HAyfFmu2vtJv5cq8cTGyw2jY/qfjoK5xY9tKEJee+NDK DFt3CL500+9yMbuYPyC9gZpGyDHgd7twoNAYwldl1UTStosxNvzrWFC41HRqO8loHUn9sclbNo8 KUHrB43KKaw4JmlNHjcwwYlgPzu0= X-Google-Smtp-Source: AGHT+IED0gDtzjRvMQIpWzZyTu8UVlVPUsyrecgQhzqb2lDXDEQzZ1IHEjrloriju+gLcFWUfpsXPzrdjfC0I9qAeUw= X-Received: by 2002:a05:6102:3a76:b0:48d:4dc2:9b18 with SMTP id ada2fe7eead31-48f130d06e4mr1987402137.33.1718785253366; Wed, 19 Jun 2024 01:20:53 -0700 (PDT) MIME-Version: 1.0 References: <1717492460-19457-1-git-send-email-yangge1116@126.com> <2e3a3a3f-737c-ed01-f820-87efee0adc93@126.com> <9b227c9d-f59b-a8b0-b353-7876a56c0bde@126.com> <4482bf69-eb07-0ec9-f777-28ce40f96589@126.com> <69414410-4e2d-c04c-6fc3-9779f9377cf2@126.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Wed, 19 Jun 2024 20:20:40 +1200 Message-ID: Subject: Re: [PATCH] mm/page_alloc: skip THP-sized PCP list when allocating non-CMA THP-sized page To: Ge Yang Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, baolin.wang@linux.alibaba.com, liuzixing@hygon.cn Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: oisrkobt4z3pmrduzp3qor6xwqqqd71r X-Rspamd-Queue-Id: 82B9680016 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1718785254-156762 X-HE-Meta: U2FsdGVkX18cXV9gQwYR8Ony2UIjEnA3xh5D12VBF+5xZZAazL2v17T5ZQ4680FKzScPbdRwmf0zEq0zYxMBuwJX77ibhLGZ6S7wJDa3foNqpzRBGhktuZoepv2PqXnim89AKGuQpWu2yE243hGL0bi2Sd7V460W3YqlrbIpy/NqrnnJcVtjlr1uwN3k9MNhm/3cLjAANSAYQl9gTzpmf8Xr05aMkZzweje/206q8CpkHN43+udljOQQcPwZRbDPDidF0WCBe4cZ2N8Bcm9JyBZs9fdY6Hc7WJ+ScKGkQivCcRUhOYJi6l80QPrcmQPJBoltuCtvr6hTKdZZkUFlhizPWv4VCnGtiTHR8NH74g4KGuS1pL1m3Is6U5UhJDoMz7EGcrBFXA7eEjMmdihpTAVeeUvoVUthhVWxbkq9DPKcVdJB55mB8voR6vECVY+fk+pz9v09mYhMaAd3x+dlKtVqz4yZHKe8CFETeIfrAxwXm26umQK2mW3Qm3o55+rM/4SVMRVTjyvTVc3uogllWLfJgEo6D5m4wAeZHUXegtq1UgRtmyNtKFEb23u47Y48ZWunU+NL5X06Q05ci0QTQ0mKLfpocrcP87NGvUInRu9BJhjfuIO77S/5R9uqI+HQUNJ8XcnJmnwz6MOdEhFDbyj8Y9YlAEzgFoGz1RPDLloIySosSykPVlaHTGObPwmGsphiUYbv//dpHLP4+BdOc+g+xSNlK2HfEFhPKyFBhGItP9mGbJ0uzPAqN0WqhfSz7bJDQgcmyeRjKQzuL+F/eK1Xv91wjVwGx4i2sr6gpmeO2NhjMGTxzxNfpOEuytb5bVj7gx3E7iayktbvPIrsxJCKpQwkj/OErmZPebjyOLZRae5O+9zuhd//rJQp8jdzcABDsPbXAcjGMc/PL8SZvFrfkMjIjNDuQSoLskCB/qCLYD0ZPOmvHMZdBPp/QKul8KOUuouuYOBASV9l6BG otm/2A/i a1D3bCMhUrk8kUs9H1cLHnq2PE63KWVizJ4oy1l3/H1qAEZ1ajiVKFHE9RXPf6dsZVd0lKHFXJiMpn58NT6YSy9XLz4GPUn3tph8Q1p8pbB64P5pElAR6If7LL3QYl9w5OaSzLPg1qHz9nadWE/RpiUqSY2v3WdWgaK125g2qexAqFJ9tOZDafiLkgy7EiRUt9ymLhStfkPsJFBSTKZQvUfrpER++2tMiF0VSwKE8aasM63GT7u3nPJcbfTTiFoSQLBYo9OfgJ7qGdwzJwSTWWKCaXd6i+bPKPwg0Rfr530RsWeiZ11DfM2Ogi1DRnMAq3YJZx5BYA/TSgHuLG3L6Smny3vFc0dBeEtuRvwORpMtCn7e6sKiU36O4i43J/XFE+MAFBb5tnORaRwpJxTSzA5Wo/RHat/U02zJBb+ibawFonfpHH+xhJ8Q3kii1RUZaurdSazo0bLdGjk0BTVYlrcfSGv7016BC2aSeQ6TQiRAlJSA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000003, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jun 19, 2024 at 5:35=E2=80=AFPM Ge Yang wrote: > > > > =E5=9C=A8 2024/6/18 15:51, yangge1116 =E5=86=99=E9=81=93: > > > > > > =E5=9C=A8 2024/6/18 =E4=B8=8B=E5=8D=882:58, Barry Song =E5=86=99=E9=81= =93: > >> On Tue, Jun 18, 2024 at 6:56=E2=80=AFPM yangge1116 wrote: > >>> > >>> > >>> > >>> =E5=9C=A8 2024/6/18 =E4=B8=8B=E5=8D=8812:10, Barry Song =E5=86=99=E9= =81=93: > >>>> On Tue, Jun 18, 2024 at 3:32=E2=80=AFPM yangge1116 wrote: > >>>>> > >>>>> > >>>>> > >>>>> =E5=9C=A8 2024/6/18 =E4=B8=8A=E5=8D=889:55, Barry Song =E5=86=99=E9= =81=93: > >>>>>> On Tue, Jun 18, 2024 at 9:36=E2=80=AFAM yangge1116 > >>>>>> wrote: > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> =E5=9C=A8 2024/6/17 =E4=B8=8B=E5=8D=888:47, yangge1116 =E5=86=99= =E9=81=93: > >>>>>>>> > >>>>>>>> > >>>>>>>> =E5=9C=A8 2024/6/17 =E4=B8=8B=E5=8D=886:26, Barry Song =E5=86=99= =E9=81=93: > >>>>>>>>> On Tue, Jun 4, 2024 at 9:15=E2=80=AFPM wro= te: > >>>>>>>>>> > >>>>>>>>>> From: yangge > >>>>>>>>>> > >>>>>>>>>> Since commit 5d0a661d808f ("mm/page_alloc: use only one PCP > >>>>>>>>>> list for > >>>>>>>>>> THP-sized allocations") no longer differentiates the migration > >>>>>>>>>> type > >>>>>>>>>> of pages in THP-sized PCP list, it's possible to get a CMA > >>>>>>>>>> page from > >>>>>>>>>> the list, in some cases, it's not acceptable, for example, > >>>>>>>>>> allocating > >>>>>>>>>> a non-CMA page with PF_MEMALLOC_PIN flag returns a CMA page. > >>>>>>>>>> > >>>>>>>>>> The patch forbids allocating non-CMA THP-sized page from > >>>>>>>>>> THP-sized > >>>>>>>>>> PCP list to avoid the issue above. > >>>>>>>>> > >>>>>>>>> Could you please describe the impact on users in the commit log= ? > >>>>>>>> > >>>>>>>> If a large number of CMA memory are configured in the system (fo= r > >>>>>>>> example, the CMA memory accounts for 50% of the system memory), > >>>>>>>> starting > >>>>>>>> virtual machine with device passthrough will get stuck. > >>>>>>>> > >>>>>>>> During starting virtual machine, it will call > >>>>>>>> pin_user_pages_remote(..., > >>>>>>>> FOLL_LONGTERM, ...) to pin memory. If a page is in CMA area, > >>>>>>>> pin_user_pages_remote() will migrate the page from CMA area to > >>>>>>>> non-CMA > >>>>>>>> area because of FOLL_LONGTERM flag. If non-movable allocation > >>>>>>>> requests > >>>>>>>> return CMA memory, pin_user_pages_remote() will enter endless > >>>>>>>> loops. > >>>>>>>> > >>>>>>>> backtrace: > >>>>>>>> pin_user_pages_remote > >>>>>>>> ----__gup_longterm_locked //cause endless loops in this function > >>>>>>>> --------__get_user_pages_locked > >>>>>>>> --------check_and_migrate_movable_pages //always check fail and > >>>>>>>> continue > >>>>>>>> to migrate > >>>>>>>> ------------migrate_longterm_unpinnable_pages > >>>>>>>> ----------------alloc_migration_target // non-movable allocation > >>>>>>>> > >>>>>>>>> Is it possible that some CMA memory might be used by non-movabl= e > >>>>>>>>> allocation requests? > >>>>>>>> > >>>>>>>> Yes. > >>>>>>>> > >>>>>>>> > >>>>>>>>> If so, will CMA somehow become unable to migrate, causing > >>>>>>>>> cma_alloc() > >>>>>>>>> to fail? > >>>>>>>> > >>>>>>>> > >>>>>>>> No, it will cause endless loops in __gup_longterm_locked(). If > >>>>>>>> non-movable allocation requests return CMA memory, > >>>>>>>> migrate_longterm_unpinnable_pages() will migrate a CMA page to > >>>>>>>> another > >>>>>>>> CMA page, which is useless and cause endless loops in > >>>>>>>> __gup_longterm_locked(). > >>>>>> > >>>>>> This is only one perspective. We also need to consider the impact = on > >>>>>> CMA itself. For example, > >>>>>> when CMA is borrowed by THP, and we need to reclaim it through > >>>>>> cma_alloc() or dma_alloc_coherent(), > >>>>>> we must move those pages out to ensure CMA's users can retrieve th= at > >>>>>> contiguous memory. > >>>>>> > >>>>>> Currently, CMA's memory is occupied by non-movable pages, meaning = we > >>>>>> can't relocate them. > >>>>>> As a result, cma_alloc() is more likely to fail. > >>>>>> > >>>>>>>> > >>>>>>>> backtrace: > >>>>>>>> pin_user_pages_remote > >>>>>>>> ----__gup_longterm_locked //cause endless loops in this function > >>>>>>>> --------__get_user_pages_locked > >>>>>>>> --------check_and_migrate_movable_pages //always check fail and > >>>>>>>> continue > >>>>>>>> to migrate > >>>>>>>> ------------migrate_longterm_unpinnable_pages > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Fixes: 5d0a661d808f ("mm/page_alloc: use only one PCP list for > >>>>>>>>>> THP-sized allocations") > >>>>>>>>>> Signed-off-by: yangge > >>>>>>>>>> --- > >>>>>>>>>> mm/page_alloc.c | 10 ++++++++++ > >>>>>>>>>> 1 file changed, 10 insertions(+) > >>>>>>>>>> > >>>>>>>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c > >>>>>>>>>> index 2e22ce5..0bdf471 100644 > >>>>>>>>>> --- a/mm/page_alloc.c > >>>>>>>>>> +++ b/mm/page_alloc.c > >>>>>>>>>> @@ -2987,10 +2987,20 @@ struct page *rmqueue(struct zone > >>>>>>>>>> *preferred_zone, > >>>>>>>>>> WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > >>>>>>>>>> > 1)); > >>>>>>>>>> > >>>>>>>>>> if (likely(pcp_allowed_order(order))) { > >>>>>>>>>> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > >>>>>>>>>> + if (!IS_ENABLED(CONFIG_CMA) || alloc_flags & > >>>>>>>>>> ALLOC_CMA || > >>>>>>>>>> + order !=3D > >>>>>>>>>> HPAGE_PMD_ORDER) { > >>>>>>>>>> + page =3D rmqueue_pcplist(preferred_zon= e, > >>>>>>>>>> zone, > >>>>>>>>>> order, > >>>>>>>>>> + migratetype, > >>>>>>>>>> alloc_flags); > >>>>>>>>>> + if (likely(page)) > >>>>>>>>>> + goto out; > >>>>>>>>>> + } > >>>>>>>>> > >>>>>>>>> This seems not ideal, because non-CMA THP gets no chance to use > >>>>>>>>> PCP. > >>>>>>>>> But it > >>>>>>>>> still seems better than causing the failure of CMA allocation. > >>>>>>>>> > >>>>>>>>> Is there a possible approach to avoiding adding CMA THP into > >>>>>>>>> pcp from > >>>>>>>>> the first > >>>>>>>>> beginning? Otherwise, we might need a separate PCP for CMA. > >>>>>>>>> > >>>>>>> > >>>>>>> The vast majority of THP-sized allocations are GFP_MOVABLE, avoid= ing > >>>>>>> adding CMA THP into pcp may incur a slight performance penalty. > >>>>>>> > >>>>>> > >>>>>> But the majority of movable pages aren't CMA, right? > >>>>> > >>>>>> Do we have an estimate for > >>>>>> adding back a CMA THP PCP? Will per_cpu_pages introduce a new > >>>>>> cacheline, which > >>>>>> the original intention for THP was to avoid by having only one > >>>>>> PCP[1]? > >>>>>> > >>>>>> [1] > >>>>>> https://patchwork.kernel.org/project/linux-mm/patch/20220624125423= .6126-3-mgorman@techsingularity.net/ > >>>>>> > >>>>> > >>>>> The size of struct per_cpu_pages is 256 bytes in current code > >>>>> containing > >>>>> commit 5d0a661d808f ("mm/page_alloc: use only one PCP list for > >>>>> THP-sized > >>>>> allocations"). > >>>>> crash> struct per_cpu_pages > >>>>> struct per_cpu_pages { > >>>>> spinlock_t lock; > >>>>> int count; > >>>>> int high; > >>>>> int high_min; > >>>>> int high_max; > >>>>> int batch; > >>>>> u8 flags; > >>>>> u8 alloc_factor; > >>>>> u8 expire; > >>>>> short free_count; > >>>>> struct list_head lists[13]; > >>>>> } > >>>>> SIZE: 256 > >>>>> > >>>>> After revert commit 5d0a661d808f ("mm/page_alloc: use only one PCP > >>>>> list > >>>>> for THP-sized allocations"), the size of struct per_cpu_pages is > >>>>> 272 bytes. > >>>>> crash> struct per_cpu_pages > >>>>> struct per_cpu_pages { > >>>>> spinlock_t lock; > >>>>> int count; > >>>>> int high; > >>>>> int high_min; > >>>>> int high_max; > >>>>> int batch; > >>>>> u8 flags; > >>>>> u8 alloc_factor; > >>>>> u8 expire; > >>>>> short free_count; > >>>>> struct list_head lists[15]; > >>>>> } > >>>>> SIZE: 272 > >>>>> > >>>>> Seems commit 5d0a661d808f ("mm/page_alloc: use only one PCP list fo= r > >>>>> THP-sized allocations") decrease one cacheline. > >>>> > >>>> the proposal is not reverting the patch but adding one CMA pcp. > >>>> so it is "struct list_head lists[14]"; in this case, the size is sti= ll > >>>> 256? > >>>> > >>> > >>> Yes, the size is still 256. If add one PCP list, we will have 2 PCP > >>> lists for THP. One PCP list is used by MIGRATE_UNMOVABLE, and the oth= er > >>> PCP list is used by MIGRATE_MOVABLE and MIGRATE_RECLAIMABLE. Is that > >>> right? > >> > >> i am not quite sure about MIGRATE_RECLAIMABLE as we want to > >> CMA is only used by movable. > >> So it might be: > >> MOVABLE and NON-MOVABLE. > > > > One PCP list is used by UNMOVABLE pages, and the other PCP list is used > > by MOVABLE pages, seems it is feasible. UNMOVABLE PCP list contains > > MIGRATE_UNMOVABLE pages and MIGRATE_RECLAIMABLE pages, and MOVABLE PCP > > list contains MIGRATE_MOVABLE pages. > > > > Is the following modification feasiable? > > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > -#define NR_PCP_THP 1 > +#define NR_PCP_THP 2 > +#define PCP_THP_MOVABLE 0 > +#define PCP_THP_UNMOVABLE 1 > #else > #define NR_PCP_THP 0 > #endif > > static inline unsigned int order_to_pindex(int migratetype, int order) > { > + int pcp_type =3D migratetype; > + > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > if (order > PAGE_ALLOC_COSTLY_ORDER) { > VM_BUG_ON(order !=3D HPAGE_PMD_ORDER); > - return NR_LOWORDER_PCP_LISTS; > + > + if (migratetype !=3D MIGRATE_MOVABLE) > + pcp_type =3D PCP_THP_UNMOVABLE; > + else > + pcp_type =3D PCP_THP_MOVABLE; > + > + return NR_LOWORDER_PCP_LISTS + pcp_type; > } > #else > VM_BUG_ON(order > PAGE_ALLOC_COSTLY_ORDER); > #endif > > - return (MIGRATE_PCPTYPES * order) + migratetype; > + return (MIGRATE_PCPTYPES * order) + pcp_type; > } > a minimum change might be, then you can drop most new code. diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 120a317d0938..cfe1e0625e38 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -588,6 +588,7 @@ static void bad_page(struct page *page, const char *rea= son) static inline unsigned int order_to_pindex(int migratetype, int order) { + bool __maybe_unused movable; #ifdef CONFIG_CMA /* * We shouldn't get here for MIGRATE_CMA if those pages don't @@ -600,7 +601,8 @@ static inline unsigned int order_to_pindex(int migratetype, int order) #ifdef CONFIG_TRANSPARENT_HUGEPAGE if (order > PAGE_ALLOC_COSTLY_ORDER) { VM_BUG_ON(order !=3D pageblock_order); - return NR_LOWORDER_PCP_LISTS; + movable =3D migratetype =3D=3D MIGRATE_MOVABLE; + return NR_LOWORDER_PCP_LISTS + movable; } #else VM_BUG_ON(order > PAGE_ALLOC_COSTLY_ORDER); > > > @@ -521,7 +529,7 @@ static inline int pindex_to_order(unsigned int pindex= ) > int order =3D pindex / MIGRATE_PCPTYPES; > > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > - if (pindex =3D=3D NR_LOWORDER_PCP_LISTS) > + if (order > PAGE_ALLOC_COSTLY_ORDER) > order =3D HPAGE_PMD_ORDER; > #else > VM_BUG_ON(order > PAGE_ALLOC_COSTLY_ORDER); > > > > >> > >>> > >>>> > >>>>> > >>>>>> > >>>>>>> Commit 1d91df85f399 takes a similar approach to filter, and I mai= nly > >>>>>>> refer to it. > >>>>>>> > >>>>>>> > >>>>>>>>>> +#else > >>>>>>>>>> page =3D rmqueue_pcplist(preferred_zone, > >>>>>>>>>> zone, order, > >>>>>>>>>> migratetype, > >>>>>>>>>> alloc_flags); > >>>>>>>>>> if (likely(page)) > >>>>>>>>>> goto out; > >>>>>>>>>> +#endif > >>>>>>>>>> } > >>>>>>>>>> > >>>>>>>>>> page =3D rmqueue_buddy(preferred_zone, zone, order= , > >>>>>>>>>> alloc_flags, > >>>>>>>>>> -- > >>>>>>>>>> 2.7.4 > >>>>>>>>> > >>>>>>>>> Thanks > >>>>>>>>> Barry > >>>>>>>>> > >>>>>>> > >>>>>>> > >>>>> > >>> >