From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19764D0E3F7 for ; Thu, 24 Oct 2024 20:01:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 819F56B007B; Thu, 24 Oct 2024 16:00:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7A16A6B0085; Thu, 24 Oct 2024 16:00:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6427E6B008A; Thu, 24 Oct 2024 16:00:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 3D2796B007B for ; Thu, 24 Oct 2024 16:00:59 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 9E75D41532 for ; Thu, 24 Oct 2024 20:00:48 +0000 (UTC) X-FDA: 82709563986.02.6519193 Received: from mail-ed1-f47.google.com (mail-ed1-f47.google.com [209.85.208.47]) by imf26.hostedemail.com (Postfix) with ESMTP id C466E140036 for ; Thu, 24 Oct 2024 20:00:42 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=VOexoTKs; spf=pass (imf26.hostedemail.com: domain of shy828301@gmail.com designates 209.85.208.47 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729799979; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YZR5+RDIEeH09JuEQux1/sOKZW2B9xbbsM1/LCOo/bE=; b=ZjkuztDr9zs3b/FNwZluObC0p2ii36p8k/pG/XFPMaugcCfFX8FFZJeQ95x54Ot1v9ddLy 7/ZnhI+1ymXQilcCF2tB6ZatI5Mz8fJmqQrawoE4tLafls08C1kEGDR9hcXKD/IWUCf12T gLtsHv/YsF2qlqoctImnOebWt68d8do= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=VOexoTKs; spf=pass (imf26.hostedemail.com: domain of shy828301@gmail.com designates 209.85.208.47 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729799979; a=rsa-sha256; cv=none; b=b0nk+1Zt0+fLdAFYrijSRKFBgv963oioA5P1RJy0xbubXDOzy7IpEY4GmjZx+Sr+3mV6nk yAl251IY1fWbR8nxBqKpY93PQvJ9KB8v3ZbKzzLRkYOCSNYNak8+tFATj/oKi+BTfJP9N8 t1rro4aYVck2bKtNkYyXdPEw5Hmsnxo= Received: by mail-ed1-f47.google.com with SMTP id 4fb4d7f45d1cf-5c903f5bd0eso2397651a12.3 for ; Thu, 24 Oct 2024 13:00:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729800055; x=1730404855; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=YZR5+RDIEeH09JuEQux1/sOKZW2B9xbbsM1/LCOo/bE=; b=VOexoTKsSbezWkldkiExpiiy1YN2XHneQ85cHrA+Yr3hzNjCiCybStefgsANZfGEpP ppcR+F7FsMsmGBvAPlQuMRHhNcXZoz1KccR9F61weIOgO9DrEX/auRbx+9rDAOJr2fPN gl0fleocFWrN4JapZHH3BQUPcdTBU3Ldfrf60jRfwjMqWQl2Fz9QyjxoStLUBpuYhQbU MflhICWIF7teU7+bXrDXoXvMkgSviNqn4urvtWMzwoF/ZTefkd5vfDU8oP2t1+iqc3hE ZenVLfwsXeA4L1nFdUaWRJICtADUWSvVd6dAfVdqG+NYkVGXvfiIPy+DH/RR/wOT+81f 9m4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729800055; x=1730404855; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YZR5+RDIEeH09JuEQux1/sOKZW2B9xbbsM1/LCOo/bE=; b=GHHQep/PNinHKf0CDYkPad25AuUB+OrY3/5+kNJPa9pkqgo7Ua+V/bx7vWObOD4px7 bAOYEWRi2Qj/otXh+spPk4k/nFa5THNTSP0KrD3fX445SnwV9FZIc/uy2ZX3oPaFA37W 4Uk/kqilgPo1I+D3O0u9uaVooh/og4qkGbQvT8JUk7LS93lnSpFbrO7wdffwwzcZWJuJ +XuFe9bdMME/8/hSAIyPY66xEimj64ZRnRY+TldIeDBTE7We89s7SrJq/QJIT2xnYUwx IAsC/PgjmV1AjX2J6n6nxmSgdIqpnTJyYwPNyI2b0l2TrdgKB2l8GAR7c9Mv/Uwv1rN2 7cUQ== X-Forwarded-Encrypted: i=1; AJvYcCVioYr6mKz5PuMT3Llt1IVu0f9IjHUV5Hyz4vv0O71Qiw43eJ+CX8+NKoiSGeHkbaYSan0/lIn6EA==@kvack.org X-Gm-Message-State: AOJu0Yxz4bT2mka5T/Lpkf4gokNJAEKBGGF9DTSbgXhmhWTH41JYXOyx wM5llyUafMO9Trm+JTwe5Wz7vyhSKHY2Sp1j1QzJ/6aJr1pnydB+nePnLyc8cMl0freKJvlcSPV 9hQMujTzvLzyKL/8QzmvBgk5OJOru+w== X-Google-Smtp-Source: AGHT+IHj/AJ6NQg7fb4O29SQIz/bkFpqledA+je0qis1vrx0o4D/Etylbm4b8TGNvyYLupb8DZNH1Qc9IJrVLYJp0kI= X-Received: by 2002:a05:6402:2344:b0:5c9:59e6:e908 with SMTP id 4fb4d7f45d1cf-5cb8ac5e8ebmr7168493a12.6.1729800055001; Thu, 24 Oct 2024 13:00:55 -0700 (PDT) MIME-Version: 1.0 References: <760237a3-69d6-9197-432d-0306d52c048a@google.com> <7dc6b280-cd87-acd1-1124-e512e3d2217d@google.com> In-Reply-To: <7dc6b280-cd87-acd1-1124-e512e3d2217d@google.com> From: Yang Shi Date: Thu, 24 Oct 2024 13:00:43 -0700 Message-ID: Subject: Re: [PATCH hotfix 2/2] mm/thp: fix deferred split unqueue naming and locking To: Hugh Dickins Cc: Andrew Morton , Usama Arif , Wei Yang , "Kirill A. Shutemov" , Matthew Wilcox , David Hildenbrand , Johannes Weiner , Baolin Wang , Barry Song , Kefeng Wang , Ryan Roberts , Nhat Pham , Zi Yan , Chris Li , linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: C466E140036 X-Stat-Signature: 4ixrsqhicmp838bn7rfar4nxjbnm53yu X-HE-Tag: 1729800042-467238 X-HE-Meta: U2FsdGVkX1/1LYZalJSlhcUON+G8z0aJGB3IIGHCU4loRoz00jBe8PrE8bPx0SlADI9eByls+G3o7gKiEqZ7YRH5wUXCxo/X7+PWFFHxh5Uo4xBF3BKipHmipBqXvE3wah6uNpVbBX1vbZTtDATVeLXqyuIE7CitL2SKNBKPCsZrqDo8sMXHBJTVwjg61p0cFDXd+RnpTcuIdpZNirRK3Y5gL9Cc2K3GmQq6fVUL8opPZGHpuYMQQvaxZR26CHCqAGa5w7J5+XA7NYDnFT4Qms4yCv6qLamv/sykIvKtzO2/xXAV33YqDli8yjoc/g7I+Bt9EjuOQdytGkyNhLDFBNJJa/22XK9Y28UjB5+Mw+AMMGkOyxOWnjxVMaAUQxOU3UM5v7oEQYA+7lZICybRGeN9D+3dgAoyNb6kFBeojwxl2wbYkt3+2/6oao+XgxVWW3z6NsenAkldgOT70pEY6jJ5nhx26nVmP8U8ysq9kLdpDMb+k+6m22l5Ff8i4O/fl4meO/6185VEmUIc3C99nmbGpIwUOiNpv9p/LuKHhnAW0AK6jr0rv+ttajHYMmdf+1J7XD4WZIjnjXZ+hWhyZaOodrXvLtxu53EiynaAopflmG3bWftmk6sYvVlNWFfKFtDImVWnQxTibS7O782LKPkWHSnKH/Q16+ot95/wFP8btNocyzRPrWS7ok4D5HZ/E+iP+6Z0lREYi+LAfUQhcm6Oe3Wz1T0TNLoeNihs1jmdDZS+O3I0ka/M0rNt4J8K5h6sqjaVq41p62mQxbL0DKyw65ltpA0rOKTQTzGsLViKWjQEWCnHsnHfQXwheTms4xn2sg3HyhP3sts1bcDwSsoCzDHxuKidKAavrzCv23Yxq1vqP+6bYcU/sd12cv33ITtWFvTIGmUIoFQst6d0qFKRdZKUTqBF56sQnkaQeM+0/C8w01tcZvRTVqoELMIqj9RnotYObsyrN3epo0K ctFDZMWo cVmO0zAWKlCTJS/9P505sohI0trSaVOxLlIJQ550sFBCPIXBI1GYjQC0Br+KKQB//u123VGbsOI6Lk6tPktuF6xGWQltNBInUCF8HMUwV/WMxSRnDS2tHzmD/cEzzFzad/D7i93KgDh+zfmYhRXUjdZyqU+v4Bm++AYT53Q+SuR0oN/9LYC+RdXy/JhEB27kr2NvFcTBr0xR0pvLJnqtsfC6jXjgytOS3Yja3xd6k28EPOUk4BaJK3Yeh7mvIgeB+E8U/Nu1i5zAhYP8GBgHB+oKuqYwwqUAXF/qhtx/SUh4X7/w4vL/WbWp0QAoCyr2F6Ur6lZTXyx+1BBbdUUOfxbGMG+oGvrBaLGpTkYLbndvEpopY7wAXe8OlyxgmbQ2c9R9XqzNRYyLYSzwwj06cU7QcwGDsRpSwFmE86MaVP5l5koGROf5LzIwHwsupiGfg86Wd X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Oct 23, 2024 at 9:13=E2=80=AFPM Hugh Dickins wro= te: > > Recent changes are putting more pressure on THP deferred split queues: > under load revealing long-standing races, causing list_del corruptions, > "Bad page state"s and worse (I keep BUGs in both of those, so usually > don't get to see how badly they end up without). The relevant recent > changes being 6.8's mTHP, 6.10's mTHP swapout, and 6.12's mTHP swapin, > improved swap allocation, and underused THP splitting. > > Before fixing locking: rename misleading folio_undo_large_rmappable(), > which does not undo large_rmappable, to folio_unqueue_deferred_split(), > which is what it does. But that and its out-of-line __callee are mm > internals of very limited usability: add comment and WARN_ON_ONCEs to > check usage; and return a bool to say if a deferred split was unqueued, > which can then be used in WARN_ON_ONCEs around safety checks (sparing > callers the arcane conditionals in __folio_unqueue_deferred_split()). > > Swapout: mem_cgroup_swapout() has been resetting folio->memcg_data 0 > without checking and unqueueing a THP folio from deferred split list; > which is unfortunate, since the split_queue_lock depends on the memcg > (when memcg is enabled); so swapout has been unqueueing such THPs later, > when freeing the folio, using the pgdat's lock instead: potentially > corrupting the memcg's list. __remove_mapping() has frozen refcount to > 0 here, so no problem with calling folio_unqueue_deferred_split() before > resetting memcg_data. > > That goes back to 5.4 commit 87eaceb3faa5 ("mm: thp: make deferred split > shrinker memcg aware"): which included a check on swapcache before adding > to deferred queue (which can now be removed), but no check on deferred > queue before adding THP to swapcache (maybe other circumstances prevented > it at that time, but not now). If I remember correctly, THP just can be added to deferred list when there is no PMD map before mTHP swapout, so shrink_page_list() did check THP's compound_mapcount (called _entire_mapcount now) before adding it to swap cache. Now the code just checked whether the large folio is on deferred list or no= t. > > Memcg-v1 move (deprecated): mem_cgroup_move_account() has been changing > folio->memcg_data without checking and unqueueing a THP folio from the > deferred list, sometimes corrupting "from" memcg's list, like swapout. > Refcount is non-zero here, so folio_unqueue_deferred_split() can only be > used in a WARN_ON_ONCE to validate the fix, which must be done earlier: > mem_cgroup_move_charge_pte_range() first try to split the THP (splitting > of course unqueues), or skip it if that fails. Not ideal, but moving > charge has been requested, and khugepaged should repair the THP later: > nobody wants new custom unqueueing code just for this deprecated case. > > The 87eaceb3faa5 commit did have the code to move from one deferred list > to another (but was not conscious of its unsafety while refcount non-0); > but that was removed by 5.6 commit fac0516b5534 ("mm: thp: don't need > care deferred split queue in memcg charge move path"), which argued that > the existence of a PMD mapping guarantees that the THP cannot be on a > deferred list. I'm not sure if that was true at the time (swapcache > remapped?), but it's clearly not true since 6.12 commit dafff3f4c850 > ("mm: split underused THPs"). Same reason as above. > > [Note in passing: mem_cgroup_move_charge_pte_range() just skips mTHPs, > large but not PMD-mapped: that's safe, but perhaps not intended: it's > arguable whether the deprecated feature should be updated to work > better with the new feature; but certainly not in this patch.] > > Backport to 6.11 should be straightforward. Earlier backports must take > care that other _deferred_list fixes and dependencies are included. It > is unclear whether these fixes are realistically needed before 6.12. > > Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware") > Signed-off-by: Hugh Dickins > Cc: > --- > mm/huge_memory.c | 35 +++++++++++++++++++++-------------- > mm/internal.h | 10 +++++----- > mm/memcontrol-v1.c | 25 +++++++++++++++++++++++++ > mm/memcontrol.c | 8 +++++--- > mm/migrate.c | 4 ++-- > mm/page_alloc.c | 4 +++- > mm/swap.c | 4 ++-- > mm/vmscan.c | 4 ++-- > 8 files changed, 65 insertions(+), 29 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index a1d345f1680c..dc7d5bb76495 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -3588,10 +3588,27 @@ int split_folio_to_list(struct folio *folio, stru= ct list_head *list) > return split_huge_page_to_list_to_order(&folio->page, list, ret); > } > > -void __folio_undo_large_rmappable(struct folio *folio) > +/* > + * __folio_unqueue_deferred_split() is not to be called directly: > + * the folio_unqueue_deferred_split() inline wrapper in mm/internal.h > + * limits its calls to those folios which may have a _deferred_list for > + * queueing THP splits, and that list is (racily observed to be) non-emp= ty. > + * > + * It is unsafe to call folio_unqueue_deferred_split() until folio refco= unt is > + * zero: because even when split_queue_lock is held, a non-empty _deferr= ed_list > + * might be in use on deferred_split_scan()'s unlocked on-stack list. > + * > + * If memory cgroups are enabled, split_queue_lock is in the mem_cgroup:= it is > + * therefore important to unqueue deferred split before changing folio m= emcg. > + */ > +bool __folio_unqueue_deferred_split(struct folio *folio) > { > struct deferred_split *ds_queue; > unsigned long flags; > + bool unqueued =3D false; > + > + WARN_ON_ONCE(folio_ref_count(folio)); > + WARN_ON_ONCE(!mem_cgroup_disabled() && !folio_memcg(folio)); > > ds_queue =3D get_deferred_split_queue(folio); > spin_lock_irqsave(&ds_queue->split_queue_lock, flags); > @@ -3603,8 +3620,11 @@ void __folio_undo_large_rmappable(struct folio *fo= lio) > MTHP_STAT_NR_ANON_PARTIALLY_MAPPED,= -1); > } > list_del_init(&folio->_deferred_list); > + unqueued =3D true; > } > spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); > + > + return unqueued; /* useful for debug warnings */ > } > > /* partially_mapped=3Dfalse won't clear PG_partially_mapped folio flag *= / > @@ -3626,19 +3646,6 @@ void deferred_split_folio(struct folio *folio, boo= l partially_mapped) > if (!partially_mapped && !split_underused_thp) > return; > > - /* > - * The try_to_unmap() in page reclaim path might reach here too, > - * this may cause a race condition to corrupt deferred split queu= e. > - * And, if page reclaim is already handling the same folio, it is > - * unnecessary to handle it again in shrinker. > - * > - * Check the swapcache flag to determine if the folio is being > - * handled by page reclaim since THP swap would add the folio int= o > - * swap cache before calling try_to_unmap(). > - */ > - if (folio_test_swapcache(folio)) > - return; > - > spin_lock_irqsave(&ds_queue->split_queue_lock, flags); > if (partially_mapped) { > if (!folio_test_partially_mapped(folio)) { > diff --git a/mm/internal.h b/mm/internal.h > index 93083bbeeefa..16c1f3cd599e 100644 > --- a/mm/internal.h > +++ b/mm/internal.h > @@ -639,11 +639,11 @@ static inline void folio_set_order(struct folio *fo= lio, unsigned int order) > #endif > } > > -void __folio_undo_large_rmappable(struct folio *folio); > -static inline void folio_undo_large_rmappable(struct folio *folio) > +bool __folio_unqueue_deferred_split(struct folio *folio); > +static inline bool folio_unqueue_deferred_split(struct folio *folio) > { > if (folio_order(folio) <=3D 1 || !folio_test_large_rmappable(foli= o)) > - return; > + return false; > > /* > * At this point, there is no one trying to add the folio to > @@ -651,9 +651,9 @@ static inline void folio_undo_large_rmappable(struct = folio *folio) > * to check without acquiring the split_queue_lock. > */ > if (data_race(list_empty(&folio->_deferred_list))) > - return; > + return false; > > - __folio_undo_large_rmappable(folio); > + return __folio_unqueue_deferred_split(folio); > } > > static inline struct folio *page_rmappable_folio(struct page *page) > diff --git a/mm/memcontrol-v1.c b/mm/memcontrol-v1.c > index 81d8819f13cd..f8744f5630bb 100644 > --- a/mm/memcontrol-v1.c > +++ b/mm/memcontrol-v1.c > @@ -848,6 +848,8 @@ static int mem_cgroup_move_account(struct folio *foli= o, > css_get(&to->css); > css_put(&from->css); > > + /* Warning should never happen, so don't worry about refcount non= -0 */ > + WARN_ON_ONCE(folio_unqueue_deferred_split(folio)); > folio->memcg_data =3D (unsigned long)to; > > __folio_memcg_unlock(from); > @@ -1217,7 +1219,9 @@ static int mem_cgroup_move_charge_pte_range(pmd_t *= pmd, > enum mc_target_type target_type; > union mc_target target; > struct folio *folio; > + bool tried_split_before =3D false; > > +retry_pmd: > ptl =3D pmd_trans_huge_lock(pmd, vma); > if (ptl) { > if (mc.precharge < HPAGE_PMD_NR) { > @@ -1227,6 +1231,27 @@ static int mem_cgroup_move_charge_pte_range(pmd_t = *pmd, > target_type =3D get_mctgt_type_thp(vma, addr, *pmd, &targ= et); > if (target_type =3D=3D MC_TARGET_PAGE) { > folio =3D target.folio; > + /* > + * Deferred split queue locking depends on memcg, > + * and unqueue is unsafe unless folio refcount is= 0: > + * split or skip if on the queue? first try to sp= lit. > + */ > + if (!list_empty(&folio->_deferred_list)) { > + spin_unlock(ptl); > + if (!tried_split_before) > + split_folio(folio); > + folio_unlock(folio); > + folio_put(folio); > + if (tried_split_before) > + return 0; > + tried_split_before =3D true; > + goto retry_pmd; > + } > + /* > + * So long as that pmd lock is held, the folio ca= nnot > + * be racily added to the _deferred_list, because > + * __folio_remove_rmap() will find !partially_map= ped. > + */ > if (folio_isolate_lru(folio)) { > if (!mem_cgroup_move_account(folio, true, > mc.from, mc.= to)) { > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 2703227cce88..06df2af97415 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -4629,9 +4629,6 @@ static void uncharge_folio(struct folio *folio, str= uct uncharge_gather *ug) > struct obj_cgroup *objcg; > > VM_BUG_ON_FOLIO(folio_test_lru(folio), folio); > - VM_BUG_ON_FOLIO(folio_order(folio) > 1 && > - !folio_test_hugetlb(folio) && > - !list_empty(&folio->_deferred_list), folio); > > /* > * Nobody should be changing or seriously looking at > @@ -4678,6 +4675,7 @@ static void uncharge_folio(struct folio *folio, str= uct uncharge_gather *ug) > ug->nr_memory +=3D nr_pages; > ug->pgpgout++; > > + WARN_ON_ONCE(folio_unqueue_deferred_split(folio)); > folio->memcg_data =3D 0; > } > > @@ -4789,6 +4787,9 @@ void mem_cgroup_migrate(struct folio *old, struct f= olio *new) > > /* Transfer the charge and the css ref */ > commit_charge(new, memcg); > + > + /* Warning should never happen, so don't worry about refcount non= -0 */ > + WARN_ON_ONCE(folio_unqueue_deferred_split(old)); > old->memcg_data =3D 0; > } > > @@ -4975,6 +4976,7 @@ void mem_cgroup_swapout(struct folio *folio, swp_en= try_t entry) > VM_BUG_ON_FOLIO(oldid, folio); > mod_memcg_state(swap_memcg, MEMCG_SWAP, nr_entries); > > + folio_unqueue_deferred_split(folio); > folio->memcg_data =3D 0; > > if (!mem_cgroup_is_root(memcg)) > diff --git a/mm/migrate.c b/mm/migrate.c > index df91248755e4..691f25ee2489 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -489,7 +489,7 @@ static int __folio_migrate_mapping(struct address_spa= ce *mapping, > folio_test_large_rmappable(folio)) { > if (!folio_ref_freeze(folio, expected_count)) > return -EAGAIN; > - folio_undo_large_rmappable(folio); > + folio_unqueue_deferred_split(folio); > folio_ref_unfreeze(folio, expected_count); > } > > @@ -514,7 +514,7 @@ static int __folio_migrate_mapping(struct address_spa= ce *mapping, > } > > /* Take off deferred split queue while frozen and memcg set */ > - folio_undo_large_rmappable(folio); > + folio_unqueue_deferred_split(folio); > > /* > * Now we know that no one else is looking at the folio: > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 4b21a368b4e2..57f64b5d0004 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -2681,7 +2681,9 @@ void free_unref_folios(struct folio_batch *folios) > unsigned long pfn =3D folio_pfn(folio); > unsigned int order =3D folio_order(folio); > > - folio_undo_large_rmappable(folio); > + if (mem_cgroup_disabled()) > + folio_unqueue_deferred_split(folio); This looks confusing. It looks all callsites of free_unref_folios() have folio_unqueue_deferred_split() and memcg uncharge called before it. If there is any problem, memcg uncharge should catch it. Did I miss something? > + > if (!free_pages_prepare(&folio->page, order)) > continue; > /* > diff --git a/mm/swap.c b/mm/swap.c > index 835bdf324b76..b8e3259ea2c4 100644 > --- a/mm/swap.c > +++ b/mm/swap.c > @@ -121,7 +121,7 @@ void __folio_put(struct folio *folio) > } > > page_cache_release(folio); > - folio_undo_large_rmappable(folio); > + folio_unqueue_deferred_split(folio); > mem_cgroup_uncharge(folio); > free_unref_page(&folio->page, folio_order(folio)); > } > @@ -988,7 +988,7 @@ void folios_put_refs(struct folio_batch *folios, unsi= gned int *refs) > free_huge_folio(folio); > continue; > } > - folio_undo_large_rmappable(folio); > + folio_unqueue_deferred_split(folio); > __page_cache_release(folio, &lruvec, &flags); > > if (j !=3D i) > diff --git a/mm/vmscan.c b/mm/vmscan.c > index eb4e8440c507..635d45745b73 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1475,7 +1475,7 @@ static unsigned int shrink_folio_list(struct list_h= ead *folio_list, > */ > nr_reclaimed +=3D nr_pages; > > - folio_undo_large_rmappable(folio); > + folio_unqueue_deferred_split(folio); > if (folio_batch_add(&free_folios, folio) =3D=3D 0) { > mem_cgroup_uncharge_folios(&free_folios); > try_to_unmap_flush(); > @@ -1863,7 +1863,7 @@ static unsigned int move_folios_to_lru(struct lruve= c *lruvec, > if (unlikely(folio_put_testzero(folio))) { > __folio_clear_lru_flags(folio); > > - folio_undo_large_rmappable(folio); > + folio_unqueue_deferred_split(folio); > if (folio_batch_add(&free_folios, folio) =3D=3D 0= ) { > spin_unlock_irq(&lruvec->lru_lock); > mem_cgroup_uncharge_folios(&free_folios); > -- > 2.35.3