From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4C7FC3DA6E for ; Wed, 20 Dec 2023 23:32:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 15B3C8D0005; Wed, 20 Dec 2023 18:32:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 10AAC8D0001; Wed, 20 Dec 2023 18:32:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EED9A8D0005; Wed, 20 Dec 2023 18:32:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id DB41C8D0001 for ; Wed, 20 Dec 2023 18:32:13 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id B3A43A19E6 for ; Wed, 20 Dec 2023 23:32:13 +0000 (UTC) X-FDA: 81588797346.02.CDC8FA3 Received: from mail-oi1-f173.google.com (mail-oi1-f173.google.com [209.85.167.173]) by imf12.hostedemail.com (Postfix) with ESMTP id 04D0D4000E for ; Wed, 20 Dec 2023 23:32:11 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=mzaXQCez; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf12.hostedemail.com: domain of shy828301@gmail.com designates 209.85.167.173 as permitted sender) smtp.mailfrom=shy828301@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1703115132; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=JafTjfhsfvTQO5zxFRDVKmKyIqXWE2U91+So3wUJ3Ds=; b=KsJc7kZ2+t3WDJ+zUfkIS7HhMS9CXv63VQpwRSurhDgX6b/8DbjkJ8jl25udLYpKfJJGYB M3FiFNsIwbwQdz43jPFJsN0PC+6vvxApnKNNkdGmVrL98sCH8OcIUl3vAkdF4MSU0HhK43 7tLBGMibtPqlsvaKa6gWkYwZ6tDX0sk= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=mzaXQCez; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf12.hostedemail.com: domain of shy828301@gmail.com designates 209.85.167.173 as permitted sender) smtp.mailfrom=shy828301@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1703115132; a=rsa-sha256; cv=none; b=ke+csEn//SdnEFWrZZB1niCmiLRZDl1liFcYPfsO7BlA6PZ5Mb9irtI7e7f9zaC/MgXRzf 7hUg94eSKtTRgJZ9m2uo7NDbsjrWHOLlfMUxzMIMtGW8AUcMZo6Gdw9HGYKyI98Cwt0pQf IS9gSU7vqUwTeKUron3QnxQBxwFZ1y8= Received: by mail-oi1-f173.google.com with SMTP id 5614622812f47-3b9d8bfe845so131617b6e.0 for ; Wed, 20 Dec 2023 15:32:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1703115131; x=1703719931; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=JafTjfhsfvTQO5zxFRDVKmKyIqXWE2U91+So3wUJ3Ds=; b=mzaXQCezX6Ds3s5svdak6tV4CDs1C8fgxcXWiMoiHLrBPqOJLQo6NHhETqBFfvxImN YTiDrtXmOB+SYGQE49h0eHA7qo8Skd0fRa14yGoAu0UjF+J4YfJPEtXHJuDJeWTmwa4Y 746Cow3ZH2ZuQsxosxDmtX2CN6DcVuaGqvUiIDe7i/oVt1Z38gnYX5x/vVwu5wgms0s+ Zw03qZJhqO+Kpe5ucezQRg4vKa977XGIPOdcrmvn2Q4X1+butEGmiCDNPOraYyCvdEky c+ZZYwf5jziP2S5+ZP2stRZys28lwdqVZvVepJPIJs9TuakqScPGut9zevZuXSG7pM0V DtEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703115131; x=1703719931; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JafTjfhsfvTQO5zxFRDVKmKyIqXWE2U91+So3wUJ3Ds=; b=chLA7OqwCdz+63FgTyF6ku3EpnlIhrcxQeEuXqNkfjE55WPbH1/XhL/gdbd/98b0Eo nA/emdYUwF/o/I3x54GqnYi2+9nGx23YF/DIuISHZqU6wkYgugm7JO+r5OcPBO7NL43h DGfa2BFd3G627IpYW6NKjFTYcslQVPK11fIHfHva5iyD1gq8amyhFIGdBduey0eu5y3/ mw/vY4BQlH2G7vFU1UycdeZm0HhYNY4JG0FRSnXflDIKBFsXr7QK7qnahSqjhikl1Yvq ejjosyE1vYZA5Ut5ODvn5Mxk1SAYegPYppdpdydG6+r/4FG6MSFrkcdluIdPX8Qpe+iQ jMPA== X-Gm-Message-State: AOJu0Yx+N11Ak2+IKCoxZjnwK1uYtWVO+XQcurK60naKSOXoUjJy/hn8 1+lGuD3bTakvk0/7u2nqalFWhZxIimeTRUyyzvQ= X-Google-Smtp-Source: AGHT+IE7F7f9ZxYZNbVnj1WH6ig3zM1ey1dAod+eyQrL/iDUt24tcMWaF+2JysLJAf8U2EW0SGNoGYJxDAp9AV8ZBOs= X-Received: by 2002:a05:6358:c3a3:b0:170:21ef:3e71 with SMTP id fl35-20020a056358c3a300b0017021ef3e71mr322479rwb.42.1703115130989; Wed, 20 Dec 2023 15:32:10 -0800 (PST) MIME-Version: 1.0 References: <61273e5e9b490682388377c20f52d19de4a80460.1703054559.git.baolin.wang@linux.alibaba.com> In-Reply-To: <61273e5e9b490682388377c20f52d19de4a80460.1703054559.git.baolin.wang@linux.alibaba.com> From: Yang Shi Date: Wed, 20 Dec 2023 15:31:58 -0800 Message-ID: Subject: Re: [PATCH] mm: memcg: fix split queue list crash when large folio migration To: Baolin Wang Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, muchun.song@linux.dev, nphamcs@gmail.com, david@redhat.com, ying.huang@intel.com, ziy@nvidia.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 04D0D4000E X-Stat-Signature: 4eriyf35msridkxjta4xzjwgwjptmpks X-HE-Tag: 1703115131-367221 X-HE-Meta: U2FsdGVkX1+8TDKsahw5dF1eKVgzUJws0J2+9KMxdrUnFdrfk3+cEfUd48vts31K9X+1fXQbKo/k+ZGuewOp+oOQzFkwNoJNJ7hfkXI28U0K9eb+W9IEPyD0pPRwlxfZd27ko99meDD/WdFjKLqklMNSzTr9l3Tk/zcbAEbggqvggH01dtoW5gWjrj/d2mz6l7Q+PG64neXYXMeQZ93T21xugMn0ACy64hULyC1SjYC0M4i6xTK/ZPNGhU7B1LIn+8ylSlbYQAFUDgqiHza6NnLTqDxQ9SPz77yF4elswOxB8OlOU3sPVPfqRVBwXZXFkjepdYO6Q/JBDCeJsKRB44yDCzN/U4rbKming+6HV3udslFbPfRsJVFDEMUAxurvECIkxkz1IwnrZ2hBNRJmVWvu8wkrUhhznTRoC0Z03B3jy6Pc58Y0pynVVxpNY7IOTl/0HLeSwHkRACHzPuaSYukhwjCJmgz/UvUCx2OWOos9T0lmuXKu9gfA+zABw9RhwSqxjmBWgF4qoyq5ItvrcsFrPw8DCK01lb1FjpKWqjMyXUn1Sa4OVT1pczcnbno0MDftXF0W35BvBlJwAafqP6d1izfkGnkioALSCy0cVDATvsNDh2PLDhWfKV9AlcFvEmTbPGUic8PF6t6cXja3pYAWmYzmioPX+2vuoAO8Pw72BHuMgDw1RksCWfCnLkb3/VG7qru6K1sY7JIfGhJykoQV1+2/ca/8k/bcOx5mfX1N1mkMin7++k3Qkp03cU7eq63mbi+3c+tFapjhNovI3+a/K02kee85mWPLrx7isAi+dFHDNToGBIVYiFONACxgUv9g0n0jT9w5FVBkE/sk0rEC9bAltj6aNlk+USYM2kX2E2S2s2kgweVIhBaa8rkgMWUWBJi1jpnspMrV60+A/CJXCwFTtwWivya+f4qP8YNOmntHIMLoPEg6wA5xAmI51qE+vTUj1NfjCSYCR+x CI8FxAbB jER0PxAvK6Z/mhixTSKfIPWJq1a5ZiHZ4M/8zkutp+0dTSHiIIhHfIbo900ryuiCGWXHRACgyOiCg/R3CYMRZZMxRLYfL+xf0DnHqTmkimObrN8GwY/OWUwMiGstIcrQVDB4fxw6Q7XH3vFL80Ih4JIxyC3Fvh451ofTBD608OeiDPkswg9RjFTocyGpjhUc3WJqvhsCSMJ7enCZy8tNQvUV0zOxf6KEbRTYtF0Lt4ijwlXhvCbrDebVc+/0bVeYcl6IIvbwfQKIIrLWoxENM11rmNYockg7Sg839164kSBXznQpO9B6q0sGJEraLxicYHSOBn7vNu+Dcd8PLnSKwdoOO2gYoq0KFtP0ewy7etPK3oFvWNBzmw+zUgqFJcigwg4c6K35byn173BFIGZUsqBfc+w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Dec 19, 2023 at 10:52=E2=80=AFPM Baolin Wang wrote: > > When running autonuma with enabling multi-size THP, I encountered the fol= lowing > kernel crash issue: > > [ 134.290216] list_del corruption. prev->next should be fffff9ad42e1c490= , > but was dead000000000100. (prev=3Dfffff9ad42399890) > [ 134.290877] kernel BUG at lib/list_debug.c:62! > [ 134.291052] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI > [ 134.291210] CPU: 56 PID: 8037 Comm: numa01 Kdump: loaded Tainted: > G E 6.7.0-rc4+ #20 > [ 134.291649] RIP: 0010:__list_del_entry_valid_or_report+0x97/0xb0 > ...... > [ 134.294252] Call Trace: > [ 134.294362] > [ 134.294440] ? die+0x33/0x90 > [ 134.294561] ? do_trap+0xe0/0x110 > ...... > [ 134.295681] ? __list_del_entry_valid_or_report+0x97/0xb0 > [ 134.295842] folio_undo_large_rmappable+0x99/0x100 > [ 134.296003] destroy_large_folio+0x68/0x70 > [ 134.296172] migrate_folio_move+0x12e/0x260 > [ 134.296264] ? __pfx_remove_migration_pte+0x10/0x10 > [ 134.296389] migrate_pages_batch+0x495/0x6b0 > [ 134.296523] migrate_pages+0x1d0/0x500 > [ 134.296646] ? __pfx_alloc_misplaced_dst_folio+0x10/0x10 > [ 134.296799] migrate_misplaced_folio+0x12d/0x2b0 > [ 134.296953] do_numa_page+0x1f4/0x570 > [ 134.297121] __handle_mm_fault+0x2b0/0x6c0 > [ 134.297254] handle_mm_fault+0x107/0x270 > [ 134.300897] do_user_addr_fault+0x167/0x680 > [ 134.304561] exc_page_fault+0x65/0x140 > [ 134.307919] asm_exc_page_fault+0x22/0x30 > > The reason for the crash is that, the commit 85ce2c517ade ("memcontrol: o= nly > transfer the memcg data for migration") removed the charging and unchargi= ng > operations of the migration folios and cleared the memcg data of the old = folio. > > During the subsequent release process of the old large folio in destroy_l= arge_folio(), > if the large folio needs to be removed from the split queue, an incorrect= split > queue can be obtained (which is pgdat->deferred_split_queue) because the = old > folio's memcg is NULL now. This can lead to list operations being perform= ed > under the wrong split queue lock protection, resulting in a list crash as= above. > > After the migration, the old folio is going to be freed, so we can remove= it > from the split queue in mem_cgroup_migrate() a bit earlier before clearin= g the > memcg data to avoid getting incorrect split queue. Nice catch! The fix looks good to me. Reviewed-by: Yang Shi > > Fixes: 85ce2c517ade ("memcontrol: only transfer the memcg data for migrat= ion") > Signed-off-by: Baolin Wang > --- > mm/huge_memory.c | 2 +- > mm/memcontrol.c | 11 +++++++++++ > 2 files changed, 12 insertions(+), 1 deletion(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 6be1a380a298..c50dc2e1483f 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -3124,7 +3124,7 @@ void folio_undo_large_rmappable(struct folio *folio= ) > spin_lock_irqsave(&ds_queue->split_queue_lock, flags); > if (!list_empty(&folio->_deferred_list)) { > ds_queue->split_queue_len--; > - list_del(&folio->_deferred_list); > + list_del_init(&folio->_deferred_list); > } > spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); > } > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index ae8c62c7aa53..e66e0811cccc 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -7575,6 +7575,17 @@ void mem_cgroup_migrate(struct folio *old, struct = folio *new) > > /* Transfer the charge and the css ref */ > commit_charge(new, memcg); > + /* > + * If the old folio a large folio and is in the split queue, it n= eeds > + * to be removed from the split queue now, in case getting an inc= orrect > + * split queue in destroy_large_folio() after the memcg of the ol= d folio > + * is cleared. > + * > + * In addition, the old folio is about to be freed after migratio= n, so > + * removing from the split queue a bit earlier seems reasonable. > + */ > + if (folio_test_large(old) && folio_test_large_rmappable(old)) > + folio_undo_large_rmappable(old); > old->memcg_data =3D 0; > } > > -- > 2.39.3 >