From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C59FBC2BD09 for ; Thu, 4 Jul 2024 03:21:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 36B436B0083; Wed, 3 Jul 2024 23:21:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 31B1C6B0085; Wed, 3 Jul 2024 23:21:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1E32F6B0088; Wed, 3 Jul 2024 23:21:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id EF69C6B0083 for ; Wed, 3 Jul 2024 23:21:37 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 91155A21EE for ; Thu, 4 Jul 2024 03:21:37 +0000 (UTC) X-FDA: 82300620234.30.5A18706 Received: from mail-yb1-f180.google.com (mail-yb1-f180.google.com [209.85.219.180]) by imf24.hostedemail.com (Postfix) with ESMTP id ABD3E180017 for ; Thu, 4 Jul 2024 03:21:35 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=xPwzw+yU; spf=pass (imf24.hostedemail.com: domain of hughd@google.com designates 209.85.219.180 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720063270; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gtybbuyYj0RpGRJub+zYJT/1C0NwLxoj7Nw9OPQ82Cc=; b=0WfBQYCYkzXwGJHex899QtmZQCXhQOvicNFk/3eWC8ThcTY0L3u0J35xLNk+RYpMXh3j/C kvVGqgINKigRoeQVL2I/DATMEB9GVfrtiVoKgh2zWrKkjZZcQzs45CAZpQ35NNEJW9FVlw ZLJdzMDBnE8rejsopCrFWEYzH4gy0dg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720063270; a=rsa-sha256; cv=none; b=BPrOMcjeT+ih4wjOmtU//dz+jLF4a6dhO8nLd9VuEIYyfH0G1D16C+weJjAm9cO22P5q/P DuxbTm67+rdFOlCSHQVjKC2jD6kYV+kKkwlXU6PDzbx4QwOyyBs6NODDjaHI+xADP5sOUI /NosidUTLEAvGI2veFEdjUmH2DC2GVY= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=xPwzw+yU; spf=pass (imf24.hostedemail.com: domain of hughd@google.com designates 209.85.219.180 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yb1-f180.google.com with SMTP id 3f1490d57ef6-e035f4e3473so154089276.3 for ; Wed, 03 Jul 2024 20:21:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1720063294; x=1720668094; darn=kvack.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=gtybbuyYj0RpGRJub+zYJT/1C0NwLxoj7Nw9OPQ82Cc=; b=xPwzw+yUMQSxkg/CbEas05NnXtGrgEBhajNfGm9GhnkwHtK6Cye48wr0A+E4tnV/6I qa0/y7fBLZhyBIbEuYDhPBD1SdbJwXvlzbg98ZH7FXiZbHPnPR75OZ0FBo/zFCSnxP3a xiU5HKL8+dt4N8g0fJk9qS0kPWHFO3B0DTUrBPp3zx2LeoajxPPNHh8PL4QvWzG5VpHe bu3WNkEXtHAYLTGolC0zNEGbGlaPBja6LxkO22HWoGL8fgf9+nMJ4Ldi+PkA49OSbmMx gA2dXl25vjghsNbhOHyxVuhGWr0ae5boj9QCr+uRarYzHhxUwp0rWW46MK/Eks59DNtZ 5lPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720063294; x=1720668094; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=gtybbuyYj0RpGRJub+zYJT/1C0NwLxoj7Nw9OPQ82Cc=; b=LBlysDCTiMy2V01J13/4FoYvobfHAJ0rlQvG0uEs2HL8SeAyvzoTMnmxqtysV0zkJe YpUJfK9vTlky/0WclfUn1M+MNy8+uQvP4lpfE10YWH+zbLzaa7Efn6BAJXc3261L8E8S Sw1ihN7QldK/QHSDQJcknXpQeatUJW04klZdFysW2PyP3Ier3skO6JS2bd0ZNd9Zs6DM 7TRJ2GaCwj2I9fZWERQeqvhYxJOWsN5v4GTSumOzY+V1khmV//Loshvl7ABmsTiKWEuR 5LphT+OUIZlJCnRLNpWxiTjuj/+Bdh/oKd68WCkD+0w+b2wIqXTg8wYmfYHxDhffBq0j s/mg== X-Forwarded-Encrypted: i=1; AJvYcCUVt5E3QrKHfIl3WHL5UX+LvAb/Z71x2KKUHcg6EGxLLCcYl55ADeyx/Qj6V1DZWgDlsD7K37H5aec2zuVo20CYCjo= X-Gm-Message-State: AOJu0YyALsvCq+ILP9OuuWK6r53FL4fWR16gV5l+kDSNVrOrqhH9Tuur YuXw9g0QnW4m1j8641OmIV3bU0dLFRUk3fSG3bbA/ufG5g4Om70xAU7yhIQRbg== X-Google-Smtp-Source: AGHT+IHnJ1NbLRUDL+3UZk9Ox3VgOwK38XIyefROX7DwVkkKJ4mohoG+Hsg/LH5OIf0SW21rIhtijQ== X-Received: by 2002:a25:874e:0:b0:e03:6442:e6a8 with SMTP id 3f1490d57ef6-e03c1907dd0mr419063276.5.1720063294483; Wed, 03 Jul 2024 20:21:34 -0700 (PDT) Received: from darker.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id 3f1490d57ef6-e03a45f2e66sm681417276.50.2024.07.03.20.21.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Jul 2024 20:21:33 -0700 (PDT) Date: Wed, 3 Jul 2024 20:21:22 -0700 (PDT) From: Hugh Dickins To: Andrew Morton cc: Hugh Dickins , Baolin Wang , Nhat Pham , Yang Shi , Zi Yan , Barry Song , Kefeng Wang , David Hildenbrand , Matthew Wilcox , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH hotfix] mm: fix crashes from deferred split racing folio migration In-Reply-To: <20240703193536.78bce768a9330da3a361ca8a@linux-foundation.org> Message-ID: <825653a7-a4d4-89f2-278f-4b18f8f8da5d@google.com> References: <29c83d1a-11ca-b6c9-f92e-6ccb322af510@google.com> <20240703193536.78bce768a9330da3a361ca8a@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Stat-Signature: 3woopoc87o4amu7aa7nuojphy61qi5r1 X-Rspamd-Queue-Id: ABD3E180017 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1720063295-888957 X-HE-Meta: U2FsdGVkX19rhnRk7JUKBdnIiiS/esIRnugYDVG2nj5S819ppL5+fc/r8WPrZwB9engi2bKeMbTZBiAI+Bh9vVX4j1fShROWt/pGz8MmHLV9rEep4OeWG1a6DAkydwArA1GdcioNEa1p8TniV9NSQVXfb6a20wT8nzfwm394n2ZybpM/YnnsKhorQRfhYmmJrmlnt1rVyuB2f3QWaTI2rhJYnl6LVTsLw7jZnSR8zkCpxNLFgOV5uDegIlgxnOITi7TRbhy6L+xsQRz/GdAWhqj+05jdo2/ugg+Q3i7jrBuAMJ3jCx2bW7ZUHRtZ7JcpqUxnfiYq73A5ToNvrLqMLh6HglZS2u+PmKCTAy3Ar8vu7Q3ozKR/bOstcuwdHI7rApUim1zbAxE+dt1FEiYaoZQVlm2e9yNozUZ29y19QrlyIPWAU3As7dO1VsO38QmkgjVesBIT5dR4ET8V7KnxlEE374DKxdXgx87GUXrlOgcyN6e9hmT2vejuc7TvSIYrQk0UQI0Hs0CCoLfK9zN9IXnsOsA7moG7ZByqSLc7MIQAs2T5i2PuGq9JcoFl6csdQH2WeTqVZXg4eWL/ehj2gc25dPWqDtPYuzMn1T2U/0fReK30R7Eu9u6CtPbI4c9+32yw401g1PFrCCJHrWPVrO/Yuxn50WQjKWMFzdurWoRDYXUSRE1rrgQOdqpIjfA4b6uIk7Y4ABtD1cZunRmCGk8Ucc+qlCFCFSRTanEjLh6e+z7/s61r28AxsKl7epl3+g3cGWJjngiSZdySMjPTTQq+JXcwax2x4gdoKPuSpFFnFziVNgRT8zDZxCS96Pcq5oVc+1zcbJ6Y2xT0fvNG8BJxXkcqAz/JvHAryv6rn7ziMNC6OZABiODkdIZReZb5GQc698Ro+iRH+YQTiRLeyAZGtj+3LUIfSeyUyy71ZAM++rGHpmEv5Q7z99vPJJUiJlcc2UUz98W/GGnQsg1 Q1VJIym5 Ocp/UsyWcDXLnb64DGP6Km1EiwAE0n1EEw5SKA0zycGC1e/yaofL+s72KkurDu02gqUhH2in9PRxMz7cCKn1AUHR7UFWo3Mo9fSRaXsrfNeHZe1C9iwj1A7CFcO68ehYSMA6tzGG4+O8XW/oBqwxb+ipZWV7Bvk7WSEJ06y8MPQQ4f+SsYOHYMcbMFjNUp9erWebteQG1sQI1j/liNW2GXXREIDfre7sr4UNulI5/vv6Vg49thfqIumGh1Tv8v+gDq4DTXnBAdixx5ifqAhx3jqG4/O0HJ/bUFqcBDB3DZpTQebbhjZYg776tbVPHAYKmZmlfTVcip2g+6PIEQLMeaf7jT12S9hnmktzAyE9xAQsFzmuGuNndd5b0cWuuCPy6zf2gA5P+n/2SDVjtZLpdNwBDtg6S9RvauT+L X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, 3 Jul 2024, Andrew Morton wrote: > On Tue, 2 Jul 2024 00:40:55 -0700 (PDT) Hugh Dickins wrote: > > > Even on 6.10-rc6, I've been seeing elusive "Bad page state"s (often on > > flags when freeing, yet the flags shown are not bad: PG_locked had been > > set and cleared??), and VM_BUG_ON_PAGE(page_ref_count(page) == 0)s from > > deferred_split_scan()'s folio_put(), and a variety of other BUG and WARN > > symptoms implying double free by deferred split and large folio migration. > > > > 6.7 commit 9bcef5973e31 ("mm: memcg: fix split queue list crash when large > > folio migration") was right to fix the memcg-dependent locking broken in > > 85ce2c517ade ("memcontrol: only transfer the memcg data for migration"), > > but missed a subtlety of deferred_split_scan(): it moves folios to its own > > local list to work on them without split_queue_lock, during which time > > folio->_deferred_list is not empty, but even the "right" lock does nothing > > to secure the folio and the list it is on. > > > > Fortunately, deferred_split_scan() is careful to use folio_try_get(): so > > folio_migrate_mapping() can avoid the race by folio_undo_large_rmappable() > > while the old folio's reference count is temporarily frozen to 0 - adding > > such a freeze in the !mapping case too (originally, folio lock and > > unmapping and no swap cache left an anon folio unreachable, so no freezing > > was needed there: but the deferred split queue offers a way to reach it). > > There's a conflict when applying Kefeng's "mm: refactor > folio_undo_large_rmappable()" > (https://lkml.kernel.org/r/20240521130315.46072-1-wangkefeng.wang@huawei.com) > on top of this hotfix. Yes, anticipated in my "below the --- line" comments: sorry for giving you this nuisance. And perhaps a conflict with another one of Kefeng's, which deletes a hunk in mm/migrate.c just above where I add a hunk: and that's indeed how it should end up, hunk deleted by Kefeng, hunk added by me. > > --- mm/memcontrol.c~mm-refactor-folio_undo_large_rmappable > +++ mm/memcontrol.c > @@ -7832,8 +7832,7 @@ void mem_cgroup_migrate(struct folio *ol > * In addition, the old folio is about to be freed after migration, so > * removing from the split queue a bit earlier seems reasonable. > */ > - if (folio_test_large(old) && folio_test_large_rmappable(old)) > - folio_undo_large_rmappable(old); > + folio_undo_large_rmappable(old); > old->memcg_data = 0; > } > > I'm resolving this by simply dropping the above hunk. So Kefeng's > patch is now as below. Please check. Checked, and that is correct, thank you Andrew. Correct, but not quite complete: because I'm sure that if Kefeng had written his patch after mine, he would have made the equivalent change in mm/migrate.c: --- a/mm/migrate.c +++ b/mm/migrate.c @@ -443,8 +443,7 @@ int folio_migrate_mapping(struct address_space *mapping, } /* Take off deferred split queue while frozen and memcg set */ - if (folio_test_large(folio) && folio_test_large_rmappable(folio)) - folio_undo_large_rmappable(folio); + folio_undo_large_rmappable(folio); /* * Now we know that no one else is looking at the folio: But there's no harm done if you push out a tree without that additional mod: we can add it as a fixup afterwards, it's no more than a cleanup. (I'm on the lookout for an mm.git update, hope to give it a try when it appears.) Thanks, Hugh