From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69D4AC2BD09 for ; Sat, 6 Jul 2024 21:29:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AC1F16B0083; Sat, 6 Jul 2024 17:29:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A4B1B6B0088; Sat, 6 Jul 2024 17:29:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8C4916B0089; Sat, 6 Jul 2024 17:29:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 6B7B96B0083 for ; Sat, 6 Jul 2024 17:29:16 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id DCCB14098E for ; Sat, 6 Jul 2024 21:29:15 +0000 (UTC) X-FDA: 82310618670.29.F662082 Received: from mail-oa1-f54.google.com (mail-oa1-f54.google.com [209.85.160.54]) by imf29.hostedemail.com (Postfix) with ESMTP id 2082B12000D for ; Sat, 6 Jul 2024 21:29:13 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=BFGWlGJM; spf=pass (imf29.hostedemail.com: domain of hughd@google.com designates 209.85.160.54 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720301332; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ue8gY35brEYBVWt8FyxwSeAIH2C+YwAP5qKojzB01cI=; b=NoUotkzQszjrGLmo4ITkcvZaG/UnoKXE8xbYBRm4KsvYAZuC9IfidkIGd1PmDzi8dIvT/V lqRA4N/Lv5cfUFoJ3I7G7u6vXyrd6NAPVVKP/04eRGPmy0ylV7Wep3NwD/gXz0LTqdSGB/ z7ExR+QdQjvV4B5O8MOVMExeDijBxJY= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=BFGWlGJM; spf=pass (imf29.hostedemail.com: domain of hughd@google.com designates 209.85.160.54 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720301332; a=rsa-sha256; cv=none; b=1qRBc0+yo5mH2rJ5UxngC9v2VOX+APBCophI4oFxO+N2RgQGNHkmBXvbDu98KcFXNfpvNi wYzRBz3qRLJ1rvuN0quvM0eG+nKVwZ6qIaJPFsetkdGWRP6blqvXGieWbehDlWYP0n2GZ4 Uxul2WNeR+d1SKXXL/mXIgAO4PY6ZkI= Received: by mail-oa1-f54.google.com with SMTP id 586e51a60fabf-24c9f892aeaso1299153fac.2 for ; Sat, 06 Jul 2024 14:29:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1720301353; x=1720906153; darn=kvack.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=ue8gY35brEYBVWt8FyxwSeAIH2C+YwAP5qKojzB01cI=; b=BFGWlGJML1HHFKfeXG6eAqTvZMMZ3Hu087T8SmmoN4tT1MiInWlR70A/x72nhhjqyh HEmYhreRZN/bbwaHhqHq+GLHQnHKqaFMORgCAT0WP/gzzAb6ZnGnZCA7acxaTm8ST2Rs +cUTnvewZ5dQDObH0Mi94/u+1ufZYIs70SonOt74mxJWc932XHsTmWzKeTpm9h6g+5yR UyTJKJg0Un6MQgeSEF6xl9dQqhKOqfRuaueGjain+ijq8ohQy+k2lVKh+Kw6Rf9S+VdX CNDOyHNEJGe81q5binCnnT7GoXbWmA0N9G3rsuvPRtYZbhkqwXi/0+8n6pxZD1vlpn7e Nq9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720301353; x=1720906153; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ue8gY35brEYBVWt8FyxwSeAIH2C+YwAP5qKojzB01cI=; b=EMw5eHqLw8RQf4FoKvfbTtur64siqDjMkYObb6geAT0rak9yQ8iOI2c0UIOeWS7hWo pWYwFtVqOosU6aSbdXcnVAuewqQ0X+AvkjEQvrFdLBdtYDQhRdVY5rqkqT4Baq2s56Fc RQGXN9l8R1edG3aK431Hqw+fZ3smRgL2ExhjDsjaf45mHooLLvfSmpA/WBcnJCOxBVBz ZJ3keEA9VS5rXHnwzMMpg8Dvmjq94hlS73xRnRfzTb7Qpe6oamY6aIeGeAv6Ubn4O6eV CqBrwOiA6nwBX4PuavlT6oGBbOKl7t3fecI9mUTBkQBVrFjg8wuc3PCDM9BDy3pVQzSF pFdw== X-Forwarded-Encrypted: i=1; AJvYcCXEDFqp8CU+Kq3jQAs/0uPdDpI6O4lyWZJM0nbszo4qGrjIDx6WAoEGAn0m4uAuw6kTs0XVNKEu55+EoPsokADJTaU= X-Gm-Message-State: AOJu0YzX5NMR31Qbvam6RgcxOOF+c6kYIjJM0ZjK4l+8uJ7OFO2TCnN8 vAoGmLhG+Zr3vXsgEonMZI5ac65UQABYAnk8jIurBDjATDagCy0pHFiKKet9GA== X-Google-Smtp-Source: AGHT+IGHrlPCBl63akfzuRCMp6XLRSgTopTU+czREYQwtTu7G1RfFozNSVFeOgW1nhmv4D6E1jZ+EA== X-Received: by 2002:a05:6870:eca0:b0:25e:1c9d:f180 with SMTP id 586e51a60fabf-25e2bf2aed9mr8482336fac.50.1720301352822; Sat, 06 Jul 2024 14:29:12 -0700 (PDT) Received: from darker.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id 586e51a60fabf-25e71070ba1sm307134fac.34.2024.07.06.14.29.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 06 Jul 2024 14:29:11 -0700 (PDT) Date: Sat, 6 Jul 2024 14:29:00 -0700 (PDT) From: Hugh Dickins To: Kefeng Wang , Andrew Morton cc: Hugh Dickins , Baolin Wang , Nhat Pham , Yang Shi , Zi Yan , Barry Song , David Hildenbrand , Matthew Wilcox , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH hotfix] mm: fix crashes from deferred split racing folio migration In-Reply-To: <7b7f2eb7-953a-4aa0-acb0-1ab32c7cc1bf@huawei.com> Message-ID: <68feee73-050e-8e98-7a3a-abf78738d92c@google.com> References: <29c83d1a-11ca-b6c9-f92e-6ccb322af510@google.com> <20240703193536.78bce768a9330da3a361ca8a@linux-foundation.org> <825653a7-a4d4-89f2-278f-4b18f8f8da5d@google.com> <7b7f2eb7-953a-4aa0-acb0-1ab32c7cc1bf@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Stat-Signature: 86ndfmctip1c6raaeqcrzhjj7o1mjqjd X-Rspam-User: X-Rspamd-Queue-Id: 2082B12000D X-Rspamd-Server: rspam02 X-HE-Tag: 1720301353-114373 X-HE-Meta: U2FsdGVkX19fCuR2cR9akdTk797cNdUBpyCY1S272MQbR4cTT5aHGnsCvLmtatm7uEGMTCM//lVx4UBSriXnu9pdWWcceIMakLyOzY033EAZFRrdQh6QZG5986RBV8TrGs9Tu8paZHPEMR8A4j1RFWcBZVA5Ir8HRxqENgIcTfl2nR46rB66nLlt5y4QQ0rhEwyufYYs0zi4/PVuIQDBl6+UNly7Cg2Wj6upFnhyycAheC6/cqQcjvgBXnUjvknJt6Bxb+VVlwb80DPGvrmUwaxhSiEVUrBZ3dPna4B/m44UbYUkI9u87NRajBT2UNKCyVYEzmVLuJmIody2E3mrKB9wmvwKXiDsRmugYXCvJ/rYYe3qaAkyRowl65tePiW96PHN8g4IW1W6tmdWP/mEqOtLnvDTjbq6S5413Gcemy24SBTW8F7w762kvEqSjQJXmUvp6akOkkDCHsvKQ9q+LSqF7JtwFUibfu17COXScKA7ZLdiU4OvUdzcNRiod+iH3MKOBoIwOVF1ZCRH9Kx8c65c/ZHHSFu3garN89sT6lAlCc2UyZZpvBYI0yUhFY4VrX8VaEZDm8uFa5E0R9H9BsY0Bu57QwLNvixHOECOccIVTWPmXHFvH694nkUqT+YZUItmybr75lD/rnq8AOm6LrymTitkkFYK9wHsfF2Hsx90GEuOsSq0ZRhr+fCRqKAYHRjNEvjmw+hfmOepIXeQ5fkgLvbaVluON13AZrE1XyO6C/tBuQS7II0m5jUNP5GG+TqiP1CklvVHA9MKhUYmkQvq80VQAbbGCGoEqDRp7I5tYMlWCZs1vV2eWlL8G1vS/BIqHffoAFssm/luE1j93MAQV5eN/csdMmPMBDa5Ey2AyKi9oplYclwUCpxpZ33flRpgqd9p04oyP1+Jsvzz8Nf9pG3iXb3aSxwaw6cLaUIHFOQyjFwvcIWlA2XznOmkkfsX0gDuEj23U7nWhyL KESwX+O0 uNRX5E4ZT0pvZmMFCTrJ60XEeYL/yMvATLQPpkG2B2sem41kpZsLEd8TFtwhqr0J4NKoGenkn3AY2YLSNymCAtHVje5QNce9BcdSqXhwlFg6JhvnzByWTi75cEcVfXnSFn8NIitYTXFOvL+Q82T6oYu5Djs/32QmEKLg8FxQxCWhiwIN9KJb+EE+gbhtQIvnkFFdlP1sXfwr6QdtBpROvAItBjzQ9aaVZ6zLri6vaYMXLIZfzkUPrSDHf4r+54pKaA136qpvKJ+784PTqJG577K99BXjjqgqy9ERvlmeixJ/D8MqgwXRqXW1p/5SrQW4oOxG+VJKW3biwFHPYWTBeYvTN7g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 4 Jul 2024, Kefeng Wang wrote: > On 2024/7/4 11:21, Hugh Dickins wrote: > > On Wed, 3 Jul 2024, Andrew Morton wrote: > >> On Tue, 2 Jul 2024 00:40:55 -0700 (PDT) Hugh Dickins > >> wrote: > >> > ... > > > > And perhaps a conflict with another one of Kefeng's, which deletes a hunk > > in mm/migrate.c just above where I add a hunk: and that's indeed how it > > should end up, hunk deleted by Kefeng, hunk added by me. > > > >> > >> --- mm/memcontrol.c~mm-refactor-folio_undo_large_rmappable > >> +++ mm/memcontrol.c > >> @@ -7832,8 +7832,7 @@ void mem_cgroup_migrate(struct folio *ol > >> * In addition, the old folio is about to be freed after migration, so > >> * removing from the split queue a bit earlier seems reasonable. > >> */ > >> - if (folio_test_large(old) && folio_test_large_rmappable(old)) > >> - folio_undo_large_rmappable(old); > >> + folio_undo_large_rmappable(old); > >> old->memcg_data = 0; > >> } > >> > >> I'm resolving this by simply dropping the above hunk. So Kefeng's > >> patch is now as below. Please check. > > > > Checked, and that is correct, thank you Andrew. Correct, but not quite > > complete: because I'm sure that if Kefeng had written his patch after > > mine, he would have made the equivalent change in mm/migrate.c: > > > > Yes, > > > --- a/mm/migrate.c > > +++ b/mm/migrate.c > > @@ -443,8 +443,7 @@ int folio_migrate_mapping(struct address_space *mapping, > > } > > > > /* Take off deferred split queue while frozen and memcg set */ > > - if (folio_test_large(folio) && folio_test_large_rmappable(folio)) > > - folio_undo_large_rmappable(folio); > > + folio_undo_large_rmappable(folio); > > > > /* > > * Now we know that no one else is looking at the folio: > > > > But there's no harm done if you push out a tree without that additional > > mod: we can add it as a fixup afterwards, it's no more than a cleanup. > > > Maybe we could convert to __folio_undo_large_rmappable() for !maping part, > which will avoid unnecessary freeze/unfreeze for empty deferred > list. > > diff --git a/mm/migrate.c b/mm/migrate.c > index ca65f03acb31..e6af9c25c25b 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -412,10 +412,11 @@ static int __folio_migrate_mapping(struct address_space > *mapping, > if (!mapping) { > /* Take off deferred split queue while frozen and memcg set */ > if (folio_test_large(folio) && > - folio_test_large_rmappable(folio)) { > + folio_test_large_rmappable(folio) && > + !data_race(list_empty(&folio->_deferred_list))) { > if (!folio_ref_freeze(folio, expected_count)) > return -EAGAIN; > - folio_undo_large_rmappable(folio); > + __folio_undo_large_rmappable(folio); > folio_ref_unfreeze(folio, expected_count); > } > What you show above is exactly what I had when I was originally testing over the top of mm-everything (well, not quite exactly, I don't think I bothered with the data_race()). But I grew to feel that probably everyone else would be happier with less of those internals _deferred_list and __folio_undo_large_rmappable() spread about. There are many ways to play it. I had also considered doing it Zi Yan's way, freezing always in the !mapping case as well as in the mapping case: what overhead it adds would probably get lost amidst all the other overhead of page migration. It will not be surprising if changes come later requiring us always to freeze in the anon !swapcache case too, it always seemed a bit surprising not to need freezing there. But for now I decided it's best to keep the freezing to the case where it's known to be needed (but without getting into __s). Many ways to play it, and I've no objection if someone then changes it around later; but we've no need to depart from what Andrew already has. Except, he did ask one of us to send along the -fix removing the unnecessary checks before its second folio_undo_large_rmappable() once your refactor patch goes in: here it is below. [I guess this is the wrong place to say so, but folio_undo_large_rmappable() is a dreadful name: it completely obscures what the function actually does, and gives the false impression that the folio would be !large_rmappable afterwards. I hope that one day the name gets changed to something like folio_unqueue_deferred_split() or folio_cancel_deferred_split().] [PATCH] mm: refactor folio_undo_large_rmappable() fix Now that folio_undo_large_rmappable() is an inline function checking order and large_rmappable for itself (and __folio_undo_large_rmappable() is now declared even when CONFIG_TRANASPARENT_HUGEPAGE is off) there is no need for folio_migrate_mapping() to check large and large_rmappable first (in the mapping case when it has had to freeze anyway). Signed-off-by: Hugh Dickins Reviewed-by: Zi Yan --- For folding in to mm-unstable's "mm: refactor folio_undo_large_rmappable()", unless I'm too late and it's already mm-stable (no problem, just a cleanup). mm/migrate.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) --- a/mm/migrate.c +++ b/mm/migrate.c @@ -443,8 +443,7 @@ int folio_migrate_mapping(struct address_space *mapping, } /* Take off deferred split queue while frozen and memcg set */ - if (folio_test_large(folio) && folio_test_large_rmappable(folio)) - folio_undo_large_rmappable(folio); + folio_undo_large_rmappable(folio); /* * Now we know that no one else is looking at the folio: -- 2.35.3