From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98410D11706 for ; Fri, 25 Oct 2024 06:57:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3515C6B0085; Fri, 25 Oct 2024 02:57:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2DBDA6B0088; Fri, 25 Oct 2024 02:57:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 17BC36B008A; Fri, 25 Oct 2024 02:57:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id E59616B0085 for ; Fri, 25 Oct 2024 02:57:32 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 516F5ABF45 for ; Fri, 25 Oct 2024 06:56:54 +0000 (UTC) X-FDA: 82711218366.18.567C6A3 Received: from mail-pj1-f53.google.com (mail-pj1-f53.google.com [209.85.216.53]) by imf07.hostedemail.com (Postfix) with ESMTP id DA9134000E for ; Fri, 25 Oct 2024 06:57:03 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=HaYFXO3O; spf=pass (imf07.hostedemail.com: domain of hughd@google.com designates 209.85.216.53 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729839296; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LpwEmX39tz7GkOoKXscKUueXnv66QRCEFJQBHl3iLv8=; b=1wEEqitp5+/SeM+13Rp7HI/JrnVLhkCkl+zL5Ba9gzPwFS83Dls4Kicn6l1UnUGfV+/CSp pDt3yXFGC5bWhhJdAkJko5qH0+f3ffoEmgjV1f6Lw32DfZCaU+pNHSYIcD92K0Ahlrfk0r FbieqwKYPINhqFIqXVmvC3zScN7zzns= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729839296; a=rsa-sha256; cv=none; b=PAvvKcA2L42K4CvZw8roO264//4a5pSzyX9/H+EpUvnkAMQ8migyWekGAXf3cuVggbz2ho ekSaYoTGS2qItTyTJg+KC7K1F5iVu7xjuhKSsIwwdSOfuOdfYkGK3W9SWYlUaHfvIOeifM Dt4sv1sBmxSuURagtAQQ77i7Bffi7bE= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=HaYFXO3O; spf=pass (imf07.hostedemail.com: domain of hughd@google.com designates 209.85.216.53 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pj1-f53.google.com with SMTP id 98e67ed59e1d1-2e2a97c2681so1245740a91.2 for ; Thu, 24 Oct 2024 23:57:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729839449; x=1730444249; darn=kvack.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=LpwEmX39tz7GkOoKXscKUueXnv66QRCEFJQBHl3iLv8=; b=HaYFXO3Om5F5TPWblPWnHXw/kUPdTIvLXKX+vCai+SV/wD9BlD2MkpLuKXZBNasZEn iyJ41rvb5moFn/4qbrnPutO4RE+cGR7nAAZ0661Sg4R5wwqWKG+Yf6hXVqgmAU+c6xug qUNhI0rsTXz0nQyNr4ZdvhvfJTZLJ//G2wjWiTSM6NR9H8bZJKEXTSErPEGspvEPqLSQ KMaO3cLAqIHy+fl/SuKVGiy9gzlRIFNVbhvOl+aEar2fJp7Ue18YoYy0rS+BdvZHtjUE L7wahI4aUN1yVzaMI5up8q/Idvz6vkW2VMlvwae2o0aEDrUqFkZyvbTuPEjcFD97Ulrl XKtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729839449; x=1730444249; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=LpwEmX39tz7GkOoKXscKUueXnv66QRCEFJQBHl3iLv8=; b=KQWyJ+NJePRwJGHJWXb1al++4mVbJjKp01iQRMcAFBbB0BS/m7Q4ldvFZazbP1uoki kkScNt7SOxumMU6Eeg12YcezI9yluvtWfirlZLEQw3sYpPLvVKQWtXzmLShe0GSAXLtX QulBNEOgrGWtApxx+LJqFJ+bAl2jCKnZstgB1v6v0YKO5n5zSaFRr5eyYoUL/xa1HQKu npFh0wImVnzM+Kl1D18eJJhZxlsac24whGN9agwzZJrV8c3N1b3qNLlJ+QcEBXHWVw9O Pa7YALOMbrBdQqjZ8te6Lstj2FVXGkvJjxSYrLwAS0S7jV0tX41WMjIbmjE5S7oGa49v M38A== X-Forwarded-Encrypted: i=1; AJvYcCVDRdvPB1SqqYHXRbHPq3vLSihCgVAKdvYvQbn6SE2RtZuAxdfFuXvNO+kwQ5uaS80l9AIpZRCGDA==@kvack.org X-Gm-Message-State: AOJu0YxR5V0jpSXw4L/tpq0xcotctCEMDRWX5MaQGAiNIeo4MgXzvD1y IpKmBQR7v57cK/cQIFfe6/nSrTGcnZ6WrCiRMDu1IdO9/R2vsrNS2vkBmgBkOA== X-Google-Smtp-Source: AGHT+IG77oWggEXdNmThy91aq60vF+TSPN5sWh64WR+w6d0aDrUcaGtZToaZcstCLLRoXFKtaIR+ow== X-Received: by 2002:a17:90a:3986:b0:2dd:5e86:8c2f with SMTP id 98e67ed59e1d1-2e76b621bd2mr8388977a91.21.1729839449017; Thu, 24 Oct 2024 23:57:29 -0700 (PDT) Received: from darker.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e77e56f468sm2654324a91.41.2024.10.24.23.57.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 24 Oct 2024 23:57:28 -0700 (PDT) Date: Thu, 24 Oct 2024 23:57:26 -0700 (PDT) From: Hugh Dickins To: Yang Shi cc: Hugh Dickins , Andrew Morton , Usama Arif , Wei Yang , "Kirill A. Shutemov" , Matthew Wilcox , David Hildenbrand , Johannes Weiner , Baolin Wang , Barry Song , Kefeng Wang , Ryan Roberts , Nhat Pham , Zi Yan , Chris Li , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH hotfix 2/2] mm/thp: fix deferred split unqueue naming and locking In-Reply-To: Message-ID: References: <760237a3-69d6-9197-432d-0306d52c048a@google.com> <7dc6b280-cd87-acd1-1124-e512e3d2217d@google.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="-1463770367-1593665756-1729839448=:30812" X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: DA9134000E X-Stat-Signature: n5o4cwremgwbty5sitmzn7rzatxye5gn X-HE-Tag: 1729839423-136075 X-HE-Meta: U2FsdGVkX18lSU4CpMAWmT+z7V4TvSl1ON/OqGVa1UXiV3sRjaWWltxNI5NX2HOJ3NWpdMObfGbHcEA6o1Ca7bfPrHKEW/cnLH13uSlP06rKwoigBrCu+BC4vbhOFhwmZaCvr3UKkVldKqFxKnkJN+2pb0uNq5nsFMBoY0udIRpsymwuuR8zdbwS5iNY1P3HbSadEFPFBINx58g23hjAhnL11cGkETWX3dEQlvyNZgGseQBsO/Z3sZV+uxxr4WKDZIPD/dUe9ztPZpRpmbj4eFAG7dVaioyqHatkI5SU+QyC2emUdfUtIJJ3QsYllopFa64Z38F1im7DGL2KVSdaTcY3pH27HEjGBQGMyft2/VoZOw6An8BUq5sB2IdhllXJYJlcwJy4qtqAn8jo3i1DhcGvuwvjHbI6XEj41VG8wQjvi0LtbJZPHi35Eg8sqZSB/KR8bZemNgQSDf/wIUpnHxZOmUg8KuX+IErmjoVjibHmLYShI7bKF9nPT6vZB6hGjNP7zZQNk9tzDhgm5jxs28ALa/eoWhyCgY+2n4LmeYYtybT8HCnlDk0E8E9m0KgWyBYVuZU/lxhlZnzfyeZGprSGxQTjY5MpPqW43ddmmEhfoBfOlx61adly6AQDp46FwRfZL6/b3aQzMobYH4Sp1kkqQdgrCY8bKGFNdxfSsqrHizUZkZY1WhRR7kX1YyhE8nybzLiwVvwsQle5Rzeqs3tVgxdgmrW9QIoRJc/tWsymCA7vCw09J3q5qhZSgJUTDc2nojLN3V6HK1cvS7/UjS4S359zQhIX8+YEJFEflByTO+ASMPdXGGcs+V8PETHYPdYhFQKENLsmYPb9F5juhd+rwMxAnGIRls0FsWrceROAQ7FChuaEX9nZXfWvLF7DXugRr3WedqInhOIqrQLfwIhbNzgPwXUyAomINI7swvjqyq0SK9carZyDL6sMaOA6l6GETe+7k3dnik70ySJ X3S3RQi8 PX4AmEQU3Vo20hSDwewGATF3ITofA+YGfvvU6qEmtWSCNHENZ6SY5EM+nnaDYdfHHF6UfsKnHkzxmdZ3YfLfpC6yINreJIus9JcAsQLjE4ynfSCB+FYTHKPmswbQq2zGx+YqyVPibcYIn6ATLpkMa/xGcUYtAWvFI0Jb8RjbkjE92MiEVkJ792seMAezJb3TWGIoW4K3lv9NjTP7LaxTpNsHYatOBTVBJwEYu05dLrN1G84d3Fc0yi7beHGR5RlpgrGjJQDRfRsRztJoMuWxcX5ATf7qIMPqaFup8iHwKulma0Y+P1P6lIX2lJKMqxIEw0+WoRSERz8mMKpKtUzovjyu2JxvHZTUS+hg6ycAH+yw1n8hHUtg0jLx3mhiquFj9IepBOMoq/x99ETBHBRdjieOtAk53uB8IqIu5U2HdXKnWRAb/Yi9jM7oO6ejQLJ3bTJwZi5IwmW1SEfCE57P+8gbERQv9JysTohl4oUQYhsYNPxRXPK0xLxU0oA165uBY83nZwwTiM3Azc/c+BdqSafpysvdJa1I28mGH X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. ---1463770367-1593665756-1729839448=:30812 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE On Thu, 24 Oct 2024, Yang Shi wrote: > On Wed, Oct 23, 2024 at 9:13=E2=80=AFPM Hugh Dickins w= rote: > > > > That goes back to 5.4 commit 87eaceb3faa5 ("mm: thp: make deferred spli= t > > shrinker memcg aware"): which included a check on swapcache before addi= ng > > to deferred queue (which can now be removed), but no check on deferred > > queue before adding THP to swapcache (maybe other circumstances prevent= ed > > it at that time, but not now). >=20 > If I remember correctly, THP just can be added to deferred list when > there is no PMD map before mTHP swapout, so shrink_page_list() did > check THP's compound_mapcount (called _entire_mapcount now) before > adding it to swap cache. >=20 > Now the code just checked whether the large folio is on deferred list or = not. I've continued to find it hard to think about, hard to be convinced by that sequence of checks, without an actual explicit _deferred_list check. David has brilliantly come up with the failed THP migration example; and I think now perhaps 5.8's 5503fbf2b0b8 ("khugepaged: allow to collapse PTE-mapped compound pages") introduced another way? But I certainly need to reword that wagging finger pointing to your commit: these are much more exceptional cases than I was thinking there. I have this evening tried running swapping load on 5.10 and 6.6 and 6.11, each with just a BUG_ON(!list_empty(the deferred list)) before resetting memcg in mem_cgroup_swapout() - it would of course be much easier to hit such a BUG_ON() than for the consequent wrong locking to be so unlucky as to actually result in list corruption. None of those BUG_ONs hit; though I was only running each for 1.5 hour, and looking at vmstats at the end, see the were really not exercising deferred split very much at all. I'd been hoping for an immediate hit (as on 6.12-rc) to confirm my doubt, but no. That doesn't *prove* you're right, but (excepting David's and my weird cases) I bet you are right. > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index 4b21a368b4e2..57f64b5d0004 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -2681,7 +2681,9 @@ void free_unref_folios(struct folio_batch *folios= ) > > unsigned long pfn =3D folio_pfn(folio); > > unsigned int order =3D folio_order(folio); > > > > - folio_undo_large_rmappable(folio); > > + if (mem_cgroup_disabled()) > > + folio_unqueue_deferred_split(folio); >=20 > This looks confusing. It looks all callsites of free_unref_folios() > have folio_unqueue_deferred_split() and memcg uncharge called before > it. If there is any problem, memcg uncharge should catch it. Did I > miss something? I don't understand what you're suggesting there. But David remarked on it too, so it seems that I do need at least to add some comment. I'd better re-examine the memcg/non-memcg forking paths again: but by strange coincidence (or suggestion?), I'm suddenly now too tired here, precisely where David stopped too. I'll have to come back to this tomorrow, sorry. Hugh ---1463770367-1593665756-1729839448=:30812--