From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B95BCEE339 for ; Wed, 9 Oct 2024 18:03:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AE0D86B00C4; Wed, 9 Oct 2024 14:03:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A8F716B00C5; Wed, 9 Oct 2024 14:03:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9305E6B00C8; Wed, 9 Oct 2024 14:03:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 73E296B00C4 for ; Wed, 9 Oct 2024 14:03:14 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id F0BF21A0AE0 for ; Wed, 9 Oct 2024 18:03:09 +0000 (UTC) X-FDA: 82654835508.20.F74CA47 Received: from mail-lj1-f178.google.com (mail-lj1-f178.google.com [209.85.208.178]) by imf06.hostedemail.com (Postfix) with ESMTP id 3F66E18001F for ; Wed, 9 Oct 2024 18:03:11 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=TIauckG3; spf=pass (imf06.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.178 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728496922; a=rsa-sha256; cv=none; b=YKzPSrQ80+uCFgBmo7nTsH8k5IgNm/cJvVdDOgBJrmH/nurNUx3WKKh/inFPCSsO6NUoW1 jR/+T70NZ9BjuWxmEtl1IIq9j1EVp8HllTTtezAClM+BCDBET+F/ER4RzbpiHyf7bdXmjY rrIjnDH3T8d5eqaTEdC7W9CoK4IIydA= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=TIauckG3; spf=pass (imf06.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.178 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728496922; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=I/SkaWaNLkmwU2OCGNzHCKiXLzWnM9lpzvJyZhiMw04=; b=tYFPgR9dJtb13UD07p/lDtVrJmm5QDXAmzxdLJJMi+E0hAKtdAeXSLU+rtKzroB5RsYcaN UZa7CJAvwm27ySdXSnYon2bgUkdbBKGPsnFDDvJPJGDAdaTz2Z5iGgClR114E6LlsQzFKM +7GUugr2K4qypqzJtLzbstSSbH4uzqs= Received: by mail-lj1-f178.google.com with SMTP id 38308e7fff4ca-2facf48157dso468821fa.2 for ; Wed, 09 Oct 2024 11:03:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1728496990; x=1729101790; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=I/SkaWaNLkmwU2OCGNzHCKiXLzWnM9lpzvJyZhiMw04=; b=TIauckG3JZlLjcw8Lo1rEuszfkXyvosNiObJjy4+D7Sg3krugAUrl7CYyGjpnGMTgB 0M/CNByspbla/S6+oWs8MQdjkiyPC9tTH307DSdjqoS5AX4cG2VzRapemS3KA+fPVVwe X+tY9lU/mUQ7bUhljwxwpDlBXZZgbRHxXND7nTL3e5YJTybKm27fuCHE0xzEQsjK42w7 yfCEKwYBFY5m+LA3PPYwznOC6mj9k/ofoQ+dKRa5COtgX2f4sx1Vhfh/r9o193QU5UVg e36rxuJsCp1CQnL3zZL11KgGhy+gCwq7+Up2v7DuZtFLiPjDsskXGiS22wLi1Lv6q1sj ZByg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728496990; x=1729101790; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=I/SkaWaNLkmwU2OCGNzHCKiXLzWnM9lpzvJyZhiMw04=; b=UAkjZC8pKHUiNEl/w9KxSmCl+OIkR9TKfeIs3qXfKODQJEBsZzDIt9UP43z4Zc7Xnh DW8fDBMofVNcW67EAcWs684bNt4G8lB2+9XvPXrgs6pwbr0cRNGB2Kh3qaUOWUec964r +BHfs6Cg7gmJiV9rjN+iAN4C8RyErYHOnacdwJ13pYwo4UFlqdonGH0lY20BH1cnTkwd qNFE5wOHwXiJLeI+gX/lfnV7wpjNC6JAuyh9pcIz2UooiBmMBzC+OGB0zEHCpXfN2hur Nwb7h39qb9FQ4NChtF8QNTDyal+6Y5Qrso8TJEuXUN4nyczfoU9O6MK+CcN6IOO7UQ7M I8Yw== X-Forwarded-Encrypted: i=1; AJvYcCXxxrfWQnOD6QFnP+Cc06g31ShlSIy9ePeqjik6wVRxZkyHusKIvjzeYN6FWf/QAvlWqO0m65ngbw==@kvack.org X-Gm-Message-State: AOJu0YxzuZOwFKYeKq4KDcFY48GMBx0BqlOCC+uLdyWiJlMQ2x2BIoy1 G1WYJ6wNGkB+VuSbpR0fT41SE8DujwvlQFjpvzVBnpc4AN+5YiONTEmWX+hMszPiPIytbaN1ADh qRXLKFFPF+YghGwXcXxpLg8TcGGs= X-Google-Smtp-Source: AGHT+IHZyzh/Mxyi1Jxrdf5EN76MOMKzMvBl/ZzQtO/ZjtEYO/apO8MeyW8UYnHxqhIYzZDbVqSfUQmakpeRFyn35sU= X-Received: by 2002:a2e:d01:0:b0:2f3:ee5a:ab8b with SMTP id 38308e7fff4ca-2fb187ce9a2mr20127131fa.43.1728496989937; Wed, 09 Oct 2024 11:03:09 -0700 (PDT) MIME-Version: 1.0 References: <62a65418-2393-40ec-b462-151605a5efcf@stanley.mountain> <892332fa-e1d0-4581-9c42-045660d7dc80@stanley.mountain> In-Reply-To: <892332fa-e1d0-4581-9c42-045660d7dc80@stanley.mountain> From: Kairui Song Date: Thu, 10 Oct 2024 02:02:52 +0800 Message-ID: Subject: Re: next-20241001: WARNING: at mm/list_lru.c:77 list_lru_del (mm/list_lru.c:212 mm/list_lru.c:200) To: Dan Carpenter , Andrew Morton Cc: Naresh Kamboju , open list , lkft-triage@lists.linaro.org, Linux Regressions , linux-mm , Arnd Bergmann , Anders Roxell Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: sdy4ehk9supusi6bgeg8bfzbt7n69zfx X-Rspamd-Queue-Id: 3F66E18001F X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1728496991-338944 X-HE-Meta: U2FsdGVkX19hCJ7GH/HxF4MsbN8Ry6eMrReN8z4dEFppJEX65gPdYcl4mU/mwBBgae+hm5+qauU8duaN2EXdQ14BdhbTfp2luyi0QmLcAPsY5Nj5lzYIfHdGVB5ESS8lJE7w7ZjM3EMFdm4qQKoSkXMNIKCszswSTmivk5CkrO0DA+DcCVfQYSte12eh9zq9Jx5KPTv0ZBRBAqlrDYFLMzLGTRTPLdIChEZhZ4aL5kwkAQF9+slmYEDvsHp+4P/ncUl8y25jrFifRv19xmbhbjftCchmSdCKipITyrf9ND+0n0im7+hf5/kkpbu6kfDkeua1AeozbZQNPrKnuLVOBwlQhgATNxfK21Nk/s2Fdx6rZpozSJQt0WMQifejKwXPOBn/3HpireQ0m/Jj7QMmuwdK94fWQQcGoiCtYSQ3kmhV03bpU4FgXqlSAU+XCRbfNdsOWBVsZaMwqoe0GHoQ5KO7aGpSmcBKlPxQO/wO38FjB8OByz2XIWJe5St5yRfym79+EUxXA+fEAMRwgPLop3lVgKXsxXHa02uxYtF9bt0czrYwggDWBiHHzb5dF5O/8eby2nwyiH7kdydrCzFbwD8xEdZ9m/xmmjctJBmZw8p21g+oG/jzVpGntFe1aEwxsjMDFPsJzpdHACtEGd2XNY0NOUEKOZyYyT3s5C763uuqHCST1D77bZUrQA9Ji7hOVUwOQ/E43BSNdm6He1ZJsRN0UKwQgIBD4OH8qnWgdTzaCjCD+1BnA529Hm2TJKp2PGa3rpKgB7NMXQJGr8DZVqCENmO1EXBLhWSmyt2jqJFP7Myef68I2RoS0MIHAMrMpxPRYzmSNNt1oQvCfxU49FFe+dzG+BHxyqa16F03swjPwUWP6hCiZLvpLeAoHwzVil2dy7BUBymqLiB4zHiHNp03ARQel6Xn/nwQkxtZ42+XOeYnCWCVPMuXaapEugEwY89QMzVIDJBFDlCjd9j 3Eo0zBOe mCeM8zFlEjmYztB3TsgL8StQJ9ALHCDQZWqk3wbv6uXVRc12PiA6swqJCYoIHtkFq65Qzlsi9snmi1CAqUevUkeD0hD1haf8IrlAN3HqViVtYbJz3ct2KUz1YTobmYEBN/BlfpQmwOUnCsjUk4vMI98JnOpktkWE4/FP3TJ7xp3BkN3tmj6du77aRbWVVY8lTQddysCA34L07f+FRi6RCArT1LoLC0VmHSMiVs4YZcLCl73DR2+INOnMV1iyX1PM+Mk5HxNRADFYH9Y/QrUxo6JSIDhdWkpYDgFIBjUceYfsJ3e28dvjAu8HoLEjGuIqKJwZvVNRM/YCN3CxVrS0ayk9BLWEYMSnJ8tRMrS/479p0qZTUHU5yjzZFwiaMtRSdFUN1cNW/AZTcr8FIwfZOh2R+kYq5WrdSY7zUq0DsvIillBI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Oct 10, 2024 at 12:51=E2=80=AFAM Dan Carpenter wrote: > > On Thu, Oct 03, 2024 at 02:58:19AM +0800, Kairui Song wrote: > > On Wed, Oct 2, 2024 at 7:28=E2=80=AFPM Dan Carpenter wrote: > > > > > > On Wed, Oct 02, 2024 at 02:25:34PM +0300, Dan Carpenter wrote: > > > > On Wed, Oct 02, 2024 at 02:24:20PM +0300, Dan Carpenter wrote: > > > > > Let's add Kairui Song to the CC list. > > > > > > > > > > One simple thing is that we should add a READ_ONCE() to the compa= rison. Naresh, > > > > > could you test the attached diff? I don't know that it will fix = it but it's > > > > > worth checking the easy stuff first. > > > > > > > > > > > > > Actually that's not right. Let me write a different patch. > > > > > > Try this one. > > > > > > regards, > > > dan carpenter > > > > > > diff --git a/mm/list_lru.c b/mm/list_lru.c > > > index 79c2d21504a2..2c429578ed31 100644 > > > --- a/mm/list_lru.c > > > +++ b/mm/list_lru.c > > > @@ -65,6 +65,7 @@ lock_list_lru_of_memcg(struct list_lru *lru, int ni= d, struct mem_cgroup *memcg, > > > bool irq, bool skip_empty) > > > { > > > struct list_lru_one *l; > > > + long nr_items; > > > rcu_read_lock(); > > > again: > > > l =3D list_lru_from_memcg_idx(lru, nid, memcg_kmem_id(memcg))= ; > > > @@ -73,8 +74,9 @@ lock_list_lru_of_memcg(struct list_lru *lru, int ni= d, struct mem_cgroup *memcg, > > > spin_lock_irq(&l->lock); > > > else > > > spin_lock(&l->lock); > > > - if (likely(READ_ONCE(l->nr_items) !=3D LONG_MIN)) { > > > - WARN_ON(l->nr_items < 0); > > > + nr_items =3D READ_ONCE(l->nr_items); > > > + if (likely(nr_items !=3D LONG_MIN)) { > > > + WARN_ON(nr_items < 0); > > > rcu_read_unlock(); > > > return l; > > > } > > > > > > > Thanks. The warning is a new added sanity check, I'm not sure if this > > WARN_ON triggered by an existing list_lru leak or if it's a new issue. > > > > And unfortunately so far I can't reproduce it locally on my ARM > > machine, it should be easily reproducible according to the > > description. And if the WARN only triggered once, and only during > > boot, mayce some static data wasn't initialized correctly? > > I have a config where it printed twice and the second time wasn't during = boot. > > https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20241009/= testrun/25363339/suite/boot/test/gcc-13-lkftconfig-rcutorture/log > > > Or the enablement of memcg caused some list_lru leak > > (mem_cgroup_from_slab_obj changed from returning NULL to returning > > actual memcg, so a item added to rootcg before will be attempt removed > > from actual memcg, seems a real race). If it's the latter case, then > > it's an existing issue caught by the new sanity check. > > > > The READ_ONCE patch may be worth trying, I'll also try to do more > > debugging on this and try to send a fix later. > > The READ_ONCE() patch *seemed* to work, but the bug is intermittent so ma= ybe it > just changed the timing or something. Still, I feel from a correctness > perspective the READ_ONCE() thing is probably correct, right? > Yes, the READ_ONCE fix is absolutely correct. Not sure if it's possible in theory, that the compiler or CPU will use the old value for the `WARN`, but use a new read value for the `if` above. This READ_ONCE will prevent that from happening, if possible. I think we should just merge the READ_ONCE fix, and see if any more tests expose this issue again.