From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1D63C67861 for ; Sat, 6 Apr 2024 23:27:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 37B9E6B0082; Sat, 6 Apr 2024 19:27:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 304EE6B0083; Sat, 6 Apr 2024 19:27:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1A5C16B0085; Sat, 6 Apr 2024 19:27:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id EDAB56B0082 for ; Sat, 6 Apr 2024 19:27:40 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 7832F1C0A6C for ; Sat, 6 Apr 2024 23:27:40 +0000 (UTC) X-FDA: 81980696280.25.2F91E67 Received: from mail-oo1-f41.google.com (mail-oo1-f41.google.com [209.85.161.41]) by imf19.hostedemail.com (Postfix) with ESMTP id A5E321A0008 for ; Sat, 6 Apr 2024 23:27:38 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=gsQ5cAWB; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf19.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.161.41 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1712446058; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xnH1kiwnCu7HwnCtgNx74KGATsGuzGIwPT6aRu5nxgY=; b=l9uGEZ2IEKfEesANek4eDL0Hk3hPLEg9gXOU03QlYVIlNFM6V8Rm3U1I/pqtXIu+wm5yNf BUFc6FuMf3/XdHoSwN2BZdsLHbC4ci6RuC0Vjs+9OPEW1Ob9HHv87wlmoXwfcE3xFGQlIu mZy9X6B72X1FTMTq9V8VfufL1cxvQvU= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=gsQ5cAWB; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf19.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.161.41 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1712446058; a=rsa-sha256; cv=none; b=eGFYBG/2XI3RsEzisekPtvdEDceJ2atZS8Uc8kNzIs35E6agIjHLdNz7vZ0rl8F346oC9A JNamploQgWRfEWLQfYEd/s25u41oi9dglZT5sZn2FfC/1Uv6u2DKAD1+KUZ09qDnkQ/agu XPwES91sKNeuLY9g0pR4R6cqwli9o44= Received: by mail-oo1-f41.google.com with SMTP id 006d021491bc7-5a58009fe88so1783344eaf.0 for ; Sat, 06 Apr 2024 16:27:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712446057; x=1713050857; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=xnH1kiwnCu7HwnCtgNx74KGATsGuzGIwPT6aRu5nxgY=; b=gsQ5cAWBKjeBqKtFvuk5Thi+ooZqD/3A32Lk/fzctsbqcguvqQaK2AgYMM9i9lFURk TMPTmRQgirkB+yCagEWsN85D9BdZ0GSJqvGpa6HXF8fOJNYvEmJ1zuU4MhwTfkdPHyL5 YPkuSKZGfCzOG5NUQUeAfjpkrH5uucPjxOEIQ72HSDehAsIVk50XlPojVl35BlfqGBdo l3DDIBi/6pagkwjzjIWgPws0lS01ErUSl+qwWqXg+o90vscYCELHD+7AVkBC3B7rnNS2 szdoIwoQpsHR0fkvqXUHP4Grk/a7TlVMr781FAUv9FMyrtZMKHOgFSeU3X65o2QftcUJ f/0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712446057; x=1713050857; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xnH1kiwnCu7HwnCtgNx74KGATsGuzGIwPT6aRu5nxgY=; b=eKy3rtg+/As5x1hDmWXZBRvAFEekCHa/OUsob7rMQrdWY9tZYIaov+OqaDvF38Mc4f qtsGIPRII0+8YaKDzqmp1OAkO12OExBtGMDL5mrMnxKxwxz8r9plGLGzOw0Zd0Exjqt1 DqBswZUc3afc2ADqGfZqirbVdB+gfIqXpsTPnpqwhVSRFgVPVq9fTIAIDPxEjpTg/kYw UZ3gsJvpl5AZ4JxasWvD3hmD0kYflMvzgFWiixnAkNuR2Rimm++HHrpAbuV6kQN0hSK/ d1yNKKNFKGtZJAGcY+BBuP70SS6I9riCqNjvspclCkK/GZXBMlRa/k5zi0NcsHczM6nh Egsw== X-Forwarded-Encrypted: i=1; AJvYcCW9JDAdWmbhZIVheSCLfWfNjvWuNdTy55ra5m3aMCyGRivsGGotjS0eLS6Z829iWKkR1SqeT8mZ96MH/89iMnyi4Ys= X-Gm-Message-State: AOJu0YxW98H2v4PpHUNSfu6y5EXOLTdHXwsIP8JD4pK8HqlDh51qdHlI JJgRdagj/U88wg+aCuCLJnPLmnWKwz+BEPdRQttahdTY9586Z0vawqyXsr2A4OFbPBQ+pEqsxPr FDQPlp2HSptBPyaPfHuk8BMSQ7WE= X-Google-Smtp-Source: AGHT+IF174D7jDPyoJ2fAHftQW7Ge4KNMxdqgGqHD7s8pAgQLxbgZgbSewX8OUdaesaCSaQHpeZZFyu4+7pLvOhRJGQ= X-Received: by 2002:a05:6358:d588:b0:186:1652:7c9c with SMTP id ms8-20020a056358d58800b0018616527c9cmr1422339rwb.31.1712446057451; Sat, 06 Apr 2024 16:27:37 -0700 (PDT) MIME-Version: 1.0 References: <20231025144546.577640-1-ryan.roberts@arm.com> <20240118111036.72641-1-21cnbao@gmail.com> <20240118111036.72641-6-21cnbao@gmail.com> <58efd8d4-28aa-45d6-b384-7463568b24bb@redhat.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Sun, 7 Apr 2024 11:27:26 +1200 Message-ID: Subject: Re: [PATCH RFC 5/6] mm: rmap: weaken the WARN_ON in __folio_add_anon_rmap() To: David Hildenbrand Cc: Chris Li , ryan.roberts@arm.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, mhocko@suse.com, shy828301@gmail.com, wangkefeng.wang@huawei.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yuzhao@google.com, surenb@google.com, steven.price@arm.com, Barry Song , Chuanhua Han Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: A5E321A0008 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: usfms1st783jxpwaj5uujhcub7t7zz6d X-HE-Tag: 1712446058-51431 X-HE-Meta: U2FsdGVkX1/Pc+QFuzYNEGtiimWvcH6HEDnwvjSkCmlSfOHXD6JvJcFpdUPMStG1f9Kq1c1yoqRq38cnyF8zpFBL/0AzpsP2/rAwTVR+af9+Q+ZcdcB/5mEPYsIVFFYTRF2VIrNsCOrLtuGb3VqXbjJEbeOLxnUT2D5RO3HY/tEeMmC0sWBzvImlt+cDB0Ks1SDnVuR+4oUE0sBPiRiOTcCuODpgylJkv7LImP6hBQRJ7gNZoj/vTU/vaMbRsz3p/F5X149z+tJY7ro8/o4pmbxX8kOo72uKRU+GI7z02sr67ZQ0BBq5suyJiKEV50vAF69ROPSWPvN2z4nygM4YYG8MKRQS1G/KHeXtKnQaW70fEBns1x5/Z5CWiEmM/cPQe5/eCzogqUH+cVYDZzTgP5OyooiFFC8pU5nrv2qEgZu6WtFy0dmbv1wNFqf/c+FtcvapR5Ecv7Uj1Yc5aZNz55PMETl195C9tfcbfX+M1lQ3u+x/4Y6Q6FFxyiZRjCWydr6U4yyxi5HTanEbv0znJFFNIrvUM27QJ4buIWk/N0IyOaS09IXXo4UUg5V6Sz+WH6gQjE7UC9QZlYdt+Yclu9YUIuIv70gbDNXAkGgDiE8J8lqYPQw/z+5mPZHaKiwGhVuloaLRyXqvt3wrrXYOGk520zd4BXtqIynUtNK9YGhXSCLbWapo9SbxV+Wm+YukjywjN4Ivj2lbQk2zDAWuhpuJiLURgK/Rhf/iKBKeTt+QufasQR1CBZEXBkAPa7KDgjo/BL12zfa6s/BMt9T1iIwrKI0Phm4Hbp4uTj/jXFT24v0ngbZcUDg/yltpQh14+1KC2JYWpk00rat3IZn13MCwHJKEP+r4WtQ/C/32IiM8k8zhsDBZphpeHHZKgX4gJXePQSDD87JvK3N6APP59WOpztl6udzCQnqO7gDbbACzJCR+00u1d/0Uhhy7GnynegEMSXYe7zJlui0hE3k t3+aZGtf TsozMz30f7EZ0XezblRkjH7bmTmem4mnWYSpOaMH+IIjthr61O6/Pc91j3sZTx604wLQXE4AC/NhOh+6gIltP6lQcC55fdY8/NLsFNFi0+MvGmyWg+dzmravj/lfzeM0dJzWy/0TMluydTXkIjf90O5ZB3h8b2P6pH6vX42xTZadwew+Xemxsks2TDM8d0maNsD+ICDnM5aZOTQ2JyLtR+7mZX+v669Dpd7aZMkQLzCPxvFc8kpvj0Hz0fF2QQK/elZhxwgfJWqYJNiT8NHeAmMamZrHkykGnvU+2SYMCvRPPbq+xhLBCT2d1vFA7s1S0MId4keW2QpquVD4hLhBtNpp+PNhWCeZGLjpKbcXZha6JjZweHpITXjUF7NuJCqAHwAdc4b+VwiP7q78FmPV27P188Ua3/IC7sWafhQ+iDySTVX3s9AMATqC8gMUkCkzWTgEM X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jan 29, 2024 at 11:07=E2=80=AFPM David Hildenbrand wrote: > > On 29.01.24 04:25, Chris Li wrote: > > Hi David and Barry, > > > > On Mon, Jan 22, 2024 at 10:49=E2=80=AFPM Barry Song <21cnbao@gmail.com>= wrote: > >> > >>> > >>> > >>> I have on my todo list to move all that !anon handling out of > >>> folio_add_anon_rmap_ptes(), and instead make swapin code call add > >>> folio_add_new_anon_rmap(), where we'll have to pass an exclusive flag > >>> then (-> whole new folio exclusive). > >>> > >>> That's the cleaner approach. > >>> > >> > >> one tricky thing is that sometimes it is hard to know who is the first > >> one to add rmap and thus should > >> call folio_add_new_anon_rmap. > >> especially when we want to support swapin_readahead(), the one who > >> allocated large filio might not > >> be that one who firstly does rmap. > > > > I think Barry has a point. Two tasks might race to swap in the folio > > then race to perform the rmap. > > folio_add_new_anon_rmap() should only call a folio that is absolutely > > "new", not shared. The sharing in swap cache disqualifies that > > condition. > > We have to hold the folio lock. So only one task at a time might do the > folio_add_anon_rmap_ptes() right now, and the > folio_add_new_shared_anon_rmap() in the future [below]. > > Also observe how folio_add_anon_rmap_ptes() states that one must hold > the page lock, because otherwise this would all be completely racy. > > From the pte swp exclusive flags, we know for sure whether we are > dealing with exclusive vs. shared. I think patch #6 does not properly > check that all entries are actually the same in that regard (all > exclusive vs all shared). That likely needs fixing. > > [I have converting per-page PageAnonExclusive flags to a single > per-folio flag on my todo list. I suspect that we'll keep the > per-swp-pte exlusive bits, but the question is rather what we can > actually make work, because swap and migration just make it much more > complicated. Anyhow, future work] > > > > >> is it an acceptable way to do the below in do_swap_page? > >> if (!folio_test_anon(folio)) > >> folio_add_new_anon_rmap() > >> else > >> folio_add_anon_rmap_ptes() > > > > I am curious to know the answer as well. > > > Yes, the end code should likely be something like: > > /* ksm created a completely new copy */ > if (unlikely(folio !=3D swapcache && swapcache)) { > folio_add_new_anon_rmap(folio, vma, vmf->address); > folio_add_lru_vma(folio, vma); > } else if (folio_test_anon(folio)) { > folio_add_anon_rmap_ptes(rmap_flags) > } else { > folio_add_new_anon_rmap(rmap_flags) > } > > Maybe we want to avoid teaching all existing folio_add_new_anon_rmap() > callers about a new flag, and just have a new > folio_add_new_shared_anon_rmap() instead. TBD. right. We need to clarify that the new anon_folio might not necessarily be exclusi= ve. Unlike folio_add_new_anon_rmap, which assumes the new folio is exclusive, folio_add_anon_rmap_ptes is capable of handling both exclusive and non-exclusive new anon folios. The code would be like: if (unlikely(folio !=3D swapcache && swapcache)) { folio_add_new_anon_rmap(folio, vma, vmf->address); folio_add_lru_vma(folio, vma); } else if (!folio_test_anon(folio)) { folio_add_anon_rmap_ptes(rmap_flags); } else { if (exclusive) folio_add_new_anon_rmap(); else folio_add_new_shared_anon_rmap(); } It appears a bit lengthy? > > > > > BTW, that test might have a race as well. By the time the task got > > !anon result, this result might get changed by another task. We need > > to make sure in the caller context this race can't happen. Otherwise > > we can't do the above safely. > Again, folio lock. Observe the folio_lock_or_retry() call that covers > our existing folio_add_new_anon_rmap/folio_add_anon_rmap_pte calls. > > -- > Cheers, > > David / dhildenb Thanks Barry