From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8FE86C021B2 for ; Fri, 21 Feb 2025 01:49:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 163EE6B00C3; Thu, 20 Feb 2025 20:49:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0ED256B00C4; Thu, 20 Feb 2025 20:49:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EA8E9280014; Thu, 20 Feb 2025 20:49:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id C840E6B00C3 for ; Thu, 20 Feb 2025 20:49:34 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 539F9C0363 for ; Fri, 21 Feb 2025 01:49:34 +0000 (UTC) X-FDA: 83142269868.29.D5258A1 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf09.hostedemail.com (Postfix) with ESMTP id 6990714000D for ; Fri, 21 Feb 2025 01:49:30 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Xktfs3jx; spf=pass (imf09.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740102572; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ReQaqT7YyFR6yUBjN3GEKsiJ157l6bNf2zhu1ytH118=; b=NyKKInk9SFIZVMYHV/K6Pu5QsAJEzkF3Gv4fCVPEKeHf6/T5q+rxBaKq5FCS1d3eP0SpeJ 5rBXQA0RUacp87POLFd+0SnF6CWia4Aewdef17xqAYzYPvuEAHyapIyyXLgM4B1nhX9+FC cKn4jgvhUCuA5m6eCt2Y9uhxD5CNbp8= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Xktfs3jx; spf=pass (imf09.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740102572; a=rsa-sha256; cv=none; b=YNbMcf/HiEya0lyzj669C2wxVyiufplyYjTE+ECoGWw5MBR2vw+HplzYqiNw+z85zPHwJE h4zrX51Y6t7bxenfRkDG+J/VhrBxR1g+SpsS1ltilGs3FJ0aP2Ygcuk4fwaXwqT4yOIZ8t YAwSmCS2RJK5A7uhmOWVN8kZEmMCwn4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740102569; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ReQaqT7YyFR6yUBjN3GEKsiJ157l6bNf2zhu1ytH118=; b=Xktfs3jxy5aDfNYO+sDjdqdOi3VR8n612zg5JPwS3+PlmfEGTsJ0k46iuoBHq0TBJ6KMyZ GpZYIkk+alSEoy2a+1UXJ8Lmq69BWycoeiHoG/0bhI3KHjM1vQ7Iiu2Rq35K8S9IKYZRrg +ChuPP2fv9qeKBRhAFE0sx91QUyMCXY= Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-621-ocNuzHDEPtaUDze3DhjB-Q-1; Thu, 20 Feb 2025 20:49:27 -0500 X-MC-Unique: ocNuzHDEPtaUDze3DhjB-Q-1 X-Mimecast-MFC-AGG-ID: ocNuzHDEPtaUDze3DhjB-Q_1740102567 Received: by mail-qv1-f72.google.com with SMTP id 6a1803df08f44-6e666975a6fso41051276d6.3 for ; Thu, 20 Feb 2025 17:49:27 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740102567; x=1740707367; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ReQaqT7YyFR6yUBjN3GEKsiJ157l6bNf2zhu1ytH118=; b=e9Un3YFjamsPYHtDDUftBMBgxNaMcIqn5HfglvxeXb5YsQYc6AW1DVtNCHvybjpsm/ dXIbbzQ6nZvg7M1jAWObYHXeL5ZHAcyH2dEjtXIL5Xa1nrLAmTmsFc4pjCdv3oeRvD6N oj23VU13ojQQ/VRWqkxceG03JC7r7GfFp1MaJYb5Kn/VSGYBCnm27PeX/PVht18ofJ/J QGWu/Pu9LJI+O9Xn3nROwWl9mwxx4YpoUAo5JjAqI/l8RDJpI0uYDa2khLfLAzceUh1U 7ZSU9hYGDqkz3K/z+liS4jVBPvC5Hspb5dGqUJ7YN85RRkvbiUkNGjPGO71XKb7YvWDa GgiA== X-Forwarded-Encrypted: i=1; AJvYcCWzyaqNh+lgWrJVpLYd9HqXVp6V1czM+yhhDdTaoKGvFUS3BYZ89EjxRWREiKAghKdjd/6Lygh1ww==@kvack.org X-Gm-Message-State: AOJu0YwYY/0CpBhU7KkUkEVI4DUlWnbFgbzRgYkGu1tgGMktAhR0jQof mSz0aZ4dZXnBNqMeeVr4LvKx92r7yDefQ7uDYhhFX8Fb99LVrMEOayPK89rxsrUpRoRYm7aCIIW 4fyV9Re6E882k90Zv10/Zw4wb7gDHi9/rgBeY5sIdDcggN2CX X-Gm-Gg: ASbGncu8Wp1vmIO4aWO4oQ5tFGPGjNm3BBluqxYS2iTX4TSJBGOzGFOzaOijnd8rSxu IfDUMCAj2qouVYTnY6LL4IWbqZbz1Cz7+e9wiMnXrRdGskMqdIHxurVh/4Lb06+Uwm+cesL95iA +bW2QCfSAGlqAvq8Ev2dksp5g3YLpIXYwhhm5vpisyXpYqAhx09GqVYtKO7Bs/6n6Og2yl3rdOA xc9RkMCuNdGwdUYHQmFRP/h7e/URG8EixVzC+5YJUDnd8rZllXGXQO3S+1BOa9BQ6HKDQ== X-Received: by 2002:a05:6214:230e:b0:6e6:6caf:e6f5 with SMTP id 6a1803df08f44-6e6ae7f686cmr19650586d6.13.1740102567055; Thu, 20 Feb 2025 17:49:27 -0800 (PST) X-Google-Smtp-Source: AGHT+IHJ3hTUOIK5Y81kSs6HZbVEqO+XYLZJOLfVD+JhLPPtobgIjBqYeyCrh/BXefIFsMMf2oo66Q== X-Received: by 2002:a05:6214:230e:b0:6e6:6caf:e6f5 with SMTP id 6a1803df08f44-6e6ae7f686cmr19650276d6.13.1740102566712; Thu, 20 Feb 2025 17:49:26 -0800 (PST) Received: from x1.local ([2604:7a40:2041:2b00::1000]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6e65d7a3faasm92209016d6.57.2025.02.20.17.49.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 20 Feb 2025 17:49:26 -0800 (PST) Date: Thu, 20 Feb 2025 20:49:22 -0500 From: Peter Xu To: Barry Song <21cnbao@gmail.com> Cc: david@redhat.com, Liam.Howlett@oracle.com, aarcange@redhat.com, akpm@linux-foundation.org, axelrasmussen@google.com, bgeffon@google.com, brauner@kernel.org, hughd@google.com, jannh@google.com, kaleshsingh@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, lokeshgidra@google.com, mhocko@suse.com, ngeoffray@google.com, rppt@kernel.org, ryan.roberts@arm.com, shuah@kernel.org, surenb@google.com, v-songbaohua@oppo.com, viro@zeniv.linux.org.uk, willy@infradead.org, zhangpeng362@huawei.com, zhengtangquan@oppo.com, yuzhao@google.com, stable@vger.kernel.org Subject: Re: [PATCH RFC] mm: Fix kernel BUG when userfaultfd_move encounters swapcache Message-ID: References: <69dbca2b-cf67-4fd8-ba22-7e6211b3e7c4@redhat.com> <20250220092101.71966-1-21cnbao@gmail.com> MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: XvTF6TgiU8jmtx6bRlJCB3yhaoNZLNVAiA-alK7nCUw_1740102567 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Queue-Id: 6990714000D X-Stat-Signature: 5jsqu71yc5btqj7miq8dqgj1gpqi4j4k X-Rspamd-Server: rspam03 X-HE-Tag: 1740102570-475518 X-HE-Meta: U2FsdGVkX1849Bp52ywwXiozUW1INAJOhelrxzmCjNuJGY/75R1tEU63TsNy6weu+rImaGIKLN1tk5tRlnsGBwfZ/kZ96cUCve3Y6MvgR8NU8L5K83/IVKFkGDrJcjKQJDs0tbz8vgEJv1QofqQ3wU9JWS9p5LnLagW9BwyNVx/3eKNy6Yo1fVkKp6ZoNtrdwLm2nfY12wIDcahf9h9QALi5gpB0kReOk5DaLJgTQe+AtMgFDRyjgf3WVx8WBwbFvD1/a0ySN9VZtNL4J9LMOYQK416wyT9quMeImqM9a0XPviUHkUsjSFajNmk/uTHntYqLxuZmcSJWf20hP/ZQKpzn037UCH3bWa47mob52pqPw19U1RbmC1xCMcsxo61ylkcpZXTV6E5DUltT81xtRYPxYHkAsC6T1xRbPtsnoXiNszEelfF4Kby+DFi3DztfbT5KtV0p/b1Rugo0tXTqNnCLSxx5iOBhP2Gvu7ZvhQz+auCtJe6MGvE6mmy4L/ftGof4plTo3R7e13Q0FOvqIlyPXejkIicnQeH5y3q/RqQbwIL7hmN7vgKAB1/BxaDumjz1ICWrLSShNhFeTSaIl5zMWTSqAFWieyGFtpPPVGvtdW1+ZUvaK+dXa7DW9a+4bAUeUvdDoJnagnKX2RKSse2XelNqRT/HIKYWKKP7vKGrhupnva92kQLZO/ptylE1EULdsl8CJ/zMNaexYsWRE99G4Y28/ZLLZjx0GOvaiFkKFjO3Eo5c6izZwRbJuNknmheqgbm5mWMo+PYc1JCFgeeYOFQPImmp5xQNaVkwsEAFtCmigD9aFLSArTBAP9VPfiqobDdbCd03bdX45fInHntBk6CP4vkuaiHD/d8A+ARKSEbOl5bS9NUJ0TWfOES07/Ydf0UheU8pTWiQ7n8GOp3YPpgf+HrK1VcW6KOLavPrwIs9pNaJQDGT077LVWmMjKD9OQf3vNRW4Vgu6rQ fqCt4EPj Uow+eohdLgag8bZCf3PlmpMostOMSKWvniB2M9FnVjmnbu+3TjDDs0VZoAusz4x7aFacnodiG7LtdqMjZu4pKXDDmeU3ORSsciEyEXto9jxtT/mQLpzDkIZTcUUxqXTPE6qqhnZmfcG8FJS9qjLAFHQ+01ebapPYUzyIPSS3Y5iRchK3RuqFPFFvRSq99c/Q+B0j1P50lTkGdzpyDf0IUrlosX6FT8A2PCanAqXPbdymO3OAwyw5BRyM9VDqK97wBYNSpTMtDLDwrBRSIsFoSfty47+xKwmB57/96z799Y7cTSdg8ppv0IaecZ7MiVATzAUX/V6TNoo71ay0L4Pp//oRBmf69zDYiP+AXddH3f9Prnj/I2UUd7O1U7EwXpqnBOLlbh1kL8btBWafhuYwF05/8MLaUw7Xof0iWOi/inEFLCaZGLLiX0BDo3Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000019, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Feb 21, 2025 at 01:07:24PM +1300, Barry Song wrote: > On Fri, Feb 21, 2025 at 12:32 PM Peter Xu wrote: > > > > On Thu, Feb 20, 2025 at 10:21:01PM +1300, Barry Song wrote: > > > 2. src_anon_vma and its lock – swapcache doesn’t require it(folio is not mapped) > > > > Could you help explain what guarantees the rmap walk not happen on a > > swapcache page? > > > > I'm not familiar with this path, though at least I see damon can start a > > rmap walk on PageAnon almost with no locking.. some explanations would be > > appreciated. > > I am observing the following in folio_referenced(), which the anon_vma lock > was originally intended to protect. > > if (!pra.mapcount) > return 0; > > I assume all other rmap walks should do the same? Yes normally there'll be a folio_mapcount() check, however.. > > int folio_referenced(struct folio *folio, int is_locked, > struct mem_cgroup *memcg, unsigned long *vm_flags) > { > > bool we_locked = false; > struct folio_referenced_arg pra = { > .mapcount = folio_mapcount(folio), > .memcg = memcg, > }; > > struct rmap_walk_control rwc = { > .rmap_one = folio_referenced_one, > .arg = (void *)&pra, > .anon_lock = folio_lock_anon_vma_read, > .try_lock = true, > .invalid_vma = invalid_folio_referenced_vma, > }; > > *vm_flags = 0; > if (!pra.mapcount) > return 0; > ... > } > > By the way, since the folio has been under reclamation in this case and > isn't in the lru, this should also prevent the rmap walk, right? .. I'm not sure whether it's always working. The thing is anon doesn't even require folio lock held during (1) checking mapcount and (2) doing the rmap walk, in all similar cases as above. I see nothing blocks it from a concurrent thread zapping that last mapcount: thread 1 thread 2 -------- -------- [whatever scanner] check folio_mapcount(), non-zero zap the last map.. then mapcount==0 rmap_walk() Not sure if I missed something. The other thing is IIUC swapcache page can also have chance to be faulted in but only if a read not write. I actually had a feeling that your reproducer triggered that exact path, causing a read swap in, reusing the swapcache page, and hit the sanity check there somehow (even as mentioned in the other reply, I don't yet know why the 1st check didn't seem to work.. as we do check folio->index twice..). Said that, I'm not sure if above concern will happen in this specific case, as UIFFDIO_MOVE is pretty special, that we check exclusive bit first in swp entry so we know it's definitely not mapped elsewhere, meanwhile if we hold pgtable lock so maybe it can't get mapped back.. it is just still tricky, at least we do some dances all over releasing and retaking locks. We could either justify that's safe, or maybe still ok and simpler if we could take anon_vma write lock, making sure nobody will be able to read the folio->index when it's prone to an update. Thanks, -- Peter Xu