From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 272DAC3ABA3 for ; Thu, 1 May 2025 14:35:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7A6816B000A; Thu, 1 May 2025 10:35:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7555D6B0092; Thu, 1 May 2025 10:35:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 61DA96B0083; Thu, 1 May 2025 10:35:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 40C316B0092 for ; Thu, 1 May 2025 10:35:04 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id D8DD2C0A28 for ; Thu, 1 May 2025 14:35:05 +0000 (UTC) X-FDA: 83394586170.03.D30450B Received: from mail-ej1-f52.google.com (mail-ej1-f52.google.com [209.85.218.52]) by imf01.hostedemail.com (Postfix) with ESMTP id C9C3140012 for ; Thu, 1 May 2025 14:35:03 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=bFVwmaCC; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf01.hostedemail.com: domain of richard.weiyang@gmail.com designates 209.85.218.52 as permitted sender) smtp.mailfrom=richard.weiyang@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746110103; a=rsa-sha256; cv=none; b=P93xQtqzwDJazPiBoK3+gT5mXHBJx6FFOXMm9QeQ5WIvjqJJfNsFqypSbBjqboD3d5uQkL aUwYRifQfxLXFamvI2d+W3YvBbE5/lKjPuhwLhop9DD3VG8fhn9F48+kmMHpLnx5gSQVpU d71mo41rImKSyfvaeHBW8aBqKhxBJgY= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=bFVwmaCC; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf01.hostedemail.com: domain of richard.weiyang@gmail.com designates 209.85.218.52 as permitted sender) smtp.mailfrom=richard.weiyang@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746110103; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=V9RchDfxr18I4yVQggiBSgsgcddCi3qc3FrSDx0hpRw=; b=XE9Cpuv5F+u+mgraxf3AZl8wsv9o9Nfjh6en8SuRNq+FFSi5X9Dt08+5JGdNdQ5NWl97CM tyeCj9FLqlfTlXw+Ay3ZzgJMF0WW1GClAQ0RXFoD3uuIIfDjnUz+hyhMSmlh6Cz+rzeFg9 dH13aGDjYvUIgeABRfu8E5eWTy9tVcY= Received: by mail-ej1-f52.google.com with SMTP id a640c23a62f3a-acb615228a4so394993166b.0 for ; Thu, 01 May 2025 07:35:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1746110102; x=1746714902; darn=kvack.org; h=user-agent:in-reply-to:content-disposition:mime-version:references :reply-to:message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=V9RchDfxr18I4yVQggiBSgsgcddCi3qc3FrSDx0hpRw=; b=bFVwmaCCY0qCaAvzGHLeqgAr4cICmP6lTeDbvqhjh6c4vSJ82Ktx4J1eEZKBlv4B5+ MVrIZ/7KRz/iOI+s4mByu85btZFtOnAijgY4jtcSU32om52u5alDimp8gSTH/Ffv94Oo uV7SKHwmY6U1VzOnEqcu5aPN1ezIPeMdHPLXDmOzbL1erj1vvARD3kPno6YKmi60DRqD dNi5dDvbSSCPAyvsPvjWnTUsoiPVNQl6CO8Jsw6heXC+VvfHaKfAb6lIghwO/W298sai luhX+IqYiazfJ6CJ/LuAUi/3o3NPcFvpOJStJHxM9WL9iqpAInrjxbmXif7B2D6FxTGd NSqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746110102; x=1746714902; h=user-agent:in-reply-to:content-disposition:mime-version:references :reply-to:message-id:subject:cc:to:from:date:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=V9RchDfxr18I4yVQggiBSgsgcddCi3qc3FrSDx0hpRw=; b=cj+bFviV1HwliaIDN5lUNVa+2FK6oBotrYWUpYo3/2Oqon6gLRkha6LW3EhK8Z2bWr ldOmZOaWxQ1f80VEh4a9RdQwDeXQi7dYo3y5lMf7/3TRvVmt3TFrzUnzpEC73A/wY+IZ +7Q9nPBAicFGTeiYZA7pgivB3fXQoN0uiCNCZgVgWOqdx/tnvAr/6Ya9pNFC+OrvXT9K 2nbitswbbxqasSjwGwzCG6PD9FjaxMSUKkfq7DKoEgWFrj3AWp+Y75Qe16EDZcY7FOkB guSwIJnXUcwISfK5Wt4RnkC7A3TDLSm0T+2Iubolo6USXsYqWtphSULCCqNM6mJqMVTY MsxQ== X-Forwarded-Encrypted: i=1; AJvYcCXlRjwZAPru6FIgXfBBICvdcXNJjatCJUUuWH+xYa3xeKMrRZxcYWrYGeWWLGfKn+AKdaK0qqBwyg==@kvack.org X-Gm-Message-State: AOJu0Yzk6eQPKx7qsaj2TS2L+PL5/S/OSk3j1zGOVDVMCW1CZ6yKGTxt wTE3VeSBtkiXExswIW8yyzwhVhk4CzomJSAD4xuThSghlxZL2Ty5 X-Gm-Gg: ASbGncuatM3EWvP3Lm+q8Ue35av8rk+PbYwe6lZMLPvAsEV6MOb/EpyRLwfm5hhxgp0 D+/lP8JKwt6PKdFT1L0nGtmafCgbhxdYEv3k+eRjti62jp1A2D6pgmbob9Ovg8D3FaWaXxfJ3ED VPWqkG9MfAp7en0WLMWZegJ+T/biBO6VM0e+t3sXJixJxDVNINv33mM1j3rdiO0ujsBYOnY2Zdf BVpaiF8y/pFxCi8upEhIRjLvNlpa2Jx/NnCPSNQKC9fjAs4naPNeVeE25VzjxBeXa+3UUg7lgtv ViJZCPDMrxkoIVGBF3xQUU+fabxaoOwWcoZtP4phuPfCrup0tLw= X-Google-Smtp-Source: AGHT+IHqQnlYEtWKs1L/q6Y1WTjJOwVkcw6HSTXvKSrxnXSL6DopuLIS+y7zHYniOgUGfSme5K/rWw== X-Received: by 2002:a17:906:2b14:b0:acf:dca5:80f7 with SMTP id a640c23a62f3a-acfdca58dbbmr101326266b.26.1746110101957; Thu, 01 May 2025 07:35:01 -0700 (PDT) Received: from localhost ([185.92.221.13]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-ad0da55bc49sm46906166b.158.2025.05.01.07.35.01 (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Thu, 01 May 2025 07:35:01 -0700 (PDT) Date: Thu, 1 May 2025 14:35:01 +0000 From: Wei Yang To: Lorenzo Stoakes Cc: Wei Yang , Andrew Morton , Vlastimil Babka , Jann Horn , "Liam R . Howlett" , Suren Baghdasaryan , Matthew Wilcox , David Hildenbrand , Pedro Falcato , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH v2 01/10] mm/mremap: introduce more mergeable mremap via MREMAP_RELOCATE_ANON Message-ID: <20250501143501.vljk4hriuc3c2yrv@master> Reply-To: Wei Yang References: <87e668d54927bb4ccdb7d374275e0662de667697.1745307301.git.lorenzo.stoakes@oracle.com> <20250430004703.63rumj4znewlbc2h@master> <8c052822-5365-4178-8e06-ecd4f917cf8a@lucifer.local> <20250430154119.a5ljf5t5tutqzim5@master> <20250501011845.ktbfgymor4oz5sok@master> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170113 (1.7.2) X-Rspamd-Queue-Id: C9C3140012 X-Stat-Signature: hm3yfuyte3j5sn4f1y3aqizfea7q6kea X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1746110103-3445 X-HE-Meta: U2FsdGVkX18AVOkjSgriN112P5fAaOsZVKUNC1QkNhpXiRzHE7utU87DrlnNnNI62sNmU678mVC1Wd5/TSvI21cZ72ao4Y7K4zpE81k/nC3xMDHy9e7mvKGIGzCXbPcL4HDZTo06/rIFGMK0mgDJJnJUf4BeGfLT/lzBLlaBlh4wgZkMtVVDXyX9q63gswjcKPiWaEUzEpDx8KyIUSoQpSi4rRIHHRW9iFeIlXh10y15IHZAInzH0A77+3tBDAvU5vkNOCLF5sP8THNh4OeA5ro0tlFdY3VKcd0Jf6S6i53rKq3ogJ/UiANUwDk9tpdw7UOCrubyDnVeBpRcJNORMHk7ChVcwux8hqsqw+EQkFk9xvWAVG/W69ox0QXAcltDpv+BqiAHQjczLVYfQ/AWxqlvHJO5fdlmHNa3VdcLj2LCbaN2RzsgzcOv0IVsrcVkws4M+3h4Mxz9lUsQWYY4b44WX/yetqLv3bDkciWqGdojY9fvKGXuMw+nVkVbcNHUJdgp3Jl9PWnzCrttJLipAPvWeN62U8/QSwR7MQFP3yHJtf0vJerYr+EipHpGhl8CfDDclkW5p0H+4dBxc+ErK5Y5LWXxXcWItN8YXhlJxFmOOyaij7ot8pElwXES0rnNyKy1l71fonOROOkBa87DuO3wVRX1lRYvjvZfXHAZfp7yBpVoFgNb3MdJhPjBDw0Kny7OiP59yLWEi8yLtFGoqCOGj2TkD5m2GOp+fVrICqWWlhDHtdRpnOe8tXHFh0/TP9aGervL6XalsyFsEhQaGIMq3M482J9+6y/Fi+iFyhkqKNZU8dMIvlJZ0z6itaL3IyxbcV4BkHDdktcdwwd2Iq+YdQus2oZ+qXY+EHs0zecAdn4rWOQQQhNrOj1Y42bhRId/tIr2sTF+RlEqh0PW/Xm/GqNU+a/bkjjpx764fkdbHidUkNH5vFN3uHtXlfLe5698G3JNMq0AK7LDZHI l0iamiIm ZcWL0rB+uId4k2Tfi3EYQZkHBTxkTjL+fRRTIxw+c9uqiCmuK5dykYnPx616Q8016i4u3Ob9z9mnSc6Oua/gOiqfbmVe6R2P7g+9zsstfqCTD0ZeyLSOEu5qXrH0dQBwxbAa1+YLeP+K+++TCDQaaD+pL28276wOSDE4hcb9yQuj6kh0SQc4Tkn4MrEdyRfWbccBRdml3LU9jB1USwqTmu8dXxlEWNIDdpV7g+T9XNlL/QhGhun5vjEzuckIJOUZC7DkwD//6/Or2MzzFdXMvr4cUJj36wfnb4pm+lBdVBBfrKE+oeoIhK0acbBGn48MT7Wy1+4ImEF5/+gEsUC13aDYlIAVAioMYHUo5jwTkr1aTIg0Z5KZ2C9DJZP7Z29cr+PjDMSFbF5Qo8EMZFkwmig5RfDTSi/PB7SIZxN3KOi9/yw9iT2COZqWs9kndc/Doo3tdQh7SLpEsOHCLMH9+F3z+irjSfhDVFKx7oOKOwo6b3g3CgxVBumElaxEvLh4+sLgettm+u2WIsmbVTd2rHwZPysliJj0I2cZ0j3yOY87SIav59sGFz2wBSTyA9TQWBqrsY4AOsR/1e/QL4F5DxzswaEm6LXcnh3VV+igGOxOozuzsWLj3ocuKQw/70mTHm3/FQmflI6OmspVnoW6yjrlNIliqvGPvIHoS7uMN0ddJw8F29aa9x5JwCcDNHj55jxcMz6TTNXAvKIaUrT50WeUzbIyJ+WklEdF8Jl6cSEO4c78vpnh2ACzoLCMXc+Ey6mviZWaZI895D6Y= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, May 01, 2025 at 10:27:47AM +0100, Lorenzo Stoakes wrote: >On Thu, May 01, 2025 at 01:18:45AM +0000, Wei Yang wrote: >> On Wed, Apr 30, 2025 at 05:07:40PM +0100, Lorenzo Stoakes wrote: >> >On Wed, Apr 30, 2025 at 03:41:19PM +0000, Wei Yang wrote: >> >> On Wed, Apr 30, 2025 at 02:15:24PM +0100, Lorenzo Stoakes wrote: >> >> >On Wed, Apr 30, 2025 at 12:47:03AM +0000, Wei Yang wrote: >> >> >> On Tue, Apr 22, 2025 at 09:09:20AM +0100, Lorenzo Stoakes wrote: >> >> >> [...] >> >> >> >+bool vma_had_uncowed_children(struct vm_area_struct *vma) >> >> >> >+{ >> >> >> >+ struct anon_vma *anon_vma = vma ? vma->anon_vma : NULL; >> >> >> >+ bool ret; >> >> >> >+ >> >> >> >+ if (!anon_vma) >> >> >> >+ return false; >> >> >> >+ >> >> >> >+ /* >> >> >> >+ * If we're mmap locked then there's no way for this count to change, as >> >> >> >+ * any such change would require this lock not be held. >> >> >> >+ */ >> >> >> >+ if (rwsem_is_locked(&vma->vm_mm->mmap_lock)) >> >> >> >+ return anon_vma->num_children > 1; >> >> >> >> >> >> Hi, Lorenzo >> >> >> >> >> >> May I have a question here? >> >> > >> >> >Just ask the question. >> >> > >> >> >> >> Thanks. >> >> >> >> My question is the function is expected to return true, if we have forked a >> >> vma from this one, right? >> >> >> >> IMO there are cases when it has one forked child and anon_vma->num_children == 1, >> >> which means folios are not exclusively mapped. But the function would return >> >> false. >> >> >> >> Or maybe I misunderstand the logic here. >> > >> >I mean, it'd be helpful if you delineated which cases these were? >> > >> >> Sorry, I should be more specific. >> >> >Presumably you're thiking of something like: >> > >> >1. Process 1: VMA A is established. num_children == 1 (self-reference is counted). >> >2. Process 2: Process 1 forks, VMA B references A, a->num_children++ >> >3. Process 3: Process 2 forks, VMA C is established (maybe you think b->num_children++?) >> >> Maybe this is the key point. Will explain below at ***. >> >> >4. Unmap vma B, oops, a->num_children == 1 but it still has C! >> > >> >But that won't happen, as VMA C will be referencing a->anon_vma, so in reality >> >a->anon_vma->num_children == 3, then after unmap == 2. >> > >> >> The case here could be handled well, I am thinking a little different one. >> >> Here is the case I am thinking about. If my understanding is wrong, please >> correct me. >> >> a VMA A >> +-----------+ +-----------+ >> | | ---> | av| == a >> +-----------+ +-----------+ >> \ >> \ >> |\ VMA B >> | \ +-----------+ >> | > | av| == b >> | +-----------+ >> \ >> \ VMA C >> \ +-----------+ >> > | av| == c >> +-----------+ >> >> 1. Process 1: VMA A is established, num_children == 1 >> 2. Process 2: Process 1 forks, a->num_children++ and b->num_children == 0 >> 3. Process 3: Process 2 forks, b->num_children++ => b->number_children == 1 >> >> If vma_had_uncowed_children(VMA B), we would check b->number_children and >> return false since it is not greater than 1. But we do have a child process 3. >> >> *** >> >> Come back the b->num_children. After re-read your example, I guess this is the >> key point. In anon_vma_fork(), we do anon_vma->parent->num_children++. So when >> fork VMA C, we increase b->num_children instead of a->num_children. >> >> To verify this, I did a quick test in my test cases in >> test_fork_grand_child[1]. I see b->num_children is increased to 1 after C is >> forked. Will reply in that thread and hope that would be helpful to >> communicate the case. >> >> Well, if I am not correct, feel free to correct me :-) > >OK so you've expressed this in a very confusing way and the diagram is >wrong but I think I see the point. > Sorry for my poor expression, while fortunately you get it :-) >Because of anon_vma reuse logic in anon_vma_clone() we might end up in the >situation where num_children (which strictly reports number of anon_vma >objects whose parent pointer points at that anon_vma) does not actually >correctly reflect the fact that there are multiple mappings of a folio. > >I think correct approach is to also look at num_active_vmas which accounts >for this, but I think overall we should move these checks to being a 'best >guess' and remove the WARN_ON() around the multiply-mapped folio >logic. It's fine to just back out if we guesstimated wrong. > Would you mind cc me if you would spin another round? I would like to learn more from your work. >I'll also add a bunch of tests to assert specific fork scenarios. > >> >> [1]: http://lkml.kernel.org/r/20250429090639.784-3-richard.weiyang@gmail.com >> >> >References to the originally faulted-in anon_vma is propagated through the >> >forks. >> > >> >anon_vma logic is tricky, one of many reasons I want to (significantly) rework >> >it. >> > >> >Though sadly there is a lot of _essential_ complexity, I do think we can do >> >better. >> > >> >> -- >> Wei Yang >> Help you, Help me -- Wei Yang Help you, Help me