From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E85F6C4167B for ; Mon, 27 Nov 2023 08:42:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5C9D06B02E3; Mon, 27 Nov 2023 03:42:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 57A8A6B0324; Mon, 27 Nov 2023 03:42:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 442ED6B0325; Mon, 27 Nov 2023 03:42:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 377066B02E3 for ; Mon, 27 Nov 2023 03:42:43 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 11FC312016E for ; Mon, 27 Nov 2023 08:42:43 +0000 (UTC) X-FDA: 81503093406.13.3CEA43D Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) by imf26.hostedemail.com (Postfix) with ESMTP id 2BB1D14001C for ; Mon, 27 Nov 2023 08:42:40 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=GRWgDvvk; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf26.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701074561; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=v7ud2f7/LRWS8b7jsTlmNHZSCAPClBeQfNhzDqsVBoE=; b=LQiX1uRpvne04qzZZrh7W+Mk6alaykw6+pkIaa/966KmHibzVpM0ZVXndzoU7C64uJSrbc ZLx5y20L9hMoi3nm5M3+DR2CWbQN9232Ve2AyTyizO3zypt5JEmbVUxGaMn8IJOR4RRorR n3ZU9p/xiYABPddHQ46rio7qeLsW8VE= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=GRWgDvvk; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf26.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701074561; a=rsa-sha256; cv=none; b=FKOZA3bv30kFYLheqCEFtyZivN7pP5CjQOVNT6O5b7jRJH1mJkTkiE0MrUtHlkN45In1+W 6+XUYgVp31iSRAWmeFuROKhJx+vcEPMFXpS0dXpUTXRVGGP7t0AMflshf8XQO7YYzDCJgh c2lSpgQmqRIJVytjZq5Hi/9w6D9Lt3Q= Received: by mail-pl1-f174.google.com with SMTP id d9443c01a7336-1cf5901b4c8so32806795ad.1 for ; Mon, 27 Nov 2023 00:42:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701074560; x=1701679360; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=v7ud2f7/LRWS8b7jsTlmNHZSCAPClBeQfNhzDqsVBoE=; b=GRWgDvvkGmHeq15myn5Z/EsaFc/cWkFWeIDs19TrrCw29hNaPmgHg+LV2FIh4DBvDL hvbXkCT+LsBo/mrKlMd9hDt6HhNjmTWxZPb3Dz8h4z7Okuy5pc1lfrEBodC9RfE1pr6W mgK4fbGkAiyXZijUZ5GUNKeR9uLzUtSPYWE5wGiLtgl1PX7LQYXtSL3t8G/PCszCtbyr c1dolV6D7SH0sYlWMTkYNmMgG+aI60Gszpmxcdciy2jvH8FxNvKmQf3z4f+wOT3PzVT+ yKMwaB7N8B+U5QpxCSt/6W/TX+jomQ1FrIv48+NU9z35f20QzDecIxRDqee4ohEerCcw Cl/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701074560; x=1701679360; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=v7ud2f7/LRWS8b7jsTlmNHZSCAPClBeQfNhzDqsVBoE=; b=mhjJgGsQBwMJjj8pW0wuQECljXWwz5nzhTv+ncM9LNNyArj3rirxX/y2xDB2KYtuco Wr5/tL/sremEL/1TglkJIWEjabGt+fVsfW41O7I9JM+i7YH9fpF1x9kmk7d5d4YZgapN KqkLRbp7InJbFF3s1aL4OH2psuCSppRCgoFPRkdyj5tO14AxjFVbY+bHt+cuvxtho3bW efdgkI69daIjuz+9RtqO4gOe20EQx2Yz5HPCZR/1naR94/EVOHdOTWB5+5rQVfARKoGv YsBUOgtwzTUAJFEkGT/8I0XQmB14tHf+L0qKsJlkq7PeHlOpoeY+XDASyQv6RYFXrBB9 etnw== X-Gm-Message-State: AOJu0YxPegho0QzDoQGhZrB430KWbHEUxPedfJSyr+52nbGcZnrQfGzG hg6YiOqYh+eqi+zjn4s7Drs= X-Google-Smtp-Source: AGHT+IGrwraFKHVCYi3flUo0orxhlZuJgUKXarA9T/+/eeDuqCOql6b/utXko+5LzdDUaLcbfwdK8g== X-Received: by 2002:a17:90a:195e:b0:285:2d62:84c4 with SMTP id 30-20020a17090a195e00b002852d6284c4mr12413569pjh.29.1701074559970; Mon, 27 Nov 2023 00:42:39 -0800 (PST) Received: from barry-desktop.hub ([2407:7000:8942:5500:19a0:3eee:b37f:15f2]) by smtp.gmail.com with ESMTPSA id mp21-20020a17090b191500b002801ca4fad2sm7276883pjb.10.2023.11.27.00.42.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Nov 2023 00:42:39 -0800 (PST) From: Barry Song <21cnbao@gmail.com> X-Google-Original-From: Barry Song To: david@redhat.com Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, anshuman.khandual@arm.com, ardb@kernel.org, catalin.marinas@arm.com, dvyukov@google.com, glider@google.com, james.morse@arm.com, jhubbard@nvidia.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mark.rutland@arm.com, maz@kernel.org, oliver.upton@linux.dev, ryabinin.a.a@gmail.com, ryan.roberts@arm.com, suzuki.poulose@arm.com, vincenzo.frascino@arm.com, wangkefeng.wang@huawei.com, will@kernel.org, willy@infradead.org, yuzenghui@huawei.com, yuzhao@google.com, ziy@nvidia.com Subject: Re: Re: [PATCH v2 01/14] mm: Batch-copy PTE ranges during fork() Date: Mon, 27 Nov 2023 21:42:17 +1300 Message-Id: <20231127084217.13110-1-v-songbaohua@oppo.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <271f1e98-6217-4b40-bae0-0ac9fe5851cb@redhat.com> References: <271f1e98-6217-4b40-bae0-0ac9fe5851cb@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 2BB1D14001C X-Stat-Signature: 45bds5tacfmtmaxnjj9a5hrm7dwwd39g X-Rspam-User: X-HE-Tag: 1701074560-643088 X-HE-Meta: U2FsdGVkX19BlV2KAwcR1IT8RUxT4uG2hRTTrt6RxrWTX8jWapONNMrz21hZq5NEww1n4fiLCbSjkc9qweuv93eHdRKw9Ry1FdHXdkhb+cNYAPMXpPBYvlSomeenVA26gMTUwHjKuYe3psDP2/6z9ZNTZhGBwH0aVILgdyZmKQx8tVzafBE2YxuaE43RLfm70SZZ9l2WqSuOYNCI3HCNOlalJ7dsldIz8IQHcGC6xX6RKMyfOWsg7iri9aO0xxQD96bE98Jbrqe0vEBD3G2ngr+LWdhqdnwWUYBIJ11S+ELgPrbYUzUScqYQnjsG5kdXktqD7FeElfB2apEtj9GEshwbMvQKEloZq+fc08Qt+iYcW4ounOwoc37dCFzDRdq8fhQ2AaW+lUVeN9XjDWNrW1Hcfftqvx7MTHG9p88PFycmYf8r38EPIPTEpeTeRPjLu9zXq7vgtPbHA3zzlox5qRRcI4qsOdg59OZXZv9cbMOZs/GxKUewvX4pTMH7Op+KdrQsTjYlVOA0JJYDqxW04smWI1zGE2KRLqODN6nOkAhTuyPXY7iCk+BxqRx8MLdO0bYVPl1/ncGeKSi8VcUO2TDtFdkLkOHPUP0/072mrLyLeR+uyQz0SuYuuiVs57iu1KoruGozuJqUGj2XMr5B+LeOFFo5z+hVVBVCicaj1OVDRdhlhdtuKdNIDett3hmECA+FJ1Gn54zZkV0LYGF5xgc50gVBAFQo4+JRecdRAEpRqowSR6lAE5pLwn8OPJ1qFrSf0yJkB2CqnkI4crLVvYDvb/+pkPJMpZyc+yC49TtAUJfvYsRY6V2j3wu26MYG6FTaMs9ZnfoJk+Th7Dqw3+27J6Pic7qio3RyY6fE70t6ZPsjCEPpv4Wu3eYW4+s4JUwishbw3EI9yNnbihVoLncMPk1tFcQ7R+/pk+k/3q1LCUvkqgLQ0Eiv2v4TR85Yfd6sPrkkuAO4XpHtn6b bLO+ba8g cNXtkGnXt8kO2OL4lgQAdydvZxEs750MkCzKt4rHfdASMvLvRIDXGAM/5A6ABrpPKRhVgR6CCm9WctLP02JGJk8DdoNiOzPg4pxtd8/ynFpae5ASgdID4/G9rsute5dx2EaWfRKpykMTgyTiMMbOoRunOVvJSgOG61mOr1R/5ItVze3LHPRobXijN1gfm6fu2LhUHPTC7EL8TBTZbdROgBo0V8FHEBNExHJfq8BnPRfxv7ozagWTixNRhJNCPpvaPfM3NeTJB4egXqkkNZjrdK6+4qiKrHAnFjSBN1BA9xOKBaowiakPdpClZpgsFrczexZVBCaQgNgFFI2bJ+RPAin/w0Ybic2qnmdiXWk7y2V5HZ9jezdjvlGfJBLra3oT2C5QfosUZiTqRkHhuonxme/5BHI2+YoLz+6pNJ6BNeUPrj3QbaVU8L4WfSnJ74pMPmPkhiOglLwE2YFP3wOQnyjSoGbm647zakZaHllSd3wAm496NbVRGfBO2zqxri8Fz/tL/K2zfwR5tZU2CPRMCJ36oup79fHlBW4+o X-Bogosity: Ham, tests=bogofilter, spamicity=0.000007, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: >> + for (i = 0; i < nr; i++, page++) { >> + if (anon) { >> + /* >> + * If this page may have been pinned by the >> + * parent process, copy the page immediately for >> + * the child so that we'll always guarantee the >> + * pinned page won't be randomly replaced in the >> + * future. >> + */ >> + if (unlikely(page_try_dup_anon_rmap( >> + page, false, src_vma))) { >> + if (i != 0) >> + break; >> + /* Page may be pinned, we have to copy. */ >> + return copy_present_page( >> + dst_vma, src_vma, dst_pte, >> + src_pte, addr, rss, prealloc, >> + page); >> + } >> + rss[MM_ANONPAGES]++; >> + VM_BUG_ON(PageAnonExclusive(page)); >> + } else { >> + page_dup_file_rmap(page, false); >> + rss[mm_counter_file(page)]++; >> + } >> } >> - rss[MM_ANONPAGES]++; >> - } else if (page) { >> - folio_get(folio); >> - page_dup_file_rmap(page, false); >> - rss[mm_counter_file(page)]++; >> + >> + nr = i; >> + folio_ref_add(folio, nr); > > You're changing the order of mapcount vs. refcount increment. Don't. > Make sure your refcount >= mapcount. > > You can do that easily by doing the folio_ref_add(folio, nr) first and > then decrementing in case of error accordingly. Errors due to pinned > pages are the corner case. > > I'll note that it will make a lot of sense to have batch variants of > page_try_dup_anon_rmap() and page_dup_file_rmap(). > i still don't understand why it is not a entire map+1, but an increment in each basepage. as long as it is a CONTPTE large folio, there is no much difference with PMD-mapped large folio. it has all the chance to be DoubleMap and need split. When A and B share a CONTPTE large folio, we do madvise(DONTNEED) or any similar things on a part of the large folio in process A, this large folio will have partially mapped subpage in A (all CONTPE bits in all subpages need to be removed though we only unmap a part of the large folioas HW requires consistent CONTPTEs); and it has entire map in process B(all PTEs are still CONPTES in process B). isn't it more sensible for this large folios to have entire_map = 0(for process B), and subpages which are still mapped in process A has map_count =0? (start from -1). > Especially, the batch variant of page_try_dup_anon_rmap() would only > check once if the folio maybe pinned, and in that case, you can simply > drop all references again. So you either have all or no ptes to process, > which makes that code easier. > > But that can be added on top, and I'll happily do that. > > -- > Cheers, > > David / dhildenb Thanks Barry