From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11CE3C47258 for ; Wed, 31 Jan 2024 10:32:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8DE188D0002; Wed, 31 Jan 2024 05:32:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 866AA8D0001; Wed, 31 Jan 2024 05:32:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 708F18D0002; Wed, 31 Jan 2024 05:32:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 5A5368D0001 for ; Wed, 31 Jan 2024 05:32:02 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 035DE120453 for ; Wed, 31 Jan 2024 10:32:01 +0000 (UTC) X-FDA: 81739240884.24.96914D9 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf29.hostedemail.com (Postfix) with ESMTP id 54723120007 for ; Wed, 31 Jan 2024 10:32:00 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf29.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706697120; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PcxmOncJZQGI2zs4eR/9n/qUB3nTm/aoYfWMqmVbjAs=; b=f/D3QkLQjPaT5Oq6aSh2e+ds67+6QfXokKrQkazhN/UKvnkdYCvoIT3NEVjIm9OuPe1EPp lwf7VDlaK23eIzevPkyZtKPqYtQUuuubFbilm02djt4C8VBTCICR19EbJFUAgL7x8OaJUl djdRuM10Pzf4jhoKhaaSzfUEkaNTWbA= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf29.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706697120; a=rsa-sha256; cv=none; b=5inVw9bm86Z+XtiRKF5KBAwR/kK3ZH02Fqtc8oQJEL2BzYO2n88LN9xnNJ2WirRQNFeJfI ZU0KpFM0FxE5/G4R3lHZxMatJCuyzmAtNzFbny6hmqPst9oO8FI2xiZgsNYPG/fUreqFmo KhLoe0R+P+LyR4CFm/e9Z6ShZ6JzxG4= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A3E8FDA7; Wed, 31 Jan 2024 02:32:42 -0800 (PST) Received: from [10.57.79.60] (unknown [10.57.79.60]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A65A23F5A1; Wed, 31 Jan 2024 02:31:53 -0800 (PST) Message-ID: <424115a2-a924-4c28-8027-32db6ab9278d@arm.com> Date: Wed, 31 Jan 2024 10:31:51 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v1 9/9] mm/memory: optimize unmap/zap with PTE-mapped THP Content-Language: en-GB To: David Hildenbrand , linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, Andrew Morton , Matthew Wilcox , Catalin Marinas , Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , Peter Zijlstra , Michael Ellerman , Christophe Leroy , "Naveen N. Rao" , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org References: <20240129143221.263763-1-david@redhat.com> <20240129143221.263763-10-david@redhat.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 54723120007 X-Stat-Signature: mme6pczow6q3axbgcktc8p7wngwsdnye X-HE-Tag: 1706697120-300928 X-HE-Meta: U2FsdGVkX1/PFR3ONgeuTotUuwsQUFHAix66XDYQLnYgDX84zvgXdSylpvFnjFNpRcORjDtZiADaIxtEIwOGIubPv2bd7HuR04LAZhmakZh6jaaVh/ikAB9TBt0HhzSAoSS3FgHoHXtQHsN5FyhjmLLOwyh4fFXQFh0wdLVo3Vnwt+2Dn/Hz3oe8rMfzvO/P24f/dyBHk20x7/+xHZzSK7cmvZce3hiDxPqblbqcZKoXX4/HF/oCU7tS1Ye4W5UMe9NQSVoQlniCQ6e5N/1ix0ZB61+vcVcxDVm2/GI9h4hURYzIOHYPCiKf/EO2byXNo/rv1M26Zv5lEXZFm/cofqXM4RFTngNe8lzOFfcBc8ifVBSp4UXxB93lMRaDTxLuQAdnkGXN3Y/6caQVBEqYmEguNnSTBkAV+Hbr0AEhDTRlpKVBnPBKsdqW7Xi8zFw1bc241Yh271ev30DQHQSM9+zhqk2JdXld4hhxjlYbF0wkxdcKBcIl5ztJ7pJLCHgKTlgszFddp/NaATJ/N/lQkvFuQTEBmQwhCWO2fHqELcsuv0XqIDZ0W56mDEkXQjaaB4B0zbDYv1AtAeZhcM8aw1zSIGG2EsFR1JpeB52zJ5vi4FQfwrAYMfyby0T4bw2EEWXttxEWjHhToy6U4FezZyKeJddmzw0eoTVMlJjrkVQvdpPC/GYaYwe7oNfjQHWBWwRQKw2Fu0VkqFhuOoxN3Fsj4oWS57NjLzAuWD45cKXpox2n82SbVFAsPkqMAUub44jbP0W32ehoQcGzbXcAAUn0iHsqhc71TOuAQs1YSUeBpCBAtLsLVSbktS1Z5iDDtN3gjVT4F1Dh/vhfws2Wzd9z1O18Jjtv X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 31/01/2024 10:21, David Hildenbrand wrote: > >>> + >>> +#ifndef clear_full_ptes >>> +/** >>> + * clear_full_ptes - Clear PTEs that map consecutive pages of the same folio. >> >> I know its implied from "pages of the same folio" (and even more so for the >> above variant due to mention of access/dirty), but I wonder if its useful to >> explicitly state that "all ptes being cleared are present at the time of the >> call"? > > "Clear PTEs" -> "Clear present PTEs" ? > > That should make it clearer. Works for me. > > [...] > >>>       if (!delay_rmap) { >>> -        folio_remove_rmap_pte(folio, page, vma); >>> +        folio_remove_rmap_ptes(folio, page, nr, vma); >>> + >>> +        /* Only sanity-check the first page in a batch. */ >>>           if (unlikely(page_mapcount(page) < 0)) >>>               print_bad_pte(vma, addr, ptent, page); >> >> Is there a case for either removing this all together or moving it into >> folio_remove_rmap_ptes()? It seems odd to only check some pages. >> > > I really wanted to avoid another nasty loop here. > > In my thinking, for 4k folios, or when zapping subpages of large folios, we > still perform the exact same checks. Only when batching we don't. So if there is > some problem, there are ways to get it triggered. And these problems are barely > ever seen. > > folio_remove_rmap_ptes() feels like the better place -- especially because the > delayed-rmap handling is effectively unchecked. But in there, we cannot > "print_bad_pte()". > > [background: if we had a total mapcount -- iow cheap folio_mapcount(), I'd check > here that the total mapcount does not underflow, instead of checking per-subpage] All good points... perhaps extend the comment to describe how this could be solved in future with cheap total_mapcount()? Or in the commit log if you prefer? > >> >>>       } >>> -    if (unlikely(__tlb_remove_page(tlb, page, delay_rmap))) { >>> +    if (unlikely(__tlb_remove_folio_pages(tlb, page, nr, delay_rmap))) { >>>           *force_flush = true; >>>           *force_break = true; >>>       } >>>   } >>>   -static inline void zap_present_pte(struct mmu_gather *tlb, >>> +/* >>> + * Zap or skip one present PTE, trying to batch-process subsequent PTEs that >>> map >> >> Zap or skip *at least* one... ? > > Ack >