From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.6 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E70CC433E1 for ; Tue, 18 Aug 2020 16:08:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D251A2076E for ; Tue, 18 Aug 2020 16:08:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="BqiPJIVx" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D251A2076E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5AE4F8D001B; Tue, 18 Aug 2020 12:08:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 55EA68D000B; Tue, 18 Aug 2020 12:08:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 47CD38D001B; Tue, 18 Aug 2020 12:08:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0193.hostedemail.com [216.40.44.193]) by kanga.kvack.org (Postfix) with ESMTP id 2EB808D000B for ; Tue, 18 Aug 2020 12:08:36 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id DD0B98248076 for ; Tue, 18 Aug 2020 16:08:35 +0000 (UTC) X-FDA: 77164172190.02.idea56_190ce4327020 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin02.hostedemail.com (Postfix) with ESMTP id 0E7E010097AC9 for ; Tue, 18 Aug 2020 16:08:32 +0000 (UTC) X-HE-Tag: idea56_190ce4327020 X-Filterd-Recvd-Size: 3974 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf35.hostedemail.com (Postfix) with ESMTP for ; Tue, 18 Aug 2020 16:08:24 +0000 (UTC) Received: from willie-the-truck (236.31.169.217.in-addr.arpa [217.169.31.236]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 828A3207D3; Tue, 18 Aug 2020 16:08:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1597766904; bh=0XDiFtoCzEGx2zRl6dr6v/7GtRSHYOaTcxUyoIADvEc=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=BqiPJIVx7P1uKny20654FpTQ1dYQDn2N/LYxgmIJfdgyKpNsD3aY2gjM4XLa0YPMo EI8QqOK433KMw0XDi1/BZ2Bj+z7UslW4eJYBYSkfJ0RbYxl2t+S9qWJMCAcu53o7OY +6MDIROG5nLysgpO550aaHXqu7TGh1kjzeMo/3vg= Date: Tue, 18 Aug 2020 17:08:16 +0100 From: Will Deacon To: Matthew Wilcox Cc: linux-arch@vger.kernel.org, Vineet Gupta , linux-snps-arc@lists.infradead.org, Russell King , linux-arm-kernel@lists.infradead.org, Catalin Marinas , Thomas Bogendoerfer , linux-mips@vger.kernel.org, Michael Ellerman , Benjamin Herrenschmidt , Paul Mackerras , linuxppc-dev@lists.ozlabs.org, "David S. Miller" , sparclinux@vger.kernel.org, linux-mm@kvack.org Subject: Re: Flushing transparent hugepages Message-ID: <20200818160815.GA16191@willie-the-truck> References: <20200818150736.GQ17456@casper.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200818150736.GQ17456@casper.infradead.org> User-Agent: Mutt/1.10.1 (2018-07-13) X-Rspamd-Queue-Id: 0E7E010097AC9 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Aug 18, 2020 at 04:07:36PM +0100, Matthew Wilcox wrote: > For example, arm64 seems confused in this scenario: > > void flush_dcache_page(struct page *page) > { > if (test_bit(PG_dcache_clean, &page->flags)) > clear_bit(PG_dcache_clean, &page->flags); > } > > ... > > void __sync_icache_dcache(pte_t pte) > { > struct page *page = pte_page(pte); > > if (!test_and_set_bit(PG_dcache_clean, &page->flags)) > sync_icache_aliases(page_address(page), page_size(page)); > } > > So arm64 keeps track on a per-page basis which ones have been flushed. > page_size() will return PAGE_SIZE if called on a tail page or regular > page, but will return PAGE_SIZE << compound_order if called on a head > page. So this will either over-flush, or it's missing the opportunity > to clear the bits on all the subpages which have now been flushed. Hmm, that seems to go all the way back to 2014 as the result of a bug fix in 923b8f5044da ("arm64: mm: Make icache synchronisation logic huge page aware") which has a Reported-by Mark and a CC stable, suggesting something _was_ going wrong at the time :/ Was there a point where the tail pages could end up with PG_arch_1 uncleared on allocation? > What would you _like_ to see? Would you rather flush_dcache_page() > were called once for each subpage, or would you rather maintain > the page-needs-flushing state once per compound page? We could also > introduce flush_dcache_thp() if some architectures would prefer it one > way and one the other, although that brings into question what to do > for hugetlbfs pages. For arm64, we'd like to see PG_arch_1 preserved during huge page splitting [1], but there was a worry that it might break x86 and s390. It's also not clear to me that we can change __sync_icache_dcache() as it's called when we're installing the entry in the page-table, so why would it be called again for the tail pages? Will [1] https://lore.kernel.org/linux-arch/20200703153718.16973-8-catalin.marinas@arm.com/