From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C200C77B7A for ; Fri, 26 May 2023 16:08:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9BB96900003; Fri, 26 May 2023 12:08:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 96B18900002; Fri, 26 May 2023 12:08:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 859B2900003; Fri, 26 May 2023 12:08:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 77353900002 for ; Fri, 26 May 2023 12:08:50 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 21647140EEA for ; Fri, 26 May 2023 16:08:50 +0000 (UTC) X-FDA: 80832889620.27.AFCD9A3 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by imf06.hostedemail.com (Postfix) with ESMTP id F0C621800C8 for ; Fri, 26 May 2023 16:07:49 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=none; spf=pass (imf06.hostedemail.com: domain of jonathan.cameron@huawei.com designates 185.176.79.56 as permitted sender) smtp.mailfrom=jonathan.cameron@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1685117270; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=t1w8VgxBAUSOg71PCnUDxI8yB+ICCw2l32jVvDZN99o=; b=FQ3oSOy/I8BxFQglAg3/HRpkSehPddECvKnkTIExPaSbMBESv+Do6TfqklazK02fdxEQfF K8RtU4/RfPDBdGwiuLO/4PMf5FBbhqycVvOZa9BkO+8arRC2nVLQqB2QFQKicE7CvNg88v lTbnzuHKmzDvgR4hXsPZK1rHQIyvCzI= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none; spf=pass (imf06.hostedemail.com: domain of jonathan.cameron@huawei.com designates 185.176.79.56 as permitted sender) smtp.mailfrom=jonathan.cameron@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1685117270; a=rsa-sha256; cv=none; b=E2NNd6zuUt3NRYYZPf3fRxyie1/W57QK9xsrQcTL8cH/MqrB+i0W3GtK94yVEBVEKfR8LK rX4rt/7SRbT5xYmuSpfABrKSnPkE0a2ifFFqBnLtmK4QSoC0bAvbjvV/lgeAvQugk6Afm1 12HOLAh198BJgAjYJ/sKzi1Ctc4kq00= Received: from lhrpeml500005.china.huawei.com (unknown [172.18.147.200]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4QSVCT5Z2Kz67lp8; Sat, 27 May 2023 00:06:13 +0800 (CST) Received: from localhost (10.202.227.76) by lhrpeml500005.china.huawei.com (7.191.163.240) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Fri, 26 May 2023 17:07:41 +0100 Date: Fri, 26 May 2023 17:07:40 +0100 From: Jonathan Cameron To: Catalin Marinas CC: Linus Torvalds , Christoph Hellwig , Robin Murphy , Arnd Bergmann , Greg Kroah-Hartman , "Will Deacon" , Marc Zyngier , Andrew Morton , Herbert Xu , "Ard Biesheuvel" , Isaac Manjarres , Saravana Kannan , Alasdair Kergon , Daniel Vetter , Joerg Roedel , Mark Brown , Mike Snitzer , "Rafael J. Wysocki" , , , Subject: Re: [PATCH v5 00/15] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Message-ID: <20230526170740.000000df@Huawei.com> In-Reply-To: References: <20230524171904.3967031-1-catalin.marinas@arm.com> <20230525133138.000014b4@Huawei.com> Organization: Huawei Technologies Research and Development (UK) Ltd. X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.202.227.76] X-ClientProxiedBy: lhrpeml100005.china.huawei.com (7.191.160.25) To lhrpeml500005.china.huawei.com (7.191.163.240) X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: F0C621800C8 X-Rspam-User: X-Stat-Signature: zdp8gqs4nr1frjhgb989tg8qaxqyhy9o X-Rspamd-Server: rspam01 X-HE-Tag: 1685117269-966782 X-HE-Meta: U2FsdGVkX19Xv70rK9geWc4D3FU0J42/7w0sTzEDoh+7uBYIS5TEg9LyR1nQWCUnBQfZDO77JV0gGmkGZFxz1VIKArq9/bwHuMLtRN75hjRncyU3RyI2rnFRibD1R14tA61o96Lt9ukHHdt6lYNr8pYMJjOf9KTnmEQuwv5ca4LkIlH5+uIegMeYZEWQEv/Alf7vOudrzTE5EAYQO2AQAhiU6sEdwSuJX+WpxxXQTqzJRcrtYPJ/7Fj2AJnjpS4y+dMd6dMQsEsfBV8B/IhNHfDtzfSAgIl87NIAgSNfvczHr7ez4H3uiWjPlrVqHRLCA61z4CaqfTA8zwtmb7pyaxjbSRSGZ2y2aoIkHLvhECgK8jAMi5Rvm66knvAFg+EJtiXPqg/DDelkUobOlSIjPNuuPw0H1DeG9hFA8MwxcIpNeHDeDAHXH1cHDvfFziLxsayiMrQpueEZIx5Io9wPh/NpbAlU+ytcWvj7lfftW50ijNTa6fNSDve5ZItDhdaV0vzKxJOx8FKn2EG+WHLOE7U0owwJCOqNHbH0k/axzRmOIARknJdTZjIC/hgokxSJg+aKW47JmPO4QC7bRotLEq5zvjHp8Q9FxyLjL4KDo/01QG0wHaudhxpKFndek28PSgJ9P61BiDTjoM2HF53kNhykrOkdX8LqEDkmkNoWlpqD8aPtTkU2fRXCp9/5+LTrwJ1qs6iR0H8W/RvACPg934mlkZOkxHuRue/4z7nl35N6y+AUv4GbIgAnzE0OWHMLLVIDoEW0GsVUKnUV2sWxiSUl5eVcIKrIqbPmSvia+oGPfVbLoVl7ZwPYIKC/glx8MMuR8SMCXmn3J6FECicq0EVPR0Nv0EUxXi3j2HZz1KZB909sUx3Nmt2PMp/uybvc+V9BiexF941bDg4hHEZUJ+vhChHMDZdyAmDJU2LLf0IQ5GyeDXlmck4i8hun5xZvijnipa/JrIpTU+puI09 I0LkzFc0 aYa7YrMb8Om1ZrdGPxS8y7ka1wXGR9JW9+5WytzEwQNx8i7HGxhjUwiarza7B5X2mcjT3MmcqGtJza9ZUY0KUp+U9w7gxGrulI4LhjY6RCh3XKElvhsgpdu8a/YFHdbIGm/2V3E3yk42+KJyG5cCLoeoELDRV/MoJTZN+jgu67cDiMbxwCHPXU0rYRzlRrntlKTLrTm50KwZnRKWLcWhdSz/9P5mEAOtieglX6qEvBU6OPwgOzD4QQdGwrG8hjV8/kAR8KsSOhhee+eq0L16KlJNE6Q/KVh7CT+6B2Ok4JktaY+j7VqjVGxZMHQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 25 May 2023 15:31:34 +0100 Catalin Marinas wrote: > On Thu, May 25, 2023 at 01:31:38PM +0100, Jonathan Cameron wrote: > > On Wed, 24 May 2023 18:18:49 +0100 > > Catalin Marinas wrote: > > > Another version of the series reducing the kmalloc() minimum alignment > > > on arm64 to 8 (from 128). Other architectures can easily opt in by > > > defining ARCH_KMALLOC_MINALIGN as 8 and selecting > > > DMA_BOUNCE_UNALIGNED_KMALLOC. > > > > > > The first 10 patches decouple ARCH_KMALLOC_MINALIGN from > > > ARCH_DMA_MINALIGN and, for arm64, limit the kmalloc() caches to those > > > aligned to the run-time probed cache_line_size(). On arm64 we gain the > > > kmalloc-{64,192} caches. > > > > > > The subsequent patches (11 to 15) further reduce the kmalloc() caches to > > > kmalloc-{8,16,32,96} if the default swiotlb is present by bouncing small > > > buffers in the DMA API. > > > > I think IIO_DMA_MINALIGN needs to switch to ARCH_DMA_MINALIGN as well. > > > > It's used to force static alignement of buffers with larger structures, > > to make them suitable for non coherent DMA, similar to your other cases. > > Ah, I forgot that you introduced that macro. However, at a quick grep, I > don't think this forced alignment always works as intended (irrespective > of this series). Let's take an example: > > struct ltc2496_driverdata { > /* this must be the first member */ > struct ltc2497core_driverdata common_ddata; > struct spi_device *spi; > > /* > * DMA (thus cache coherency maintenance) may require the > * transfer buffers to live in their own cache lines. > */ > unsigned char rxbuf[3] __aligned(IIO_DMA_MINALIGN); > unsigned char txbuf[3]; > }; > > The rxbuf is aligned to IIO_DMA_MINALIGN, the structure and its size as > well but txbuf is at an offset of 3 bytes from the aligned > IIO_DMA_MINALIGN. So basically any cache maintenance on rxbuf would > corrupt txbuf. That was intentional (though possibly wrong if I've misunderstood the underlying issue). For SPI controllers at least my understanding was that it is safe to assume that they won't trample on themselves. The driver doesn't touch the buffers when DMA is in flight - to do so would indeed result in corruption. So whilst we could end up with the SPI master writing stale data back to txbuf after the transfer it will never matter (as the value is unchanged). Any flushes in the other direction will end up flushing both rxbuf and txbuf anyway which is also harmless. > You need rxbuf to be the only resident of a > cache line, therefore the next member needs such alignment as well. > > With this series and SWIOTLB enabled, however, if you try to transfer 3 > bytes, they will be bounced, so the missing alignment won't matter much. > Only on arm64? If the above is wrong, it might cause trouble on some other architectures. As a side note spi_write_then_read() goes through a bounce buffer dance to avoid using dma unsafe buffers. Superficially that looks to me like it might now end up with an undersized buffer and hence up bouncing which rather defeats the point of it. It uses SMP_CACHE_BYTES for the size. Jonathan