From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12379C77B7E for ; Thu, 25 May 2023 12:31:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 88179900003; Thu, 25 May 2023 08:31:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8320D900002; Thu, 25 May 2023 08:31:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 72087900003; Thu, 25 May 2023 08:31:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 62DEF900002 for ; Thu, 25 May 2023 08:31:51 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id A9A2380B38 for ; Thu, 25 May 2023 12:31:50 +0000 (UTC) X-FDA: 80828713980.29.DFC299D Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by imf09.hostedemail.com (Postfix) with ESMTP id 17F6614000E for ; Thu, 25 May 2023 12:31:45 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=none; spf=pass (imf09.hostedemail.com: domain of jonathan.cameron@huawei.com designates 185.176.79.56 as permitted sender) smtp.mailfrom=jonathan.cameron@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1685017906; a=rsa-sha256; cv=none; b=HZEwDIem6alJ2SX1CJuQO4fK4B4Pgs8szSTGoU0xFBBLHp9CvCxxLyLnk+hfyXImTp6uzb pZhnhVbVR2WKsAf8rXXbVxWse/siefrTSQF24ikZ8KvEaqRyXeN2hTBr9Yc6xmyz7WYb42 eG1/bU0YSL1srwrnPHdJbFaWrZWnHP8= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none; spf=pass (imf09.hostedemail.com: domain of jonathan.cameron@huawei.com designates 185.176.79.56 as permitted sender) smtp.mailfrom=jonathan.cameron@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1685017906; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gp7mWQNgK/0xT02EBW5/wlRfspAgEAg5ECg5wzaIMWI=; b=QZ2zDjgfwcNRHwdlZZko79pSqrq++GxEPPGFQsGp0oCFKX2E0qBYMWHHtrZCBDZDrxZyPJ TsaVM393cwGtiTqYyMesyPaQ0qsxLOdWgGw4Zvezx4JU4n6t0gfKnwJ3fLbpV472SVskxS RKT6H2WvUtTqznDPJB+kAWhlCeXqoGo= Received: from lhrpeml500005.china.huawei.com (unknown [172.18.147.206]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4QRnS36Rt6z67fjq; Thu, 25 May 2023 20:29:39 +0800 (CST) Received: from localhost (10.202.227.76) by lhrpeml500005.china.huawei.com (7.191.163.240) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 25 May 2023 13:31:39 +0100 Date: Thu, 25 May 2023 13:31:38 +0100 From: Jonathan Cameron To: Catalin Marinas CC: Linus Torvalds , Christoph Hellwig , Robin Murphy , Arnd Bergmann , Greg Kroah-Hartman , "Will Deacon" , Marc Zyngier , Andrew Morton , Herbert Xu , "Ard Biesheuvel" , Isaac Manjarres , Saravana Kannan , Alasdair Kergon , Daniel Vetter , "Joerg Roedel" , Mark Brown , Mike Snitzer , "Rafael J. Wysocki" , , , Subject: Re: [PATCH v5 00/15] mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8 Message-ID: <20230525133138.000014b4@Huawei.com> In-Reply-To: <20230524171904.3967031-1-catalin.marinas@arm.com> References: <20230524171904.3967031-1-catalin.marinas@arm.com> Organization: Huawei Technologies Research and Development (UK) Ltd. X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.202.227.76] X-ClientProxiedBy: lhrpeml100004.china.huawei.com (7.191.162.219) To lhrpeml500005.china.huawei.com (7.191.163.240) X-CFilter-Loop: Reflected X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 17F6614000E X-Stat-Signature: 1mfhcyfrtm8ryimjbing9p41zs8rajgf X-HE-Tag: 1685017905-379281 X-HE-Meta: U2FsdGVkX1+iiVPELjIPAPj5qApoekHoSyQee7Qgda3B66d7UMMAr9dBj6lyPmB255I7Zy6jYeCAUmFQUEbUJzOLqz5R8uHoq9SImE+yrxAesxMqnVuwqjNp1/CS7cLMWY/Gt4KJnpFxJb3w2hgwmhITJ4Bx2npFvK9OK94JcFvXwoiT1XlvE58Vqch3XaxU6XYvSOEc+GaX8wzq/FmOTJ8ykD9/1OO0Weca6Mg1dwBlRmc2KFWjPXJB9MUNfLsphrtP7IbdquLvOTYNWd5O2+YtpP/Gi/Hoo28HtsBIBb0CtCtBgDa3oEtg3fpkJFxTJd0n6AUyl8eNXFiXPAx57shbFbUbhoBQWt4sr+gRR3+3o1Iz+e1VNVTtKwyWz2TpKEJ6cxboDawPVTzbmwSDmf9w7V0/KNYjHvhyyMm13u/irymaH4/kFWN5gr8Pz0LPKSvStDPpVQ5Ahi4fQEJvlUcA9qbn+K9t3I4Dowj+BPGPK/RZg32j2N5QFYcCZgIZau0hyFZF0pEKIt1Kx0FZDT+hwJ7+j8VVhekQrc/D4Vbz0P47c9QcM9CIYnDUcZuz9TrJtKMQyBQ5yRsKgimx8fkaFPbSY0otbzSmglytWS5KiyY1moqUry0M/qMu+zscJHVYp9qNGsKylzNc9eUcJaNry6dQkOa9TBmVnc7doJuMiJobb3U2ZuqY+Ae997oTsQX2DwM2UL2hrh2Z3MrSnk2xuJJj7YZGTPZVz1wpkWb5hPF0v2EHl31e/ZNvU+smixlWtB92bSv4bPKo/LRXelKIaCaB+ahnhnWTH/2sH7DkB1xxijr535NzVs3zWNooT5O5JCQwlQw5eSS+fEAqByoo/DN45QaXu8dwmeh8RktvmBxsr9y1cEnzzAKnOm8Wft6Ku6Gpt8dbB09R+r+8fShE0l9hFEUzi7ir7dUWiwTmqXeCUWzRIpBkx97DzwjAbCRircWpcexc0LdeBdC R4Meu3Ss xBeFXOyjzB/p8Ryq80Qk3LW/PKWau1ogK48ZTf0FEVl0taAVukWQ7yCJpaWZjA0I4BUP7n1oQNVfb2oFS5JZ2RWrmXFu6KVr/p+3U+Ma+UXXXjDQ52FXII5RMhPlSB14fDYGysc6ZwXrlAkZHq0ZJZR7RZyeUxia8GwlqFDXRjlO1lyKvzmb4cfxthm+eUnlHbtEX4upst8ROgWWQhmfhTYOQrjy/GyCwUESPCQvtWhIC3eQynpzlbqAy/Kj+F5JKTzH4oWNVRjv6iA5cylGjDEbzWIsyyriWdSXUDmbFtuvt973rzBtBd8XSaWq26G0cfGdUbhtRMs0q2lsPbWZWiFU1HfMeiGD/hMCD+iP5vkqHNLeA9WY7ZPNTiQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, 24 May 2023 18:18:49 +0100 Catalin Marinas wrote: > Hi, > > Another version of the series reducing the kmalloc() minimum alignment > on arm64 to 8 (from 128). Other architectures can easily opt in by > defining ARCH_KMALLOC_MINALIGN as 8 and selecting > DMA_BOUNCE_UNALIGNED_KMALLOC. > > The first 10 patches decouple ARCH_KMALLOC_MINALIGN from > ARCH_DMA_MINALIGN and, for arm64, limit the kmalloc() caches to those > aligned to the run-time probed cache_line_size(). On arm64 we gain the > kmalloc-{64,192} caches. > > The subsequent patches (11 to 15) further reduce the kmalloc() caches to > kmalloc-{8,16,32,96} if the default swiotlb is present by bouncing small > buffers in the DMA API. Hi Catalin, I think IIO_DMA_MINALIGN needs to switch to ARCH_DMA_MINALIGN as well. It's used to force static alignement of buffers with larger structures, to make them suitable for non coherent DMA, similar to your other cases. Thanks, Jonathan > > Changes since v4: > > - Following Robin's suggestions, reworked the iommu handling so that the > buffer size checks are done in the dev_use_swiotlb() and > dev_use_sg_swiotlb() functions (together with dev_is_untrusted()). The > sync operations can now check for the SG_DMA_USE_SWIOTLB flag. Since > this flag is no longer specific to kmalloc() bouncing (covers > dev_is_untrusted() as well), the sg_is_dma_use_swiotlb() and > sg_dma_mark_use_swiotlb() functions are always defined if > CONFIG_SWIOTLB. > > - Dropped ARCH_WANT_KMALLOC_DMA_BOUNCE, only left the > DMA_BOUNCE_UNALIGNED_KMALLOC option, selectable by the arch code. The > NEED_SG_DMA_FLAGS is now selected by IOMMU_DMA if SWIOTLB. > > - Rather than adding another config option, allow > dma_get_cache_alignment() to be overridden by the arch code > (Christoph's suggestion). > > - Added a comment to the dma_kmalloc_needs_bounce() function on the > heuristics behind the bouncing. > > - Added acked-by/reviewed-by tags (not adding Ard's tested-by yet as > there were some changes). > > The updated patches are also available on this branch: > > git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux devel/kmalloc-minalign > > Thanks. > > Catalin Marinas (14): > mm/slab: Decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN > dma: Allow dma_get_cache_alignment() to be overridden by the arch code > mm/slab: Simplify create_kmalloc_cache() args and make it static > mm/slab: Limit kmalloc() minimum alignment to > dma_get_cache_alignment() > drivers/base: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN > drivers/gpu: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN > drivers/usb: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN > drivers/spi: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN > drivers/md: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN > arm64: Allow kmalloc() caches aligned to the smaller cache_line_size() > dma-mapping: Force bouncing if the kmalloc() size is not > cache-line-aligned > iommu/dma: Force bouncing if the size is not cacheline-aligned > mm: slab: Reduce the kmalloc() minimum alignment if DMA bouncing > possible > arm64: Enable ARCH_WANT_KMALLOC_DMA_BOUNCE for arm64 > > Robin Murphy (1): > scatterlist: Add dedicated config for DMA flags > > arch/arm64/Kconfig | 1 + > arch/arm64/include/asm/cache.h | 3 ++ > arch/arm64/mm/init.c | 7 +++- > drivers/base/devres.c | 6 ++-- > drivers/gpu/drm/drm_managed.c | 6 ++-- > drivers/iommu/Kconfig | 1 + > drivers/iommu/dma-iommu.c | 50 +++++++++++++++++++++++----- > drivers/md/dm-crypt.c | 2 +- > drivers/pci/Kconfig | 1 + > drivers/spi/spidev.c | 2 +- > drivers/usb/core/buffer.c | 8 ++--- > include/linux/dma-map-ops.h | 61 ++++++++++++++++++++++++++++++++++ > include/linux/dma-mapping.h | 4 ++- > include/linux/scatterlist.h | 29 +++++++++++++--- > include/linux/slab.h | 14 ++++++-- > kernel/dma/Kconfig | 7 ++++ > kernel/dma/direct.h | 3 +- > mm/slab.c | 6 +--- > mm/slab.h | 5 ++- > mm/slab_common.c | 46 +++++++++++++++++++------ > 20 files changed, 213 insertions(+), 49 deletions(-) > > >