From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E19CBC4332F for ; Wed, 2 Nov 2022 20:50:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7A4AB8E0002; Wed, 2 Nov 2022 16:50:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 754758E0001; Wed, 2 Nov 2022 16:50:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 61C1F8E0002; Wed, 2 Nov 2022 16:50:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 51D048E0001 for ; Wed, 2 Nov 2022 16:50:31 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id ED44B81225 for ; Wed, 2 Nov 2022 20:50:30 +0000 (UTC) X-FDA: 80089695420.06.6337F6A Received: from mail-pf1-f179.google.com (mail-pf1-f179.google.com [209.85.210.179]) by imf10.hostedemail.com (Postfix) with ESMTP id 7B058C000D for ; Wed, 2 Nov 2022 20:50:30 +0000 (UTC) Received: by mail-pf1-f179.google.com with SMTP id m6so17518891pfb.0 for ; Wed, 02 Nov 2022 13:50:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=s2Fta52Ezq1Jihj5LiWc83ydrfKZa4wMtzdeCdVvZOg=; b=QoYqii6A1WIsfq91Qqfry1YALgjlg5cUP7EFuQ/pp6teI4iPF1pa4+loZATqih6lw6 esp1SUTcldVCRlo03PzKLwYGaDc/FSNT5W3TEo7wQBa4Fnu7o0g/k0TqnE8ctSSLFsYE PIKxc+5D284UajGqBV9sXA3JCt4D70LMtOC1QlSIUwx3Mk1hXPEdmpGzA4LMyDdULsvF mgrLM2lKwq9dcxnFLJ88fr3igVDpaQVJbQNslhITrLjwJngB9xs7WSNBLNQkNutaFPvR T7Em1tDvPvUseHZnYuDQrX1zCQ8+TboAF543WDJz/pnvDSCv8QPfgX0CQhc5xAHatIrA dXgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=s2Fta52Ezq1Jihj5LiWc83ydrfKZa4wMtzdeCdVvZOg=; b=QcPtvxqVe3xBCSNg7nTTfWrFpfb7o8hcr8XV5TIdv67u4VExBOHYDrBHLPMsCtaWaz 1vmLPugMCI33m79RkYoeARi2CuDKSkF53p+TqChGIR6IBDa9UFeC33fHJ3kSKptAD+jd xfWTahBsJKEv8wQnGVhSfY18jSJXSpVyO6N1gLZ4eKxS4GHi00AjfbVsNRCJ4e7DoD9g Gyt8FuBsUxMJO65gE6mdnAnQSZO7zND6A5O0UchqeB40NSBk6mTKchnnMkOvI0zcNhU+ iQwTtgaHknW/TMtsS0k3ie0APe3GJUhnFimc7HICW/ymemd3uPQVIKXDXMgjRhDURpu5 xvIg== X-Gm-Message-State: ACrzQf0S9PEHnkjii5wk6xKnrA6ImQwadTZQKU7/1cXF7XqY06SCsPJ9 XJ6BgSU2zQJ8d3sVLYlS3o7AlQ== X-Google-Smtp-Source: AMsMyM4iq8sAfWgMMDo/qq5AIFNURO8NNeNSoKp22ASEgNRhINF26lyaAZtqJs6sJv3wS3jKBa0elw== X-Received: by 2002:a63:1765:0:b0:457:8091:1b6c with SMTP id 37-20020a631765000000b0045780911b6cmr22861825pgx.208.1667422229142; Wed, 02 Nov 2022 13:50:29 -0700 (PDT) Received: from google.com ([2620:15c:2d:3:c83f:bf46:7d3c:579f]) by smtp.gmail.com with ESMTPSA id q22-20020a170902bd9600b00185002f0c6csm8715308pls.134.2022.11.02.13.50.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Nov 2022 13:50:28 -0700 (PDT) Date: Wed, 2 Nov 2022 13:50:23 -0700 From: Isaac Manjarres To: Catalin Marinas Cc: Christoph Hellwig , Greg Kroah-Hartman , Linus Torvalds , Arnd Bergmann , Will Deacon , Marc Zyngier , Andrew Morton , Herbert Xu , Ard Biesheuvel , Saravana Kannan , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH v2 2/2] treewide: Add the __GFP_PACKED flag to several non-DMA kmalloc() allocations Message-ID: References: <20221030091349.GA5600@lst.de> <20221101105919.GA13872@lst.de> <20221101172416.GB20381@lst.de> <20221101173940.GA20821@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1667422230; a=rsa-sha256; cv=none; b=RPPQU7kthAgGK/L+Abamda0jNAa6lSpvttu6ZO8VrIlk5crqvV9YGvWOeRrLARIJsonbxo L2Av1r2QrGFlqSNCu8rdH6tngKx6rJiJoldj5SJt1GUE+y6PvIBTvOuNa6dvMYs1kM84gF 3Q9Kg+jvzzf7f8BAx6lbqkKiT4smUyo= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=QoYqii6A; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf10.hostedemail.com: domain of isaacmanjarres@google.com designates 209.85.210.179 as permitted sender) smtp.mailfrom=isaacmanjarres@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1667422230; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=s2Fta52Ezq1Jihj5LiWc83ydrfKZa4wMtzdeCdVvZOg=; b=79PnesjoUd7RLdIc4G2rsuJ2ajxo0GFMbJHk6+ouLJUtEEKLY1z2lUesdqfkuig0037pQt TVwqHdZ7YT8Qj0BCUHZ6Eaa1Qi8VTjlI+/u9o2XWfT6wSJ7f6oDwBMAu96sldOiNOJjF/J wr1OGrL+/68zNEA2u10DZEThXbHen7s= X-Stat-Signature: z5oxkt6bx8jh1uqzem7ciz7iu8kn697a X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 7B058C000D X-Rspam-User: Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=QoYqii6A; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf10.hostedemail.com: domain of isaacmanjarres@google.com designates 209.85.210.179 as permitted sender) smtp.mailfrom=isaacmanjarres@google.com X-HE-Tag: 1667422230-821395 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Nov 02, 2022 at 11:05:54AM +0000, Catalin Marinas wrote: > On Tue, Nov 01, 2022 at 12:10:51PM -0700, Isaac Manjarres wrote: > > On Tue, Nov 01, 2022 at 06:39:40PM +0100, Christoph Hellwig wrote: > > > On Tue, Nov 01, 2022 at 05:32:14PM +0000, Catalin Marinas wrote: > > > > There's also the case of low-end phones with all RAM below 4GB and arm64 > > > > doesn't allocate the swiotlb. Not sure those vendors would go with a > > > > recent kernel anyway. > > > > > > > > So the need for swiotlb now changes from 32-bit DMA to any DMA > > > > (non-coherent but we can't tell upfront when booting, devices may be > > > > initialised pretty late). > > > > Not only low-end phones, but there are other form-factors that can fall > > into this category and are also memory constrained (e.g. wearable > > devices), so the memory headroom impact from enabling SWIOTLB might be > > non-negligible for all of these devices. I also think it's feasible for > > those devices to use recent kernels. > > Another option I had in mind is to disable this bouncing if there's no > swiotlb buffer, so kmalloc() will return ARCH_DMA_MINALIGN (or the > typically lower cache_line_size()) aligned objects. That's at least > until we find a lighter way to do bouncing. Those devices would work as > before. The SWIOTLB buffer will not be allocated in cases with devices that have low amounts of RAM sitting entirely below 4 GB. Those devices though would still benefit greatly from kmalloc() using a smaller size for objects, so it would be unfortunate to not allow this based on the existence of the SWIOTLB buffer. > > > Yes. The other option would be to use the dma coherent pool for the > > > bouncing, which must be present on non-coherent systems anyway. But > > > it would require us to write a new set of bounce buffering routines. > > > > I think in addition to having to write new bounce buffering routines, > > this approach still suffers the same problem as SWIOTLB, which is that > > the memory for SWIOTLB and/or the dma coherent pool is not reclaimable, > > even when it is not used. > > The dma coherent pool at least it has the advantage that its size can be > increased at run-time and we can start with a small one. Not decreased > though, but if really needed I guess it can be added. > > We'd also skip some cache maintenance here since the coherent pool is > mapped as non-cacheable already. But to Christoph's point, it does > require some reworking of the current bouncing code. Right, I do think it's a good thing that dma coherent pool starts small and can grow. I don't think it would be too difficult to add logic to free the memory back. Perhaps using a shrinker might be sufficient to free back memory when the system is experiencing memory pressure, instead of relying on some threshold? > > I've seen the expression below in a couple of places in the kernel, > though IIUC in_atomic() doesn't always detect atomic contexts: > > gfpflags = (in_atomic() || irqs_disabled()) ? GFP_ATOMIC : GFP_KERNEL; > I'm not too sure about this; I was going more off of how the mapping callbacks in iommu/dma-iommu.c use the atomic variants of iommu_map. > > But what about having a pool that has a small amount of memory and is > > composed of several objects that can be used for small DMA transfers? > > If the amount of memory in the pool starts falling below a certain > > threshold, there can be a worker thread--so that we don't have to use > > GFP_ATOMIC--that can add more memory to the pool? > > If the rate of allocation is high, it may end up calling a slab > allocator directly with GFP_ATOMIC. > > The main downside of any memory pool is identifying the original pool in > dma_unmap_*(). We have a simple is_swiotlb_buffer() check looking just > at the bounce buffer boundaries. For the coherent pool we have the more > complex dma_free_from_pool(). > > With a kmem_cache-based allocator (whether it's behind a mempool or > not), we'd need something like virt_to_cache() and checking whether it > is from our DMA cache. I'm not a big fan of digging into the slab > internals for this. An alternative could be some xarray to remember the > bounced dma_addr. Right. I had actually thought of using something like what is in mm/dma-pool.c and the dma coherent pool, where the pool is backed by the page allocator, and the objects are of a fixed size (for ARM64 for example, it would be align(192, ARCH_DMA_MINALIGN) == 256, though it would be good to have a more generic way of calculating this). Then determining whether an object resides in the pool boils down to scanning the backing pages for the pool, which dma coherent pool does. > Anyway, I propose that we try the swiotlb first and look at optimising > it from there, initially using the dma coherent pool. Except for the freeing logic, which can be added if needed as you pointed out, and Christoph's point about reworking the bouncing code, dma coherent pool doesn't sound like a bad idea. --Isaac