From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Ard Biesheuvel <ardb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Herbert Xu <herbert@gondor.apana.org.au>,
Will Deacon <will@kernel.org>, Marc Zyngier <maz@kernel.org>,
Arnd Bergmann <arnd@arndb.de>,
Andrew Morton <akpm@linux-foundation.org>,
Linux Memory Management List <linux-mm@kvack.org>,
Linux ARM <linux-arm-kernel@lists.infradead.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
"David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH 07/10] crypto: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN
Date: Thu, 14 Apr 2022 16:52:56 +0200 [thread overview]
Message-ID: <Ylg1SCBA6lJM/k4h@kroah.com> (raw)
In-Reply-To: <CAMj1kXGAd7s1GU=uH+iLOwvbn1zG-=TNXw02Bf-AdiHOQvh02Q@mail.gmail.com>
On Thu, Apr 14, 2022 at 04:36:46PM +0200, Ard Biesheuvel wrote:
> On Thu, 14 Apr 2022 at 16:27, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Thu, Apr 14, 2022 at 03:52:53PM +0200, Ard Biesheuvel wrote:
> > > On Thu, 14 Apr 2022 at 07:38, Greg Kroah-Hartman
> > > <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > On Wed, Apr 13, 2022 at 09:53:24AM -1000, Linus Torvalds wrote:
> ...
> > > > > Honestly, I think it would probably be worth discussing the "kmalloc
> > > > > DMA alignment" issues.
> > > > >
> > > > > 99.9% of kmalloc users don't want to do DMA.
> > > > >
> > > > > And there's actually a fair amount of small kmalloc for random stuff.
> > > > > Right now on my laptop, I have
> > > > >
> > > > > kmalloc-8 16907 18432 8 512 1 : ...
> > > > >
> > > > > according to slabinfo, so almost 17 _thousand_ allocations of 8 bytes.
> > > > >
> > > > > It's all kinds of sad if those allocations need to be 64 bytes in size
> > > > > just because of some silly DMA alignment issue, when none of them want
> > > > > it.
> > > > >
> > >
> > > Actually, the alignment for non-cache coherent DMA is 128 bytes on
> > > arm64, not 64 bytes.
> > >
> > > > > Yeah, yeah, wasting a megabyte of memory is "just a megabyte" these
> > > > > days. Which is crazy. It's literally memory that could have been used
> > > > > for something much more useful than just pure and utter waste.
> > > > >
> > > > > I think we could and should just say "people who actually require DMA
> > > > > accesses should say so at kmalloc time". We literally have that
> > > > > GFP_DMA and ZOME_DMA for various historical reasons, so we've been
> > > > > able to do that before.
> > > > >
> > > > > No, that historical GFP_DMA isn't what arm64 wants - it's the old
> > > > > crazy "legacy 16MB DMA" thing that ISA DMA used to have.
> > > > >
> > > > > But the basic issue was true then, and is true now - DMA allocations
> > > > > are fairly special, and should not be that hard to just mark as such.
> > > >
> > > > "fairly special" == "all USB transactions", so it will take a lot of
> > > > auditing here. I think also many SPI controllers require this and maybe
> > > > I2C? Perhaps other bus types do as well.
> > > >
> > > > So please don't make this change without some way of figuring out just
> > > > what drivers need to be fixed up, as it's going to be a lot...
> > > >
> > >
> > > Yeah, the current de facto contract of being able to DMA map anything
> > > that was allocated via the linear map makes it quite hard to enforce
> > > the use of dma_kmalloc() for this.
> > >
> > > What we might do, given the fact that only inbound non-cache coherent
> > > DMA is problematic, is dropping the kmalloc alignment to 8 like on
> > > x86, and falling back to bounce buffering when a misaligned, non-cache
> > > coherent inbound DMA mapping is created, using the SWIOTLB bounce
> > > buffering code that we already have, and is already in use on most
> > > affected systems for other reasons (i.e., DMA addressing limits)
> >
> > Ick, that's a mess.
> >
> > > This will cause some performance regressions, but in a way that seems
> > > fixable to me: taking network drivers as an example, the RX buffers
> > > that are filled using inbound DMA are typically owned by the driver
> > > itself, which could be updated to round up its allocations and DMA
> > > mappings. Block devices typically operate on quantities that are
> > > aligned sufficiently already. In other cases, we will likely notice
> > > if/when this fallback is taken on a hot path, but if we don't, at
> > > least we know a bounce buffer is being used whenever we cannot perform
> > > the DMA safely in-place.
> >
> > We can move to having an "allocator-per-bus" for memory like this to
> > allow the bus to know if this is a DMA requirement or not.
> >
> > So for all USB drivers, we would have:
> > usb_kmalloc(size, flags);
> > and then it might even be easier to verify with static tools that the
> > USB drivers are sending only properly allocated data. Same for SPI and
> > other busses.
> >
>
> As I pointed out earlier in the thread, alignment/padding requirements
> for non-coherent DMA are a property of the CPU's cache hierarchy, not
> of the device. So I'm not sure I follow how a per-subsystem
> distinction would help here. In the case of USB especially, would that
> mean that block, media and networking subsystems would need to be
> aware of the USB-ness of the underlying transport?
That's what we have required today, yes. That's only because we knew
that for some USB controllers, that was a requirement and we had no way
of passing that information back up the stack so we just made it a
requirement.
But I do agree this is messy. It's even messier for things like USB
where it's not the USB device itself that matters, it's the USB
controller that the USB device is attached to. And that can be _way_ up
the device hierarchy. Attach something like a NFS mount over a PPP
network connection on a USB to serial device and ugh, where do you
begin? :)
And is this always just an issue of the CPU cache hierarchy? And not the
specific bridge that a device is connected to that CPU on? Or am I
saying the same thing here?
I mean take a USB controller for example. We could have a system where
one USB controller is on a PCI bus, while another is on a "platform"
bus. Both of those are connected to the CPU in different ways and so
could have different DMA rules. Do we downgrade everything in the
system for the worst connection possible?
Again, consider a USB driver allocating memory to transfer stuff, should
it somehow know the cache hierarchy that it is connected to? Right now
we punt and do not do that at the expense of a bit of potentially
wasted memory for small allocations.
> > https://lore.kernel.org/r/230a9486fc68ea0182df46255e42a51099403642.1648032613.git.christophe.leroy@csgroup.eu
> > is an example of a SPI driver that has been there "for forever" yet
> > always got it wrong. If we could have had some way to know "only memory
> > allocated with this function is allowed on the bus" that would have
> > fixed the issue a long time ago.
> >
> > Anyway, just an idea, it's up to you all if this is worth it or not.
> > Adding performance regressions at the expense of memory size feels like
> > a rough trade-off to go through until things are fixed up.
> >
>
> Yeah, I hear you. But we already have distinct classes of memory,
> i.e., vmalloc vs kmalloc, where only the latter is permitted for DMA
> (which is why VMAP stack broke that SPI driver), and I'm not sure
> adding more types is going to make this tractable going forward.
Sorry, it would not be a "real" type of memory at all, just something
with a static tag on it at the compiler level (like we do for endian
stuff with sparse today). The real memory would be allocated with the
normal kmalloc() and friends it's just that static tools can be run to
verify "yes, all of the calls to usb_submit_urb() were done with memory
allocated with usb_kmalloc(), and not from the stack"
No runtime changes needed for different types, that would be a mess.
Let's lean on static tools when we can.
thanks,
greg k-h
next prev parent reply other threads:[~2022-04-14 14:53 UTC|newest]
Thread overview: 139+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-05 13:57 [PATCH 00/10] mm, arm64: Reduce ARCH_KMALLOC_MINALIGN below the cache line size Catalin Marinas
2022-04-05 13:57 ` [PATCH 01/10] mm/slab: Decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN Catalin Marinas
2022-04-05 23:59 ` Hyeonggon Yoo
2022-04-06 8:53 ` Catalin Marinas
[not found] ` <CAK8P3a1K0=jwYEHVu=X7oAWk9dzaOYAdFsidwVRKCJVReSV3+g@mail.gmail.com>
2022-04-06 12:09 ` Hyeonggon Yoo
2022-04-08 6:42 ` Hyeonggon Yoo
2022-04-08 9:06 ` Hyeonggon Yoo
2022-04-08 9:11 ` Catalin Marinas
2022-04-11 10:37 ` Hyeonggon Yoo
2022-04-11 14:02 ` Catalin Marinas
2022-04-05 13:57 ` [PATCH 02/10] drivers/base: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN Catalin Marinas
2022-04-11 14:57 ` Andy Shevchenko
2022-04-11 17:39 ` Catalin Marinas
2022-04-05 13:57 ` [PATCH 03/10] drivers/gpu: " Catalin Marinas
2022-04-05 13:57 ` [PATCH 04/10] drivers/md: " Catalin Marinas
2022-04-05 13:57 ` [PATCH 05/10] drivers/spi: " Catalin Marinas
2022-04-05 14:05 ` Mark Brown
2022-04-05 13:57 ` [PATCH 06/10] drivers/usb: " Catalin Marinas
2022-04-05 13:57 ` [PATCH 07/10] crypto: " Catalin Marinas
2022-04-05 22:57 ` Herbert Xu
2022-04-06 6:53 ` Ard Biesheuvel
2022-04-06 8:49 ` Catalin Marinas
2022-04-06 9:41 ` Ard Biesheuvel
2022-04-07 4:30 ` Herbert Xu
2022-04-07 11:01 ` Catalin Marinas
2022-04-07 11:40 ` Herbert Xu
2022-04-07 16:28 ` Catalin Marinas
2022-04-08 3:25 ` Herbert Xu
2022-04-08 9:04 ` Catalin Marinas
2022-04-08 9:11 ` Herbert Xu
2022-04-12 9:32 ` Catalin Marinas
2022-04-12 9:40 ` Herbert Xu
2022-04-12 10:02 ` Catalin Marinas
2022-04-12 10:18 ` Herbert Xu
2022-04-12 12:31 ` Catalin Marinas
2022-04-12 22:01 ` Ard Biesheuvel
2022-04-13 8:47 ` Catalin Marinas
2022-04-13 19:53 ` Linus Torvalds
2022-04-14 5:38 ` Greg Kroah-Hartman
2022-04-14 13:52 ` Ard Biesheuvel
2022-04-14 14:27 ` Greg Kroah-Hartman
2022-04-14 14:36 ` Ard Biesheuvel
2022-04-14 14:52 ` Greg Kroah-Hartman [this message]
2022-04-14 15:01 ` Ard Biesheuvel
2022-04-14 15:10 ` Ard Biesheuvel
2022-04-14 19:49 ` Catalin Marinas
2022-04-14 22:25 ` Linus Torvalds
2022-04-15 6:03 ` Ard Biesheuvel
2022-04-15 11:09 ` Arnd Bergmann
2022-04-16 9:42 ` Catalin Marinas
2022-04-20 19:07 ` Catalin Marinas
2022-04-20 19:33 ` Linus Torvalds
2022-04-14 14:30 ` Ard Biesheuvel
2022-04-15 6:51 ` Herbert Xu
2022-04-15 7:49 ` Ard Biesheuvel
2022-04-15 7:51 ` Herbert Xu
2022-04-15 8:05 ` Ard Biesheuvel
2022-04-15 8:12 ` Herbert Xu
2022-04-15 9:51 ` Ard Biesheuvel
2022-04-15 10:04 ` Ard Biesheuvel
2022-04-15 10:12 ` Herbert Xu
2022-04-15 10:22 ` Ard Biesheuvel
2022-04-15 10:45 ` Herbert Xu
2022-04-15 11:38 ` Ard Biesheuvel
2022-04-17 8:08 ` Herbert Xu
2022-04-17 8:31 ` Catalin Marinas
2022-04-17 8:35 ` Herbert Xu
2022-04-17 8:50 ` Catalin Marinas
2022-04-17 8:58 ` Herbert Xu
2022-04-17 16:30 ` Catalin Marinas
2022-04-18 8:37 ` Herbert Xu
2022-04-18 9:19 ` Catalin Marinas
2022-04-18 16:44 ` Catalin Marinas
2022-04-19 21:50 ` Ard Biesheuvel
2022-04-20 10:36 ` Catalin Marinas
2022-04-20 11:29 ` Arnd Bergmann
2022-04-21 7:20 ` Christoph Hellwig
2022-04-21 7:36 ` Arnd Bergmann
2022-04-21 7:44 ` Christoph Hellwig
2022-04-21 8:05 ` Ard Biesheuvel
2022-04-21 11:06 ` Catalin Marinas
2022-04-21 12:28 ` Arnd Bergmann
2022-04-21 13:25 ` Catalin Marinas
2022-04-21 13:47 ` Arnd Bergmann
2022-04-21 14:44 ` Catalin Marinas
2022-04-21 14:47 ` Arnd Bergmann
2022-05-10 11:03 ` [RFC PATCH 0/7] crypto: Add helpers for allocating with DMA alignment Herbert Xu
2022-05-10 11:07 ` [RFC PATCH 1/7] crypto: Prepare to move crypto_tfm_ctx Herbert Xu
2022-05-10 11:07 ` [RFC PATCH 2/7] crypto: api - Add crypto_tfm_ctx_dma Herbert Xu
2022-05-10 17:10 ` Catalin Marinas
2022-05-12 3:57 ` Herbert Xu
2022-05-10 11:07 ` [RFC PATCH 3/7] crypto: aead - Add ctx helpers with DMA alignment Herbert Xu
2022-05-10 11:07 ` [RFC PATCH 4/7] crypto: hash " Herbert Xu
2022-05-10 11:07 ` [RFC PATCH 5/7] crypto: skcipher " Herbert Xu
2022-05-10 11:07 ` [RFC PATCH 6/7] crypto: api - Increase MAX_ALGAPI_ALIGNMASK to 127 Herbert Xu
2022-05-10 11:07 ` [RFC PATCH 7/7] crypto: caam - Explicitly request DMA alignment Herbert Xu
2022-04-15 12:18 ` [PATCH 07/10] crypto: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN Catalin Marinas
2022-04-15 12:25 ` Ard Biesheuvel
2022-04-15 9:51 ` Catalin Marinas
2022-04-15 12:31 ` Catalin Marinas
2022-04-17 8:11 ` Herbert Xu
2022-04-17 8:38 ` Catalin Marinas
2022-04-17 8:43 ` Herbert Xu
2022-04-17 16:29 ` Catalin Marinas
2022-07-15 22:23 ` Isaac Manjarres
2022-07-16 3:25 ` Herbert Xu
2022-07-18 17:53 ` Catalin Marinas
2022-09-21 0:47 ` Isaac Manjarres
2022-09-30 18:32 ` Catalin Marinas
2022-09-30 19:35 ` Linus Torvalds
2022-10-01 22:29 ` Catalin Marinas
2022-10-02 17:00 ` Linus Torvalds
2022-10-02 22:08 ` Ard Biesheuvel
2022-10-02 22:24 ` Linus Torvalds
2022-10-03 17:39 ` Catalin Marinas
2022-10-12 17:45 ` Isaac Manjarres
2022-10-13 16:57 ` Catalin Marinas
2022-10-13 18:58 ` Saravana Kannan
2022-10-14 16:25 ` Catalin Marinas
2022-10-14 20:23 ` Saravana Kannan
2022-10-14 20:44 ` Linus Torvalds
2022-10-16 21:37 ` Catalin Marinas
2022-04-12 10:20 ` Catalin Marinas
2022-04-07 6:14 ` Muchun Song
2022-04-07 9:25 ` Catalin Marinas
2022-04-07 10:00 ` Muchun Song
2022-04-07 11:06 ` Catalin Marinas
2022-04-05 13:57 ` [PATCH 08/10] mm/slab: Allow dynamic kmalloc() minimum alignment Catalin Marinas
2022-04-07 3:46 ` Hyeonggon Yoo
2022-04-07 8:50 ` Catalin Marinas
2022-04-07 9:18 ` Hyeonggon Yoo
2022-04-07 9:35 ` Catalin Marinas
2022-04-07 12:26 ` Hyeonggon Yoo
2022-04-11 11:55 ` Hyeonggon Yoo
2022-04-05 13:57 ` [PATCH 09/10] mm/slab: Simplify create_kmalloc_cache() args and make it static Catalin Marinas
2022-04-05 13:57 ` [PATCH 10/10] arm64: Enable dynamic kmalloc() minimum alignment Catalin Marinas
2022-04-07 14:40 ` [PATCH 00/10] mm, arm64: Reduce ARCH_KMALLOC_MINALIGN below the cache line size Vlastimil Babka
2022-04-07 17:48 ` Catalin Marinas
2022-04-08 14:37 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Ylg1SCBA6lJM/k4h@kroah.com \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=ardb@kernel.org \
--cc=arnd@arndb.de \
--cc=catalin.marinas@arm.com \
--cc=davem@davemloft.net \
--cc=herbert@gondor.apana.org.au \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=maz@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox