From: Baoquan He <bhe@redhat.com>
To: Heiko Carstens <hca@linux.ibm.com>, Christoph Hellwig <hch@lst.de>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
akpm@linux-foundation.org, cl@linux.com, 42.hyeyoo@gmail.com,
penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com,
vbabka@suse.cz, David.Laight@aculab.com, david@redhat.com,
herbert@gondor.apana.org.au, davem@davemloft.net,
linux-crypto@vger.kernel.org, steffen.klassert@secunet.com,
netdev@vger.kernel.org, gor@linux.ibm.com,
agordeev@linux.ibm.com, borntraeger@linux.ibm.com,
svens@linux.ibm.com, linux-s390@vger.kernel.org,
michael@walle.cc, linux-i2c@vger.kernel.org, wsa@kernel.org,
Halil Pasic <pasic@linux.ibm.com>,
Vineeth Vijayan <vneethv@linux.ibm.com>,
x86@kernel.org
Subject: Re: [PATCH 00/22] Don't use kmalloc() with GFP_DMA
Date: Tue, 22 Feb 2022 21:26:27 +0800 [thread overview]
Message-ID: <YhTkgytf1YnQLcuB@MiWiFi-R3L-srv> (raw)
In-Reply-To: <YhThVgoRJoZ7Voyy@MiWiFi-R3L-srv>
On 02/22/22 at 09:12pm, Baoquan He wrote:
> On 02/22/22 at 09:44am, Christoph Hellwig wrote:
> > On Mon, Feb 21, 2022 at 02:57:34PM +0100, Heiko Carstens wrote:
> > > > 1) Kmalloc(GFP_DMA) in s390 platform, under arch/s390 and drivers/s390;
> > >
> > > So, s390 partially requires GFP_DMA allocations for memory areas which
> > > are required by the hardware to be below 2GB. There is not necessarily
> > > a device associated when this is required. E.g. some legacy "diagnose"
> > > calls require buffers to be below 2GB.
> > >
> > > How should something like this be handled? I'd guess that the
> > > dma_alloc API is not the right thing to use in such cases. Of course
> > > we could say, let's waste memory and use full pages instead, however
> > > I'm not sure this is a good idea.
> >
> > Yeah, I don't think the DMA API is the right thing for that. This
> > is one of the very rare cases where a raw allocation makes sense.
> >
> > That being said being able to drop kmalloc support for GFP_DMA would
> > be really useful. How much memory would we waste if switching to the
> > page allocator?
> >
> > > s390 drivers could probably converted to dma_alloc API, even though
> > > that would cause quite some code churn.
> >
> > I think that would be a very good thing to have.
> >
> > > > For this first patch series, thanks to Hyeonggon for helping
> > > > reviewing and great suggestions on patch improving. We will work
> > > > together to continue the next steps of work.
> > > >
> > > > Any comment, thought, or suggestoin is welcome and appreciated,
> > > > including but not limited to:
> > > > 1) whether we should remove dma-kmalloc support in kernel();
> > >
> > > The question is: what would this buy us? As stated above I'd assume
> > > this comes with quite some code churn, so there should be a good
> > > reason to do this.
> >
> > There is two steps here. One is to remove GFP_DMA support from
> > kmalloc, which would help to cleanup the slab allocator(s) very nicely,
> > as at that point it can stop to be zone aware entirely.
> >
> > The long term goal is to remove ZONE_DMA entirely at least for
> > architectures that only use the small 16MB ISA-style one. It can
> > then be replaced with for example a CMA area and fall into a movable
> > zone. I'd have to prototype this first and see how it applies to the
> > s390 case. It might not be worth it and maybe we should replace
> > ZONE_DMA and ZONE_DMA32 with a ZONE_LIMITED for those use cases as
> > the amount covered tends to not be totally out of line for what we
> > built the zone infrastructure.
> >
> > > >From this cover letter I only get that there was a problem with kdump
> > > on x86, and this has been fixed. So why this extra effort?
> > >
> > > > 3) Drop support for allocating DMA memory from slab allocator
> > > > (as Christoph Hellwig said) and convert them to use DMA32
> > > > and see what happens
> > >
> > > Can you please clarify what "convert to DMA32" means? I would assume
> > > this does _not_ mean that passing GFP_DMA32 to slab allocator would
> > > work then?
> >
> > I'm really not sure what this means.
>
> Thanks a lot to Heiko for valuable input, it's very helpful. And thanks
> a lot to Christoph for explaining.
>
> I guess this "convert to DMA32" is similar to "replace ZONE_DMA and
> ZONE_DMA32 with a ZONE_LIMITED".
And by the way, when I searched SLAB_CACHE_DMA32 which is another zone
aware slab flag, I got that not all people likes to abuse
kmalloc(GFP_DMA). There are two places where
kmem_cache_create(SLAB_CACHE_DMA32) are called to create slab grabbing
memory from zone DMA32. Obviously the code author really knows slab
allocator. They use dma32 slab to get cache memory under 4G.
drivers/firmware/google/gsmi.c : gsmi_init()
drivers/iommu/io-pgtable-arm-v7s.c: arm_v7s_alloc_pgtable()
>
> When I use 'git grep "GFP_DMA/>"' to search all places specifying GFP_DMA,
> I noticed the main usage of kmalloc(GFP_DMA) is to get memory under a
> memory limitation, but not for DMA buffer allocation. Below is what I got
> for earlier kdump issue explanation. It can help explain why kmalloc(GFP_DMA)
> is useful on ARCHes w/o ZONE_DMA32, but doesn't make sense on x86_64 which
> has both zone DMA and DMA32. The 16M ZONE_DMA is only for very rarely used
> legacy ISA device, but most pci devices driver supporting 32bit addressing
> likes to abuse kmalloc(GFP_DMA) to get DMA buffer from the zone DMA.
> That obviously is unsafe and unreasonable.
>
> Like risc-V which doesn't have the burden of legacy ISA devices, it can
> take only containing DMA32 zone way. ARM64 also adjusts to have only
> arm64 if not on Raspberry Pi. Using kmalloc(GFP_DMA) makes them no
> inconvenience. If finally having dma32-kmalloc, the name may need be
> carefully considerred, it seems to be acceptable. We just need to pick
> up those ISA device driver and handle their 24bit addressing DMA well.
>
> For this patchset, I only find out places in which GPF_DMA is
> redundant and can be removed directly, and places where
> kmalloc(GFP_DMA)|dma_map_ pair can be replaced with dma_alloc_xxxx() API
> and the memory wasting is not so big. I have patches converting
> kmalloc(GFP_DMA) to alloc_pages(GFP_DMA), but not easy to replace with
> dma_alloc_xxx(), Hyeonggon suggested not adding them to this series.
> I will continue investigating the left places, see whether or how we can
> convert them.
>
> =============================
> ARCH which has DMA32
> ZONE_DMA ZONE_DMA32
> arm64 0~X X~4G (X is got from ACPI or DT. Otherwise it's 4G by default, DMA32 is empty)
> ia64 None 0~4G
> mips 0 or 0~16M X~4G (zone DMA is empty on SGI_IP22 or SGI_IP28, otherwise 16M by default like i386)
> riscv None 0~4G
> x86_64 16M 16M~4G
>
>
> =============================
> ARCH which has no DMA32
> ZONE_DMA
> alpha 0~16M or empty if IOMMU enabled
> arm 0~X (X is reported by fdt, 4G by default)
> m68k 0~total memory
> microblaze 0~total low memory
> powerpc 0~2G
> s390 0~2G
> sparc 0~ total low memory
> i386 0~16M
>
> >
> > >
> > > btw. there are actually two kmalloc allocations which pass GFP_DMA32;
> > > I guess this is broken(?):
> > >
> > > drivers/hid/intel-ish-hid/ishtp-fw-loader.c: dma_buf = kmalloc(payload_max_size, GFP_KERNEL | GFP_DMA32);
> > > drivers/media/test-drivers/vivid/vivid-osd.c: dev->video_vbase = kzalloc(dev->video_buffer_size, GFP_KERNEL | GFP_DMA32);
> >
> > Yes, this is completely broken.
> >
>
next prev parent reply other threads:[~2022-02-22 13:26 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-19 0:51 Baoquan He
2022-02-19 0:52 ` [PATCH 01/22] parisc: pci-dma: remove stale code and comment Baoquan He
2022-02-19 7:07 ` Christoph Hellwig
2022-02-19 0:52 ` [PATCH 02/22] net: moxa: Don't use GFP_DMA when calling dma_alloc_coherent() Baoquan He
2022-02-19 7:07 ` Christoph Hellwig
2022-02-19 0:52 ` [PATCH 03/22] gpu: ipu-v3: " Baoquan He
2022-02-19 7:07 ` Christoph Hellwig
2022-02-19 0:52 ` [PATCH 04/22] drm/sti: Don't use GFP_DMA when calling dma_alloc_wc() Baoquan He
2022-02-19 7:08 ` Christoph Hellwig
2022-02-19 0:52 ` [PATCH 05/22] sound: n64: Don't use GFP_DMA when calling dma_alloc_coherent() Baoquan He
2022-02-19 7:08 ` Christoph Hellwig
2022-02-19 0:52 ` [PATCH 06/22] fbdev: da8xx: " Baoquan He
2022-02-19 7:08 ` Christoph Hellwig
2022-02-19 0:52 ` [PATCH 07/22] fbdev: mx3fb: Don't use GFP_DMA when calling dma_alloc_wc() Baoquan He
2022-02-19 7:08 ` Christoph Hellwig
2022-02-19 0:52 ` [PATCH 08/22] usb: gadget: lpc32xx_udc: Don't use GFP_DMA when calling dma_alloc_coherent() Baoquan He
2022-02-19 7:09 ` Christoph Hellwig
2022-02-19 0:52 ` [PATCH 09/22] usb: cdns3: " Baoquan He
2022-02-19 7:09 ` Christoph Hellwig
2022-02-19 0:52 ` [PATCH 10/22] uio: pruss: " Baoquan He
2022-02-19 7:09 ` Christoph Hellwig
2022-02-19 0:52 ` [PATCH 11/22] staging: emxx_udc: " Baoquan He
2022-02-19 6:51 ` Wolfram Sang
2022-02-20 1:55 ` Baoquan He
2022-02-19 7:09 ` Christoph Hellwig
2022-02-19 0:52 ` [PATCH 12/22] " Baoquan He
2022-02-19 7:10 ` Christoph Hellwig
2022-02-19 0:52 ` [PATCH 13/22] spi: atmel: " Baoquan He
2022-02-19 7:10 ` Christoph Hellwig
2022-02-19 0:52 ` [PATCH 14/22] spi: spi-ti-qspi: " Baoquan He
2022-02-19 7:12 ` Christoph Hellwig
2022-02-19 0:52 ` [PATCH 15/22] usb: cdns3: Don't use GFP_DMA32 when calling dma_pool_alloc() Baoquan He
2022-02-19 7:13 ` Christoph Hellwig
2022-02-19 0:52 ` [PATCH 16/22] usb: udc: lpc32xx: Don't use GFP_DMA " Baoquan He
2022-02-19 7:13 ` Christoph Hellwig
2022-02-19 0:52 ` [PATCH 17/22] net: marvell: prestera: " Baoquan He
2022-02-19 4:54 ` Jakub Kicinski
2022-02-20 2:06 ` Baoquan He
2022-02-19 7:13 ` Christoph Hellwig
2022-02-19 0:52 ` [PATCH 18/22] net: ethernet: mtk-star-emac: Don't use GFP_DMA when calling dmam_alloc_coherent() Baoquan He
2022-02-19 7:13 ` Christoph Hellwig
2022-02-19 0:52 ` [PATCH 19/22] ethernet: rocker: Use dma_alloc_noncoherent() for dma buffer Baoquan He
2022-02-19 7:14 ` Christoph Hellwig
2022-02-19 0:52 ` [PATCH 20/22] HID: intel-ish-hid: " Baoquan He
2022-02-19 7:14 ` Christoph Hellwig
2022-02-19 0:52 ` [PATCH 21/22] mmc: wbsd: " Baoquan He
2022-02-19 7:17 ` Christoph Hellwig
2022-02-20 8:40 ` Baoquan He
2022-02-22 8:45 ` Christoph Hellwig
2022-02-22 9:14 ` Baoquan He
2022-02-22 13:11 ` Christoph Hellwig
2022-02-22 13:40 ` Baoquan He
2022-02-22 13:41 ` [PATCH 1/2] dma-mapping: check dma_mask for streaming mapping allocs Baoquan He
2022-02-22 15:59 ` Christoph Hellwig
2022-02-23 0:28 ` Baoquan He
2022-02-23 14:25 ` Christoph Hellwig
2022-02-23 14:57 ` David Laight
2022-02-24 14:11 ` Baoquan He
2022-02-24 14:27 ` David Laight
2022-02-25 15:39 ` 'Baoquan He'
2022-02-22 13:42 ` [PATCH 2/2] kernel/dma: rename dma_alloc_direct and dma_map_direct Baoquan He
2022-02-22 15:59 ` Christoph Hellwig
2022-02-19 0:52 ` [PATCH 22/22] mtd: rawnand: Use dma_alloc_noncoherent() for dma buffer Baoquan He
2022-02-19 7:19 ` Christoph Hellwig
2022-02-19 11:18 ` Hyeonggon Yoo
2022-02-22 8:46 ` Christoph Hellwig
2022-02-22 9:06 ` David Laight
2022-02-22 13:16 ` 'Christoph Hellwig'
2022-02-21 13:57 ` [PATCH 00/22] Don't use kmalloc() with GFP_DMA Heiko Carstens
2022-02-22 8:44 ` Christoph Hellwig
2022-02-22 13:12 ` Baoquan He
2022-02-22 13:26 ` Baoquan He [this message]
2022-02-23 19:18 ` Heiko Carstens
2022-02-24 6:33 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YhTkgytf1YnQLcuB@MiWiFi-R3L-srv \
--to=bhe@redhat.com \
--cc=42.hyeyoo@gmail.com \
--cc=David.Laight@aculab.com \
--cc=agordeev@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=borntraeger@linux.ibm.com \
--cc=cl@linux.com \
--cc=davem@davemloft.net \
--cc=david@redhat.com \
--cc=gor@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=hch@lst.de \
--cc=herbert@gondor.apana.org.au \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-i2c@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-s390@vger.kernel.org \
--cc=michael@walle.cc \
--cc=netdev@vger.kernel.org \
--cc=pasic@linux.ibm.com \
--cc=penberg@kernel.org \
--cc=rientjes@google.com \
--cc=steffen.klassert@secunet.com \
--cc=svens@linux.ibm.com \
--cc=vbabka@suse.cz \
--cc=vneethv@linux.ibm.com \
--cc=wsa@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox