From: Petr Tesarik <ptesarik@suse.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Harry Yoo <harry.yoo@oracle.com>,
Feng Tang <feng.tang@linux.alibaba.com>,
Peng Fan <peng.fan@nxp.com>, Hyeonggon Yoo <42.hyeyoo@gmail.com>,
David Rientjes <rientjes@google.com>,
Christoph Lameter <cl@linux.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Catalin Marinas <Catalin.Marinas@arm.com>
Subject: Re: slub - extended kmalloc redzone and dma alignment
Date: Fri, 4 Apr 2025 15:53:03 +0200 [thread overview]
Message-ID: <20250404155303.2e0cdd27@mordecai> (raw)
In-Reply-To: <fe9650c9-483c-4325-a4d8-6af623344096@suse.cz>
On Fri, 4 Apr 2025 14:45:14 +0200
Vlastimil Babka <vbabka@suse.cz> wrote:
> On 4/4/25 13:12, Petr Tesarik wrote:
> > On Fri, 4 Apr 2025 19:30:09 +0900
> > Harry Yoo <harry.yoo@oracle.com> wrote:
> >
> >> On Fri, Apr 04, 2025 at 11:30:49AM +0200, Vlastimil Babka wrote:
> >> > Hi,
> >> >
> >> > due to some off-list inquiry I have realized that since 946fa0dbf2d8
> >> > ("mm/slub: extend redzone check to extra allocated kmalloc space than
> >> > requested")
> >> > we might be reporting false positives due to dma writing into the redzone.
> >> >
> >> > It wasn't confirmed (yet) during the conversation but AFAICS it can be
> >> > happening. We have this ARCH_DMA_MINALIGN and kmalloc() will guarantee it,
> >> > but the redzone check doesn't take it into account.
> >>
> >> Sounds valid to me.
> >
> > I'm not sure I understand your concerns.
>
> I'd be happy to be proven wrong and you're more familiar with DMA details
> than me :)
>
> > Are you afraid that another device on the bus caches a copy of the
> > redzone before it was poisoned, so it overwrites the redzone with stale
> > data on a memory write operation? IMO that's buggy, because if a
> > bus-mastering device implements such cache, it is the device driver's
> > responsibility to flush it before starting a DMA transfer. FTR I'm not
> > aware of any such devices, except GPUs, but there's a whole lot to do
> > about CPU<->GPU coherency management, including device-specific ioctl's
> > to expose some gory details all the way down to userspace.
>
> OK, guess not that.
>
> > Or are you concerned about bus data word size? I would again argue that
> > allocating a DMA buffer with a size that is not a multiple of the
> > transfer size is a bug. IOW the driver must make sure the buffer size
> > is a multiple of 4 if it is used for 32-bit DMA transfers, or a
> > multiple of 8 if it is used for 64-bit DMA transfers.
>
> Yeah I think it's that, and I thought drivers don't need to care themselves
> because ARCH_DMA_MINALIGN means kmalloc() layer provides that guarantee
> itself. I also remember this series (incidentally just recently the
> discussion was revived).
>
> https://lore.kernel.org/all/20230612153201.554742-1-catalin.marinas@arm.com/
I can remember this series, as well as my confusion why 192-byte
kmalloc caches were missing on arm64.
Nevertheless, I believe ARCH_DMA_MINALIGN is required to avoid putting
a DMA buffer on the same cache line as some other data that might be
_written_ by the CPU while the corresponding main memory is modified by
another bus-mastering device.
Consider this layout:
... | DMA buffer | other data | ...
^ ^
+-------------------------+-- cache line boundaries
When you prepare for DMA, you make sure that the DMA buffer is not
cached by the CPU, so you flush the cache line (from all levels). Then
you tell the device to write into the DMA buffer. However, before the
device finishes the DMA transaction, the CPU accesses "other data",
loading this cache line from main memory with partial results. Worse,
if the CPU writes to "other data", it may write the cache line back
into main memory, racing with the device writing to DMA buffer, and you
end up with corrupted data in DMA buffer.
But redzone poisoning should happen long before the DMA buffer cache
line is flushed. The device will not overwrite it unless it was given
wrong buffer length for the transaction, but then that would be a bug
that I'd rather detect.
@Catalin: I can see you're already in Cc. If I'm still missing
something, please, correct me.
Petr T
next prev parent reply other threads:[~2025-04-04 13:53 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-04 9:30 Vlastimil Babka
2025-04-04 10:30 ` Harry Yoo
2025-04-04 11:12 ` Petr Tesarik
2025-04-04 12:45 ` Vlastimil Babka
2025-04-04 13:53 ` Petr Tesarik [this message]
2025-04-06 14:02 ` Feng Tang
2025-04-07 7:21 ` Feng Tang
2025-04-07 7:54 ` Vlastimil Babka
2025-04-07 9:50 ` Petr Tesarik
2025-04-07 17:12 ` Catalin Marinas
2025-04-08 5:27 ` Petr Tesarik
2025-04-08 15:07 ` Catalin Marinas
2025-04-09 8:39 ` Petr Tesarik
2025-04-09 9:05 ` Petr Tesarik
2025-04-09 9:47 ` Catalin Marinas
2025-04-09 12:18 ` Petr Tesarik
2025-04-09 12:49 ` Catalin Marinas
2025-04-09 13:41 ` Petr Tesarik
2025-04-09 8:51 ` Vlastimil Babka
2025-04-09 11:11 ` Catalin Marinas
2025-04-09 12:22 ` Vlastimil Babka
2025-04-09 14:30 ` Catalin Marinas
2025-04-10 1:54 ` Feng Tang
2025-04-07 7:45 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250404155303.2e0cdd27@mordecai \
--to=ptesarik@suse.com \
--cc=42.hyeyoo@gmail.com \
--cc=Catalin.Marinas@arm.com \
--cc=cl@linux.com \
--cc=feng.tang@linux.alibaba.com \
--cc=harry.yoo@oracle.com \
--cc=linux-mm@kvack.org \
--cc=peng.fan@nxp.com \
--cc=rientjes@google.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox