From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C5ABBC36002 for ; Wed, 9 Apr 2025 09:05:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 512936B0160; Wed, 9 Apr 2025 05:05:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 49B64280036; Wed, 9 Apr 2025 05:05:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2784E6B0162; Wed, 9 Apr 2025 05:05:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 0352E6B0160 for ; Wed, 9 Apr 2025 05:05:35 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id B7B2F1CE8BB for ; Wed, 9 Apr 2025 09:05:36 +0000 (UTC) X-FDA: 83313922272.24.7D1BE61 Received: from mail-wm1-f46.google.com (mail-wm1-f46.google.com [209.85.128.46]) by imf09.hostedemail.com (Postfix) with ESMTP id 9EFF2140004 for ; Wed, 9 Apr 2025 09:05:34 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=RI5jrsm5; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf09.hostedemail.com: domain of ptesarik@suse.com designates 209.85.128.46 as permitted sender) smtp.mailfrom=ptesarik@suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744189534; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bJQfyTrIyEkFHIymvEGGLw59ZBCv+QQxraI8YLf70cA=; b=ySWqZdferW0DqmIwvofae4jMYpGjyV2mmjy5CpBaixpEL9J0tJK+Gz5PkfxBiNKruitjKM 2zCMQJAKtQZ5IEBI4sDOFmItRfv18eP8HhNbtzo6A7yvG2a6l9X3WaaxIstHlIO83rKIG/ M0jvTq/DSX2+m1wp+c/ufLRW7YMK5RI= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=RI5jrsm5; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf09.hostedemail.com: domain of ptesarik@suse.com designates 209.85.128.46 as permitted sender) smtp.mailfrom=ptesarik@suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744189534; a=rsa-sha256; cv=none; b=oDqf7Qyt9gqRNXh3X7OedqyHBUvICH9MU+5XXX+rFxVs2QDM+sFHsbjMWN+F/CVjyI6CkF oFs4CoSR5Ak9nUJagatc1uA9yVgvgfSQ2St51rOjkuk6GYCBupK28sCtYhDUHYLI1Yn+sl vvuBidsaanw9SdftMvf4qWrr3QGtSaY= Received: by mail-wm1-f46.google.com with SMTP id 5b1f17b1804b1-43cfe99f2a7so6323205e9.2 for ; Wed, 09 Apr 2025 02:05:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1744189533; x=1744794333; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=bJQfyTrIyEkFHIymvEGGLw59ZBCv+QQxraI8YLf70cA=; b=RI5jrsm5K3v73fUokWEYAh9b0rLRWD5aOltX6GZJM/wfdFkdUTZ3CAY/16IAJx17ku umFiymeGKDZzeQsnreFXiGzm4cqciUudjY2tPvpKwkGKPPuMDCnGMzu35+Zh10fY43V8 23Ev+bDHHXYBj9NjrVCMnEPd10IjuUjZgCfiGQLTJA5Yy5msUvtJ5stPGLVaelXXXXGJ aYUbW1CTgUvXsMUgrjmYhlJSFZPlc4taBlI93fftUCt8XyXVcORB6rSr3LJEJovHfK6D OMMPlqcgaqJ4eS+RDXlBWEAdTT8yTjk1Sc0D5rF73+V/PgIi6vcRG/25HY2J4yZ5dm+F oFAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744189533; x=1744794333; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bJQfyTrIyEkFHIymvEGGLw59ZBCv+QQxraI8YLf70cA=; b=QdKKryBF0bGUCE8xKnHwoeeHmnwLPH4IZCKKsQWOdXlntJTlhXsSFD9S+9CUeYqSwa YCbR4TTb19pPXiyQDDLplgUx92ptZ+owXTja3IIORFe7YWhf23IAdJTJFtLKpmPMH9EF katyE64/YxT2+kCStEkigg3rlClLmf1BDDZtZNkv5suqVnu2Bi+CmC9KQdAfNvvOovas W1XBC8pyjUMl02bhlKrDGtu+1M9fn95v7ualWYLil9we/cXGA3mdIxU3MykOJ2ceASL+ cN8CdowOCJISd46DaPsMgn7sROZpL0ACanqXF7CgQnT+7yZSyQe7FQLA+xItcRmMYGaC 9wgg== X-Forwarded-Encrypted: i=1; AJvYcCU3WePKCEL9WOCNrWmMs/52rhbi4HzoI6NJr0WavNQ5AJZgueDPwLevy2E61269xBEyJRZAEVJ3rA==@kvack.org X-Gm-Message-State: AOJu0YyjJXda32UPqZSv55a+lxq7IU32tE2es9VBdCqOZq5jkZUbtE/K JLLt6+Z85zKvsncXy0sGsuCFTwc6ByEn4/lCU3DU3i1+LY0AkalzCuQ+xmLUwww= X-Gm-Gg: ASbGncvyFRU0TdIpBQ+ybVGAD1pXDOzunsiJxR+JkjetIqCivK8KIjvP5AIwMdd5+Jq 2G6UrmCPMadEFz5t3nCuwl5QqrxgrCKwJGqjPPsOdmdlN/9GtwHFOAd15nqDMZ/EfKyqTR1d0q6 pUO6VbKhEFqEGoLu8tS9zIWwqi/MshEQD0/Wz0ybOYdkQl+YtPgMpD0D5k6keKCV84Uh0+dJcUS QYCU7ygBxzERbGFcp80+F/dqACVqMrzoazo3aPgt5+cb60p0Q3JqnYIUGzE1gLFeeZrkVb7za32 VygO0g35TFkcycdPrmyt+IjvSMyHKKnI2cqQ72r5Exa7UoxVQSBEAyZmJXjK0r4JPAOJWxNhjg6 wSzNjO7nr2OxCyl8IK8JPUWHVxMkwNecwyVyUXZiZ X-Google-Smtp-Source: AGHT+IGdKcEEMSnS/Dg9IxBRKi7WTbCqLv45LWIqfcGYI8HAYTV8NNWcWo/RLBEon/hJxH0p8lhrJA== X-Received: by 2002:a05:600c:3ba7:b0:43d:fa58:81d2 with SMTP id 5b1f17b1804b1-43f1fd68bd7mr4290445e9.9.1744189532731; Wed, 09 Apr 2025 02:05:32 -0700 (PDT) Received: from mordecai (dynamic-2a00-1028-83b8-1e7a-b223-ac12-b926-9872.ipv6.o2.cz. [2a00:1028:83b8:1e7a:b223:ac12:b926:9872]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-43f2338db88sm9592055e9.6.2025.04.09.02.05.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Apr 2025 02:05:32 -0700 (PDT) Date: Wed, 9 Apr 2025 11:05:29 +0200 From: Petr Tesarik To: Catalin Marinas Cc: Vlastimil Babka , Feng Tang , Harry Yoo , Peng Fan , Hyeonggon Yoo <42.hyeyoo@gmail.com>, David Rientjes , Christoph Lameter , "linux-mm@kvack.org" , Robin Murphy , Sean Christopherson , Halil Pasic Subject: Re: slub - extended kmalloc redzone and dma alignment Message-ID: <20250409110529.3ad65b3c@mordecai> In-Reply-To: <20250409103904.54a19faa@mordecai> References: <20250404131239.2a987e58@mordecai> <20250404155303.2e0cdd27@mordecai> <39657cf9-e24d-4b85-9773-45fe26dd16ae@suse.cz> <20250408072732.32db7809@mordecai> <20250409103904.54a19faa@mordecai> X-Mailer: Claws Mail 4.3.1 (GTK 3.24.48; x86_64-suse-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam01 X-Stat-Signature: 1tain4assedtkmo9he186q1ynx5o1xhf X-Rspam-User: X-Rspamd-Queue-Id: 9EFF2140004 X-HE-Tag: 1744189534-844587 X-HE-Meta: U2FsdGVkX19Wz7/ZaFgoYEV2FSEGGe1Ta7TP+5MjffdikrxVyqVFvMgelycIO+O7Egix/xqXJCoP08JAMMuqRACYY2qTyfCgh9mEHp1FioDx2GkFZ9IwqFP5tvcSUyHGl7P5bhyy21UI84RTzj1k1kCgMa9ZvHadHE3zvUUJI9pMDnXdfSXp9X4e1pjjqGy0w/fyp2DxOZxpYkB8KdrV77qFq48KXcxHiJ9ecy0Gv3gEEsnNQA5OpxDBhhxcWl+77ZM1/hPndFHhtq02LPrZ2OIneCWW2Ad6dEQ8U1+tLeIFP5YxGK3KbeV3mVbNpliwrqLVKIHly3PJcv9Ip5V9qWpmuWVSCgzbdAyzW/+3ovcPDgF2Qq9laXO7hWWX7i8OPQU+3ojBcgpOLmSR+6rU6jyIS9p/xvKicHxkGUbjcHmPiwCZdxmI6cyI6bSDNwDAWRIfbXW7wOAUyFrbMlU2ZyWqSSIQQpUjU+XXdsx49n+IRCOFeYx2OwI1/kbgweYASPvow/kSFuZBIb4Z6Q4b2nRJ7RATBKanNmWkKWBledHmLtuW8mT2fO+0iznPCRE1+jm69ug4oRpJL9wbcJAg+W53hfAEg+X3WX8emdRaLYq4zhbb9+ZFj4+wDpo2yaZVhmskycK/LtoSbCZCNo3Snejk2/L66HRj2nDvAZoWbzCVSxTDJt7SQgHMKEx2IE1ceGm2SFbEoZjFUzKiyjRbfDqltmB9jIHBYXKn4irTCqr5sn8Nh0CzrUO7+V2+3x7rotqlQ3kAGFxiG/1NrAzLrQcJjU1yN0+GqaFgOhz9PFAZTdaWit8gg1EGWj2abaG2AbqP+35KNUntPeyde9iFT5sp4gLk023fzoN+xEvfBpUhYjUe1DwbknQlGhsU/s/fyVTsU9LCWI3cncL1HLLZDUgRLViCARSBuPc1W4liiyY2ecL37mIyEZnokr1WLtWHYKwDJf/Zd3/Oago8UYq 1A7ZBF5A PvG7hTYJaDxAHOsHexVfKP/CIw8m0yVvUWCPexDqmgBDnw/y52i/CmYQQ2TOJ2yBcltfYksGDqxZNB+MPnvHnzpeNRQp0Qv2NZrWNk3rQUHMdqnEQjwt7NSRMyIASYGmeX1cX3BGdDErQN+2ipxBQT6dgCtwblmIZ9P3KF+gdrbRPV9kDAsVTDm8v1Tbk7NupernGXvQWfuged2HrqVGm2mYkstbSONTqR5mc21GRQXstWMo/dQKVhB93xz+Ds6Mwrhx1LQpL6oV0Olq+9mhDfgGNIIw8YUSWplMKpPd1CCeXd9Q5wOD77qWj4AkNPX0uGnyFJ2EJyCok1FHjGMkd99wvf61FFeCR2ylXTZRi7VQFGMdcj6e3ZOP1CO0FnInOZilYmxOBerX+W+ihGZC4NxR3JwrLttF4FSppmrYUPdBcOaXTi+pJGoCaFdFkP1VmDWc58Y2GDPAX35xBtQGP3XYCo1t2dzQtQIUw63TzAWY0F3g= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, 9 Apr 2025 10:39:04 +0200 Petr Tesarik wrote: > On Tue, 8 Apr 2025 16:07:19 +0100 > Catalin Marinas wrote: > > > On Tue, Apr 08, 2025 at 07:27:32AM +0200, Petr Tesarik wrote: > > > On Mon, 7 Apr 2025 18:12:09 +0100 > > > Catalin Marinas wrote: > > > > Thanks for looping me in. I'm just catching up with this thread. > > > > > > > > On Mon, Apr 07, 2025 at 09:54:41AM +0200, Vlastimil Babka wrote: > > > > > On 4/7/25 09:21, Feng Tang wrote: > > > > > > On Sun, Apr 06, 2025 at 10:02:40PM +0800, Feng Tang wrote: > > > > > > [...] > > > > > >> > I can remember this series, as well as my confusion why > > > > > >> > 192-byte kmalloc caches were missing on arm64. > > > > > >> > > > > > > >> > Nevertheless, I believe ARCH_DMA_MINALIGN is required to avoid > > > > > >> > putting a DMA buffer on the same cache line as some other data > > > > > >> > that might be _written_ by the CPU while the corresponding > > > > > >> > main memory is modified by another bus-mastering device. > > > > > >> > > > > > > >> > Consider this layout: > > > > > >> > > > > > > >> > ... | DMA buffer | other data | ... > > > > > >> > ^ ^ > > > > > >> > +-------------------------+-- cache line boundaries > > > > > >> > > > > > > >> > When you prepare for DMA, you make sure that the DMA buffer is > > > > > >> > not cached by the CPU, so you flush the cache line (from all > > > > > >> > levels). Then you tell the device to write into the DMA > > > > > >> > buffer. However, before the device finishes the DMA > > > > > >> > transaction, the CPU accesses "other data", loading this cache > > > > > >> > line from main memory with partial results. Worse, if the CPU > > > > > >> > writes to "other data", it may write the cache line back into > > > > > >> > main memory, racing with the device writing to DMA buffer, and > > > > > >> > you end up with corrupted data in DMA buffer. > > > > > > > > Yes, cache evictions from 'other data; can override the DMA. Another > > > > problem, when the DMA completed, the kernel does a cache invalidation > > > > to remove any speculatively loaded cache lines from the DMA buffer > > > > but that would also invalidate 'other data', potentially corrupting > > > > it if it was dirty. > > > > > > > > So it's not safe to have DMA into buffers less than ARCH_DMA_MINALIGN > > > > (and unaligned). > > > > > > It's not safe to DMA into buffers that share a CPU cache line with other > > > data, which could be before or after the DMA buffer, of course. > >[...] > > While I think we are ok for arm64, other architectures may invalidate > > the caches in the arch_sync_dma_for_device() which could discard the red > > zone data. A quick grep for arch_sync_dma_for_device() shows several > > architectures invalidating the caches in the FROM_DEVICE case. > > Wait. This sounds broken for partial writes into the DMA buffer, i.e. > where only part of the buffer is updated by a bus-mastering device. > When I worked on swiotlb, I was told that such partial updates must > be supported, and that's why the initial swiotlb_bounce() cannot be > removed from swiotlb_tbl_map_single(). In fact, the comment says: > > /* > * When the device is writing memory, i.e. dir == DMA_FROM_DEVICE, copy > * the original buffer to the TLB buffer before initiating DMA in order > * to preserve the original's data if the device does a partial write, > * i.e. if the device doesn't overwrite the entire buffer. Preserving > * the original data, even if it's garbage, is necessary to match > * hardware behavior. Use of swiotlb is supposed to be transparent, > * i.e. swiotlb must not corrupt memory by clobbering unwritten bytes. > */ > > You may want to check commit ddbd89deb7d3 ("swiotlb: fix info leak with > DMA_FROM_DEVICE"), commit aa6f8dcbab47 ("swiotlb: rework "fix info leak > with DMA_FROM_DEVICE") and commit 1132a1dc053e ("swiotlb: rewrite > comment explaining why the source is preserved on DMA_FROM_DEVICE"). > > I believe there is potential for a nasty race condition, and maybe even > info leak. Consider this: > > 1. DMA buffer is allocated by kmalloc(). The memory area previously > contained sensitive information, which had been written to main > memory. > 2. The DMA buffer is initialized with zeroes, but this new content > stays in a CPU cache (because this is kernel memory with a write > behind cache policy). > 3. DMA is set up, but nothing is written to main memory by the > bus-mastering device. > 4. The CPU cache line is now discarded in arch_sync_dma_for_cpu(). > > IIUC the zeroes were never written to main memory, and previous content > can now be read by the CPU through the DMA buffer. > > I haven't checked if any architecture is affected, but I strongly > believe that the CPU cache MUST be flushed both before and after the > DMA transfer. Any architecture which does not do it that way should be > fixed. > > Or did I miss a crucial detail (again)? Just after sending this, I realized I did. :( There is a step between 2 and 3: 2a. arch_sync_dma_for_device() invalidates the CPU cache line. Architectures which do not write previous content to main memory effectively undo the zeroing here. AFAICS the consequence is still the same: race condition and/or info leak on partial (or zero) DMA write. Petr T