From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B65E8C021A4 for ; Mon, 24 Feb 2025 17:50:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4AE1D280013; Mon, 24 Feb 2025 12:50:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 45DB528000D; Mon, 24 Feb 2025 12:50:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2DADA280013; Mon, 24 Feb 2025 12:50:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 0CC7328000D for ; Mon, 24 Feb 2025 12:50:45 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id BD13581A86 for ; Mon, 24 Feb 2025 17:50:44 +0000 (UTC) X-FDA: 83155578408.11.66B277F Received: from mail-lj1-f176.google.com (mail-lj1-f176.google.com [209.85.208.176]) by imf10.hostedemail.com (Postfix) with ESMTP id A6919C000F for ; Mon, 24 Feb 2025 17:50:42 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="Fxi/OqgZ"; spf=pass (imf10.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.176 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740419442; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rI6WhFr7+Sr/oyaZLiBDcEJPWCjSV0G44RZKC0MTEIc=; b=k7v9rmwkSVhiA0F+KVbxCavd5rIcCFExtOl0eJcp80JYN/EwC3ilC+4pqVqQniHW0RzNak GHV6/2HIfRUQtJIQlOByqfSAm5d+9mNUay+SECPzdPJnyCzwaGU39FQyhSlP7INnsbrmaN XXegS2ohODUngcuUsxiiBkKy8nFYyFE= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="Fxi/OqgZ"; spf=pass (imf10.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.176 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740419442; a=rsa-sha256; cv=none; b=X2ihxzJRroETKYrLf/TtdBATSnyEGkLzGkFKiZV9kf5Uno6+X51lv5RaUixMxPhdwhOkiw 4es4QDj1U7TZFFWiTYR1fx+6QAkiJj44mxR1fpEbpMPjoK2tjJtxMso0/b1Hh8VXl5OUAS FTivwfc0+EE2YCSBOX1oRG98aw2+Yb4= Received: by mail-lj1-f176.google.com with SMTP id 38308e7fff4ca-30613802a6bso49159061fa.1 for ; Mon, 24 Feb 2025 09:50:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1740419440; x=1741024240; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=rI6WhFr7+Sr/oyaZLiBDcEJPWCjSV0G44RZKC0MTEIc=; b=Fxi/OqgZfFaPh+PLtIMXiJPzDv9MGDXO7axsWZQ8pq6JjLzOb9DVsAv8SFqRUwDF5R kqOiWRi1HNT90dG3adQyt6RDuH8QASIb4avwclUny+grH9xRCK4v/zZFo8Ey4eRvAL42 x2ow9sK/HnhQfZUEQRXI/i3nXdGum9IjO2rJyVlKDwLNTHtoitAFPFpvGIomOfpGPhV9 TvKauVqRqTcCF/SFEcSDl8iE8dMqT/9Gaa2DGVwHrecwazibu7jCPT+UsSqQy5FoYdX1 cJ2hhsqBL90fiyl82Wrpg+5F3F7DyWK6KRJnB6Z5ewUc8XWxuHY9bh9DH8/iCo2OmU0T p2LQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740419440; x=1741024240; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rI6WhFr7+Sr/oyaZLiBDcEJPWCjSV0G44RZKC0MTEIc=; b=lNOobV6dMZUZO/uu8enVcwlcXcJhdZk14i6xhzEL9rgDbAH8TS946MtMHiNJvDC6hw WNMoWJl9A1UB4q5LOUSQkFSprJTTVNZ/H78CHtCiCThUzAZUmOl1B/vMJ+A5E5rTZwXw hSWQm4vHFvzqVJuVzPJLQr3+hpNoBZg6faNcWv6SKe7ggq30QJ6gnI0J6j3E9YOBcWRR i6OjwrSb8H68SfcxjaOGnXlanhqko7n+BFtTPbo985JOXn4R8CCf8LyltVXXTPql1vG5 E6u2UONjovPN4BWktAWKEI2HCKBI6zDwJOJcrD7HN5+14DTjlBhzrzVtRS8USwWvufYZ DeZQ== X-Forwarded-Encrypted: i=1; AJvYcCXoFB/0ORA44YLBNVE2r9A4oO60x6p4/gN0xPYGhWGgxlw4ZC36unS0Nf064aVUm+XCV2T2Vdmn7w==@kvack.org X-Gm-Message-State: AOJu0Yzt+jXtZWcjdeJxI8/TAflURCJonNMXiXVTpgEvG7XiF+OVIXRh HL/zyvXdH0AqSxrMsCciS9A2WMhUQyAqrMFjk6Asj1OtvoQRlvnLD9Z1RgP9LrG/+3WEfVixOJP pQDY7oN3O56zXlL0KHvJ9CqjSZus= X-Gm-Gg: ASbGncupOkNlMCr91OqTsjZ7Qf90ubw5D1C6GZiKOIHgpJlNJj3JnbDNbkOaLhFz0Wp 7fRYh7+EcRu/IIUBhv6J1uGOJuERyBnAp8EG/bwfKFz3Ke3f6hijHiNOgwtKC9Fj8ozWLOD2oWK duFDhgLL0= X-Google-Smtp-Source: AGHT+IG9gHyzqvxVRLTze/Uh/M1gyHaHekyvx+ue890bzSA787Hd6wvfYp7tdQZsvp6jZQ06O1xW+HUqWr1XmXmlbo4= X-Received: by 2002:a2e:9049:0:b0:307:5879:e7e6 with SMTP id 38308e7fff4ca-30a80cb7ab0mr685011fa.32.1740419440224; Mon, 24 Feb 2025 09:50:40 -0800 (PST) MIME-Version: 1.0 References: <731904cf-d862-4c0e-ae5b-26444faff253@linux.alibaba.com> <53e610af72302667475821e5b3c84c382da4efbc.1740386576.git.baolin.wang@linux.alibaba.com> In-Reply-To: <53e610af72302667475821e5b3c84c382da4efbc.1740386576.git.baolin.wang@linux.alibaba.com> From: Kairui Song Date: Tue, 25 Feb 2025 01:50:23 +0800 X-Gm-Features: AWEUYZlGRBntVkJKFKFmAMSMB-OFznwWpZ3XAcYLcrkzvZkmTsNWtaDj7TNlkv0 Message-ID: Subject: Re: [PATCH] mm: shmem: fix potential data corruption during shmem swapin To: Baolin Wang Cc: akpm@linux-foundation.org, alex_y_xu@yahoo.ca, baohua@kernel.org, da.gomez@samsung.com, david@redhat.com, hughd@google.com, ioworker0@gmail.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, ryan.roberts@arm.com, wangkefeng.wang@huawei.com, willy@infradead.org, ziy@nvidia.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: A6919C000F X-Stat-Signature: e5ou961ujnu4ntkmje4hmi4zkoatunzt X-HE-Tag: 1740419442-895914 X-HE-Meta: U2FsdGVkX19VbAi54z1U3mHRqCC7S6FFpKyG0427A11uYbYfYl05ahXclN/HtCn5qkILMPmfBzZJOTFxwAkhySU2rK15UO61wBz5gI4shGgOWF682KGmiTtxgYqm+Lq27epfmcNAvBBb9zTdfm8+NsAxF/XqVi5pGDZ/+oCuhGYulby7JseEMaYYvYbdg/uCo/qrE5LI/KBcXsZRH4eKkpS6OdFBfImqUw4MwQzAxmvOhMYXa+GdZ9T9pk8DKHfN2jcmoso6gGiioIbr1EbTfg4AoDPj7OKkJB64lz/g37Wwz+f+7AZKMM1vNtJaFC9CMzo5w0Nj+YeZQ/jqaMh5OMdLAwhVpMCuhISeu22GMWhRach6ELcwPgegoQvF2lEzxvVjTpSKdcottN1GkrpNWHlCg/DYkhEaKf9BFDUKa3hjYpHyEB76yJduwfXsOD4X0R8gksWJ1Lf0/6aGLfe2Lnc1OA5Xm9SfZ5unUYtLMvZ9Y7DnDfpwVE8MJvaJfQ9DrCPy1IXQUyvQgvUioUT8yF+ABuega5NdKgSv8U5AP+mGPe5T1O+N9+zVtTzWK9TdAknHZW3SsvQQKH1PisdlW0DiI/2tge4AJViD7iAPTQ27dMLg0nCc20Aq1YnplUFmykbcNioyJ2dxrAS4+ZfcWrG6mJHnl4/ezMNF2NrodaOgXsB3fRiUHvaoTQk+v6cMNw+YFMRVjLhQTDlfmMu/V2l3Btb7UsLtK5yZPMaVFS767h9z7tWts+HEUtLuNlNxNRlT1PnpO7bQw7kgW9ZF2VVoFdC0YXrnl4wVjBfntBhtpm88qH1JDhFJcWRLYBZfBqaHRutD+zglHa3biXyqxlylwXgXsDYeM/4U5oFqIgLrPb33mK6hw3kykX0diOrIaGHb8gOoSF183KaaGMwerjCj1tDyS4WswsWnAchdaYKC7CLUkqcxyDZI1L6NBHZR3rv1jCo4z0OFz1AwAlz MYUN96wx Dbw7bX79/jXn+eJaNgojjdooqDCmymcX+9uLO3io7dK5uopDVyERwl5Zb5zG5L6j5vIn9zAbPbVUR2wC61tFsS/xfxlWcX03PObQ7HkZWy8mC5PQichOBSdLCagQxEwuh7lknumx4Fm2tLb3BmWolhrL4wG2swNVUmnP/MCZ66Yfa22CFEnYNwKeOgTFEqQqO/eGyMjIBuPIgVTFe14W7WKQgMaNTQU1htX8mZR842JJ5sul2+I1ea+LB+/mWqzeI0wJ8UoHMARzzzK5WtpMTvi3Vnc5wQEAhSDukCEk0qhFF+tZwUVBhR2elTNwRxR01C6Wo/UPXz4n9GvHjWQJSarbbQ/UFE981W6GMZRgGNqJZpK+fCLo8zycyEOKEronCd6zOqi4q23CvKLjAtd49sFeaTOhjOPhvV/f2lbIzsvizsXwB4cfI3Bvrhu8N9OD7kSCF X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Feb 24, 2025 at 4:47=E2=80=AFPM Baolin Wang wrote: > > Alex and Kairui reported some issues (system hang or data corruption) whe= n > swapping out or swapping in large shmem folios. This is especially easy t= o > reproduce when the tmpfs is mount with the 'huge=3Dwithin_size' parameter= . > Thanks to Kairui's reproducer, the issue can be easily replicated. > > The root cause of the problem is that swap readahead may asynchronously > swap in order 0 folios into the swap cache, while the shmem mapping can > still store large swap entries. Then an order 0 folio is inserted into > the shmem mapping without splitting the large swap entry, which overwrite= s > the original large swap entry, leading to data corruption. > > When getting a folio from the swap cache, we should split the large swap > entry stored in the shmem mapping if the orders do not match, to fix this > issue. > > Fixes: 809bc86517cc ("mm: shmem: support large folio swap out") > Reported-by: Alex Xu (Hello71) > Reported-by: Kairui Song Maybe you can add a Closes:? > Signed-off-by: Baolin Wang > --- > mm/shmem.c | 31 +++++++++++++++++++++++++++---- > 1 file changed, 27 insertions(+), 4 deletions(-) > > diff --git a/mm/shmem.c b/mm/shmem.c > index 4ea6109a8043..cebbac97a221 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -2253,7 +2253,7 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, > struct folio *folio =3D NULL; > bool skip_swapcache =3D false; > swp_entry_t swap; > - int error, nr_pages; > + int error, nr_pages, order, split_order; > > VM_BUG_ON(!*foliop || !xa_is_value(*foliop)); > swap =3D radix_to_swp_entry(*foliop); > @@ -2272,10 +2272,9 @@ static int shmem_swapin_folio(struct inode *inode,= pgoff_t index, > > /* Look it up and read it in.. */ > folio =3D swap_cache_get_folio(swap, NULL, 0); > + order =3D xa_get_order(&mapping->i_pages, index); > if (!folio) { > - int order =3D xa_get_order(&mapping->i_pages, index); > bool fallback_order0 =3D false; > - int split_order; > > /* Or update major stats only when swapin succeeds?? */ > if (fault_type) { > @@ -2339,6 +2338,29 @@ static int shmem_swapin_folio(struct inode *inode,= pgoff_t index, > error =3D -ENOMEM; > goto failed; > } > + } else if (order !=3D folio_order(folio)) { > + /* > + * Swap readahead may swap in order 0 folios into swapcac= he > + * asynchronously, while the shmem mapping can still stor= es > + * large swap entries. In such cases, we should split the > + * large swap entry to prevent possible data corruption. > + */ > + split_order =3D shmem_split_large_entry(inode, index, swa= p, gfp); > + if (split_order < 0) { > + error =3D split_order; > + goto failed; > + } > + > + /* > + * If the large swap entry has already been split, it is > + * necessary to recalculate the new swap entry based on > + * the old order alignment. > + */ > + if (split_order > 0) { > + pgoff_t offset =3D index - round_down(index, 1 <<= split_order); > + > + swap =3D swp_entry(swp_type(swap), swp_offset(swa= p) + offset); > + } > } > > alloced: > @@ -2346,7 +2368,8 @@ static int shmem_swapin_folio(struct inode *inode, = pgoff_t index, > folio_lock(folio); > if ((!skip_swapcache && !folio_test_swapcache(folio)) || > folio->swap.val !=3D swap.val || > - !shmem_confirm_swap(mapping, index, swap)) { > + !shmem_confirm_swap(mapping, index, swap) || > + xa_get_order(&mapping->i_pages, index) !=3D folio_order(folio= )) { > error =3D -EEXIST; > goto unlock; > } > -- > 2.43.5 > Thanks for the fix, it works for me. Tested-by: Kairui Song