From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4CB09C27C52 for ; Thu, 6 Jun 2024 21:01:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BFC946B009F; Thu, 6 Jun 2024 17:01:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B85A46B00A1; Thu, 6 Jun 2024 17:01:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A24E16B00A9; Thu, 6 Jun 2024 17:01:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 7E9A66B009F for ; Thu, 6 Jun 2024 17:01:27 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 29A5E120553 for ; Thu, 6 Jun 2024 21:01:27 +0000 (UTC) X-FDA: 82201684614.04.0187CEC Received: from mail-ej1-f45.google.com (mail-ej1-f45.google.com [209.85.218.45]) by imf26.hostedemail.com (Postfix) with ESMTP id 4269B140018 for ; Thu, 6 Jun 2024 21:01:25 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="EcT/3mvR"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf26.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.45 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717707685; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7MCVKBr1jnnAYBk5fqftqx4NlGxQg/1shx+hU00krnQ=; b=JOycZoglP+P19n6xpdbNutDxZobI5560Y8Aww22dWmh0iJ6uQP4YZNqS1zC29qy1fPm9Gq 9mJWFq4YRyqIAdzqL61aFsxaQH8TRwXvdtb6kL1LXu+xPlyPwij8IWtwYsv82b49KbHtVq C1NO8dimbycs0JXpFDoDZ1e0DP1W1I8= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="EcT/3mvR"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf26.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.45 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717707685; a=rsa-sha256; cv=none; b=OKbRzzX2FKqwxoR7dNOZ+2gdz2SMkJmmBhiFLiBBugi6Ug3ATrODtQMrOdIXsVN4z7GjN2 T9SJgB7kZY0qkEzYHIAXL6lKfkeB/X+znnczcaadubQ4vCT+iaD9TJwoegaYKyTGzvZR7a f91CfVbXhOqj2MBaKlmjT04f98bc/JI= Received: by mail-ej1-f45.google.com with SMTP id a640c23a62f3a-a68fc86acfaso227467466b.1 for ; Thu, 06 Jun 2024 14:01:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717707684; x=1718312484; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=7MCVKBr1jnnAYBk5fqftqx4NlGxQg/1shx+hU00krnQ=; b=EcT/3mvRoA/62irveRp/QG/NoWmDX0c386nnMjp/zNWRzbZ5IKXwq1niwDulnnb3jy RIpFeD4csF3hEUUcnBNr1eWvgouhtYdNu+pSy2efPmO1jonNeZzffi21Z9fk+RPZsaD8 cBnn2e7iMpDfIs//hLQUht/ZWC/cS4Gn5WQic72HXvkhfHlfg0pYrMsCatOQSIsuZKvd FpjFIAHaLEPQiTP2s7GBmWezAYK5UYk7PUf9H7vgCoH8L/VMqF5q9WmtVgpX+hW90TJR MVUNt8gsZhFoCGtdtxC7lyWo8TS0tl1j7YzQ9YPdjNrDvgEamokLqT9v1WASKjXNrgg8 /OXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717707684; x=1718312484; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7MCVKBr1jnnAYBk5fqftqx4NlGxQg/1shx+hU00krnQ=; b=Mr7dtEb+ZmxVUN9D+lQdjQ8XYWlrqJcGYFJUkJFoLyUqhH5USPU4feQQcSwOnJ+hUY /WG8tE5IAs2WEvne7z9fMNnJ987nqsvHKrH8LyHu0Gu1DwegJHb04J6jv/6xb7eyCGsO sxeb+l+2aUaWfdnfbeSPKEY9Qox0TRMQAsi+GswXuoLNtGcS0fnxPP8zqhvDnL0Jut4o 01DbDuwiVZImD9x5mMuXY9n5fMwNukMHIXaIdMmDirVCMyN653srqWy+qnM0v2loB/k2 haVS6Sv8kilC3VAnz6Sw5WUdqNGMiPl96wEt0Mq0kYtUARy0kkJlEGd+Ji+PCZg6RJMb Vt3A== X-Forwarded-Encrypted: i=1; AJvYcCWRD80cdo8QgL4Oj9q7hS8hPiiYFfnI7ssVxHtbOaLhhGgYlzlSCnA2uqjJnFmrjjpUF3jLI5tLuNb85N/KS8rXT8k= X-Gm-Message-State: AOJu0Yw8Ltc85m1RykOSqyrsMaj9ySBkLilkognhBn4QElIcii1T2cBr KztbMjdyz4h2nVQObW2cIFR5twLVyXvdLSVd6GWz9ekb10OukD+qHpFeFZQfyDkGiC762nu7sYM IlVSvMAcDOPrRjKuSZqe3tEfbQVUg6UDW+/Xh X-Google-Smtp-Source: AGHT+IEcMkK6qWB80Jcu7kkIF7UKhpOkQSMkPqnzpZA2KliAZwI4wS4nyFrPjsvdSqEqs3aWy90YHQtEaQgZprni41o= X-Received: by 2002:a17:906:48d0:b0:a6c:7337:b19a with SMTP id a640c23a62f3a-a6cbc6ca00cmr67012166b.29.1717707683316; Thu, 06 Jun 2024 14:01:23 -0700 (PDT) MIME-Version: 1.0 References: <20240606184818.1566920-1-yosryahmed@google.com> <84d78362-e75c-40c8-b6c2-56d5d5292aa7@redhat.com> In-Reply-To: From: Yosry Ahmed Date: Thu, 6 Jun 2024 14:00:47 -0700 Message-ID: Subject: Re: [PATCH] mm: zswap: add VM_BUG_ON() if large folio swapin is attempted To: Barry Song <21cnbao@gmail.com> Cc: David Hildenbrand , Andrew Morton , Johannes Weiner , Nhat Pham , Chengming Zhou , Baolin Wang , Chris Li , Ryan Roberts , Matthew Wilcox , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4269B140018 X-Stat-Signature: bniiisu81hsor4ik8pdhhkb6cjyo5ys1 X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1717707685-594381 X-HE-Meta: U2FsdGVkX1928NOU58egNiEWFavOIX9Pu0NxiSfYZcQ8nU1MqCXOyc5e/MxydbKlnBPUGXCvHm+17+de6pkv6EHLGLZe1in/YDct9SwXCNrmj2IWvJrvVaz13HV5duQBfc9Aw/6obj9VKQ3MIGCg8lqMGrn+f1WwF6wmqp+KgzwLhoZ9AuxQuguYtp86UMmLnWz9+QOaqnXzGq48LZwy9xMvRo52XJbN2kPlLKr0wVAOjmwNifxg0TdpLKtjs20CE8sJ5E4Cgs7w+C4fUPBOywYBG2M0eaiiqN5QLAWH+2Jm2O+J0G7yLl/HdMeSKHvVvfoBQJosXgzvWkMjywzGEXHluy5sabpQG4FtjcsLD9LJupW8KXpnv7t86Ev6BR3dA/N8VwXjJyXq1lR01P198wtS2XaN2+zNvKG606vCKZBW5zxoEZbP4+nbQS75+3DJx8pM5H5WJDbYWij3nPO2YSQSrSVLrRag9uLSNf0SY0PWW2ryOkAEH4Gyhjf2RTInx9qGfklPaxkHOfwQKc8f4+qSouJf4l77LK8TAwLhhh/8fuk/1WA/vPKoFWkzbz37WYIfScH2RyPxjX9bKdAe62Xrmvqhj/KS0iGcaSFqgkJuxFJ5a02SZhWtuoaRIIbUe+uPz//YuMOjtrRDdRbFXbVRnHdhp1WLnDjP0R8W0ImbwlOsDEqB6tCBXi1bVTQ8HvuSaMT9j3e4i5gAQaINBxezQAYZvx/wHiadxTxQrR5SGxPm3D8RuzjKt2ajuVc+qpsB91/pJzEzCfR2MsPh7+fVL8rInMGw5wwxF7X7R5ryEFJ2LFKuNXMb2/hSi3UDLcdJJ5DgaLjlXyod4vV8kGpCdHZtRNmMOJGiaIdfff6Qyqc0yN5ypwSDmXzzXyK8COpu8EyzmwHKJ41ImY+m+BIQHokXsbjUfh/mVknDcILRBJSOxVW5Ljprn8lTlyAdOoda3j85p3qrrHRE3EM PCRoHFL0 142d0it5g2a4j1woGMMswSQx+i0XODLkNvwWvVSLv1xMCd+/7k8k9n36Sw6tnLH9b+T0xqA64Ld4nCPNG7Ysq4hWfIqwR5ARDtj28pT+T/FrTX9vVCFntjp8ZiZZJ1dnlj0kADVa8AHvPgHcg14uTahTKf2YsICjOlYBhBjDc6brYUItLnjMPWBO88JI3dPJK1/UxDKbHeRMYjntt8ynnFbDhCZEqNAFBNZbKRBLk4NMX9q7mnFcTz0RZmo+jfr4R8FiMPq5EtxSf9VrbJE4nVrLF64eAuWlkKfR4zfSDaR6bH2IKCxav3drivk7G9XrRlWiL9IWOITkAhfxKHJvSGdY9wcjxt/uTLUIF5VmaZPT3UWK8QV1Cs51Nh3ALbyKyRUuXF0+1QwxMts3JIytAtEqnbg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jun 6, 2024 at 1:55=E2=80=AFPM Barry Song <21cnbao@gmail.com> wrote= : > > On Fri, Jun 7, 2024 at 8:32=E2=80=AFAM Yosry Ahmed wrote: > > > > On Thu, Jun 6, 2024 at 1:22=E2=80=AFPM David Hildenbrand wrote: > > > > > > On 06.06.24 20:48, Yosry Ahmed wrote: > > > > With ongoing work to support large folio swapin, it is important to= make > > > > sure we do not pass large folios to zswap_load() without implementi= ng > > > > proper support. > > > > > > > > For example, if a swapin fault observes that contiguous PTEs are > > > > pointing to contiguous swap entries and tries to swap them in as a = large > > > > folio, swap_read_folio() will pass in a large folio to zswap_load()= , but > > > > zswap_load() will only effectively load the first page in the folio= . If > > > > the first page is not in zswap, the folio will be read from disk, e= ven > > > > though other pages may be in zswap. > > > > > > > > In both cases, this will lead to silent data corruption. > > > > > > > > Proper large folio swapin support needs to go into zswap before zsw= ap > > > > can be enabled in a system that supports large folio swapin. > > > > > > > > Looking at callers of swap_read_folio(), it seems like they are eit= her > > > > allocated from __read_swap_cache_async() or do_swap_page() in the > > > > SWP_SYNCHRONOUS_IO path. Both of which allocate order-0 folios, so = we > > > > are fine for now. > > > > > > > > Add a VM_BUG_ON() in zswap_load() to make sure that we detect chang= es in > > > > the order of those allocations without proper handling of zswap. > > > > > > > > Alternatively, swap_read_folio() (or its callers) can be updated to= have > > > > a fallback mechanism that splits large folios or reads subpages > > > > separately. Similar logic may be needed anyway in case part of a la= rge > > > > folio is already in the swapcache and the rest of it is swapped out= . > > > > > > > > Signed-off-by: Yosry Ahmed > > Acked-by: Barry Song Thanks! > > this has been observed by me[1], that's why you can find the below > code in my swapin patch Thanks for the pointer. I suppose if we add more generic swapin support we'll have to add a similar check in __read_swap_cache_async(), unless we are planning to reuse alloc_swap_folio() there. It would be nice if we can have a global check for this rather than add it to all different swapin paths (although I suspect there are only two allocations paths right now). We can always disable zswap if mTHP swapin is enabled, or always disable mTHP swapin if zswap is enabled. At least until proper support is added. > > +static struct folio *alloc_swap_folio(struct vm_fault *vmf) > +{ > + ... > + /* > + * a large folio being swapped-in could be partially in > + * zswap and partially in swap devices, zswap doesn't > + * support large folios yet, we might get corrupted > + * zero-filled data by reading all subpages from swap > + * devices while some of them are actually in zswap > + */ > + if (is_zswap_enabled()) > + goto fallback; > > [1] https://lore.kernel.org/linux-mm/20240304081348.197341-6-21cnbao@gmai= l.com/