From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD48AE77197 for ; Thu, 9 Jan 2025 21:35:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 28B376B00B0; Thu, 9 Jan 2025 16:35:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 23C466B00B1; Thu, 9 Jan 2025 16:35:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0DBC96B00B2; Thu, 9 Jan 2025 16:35:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id E390B6B00B0 for ; Thu, 9 Jan 2025 16:35:04 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 6A5EBC0590 for ; Thu, 9 Jan 2025 21:35:04 +0000 (UTC) X-FDA: 82989218928.18.43BD492 Received: from mail-qv1-f44.google.com (mail-qv1-f44.google.com [209.85.219.44]) by imf13.hostedemail.com (Postfix) with ESMTP id 80A8720009 for ; Thu, 9 Jan 2025 21:35:02 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=i9ZQJnZr; spf=pass (imf13.hostedemail.com: domain of yosryahmed@google.com designates 209.85.219.44 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736458502; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=G/e/mp7TtZ7MsXxhazmhc1udG7lO6kH1uRP7XQKbWJ8=; b=CQPcDwxu4cE4RTGWwwS+FcvV+PX8gCXiJfqHiBouG2tcaInlsqd9vg358PyekN9G4pJH84 /D5IeOW/ko7tzT6DGPX7NfJp739eSxEtYhuggwMQI+fkgakBZAHnftVhPobgZ1G5+MDTHq EaBFQSWK2D9nubQNdB3mp3M4R8WMqnQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736458502; a=rsa-sha256; cv=none; b=6P3ekkkCOa/ECaXVw2c4n4Uo5LPiTkXdPxTiC2zovxgwUOVlkcVvif+IxC89X97n/1FPh5 DWpg/GajDv8+MMEWpo7Ln8R0ITE/Nk1ynfUY7S1s5ABi1cIgST1BpdmBqPHexFJFFIb2zw JjOi1vxOBYuwR9UFPZha+vFlzMGJL8o= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=i9ZQJnZr; spf=pass (imf13.hostedemail.com: domain of yosryahmed@google.com designates 209.85.219.44 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-qv1-f44.google.com with SMTP id 6a1803df08f44-6d91653e9d7so11953716d6.1 for ; Thu, 09 Jan 2025 13:35:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736458501; x=1737063301; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=G/e/mp7TtZ7MsXxhazmhc1udG7lO6kH1uRP7XQKbWJ8=; b=i9ZQJnZrRQkbeqmfqFUptYFVI3ZKbRQd6bzlg8Y0TToMhHl6gidb86lst605It7nIz CdpjHvSckeXtAB5PMlB6XEqDAAzphtjLw6wOoBR7iblqlnVngfcGwmdVki8q4v3hMS20 UZfK23X76yU0uA+adTTvhkuauKaagqsFr5E/tnpcLd54T7jQv9+B5+fs8/tIDpmCIq2G FfiEReDy7PQ9n5GWdINOfngJZF1sCJok15Wh5dEjcIBzxUkhnZyWMDAfQ5kTBHrrTXUP H16LL+DOPFQKXcUjoa0hSeheWv2qmS11sZQT7bIphhrktTvVOGleDdYYNzWxZaZ/FZVE WNcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736458501; x=1737063301; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=G/e/mp7TtZ7MsXxhazmhc1udG7lO6kH1uRP7XQKbWJ8=; b=BhsHGv0TP1KCQ6HH3H5UNfbUOVnNxe4iLRY1thjEWoKnjpUuBOBD9V4TskD2pJeGmT XEH3uXwYHOXjQ/TdjlqfSG39reZZBFUZSaaozUC/XbyY8qyfSKEEDbnynvkq7/0iE4Ij b1ekLXpHW0v5q5YQ/yDR4MfsgNcBqpSpU6uY4s8ZXqs+JuFSWLPmGZZnm6d5bPfnA9by e0yVAvXb8v85KZm5tK56mufVb5tlc02U7Y2077tC5liNJdT5DKl5rqeWFmgXXFbOhdS8 94nNRARnLxz2jaF+x28ubGgAeUodI1HjlHit0+Ap6zKIVcjHW/Whavz8kSCDkV1LcFqm IVRA== X-Forwarded-Encrypted: i=1; AJvYcCXOHCfsDOclQNrEKvat5aHoC8jagP4X74uTEyrtyXgSkB9Rowi/KZb1dLwTYlnlI5NYFgq6nwoYrA==@kvack.org X-Gm-Message-State: AOJu0Ywz9KsiXTAR6c9wKNkbOKmsgyfEWcvZByp0ZqqIeYbqS30FsssM kzHUeFd2aih/lYZMUv2kzbnJA4l4cK9EnYX0AtOjaaN+w558QLPyjGk4IIasxZBMpyhZAi9+0e7 5ZX4o27CRSa2jiNRGzhawjk8ols0eGD84z331 X-Gm-Gg: ASbGncuQbIHtqeyiwQhMuSMoC5HsaQyHq585uWarPjQ4FYG/wzVvB/WqQKYplXbzB0y H6cCEyq7IGxdG+4TtFZoWlIN2jvisEKvJBRaHJkxkNFMTgig0mOV0HEYQSOSfjNeuPDhE X-Google-Smtp-Source: AGHT+IFtMFLQhA9Gij9MrB6aCB1dhERg7rBRaGjgIW6dwTF86rALw3e0NdKw41uPwjlhNloTpBZ0hBkUMhB7DbsDPGI= X-Received: by 2002:a05:6214:1948:b0:6d8:a84b:b50d with SMTP id 6a1803df08f44-6df9b2ad2fcmr143358776d6.33.1736458501435; Thu, 09 Jan 2025 13:35:01 -0800 (PST) MIME-Version: 1.0 References: <58716200-fd10-4487-aed3-607a10e9fdd0@gmail.com> In-Reply-To: <58716200-fd10-4487-aed3-607a10e9fdd0@gmail.com> From: Yosry Ahmed Date: Thu, 9 Jan 2025 13:34:25 -0800 X-Gm-Features: AbW1kvajtdtXHr4M6WpEP6ppGScu7T4txq1Gkb_TprDpNBIZiZ-ofVtRRBF_L-8 Message-ID: Subject: Re: [LSF/MM/BPF TOPIC] Large folio (z)swapin To: Usama Arif Cc: lsf-pc@lists.linux-foundation.org, Linux Memory Management List , Johannes Weiner , Barry Song <21cnbao@gmail.com>, Shakeel Butt Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 80A8720009 X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: 5bqduj7d94te3orz4mhj6mh864km6bmy X-HE-Tag: 1736458502-271390 X-HE-Meta: U2FsdGVkX1+tMe/ScS6D3WkBHP/oEWgk5Es2kVBF2jft4xW0Z0ElkVkkGBU27V0DVAHyguu8OQ04RnFJ+XWdHWQp1d+aSmv+Obp0efX129himW9rQWxdJhXY0QEkjM0HF61iSkm5Zj1LxGXFYSyIQWgLC6WEyTvw1g+Wn0yxKH6g2PqyfNWd7p96QarhRskn4Qm9MG9fxUIYYJARcMhucXt7nwwWoJeUa8X3l7kaJ2FrBzblBLmWv0w7hEjkBNjW+iaFXYl/qZpuZaG1Psp73B4r3SNX2aWLG+5ZrbDRJbHwtVDo1IaOq4Y3ia/XvJSL8KpoVsBJHptem2BD76J8zSD0NMtVjP+4MqE6nnIWa+in6Igozmy1YfOa2FJnYAET25YlbmVRNPqLBQ6oJJhpUfyX++VbZ2OqhryXRAgoCR7kZ6r6OnEEezgs/ERwK+GHwyM9XCWb8fg4E66u4JK9o/kS22ZVwt+W32hHSgWmGlwqw36Olsuxjc8uOuLnn4G1XeOkoZCO0RWUX2VUIuPM1b1fEHIkiuj8QglIM4/zdR3byQZDmS6VR6HuXpZPeAbeIMOa7J/wMJxTBxh5cYU1Ca7g7sRuQ783MUOoDwTTVw0TPX55q/6bA4U/AsOFqxWY+x4Mi7CFowvQvATVTulDJDdvYUh3AR1XMVLFiiWHp49MdM65XeNEJvh88jxKNqLP3tNQGUDLXpOcmbJBVpkmsRzpqJoa4MOtUO36mBC6hyudGJGE1srAvHJA9Jn7ByXJT9ngCf/Z8xcp4+ctLkLL5bxwRLm9KRkmiUQ4kpoKpPhXwF2ieDMYtKdH4Ypof4qQ2kIcJCb3J5NDHRzRv81xrpUmNq5GirfD5psy2sgrC2PQFqDVfYehHO6hgETqobsAWJj1o50bokmPK+Mgx8ZRk0Zo2QD5mD4BYcW/UqNfRBvmIac5aVBn3EDF5DcsBj+t2NUapv/T/yrxZO8scCC 0986nzBR p2TZU17XkoB6JOWDXSf6hxYT/pk5m+EQccL8qZHYSTLeXJT01fnsRctDTm0jC/ffl1cLGyWfv4GP2EBk3E4rcnnFmCg6ome01F1OlqLvsaEjG5jxNEC5lUgVKS72ZLwc2PCb6iGb1VnJVbRvpJT6g4UjADGCw+Ek8/nvLOYFG+LLtGw18RgRI2ssLZVGb0h+ZLAw16tkLlcMPJ444skDIncw03R4ZxP6qAFw6b2J51KZzmV2tvxiT4oP6lBRvS39X/xXqeLff9iJQ+G72cgrhEaUrljcyoye/HzPhbm8SlvSb2ooWnlDF4Y5FLGMOfiCainb7 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jan 9, 2025 at 12:06=E2=80=AFPM Usama Arif = wrote: > > I would like to propose a session to discuss the work going on > around large folio swapin, whether its traditional swap or > zswap or zram. > > Large folios have obvious advantages that have been discussed before > like fewer page faults, batched PTE and rmap manipulation, reduced > lru list, TLB coalescing (for arm64 and amd). > However, swapping in large folios has its own drawbacks like higher > swap thrashing. > I had initially sent a RFC of zswapin of large folios in [1] > but it causes a regression due to swap thrashing in kernel > build time, which I am confident is happening with zram large > folio swapin as well (which is merged in kernel). I am obviously interested in this discussion, but unfortunately I won't be able to make it this year. I will try to attend remotely though if possible! > > Some of the points we could discuss in the session: > > - What is the right (preferably open source) benchmark to test for > swapin of large folios? kernel build time in limited > memory cgroup shows a regression, microbenchmarks show a massive > improvement, maybe there are benchmarks where TLB misses is > a big factor and show an improvement. > > - We could have something like > /sys/kernel/mm/transparent_hugepage/hugepages-*kB/swapin_enabled > to enable/disable swapin but its going to be difficult to tune, might > have different optimum values based on workloads and are likely to be > left at their default values. Is there some dynamic way to decide when > to swapin large folios and when to fallback to smaller folios? > swapin_readahead swapcache path which only supports 4K folios atm has a > read ahead window based on hits, however readahead is a folio flag and > not a page flag, so this method can't be used as once a large folio > is swapped in, we won't get a fault and subsequent hits on other > pages of the large folio won't be recorded. > > - For zswap and zram, it might be that doing larger block compression/ > decompression might offset the regression from swap thrashing, but it > brings about its own issues. For e.g. once a large folio is swapped > out, it could fail to swapin as a large folio and fallback > to 4K, resulting in redundant decompressions. > This will also mean swapin of large folios from traditional swap > isn't something we should proceed with? > > - Should we even support large folio swapin? You often have high swap > activity when the system/cgroup is close to running out of memory, at thi= s > point, maybe the best way forward is to just swapin 4K pages and let > khugepaged [2], [3] collapse them if the surrounding pages are swapped in > as well. > > [1] https://lore.kernel.org/all/20241018105026.2521366-1-usamaarif642@gma= il.com/ > [2] https://lore.kernel.org/all/20250108233128.14484-1-npache@redhat.com/ > [3] https://lore.kernel.org/lkml/20241216165105.56185-1-dev.jain@arm.com/ > > Thanks, > Usama