From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5F95CD11DD for ; Thu, 28 Mar 2024 20:24:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 391B66B0098; Thu, 28 Mar 2024 16:24:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 342816B0099; Thu, 28 Mar 2024 16:24:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 209E36B009A; Thu, 28 Mar 2024 16:24:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 03AEC6B0098 for ; Thu, 28 Mar 2024 16:24:21 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id AA7A9C0F54 for ; Thu, 28 Mar 2024 20:24:21 +0000 (UTC) X-FDA: 81947575122.05.6722458 Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) by imf12.hostedemail.com (Postfix) with ESMTP id F2AFE40010 for ; Thu, 28 Mar 2024 20:24:19 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="Ou4P2l/K"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf12.hostedemail.com: domain of yosryahmed@google.com designates 209.85.221.49 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711657460; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=s6wzOI8f0nNiuU7SM67JbM3AAeU2GfbdXdqb/BOOqV4=; b=vbZFJDB2tTWl+8PP2OZfrhpQk4Mbn/CUPZpwHM5z4p31A8QEtgjSWVVCCH93X0WjGvz9Bw +YH7HYCNi/Q5s1W/SbSBQbA18zAF85x9gfdaHWsdPsW4MCNB0+KI//CqHObrgcku2Mc6Tr Yw7tIUT1mwyp0A+KD3tck0w5qZ6IMPQ= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="Ou4P2l/K"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf12.hostedemail.com: domain of yosryahmed@google.com designates 209.85.221.49 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711657460; a=rsa-sha256; cv=none; b=2D674txu/j+gFx2y4AMKEhXP7eF53KTlQ6VqYu2WHdQw2P4SSygqOTQqsO9BAJUzkoiY0i lT1XwoOeMgi4zxKpCe1zyEDfuF1jro3QEdkUvXHlKm6epbrfBgy2a17EkXm8mYYOa9xgjc DSy+Udsi4W8B7lFl8nDLu452WG7Nt7M= Received: by mail-wr1-f49.google.com with SMTP id ffacd0b85a97d-341cf77b86dso1216456f8f.2 for ; Thu, 28 Mar 2024 13:24:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1711657458; x=1712262258; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=s6wzOI8f0nNiuU7SM67JbM3AAeU2GfbdXdqb/BOOqV4=; b=Ou4P2l/KRS/6VQsf6ZAUftVuxLg3mPe2u64uQ4RcZ47J46xFKjnKj7GrIOpZzPdrM2 1w+l/PxFu2Mdn4eu5Inin7zRZu5hAGqtxSDrc7xgy/upBKwmoaj0yzkVRaybd/nQPjGs p5Gppviq/FhlUAOXZ1k/aaQ5DnjmliSL/MIew4KMiiU3jWfJIb+QqrM1/r/37Q7PCKxm 7jw58s1WWXtliT8hksyqNjdoPIWQ6q9/bQTfFr/vn/5ZKQZa04vLHVWd296TN3gpP7au UPaN0oO8+1TZ7cd9nV1XAxpCXBbQ/aOwzcdH6vkN1i9mBKCDiN4+iz3CyxHo4Kwcw8bA Ghig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711657458; x=1712262258; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=s6wzOI8f0nNiuU7SM67JbM3AAeU2GfbdXdqb/BOOqV4=; b=fETNR7cCJPQk7wBRqsRl4Ij38+VzTIVtK700AJ2BqJFuHyB46LGNzhpqmtaklLNjSh 0Y7FU5Gnv5kUXdtflGe9KWVedapAVnG7hB2wcieEJfr7KAcomi3zSwOvuQ/VV6eJNJQO B5W8/ijIiyr6jpBxu0Kjf9AVdOfQ3haoCKusOHQVGek0M9LCr814YfQXEvgmGm/z/RCO oZr79FOySmi3XkZ2QRnLHsahlUS5Y4imNYta/hue49OxQ4lrNw40cXdjxRKXnp/MWmQc AhG5W+dkvMhubSdWGL9mK9S4Ep2GfPzqZ3XUv8lNzEFbgXvFhfUxXgnGbYpEdVwHdGGw jcVQ== X-Forwarded-Encrypted: i=1; AJvYcCVA5mIytul+pmW00PPghIBTsKgGP45MTf4WpAossMAFmXkB7UReuRs/xiTvAS1/A59/LrTgXhMQFAzzmKDr50HW/xU= X-Gm-Message-State: AOJu0YzfEyEP9sIdlkvU5S0At8GQSA77t9REZ5Wbyzhj/I55sSM9j+Qj fH9ChKrcs1fqUn/vzmezylhP7dkJNLrg3aFVNmRS9vXQ6QPYp9wFOw91JldHQmltaxiEOc4jLic DREcGUdayMhBi3+kFJOJDhr2eRCDaw30T5Bb3 X-Google-Smtp-Source: AGHT+IFXkGygkGArDNlR3c03YewcMnsXn2Y3WZHSXGvnNDU1C+7dW3MQ3mb3DZucF5DN+YcgsADtFr+OPnTSdxO7Axs= X-Received: by 2002:a05:6000:1d86:b0:33e:76a1:d031 with SMTP id bk6-20020a0560001d8600b0033e76a1d031mr160319wrb.50.1711657458197; Thu, 28 Mar 2024 13:24:18 -0700 (PDT) MIME-Version: 1.0 References: <20240325235018.2028408-1-yosryahmed@google.com> <20240325235018.2028408-7-yosryahmed@google.com> <20240328193149.GF7597@cmpxchg.org> In-Reply-To: <20240328193149.GF7597@cmpxchg.org> From: Yosry Ahmed Date: Thu, 28 Mar 2024 13:23:42 -0700 Message-ID: Subject: Re: [RFC PATCH 6/9] mm: zswap: drop support for non-zero same-filled pages handling To: Johannes Weiner Cc: Andrew Morton , Nhat Pham , Chengming Zhou , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: codqu9uj3qrfkqqddwttrkuarji5yip6 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: F2AFE40010 X-HE-Tag: 1711657459-868903 X-HE-Meta: U2FsdGVkX182FJacksAnxWe6una3v8xt/H4itgO/j0ZBUwfKQqTbGBQkHbDL3452DOZW+uGUzLZpB9nvdx8GpWfDKU+2TNNxSkoWVcB5e6CuTCG/Mao9YqvLCk6QAWpFflw3lTpKjTk37NqBpoaANsL8FZJC7hbhioBLhBuuqACCS/EVNXMPNDsppIjde2ugQqejwXbcdVCz+qgKHh4QlhCEwtgpZAgkcbW5Lgkr4uFemV+JQCHNGfNGnEkkSyx5xxSqnqWmpd5k1D1MgDNo98NnKCegmzHdOK5qVhcKkN6n+a4DcO9gGjEnWw1SFbzbSQCv1mWEKeMig6r0dj0xHgkMKZ8rLW1GGUYZP273w0P3AtcUYF1mT9hSMqX7/in+Hvgt+CVR1cFYFNrYH49qcSiwNvRQ3Bh02je1LYIyaAQI3bf4KddXxCZoCbtyzEa3/QUAq74pICL8Ke1vuh3OkHzFhBcfcAYycdjsaSGHlaJv2rJxEKmq3+oRxTLvO2PWJZsFj3Xt8Lspwd2h/friIAQaLoKfDtsjJgNs920XQPdX/JkZ34nJuvlBjuUaP+nTsCG8KDHg2sxAxTBgufE00Lx1WNibFkGfhsXFox0lpEiXhH/Uc09903FIsMcaeFO5GifnbxRzIcsAb4V2gDriKJw3BpyZupwFvgAUFd8ekLUMYT3+SAMGCJqAUGuZ7tOMq2NCzbNCEFz55uQlI6ozCgwIrApHu7JyACa+7ZA5C8Dz8n4y1WOgiOX3wtqiNmBoEokNJQC4aq5bTuLR6E606W78/DtSL3pkfK1crst98Pv9rlE5lZnlyVt5/qDEW0CpeZqPVfcRJdBZXydOOXSsYhU6vqaaRb1N24RF5mZ6DJsmH9WYYeRz+WuYf6W5XLYLNCaz/zpH3QsRKaDMvTTfOiTY3nTrdEneRhv4Zt6fjfsw64tGn/BA0pT4SQ5orpAg33wF40Th4wRh/nWoRIm YAg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000006, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Mar 28, 2024 at 12:31=E2=80=AFPM Johannes Weiner wrote: > > On Mon, Mar 25, 2024 at 11:50:14PM +0000, Yosry Ahmed wrote: > > The current same-filled pages handling supports pages filled with any > > repeated word-sized pattern. However, in practice, most of these should > > be zero pages anyway. Other patterns should be nearly as common. > > > > Drop the support for non-zero same-filled pages, but keep the names of > > knobs exposed to userspace as "same_filled", which isn't entirely > > inaccurate. > > > > This yields some nice code simplification and enables a following patch > > that eliminates the need to allocate struct zswap_entry for those pages > > completely. > > > > There is also a very small performance improvement observed over 50 run= s > > of kernel build test (kernbench) comparing the mean build time on a > > skylake machine when building the kernel in a cgroup v1 container with = a > > 3G limit: > > > > base patched % diff > > real 70.167 69.915 -0.359% > > user 2953.068 2956.147 +0.104% > > sys 2612.811 2594.718 -0.692% > > > > This probably comes from more optimized operations like memchr_inv() an= d > > clear_highpage(). Note that the percentage of zero-filled pages during > > this test was only around 1.5% on average, and was not affected by this > > patch. Practical workloads could have a larger proportion of such pages > > (e.g. Johannes observed around 10% [1]), so the performance improvement > > should be larger. > > > > [1]https://lore.kernel.org/linux-mm/20240320210716.GH294822@cmpxchg.org= / > > > > Signed-off-by: Yosry Ahmed > > This is an interesting direction to pursue, but I actually thinkg it > doesn't go far enough. Either way, I think it needs more data. > > 1) How frequent are non-zero-same-filled pages? Difficult to > generalize, but if you could gather some from your fleet, that > would be useful. If you can devise a portable strategy, I'd also be > more than happy to gather this on ours (although I think you have > more widespread zswap use, whereas we have more disk swap.) I am trying to collect the data, but there are.. hurdles. It would take some time, so I was hoping the data could be collected elsewhere if possible. The idea I had was to hook a BPF program to the entry of zswap_fill_page() and create a histogram of the "value" argument. We would get more coverage by hooking it to the return of zswap_is_page_same_filled() and only updating the histogram if the return value is true, as it includes pages in zswap that haven't been swapped in. However, with zswap_is_page_same_filled() the BPF program will run in all zswap stores, whereas for zswap_fill_page() it will only run when needed. Not sure if this makes a practical difference tbh. > > 2) The fact that we're doing any of this pattern analysis in zswap at > all strikes me as a bit misguided. Being efficient about repetitive > patterns is squarely in the domain of a compression algorithm. Do > we not trust e.g. zstd to handle this properly? I thought about this briefly, but I didn't follow through. I could try to collect some data by swapping out different patterns and observing how different compression algorithms react. That would be interesting for sure. > > I'm guessing this goes back to inefficient packing from something > like zbud, which would waste half a page on one repeating byte. > > But zsmalloc can do 32 byte objects. It's also a batching slab > allocator, where storing a series of small, same-sized objects is > quite fast. > > Add to that the additional branches, the additional kmap, the extra > scanning of every single page for patterns - all in the fast path > of zswap, when we already know that the vast majority of incoming > pages will need to be properly compressed anyway. > > Maybe it's time to get rid of the special handling entirely? We would still be wasting some memory (~96 bytes between zswap_entry and zsmalloc object), and wasting cycling allocating them. This could be made up for by cycles saved by removing the handling. We will be saving some branches for sure. I am not worried about kmap as I think it's a noop in most cases. I am interested to see how much we could save by removing scanning for patterns. We may not save much if we abort after reading a few words in most cases, but I guess we could also be scanning a considerable amount before aborting. On the other hand, we would be reading the page contents into cache anyway for compression, so maybe it doesn't really matter? I will try to collect some data about this. I will start by trying to find out how the compression algorithms handle same-filled pages. If they can compress it efficiently, then I will try to get more data on the tradeoff from removing the handling. Thanks for the insights.