From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74D52C6FD1F for ; Fri, 29 Mar 2024 18:57:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D2D7C6B0083; Fri, 29 Mar 2024 14:57:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CDDE56B0087; Fri, 29 Mar 2024 14:57:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BA5286B0088; Fri, 29 Mar 2024 14:57:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 9E1996B0083 for ; Fri, 29 Mar 2024 14:57:26 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 551EE1C0B0E for ; Fri, 29 Mar 2024 18:57:26 +0000 (UTC) X-FDA: 81950984892.08.B705A45 Received: from mail-ej1-f47.google.com (mail-ej1-f47.google.com [209.85.218.47]) by imf27.hostedemail.com (Postfix) with ESMTP id 88F7B4000E for ; Fri, 29 Mar 2024 18:57:24 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=MCMyZFnZ; spf=pass (imf27.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.47 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711738644; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7ThnGNLb7whw9044LonmmdrO8mODL/ogtraEcDhnIGU=; b=TGlYNhHrYwDMMuTaNtQYjfrMfByBjFDkw83qL8bKb6TS+Q6fBmyBsvNBlaPrUUIYWkmlZ+ pb1zcgc2EquvuQ/RgeG0eREgX4trEWfDbKkOPf9JOL1Yqq+/fEYgux7auU6/Ix9tCWyijx 2jieiem1dsu0pO6IgkPyxCwdfQ6kQFg= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=MCMyZFnZ; spf=pass (imf27.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.47 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711738644; a=rsa-sha256; cv=none; b=3Hg+apHs3ZzQBPfS7FG0Mnzv1vTIofNrfQLrmwfiX5MLfABllK71twntSOK64uufVVfCrn 9fdy1Af0DRsqZH9sXsHRGVjBP1QuQe30bhQdyx7vzM8gtKWfV8usw+DYPi9kfDiM6dCFZE IJqTONfnxaCLW03gKJf/I+jBKtlMalg= Received: by mail-ej1-f47.google.com with SMTP id a640c23a62f3a-a4e39f5030dso93973266b.0 for ; Fri, 29 Mar 2024 11:57:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1711738643; x=1712343443; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=7ThnGNLb7whw9044LonmmdrO8mODL/ogtraEcDhnIGU=; b=MCMyZFnZ75oXbn3jUiAPNFmLOBGSMGCSjK6uxBIDCarP6pUcuQIYTiA5IemLC2uTEl EMbkoD2T8SfnIRQ63evvR28s/iY2CRFMKkOzZxeXUa9xCmrOgaKrV1JU3x0qQl2/tr0u 6BpHCLTK4TlW5S8qJjy4mJ6c8rymOCKEDPXdkVPvhtCpmCcG4ccIDyJjgoJbbH/41Hf9 fTl5TjxcpyOBd/VjSkYeH5o3CjyDj8uDPgO+QhMWORzIIgPncyNSaBL93seP0n5bWLt1 qAQiBv6F/K6J8KGCpFTQGTCmV2XD7POUb6G9Nibtv7x5AYMtGngyQiOYMZhn/YK8OPJE xpGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711738643; x=1712343443; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=7ThnGNLb7whw9044LonmmdrO8mODL/ogtraEcDhnIGU=; b=kxU8XmFi9pULmywrjCgBM3oZ4CWjkoYQAF51V8+5SKmAW/LbjFduQYI/yDfGtPfVNV 7dM1J4tD9APDeRRDNk2H0G7oGWro+H951M8K4cSL6ybICdPy3z+yBYdXJbNAvmWPrLnk nVXm03v7o++QzVOhWMDfqJv/C11XQn8JlYstAq4G1g3Mxa/B2DjIsA7Xf3dR1AZvhX8+ aOKzqT2xOl5TdWL/KVoF2+ELQyCmh8MEM9t5JcpVo5dOBrcOFTlmFx12kEXuxJeYS/ld XaJNgFfdpUM1tPskYeKQfpeLjqXvEg5tmtfqKHOp0x8Z9gbETEI/PKjIOLlcxnJVThQw apeA== X-Forwarded-Encrypted: i=1; AJvYcCVzcl5a5Q1kH5VyGlMG/sFweTZRaLMDOjwgp42Dvg5XIcnOrkknWRIR/qvRbdI+IFf0Dn3Hn3/vssvTrknM+pxiy+Q= X-Gm-Message-State: AOJu0Yy1ydk4koFKwJbs7ZRMc+OaYNIAN5BWgARWfrE5CkEPYquIzr55 8tX7uHbW0e6cSm4O/7i6I8BoT7ugHQzEFaBzB8XiEVUBIK4+wLBRJtPRMVt2SzQdYXaiYjWMazL C4Xvl/reUlhwASpePQSY4cXY10nQXGROOYDG6 X-Google-Smtp-Source: AGHT+IE7ZEmDbKxjJT2MfgJ82YDUlNYG2SOco8RkNyq3EjZ8JKESLgiky0azfOPnbeno1yGqJk2g3DLjrYgDipUMI9U= X-Received: by 2002:a17:906:c12:b0:a4a:3441:2e2a with SMTP id s18-20020a1709060c1200b00a4a34412e2amr2097433ejf.55.1711738642697; Fri, 29 Mar 2024 11:57:22 -0700 (PDT) MIME-Version: 1.0 References: <20240325235018.2028408-1-yosryahmed@google.com> <20240325235018.2028408-7-yosryahmed@google.com> <20240328193149.GF7597@cmpxchg.org> <20240328210709.GH7597@cmpxchg.org> <20240329173759.GI7597@cmpxchg.org> In-Reply-To: <20240329173759.GI7597@cmpxchg.org> From: Yosry Ahmed Date: Fri, 29 Mar 2024 11:56:46 -0700 Message-ID: Subject: Re: [RFC PATCH 6/9] mm: zswap: drop support for non-zero same-filled pages handling To: Johannes Weiner Cc: Nhat Pham , Andrew Morton , Chengming Zhou , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Srividya Desireddy Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 88F7B4000E X-Rspam-User: X-Stat-Signature: r9yy5t9grcajw9mmazfgik8tuispcpcq X-Rspamd-Server: rspam01 X-HE-Tag: 1711738644-517818 X-HE-Meta: U2FsdGVkX198yCwCElIDgb4F4BzethS3SeFspBr+CmzNIifUf7ReN7idtta8boy1kjNTeCt01AIC31pIdDaQtyJX9lQwzGtSWM33JAe2Cz4YKOCLacnZgj27ql39zeSZ0YCDcXJMDgHdKS7R2yv6zWGwiFzFTFpi+QChGouYT2p3ZZ92g0Qc1cjiCAeaSdGrtiJ/arQuBxS1ksTpt09NfJwVh++BEeAHBNhBi754mMqVnq+4mZm73UAwkrIYgr5qfN+GG28GtIw54/qPjgCSiPHuCw+DcZiIOdN4U+dA2sssxowPetplVtxX4xY/j2yxr6r0ZQr+w9ePO/r67LWVnRhOeY6/RMyOqnmHVprDM5C8KGyGpdggr0mSc5EmZV8rv+Qu5miYL2W15azWlT/6cyCWdhJlIO2JjZNYGYQyZFFhpYQPu9FQyDVMqjk6bUG2In0983K0G4GFrCp89fxlN2yPb84CNfQPVmzqNsB/6tv7kAuybxiiWjbn8HQ1a/BbKWNdlqADdeFrCaYevJcRMrY3Lj8IDbLwEfcG+6FoclPgwCwgwOqbQKXQClW51p4n3i3wncP0L/0veB0ClUSgREI7XAR/iBhIMvIf72aRZwxjA+bnBT7SAe1FcfRyboGkOEe/h4WD0i7WksF9QdrPS4oixRgTGW+mVhfoZ6mSGuxVqFDc3HP1zm01FXdVdXJESsKbxQn4bvPsXKLpuTC//m6Y14qCrWSDvcUkhU32oMwjS8OCAWgtUV6yKmRODFJf163VqHa+Pjvqc2MlAGCHqx+R95bmNecy6BwA0Y4IZTuYU+VMqRgTPJT8TyfjMHsoUs43ju5ietccUP8ak/MOY/23ws53nQWHxgydYJl+JNcMVUgUmuENP49eeuq54qLd4splTe/8FwzsRfLki9eGO6j5wBRq4tOMpJ9GkQoAOI8lguXFbgr3JL8Hr2rWj56xv7LrcVdAFeDMTj7CEOb vvNyLqmI F+UXA7wBY9/3N5C/W9I95iTE++hvLxxGAWCwJ2Qpc+OBld4dRn6IKrTppCwPCXRhxyCVEsmrly49djZWulLH6RWd1jiUVNIJmUrWKl7rNGB7cbovYwum29EGWHgQ/YD7uFxUPs2uTf9V+Qv3ISOgfmErZXm2Om3dA78Hcp6KszcZYp4wVejPTWtOs4kkyiA28liCSMQcKvCfonP41i13fBNoQYa8Ttqmkwk2Q414BPON0HFU7/Q36Cvxzeegg08wfhmJYO4dMa3thpyl+yqWK3LGMbbO8nEubEQa3zLdAwIksxgRqOmbqWZ8CrUxvldLLC/WMWoJ2WQfnVaJLnSm3NNAsxSpIAxzI0tv3CxGGPk/FBqmIjrKGaIV3kzIbQB+IWiDMnqXGMWB3aq3fkC7CUE29gg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: [..] > > > > The context guy is here :) > > > > > > > > Not arguing for one way or another, but I did find the original patch > > > > that introduced same filled page handling: > > > > > > > > https://github.com/torvalds/linux/commit/a85f878b443f8d2b91ba76f09da21ac0af22e07f > > > > > > > > https://lore.kernel.org/all/20171018104832epcms5p1b2232e2236258de3d03d1344dde9fce0@epcms5p1/T/#u > > > > > > Thanks for digging this up. I don't know why I didn't start there :) > > > > > > Following in your footsteps, and given that zram has the same feature, > > > I found the patch that added support for non-zero same-filled pages in > > > zram: > > > https://lore.kernel.org/all/1483692145-75357-1-git-send-email-zhouxianrong@huawei.com/#t > > > > > > Both of them confirm that most same-filled pages are zero pages, but > > > they show a more significant portion of same-filled pages being > > > non-zero (17% to 40%). I suspect this will be less in data centers > > > compared to consumer apps. > > > > > > The zswap patch also reports significant performance improvements from > > > the same-filled handling, but this is with 17-22% same-filled pages. > > > Johannes mentioned around 10% in your data centers, so the performance > > > improvement would be less. In the kernel build tests I ran with only > > > around 1.5% same-filled pages I observed 1.4% improvements just by > > > optimizing them (only zero-filled, skipping allocations). > > > > > > So I think removing the same-filled pages handling completely may be > > > too aggressive, because it doesn't only affect the memory efficiency, > > > but also cycles spent when handling those pages. Just avoiding going > > > through the allocator and compressor has to account for something :) > > > > Here is another data point. I tried removing the same-filled handling > > code completely with the diff Johannes sent upthread. I saw 1.3% > > improvement in the kernel build test, very similar to the improvement > > from this patch series. _However_, the kernel build test only produces > > ~1.5% zero-filled pages in my runs. More realistic workloads have > > significantly higher percentages as demonstrated upthread. > > > > In other words, the kernel build test (at least in my runs) seems to > > be the worst case scenario for same-filled/zero-filled pages. Since > > the improvement from removing same-filled handling is quite small in > > this case, I suspect there will be no improvement, but possibly a > > regression, on real workloads. > > > > As the zero-filled pages ratio increases: > > - The performance with this series will improve. > > - The performance with removing same-filled handling completely will > > become worse. > > Sorry, this thread is still really lacking practical perspective. > > As do the numbers that initially justified the patch. Sure, the stores > of same-filled pages are faster. What's the cost of prechecking 90% of > the other pages that need compression? Well, that's exactly what I was trying to find out :) The kernel build test has over 98% pages that are not same-filled in my runs, so we are paying the cost of prechecking 98% of pages for same-filled contents needlessly. However, removing same-filled handling only resulted in a 1.4% performance improvement. We should expect even less for workloads that have 90% non-same-filled pages, right? It seems like the cost of prechecking is not that bad at all, potentially because the page contents are read into cache anyway. Did I miss something? > > Also, this is the swap path we're talking about. There is vmscan, swap > slot allocations, page table walks, TLB flushes, zswap tree inserts; > then a page fault and everything in reverse. > > I perf'd zswapping out data that is 10% same-filled and 90% data that > always needs compression. It does nothing but madvise(MADV_PAGEOUT), > and the zswap_store() stack is already only ~60% of the cycles. > > Using zsmalloc + zstd, this is the diff between vanilla and my patch: > > # Baseline Delta Abs Shared Object Symbol > # ........ ......... .................... ..................................................... > # > 4.34% -3.02% [kernel.kallsyms] [k] zswap_store > 11.07% +1.41% [kernel.kallsyms] [k] ZSTD_compressBlock_doubleFast > 15.55% +0.91% [kernel.kallsyms] [k] FSE_buildCTable_wksp > > As expected, we have to compress a bit more; on the other hand we're > removing the content scan for same-filled for 90% of the pages that > don't benefit from it. They almost amortize each other. Let's round it > up and the remaining difference is ~1%. Thanks for the data, this is very useful. There is also the load path. The original patch that introduced same-filled pages reports more gains on the load side, which also happens to be more latency-sensitive. Of course, the data could be outdated -- but it's a different type of workload than what will be running in a data center fleet AFAICT. Is there also no noticeable difference on the load side in your data? Also, how much increase did you observe in the size of compressed data with your patch? Just curious about how zstd ended up handling those pages. > > It's difficult to make the case that this matters to any real > workloads with actual think time in between paging. If the difference in performance is 1%, I surely agree. The patch reported 19-32% improvement in store time for same-filled pages depending on the workload and platform. For 10% same-filled pages, this should roughly correspond to 2-3% overall improvement, which is not too far from the 1% you observed. The patch reported much larger improvement for load times (which matters more), 49-85% for same-filled pages. If this corresponds to 5-8% overall improvement in zswap load time for a workload with 10% same-filled pages, that would be very significant. I don't expect to see such improvements tbh, but we should check. Let's CC Srividya here for visibility. > > But let's say you do make the case that zero-filled pages are worth > optimizing for. I am not saying they are for sure, but removing the same-filled checking didn't seem to improve performance much in my testing, so the cost seems to be mostly in maintenance. This makes it seem to me that it *could* be a good tradeoff to only handle zero-filled pages. We can make them slightly faster and we can trim the complexity -- as shown by this patch. > Why is this in zswap? Why not do it in vmscan with a > generic zero-swp_entry_t, and avoid the swap backend altogether? No > swap slot allocation, no zswap tree, no *IO on disk swap*. That part I definitely agree with, especially that the logic is duplicated in zram. > > However you slice it, I fail to see how this has a place in > zswap. It's trying to optimize the slow path of a slow path, at the > wrong layer of the reclaim stack. Agreeing for the store path, we still need some clarity on the load path. But yeah all-in-all zswap is not the correct place for such optimizations, but it's the way the handling currently lives.