From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BBBEC76196 for ; Mon, 3 Apr 2023 16:44:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CB9826B0071; Mon, 3 Apr 2023 12:44:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C697D6B0074; Mon, 3 Apr 2023 12:44:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B0AF16B0075; Mon, 3 Apr 2023 12:44:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id A0D026B0071 for ; Mon, 3 Apr 2023 12:44:56 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 76636160B64 for ; Mon, 3 Apr 2023 16:44:56 +0000 (UTC) X-FDA: 80640654192.15.F1556FB Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com [66.111.4.29]) by imf06.hostedemail.com (Postfix) with ESMTP id 6C61818001B for ; Mon, 3 Apr 2023 16:44:53 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=devkernel.io header.s=fm3 header.b=iiN8E5+L; dkim=pass header.d=messagingengine.com header.s=fm2 header.b=lLcVNqMV; spf=pass (imf06.hostedemail.com: domain of shr@devkernel.io designates 66.111.4.29 as permitted sender) smtp.mailfrom=shr@devkernel.io; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680540293; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=11sc0TQGOZNioE8V36qC3PGFQzzymQW7XNL7fWSiI1E=; b=lmudTd/sVvBDjOJL5hCIiP1jNrT0UKP7anlAzr0Z2wwLIj2dJH24isQdWJ6THh1RemE1dZ bYsC6FzaR0KoHbulvb8QRxX4d2b2o5A6LkZeYhbkFvlE1/wy1ZY07MEuWiHrxkhYpfu8vH jPkT1PojEQ7dbYbVVHhT4qkdZXxiBE0= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=devkernel.io header.s=fm3 header.b=iiN8E5+L; dkim=pass header.d=messagingengine.com header.s=fm2 header.b=lLcVNqMV; spf=pass (imf06.hostedemail.com: domain of shr@devkernel.io designates 66.111.4.29 as permitted sender) smtp.mailfrom=shr@devkernel.io; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680540293; a=rsa-sha256; cv=none; b=o5h0yh6BlDwM5A53Y/3mzAjqz36WRWM76S7xaFqJanKenm4uGI2DJVJuG448RH8S9lZdVz 07I6fm1vh3JzOJPHTjjDbfJGjuvMyvJ+Gcx9oNfX48gLW5D/1D77H8LqxVFKbgTqmMjkf0 UjvsVrGvItMUaA/Av5YflDx3931fvhI= Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.nyi.internal (Postfix) with ESMTP id C305B5C0125; Mon, 3 Apr 2023 12:44:52 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute1.internal (MEProxy); Mon, 03 Apr 2023 12:44:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=devkernel.io; h= cc:cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm3; t=1680540292; x=1680626692; bh=11 sc0TQGOZNioE8V36qC3PGFQzzymQW7XNL7fWSiI1E=; b=iiN8E5+Lt6mMuavRVk 45CA5j3Q4I0Jw+WHUcgc+h4vEF/ExBmWwdUDWf3mmepDzBKj71DNKL5CkVxodq14 hjIyhwgx8fWKgTwPMm6EgEVic7/LNSL6gF4tdbqEOMRXEj8pI3ogMky6x2BuQTbC RAgEwyfS4Msa/qzW+cVvTkBP/LZo6Tpnl0aPRFCxZ0wiM2bNrIeHSDyn79uUdxvF 4EbL3vz3CD9wrw2bXOCLsT7LCvVJ8JStjKZNrDicObrQBntWM4CtUrTrDLCzudaA WZtTSrXt6kabA5eX07ePeHMLcEB+ATRob8ZHZamnomuZVMANbM8nI0iYQMNUzwCB iyoA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm2; t=1680540292; x=1680626692; bh=11sc0TQGOZNio E8V36qC3PGFQzzymQW7XNL7fWSiI1E=; b=lLcVNqMVPTl189cic2q6WYOJkkXdW GqHLIW2aonhpjJPdP+jVxxyN2AbY/OR87Ma3gK/cerJMBIU+av4N+TOolGifjQc0 wUwMAtK1eE5yVIvUqBSmihovQNL7A5R8tmaY/HJRKvLT8oaCyJ5efYTDLLo1jRLh sPvGaynh4KNewsf1THgz1xmyu4dmxoe+NPSYWQzHmoAjoW3eMM4DY5KpGhJUch5m uVpg+vvvY3+L8aM6L0C/1Y7mIyaF9UyILSwf0CtodB1iX3wyzNPWRL+B8OdiKCVD TqwO5AMylqRaAJbj1Ejww09dKhK3sLCiNEuxYIhxRgT4okOrFePcFSBKA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrvdeijedguddtiecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpehffgfhvfevufffjgfkgggtsehttdertddtredtnecuhfhrohhmpefuthgv fhgrnhcutfhovghstghhuceoshhhrhesuggvvhhkvghrnhgvlhdrihhoqeenucggtffrrg htthgvrhhnpeevlefggffhheduiedtheejveehtdfhtedvhfeludetvdegieekgeeggfdu geeutdenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpe hshhhrseguvghvkhgvrhhnvghlrdhioh X-ME-Proxy: Feedback-ID: i84614614:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 3 Apr 2023 12:44:51 -0400 (EDT) References: <20230310182851.2579138-1-shr@devkernel.io> <20230328160914.5b6b66e4a5ad39e41fd63710@linux-foundation.org> <37dcd52a-2e32-c01d-b805-45d862721fbc@redhat.com> User-agent: mu4e 1.6.11; emacs 28.2.50 From: Stefan Roesch To: David Hildenbrand Cc: Johannes Weiner , Andrew Morton , kernel-team@fb.com, linux-mm@kvack.org, riel@surriel.com, mhocko@suse.com, linux-kselftest@vger.kernel.org, linux-doc@vger.kernel.org, Hugh Dickins Subject: Re: [PATCH v4 0/3] mm: process/cgroup ksm support Date: Mon, 03 Apr 2023 09:34:59 -0700 In-reply-to: Message-ID: MIME-Version: 1.0 Content-Type: text/plain X-Rspamd-Queue-Id: 6C61818001B X-Stat-Signature: od949wbc7mphkwxribbyca4bde59enmz X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1680540293-307400 X-HE-Meta: U2FsdGVkX1+PkBcGdE4ZuoCOoe9FGcfF6bNbojVvMF+RsO/k9T/wrWDzPwcq43f/EXRWl9gMRcN+neeZrj96sWvpl+iwAPadwyVMhCVd53pf3s800+r8g5CDjW8x9v5cxDG27C7O+Ei18c9eeFgOxtcv5ecuE/jI2+vzeTHUNctwMslU+mPWy6jHJ4uyG+sDhOXFDeUZroCBnR1eqnOoHA3UC5Jvx2hZqZG64sBKDcM/wblbYKoYTgz8OZVIfkdC5Hd/Gt8SONg/ja7Tv4zJ5t/+w9gE5c8upgEJ5chOss5LUp3JLq5jQv/VSwMhgRru35KZUyq7b6TCKAKJhHrVtR4Q2yWCzo9//apaL+oFdy++SugBFwXvorSDulrtXeD6oLA3aL7S3xPOCGEeSG1CsnIuNLIQHHLvxCo8BfrHetYPRXTLjISdynFiIggzMpQda3XMF6flMnnefpdU0WptBsfeMptqZq/6c172yuo3NT0GtnNIsdbx3e7k0rKo2z47myLWbnNTn7J3yKAO+7lUj6l+g2/mm6YgIlSdX789lisXfhiQ0yEKJjjl4iREO5tno4Hne2lLQyWCTJosvVxfVJOhoHJAAmzvtpCMzqkMwNyLRoaF+M12v7FRFvGukFILNVQFomYdUgEioN6qfb3avkMJVC9PFmyJv1zml6pNtC91UKIVNi+VUTk1FeBH13LcwIn/0tVnnGEJgcFipYsikn8lFHEpNAmx6HLrX6zrlZCQbgCa20AqoBLVgrKb6rrXqV7cMknVzlZYZoYaGiJyrY9dcwXhiNhtGoPbaczz23uTniVeZiMELMV9GxMVCOVEFABCDCNyIaZn3BBuUTAPtWJYPulfeTjv75ZfgTjxQ+eV7qF7WCbU9DLm7d+glQmYTTKojMRn7pjmqSyhvjiUCxodkJpOu6BAGZ32kYLP6hzdRIDNF88amS+kTkePSRL7l1R1bSSeqR+Ppx+adHp f9zu+r0S as4BQCaMXUh8XvLnTv17vp3YAXwDQpWCqZr8ZUjsP/Vd+JmXaGpLZLMwuQhkvJNCaCTKXAQUuNyZAnCh0iDe/5iIOaPNet2QlfjQdzhdyjq/kFSGbYYFaQUCJtlskv0vFJFYMEX/ybONC/uAo1BT08EqwBo8eHYf1omy7 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: David Hildenbrand writes: >>>> Obviously we could spend months analysing which exact allocations are >>>> identical, and then more months or years reworking the architecture to >>>> deduplicate them by hand and in userspace. But this isn't practical, >>>> and KSM is specifically for cases where this isn't practical. >>>> Based on your request in the previous thread, we investigated whether >>>> the boost was coming from the unintended side effects of KSM splitting >>>> THPs. This wasn't the case. >>>> If you have other theories on how the results could be bogus, we'd be >>>> happy to investigate those as well. But you have to let us know what >>>> you're looking for. >>>> >>> >>> Maybe I'm bad at making such requests but >>> >>> "Stefan, can you do me a favor and investigate which pages we end up >>> deduplicating -- especially if it's mostly only the zeropage and if it's >>> still that significant when disabling THP?" >>> >>> "In any case, it would be nice to get a feeling for how much variety in >>> these 20% of deduplicated pages are. " >>> >>> is pretty clear to me. And shouldn't take months. >>> > > Just to clarify: the details I requested are not meant to decide whether to > reject the patch set (I understand that it can be beneficial to have); I > primarily want to understand if we're really dealing with a workload where KSM > is able to deduplicate pages that are non-trivial, to maybe figure out if there > are other workloads that could similarly benefit -- or if we could optimize KSM > for these specific cases or avoid the memory deduplication altogether. > > In contrast to e.g.: > > 1) THP resulted in many zeropages we end up deduplicating again. The THP > placement was unfortunate. > > 2) Unoptimized memory allocators that leave many identical pages mapped > after freeing up memory (e.g., zeroed pages, pages all filled with > poison values) instead of e.g., using MADV_DONTNEED to free up that > memory. > > I repeated an experiment with and without KSM. In terms of THP there is no huge difference between the two. On a 64GB main memory machine I see between 100 - 400MB in AnonHugePages. >> /sys/kernel/mm/ksm/pages_shared is over 10000 when we run this on an >> Instagram workload. The workload consists of 36 processes plus a few >> sidecar processes. > > Thanks! To which value is /sys/kernel/mm/ksm/max_page_sharing set in that > environment? > It's set to the standard value of 256. In the meantime I have run experiments with different settings for pages_to_scan. With the default value of 100, we only get a relatively small benefit of KSM. If I increase the value to for instance to 2000 or 3000 the savings are substantial. (The workload is memory bound, not CPU bound). Here are some stats for setting pages_to_scan to 3000: full_scans: 560 general_profit: 20620539008 max_page_sharing: 256 merge_across_nodes: 1 pages_shared: 125446 pages_sharing: 5259506 pages_to_scan: 3000 pages_unshared: 1897537 pages_volatile: 12389223 run: 1 sleep_millisecs: 20 stable_node_chains: 176 stable_node_chains_prune_millisecs: 2000 stable_node_dups: 2604 use_zero_pages: 0 zero_pages_sharing: 0 > What would be interesting is pages_shared after max_page_sharing was set to a > very high number such that pages_shared does not include duplicates. Then > pages_shared actually expresses how many different pages we deduplicate. No need > to run without THP in that case. > Thats on my list for the next set of experiments. > Similarly, enabling "use_zero_pages" could highlight if your workload ends up > deduplciating a lot of zeropages. But maxing out max_page_sharing would be > sufficient to understand what's happening. > > I already run experiments with use_zero_pages, but they didn't make a difference. I'll repeat the experiment with a higher pages_to_scan value. >> Each of these individual processes has around 500MB in KSM pages. >> > > That's really a lot, thanks. > >> Also to give some idea for individual VMA's >> 7ef5d5600000-7ef5e5600000 rw-p 00000000 00:00 0 (Size: 262144 KB, KSM: >> 73160 KB) >> > > I'll have a look at the patches today.