From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00B22EE4993 for ; Wed, 23 Aug 2023 16:43:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7C46928008D; Wed, 23 Aug 2023 12:43:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7751D28008A; Wed, 23 Aug 2023 12:43:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6153828008D; Wed, 23 Aug 2023 12:43:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 4EE0728008A for ; Wed, 23 Aug 2023 12:43:42 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 231751A01EB for ; Wed, 23 Aug 2023 16:43:42 +0000 (UTC) X-FDA: 81155940684.08.C127FEA Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) by imf19.hostedemail.com (Postfix) with ESMTP id 0622C1A001D for ; Wed, 23 Aug 2023 16:43:39 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=devkernel.io header.s=fm2 header.b=wMAuenXU; dkim=pass header.d=messagingengine.com header.s=fm1 header.b=Ddvvybt+; spf=pass (imf19.hostedemail.com: domain of shr@devkernel.io designates 66.111.4.26 as permitted sender) smtp.mailfrom=shr@devkernel.io; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692809020; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CkdE8+B7DSfWbcYZjDLUGWPBGo+THho9cEGj+3Fl/JI=; b=M2LNi0h1w0k5qdDJQQ5umI6Edv8smttwufAh5HF0ULIuxkDRW1mUsfs9KeJTm3BeKcy7G1 HFOeZ0mIH5wP/pQ76WdnSlu+L3XAtEd35lfK2KmsojpbfCkv7u1xoI+5YL/u/juUbMYCpu wz9kf5L/2TYD6zpam5t0GntgEurMdEY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692809020; a=rsa-sha256; cv=none; b=SLaK3fIfjPxnfDDEc8/abJEaEs/JgYIxyH/pgF47WontCejjDwDsqbKbBx8H+DXDjk96nF tK10eKYQmb028ceJ2hFW8+saZ0thfBIHoC3F+quGyF5pNX3gRFLv3utdUs/jajiZdPLbFL gZMDbZ0vqSA7lhH0j7AiYWJkLmqcQUM= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=devkernel.io header.s=fm2 header.b=wMAuenXU; dkim=pass header.d=messagingengine.com header.s=fm1 header.b=Ddvvybt+; spf=pass (imf19.hostedemail.com: domain of shr@devkernel.io designates 66.111.4.26 as permitted sender) smtp.mailfrom=shr@devkernel.io; dmarc=none Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id BC8B45C00C7; Wed, 23 Aug 2023 12:43:37 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute5.internal (MEProxy); Wed, 23 Aug 2023 12:43:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=devkernel.io; h= cc:cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm2; t=1692809017; x=1692895417; bh=Ck dE8+B7DSfWbcYZjDLUGWPBGo+THho9cEGj+3Fl/JI=; b=wMAuenXU7uAMxJQ+un Z0vNs0efonN27DruxHcazveEKvpryFrrmFlavrLxVJ+XYWjjArwfWyCdsh3h8Oa6 vojoExie7hh5F+l0wXSd8IPz/krxMdNUqPFiq7t3jgtCIN3cJdHa5p4WJj46XhpC /pYkLtvJVnmnD4R0NSqyjaBodgvCuwuqxrbfYouU2cKV5PffBF4FmT8mgYfhU6nP FyFWlHSsRNUQBvK3HNtgoXYD1RirJlhSMmyx8tpKHyVWwYhDp0v+gKZN1F3/dZLn 5xSC8bTSw48o1yGxtOhowQNmZYb327nOjci0C23fT8QRzHPXfqXgskn7UUdikRZI 5JSQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1692809017; x=1692895417; bh=CkdE8+B7DSfWb cYZjDLUGWPBGo+THho9cEGj+3Fl/JI=; b=Ddvvybt+6xTafibd//Y3R/QocKM39 d0j6Z4aYbtMnvzZ9mqfxpAX3wFVQm5MclxZeJiHxjnVU2isbrcAsbbMDhrk1cP/Z Lhr2Sq5xkJToRgXvqSrB+TLJjIl1m5l8XNIdobmoyM4Ee6tHcBAIiyHQAGwAkXQh 7TjjvkdDGvAB5DPV/StL+ZAmKoGEZByvnL3XBnHUvNezvQIc+t7gucR6JFq+Ff/3 749ViiarHfg4p97C6rTOAIqxj4Jdq+fzaNFlhDdiQxwxexqWnFoMbIJnbA9NGKRi 7kWU3BgB0mbIJws8AtWG3jzLpFoy1UzjT46aIz1YHLFnkHSUpn9tJOTNA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedruddvgedguddtgecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpehffgfhvfevufffjgfkgggtsehttdertddtredtnecuhfhrohhmpefuthgv fhgrnhcutfhovghstghhuceoshhhrhesuggvvhhkvghrnhgvlhdrihhoqeenucggtffrrg htthgvrhhnpeevlefggffhheduiedtheejveehtdfhtedvhfeludetvdegieekgeeggfdu geeutdenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpe hshhhrseguvghvkhgvrhhnvghlrdhioh X-ME-Proxy: Feedback-ID: i84614614:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 23 Aug 2023 12:43:36 -0400 (EDT) References: <20230822180539.1424843-1-shr@devkernel.io> User-agent: mu4e 1.10.1; emacs 28.2.50 From: Stefan Roesch To: David Hildenbrand Cc: kernel-team@fb.com, akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org, hannes@cmpxchg.org, riel@surriel.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v4] proc/ksm: add ksm stats to /proc/pid/smaps Date: Wed, 23 Aug 2023 09:41:56 -0700 In-reply-to: Message-ID: MIME-Version: 1.0 Content-Type: text/plain X-Rspamd-Queue-Id: 0622C1A001D X-Rspam-User: X-Stat-Signature: fd355t4caju7m1qcmnjo1pjb7rfftw99 X-Rspamd-Server: rspam03 X-HE-Tag: 1692809019-817398 X-HE-Meta: U2FsdGVkX18rWGJArOLTLR4efw5GVLM+U9G/mDPYaO+MuT46UcaPuU3izL9xAlZ+5ATF2v40sQGTlMWL+PGgsPEAGjDzvrBtchTyw9KzLclf6Absaw0gK9Tr6i0XqYf3hbcVqzPxSZf8MU1uNj+l0rgdmAAiHB4cAYPt+uctrrE3/XAydMmiK32nSWBYByLrhpUcONzgDId0x+jJXfvZTcyjQsvEFBOutN44qNy5iP+GRovEx4hWTcjmNavRJqjipuhDb0ygCt7fDziNI+Hy44DDSAKVHOEdqC3SNQnyB3DLQWgEhDuW8HRHsTZSzM2d9hommJtpEA7ZcghS86xdxxw8P9hu8kXcEY1IEuiC3G87528G6Fb5GVJWT4b1HtasP6pHrf6Q2YDJI0AAQbPOa45GYuoGFb8MBX6bdLo5GAALbXRaaMDsTuVO/+1ySIAhpWDghmr/Ii1OfxTriymcBNvvEy5K4cO+wWfoBWPxN/Wp+lzAh0aTlOQsuWb700gecT25BrtMUFUHiow02e8K1obt51CXSFhkSi1fA1VnvrfD/6MI6V/jOSSrTFuBBI6bZA4ogn7SQdAQ6ZYjzmbTu1Rc9sS4nucEvimIDOrYT3lpCb2BYaQXLfkr0TdDN8ocmy057ORDv67CrAiucQgnnwVbC6YHxdmu1kTOnSP5K84KhDhwkT7kIdjrLR0Yy8OA4RIQ1gfVBkhjxxZw3wodBeTpS8CvYs3h4kn90nGIk02lS1pCXZFI73vl83Jwl+eJq3bkgqD1vORnYJR2UtAVnQReatybyx07LnIAdPQNf4Uhf4V0/fqVTkHquNtgGeo4lUZfP7mvmjgZyrjUw7VCdqAk1ZXTzGTbVQUVWd2nGBoJ0Kai30m6i1woRCJIf2UuMflGlBderrzby5r8bT8se2vOseZ8vYJTKxmJ8Xpqq7eSIcoDZdgQcDlqpchX+HaaLSPJ8Xgfp+NyuMNoG+h nQ/YNFA4 14vbAoYbnxgA4PIFu4p8wy/JlM6EtEgdBGB/WuHFKhR2sW3FKTQp29hCnw8CaBElojlq86kzib2znAynvsx+dPATmssLkz/foV5q7bjzjrCBCSX1XzVJsfwMCVe0lIgJ1ECP2qTa0gfb+o8rIIm++pGiAb2VLSYy+3MfstLlOlCNRu2GME5v6DA9QsHOq6MmCbHIDlQSBQq9MYZ3zfwUUufBZcHqpyp/mMYK9 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: David Hildenbrand writes: > On 22.08.23 20:05, Stefan Roesch wrote: >> With madvise and prctl KSM can be enabled for different VMA's. Once it >> is enabled we can query how effective KSM is overall. However we cannot >> easily query if an individual VMA benefits from KSM. >> This commit adds a KSM section to the /prod//smaps file. It reports >> how many of the pages are KSM pages. The returned value for KSM is >> independent of the use of the shared zeropage. > > Maybe phrase that to something like "The returned value for KSM includes > KSM-placed zeropages, so we can observe the actual KSM benefit independent > of the usage of the shared zeropage for KSM.". > > But thinking about it (see below), maybe we really just let any user figure that out by > temporarily disabling the shared zeropage. > > So this would be > > "It reports how many of the pages are KSM pages. Note that KSM-placed zeropages > are not included, only actual KSM pages." > I'll replace the commit message and the documentation with the above sentence. >> Here is a typical output: >> 7f420a000000-7f421a000000 rw-p 00000000 00:00 0 >> Size: 262144 kB >> KernelPageSize: 4 kB >> MMUPageSize: 4 kB >> Rss: 51212 kB >> Pss: 8276 kB >> Shared_Clean: 172 kB >> Shared_Dirty: 42996 kB >> Private_Clean: 196 kB >> Private_Dirty: 7848 kB >> Referenced: 15388 kB >> Anonymous: 51212 kB >> KSM: 41376 kB >> LazyFree: 0 kB >> AnonHugePages: 0 kB >> ShmemPmdMapped: 0 kB >> FilePmdMapped: 0 kB >> Shared_Hugetlb: 0 kB >> Private_Hugetlb: 0 kB >> Swap: 202016 kB >> SwapPss: 3882 kB >> Locked: 0 kB >> THPeligible: 0 >> ProtectionKey: 0 >> ksm_state: 0 >> ksm_skip_base: 0 >> ksm_skip_count: 0 >> VmFlags: rd wr mr mw me nr mg anon >> This information also helps with the following workflow: >> - First enable KSM for all the VMA's of a process with prctl. >> - Then analyze with the above smaps report which VMA's benefit the most >> - Change the application (if possible) to add the corresponding madvise >> calls for the VMA's that benefit the most >> Signed-off-by: Stefan Roesch >> --- >> Documentation/filesystems/proc.rst | 4 ++++ >> fs/proc/task_mmu.c | 16 +++++++++++----- >> 2 files changed, 15 insertions(+), 5 deletions(-) >> diff --git a/Documentation/filesystems/proc.rst >> b/Documentation/filesystems/proc.rst >> index 7897a7dafcbc..d5bdfd59f5b0 100644 >> --- a/Documentation/filesystems/proc.rst >> +++ b/Documentation/filesystems/proc.rst >> @@ -461,6 +461,7 @@ Memory Area, or VMA) there is a series of lines such as the following:: >> Private_Dirty: 0 kB >> Referenced: 892 kB >> Anonymous: 0 kB >> + KSM: 0 kB >> LazyFree: 0 kB >> AnonHugePages: 0 kB >> ShmemPmdMapped: 0 kB >> @@ -501,6 +502,9 @@ accessed. >> a mapping associated with a file may contain anonymous pages: when MAP_PRIVATE >> and a page is modified, the file page is replaced by a private anonymous copy. >> +"KSM" shows the amount of anonymous memory that has been de-duplicated. The >> +value is independent of the use of shared zeropage. > > Maybe here as well. > >> + >> "LazyFree" shows the amount of memory which is marked by madvise(MADV_FREE). >> The memory isn't freed immediately with madvise(). It's freed in memory >> pressure if the memory is clean. Please note that the printed value might >> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c >> index 51315133cdc2..4532caa8011c 100644 >> --- a/fs/proc/task_mmu.c >> +++ b/fs/proc/task_mmu.c >> @@ -4,6 +4,7 @@ >> #include >> #include >> #include >> +#include >> #include >> #include >> #include >> @@ -396,6 +397,7 @@ struct mem_size_stats { >> unsigned long swap; >> unsigned long shared_hugetlb; >> unsigned long private_hugetlb; >> + unsigned long ksm; >> u64 pss; >> u64 pss_anon; >> u64 pss_file; >> @@ -435,9 +437,9 @@ static void smaps_page_accumulate(struct mem_size_stats *mss, >> } >> } >> -static void smaps_account(struct mem_size_stats *mss, struct page *page, >> - bool compound, bool young, bool dirty, bool locked, >> - bool migration) >> +static void smaps_account(struct mem_size_stats *mss, pte_t *pte, >> + struct page *page, bool compound, bool young, bool dirty, >> + bool locked, bool migration) >> { >> int i, nr = compound ? compound_nr(page) : 1; >> unsigned long size = nr * PAGE_SIZE; >> @@ -452,6 +454,9 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page, >> mss->lazyfree += size; >> } >> + if (PageKsm(page) && (!pte || !is_ksm_zero_pte(*pte))) > > I think this won't work either way, because smaps_pte_entry() never ends up calling > this function with !page. And the shared zeropage here always gives us !page. > > What would work is: > > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > index 15ddf4653a19..ef6f39d7c5a2 100644 > --- a/fs/proc/task_mmu.c > +++ b/fs/proc/task_mmu.c > @@ -528,6 +528,9 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr, > page = vm_normal_page(vma, addr, ptent); > young = pte_young(ptent); > dirty = pte_dirty(ptent); > + > + if (!page && is_ksm_zero_pte(ptent)) > + mss->ksm += size; > } else if (is_swap_pte(ptent)) { > swp_entry_t swpent = pte_to_swp_entry(ptent); > That means that "KSM" can be bigger than "Anonymous" and "RSS" when the shared > zeropage is used. > > Interestingly, right now we account each KSM page individually towards > "Anonymous" and "RSS". > > So if we have 100 times the same KSM page in a VMA, we will have 100 times anon > and 100 times rss. > > Thinking about it, I guess considering the KSM-placed zeropage indeed adds more > confusion to that. Eventually, we might just want separate "Shared-zeropages" count. > > > So maybe v3 is better, clarifying the documentation a bit, that the > KSM-placed zeropage is not considered. > o > Sorry for changing my mind :D Thoughts? I agree I think v3 is better, I'll revert to v3 and add the above documentation change.