From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32041C761A6 for ; Thu, 30 Mar 2023 14:40:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B53E26B0071; Thu, 30 Mar 2023 10:40:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B03F36B0072; Thu, 30 Mar 2023 10:40:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9A4FC6B0074; Thu, 30 Mar 2023 10:40:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 8C5926B0071 for ; Thu, 30 Mar 2023 10:40:56 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 2CAF8C0FE4 for ; Thu, 30 Mar 2023 14:40:56 +0000 (UTC) X-FDA: 80625826512.28.C00C7BF Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf11.hostedemail.com (Postfix) with ESMTP id BAD074000B for ; Thu, 30 Mar 2023 14:40:53 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=FZl0Wj4H; spf=pass (imf11.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680187253; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EBPUuk2JCRV9N/XVlREPZ4qIQldNVbbnB+3Y+KyNIMg=; b=gC7CUl5ztI39VgmRVoys/QCUvtJ13wf8PzydCl5C4DZK96PQdXoi6c8CX08TArE2jRIFqr kSWIofiSUEMd/wLXc0Q5gnkqPOfZ6RWjFAROaXetoLZJ39OF3rynLZc+5/7Ca7P/2n/BrQ gdkhc2wf2q60Z+QjnwFGSAZ31IGEe9g= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=FZl0Wj4H; spf=pass (imf11.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680187253; a=rsa-sha256; cv=none; b=Mca+gMFxMIR/0pXO7wY/Xu703JviOQMfpGCjzGJSxTf5tJAy389CO5aytHn/+XGbLj9lqi RyDIXHb1eDrH8xb4k63HewhHqj46hEIb7uU3R8+w1CdN10MVGVWFQUDV2+SF6ufh1Q+dU3 uhrRmr7OmBGWNu4URtuJE9i1nUZOIC0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680187253; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EBPUuk2JCRV9N/XVlREPZ4qIQldNVbbnB+3Y+KyNIMg=; b=FZl0Wj4HoYqR6Nm/0s18z3/vuB89crmuZwJObLFVs62/xU6OQymaQ5woslgWV6osq3GuJv IXeUs6Z1ZRK/+v4gETxxsGPiGSbtLnlqsL+Yrdm2ikz5pk2GgRr6Yg+1/8ixhRzW3/Buuc fpfQm+Y8d8Kxb7z2ICufAok7K4ABL0Y= Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-493-mO0WrCGqPhW-AcF-w-OnRg-1; Thu, 30 Mar 2023 10:40:51 -0400 X-MC-Unique: mO0WrCGqPhW-AcF-w-OnRg-1 Received: by mail-pl1-f198.google.com with SMTP id d2-20020a170902cec200b001a1e8390831so11220779plg.5 for ; Thu, 30 Mar 2023 07:40:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680187250; h=content-transfer-encoding:in-reply-to:subject:organization:from :content-language:references:cc:to:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=EBPUuk2JCRV9N/XVlREPZ4qIQldNVbbnB+3Y+KyNIMg=; b=Kpogh2HZvpSUSNlIPu9wqY4AzHjfytoHKSwPFwR6Q9Nf9KzIqOHpvlHpUhYfyWpbO/ DiViSy8lqVCb5LSajvWvcPD/XxJTINry/SKZMFtPNrh/ZUbRxWK4FNTXatoHsR4Tc5ou tAZFu9kVlFB00XiNPX/X1Oc2gc4CKqZdIMCCxdqK01OkTBX4giC8HkbjDjaR8OyUhGmv sVJxQjNVzGsCREMcZHlzv8+z5d2JTyaUnt8wyZoJFSKqvXkIGLWplrW/UGtHdrisDIga i9gbihRh90ZlVf0m1zqscPRU7zdNaa9oTEe2kEXTAyjTO0AHala28OPxHNZhKCNFqL3Y eKQA== X-Gm-Message-State: AAQBX9cCzUAHJ14NGv943685wCil593aSFgYkTJNMWtv7CAYAhhVqNv5 T7UnFyZ2xOdBh5lED4jeBXo6VPyFasdFVWGCp13MeUGnvawnZN8vlC/gbSSzR27HCug91T8ciRr CYbGdFWbYxBU= X-Received: by 2002:a17:90a:86c3:b0:231:248c:6ac4 with SMTP id y3-20020a17090a86c300b00231248c6ac4mr25890644pjv.7.1680187249918; Thu, 30 Mar 2023 07:40:49 -0700 (PDT) X-Google-Smtp-Source: AKy350YRnh6kqb08u4czJc76hdyMex7f7wBhD8J6ELIXqGjTnfL30swP4TDll3Hnd/sy8n5Lpfm+kg== X-Received: by 2002:a17:90a:86c3:b0:231:248c:6ac4 with SMTP id y3-20020a17090a86c300b00231248c6ac4mr25890606pjv.7.1680187249419; Thu, 30 Mar 2023 07:40:49 -0700 (PDT) Received: from [192.168.35.160] ([64.114.255.114]) by smtp.gmail.com with ESMTPSA id mh11-20020a17090b4acb00b0023cfbe7d62esm8363577pjb.1.2023.03.30.07.40.48 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 30 Mar 2023 07:40:48 -0700 (PDT) Message-ID: Date: Thu, 30 Mar 2023 16:40:48 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.0 To: Johannes Weiner Cc: Andrew Morton , Stefan Roesch , kernel-team@fb.com, linux-mm@kvack.org, riel@surriel.com, mhocko@suse.com, linux-kselftest@vger.kernel.org, linux-doc@vger.kernel.org, Hugh Dickins References: <20230310182851.2579138-1-shr@devkernel.io> <20230328160914.5b6b66e4a5ad39e41fd63710@linux-foundation.org> <37dcd52a-2e32-c01d-b805-45d862721fbc@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH v4 0/3] mm: process/cgroup ksm support In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: BAD074000B X-Rspam-User: X-Stat-Signature: ukinpuwaboprmi9dwow9xsc6ecbprr83 X-HE-Tag: 1680187253-966950 X-HE-Meta: U2FsdGVkX1+KnI7sB6GPX+O6UeXTD0AL0tNWKAoOpnPqsHKKXgqz9wbcjLjJ/zSlpHm38zGr9wfz0y87+IC0nRA0UCmmzl7EQ1Kz+eJJGfa4gdm5rtCvdg/QYzJt+tTtnxhSptUBt+jDZKroLo+iMDO4GvLnGCP2m3iJSeYw14Blb2WgCE9UUmEKvvYMU+p9sTnHcbNpob7oKtacZW6x3I/DAcLXKbJ/0yXd0Th8+WAsEc6UPM46h8JHypFNXiK190GJpAXlepw5Mb96VNB7QdobZr/grHXR3GnhhJTZNx7iEQrCOZN0HB0lPTTD4LZpEZUmh7MiJ2nR83WgBuHTmz8USJCyy80BMszCiGGny4W4BMCjOjFXpiy3jkwaIH3Rv2++M/bwhYZeihj14srwXDR2GoJi8jOln7NOwC9sqJ6vUhyfGSsFKe8oLftdJUc41nKqeq1aXBIlIlNepSCCsvMeEoy50w53jRVFBhIJXFooA85mAvbbVfWkm6x0fCDlNcJmfQiAZz8NIsQkAFDr6l3LFZl2UAyKs0KgKXno6kxjafQv/5AarszcS08dZNSDcp+z753Ae/UZQk6G25iSmlpFK79BRO3o/xWIEkSiqKFGJj8wiuSwPCBxySCavlto7n1NDO9NJfvrote6KG6jDXg4k8v0bSzgwtmyRtjcI3dqh3s7hEHXQ0tJy02TT2Bm+vzfcVEbJFg6zdv1ugNULQwp7R1uztAcdUBIV8jFbP7LrBch4nYaHvl9bYvEw6bWucHbxufmi2QHXEpOPtB9ZJdCd4NQx1TIA9+GpTLK3anJKdtpnEcBwWM+R3m3WBJiJw54X2ujyz9yci29yxdPNxfILUPKotlGmbvRFEsf0EQSXaXaaTG97j5a6qerpK7gOIe6JqkTvb3dqQ8HWwSLKcDkOiZTShO9tR9Q+X5kMwXrW/6P8g3MBQHry5klI7lC0nVoOkEO2euJZr5NtV/ m1JQ1L7u 6ST2TFmXQaxdCobVk8HP5iJAeL9e76dtAbOg8ZlxSXFoQvjJiIyKuQhqeg3BCKbaie18QtN9P/gyFMaRM7g6F3P7veQJvATKT06tDlNLu0g7IY20nj9D/gxNXU1shwPh0yv89bQdMjrb5hHghrJeuOy+N47Zb4a1TKjN9AEm7WnvHP+jawmlE3rjc4PzH2koavrz8+59fb4uyZgrWKLjoN8ZCfELt7MpgJ1eGG1gzpa3dHxXBFEn/I8bVe1CHL+Uq/ZEsPeoKpsPteqVFQUGR9R0Ik8wVqfwPbWmyTpHBgR1A73XXt7Zmbkt5Jj5Sw2en5GlVXve7pLwE3bJNPwJAqt++gGed5VtegjGOPvR4nBP1MZdMX8ICOy2LrggKnkhA0EkPWDGxCcuJ/pIhPxqQmuh2Uc4JqBl6HypnRznMV9RFSiI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 30.03.23 16:26, Johannes Weiner wrote: > On Thu, Mar 30, 2023 at 06:55:31AM +0200, David Hildenbrand wrote: >> On 29.03.23 01:09, Andrew Morton wrote: >>> On Fri, 10 Mar 2023 10:28:48 -0800 Stefan Roesch wrote: >>> >>>> So far KSM can only be enabled by calling madvise for memory regions. To >>>> be able to use KSM for more workloads, KSM needs to have the ability to be >>>> enabled / disabled at the process / cgroup level. >>> >>> Review on this series has been a bit thin. Are we OK with moving this >>> into mm-stable for the next merge window? >> >> I still want to review (traveling this week), but I also don't want to block >> this forever. >> >> I think I didn't get a reply from Stefan to my question [1] yet (only some >> comments from Johannes). I would still be interested in the variance of >> pages we end up de-duplicating for processes. >> >> The 20% statement in the cover letter is rather useless and possibly >> misleading if no details about the actual workload are shared. > > The workload is instagram. It forks off Django runtimes on-demand > until it saturates whatever hardware it's running on. This benefits > from merging common heap/stack state between instances. Since that > runtime is quite large, the 20% number is not surprising, and matches > our expectations of duplicative memory between instances. Thanks for this explanation. It's valuable to get at least a feeling for the workload because it doesn't seem to apply to other workloads at all. > > Obviously we could spend months analysing which exact allocations are > identical, and then more months or years reworking the architecture to > deduplicate them by hand and in userspace. But this isn't practical, > and KSM is specifically for cases where this isn't practical. > > Based on your request in the previous thread, we investigated whether > the boost was coming from the unintended side effects of KSM splitting > THPs. This wasn't the case. > > If you have other theories on how the results could be bogus, we'd be > happy to investigate those as well. But you have to let us know what > you're looking for. > Maybe I'm bad at making such requests but "Stefan, can you do me a favor and investigate which pages we end up deduplicating -- especially if it's mostly only the zeropage and if it's still that significant when disabling THP?" "In any case, it would be nice to get a feeling for how much variety in these 20% of deduplicated pages are. " is pretty clear to me. And shouldn't take months. > Beyond that, I don't think we need to prove from scratch that KSM can I never expected a proof. I was merely trying to understand if it's really KSM that helps here. Also with the intention to figure out if KSM is really the right tool to use here or if it simply "helps by luck" as with the shared zeropage. That end result could have been valuable to your use case as well, because KSM overhead is real. > be a worthwhile optimization. It's been established that it can > be. This series is about enabling it in scenarios where madvise() > isn't practical, that's it, and it's yielding the expected results. I'm sorry to say, but you sound a bit aggressive and annoyed. I also have no idea why Stefan isn't replying to me but always you. Am I asking the wrong questions? Do you want me to stop looking at KSM code? -- Thanks, David / dhildenb