From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.9 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1827C433F5 for ; Wed, 15 Sep 2021 08:42:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7B83461247 for ; Wed, 15 Sep 2021 08:42:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 7B83461247 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id E421B900002; Wed, 15 Sep 2021 04:42:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DCB3E6B0072; Wed, 15 Sep 2021 04:42:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C6B5F900002; Wed, 15 Sep 2021 04:42:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0081.hostedemail.com [216.40.44.81]) by kanga.kvack.org (Postfix) with ESMTP id B28D86B0071 for ; Wed, 15 Sep 2021 04:42:09 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 5D1798249980 for ; Wed, 15 Sep 2021 08:42:09 +0000 (UTC) X-FDA: 78589165578.14.74B0671 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf14.hostedemail.com (Postfix) with ESMTP id C7DBB6001987 for ; Wed, 15 Sep 2021 08:42:08 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 44174221B6; Wed, 15 Sep 2021 08:42:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1631695327; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ss0cx9db0tE4mKDvUy1qeDM+5YsJGE3aId/CpWQkZJY=; b=PEc06L+q4OSsqr2e2CO2jd60kWPVsnJe60hwmlN1P70NjAg8X1himjcjpe0ZQm0TccloWn 76C+Ap5LUTM+EaUmLDZxq0w/Jt2NdkrLMEiMTofWgm+mqRCKj5CIoAmA7m6+n0IgjO6YuN MqHf/ibR7id5X38ZRFnVolfvTbqdb1g= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1631695327; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ss0cx9db0tE4mKDvUy1qeDM+5YsJGE3aId/CpWQkZJY=; b=wn6Nm0FqmF2d4WrE02wGsScQ9Fq4hTY72p/+3qqR86tuGWR5/5gN73BoUoiA5SVhhI6Ig6 1n14LBRyGSX8e0Dg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 1E39013C1A; Wed, 15 Sep 2021 08:42:07 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id h3y9Bt+xQWGePQAAMHmgww (envelope-from ); Wed, 15 Sep 2021 08:42:07 +0000 Message-ID: Date: Wed, 15 Sep 2021 10:42:06 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.1.0 Content-Language: en-US To: David Rientjes Cc: linux-mm@kvack.org, Christoph Lameter , Joonsoo Kim , Pekka Enberg , Jann Horn , linux-kernel@vger.kernel.org, Roman Gushchin References: <20210913170148.10992-1-vbabka@suse.cz> From: Vlastimil Babka Subject: Re: [RFC PATCH] mm, slub: change percpu partial accounting from objects to pages In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=PEc06L+q; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=wn6Nm0Fq; spf=pass (imf14.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none X-Stat-Signature: woy535djj5mux5ahi7az1pn71e7m8uk1 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: C7DBB6001987 X-HE-Tag: 1631695328-749541 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 9/15/21 07:32, David Rientjes wrote: > On Mon, 13 Sep 2021, Vlastimil Babka wrote: > >> While this is no longer a problem in kmemcg context thanks to the accounting >> rewrite in 5.9, the memory waste is still not ideal and it's questionable >> whether it makes sense to perform free object count based control when object >> counts can easily become so much inaccurate. So this patch converts the >> accounting to be based on number of pages only (which is precise) and removes >> the page->pobjects field completely. This is also ultimately simpler. >> > > Thanks for the very detailed explanation, this is very timely for us. > > I'm wondering if we should be concerned about the memory waste even being > possible, though, now that we have the kmemcg accounting change? > > IIUC, because we're accounting objects and not pages, then it *seems* like > we could have a high number of pages but very few objects charged per > page so this memory waste could go unconstrained from any kmemcg > limitation. So the main problem before 5.9 was that there were separate kmem caches per memcg with their own percpu partial lists, so the memory used was determined by caches x cpus x memcgs, now they are shared so it's just caches x cpus. What you're saying would be also true, but relatively much smaller issue than what it was before 5.9. >> To retain the existing set_cpu_partial() heuristic, first calculate the target >> number of objects as previously, but then convert it to target number of pages >> by assuming the pages will be half-filled on average. This assumption might >> obviously also be inaccurate in practice, but cannot degrade to actual number of >> pages being equal to the target number of objects. >> > > I think that's a fair heuristic. > >> We could also skip the intermediate step with target number of objects and >> rewrite the heuristic in terms of pages. However we still have the sysfs file >> cpu_partial which uses number of objects and could break existing users if it >> suddenly becomes number of pages, so this patch doesn't do that. >> >> In practice, after this patch the heuristics limit the size of percpu partial >> list up to 2 pages. In case of a reported regression (which would mean some >> workload has benefited from the previous imprecise object based counting), we >> can tune the heuristics to get a better compromise within the new scheme, while >> still avoid the unexpectedly long percpu partial lists. >> > > Curious if you've tried netperf TCP_RR with this change? This benchmark > was the most significantly improved benchmark that I recall with the > introduction of per-cpu partial slabs for SLUB. If there are any > regressions to be introduced by such an approach, I'm willing to bet that > it would be surfaced with that benchmark. I'll try, thanks for the tip.