From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DBBEEC433E0 for ; Thu, 9 Jul 2020 14:32:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8F3C9206C3 for ; Thu, 9 Jul 2020 14:32:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8F3C9206C3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D4C936B0003; Thu, 9 Jul 2020 10:32:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CFE066B0005; Thu, 9 Jul 2020 10:32:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C13C16B0008; Thu, 9 Jul 2020 10:32:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0105.hostedemail.com [216.40.44.105]) by kanga.kvack.org (Postfix) with ESMTP id A7AEF6B0003 for ; Thu, 9 Jul 2020 10:32:40 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 15FB8246E for ; Thu, 9 Jul 2020 14:32:40 +0000 (UTC) X-FDA: 77018778480.01.arch37_071371526ec6 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin01.hostedemail.com (Postfix) with ESMTP id 0380D400020F86BC for ; Thu, 9 Jul 2020 14:32:34 +0000 (UTC) X-HE-Tag: arch37_071371526ec6 X-Filterd-Recvd-Size: 3482 Received: from gentwo.org (gentwo.org [3.19.106.255]) by imf25.hostedemail.com (Postfix) with ESMTP for ; Thu, 9 Jul 2020 14:32:34 +0000 (UTC) Received: by gentwo.org (Postfix, from userid 1002) id 026D63F1F2; Thu, 9 Jul 2020 14:32:33 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by gentwo.org (Postfix) with ESMTP id 000053F1EC; Thu, 9 Jul 2020 14:32:33 +0000 (UTC) Date: Thu, 9 Jul 2020 14:32:33 +0000 (UTC) From: Christopher Lameter X-X-Sender: cl@www.lameter.com To: Pekka Enberg cc: Xunlei Pang , Andrew Morton , Wen Yang , Yang Shi , Roman Gushchin , "linux-mm@kvack.org" , LKML Subject: Re: [PATCH 1/2] mm/slub: Introduce two counters for the partial objects In-Reply-To: Message-ID: References: <1593678728-128358-1-git-send-email-xlpang@linux.alibaba.com> <7374a9fd-460b-1a51-1ab4-25170337e5f2@linux.alibaba.com> User-Agent: Alpine 2.22 (DEB 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: 0380D400020F86BC X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, 7 Jul 2020, Pekka Enberg wrote: > On Fri, Jul 3, 2020 at 12:38 PM xunlei wrote: > > > > On 2020/7/2 PM 7:59, Pekka Enberg wrote: > > > On Thu, Jul 2, 2020 at 11:32 AM Xunlei Pang wrote: > > >> The node list_lock in count_partial() spend long time iterating > > >> in case of large amount of partial page lists, which can cause > > >> thunder herd effect to the list_lock contention, e.g. it cause > > >> business response-time jitters when accessing "/proc/slabinfo" > > >> in our production environments. > > > > > > Would you have any numbers to share to quantify this jitter? I have no > > > > We have HSF RT(High-speed Service Framework Response-Time) monitors, the > > RT figures fluctuated randomly, then we deployed a tool detecting "irq > > off" and "preempt off" to dump the culprit's calltrace, capturing the > > list_lock cost up to 100ms with irq off issued by "ss", this also caused > > network timeouts. > > Thanks for the follow up. This sounds like a good enough motivation > for this patch, but please include it in the changelog. Well this is access via sysfs causing a holdoff. Another way of access to the same information without adding atomics and counters would be best. > > I also have no idea what's the standard SLUB benchmark for the > > regression test, any specific suggestion? > > I don't know what people use these days. When I did benchmarking in > the past, hackbench and netperf were known to be slab-allocation > intensive macro-benchmarks. Christoph also had some SLUB > micro-benchmarks, but I don't think we ever merged them into the tree. They are still where they have been for the last decade or so. In my git tree on kernel.org. They were also reworked a couple of times and posted to linux-mm. There are historical posts going back over the years where individuals have modified them and used them to create multiple other tests.