From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=pyM+=OZ=kvack.org=owner-linux-mm@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 57C24C433EF
	for <linux-mm@archiver.kernel.org>; Tue,  5 Oct 2021 14:16:29 +0000 (UTC)
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by mail.kernel.org (Postfix) with ESMTP id E12B3611C5
	for <linux-mm@archiver.kernel.org>; Tue,  5 Oct 2021 14:16:28 +0000 (UTC)
DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org E12B3611C5
Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com
Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org
Received: by kanga.kvack.org (Postfix)
	id 4F0ED900002; Tue,  5 Oct 2021 10:16:28 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 4A04C6B0071; Tue,  5 Oct 2021 10:16:28 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 36766900002; Tue,  5 Oct 2021 10:16:28 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from forelay.hostedemail.com (smtprelay0102.hostedemail.com [216.40.44.102])
	by kanga.kvack.org (Postfix) with ESMTP id 27F2A6B006C
	for <linux-mm@kvack.org>; Tue,  5 Oct 2021 10:16:28 -0400 (EDT)
Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251])
	by forelay01.hostedemail.com (Postfix) with ESMTP id DA65618232E5B
	for <linux-mm@kvack.org>; Tue,  5 Oct 2021 14:16:27 +0000 (UTC)
X-FDA: 78662584014.07.879F6F7
Received: from mail-pg1-f169.google.com (mail-pg1-f169.google.com [209.85.215.169])
	by imf04.hostedemail.com (Postfix) with ESMTP id 99C395001509
	for <linux-mm@kvack.org>; Tue,  5 Oct 2021 14:16:27 +0000 (UTC)
Received: by mail-pg1-f169.google.com with SMTP id v11so7306822pgb.8
        for <linux-mm@kvack.org>; Tue, 05 Oct 2021 07:16:27 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20210112;
        h=date:from:to:cc:subject:message-id:references:mime-version
         :content-disposition:in-reply-to;
        bh=EwHZDZ7kt9Qt3vhwKnm3A0/y/1Nf7STGF1vfsq6DqDE=;
        b=Dr39kURPHYYkONhvPaa1UE9yMgqt/Yk1d3oR/j6onhWoiP1TOjLwFVVO5LuwdSMa9+
         Qq8WPcfNJqOgQnwYdeo41itxFbrJ1B0wcYIdXzb09PUYC5byA94xP3OSjKv9WNGnFmIx
         7H/lKUIj0t5npCbJOiucP6vKNwmPPS2kfeuV1UwSVTGIjtjaf9xe8B49nNn5DEb/3jE5
         KJG8xuUzynfhtD5v+kdFRPH7wCWKZFkyhNDhpywMdmzQ59IAmU/sSRsxxrYudN5I0rv3
         3q980hqhNLR7gZmAdE8Z3ov64U07sFf+UsTgSvd2WrEaKVYHDjqcTNb7XmKo1GhDIhv3
         VQ2g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:from:to:cc:subject:message-id:references
         :mime-version:content-disposition:in-reply-to;
        bh=EwHZDZ7kt9Qt3vhwKnm3A0/y/1Nf7STGF1vfsq6DqDE=;
        b=IGcqwLYSQLBYJvjCLjDCv10udgNlmHAs7iHH9TG5riZjupCfPa3rQktQYg6q4muG7+
         lJcIPcxctNwN8r/Q7Ywuind69uIH7UPP+6RfpbwiODAnB5Z/q5WivNxNtwt1tXU4mRuu
         vO05QP3ojRLPZR4ZgJR6D/HlqBZfUFzVPv8YBYQgsE5n+IL1urtWruI5vAUSEK0+Ip+0
         wTUFJXSZxDEICvNg2TZgBGkVwjUWU0hNA/IIef6PQyO2kELtY8pmptIr/a7M361dfnvs
         oQSGMJDQZvlJSopyVyBSIhkHdABmuIWulBd7Avekf9ck50C6iVLl4I745/yyhOfvhCAs
         9uWQ==
X-Gm-Message-State: AOAM532vb9loORZA7ZPbKVi/HIUftQsZDO03psrlXf6Q7tL/3N1a7rZW
	69sRw2IoH0fvbaqOvi0QOjG55alCHCU=
X-Google-Smtp-Source: ABdhPJwvUVLH835WDiej69cMmIrlamhNnH89ABSVEQgdVQE+O4tPcP2eaBfVx30kXpcDuXkHKDUEUA==
X-Received: by 2002:a63:1d13:: with SMTP id d19mr16047813pgd.383.1633443386517;
        Tue, 05 Oct 2021 07:16:26 -0700 (PDT)
Received: from kvm.asia-northeast3-a.c.our-ratio-313919.internal (24.151.64.34.bc.googleusercontent.com. [34.64.151.24])
        by smtp.gmail.com with ESMTPSA id d26sm17723862pfo.116.2021.10.05.07.16.24
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 05 Oct 2021 07:16:26 -0700 (PDT)
Date: Tue, 5 Oct 2021 14:16:22 +0000
From: Hyeonggon Yoo <42.hyeyoo@gmail.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@gentwo.de>, linux-mm@kvack.org,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org
Subject: Re: Queueing is outside of SLUB nowdays
Message-ID: <20211005141622.GC2760@kvm.asia-northeast3-a.c.our-ratio-313919.internal>
References: <20210927090347.GA2533@linux.asia-northeast3-a.c.our-ratio-313919.internal>
 <8aa15f4b-71de-5283-5ebc-d8d1a323473d@suse.cz>
 <20211001003908.GA2657@linux.asia-northeast3-a.c.our-ratio-313919.internal>
 <alpine.DEB.2.22.394.2110041648220.294708@gentwo.de>
 <09ca489a-ecfb-dd5e-b057-dc9c59c8585e@suse.cz>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <09ca489a-ecfb-dd5e-b057-dc9c59c8585e@suse.cz>
X-Rspamd-Server: rspam02
X-Rspamd-Queue-Id: 99C395001509
X-Stat-Signature: pkwzx3sdwmwt8udwb5zfjoqzmm1qmgw6
Authentication-Results: imf04.hostedemail.com;
	dkim=pass header.d=gmail.com header.s=20210112 header.b=Dr39kURP;
	spf=pass (imf04.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.215.169 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com;
	dmarc=pass (policy=none) header.from=gmail.com
X-HE-Tag: 1633443387-650739
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On Tue, Oct 05, 2021 at 10:19:32AM +0200, Vlastimil Babka wrote:
> On 10/4/21 16:56, Christoph Lameter wrote:
> > On Fri, 1 Oct 2021, Hyeonggon Yoo wrote:
> > 
> >> Looking at other layers, they implemented queuing layer outside of SLUB.
> >> See commit 795bb1c00dd ("net: bulk free infrastructure for NAPI context,
> >> use napi_consume_skb") for example. They made skb cache because SLUB is
> >> not suitable for intensive alloc/free.
> >>
> >> And because the queue is outside of slab, it can go lockless
> >> depending on it's context. (But it's not easy to do so in slab because
> >> slab is general purpose allocator.)
> > 
> > The queuing within in SLUB/SLAB is lockless.
> >

Oh, yes. both SLAB/SLUB has lockless queueing.

I misused word 'lockless'. it's lockless and also without disabling
interrupt.

> >> So current approach on place where slab's performance is critical
> >> is implementing queuing layer on top of slab.
> > 
> > If you have to use object specific characteristics to optimize then yes
> > you can optimize further. However, the slab allocators implement each
> > their own form of queuing that is generic.
> >
> >> Then new question arising:
> >>     - Is that proper way to solve fundamental problem?
> > 
> > There is a problem?
> 
> If someone benefits from implementing a caching layer on top of SL*B, it
> probably indicates a problem.
>

Before I say something, I want to ask why Christoph stopped
implementing SLUB+Q at that time.

And Yeah, I think there are some problems.

If objects are manged outside of slab allocator and most of alloc/frees
are done outside of slab, it's waste of memory.

To say the extreme case (even if it's not common situation), how does implementing 
queueing layer on SLAB, on system with high NUMA nodes makes sense?
it's wasting lots of memory.

and objects are treated as 'allocated' even if it's actually being not
used in queue outside slab. so the memory is unreclaimable.

I think that, If objects are mostly allocated / freed outside of slab
allocator, it does not need to be on top of slab allocator.

and implementing same queueing layers on similar situation is duplication
of code and increased maintenance cost.

So what I tried was generalizing optimizations that are done in some layers
(block and networks). But that was not an easy task, though.

=============================================================================

And what recently I was surprised was:

    I asked Jens (who recently made bio caching layer on slab)
    "I think it would have better performance if you run benchmarks with SLAB?".
    because there is lots of allocations (Millions of allocations per second),
    it's likely to cache-friendly characteristics of SLAB would have result in performance.

    but the response was "I would be surprised if SLAB was better, SLAB
    is considered legacy and everybody uses SLUB."

    and the solution was too SLUB-specific way. (implementing queuing
    layer) I'll say it's too SLUB-specific because SLAB's cache
    utilization functionalities weren't even considered.

    That's why I started this thread at first.

> >>       - why not use SLAB if they need queuing?
> > 
> > SLAB is LIFO queuing whereas SLUB uses spatial considerations and queues
> > within a page before going outside.
> 
> IIUC SLUB queueing works well for allocation (we just consume a per-cpu
> freelist that nobody else can touch) but freeing uses the corresponding
> page's freelist so the atomics are more expensive. In both cases the linked
> freelists might be also worse for cache locality than an array of pointers.
> So perhaps some workload still benefit from a array-based cache on top of
> SLUB and it would be great if they didn't have to implement own solutions?
>

I wonder if page-based policy will work well with queueing.
What to do if the page is full and we must take new page to satisfy a
request?

and what If the queue is mixed with objects of different pages?
That might be somewhat losing spatial locality of SLUB.

So if what you need is queueing, I think SLAB might be better
than adding queueing on SLUB because adding queueing on SLUB
might result in losing its own characteristics.

It seems it's really difficult to consider all situations
in a single memory allocator... T.T.

> > Slab requires disabling interrupts,
> > SLUB is optimized to rely on per cpu atomics and there are numerous other
> > differences.
> > 
> >>       - how does this approach work on SLAB?
> > 
> > SLAB has a lockless layer that is only requiring disabling interrupts. It
> > provides a generic queuing layer as well.
> > 
> > See my talk on Slab allocators awhile back.
> > 
> > https://www.youtube.com/watch?v=h0VMLXavx30

Thank you for sharing that! I have read the presentation before
but didn't know that there was a video too!

It's very useful and I became more familiar with them.

> > 
> 

If I wrongly understand something, please tell me.
I'm so excited to talk about this topic.

Thanks,
Hyeonggon.