From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24A18C433F5 for ; Thu, 21 Oct 2021 11:41:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BAB70611C7 for ; Thu, 21 Oct 2021 11:41:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org BAB70611C7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id E18506B0071; Thu, 21 Oct 2021 07:41:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DC808900003; Thu, 21 Oct 2021 07:41:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C8FD9900002; Thu, 21 Oct 2021 07:41:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0079.hostedemail.com [216.40.44.79]) by kanga.kvack.org (Postfix) with ESMTP id BAAB16B0071 for ; Thu, 21 Oct 2021 07:41:18 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 755852BA81 for ; Thu, 21 Oct 2021 11:41:18 +0000 (UTC) X-FDA: 78720253836.12.97FDB32 Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) by imf27.hostedemail.com (Postfix) with ESMTP id C8357700009B for ; Thu, 21 Oct 2021 11:41:16 +0000 (UTC) Received: by mail-pj1-f47.google.com with SMTP id u6-20020a17090a3fc600b001a00250584aso2886195pjm.4 for ; Thu, 21 Oct 2021 04:41:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=jRDu67UN4//HE7dg80MGnHq8ullkWEYTnuYL8y+5HtM=; b=ifZJzIcQ88huba0uUZN5I3h44XVzA3nAbeCvYZxdjy9KrlrS1XSrIa2HnJ5QaLa52B OqCd6UxCS+lpcRepJJtXtEL93v+fiq/PnRC5puMwBk3YSk89xh8+jG1rHGQCGFFTckEl tRP71IOTX2pAJc5NtnF7KcT0DUy9jYVzHM4zviL7a7VMg+gseuU/6S/quROEIeYrXznP Vc5B4zBB7t18VBTV2G1e+gBMazp/1Yw+9Gzi9sNAWzy8iw4sDdSmmayOHJZo2g46fSBO 9c/L7HhxlHb+FTQhkVLY8+76Ia+MkixS0hkVH7HN1YrH5uLQepuzbnod07OmUUkseG7v e7UA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=jRDu67UN4//HE7dg80MGnHq8ullkWEYTnuYL8y+5HtM=; b=QcHOS6UBnCytE4LUjN6+n+k2pBpi1BKTzQ1Qm7xRe6+teqHrQcKu313VDi0pYPpi9+ o/wWZbY800xd1GRp5YLIp/XnKl4cRc0C/XP2ZnapMG2CtuyZZ4yTMiBPzlo22DcI9rPe gLLTEQ45BjwRVyipkROhm+lbeyxOuXyXsQ/PG4XgQfR4eHZrDO58y0Y68Xd8vZozKXo2 +0IYrRy6q2jp0Dzj49O9bvSs6gXv1tzvD8I+veALeTMPiTXtKulXWwqMP5wrbO3wxd9U dCGpmtiwnJBGPuYzI+TzmL7LnXnTK8+uj6vVZEp3JulQAGE0Ac39SvlI55nAX98D/1Kj H4zQ== X-Gm-Message-State: AOAM5338SiOzn6NUMhgMG+PuWxpB3dKCsSlGZBYZ9o/XlTpNcHbJh2FL paoDDqg3NBDz6FpyJ2IAbus= X-Google-Smtp-Source: ABdhPJwiONEoJhzfpzCePGv3CUV7E3ZTE2PSxs/2hMDperfF/Bk83XDZl7rfjFr5UBaHFK6zWNijEw== X-Received: by 2002:a17:90b:38c6:: with SMTP id nn6mr6213603pjb.28.1634816477001; Thu, 21 Oct 2021 04:41:17 -0700 (PDT) Received: from kvm.asia-northeast3-a.c.our-ratio-313919.internal (139.96.64.34.bc.googleusercontent.com. [34.64.96.139]) by smtp.gmail.com with ESMTPSA id oo9sm6246532pjb.53.2021.10.21.04.41.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Oct 2021 04:41:16 -0700 (PDT) Date: Thu, 21 Oct 2021 11:41:12 +0000 From: Hyeonggon Yoo <42.hyeyoo@gmail.com> To: Vlastimil Babka Cc: linux-kernel@vger.kernel.org, Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , linux-mm@kvack.org, Matthew Wilcox , Dave Taht Subject: Re: [RFC PATCH] mm, slob: Rewrite SLOB using segregated free list Message-ID: <20211021114112.GA4004@kvm.asia-northeast3-a.c.our-ratio-313919.internal> References: <20211020135535.517236-1-42.hyeyoo@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Stat-Signature: 9p5w6en5dkpk6u3h5uxqquesfhi6dyfh X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: C8357700009B Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=ifZJzIcQ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf27.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.216.47 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com X-HE-Tag: 1634816476-293024 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hmm.. I think I need to clarify my intention. I'm not saying this should be merged or we should put effort to make SLOB into lightweight SLOB. I just rewrote it just for fun. I wanted to know how small a segregated free list allocator can be. And when I rewrote it, I wondered who is users of SLOB and where SLOB should be used. I think SLOB was useful when there was only SLAB and there was no SLUB, but I wonder where SLOB should be used now. When I compared SLOB and SLUB without cpu partials, That made 300kB of difference in Slab memory. Then Is SLOB used where 300kB of difference is so important? But I think we need at least 16MB of RAM to run linux. So I'm not saying we need to turn SLOB into lightweight SLUB, but wanted to talk about the questions: > > But after rewriting, I thought I need to discuss what SLOB is for. > > According to Matthew, SLOB is for small machines whose > > memory is 1~16 MB. > > > > I wonder adding 48kB on SLOB memory for speed/lower latency > > is worth or harmful. > > > > So.. questions in my head now: > > - Who is users of SLOB? > > - Is it harmful to add some kilobytes of memory into SLOB? > > - Is it really possible to run linux under 10MB of RAM? > > (I failed with tinyconfig.) > > - What is the boundary to make decision between SLOB and SLUB? On Thu, Oct 21, 2021 at 10:46:46AM +0200, Vlastimil Babka wrote: > On 10/20/21 15:55, Hyeonggon Yoo wrote: > > Hello linux-mm, I rewrote SLOB using segregated free list, > > to understand SLOB and SLUB more. It uses more kilobytes > > of memory (48kB on 32bit tinyconfig) and became 9~10x faster. > > > > But after rewriting, I thought I need to discuss what SLOB is for. > > According to Matthew, SLOB is for small machines whose > > memory is 1~16 MB. > > > > I wonder adding 48kB on SLOB memory for speed/lower latency > > is worth or harmful. > > > > So.. questions in my head now: > > - Who is users of SLOB? > > - Is it harmful to add some kilobytes of memory into SLOB? > > - Is it really possible to run linux under 10MB of RAM? > > (I failed with tinyconfig.) > > - What is the boundary to make decision between SLOB and SLUB? > > > > Anyway, below is my work. > > Any comments/opinions will be appreciated! > > > > SLOB uses sequential fit method. the advantages of this method > > is the fact that it is simple and does not have complex metadata. > > > > But big downside of sequential fit method is its high latency > > in allocation/deallocation and fast fragmentation. > > > > High latency comes from iterating pages and also iterating objects > > in the page to find suitable free object. And fragmentation easily > > happens because objects of difference size is allocated in same page. > > > > This patch tries to minimize both its latency and fragmentation by > > re-implmenting SLOB using segregated free list method and adding > > support for slab merging. it looks like lightweight SLUB but more > > compact than SLUB. > > My immediate reaction is that we probably don't want to turn SLOB into > lightweight SLUB. SLOB choses the tradeoff of low memory usage over speed > and shifting it towards more speed kinda defeats this purpose. Also it's a > major rewrite, so without a very clear motivation there will be resistance > to that. > Yes, I agree that SLOB is for memory efficiency, not a performance. That's why I said: > > I wonder adding 48kB on SLOB memory for speed/lower latency > > is worth or harmful. But on the contrary, I wonder when SLOB is useful than SLUB. is it for really tiny linux systems that has under 1M of RAM? But can linux be that small? > SLUB itself could be probably tuned to less memory overhead if needed. Most > of the debug options effectively disable percpu slabs, we could add a mode > that disables them without the rest of the debugging overhead. Allocation > order can be lowered (although some object sizes might benefit from less > fragmentation with a higher order). Yes, that's what I was curious about. As SLUB is not that big, I wonder where SLOB is useful. > > One notable difference is after this patch SLOB uses kmalloc_caches > > like SL[AU]B. > > > > Below is performance impacts of this patch. > > > > Memory usage was measured on 32 bit + tinyconfig + slab merging. > > > > Before: > > MemTotal: 29668 kB > > MemFree: 19364 kB > > MemAvailable: 18396 kB > > Slab: 668 kB > > > > After: > > MemTotal: 29668 kB > > MemFree: 19420 kB > > MemAvailable: 18452 kB > > Slab: 716 kB > > > > This patch adds about 48 kB after boot. > > > > hackbench was measured on 64 bit typical buildroot configuration. > > After this patch it's 9~10x faster than before. > > > > Before: > > memory usage: > > after boot: > > Slab: 7908 kB > > after hackbench: > > Slab: 8544 kB > > > > Time: 189.947 > > Performance counter stats for 'hackbench -g 4 -l 10000': > > 379413.20 msec cpu-clock # 1.997 CPUs utilized > > 8818226 context-switches # 23.242 K/sec > > 375186 cpu-migrations # 988.859 /sec > > 3954 page-faults # 10.421 /sec > > 269923095290 cycles # 0.711 GHz > > 212341582012 instructions # 0.79 insn per cycle > > 2361087153 branch-misses > > 58222839688 cache-references # 153.455 M/sec > > 6786521959 cache-misses # 11.656 % of all cache refs > > > > 190.002062273 seconds time elapsed > > > > 3.486150000 seconds user > > 375.599495000 seconds sys > > > > After: > > memory usage: > > after boot: > > Slab: 7560 kB > > after hackbench: > > Slab: 7836 kB > > Interesting that the memory usage in this test is actually lower with your > patch. I didn't mention that because if we have enough memory, I think we have no reason to use SLOB. (why not use SLUB?) I thought memory usage on small machine is important. > > > hackbench: > > Time: 20.780 > > Performance counter stats for 'hackbench -g 4 -l 10000': > > 41509.79 msec cpu-clock # 1.996 CPUs utilized > > 630032 context-switches # 15.178 K/sec > > 8287 cpu-migrations # 199.640 /sec > > 4036 page-faults # 97.230 /sec > > 57477161020 cycles # 1.385 GHz > > 62775453932 instructions # 1.09 insn per cycle > > 164902523 branch-misses > > 22559952993 cache-references # 543.485 M/sec > > 832404011 cache-misses # 3.690 % of all cache refs > > > > 20.791893590 seconds time elapsed > > > > 1.423282000 seconds user > > 40.072449000 seconds sys > > That's significant, but also hackbench is kind of worst case test, so in > practice the benefit won't be that prominent. > > > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> > > ---