From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58050C433EF for ; Tue, 26 Oct 2021 17:22:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D764860F9D for ; Tue, 26 Oct 2021 17:22:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D764860F9D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id ECE4A80007; Tue, 26 Oct 2021 13:22:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E7D97940007; Tue, 26 Oct 2021 13:22:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D1E5D80007; Tue, 26 Oct 2021 13:22:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0060.hostedemail.com [216.40.44.60]) by kanga.kvack.org (Postfix) with ESMTP id C06EF940007 for ; Tue, 26 Oct 2021 13:22:08 -0400 (EDT) Received: from smtpin40.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 816DB82499A8 for ; Tue, 26 Oct 2021 17:22:08 +0000 (UTC) X-FDA: 78739256736.40.9190FB7 Received: from mail-qt1-f172.google.com (mail-qt1-f172.google.com [209.85.160.172]) by imf07.hostedemail.com (Postfix) with ESMTP id 31D7410000BB for ; Tue, 26 Oct 2021 17:22:08 +0000 (UTC) Received: by mail-qt1-f172.google.com with SMTP id v17so14246705qtp.1 for ; Tue, 26 Oct 2021 10:22:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=AF3+0fxjbHy192guNpuUmLQGtWI98ukF4k5FY0YeQvI=; b=hZ9nAZD4ZgG57guNk9YVYLlc9O0r7XUTDZnNJB951VuNpbd0QfS64Yw/XkvJPUiSvD gwuIcG69ttUC18n3/aye6dH+ytfQVYoMM3WVSVI4f/N7O2xntPc+zt5148A2XlIAj3ZM OV9zNUTiKyoBu290tTNXaVQKyxNeHi3fm97RNWcdMWiaMxi9V2L3EGjedk9JtLev2kRO i7NX2lz57coVXMDP6PLrFaNooycFfCk+qu7MVKWEp3B9hbRwQ4WWywcqRG2pduoJ2Amu vWD3tzNknFikFtvu7e/v4ij9YjZlK3ndcb+svxf3i/7jGuRIMjp4/kCfNOBYdBRXCcIT yeTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=AF3+0fxjbHy192guNpuUmLQGtWI98ukF4k5FY0YeQvI=; b=NOU5+F0ZkISmg7EUPl5iRZWhs1r29ztGHv3jhRj8pFtA6Ki9Zk0MemkDH7xeT91in0 licDWTvxy/RGSkzie1/34dzsMLcG4SADS7ZdYTq+tlUGgd1Vu4FvDG/b5Z0HXc0WdVxF ZD5S700eK57KqH90IBDlxD4sZqjzA2NTPhw4aVdvnQZlgkLXnBRhp5RnejQiy4bCtakQ 9aTXU/gtjc7MqKUDXtLJRwo36jbI9a8J4CO1/sw5TlNRLqiD6yp2xL+P7VFVE41EQVp1 egyWZDn2mCDUy+vp244yba2FuO/ha7Sw4HKfgPFsJ53kUy3ygpWbUc9Lt9fxjW+mlqIg GMfQ== X-Gm-Message-State: AOAM5322FA8FYGMKuJCupxtjs0uqbtVqw7J6rmutVSTZrDcqhgiCJkXh ZO/Lh/fMZNkiE4OPooX0fA== X-Google-Smtp-Source: ABdhPJz7kdL+7cszNu7/tTFU89jZPD2M+pJ/KkaTtJlDF9GVxjPI3JPBOZ6GgvWHYUUhbxGJsMR8+w== X-Received: by 2002:ac8:7d46:: with SMTP id h6mr25927844qtb.93.1635268927220; Tue, 26 Oct 2021 10:22:07 -0700 (PDT) Received: from moria.home.lan (c-73-219-103-14.hsd1.vt.comcast.net. [73.219.103.14]) by smtp.gmail.com with ESMTPSA id v15sm2004847qkl.91.2021.10.26.10.22.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Oct 2021 10:22:06 -0700 (PDT) Date: Tue, 26 Oct 2021 13:22:02 -0400 From: Kent Overstreet To: Matthew Wilcox Cc: linux-mm@kvack.org, Johannes Weiner Subject: Re: Dynamically allocated memory descriptors Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 31D7410000BB Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=hZ9nAZD4; spf=pass (imf07.hostedemail.com: domain of kent.overstreet@gmail.com designates 209.85.160.172 as permitted sender) smtp.mailfrom=kent.overstreet@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Stat-Signature: o7dn8stp7yzjhoupd865ydmxeym5iwkp X-Rspamd-Server: rspam06 X-HE-Tag: 1635268928-814114 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Oct 25, 2021 at 08:55:21PM +0100, Matthew Wilcox wrote: > Kent asked: > > I ran into a major roadblock when I tried converting buddy allocator > > freelists to radix trees: freeing a page may require allocating a new > > page for the radix tree freelist, which is fine normally - we're freeing > > a page after all - but not if it's highmem. So right now I'm not sure > > if getting struct page down to two words is even possible. Oh well. > > I don't think I can answer this without explaining the whole design > I have in mind, so here goes ... this is far more complicated than > I would like it to be, but I think it *works*. So you've got two separately allocated structs per compound page - struct buddy, for allocator/freelist state, and struct folio or slab or whatever, for allocatee state. This lets you get struct page - our 4k page tax - down to a single pointer. But the shenanigans required for separately allocating struct buddy make me want to go back to my proposal :) The difference between your proposal and mine is that in mine, we don't separately allocate struct buddy, instead we only shrink struct page down to two words/pointers, not one. We can get the state for a free page down to two words if we replace the doubly linked freelists with a dequeue implemented as a radix tree: the second word in struct page will be a pointer to allocatee state for allocated pages, but for free pages it will be an index onto the freelist. As you also noted, splitting page->flags up between allocator state and allocatee state (i.e. moving some of it to the folio) means we'll be able to fit compound/buddy order in page->flags; that becomes the allocator state word in my model. The issue I ran into was where we have to allocate new pages for the freelist radix tree: normally there's no issue here because we can just consume the page we're trying to free. But if the page is highmem - oof. So I've been kicking around the idea of implementing a version of my lib/generic-radix-tree.c code where we use the low bit of pointers to nodes to indicate when they're highmem pages that need to be kmap_local()'d. I think done this way the performance overhead will be negligable, in practice. So, I'm gonna cook this up and see how it comes out...