From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D6C0C433EF for ; Thu, 14 Oct 2021 07:22:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 198A660EE2 for ; Thu, 14 Oct 2021 07:22:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 198A660EE2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id AFE1F940021; Thu, 14 Oct 2021 03:22:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AAD5194001C; Thu, 14 Oct 2021 03:22:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9752F940021; Thu, 14 Oct 2021 03:22:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0088.hostedemail.com [216.40.44.88]) by kanga.kvack.org (Postfix) with ESMTP id 88D3694001C for ; Thu, 14 Oct 2021 03:22:18 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 4870732099 for ; Thu, 14 Oct 2021 07:22:18 +0000 (UTC) X-FDA: 78694199556.01.10DC425 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf06.hostedemail.com (Postfix) with ESMTP id 4783F801A89C for ; Thu, 14 Oct 2021 07:22:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634196137; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5ODxXC/IfUtQhz/tgKnhV2M7IzeuFZn9wN5xdfLtpUY=; b=YGW7zTWxpxTM7aScEr8axIciL1urUbZ5O4UTnrlVQCwmG/PN4QdePc8tXXuhL+jjcM4lwJ RJU08k31CMaSyvilA1VxxawSSLgoEZzW0QVnMEEcxFMoczd1i/R6usnKwuQZHs6DCKne7o Z6IarXXLtOEdx9Ctv9fpz8GTQk15kFs= Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-433-qB04snBFNq6GPiS19qB5eg-1; Thu, 14 Oct 2021 03:22:15 -0400 X-MC-Unique: qB04snBFNq6GPiS19qB5eg-1 Received: by mail-wr1-f72.google.com with SMTP id p12-20020adfc38c000000b00160d6a7e293so3802941wrf.18 for ; Thu, 14 Oct 2021 00:22:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent :content-language:to:cc:references:from:organization:subject :in-reply-to:content-transfer-encoding; bh=5ODxXC/IfUtQhz/tgKnhV2M7IzeuFZn9wN5xdfLtpUY=; b=pou9SCL1L0aCNIYfLIMSreTRsHdCYU4yiTmLInNKGgRH7xZasigYTvuqUIqR52cP9L nZi9fwjKFDPgXpSXbUAQ2iTL5fqxmvPlkmTLi3fTnBM33gldKdLDsaL+3c5X3+mOi0Pc 0Hlsa8J2RM6BOBVOKH1nKbYeakekkotQeA/DiRkISb54q5le72o9ys9NikA00v/SemXr DJex1m730t+2fR7VIKRoM3dpTfOBfSO3c0NqAShnYOayA76muVor0ThEr9znmoDuKgCK szxLhCCblgjIG92F1bf5uhoIoU54ekpsb6I1s92Hn+wACY2eJI2FNMUDiqAwI+XyY7Ne fi/g== X-Gm-Message-State: AOAM531/bK15a1ercBVXVxTT0v9it67lxAHu0dxR8OYdrEU3Io2pfPfR qup1EmMlJN2KEauigG+RRu5lf1/oOFD5Loa4AgSRNiDu7QqfSHr8EOP2A8Tdbw97x7qI1hmUmLA YFSYSL0tOJio= X-Received: by 2002:a7b:c351:: with SMTP id l17mr4094853wmj.120.1634196134561; Thu, 14 Oct 2021 00:22:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJypjd/tQjyRgNdrjsZTYGqz+zqkScu0HZ+/I2qRIFZmZyHI+BQWZioBlD3VUwW4JE/lL1laxw== X-Received: by 2002:a7b:c351:: with SMTP id l17mr4094832wmj.120.1634196134269; Thu, 14 Oct 2021 00:22:14 -0700 (PDT) Received: from [192.168.3.132] (p5b0c694e.dip0.t-ipconnect.de. [91.12.105.78]) by smtp.gmail.com with ESMTPSA id p17sm1633374wro.34.2021.10.14.00.22.13 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 14 Oct 2021 00:22:13 -0700 (PDT) Message-ID: <4cccc03f-1a9b-a45f-082f-77a4b37f6761@redhat.com> Date: Thu, 14 Oct 2021 09:22:13 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.1.0 To: Matthew Wilcox , Johannes Weiner Cc: Kent Overstreet , linux-mm@kvack.org References: <20211004134650.4031813-1-willy@infradead.org> <20211004134650.4031813-4-willy@infradead.org> <02a055cd-19d6-6e1d-59bb-e9e5f9f1da5b@redhat.com> <425cd66f-2040-4278-6149-69a329a82f79@redhat.com> <842357c1-bec2-654e-c782-569b1fd627b2@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH 03/62] mm: Split slab into its own type In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 4783F801A89C X-Stat-Signature: arzfq5ecmtkqin95yiigkxb4o8wc9ejb Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=YGW7zTWx; spf=none (imf06.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-HE-Tag: 1634196137-274708 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 13.10.21 20:31, Matthew Wilcox wrote: > On Wed, Oct 13, 2021 at 02:08:57PM -0400, Johannes Weiner wrote: >> Btw, I think slab_nid() is an interesting thing when it comes to page >> polymorphy. We want to know the nid for all sorts of memory types: >> slab, file, anon, buddy etc. In the goal of distilling page down to >> the fewest number of bytes, this is probably something that should >> remain in the page rather than be replicated in all subtypes. > > Oh, this is a really interesting point. > > Node ID is typically 10 bits (I checked Debian & Oracle configs for > various architectures). That's far more than we can store in the bottom > bits of a single word, and it's even a good chunk of a second word. > > I was assuming that, for the page allocator's memory descriptor and for > that of many allocators (such as slab), it would be stored *somewhere* > in the memory descriptor. It wouldn't necessarily have to be the same > place for all memory descriptors, and maybe (if it's accessed rarely), > we delegate finding it to the page allocator's knowledge. > > But not all memory descriptors want/need/can know this. For example, > vmalloc() might well spread its memory across multiple nodes. As long > as we can restore the node assignment again once the pages are vfree(), > there's no particular need for the vmalloc memory descriptor to know > what node an individual page came from (and the concept of asking > vmalloc what node a particular allocation came from is potentially > nonsense, unless somebody used vmalloc_node() or one of the variants). > > Not sure there's an obviously right answer here. I was assuming that at > first we'd enforce memdesc->flags being the first word of every memory > descriptor and so we could keep passing page->flags around. That could > then change later, but it'd be a good first step? > It's really hard to make an educated guess here without having a full design proposal of what we actually want to achieve and especially how we're going to treat all the corner cases (as raised already in different context). I'm all for simplifying struct page and *eventually* being able to shrink it, even if we end up only shrinking by a little. However, I'm not sold on doing that by any means (e.g., I cannot agree to any fundamental page allocator rewrite without an idea what it does to performance but also complexity). We might always have a space vs. performance cost and saving space by sacrificing performance isn't necessarily always a good idea. But again, it's really hard to make an educated guess. Again, I'm all for cleanups and simplifications as long as they really make things cleaner. So I'm going to comment on the current state and how the cleanups make sense with the current state. Node/zone is a property of a base page and belongs into struct page OR has to be very easily accessible without any kind of heavy locking. The node/zone is determined once memory gets exposed to the system (e.g., to the buddy during boot or during memory onlining) and is stable until memory is offlined again (as of right now, one could imagine changing zones at runtime). For example, node/zone information is required for (almost) lockless PFN walkers in memory offlining context, to figure out if all pages we're dealing with belong to one node/zone, but also to properly shrink zones+nodes to eventually be able to offline complete nodes. I recall that there are other PFN walkers (page compaction) that need this information easily accessible. -- Thanks, David / dhildenb