From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6815BC3DA45 for ; Wed, 10 Jul 2024 11:27:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CE8336B009C; Wed, 10 Jul 2024 07:27:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C705B6B009D; Wed, 10 Jul 2024 07:27:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B11276B009F; Wed, 10 Jul 2024 07:27:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 8D1306B009C for ; Wed, 10 Jul 2024 07:27:06 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 10E6BA515A for ; Wed, 10 Jul 2024 11:27:06 +0000 (UTC) X-FDA: 82323616452.13.271533A Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by imf20.hostedemail.com (Postfix) with ESMTP id C33C21C000B for ; Wed, 10 Jul 2024 11:27:03 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=eaZJQmoa; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=2kZ2Klff; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=eaZJQmoa; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=2kZ2Klff; spf=pass (imf20.hostedemail.com: domain of osalvador@suse.de designates 195.135.223.130 as permitted sender) smtp.mailfrom=osalvador@suse.de; dmarc=pass (policy=none) header.from=suse.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720610808; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=JzTKuYC2mO+DNXXK5AjiYtvoeM/CmJF2ahQV7yfc6DI=; b=2o6yYfEpURSxQb3ItSZxRVwS6xVcnwLrK2vq+dP0utAwB5Vo3pa6jKCJvTwMetc1BVp8qQ 5hCtdA9fMp6I9v7MY4ePOllZGdtrNm4FuyQMhkPsZYPpdB5i4wc7s6WLxpbgSZA9drjdo0 Iu4CbKeU56cMfy0bTgyYnxIiSLZMF5c= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=eaZJQmoa; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=2kZ2Klff; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=eaZJQmoa; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=2kZ2Klff; spf=pass (imf20.hostedemail.com: domain of osalvador@suse.de designates 195.135.223.130 as permitted sender) smtp.mailfrom=osalvador@suse.de; dmarc=pass (policy=none) header.from=suse.de ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720610808; a=rsa-sha256; cv=none; b=fQ1ZLAEa8Ah6hfgy6FyfMeFzDOuqO7dKCGupPPXFwwuzX/bn2a2SjoQWia/GKYIexZb92P VIfsZyt9fx/+XbtstFGFAex8jBsAjq+MIkZd8fhiaMLcq8pT6fjfj5CzUnUOu6voZ+zB9d 6NDs+Pna/tkaTjPeyMcjk1KAYaghQzs= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 17CB121BBB; Wed, 10 Jul 2024 11:27:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1720610822; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=JzTKuYC2mO+DNXXK5AjiYtvoeM/CmJF2ahQV7yfc6DI=; b=eaZJQmoaUrJQTAFr6gk4yuuCqNUjR5ub9wfBTzFMD82pJMVUBIJdtGTpmWhwZ/LVtnnQ/d 7n9Mh1aAg9Redwt6J6+HKd6FfvTtHR/J2WAx1wAKyb0uJxJxLpwtqXSG8N1sRb34bCrgWO ro0EgDfFr7mDXVnuxrq1qqlZZIgCsCQ= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1720610822; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=JzTKuYC2mO+DNXXK5AjiYtvoeM/CmJF2ahQV7yfc6DI=; b=2kZ2KlffOaL4OSRustxVhKH/x0BRjy7C3fwGxt4Y0fGXEDSmb0K2MLEW9lSZKVWOUTcXI+ t4ei5Eq/7Qx36gAg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1720610822; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=JzTKuYC2mO+DNXXK5AjiYtvoeM/CmJF2ahQV7yfc6DI=; b=eaZJQmoaUrJQTAFr6gk4yuuCqNUjR5ub9wfBTzFMD82pJMVUBIJdtGTpmWhwZ/LVtnnQ/d 7n9Mh1aAg9Redwt6J6+HKd6FfvTtHR/J2WAx1wAKyb0uJxJxLpwtqXSG8N1sRb34bCrgWO ro0EgDfFr7mDXVnuxrq1qqlZZIgCsCQ= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1720610822; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=JzTKuYC2mO+DNXXK5AjiYtvoeM/CmJF2ahQV7yfc6DI=; b=2kZ2KlffOaL4OSRustxVhKH/x0BRjy7C3fwGxt4Y0fGXEDSmb0K2MLEW9lSZKVWOUTcXI+ t4ei5Eq/7Qx36gAg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id C1967137D2; Wed, 10 Jul 2024 11:26:57 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id nM6tHwFwjmaxfgAAD6G6ig (envelope-from ); Wed, 10 Jul 2024 11:26:57 +0000 Date: Wed, 10 Jul 2024 13:26:54 +0200 From: Oscar Salvador To: David Hildenbrand Cc: Peter Xu , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Muchun Song , SeongJae Park , Miaohe Lin , Michal Hocko , Matthew Wilcox , Christophe Leroy , Jason Gunthorpe Subject: Re: [PATCH 00/45] hugetlb pagewalk unification Message-ID: References: <20240704043132.28501-1-osalvador@suse.de> <617169bc-e18c-40fa-be3a-99c118a6d7fe@redhat.com> <84d4e799-90da-487e-adba-6174096283b5@redhat.com> <9d5980e3-72e6-4848-b1ac-83ffab8522c4@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <9d5980e3-72e6-4848-b1ac-83ffab8522c4@redhat.com> X-Rspamd-Action: no action X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: C33C21C000B X-Stat-Signature: b473uraedtbds5cwysargisybi64f7gz X-HE-Tag: 1720610823-952206 X-HE-Meta: U2FsdGVkX18sh+64zWkZN1Aoi2KmcrhgA844hzxtnm1tQZuBRIFbWCOvBevNPtkXdTeCsRvLQmpBB6DxTZpCaOH/VaSrhJleQhUbMFXbia5noVyt3bzjMT+ZdNYC8SSog3dvY8sYvqgumzQ4JQy3DjXcvY9itpL3kizE7q026oNkMvUIcTnYv8BT0Zgw7i2mC7Sj69Jnfek4PN2Ruy7GyF/1qPWDaZTLty1XHMVsizxP7brzAcSNtNRuJQAShY9E//pLzJkdsZL623+cYHAlmJ17dEKBL3wNEmFY3dEOOPebyUx3adjF5p4FqCdbSdaECmEl7XwvWcps2lsBpIk81GzXW8fCN6fDII5W3bYfMa2kqf+MraUuVu0RQan2slNCT7vXXtpBTp4TYxUcRweblC/96bk6dTiGiOjPN2Lf+Y1yR1bEDKkPgSHFLi+aIYnaU7ineG86NXqx57duYNczHtsMhUVPvjH3VCI6U2HeChEnitIdeh9Sn8y07Gp+C9a0VULTDaX8dRf2JZImkmURWf1b2s0PZwahrEVvLgULpoh9sclfuM4qFQcdo9zEPUmHVCCdWyliHFKI9wdDJmwBuZNrjhdtSMIdps+GqB0CJLizZYWr4Oljg8Enre+NKyOJp0qnJmFeM+cdAXnQ3GDMXxT3MM0lIzyqbapz/ozWNGvEHtu3PfW9RVH8JuSeveSEZREkB3foQ1lRlcj7QA/T70EptsRpRrDxYZy4okLdacQFb0BVEXMUV7b6bz8m56zPhgBYzeE97RBYzq7p3VkStZfUIgAoJSCujyyjYRcEVbi7Rys4Kzlunp2593XWzGhp0BI12uC2zYRow89e+yQYug716zKrX2c+FTOFEZuB8WrQ4j+7nNwb07mBcZtnxSdEHU4yPrmI+C2rljIoYjRGTE/9ehTWnqSaqGP/WsWwcCYeQ3KZmNiv+IhMaeztj2LZtKBTk3/v03+HAoRqOUV Y+oaysdu A1mLxsezvxPjXZy/Lloqf3PeGHRTpvmP7UEjeMTakPVCTXNWE243n4BU9BxS+yak1lheYR3O831q+5AO1QuB3cJSYkZYt06skVKLHMd8eIBlVDjXDOPe73kwP/bttKhH9n8t3UCugfTyYLkVQNTVfmU5C1Yq6NuZarAPBfqSdEHw8+5ge4TXo++kh1noPOGF7NvHlqn7oWB7t9r70zewNuRuLDxy7Y08mTz7/oNASs9Y1PtH92NWzDo8JAw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jul 10, 2024 at 05:52:43AM +0200, David Hildenbrand wrote: > I understand that. And it would all be easier+more straight forward if we > wouldn't have that hugetlb CONT-PTE / CONT-PMD stuff in there that works > similar, but different to "ordinary" cont-pte for thp. > > I'm sure you stumbled over the set_huge_pte_at() on arm64 for example. If > we, at one point *don't* use these hugetlb functions right now to modify > hugetlb entries, we might be in trouble. > > That's why I think we should maybe invest our time and effort in having a > new pagewalker that will just batch such things naturally, and users that > can operate on that naturally. For example: a hugetlb cont-pte-mapped folio > will just naturally be reported as a "fully mapped folio", just like a THP > would be if mapped in a compatible way. > > Yes, this requires more work, but as raised in some patches here, working on > individual PTEs/PMDs for hugetlb is problematic. > > You have to batch every operation, to essentially teach ordinary code to do > what the hugetlb_* special code would have done on cont-pte/cont-pmd things. > > > (as a side note, cont-pte/cont-pmd should primarily be a hint from arch code > on how many entries we can batch, like we do in folio_pte_batch(); point is > that we want to batch also on architectures where we don't have such bits, > and prepare for architectures that implement various sizes of batching; > IMHO, having cont-pte/cont-pmd checks in common code is likely the wrong > approach. Again, folio_pte_batch() is where we tackled the problem > differently from the THP perspective) I must say I did not check folio_pte_batch() and I am totally ignorant of what/how it does things. I will have a look. > I have an idea for a better page table walker API that would try batching > most entries (under one PTL), and walkers can just register for the types > they want. Hoping I will find some time to at least scetch the user > interface soon. > > That doesn't mean that this should block your work, but the > cont-pte/cont/pmd hugetlb stuff is really nasty to handle here, and I don't > particularly like where this is going. Ok, let me take a step back then. Previous versions of that RFC did not handle cont-{pte-pmd} wide in the open, so let me go back to the drawing board and come up with something that does not fiddle with cont- stuff in that way. I might post here a small diff just to see if we are on the same page. As usual, thanks a lot for your comments David! -- Oscar Salvador SUSE Labs