From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 211A1C27C53 for ; Fri, 7 Jun 2024 05:07:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ACCEB6B009E; Fri, 7 Jun 2024 01:07:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A7D1A6B009F; Fri, 7 Jun 2024 01:07:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 91D486B00A0; Fri, 7 Jun 2024 01:07:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 742666B009E for ; Fri, 7 Jun 2024 01:07:35 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id E7C0216134B for ; Fri, 7 Jun 2024 05:07:34 +0000 (UTC) X-FDA: 82202909628.01.8253F97 Received: from mail-wm1-f47.google.com (mail-wm1-f47.google.com [209.85.128.47]) by imf10.hostedemail.com (Postfix) with ESMTP id C3D20C0028 for ; Fri, 7 Jun 2024 05:07:32 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=UvZ2sU2k; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf10.hostedemail.com: domain of osalvador@suse.com designates 209.85.128.47 as permitted sender) smtp.mailfrom=osalvador@suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717736853; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=q4mDFWJivOGj57fTalEcOERy8pmlQ4MkG0gu53R4rts=; b=JeilW9I14InjsUG8zGMQZjeQSgTMJZZA8Q4Xp/0O73aJI10gYa6hP9ArTA2u61AY1ISqo9 489GCewChk5xF59O5iAvTOYafyWCQKGjx6Tat3tJwymQQ2gG8RyF9pTmPwYXvng2sgsYzy Ev1qke9AiXI8rqT9+tc+0l4g62+Il4Y= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=UvZ2sU2k; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf10.hostedemail.com: domain of osalvador@suse.com designates 209.85.128.47 as permitted sender) smtp.mailfrom=osalvador@suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717736853; a=rsa-sha256; cv=none; b=HnWw7n4Nb50T5D0vE5RUsrKpbp7pVM+lcPCh38oc5sJDE6F8DzDKnV+yqIanf/yKBulwuJ ZGs3gsLA8iyWEAz4aWVLqTB4dbs+o9Dd06YRw8jPRIDQuOC7C2wPqEbkuOPkOt2/+01R0A dy5Vqz2tNS6uPwfb7HfT2GWCupuhpuI= Received: by mail-wm1-f47.google.com with SMTP id 5b1f17b1804b1-4216b35f0bfso2213275e9.0 for ; Thu, 06 Jun 2024 22:07:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1717736851; x=1718341651; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:from:to:cc:subject:date:message-id:reply-to; bh=q4mDFWJivOGj57fTalEcOERy8pmlQ4MkG0gu53R4rts=; b=UvZ2sU2k/EqDCn2cQ3Rss36oJa3GvzkxiyxCHSv6Iro/E59I+TNSZCSbCOGDKoQDbz jCvMay39wdAvHid7Ojyxo2jRIzzw3FuBrNmV+fA9QSRiv9T7Hu1XPyyfYJMceQpe97Y5 RoAwboqiifTRWGYXIYZGPoReaL371sIxU/kG9/XX3dJk88tRZP4Li9yyxtWIZNOZBnIn hysobrvOJDhFruAjriXivv//q7zXl8HRwJys5FOHGW2NKizPTI36cbyEoYi2kFw0Q6YD MgMSEi+ahpJlCCAeZpMbrxIloWF31wPSjAXr/H51uH4kKs6tokRkwpxwpFzSUlH+gvja O/sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717736851; x=1718341651; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=q4mDFWJivOGj57fTalEcOERy8pmlQ4MkG0gu53R4rts=; b=WQVqQ5UkNyv4MRBeC3lsTLndroJLe6Zpe4BhlZWILeZoQuSL/nlDGVoaiBTKvNb2rr 4BYVQxWQy8IC89Xbi78ZaD7IB75xKnUtvg/u/uUWsJN05L0bK+LMAkfCSc3n2L+fVa8h 7KvjjG/rgvxbhFsM5Kdaguj6klnywFZzVAC8aUcF4fVAoOMMKdfPkc3T6ktPRUL6Wa6F n9Tpga37DEgZl9X9oBl+sB9ektVRVCN1YlH+GoahIrSXElWSW6SpaGiqkLtCrTeJNVN0 M+muUZMFEKizKZMv/tvs7C5gf/DpnyZrE8SMWrB0hemeT87w+j3LGNd20ErLgGDYJSgM waxQ== X-Forwarded-Encrypted: i=1; AJvYcCUO/yvS4nZqBxzw80D6jIFYIlCRX8UXMD4KyNPXRsIJIJe8KFdjkPHtAn6dQbSUo6gkC8WGmTaoQkzhj4veIM39eqE= X-Gm-Message-State: AOJu0YwUn+m/hs/YlLroeEoXifQuIfCeSirGbo9XxNpswyo8m2dFdrcP cSUcLFRAlkYoqYIk8QJIR2S4QkeqpQb6wg00CuHNHz9DDVdvRT3bstLF00jmD9c= X-Google-Smtp-Source: AGHT+IGhzcS6MYqEuTl5UXlmT6Sma1L7TX9nSMPO9dpEOwOh/RHmzQ980fYkFuk/Vt0C5BcZObpfTg== X-Received: by 2002:a05:6000:82:b0:35e:8388:62ba with SMTP id ffacd0b85a97d-35efee10b73mr1004340f8f.59.1717736851199; Thu, 06 Jun 2024 22:07:31 -0700 (PDT) Received: from localhost.localdomain (62.83.84.125.dyn.user.ono.com. [62.83.84.125]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-35ef5d4a827sm3124859f8f.36.2024.06.06.22.07.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 22:07:30 -0700 (PDT) From: Oscar Salvador X-Google-Original-From: Oscar Salvador Date: Fri, 7 Jun 2024 07:07:27 +0200 To: Peter Xu Cc: Matthew Wilcox , Khalid Aziz , Vishal Moola , Jane Chu , Muchun Song , linux-mm@kvack.org Subject: Re: Unifying page table walkers Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: C3D20C0028 X-Stat-Signature: ittcfjy86myxoczrhse5gjyaihw7tn44 X-Rspam-User: X-Rspamd-Server: rspam04 X-HE-Tag: 1717736852-798436 X-HE-Meta: U2FsdGVkX1+2FY155YImpjr03wOymuRLj4uGfn99/nwRXyO3JmtD60dpWJ5+IAHJQ9dAsjCtfZ5yjW1ZYUDbJC+a+gDglfFbr51uHx6f4bGCbyY/QbsByso90k/+pb6K32BQ8PMHoW4K1OSlKnoE6pgMzmSfOBmh0JLTx2I8rrOtPlTa4QzI9oh+rlq77dzcY5gHKl4vUFF76sjDYvGLr6yY9RAvmvqjUQGFQg9qNoRhHIE2E10j8d8KaFxq3RM3u3GI98C2CzdlLTOImn/y80yR17RYGxq5ymYfxs5yxAcPqPQjRtf0OuhldZFxzFv9vPuc9KeTSuxLASebyed1NUrq68ZNmt73OWGAFhLzmXER3f0pgJQcilxNedbdfAahFG9LkRdsCxeclQuwFD7dHb8HFOepTVQM6igiHHp4ySY/bkWSxyS90dv34IOVbAL/rsBW/lrJ0GTYk+11RgiBfvDO7PwmTaG9dQrpzPIJ6DMFxTqasknsqfjQfBzu9RgO9yR8b8cEeyI4JelltpuFm76YE/24P9nMG161zpL+AHr79Y/aqIfv7wWc02B75Kjf7KAM7crQYM9+Ype4oPFMoNpClHjptTYxJpQZ1rl0dyHALg0i1la7CrDBYoUmz4m4Fg0DNC2ZBcNwebv6d/DzfM5wjfFBMlZ52zARAXDGzBiiUNADlQXR9WiLihwUpKX+a1sH3nAM4ntwu8rXHB0qs6S90vrUzaH1zwG6fya6JxH08EkubY5h1m05zkfV5B6PRXa0d9RrRHxNbzYhbEOTvrjDdS41spgZ2q6H80g192ITGNPF62JFp0lecF46BWlq6cbx9dXjFg+sXDaIFInFoyf7Wa7CPYhLDS0uhOBv7Xs5gfxUFJ0vd+p/YdQLd2thtFUhEbNehb9If93P1o6H1E+YsfikqPlCf1xZyv1UOHnbjvnUNlmFoWM6gRdq7J7YXHJT6PpdUCh0yaaQs+C oZ1MIUcA EqIyxVzOT4RXvNpuiC0sUlQlEuJ1uw0ADNtpW9pH4sq2om4H5Np15/E8dMK0PG29dCeNDkf2wLBpRRymwxrZSNmXY/yoLAoe/BJmxD6axqM7o1QMBrhD8XYaQcWAF7y9jsF+Rgp/Y3oqnIvYIzM0sqNFJJRsYHPrbTQOSXCWNA/nwi0whzbcjB7AicgwvODyrxCAdMA5t5lGGrbFLPeW5fBr0/bd8eSEn8TGErh6jx3DwIhynTDlW8p1yWyJpC7zoKvGAmhxct5MPB9C94HDdnrf2DJn0OmDKZSOpo5Bhx4SE2HuH8SBbCcXAhqiMMTw0UAv1voHQRVqA289/VvQMRwR/l6S7asRtaN4Lsiyzxe4PKkUmwtolEoYF0VGra19n1zJQJxBAIJ5ssdgzVE4cRhreiaE3+IxT6GpIl0lsonXwyyY395dnnU0FqYJalJCZzPSasCS6ZkTAiMR4V33xj3l1k9/bwFoJoEA25ZFynXWg6c9JcrO4Wz8bCzhbHfP0Rfv+aW/oh4Pd1hY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000911, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jun 06, 2024 at 05:49:30PM -0400, Peter Xu wrote: > On Thu, Jun 06, 2024 at 07:29:22PM +0100, Matthew Wilcox wrote: > > The reason we have a separate hugetlb_entry from pmd_entry and pud_entry > > is that it has a different locking context. It is called with the > > hugetlb_vma_lock held for read (nb: this is not the same as the vma > > lock; see walk_hugetlb_range()). Why do we need this? Because of page > > table sharing. > > Just to quickly comment on this one: I think it's more than the per-vma > lock. Oscar is actually working together with me (we had plenty of > discussions but so far all offlist...), and the lock context is as simple > as this after refactor for hugetlb_entry() path: > > https://github.com/leberus/linux/commit/88e56c1ecaf8c64ba9165aeba74335bdc15d1b56 Yes, I reached out to Peter after LSFMM because I was highly interested in helping out here. We agreed that I would take pagewalk part, and I already do have some patches on the works [1][2] that are based on a patchset that I have been reviewing that removes hugepd on powerpc [3]. Ideally we should remove the exclusive use of 'pte' from hugetlb (unless it is CONTPTE) and have it using pud/pmd where needed. E.g: if we look at huge_ptep_get version from s390, which is the most special one I would say: huge_ptep_get()->__rste_to_pte or they way around (__pte_to_rste) what it does is it tries to convert a pud/pmd entry into a pte or viceversa, since hugetlb "can" only work with that, and so you have all this castings back and forth all spread over. I started first merging all .hugetlb_entry functions into the .pmd_entrys (not done yet, half-way through) and creating .pud_entry because we will need them since hugetlb can be PUD-mapped, unlike THP (well, yes, devmp but most walkers do not care about it so they did not create a .pud_entry). Then I will be running some tests on x86_64/arm64/pp64le and s390(not sure if I will be able to grab one but let us see), and then I will post a patchset as RFC to gather some feedback. [1] https://github.com/leberus/linux/tree/hugetlb-pagewalk-v2 [2] Do not stare too close as they are a very WIP, and ignore the last 4 commits as they are half-done. [3] https://patchwork.kernel.org/project/linux-mm/cover/cover.1716815901.git.christophe.leroy@csgroup.eu/ -- Oscar Salvador SUSE Labs