From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9670C61DA4 for ; Thu, 2 Feb 2023 23:27:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3D24B6B0074; Thu, 2 Feb 2023 18:27:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 382546B0075; Thu, 2 Feb 2023 18:27:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 249D26B0078; Thu, 2 Feb 2023 18:27:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 15EAA6B0074 for ; Thu, 2 Feb 2023 18:27:58 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id E26601A019E for ; Thu, 2 Feb 2023 23:27:57 +0000 (UTC) X-FDA: 80423941794.08.DFD406C Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) by imf05.hostedemail.com (Postfix) with ESMTP id BC155100009 for ; Thu, 2 Feb 2023 23:27:55 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=shutemov.name header.s=fm2 header.b=WIDjweNW; dkim=pass header.d=messagingengine.com header.s=fm3 header.b=ShMKKYTh; spf=pass (imf05.hostedemail.com: domain of kirill@shutemov.name designates 66.111.4.25 as permitted sender) smtp.mailfrom=kirill@shutemov.name; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675380475; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ybAOl0im+zyVAqNQleQtVluip4OoVhxd17oXs8vM8R8=; b=ePetZtH5s8plu2Xsb0nPEGRjeFFost8Ur8/yO1BTEeW/DtIDxm9q3DEOfvLpRsX7C6+Cfc rzqHEQMKcKkR5YQ2Hv2JKhUEdUlx3YCM82NL+nosKoznEC5YJN3C/HSjXE1mGkUZ1B0I2k VpET31AyC8iuvt8GRrYKqXr4U4ubLnA= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=shutemov.name header.s=fm2 header.b=WIDjweNW; dkim=pass header.d=messagingengine.com header.s=fm3 header.b=ShMKKYTh; spf=pass (imf05.hostedemail.com: domain of kirill@shutemov.name designates 66.111.4.25 as permitted sender) smtp.mailfrom=kirill@shutemov.name; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675380475; a=rsa-sha256; cv=none; b=CSb2sC4tBBY7Hviv0sqBc8TZCVNs06G4PX9A4Jq7B4U4kG7TcBJmKWhcydtbia/DNp38G4 NbNNb2OO2OVtPbop1o30heejUcVujKeC9HFp5MAPFf+UpykzGQIUmFqbWyaCXFJ3fizD97 Ab3tFCsSx2lzbZ9mqsfVsrTedl2g05o= Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 347215C006D; Thu, 2 Feb 2023 18:27:55 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Thu, 02 Feb 2023 18:27:55 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov.name; h=cc:cc:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm2; t=1675380475; x=1675466875; bh=yb AOl0im+zyVAqNQleQtVluip4OoVhxd17oXs8vM8R8=; b=WIDjweNWaBkEKh36iY Bs/bvDg32HW6CVt2rk0He3XdsUEiXaXAqXX4oQjAgowAyl7drdGCzzq0C9zCiaCx ki2OauPFdPtz/t21AinsLfper8c8zZ3LRx54nBtmPIRYUqpQd/utpGHzFkq5n0E3 zCjhiRmObI87vurwr+Lxxtj6+Q6EWXq2rnl8TJMl3h0AHqnieSw2BJMfQD0o4q+9 v16pkcEkBQq6PR+SSFa011DCwNd3KOVaPPO8boD0fpRhfII9I4E/yDaXqn4vBCAY 4jmhPmDGPtPqy9f3thXJokpBvOISDIVSqP56JIdlnGhVXWTK3Hrh2d/Hzrf3nt+7 hFDw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:date:date:feedback-id :feedback-id:from:from:in-reply-to:in-reply-to:message-id :mime-version:references:reply-to:sender:subject:subject:to:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1675380475; x=1675466875; bh=ybAOl0im+zyVAqNQleQtVluip4Oo Vhxd17oXs8vM8R8=; b=ShMKKYThevm6xA3F+MaiDv1wU8njjVTm4afr6jlU7EDm bGhwYsF9CcCiOAQ8Man/yZAPZZ7AW+XbTMZ7LoMu7mKMl/dOEHWs3/36qU4RkyjA WJ82QLrgh9g7R4YLVXrps9hL3WtzaQMNYB9RmSJJhv3tSiUUjq3HBYayyMAPdxx1 WBp6yDD9nJbG2Nsn8zb82GFwpNffc2fsNPIKyoIlj3ygRShOlrBhYcNIJGjLnYWK K1dFJDmZFoLNjwTUn0RW1dj0mz+OhFqEARHsayZrUUXmOAXAi5H3cJ7ga1jwmW69 7h99o1VySFyRC5jjcry3fziYRAtaWXBwQv0qQ7q/4w== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrudefledgtdelucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesthdttddttddtvdenucfhrhhomhepfdfmihhr ihhllhcutedrucfuhhhuthgvmhhovhdfuceokhhirhhilhhlsehshhhuthgvmhhovhdrnh grmhgvqeenucggtffrrghtthgvrhhnpeelgffhfeetlefhveffleevfffgtefffeelfedu udfhjeduteeggfeiheefteehjeenucffohhmrghinhepkhgvrhhnvghlrdhorhhgnecuve hluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepkhhirhhilhhl sehshhhuthgvmhhovhdrnhgrmhgv X-ME-Proxy: Feedback-ID: ie3994620:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 2 Feb 2023 18:27:54 -0500 (EST) Received: by box.shutemov.name (Postfix, from userid 1000) id C14DA10E388; Fri, 3 Feb 2023 02:27:51 +0300 (+03) Date: Fri, 3 Feb 2023 02:27:51 +0300 From: "Kirill A. Shutemov" To: Matthew Wilcox Cc: linux-arch@vger.kernel.org, Yin Fengwei , linux-mm@kvack.org Subject: Re: API for setting multiple PTEs at once Message-ID: <20230202232751.q4qfm2qrauwtz5bs@box.shutemov.name> References: <20230202214858.btrzrcevzxjfk6wg@box.shutemov.name> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Stat-Signature: wcss4mkjfjmofsas9zatc9wzpuy68sfh X-Rspam-User: X-Rspamd-Queue-Id: BC155100009 X-Rspamd-Server: rspam06 X-HE-Tag: 1675380475-52083 X-HE-Meta: U2FsdGVkX1/j0Tfznec/1j2FkjUbN66mhsL3q694U4zyhQQLnU48arFPibtc6qYYHh4Tvytz8TobY+ijD77AUgQLKu/a59xKBAN1W+6ObPofU1Jl6I193BCnJ1eEos7cZxRtGdzAZMUEfYfPmDDBiUnSGrtbjNkERFIuVG9CmFPKDVv71Ex81Th+f69DlnPp0sFsrFVyXXBeodkD+u4U/vMRjAJ29anW9kOtQtPS1NsF0GQSlniZiN6OvA1ETnrPcEDZ0CN8aciivLHyhNfLTQTs0g2Mi+1ydv1xlU0ZPPQ8KIn5FCamUv1Bw0/rrii0tckozt/CfnjmC9nRo1c1o9fkd6bqvxRLUyVP9+e7PU9NQ9CXVXecJ5q2QC0JQGanRhxkPzhCrAAcRnuFY6wenVqLiT33U9/N6n13BGURauyxEeL0KfWJhVYtWFbp+Q4xZY3JkVVo8lcnf/gqfzQiUmut62Y4sgC1hG0iq5FelXI1IFYYUVQ0gH0K1u09l3qxF9qalfldPtgJtGrtG9If6fmS21pu6/sOBz+VYidbKhJmdtWq9ZE1P3N2HekCnN8Gzp4X1rMBcGUf2mlToES+kLo5v1FmPMad7/li0VPrdqVyRWSdW26KMSAh8+a6X5tbaw6sDV8+ljloSQ/utFh7W8UvMsITrae8wwcLuyT0zMGuyTaFq3btJGiAKNdRoeacmHA4BhX75HUwBAG5+b4bYIC3qTTZX5oNZ+JVr7WufKo6yLlpXTmslYiEESP11ZlzX0d8L/Z6XIaQaKXSmckcMBG+wvWINHJ6BFHapc+zIUk/rY6NiDqRG+AHtb7DF3eafjLG0l6pJtdGndNDActhhNjotIZrOMw3UZkA04KUndi3D/vPkLWTfy1L9G+ohDKF5AMDVgghbLbMyVdd71qJMWKSCGoLpOYBzB6SFgN+QwQ9EAQDhT5Y5hMRfgQONcemAIAzhLRodzDFBB8uZ/P qagH/W19 bzm9PLA63TZyuqrcQydpB41XhyzqCSGgAacqaQjpkI+3KpOnt+5rBCZRFCY9sW3WpnhBRl1Ehn587eN/DAicoSwhoCQFLoFA2m1Dbsak+HU1Be7/Yl+49DSPiVABOcDmeHYQTPnN9Wr7JV3fKgoptNY0buGz10N+JCpgrGKET3xuTVgnKCk2/UIc08ZvKqUhls6ldOE1OGjeIF94zNIAoyhD5LoNPftraRQ5AZuzIB+vZw4uz6CrCkPQo8nJMgw+IyapkuYL0l6lCdRt7Psid/8CIczeGGZ2qzDjh/zwARy6EsotVkS3OyRUV/SSlvCN+OKr4N/1otxYly33GOdTNcDG4OOhB/TDxgtNrWypAAa0OR2wzRyqXca1n9ONmD+AOH/mePSOfLI0zn/cgdBFE8+VviA3vOWUI/zrDC//m7yHHqzo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Feb 02, 2023 at 10:49:38PM +0000, Matthew Wilcox wrote: > On Fri, Feb 03, 2023 at 12:48:58AM +0300, Kirill A. Shutemov wrote: > > On Thu, Feb 02, 2023 at 09:14:23PM +0000, Matthew Wilcox wrote: > > > For those of you not subscribed, linux-mm is currently discussing > > > how best to handle page faults on large folios. I simply made it work > > > when adding large folio support. Now Yin Fengwei is working on > > > making it fast. > > > > > > https://lore.kernel.org/linux-mm/Y9qjn0Y+1ir787nc@casper.infradead.org/ > > > is perhaps the best place to start as it pertains to what the > > > architecture will see. > > > > > > At the bottom of that function, I propose > > > > > > + for (i = 0; i < nr; i++) { > > > + set_pte_at(vma->vm_mm, addr, vmf->pte + i, entry); > > > + /* no need to invalidate: a not-present page won't be cached */ > > > + update_mmu_cache(vma, addr, vmf->pte + i); > > > + addr += PAGE_SIZE; > > > + entry = pte_next(entry); > > > + } > > > > > > (or I would have, had I not forgotten that pte_t isn't an integral type) > > > > > > But I think that some architectures want to mark PTEs specially for > > > "This is part of a contiguous range" -- ARM, perhaps? So would you like > > > an API like: > > > > > > arch_set_ptes(mm, addr, vmf->pte, entry, nr); > > > > Maybe just set_ptes(). arch_ doesn't contribute much. > > Sure. > > > > update_mmu_cache_range(vma, addr, vmf->pte, nr); > > > > > > There are some challenges here. For example, folios may be mapped > > > askew (ie not naturally aligned). Another problem is that folios may > > > be unmapped in part (eg mmap(), fault, followed by munmap() of one of > > > the pages in the folio), and I presume you'd need to go and unmark the > > > other PTEs in that case. So it's not as simple as just checking whether > > > 'addr' and 'nr' are in some way compatible. > > > > I think the key question is who is responsible for 'nr' being safe. Like > > is it caller or set_ptes() need to check that it belong to the same PTE > > page table, folio, VMA, etc. > > > > I think it has to be done by caller and set_pte() has to be as simple as > > possible. > > Caller guarantees that 'nr' is bounded by all of (vma, PMD table, folio). Also caller is responsible for taking all relevant locks. > We don't currently allocate folios larger than PMD size, but perhaps we > should prepare for that and as part of this same exercise define > > set_pmds(mm, addr, vmf->pmd, entry, nr); > > ... where 'nr' is the number of PMDs to set, not number of pages. Sounds good to me. -- Kiryl Shutsemau / Kirill A. Shutemov