From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C97F1C433EF for ; Sat, 30 Apr 2022 13:44:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5BBD06B0072; Sat, 30 Apr 2022 09:44:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 56A496B0073; Sat, 30 Apr 2022 09:44:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 40B1A6B0074; Sat, 30 Apr 2022 09:44:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.25]) by kanga.kvack.org (Postfix) with ESMTP id 30CD66B0072 for ; Sat, 30 Apr 2022 09:44:24 -0400 (EDT) Received: from smtpin31.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 071CB609D1 for ; Sat, 30 Apr 2022 13:44:24 +0000 (UTC) X-FDA: 79413664848.31.84154C3 Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) by imf18.hostedemail.com (Postfix) with ESMTP id 8115A1C0060 for ; Sat, 30 Apr 2022 13:44:17 +0000 (UTC) Received: by mail-pl1-f176.google.com with SMTP id u9so8699556plf.6 for ; Sat, 30 Apr 2022 06:44:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=BqIxANSCe4nfeVe4AU8f2BtOt2NenfWECIlvX3a7SYk=; b=AUcE1lZuPjKn80PACgAPEZP0kBiSrF5iKrv862zPCaJgDhkvtPfzOvzAdYmjIrP+Px 1VZe2FbcwldJya2Caq+H8pXa3UurJZ8g06pppmVj/XLcRmfPcMJobGSRHmH/3xQqXBYl OCAlsM97tr5XcNn13Gn/Qzexj5oWSl0+aeFvmtn6tIlsTpYjfuMqXVHz8T3bzQjVVvYq O9j924xjCSNOS0xAPOX05WCkoQ7TBtXF5y4mMdWFa/iNhmogDE/V/0IwxhbHpjB+nUkZ SFzCCTa4OCEKgtx9lxCFSer20QsWPrM/6apEYpePLDHTTLHssTLwlkvvFde/+TPN1mgG yyZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=BqIxANSCe4nfeVe4AU8f2BtOt2NenfWECIlvX3a7SYk=; b=UlJvrm9glXesLCZ2IPm2qnvoDbUBixtXBDd9pr88a1N9EKHlh0j3WOKHqHXust0NqH fZDCUA1AMprpTw08XEHwYgJdke34asl9guAnCGJ9wMjHg9arDykQaK4iqqdgiBbyYPTd a5Z6wUQZIZSlNiYtfpBumd6xW+1vUUTM7xHEzjnKIAVnIsD2LLpKCkJvw1Sk7PDfDi2H xMGRybJyBAEmSTS/JgjNsukDsd/udgDRV5NtuCEZlPwj4SI+y1mbofwtopC7lYOuU7XJ ZMGg+I3jkXhGU6GC887twMv/FYBRct4ARhimd9fELQpf9+8Sa1fYmCo7gHXHaADuvjOy xNoA== X-Gm-Message-State: AOAM532/3nPc/KDwecQglW6HjQRsho5e3vU0VZLa7KXBJfU0DyqR2fFD xZ/ZYj6bS6l20mt8st6Ij34= X-Google-Smtp-Source: ABdhPJxH0grf9vbL0PX3vT9Q0wxOja7HcFLXgAV7X/ifDrk9gSCn0eg3KLVkBMX6ocMTuiir6gM2qQ== X-Received: by 2002:a17:90a:9ea:b0:1dc:1c48:eda with SMTP id 97-20020a17090a09ea00b001dc1c480edamr5480231pjo.38.1651326262206; Sat, 30 Apr 2022 06:44:22 -0700 (PDT) Received: from ip-172-31-27-201.ap-northeast-1.compute.internal (ec2-18-183-95-104.ap-northeast-1.compute.amazonaws.com. [18.183.95.104]) by smtp.gmail.com with ESMTPSA id h3-20020a62b403000000b0050dc7628181sm1585485pfn.91.2022.04.30.06.44.19 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 30 Apr 2022 06:44:21 -0700 (PDT) Date: Sat, 30 Apr 2022 13:44:16 +0000 From: Hyeonggon Yoo <42.hyeyoo@gmail.com> To: Mike Rapoport Cc: linux-mm@kvack.org, Andrew Morton , Andy Lutomirski , Dave Hansen , Ira Weiny , Kees Cook , Mike Rapoport , Peter Zijlstra , Rick Edgecombe , Vlastimil Babka , linux-kernel@vger.kernel.org, x86@kernel.org Subject: Re: [RFC PATCH 0/3] Prototype for direct map awareness in page allocator Message-ID: <20220430134415.GA25819@ip-172-31-27-201.ap-northeast-1.compute.internal> References: <20220127085608.306306-1-rppt@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Stat-Signature: 93akptzrt6qn71sc4khps5o34jzkhbhj Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=AUcE1lZu; spf=pass (imf18.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 8115A1C0060 X-HE-Tag: 1651326257-599492 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Apr 26, 2022 at 06:21:57PM +0300, Mike Rapoport wrote: > Hello Hyeonggon, > > On Tue, Apr 26, 2022 at 05:54:49PM +0900, Hyeonggon Yoo wrote: > > On Thu, Jan 27, 2022 at 10:56:05AM +0200, Mike Rapoport wrote: > > > From: Mike Rapoport > > > > > > Hi, > > > > > > This is a second attempt to make page allocator aware of the direct map > > > layout and allow grouping of the pages that must be mapped at PTE level in > > > the direct map. > > > > > > > Hello mike, It may be a silly question... > > > > Looking at implementation of set_memory*(), they only split > > PMD/PUD-sized entries. But why not _merge_ them when all entries > > have same permissions after changing permission of an entry? > > > > I think grouping __GFP_UNMAPPED allocations would help reducing > > direct map fragmentation, but IMHO merging split entries seems better > > to be done in those helpers than in page allocator. > > Maybe, I didn't got as far as to try merging split entries in the direct > map. IIRC, Kirill sent a patch for collapsing huge pages in the direct map > some time ago, but there still was something that had to initiate the > collapse. But in this case buddy allocator's view of direct map is quite limited. It cannot merge 2M entries to 1G entry as it does not support big allocations. Also it cannot merge entries of pages freed in boot process as they weren't allocated from page allocator. And it will become harder when pages in MIGRATE_UNMAPPED is borrowed from another migrate type.... So it would be nice if we can efficiently merge mappings in change_page_attr_set(). this approach can handle cases above. I think in this case grouping allocations and merging mappings should be done separately. > > For example: > > 1) set_memory_ro() splits 1 RW PMD entry into 511 RW PTE > > entries and 1 RO PTE entry. > > > > 2) before freeing the pages, we call set_memory_rw() and we have > > 512 RW PTE entries. Then we can merge it to 1 RW PMD entry. > > For this we need to check permissions of all 512 pages to make sure we can > use a PMD entry to map them. Of course that may be slow. Maybe one way to optimize this is using some bits in struct page, something like: each bit of page->direct_map_split (unsigned long) is set when at least one entry in (PTRS_PER_PTE = 512)/(BITS_PER_LONG = 64) = 8 entries has special permissions. Then we just need to set the corresponding bit when splitting mappings and iterate 8 entries when changing permission back again. (and then unset the bit when 8 entries has usual permissions). we can decide to merge by checking if page->direct_map_split is zero. When scanning, 8 entries would fit into one cacheline. Any other ideas? > Not sure that doing the scan in each set_memory call won't cause an overall > slowdown. I think we can evaluate it by measuring boot time and bpf/module load/unload time. Is there any other workload that is directly affected by performance of set_memory*()? > > 3) after 2) we can do same thing about PMD-sized entries > > and merge them into 1 PUD entry if 512 PMD entries have > > same permissions. > > [...] > > > Mike Rapoport (3): > > > mm/page_alloc: introduce __GFP_UNMAPPED and MIGRATE_UNMAPPED > > > mm/secretmem: use __GFP_UNMAPPED to allocate pages > > > EXPERIMENTAL: x86/module: use __GFP_UNMAPPED in module_alloc > > -- > > Thanks, > > Hyeonggon > > -- > Sincerely yours, > Mike.