From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 755C3C433F5 for ; Wed, 15 Dec 2021 21:55:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BD5386B0071; Wed, 15 Dec 2021 16:55:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B85116B0073; Wed, 15 Dec 2021 16:55:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A4D1E6B0074; Wed, 15 Dec 2021 16:55:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0072.hostedemail.com [216.40.44.72]) by kanga.kvack.org (Postfix) with ESMTP id 9634F6B0071 for ; Wed, 15 Dec 2021 16:55:40 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 564D382499B9 for ; Wed, 15 Dec 2021 21:55:30 +0000 (UTC) X-FDA: 78921385620.06.324F950 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf21.hostedemail.com (Postfix) with ESMTP id CC1FF1C0014 for ; Wed, 15 Dec 2021 21:55:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:Message-ID: Subject:Cc:To:From:Date:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:In-Reply-To:References; bh=+fdrOWYRdrZ7fpCAUDT3hBdOJ69G8Xt0yS2cf82jtbg=; b=I/WGT6M1qyqLOVgqvFVlkc4vE1 n9nKCsIRqvXnOUKa6crX/Qbyj7Uik5jinSc/D6fHFD5j4FXBXTgH9hWb87K1jumTURYh6um9WXF+j 1STD+/+06uuR3tHkJWL0RLxYZwvNNpds4J8EdvX94R4Rii98vFjPEbHoG0vzfKjKWkIGUI3wAUO+v //B0zu74m98TM8X6QY3MSZXZ3DJwHir0Jigb7UoLRvp2ZKGIdqvhyEaMusr4d4jm5AfpLO28u52Pg FUGHjb4UFkmpPGXuIBTT2rR5lHdmR3WE77Kyxo/GzEyVYtfE4c2fEVicstS2fxDFH+8XSca0MKTLt +IFRLMjQ==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1mxcF6-00F3Io-SW; Wed, 15 Dec 2021 21:55:20 +0000 Date: Wed, 15 Dec 2021 21:55:20 +0000 From: Matthew Wilcox To: "Kirill A. Shutemov" Cc: linux-mm@kvack.org, Hugh Dickins , David Hildenbrand , Mike Kravetz Subject: folio mapcount Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: CC1FF1C0014 X-Stat-Signature: tgabsxfkktqh34a6kmccjq5ddwg1zfju Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b="I/WGT6M1"; dmarc=none; spf=none (imf21.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org X-HE-Tag: 1639605326-528261 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: I've been trying to understand whether we can simplify the mapcount handling for folios from the current situation with THPs. Let me quote the commit message from 53f9263baba6: > mm: rework mapcount accounting to enable 4k mapping of THPs > > We're going to allow mapping of individual 4k pages of THP compound. It > means we need to track mapcount on per small page basis. > > Straight-forward approach is to use ->_mapcount in all subpages to track > how many time this subpage is mapped with PMDs or PTEs combined. But > this is rather expensive: mapping or unmapping of a THP page with PMD > would require HPAGE_PMD_NR atomic operations instead of single we have > now. > > The idea is to store separately how many times the page was mapped as > whole -- compound_mapcount. This frees up ->_mapcount in subpages to > track PTE mapcount. > > We use the same approach as with compound page destructor and compound > order to store compound_mapcount: use space in first tail page, > ->mapping this time. > > Any time we map/unmap whole compound page (THP or hugetlb) -- we > increment/decrement compound_mapcount. When we map part of compound > page with PTE we operate on ->_mapcount of the subpage. > > page_mapcount() counts both: PTE and PMD mappings of the page. > > Basically, we have mapcount for a subpage spread over two counters. It > makes tricky to detect when last mapcount for a page goes away. > > We introduced PageDoubleMap() for this. When we split THP PMD for the > first time and there's other PMD mapping left we offset up ->_mapcount > in all subpages by one and set PG_double_map on the compound page. > These additional references go away with last compound_mapcount. > > This approach provides a way to detect when last mapcount goes away on > per small page basis without introducing new overhead for most common > cases. What breaks if we simply track any mapping (whether by PMD or PTE) as an increment to the head page (aka folio's) refcount? Essentially, we make the head mapcount 'the number of VMAs which contain a reference to any page in this folio'. We can remove PageDoubleMap. The tail refcounts will all be 0. If it's useful, we could introduce a 'partial_mapcount' which would be <= mapcount (but I don't know if it's useful). Splitting a PMD would not change ->_mapcount. Splitting the folio already causes the folio to be unmapped, so page faults will naturally re-increment ->_mapcount of each subpage. We might need some additional logic to treat a large folio (aka compound page) as a single unit; that is, when we fault on one page, we place entries for all pages in this folio (that fit ...) into the page tables, so that we only account it once, even if it's not compatible with using a PMD.