From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51398C61DA4 for ; Mon, 6 Feb 2023 22:55:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E19176B0073; Mon, 6 Feb 2023 17:55:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DC8036B0074; Mon, 6 Feb 2023 17:55:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C69C06B0075; Mon, 6 Feb 2023 17:55:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id B99656B0073 for ; Mon, 6 Feb 2023 17:55:24 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 7BF1E1405B7 for ; Mon, 6 Feb 2023 22:55:24 +0000 (UTC) X-FDA: 80438374968.19.57D5F66 Received: from mail-pj1-f45.google.com (mail-pj1-f45.google.com [209.85.216.45]) by imf19.hostedemail.com (Postfix) with ESMTP id C1F961A0019 for ; Mon, 6 Feb 2023 22:55:22 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=Av0SFX+O; spf=pass (imf19.hostedemail.com: domain of shy828301@gmail.com designates 209.85.216.45 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675724122; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1L3AWlMsh68l70E8yrAn9rE21KZX80tIcy0yGxrM41A=; b=lYCNP8muF6svZ0P3e7XsuqjRQsFE8MALf54v3b3tJtzzT6WWeiExC7S6rgtdsj6NRLdHF7 xGyY/o18CmSvK0kdUIwm4ikoFNtiRrQ7m80Aw+USuJNZRczPp9hAKfQQo4tXgcPEU8DzQM DqzxkfRS47AlYk/ILZAWIjeXq9Tyf/4= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=Av0SFX+O; spf=pass (imf19.hostedemail.com: domain of shy828301@gmail.com designates 209.85.216.45 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675724122; a=rsa-sha256; cv=none; b=MiZMp4IxxwzhiOcDIaSVlA7wWPBQfFpdz2F23WPBRvzr3SOQvJb22PdhWkeaVP+Dy4l7rv MsH679+vKilQdY78uw8CBy4r2ooFj+mVT7jqVckcemdZvExyjqyv2TLaQ1WlFv2Hqa9DLv mvuuEnGO5LJSI3AYZc5/fri+kWzEiYs= Received: by mail-pj1-f45.google.com with SMTP id n20-20020a17090aab9400b00229ca6a4636so16737846pjq.0 for ; Mon, 06 Feb 2023 14:55:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=1L3AWlMsh68l70E8yrAn9rE21KZX80tIcy0yGxrM41A=; b=Av0SFX+Oe39+SyBf+OiCnueBiP0mTNE/fvSjBK8OvfoOtfu1OTRnjlrc2QRySNDbp+ hy7ay7AesIqn7UdrtYC93WIJU+6VkjEggy9eg4ZTztq6e4gDjG4qW3OLDukVoo3g44kN v7O63Ih2ECLw9ahgoaUYKfTgRTI0C+0XLM0XdbLd8pzq87QpxlkVreDef8AzevWSWD7V uDClMe1VW2SOLjQ4GehuEdMbPSzATw47g8B1QIw2bQs3ZMXkLlFf0TjhWyX4pwTDH+PG MwTJ7KrSfrGdlJEg4COUjL2R/JRNjj8qwjZxHo7msWCjC8KaVHH0GEr42IzWSmf9pJ9g YUsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=1L3AWlMsh68l70E8yrAn9rE21KZX80tIcy0yGxrM41A=; b=GBPuA793K4zN6qcQ1LiePlFpHNBjy68kueZr2bIB7mJ8XObyvFHaOhHlzCPkkI4eyH Y8ILQ0QkfVSsvqpWOprEwaFQGbdranS6QDYtXW1rzyxoTf+5+f84mjzup2edB9gmLdJq 5dQ2gjSI95lwni9Jj3jyBIiPgXZfAeHYhgHOgCTBG0TW1OFY5gl7wCZT0yUyzJl8EuXL tJpPSB8vcJsZADSQXDLImvawTx0hI3KwN/+Jo5r6H0V1eB4rbXp1YGWMYytPD4tSTV4n JX6PKKPwUXZhikx0MuswTCaL4n685xXJCVhlFXqFH7CukrYfbgkAKCk42DfP+sREqFQM bd2w== X-Gm-Message-State: AO0yUKXGGbp58u2GK9MlzM9mtyUPIFfmwIlMgeuK6irIzXulgzvb5cqg iesvXllx2chv0q8vCLTaAfLI5VwrQnJYuxY/9LY= X-Google-Smtp-Source: AK7set8TIMaAznpoKtpAzAO32G51gmhtXI8/0a/Kq4eMyW0sVZK6go1tbuhoCqs7BKe0Vck4RptF91JZGcVLPPuh3JM= X-Received: by 2002:a17:90a:304f:b0:22c:266c:2c45 with SMTP id q15-20020a17090a304f00b0022c266c2c45mr270494pjl.49.1675724121346; Mon, 06 Feb 2023 14:55:21 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Yang Shi Date: Mon, 6 Feb 2023 14:55:09 -0800 Message-ID: Subject: Re: Folio mapcount To: Matthew Wilcox Cc: linux-mm@kvack.org, Vishal Moola , Hugh Dickins , Rik van Riel , David Hildenbrand , "Yin, Fengwei" Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: C1F961A0019 X-Rspam-User: X-Stat-Signature: b36asku9apnjsmi8zay5cbssga4kkjrh X-HE-Tag: 1675724122-506339 X-HE-Meta: U2FsdGVkX184wUeLXsCd3o4GzitDMR4HJRy5oPEB6jfpI0KI+Eftf7IYcu+1dNWu3yJmLf4rY5d8Uv6ysFtVBISurHIBtWEYuSdA1CdOu7YK3V+NBkxeghAzH3mFJPjDxK8wB2r7RnG4hXy1cfuXn7hj03Yx/w6DsImVCAbyNiP2gLgR48LEvC2fiXjFDLsqFVag7vxf4LYZRMc+uj12NlLDrhZ370CGFa43tV0z1ELsrP9p+2tvSrYEdW0RtKp40ggRHQCjJXcIuGyxVlAw6LFBiCJWjQRsCRH8125aYGw6CHWhlpo6R4S3rhdlWerIYJGftbcozlfgYLTlKtvwyVDQphtOlKFFksajaQY3QKc0JagxkoeRuU7UbSDr5wd8mxacqkVDMxz/Z4VdrvQsHm0/EIvQNNaUvA2YgrDTWFuixxobBckvQgDzrR5jxIh8U/0FF9H3pWq/ONqNjDEOPSz0TH32GnrkscRAdLxMF79ucNblmv0ZcDd7PIjg9crjmwUBIKPrVOrt78ud0LUtTZ/mNdh64jfuLStcla946ltvlrB/QLf43v2CGaInN3lqGCEDmSuqmNzddZbxFcuzhd7BWhA4XPtVRRA+btHsRpoNtmsYHi12wwXd28PR9pvwtmqY+6bE9OJ9+iKsJ6Rnmnhc2uzz0VPP6ViTZIZZSmlDeYwwDP/+UgjmsL1FpUSfORsEXxqAxzlHQ+WhTH0JeYe0zS0aRTHmtaIwWsJDC4xifejVVlI/gbcNrcrBJv04KHp/LAgQDyzTdzIxlbjyIJCpSISYHhKs8JwjtjbbiyKavWH8VnSoJz4fB97sFJcYQB7loGAsnzNrF/q57zgXf/N3+RXju26o7jnpGloouTo8n3RKF2blOvFsEOoC/OfEl5KljF/Hma430iue+zXBClORdKnzyPZgBNXk8cYpjK/f4+MNLh4xFB5cROuzkvdPLnk163GSkfmVqZ/4aW8 RD/fhOqf VxViafzx0LrWeqvj6PzFim/LhUEqWlbO+XphvtLpq3kQ8RP3Mia9lfQ2IFJN4UKXEmuksTl0+BHbFw1p0iCMyjn2zGLMaKKe6uqeP0wC7HQBGDvoM2UIRKPG6KnPSH3MSs2BesOqqNMm7iss0G3hYa8IC1zSyewjuWFQGsfPlmTZ4/mGGVttyITxgSOm8HnFM1nK9PV+T/FLcscX31Y0HYcMZkO71JIHA4Mc6nI7ZLX5TL4vxIPnFWCQF2Sln+/E4c8ylThYp8QZ4/uE0ta0UrduiR9JK9RWgSfbtQYcjoLLYt6AGn7A3Yayhdq/g7ORBhxDpIMLEWyrxmFeu2lvW5jm/ZBNd6hEgROQGOEU68fBXXR2qrt4poGKo782ak6nZSQXWXGb0kDF7MUXGUo1NMQ30GGtaHR97KX/i20OZTH7cbspZ768Rar5AtAW+3v9oSY+FlVUU6gOdzP1ZmB8nM2j6Ln2A9tqtEf5Elc7ZyQxt1zT0xhi96uSR6g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Feb 6, 2023 at 12:34 PM Matthew Wilcox wrote: > > On Tue, Jan 24, 2023 at 06:13:21PM +0000, Matthew Wilcox wrote: > > Once we get to the part of the folio journey where we have > > one-pointer-per-page, we can't afford to maintain per-page state. > > Currently we maintain a per-page mapcount, and that will have to go. > > We can maintain extra state for a multi-page folio, but it has to be a > > constant amount of extra state no matter how many pages are in the folio. > > > > My proposal is that we maintain a single mapcount per folio, and its > > definition is the number of (vma, page table) tuples which have a > > reference to any pages in this folio. > > I've been thinking about this a lot more, and I have changed my > mind. It works fine to answer the question "Is any page in this > folio mapped", but it's now hard to answer the question "I have it > mapped, does anybody else?" That question is asked, for example, > in madvise_cold_or_pageout_pte_range(). > > With this definition, if the mapcount is 1, it's definitely only mapped > by us. If it's more than 2, it's definitely mapped by somebody else (*). > If it's 2, maybe we have the folio mapped twice, and maybe we have it > mapped once and somebody else has it mapped once, so we have to consult > the rmap to find out. Not fun times. > > (*) If we support folios larger than PMD size, then the answer is more > complex. > > I now think the mapcount has to be defined as "How many VMAs have > one-or-more pages of this folio mapped". IIRC it may be still possible the folio's mapcount is two, but it is only mapped by us (for example, two VMAs from the same process). > > That means that our future folio_add_file_rmap_range() looks a bit > like this: > > { > bool add_mapcount = true; > > if (nr < folio_nr_pages(folio)) > add_mapcount = !folio_has_ptes(folio, vma); > if (add_mapcount) > atomic_inc(&folio->_mapcount); > > __lruvec_stat_mod_folio(folio, NR_FILE_MAPPED, nr); > if (nr == HPAGE_PMD_NR) > __lruvec_stat_mod_folio(folio, folio_test_swapbacked(folio) ? > NR_SHMEM_PMDMAPPED : NR_FILE_PMDMAPPED, nr); > > mlock_vma_folio(folio, vma, nr == HPAGE_PMD_NR); > } > > bool folio_mapped_in_vma(struct folio *folio, struct vm_area_struct *vma) > { > unsigned long address = vma_address(&folio->page, vma); > DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, 0); > > if (!page_vma_mapped_walk(&pvmw)) > return false; > page_vma_mapped_walk_done(&pvmw); > return true; > } > > ... some details to be fixed here; particularly this will currently > deadlock on the PTL, so we'd need not only to exclude the current > PMD from being examined, but also avoid a deadly embrace between > two threads (do we currently have a locking order defined for > page table locks at the same height of the tree?) > >