From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DE8BC432BE for ; Thu, 19 Aug 2021 10:18:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EBFCD60BD3 for ; Thu, 19 Aug 2021 10:18:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org EBFCD60BD3 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 7894A6B006C; Thu, 19 Aug 2021 06:18:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 739128D0001; Thu, 19 Aug 2021 06:18:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 628876B0072; Thu, 19 Aug 2021 06:18:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0195.hostedemail.com [216.40.44.195]) by kanga.kvack.org (Postfix) with ESMTP id 4454E6B006C for ; Thu, 19 Aug 2021 06:18:15 -0400 (EDT) Received: from smtpin33.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id CB5B5181B049E for ; Thu, 19 Aug 2021 10:18:14 +0000 (UTC) X-FDA: 78491430108.33.389FE8B Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) by imf03.hostedemail.com (Postfix) with ESMTP id 4EDBD300398E for ; Thu, 19 Aug 2021 10:18:13 +0000 (UTC) Received: by mail-pl1-f174.google.com with SMTP id l11so3637810plk.6 for ; Thu, 19 Aug 2021 03:18:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding; bh=YWzpA6GpiUj9DNafhyNnVYqFrxHaZDDWl5LsHjc0VZo=; b=nJW65GAjl7D58Tk4x3AWNw+vUkgL7jOiuXD2nezoZq/Ch6lRzc0SVRGhIpfViaAXii aKW4Nm497bc9RrWJ4RBDAJPmbwAMsIYXvxDCeLtkPoR5q8Q/4IMsmLhwrejMuPs74DXI dlrboN0usqfK+kgTp/42Ui2QqTYIKpYq+p63m8Kv9qkVlFWHndSOvGwioPcQMkz/A3j7 temEF7jTY9AUcS54LQD4NmXq+0swWWloea6N7m6vUlLnqQoVH5Bt6wzXjHtCeO5b9++n ZSJQRR3l5Gjl8DL0bBPLvRNqX9Q8c3iRDJfyxdZS9MUWVoV+I9w7qkNRUaKMDPOddWbV P6Mw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=YWzpA6GpiUj9DNafhyNnVYqFrxHaZDDWl5LsHjc0VZo=; b=togP7tP7fMM6oDcVEmzlOQq5Z//wka4i/UwRPuB/vyrKvgEZp3vL8/iCxfflToN691 56xq1ib/ggEWUIcLIOpDynaSGiqXw7YxqGkGILQZiRvTlzM1CtfLyFVHUTEusY1iSn1/ R2VqSueTEELtW1ZzokZh9y1ppsPpyR1yJCErCEiRR84DoMgUVv2qjwALTKoi2vWpZjVA IzXlROuuY1pZUBwiDrLrPaeKA4l7DYkD9utGmMdhGKUNcc6HJbV7ZUAd5EjNODrFDUa0 goP9uy/7XJ+oEwpQrH5RLb8/lX7Juk94En+PRA08vexTILMwVLWpH6pf0nsvu5miggj8 hV3w== X-Gm-Message-State: AOAM5300jk8VibvZPdFAZwf1SY7FzfUJpseWTM3ZycDy4gcphFB2BYI/ 154ZMpAFEAg108KTUhzjwLCmlw== X-Google-Smtp-Source: ABdhPJxpvI4seUV1ElylKWY6jlwJA1pr0aOdH/FokCyrtOSpqe/37r8Qycw3jq1alEYdVBeEpzD08w== X-Received: by 2002:a17:902:76cb:b029:12b:2fb8:7c35 with SMTP id j11-20020a17090276cbb029012b2fb87c35mr11126355plt.16.1629368291977; Thu, 19 Aug 2021 03:18:11 -0700 (PDT) Received: from [10.254.207.146] ([139.177.225.231]) by smtp.gmail.com with ESMTPSA id i5sm7762302pjk.47.2021.08.19.03.18.07 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 19 Aug 2021 03:18:11 -0700 (PDT) Subject: Re: [External] Re: [PATCH v2 6/9] mm: free user PTE page table pages To: David Hildenbrand , akpm@linux-foundation.org, tglx@linutronix.de, hannes@cmpxchg.org, mhocko@kernel.org, vdavydov.dev@gmail.com, kirill.shutemov@linux.intel.com, mika.penttila@nextfour.com Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, songmuchun@bytedance.com References: <20210819031858.98043-1-zhengqi.arch@bytedance.com> <20210819031858.98043-7-zhengqi.arch@bytedance.com> <5aa3020c-fcf2-87bd-31fe-e2b5c2aafcf2@redhat.com> From: Qi Zheng Message-ID: <74bfdf9c-f054-906c-f533-9b5e53ba057d@bytedance.com> Date: Thu, 19 Aug 2021 18:18:05 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: <5aa3020c-fcf2-87bd-31fe-e2b5c2aafcf2@redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=bytedance-com.20150623.gappssmtp.com header.s=20150623 header.b=nJW65GAj; spf=pass (imf03.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=none) header.from=bytedance.com X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 4EDBD300398E X-Stat-Signature: i7rc1too8b4eurerjdo617wicfmyr1zy X-HE-Tag: 1629368293-838826 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2021/8/19 PM3:01, David Hildenbrand wrote: >> >> In this patch series, we add a pte_refcount field to the >> struct page of page table to track how many users of PTE page >> table. Similar to the mechanism of page refcount, the user of >> PTE page table should hold a refcount to it before accessing. >> The PTE page table page will be freed when the last refcount >> is dropped. >> >> While we access ->pte_refcount of a PTE page table, any of the >> following ensures the pmd entry corresponding to the PTE page >> table stability: >> >> =C2=A0=C2=A0=C2=A0=C2=A0- mmap_lock >> =C2=A0=C2=A0=C2=A0=C2=A0- anon_lock >> =C2=A0=C2=A0=C2=A0=C2=A0- i_mmap_lock >> =C2=A0=C2=A0=C2=A0=C2=A0- parallel threads are excluded by other means= which >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 can make ->pmd stable(e.g. gup case) >> >> This patch does not support THP temporarily, it will be >> supported in the next patch. >=20 > Can you clarify (and document here) who exactly takes a reference on th= e=20 > page table? Do I understand correctly that >=20 > a) each !pte_none() entry inside a page table take a reference to the=20 > page it's containted in. > b) each page table walker temporarily grabs a page table reference > c) The PMD tables the PTE is referenced in (->currently only ever a=20 > single one) does *not* take a reference. Yes, both of the !pte_none() entry and the page table walker can be regarded as users of the PTE page table, so they need to hold a ->pte_refcount during their life cycle. And the pte_refcount field of struct page is only for PTE page table, so the PMD page tables does *not* take a ->pte_refcount. >=20 > So if there are no PTE entries left and nobody walks the page tables,=20 > you can remove it? You should really extend the=20 Yes, if there are no PTE entries left and nobody walks the page tables, which means there is no user, then we can remove it when we drop the last ->pte_refcount. > description/documentation to make it clearer how exactly it's supposed=20 > to work I'm sorry that there is no clear description of the usage of pte_refcount, i will make a documentation to describe it. >=20 >=20 > It feels kind of strange to not introduce the CONFIG_FREE_USER_PTE=20 > Kconfig option in this patch. At least it took me a while to identify i= t=20 > in the previous patch. The introduction of the CONFIG_FREE_USER_PTE and related APIs are all place in the previous patch ([PATCH v2 5/9] mm: pte_refcount infrastructure). And in this and next patch, we use these infrastructures to free user PTE page table pages. >=20 > Maybe you should introduce the empty stubs and use them in a separate=20 > patch, and then have this patch just introduce CONFIG_FREE_USER_PTE=20 > along with the actual refcounting magic inside the !stub implementation= . >=20 Hmm, let me think about this suggestion. Thanks, Qi