From: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
To: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>,
Andrew Morton <akpm@linux-foundation.org>,
Vasily Gorbik <gor@linux.ibm.com>,
Mike Kravetz <mike.kravetz@oracle.com>,
Mike Rapoport <rppt@kernel.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Matthew Wilcox <willy@infradead.org>,
David Hildenbrand <david@redhat.com>,
Suren Baghdasaryan <surenb@google.com>,
Qi Zheng <zhengqi.arch@bytedance.com>,
Yang Shi <shy828301@gmail.com>,
Mel Gorman <mgorman@techsingularity.net>,
Peter Xu <peterx@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Will Deacon <will@kernel.org>, Yu Zhao <yuzhao@google.com>,
Alistair Popple <apopple@nvidia.com>,
Ralph Campbell <rcampbell@nvidia.com>,
Ira Weiny <ira.weiny@intel.com>,
Steven Price <steven.price@arm.com>,
SeongJae Park <sj@kernel.org>,
Lorenzo Stoakes <lstoakes@gmail.com>,
Huang Ying <ying.huang@intel.com>,
Naoya Horiguchi <naoya.horiguchi@nec.com>,
Christophe Leroy <christophe.leroy@csgroup.eu>,
Zack Rusin <zackr@vmware.com>,
Axel Rasmussen <axelrasmussen@google.com>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Pasha Tatashin <pasha.tatashin@soleen.com>,
Miaohe Lin <linmiaohe@huawei.com>,
Minchan Kim <minchan@kernel.org>,
Christoph Hellwig <hch@infradead.org>, Song Liu <song@kernel.org>,
Thomas Hellstrom <thomas.hellstrom@linux.intel.com>,
Russell King <linux@armlinux.org.uk>,
"David S. Miller" <davem@davemloft.net>,
Michael Ellerman <mpe@ellerman.id.au>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
Heiko Carstens <hca@linux.ibm.com>,
Christian Borntraeger <borntraeger@linux.ibm.com>,
Claudio Imbrenda <imbrenda@linux.ibm.com>,
Alexander Gordeev <agordeev@linux.ibm.com>,
Jann Horn <jannh@google.com>,
Vishal Moola <vishal.moola@gmail.com>,
Vlastimil Babka <vbabka@suse.cz>,
linux-arm-kernel@lists.infradead.org, sparclinux@vger.kernel.org,
linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page
Date: Mon, 3 Jul 2023 18:10:32 +0200 [thread overview]
Message-ID: <20230703181032.5803c333@thinkpad-T15> (raw)
In-Reply-To: <edaa96f-80c1-1252-acbb-71c4f045b035@google.com>
On Thu, 29 Jun 2023 23:00:07 -0700 (PDT)
Hugh Dickins <hughd@google.com> wrote:
> On Thu, 29 Jun 2023, Gerald Schaefer wrote:
> > On Thu, 29 Jun 2023 12:22:24 -0300
> > Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > > On Wed, Jun 28, 2023 at 10:08:08PM -0700, Hugh Dickins wrote:
> > > > On Wed, 28 Jun 2023, Gerald Schaefer wrote:
> > > > >
> > > > > As discussed in the other thread, we would rather go with less complexity,
> > > > > possibly switching to an approach w/o the list and fragment re-use in the
> > > > > future. For now, as a first step in that direction, we can try with not
> > > > > adding fragments back only for pte_free_defer(). Here is an adjusted
> > > > > version of your patch, copying most of your pte_free_defer() logic and
> > > > > also description, tested with LTP and all three of your patch series applied:
> > > >
> > > > Thanks, Gerald: I don't mind abandoning my 13/12 SLAB_TYPESAFE_BY_RCU
> > > > patch (posted with fewer Cc's to the s390 list last week), and switching
> > > > to your simpler who-cares-if-we-sometimes-don't-make-maximal-use-of-page
> > > > patch.
> > > >
> > > > But I didn't get deep enough into it today to confirm it - and disappointed
> > > > that you've found it necessary to play with pt_frag_refcount in addition to
> > > > _refcount and HH bits. No real problem with that, but my instinct says it
> > > > should be simpler.
> >
> > Yes, I also found it a bit awkward, but it seemed "good and simple enough",
> > to have something to go forward with, while my instinct was in line with yours.
> >
> > >
> > > Is there any reason it should be any different at all from what PPC is
> > > doing?
> > >
> > > I still think the right thing to do here is make the PPC code common
> > > (with Hugh's proposed RCU modification) and just use it in both
> > > arches....
> >
> > With the current approach, we would not add back fragments _only_ for
> > the new pte_free_defer() path, while keeping our cleverness for the other
> > paths. Not having a good overview of the negative impact wrt potential
> > memory waste, I would rather take small steps, if possible.
> >
> > If we later switch to never adding back fragments, of course we should
> > try to be in line with PPC implementation.
>
> I find myself half-agreeing with everyone.
>
> I agree with Gerald that s390 should keep close to what it is already
> doing (except for adding pte_free_defer()): that changing its strategy
> and implementation to be much more like powerpc, is a job for some other
> occasion (and would depend on gathering data about how well each does).
>
> But I agree with Jason that the powerpc solution we ended up with cut
> out a lot of unnecessary complication: it shifts the RCU delay from
> when pte_free_defer() is called, to when the shared page comes to be
> freed; which may be a lot later, and might not be welcome in a common
> path, but is quite okay for the uncommon pte_free_defer().
Ok, I guess I must admit that I completely ignored the latest progress in
the powerpc thread, and therefore was not up-to-date. Still had the older
approach in mind, where you also checked for pt_frag_refcount to avoid
double call_rcu().
The new approach sounds very reasonable, and I also like your latest
s390 patch from a first glance. Need to get more up-to-date with PageActive
and maybe also powerpc approach, and give this some proper review tomorrow.
>
> And I agree with Alexander that pte_free_lower() and pte_free_upper()
> are better names than pte_free_now0() and pte_free_now1(): I was going
> to make that change, except all those functions disappear if we follow
> Jason's advice and switch the call_rcu() to when freeing the page.
>
> (Lower and upper seem unambiguous to me: Gerald, does your confusion
> come just from the way they are shown the wrong way round in the PP AA
> diagram? I corrected that in my patch, but you reverted it in yours.)
Ah yes, that could well be, and unfortunately I did not notice that you
fixed that in the comment. I only saw that you "fixed" the bit numbering
from 01234567 to 76543210, which I think is wrong on big-endian s390,
and therefore I simply removed that complete hunk.
But thanks a lot for pointing to that! We will certainly want to fix that
comment in a later patch, to reduce some or maybe all of the (at least
my) upper/lower confusion.
next prev parent reply other threads:[~2023-07-03 16:11 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-20 7:35 [PATCH v2 00/12] mm: free retracted page table by RCU Hugh Dickins
2023-06-20 7:40 ` [PATCH v2 01/12] mm/pgtable: add rcu_read_lock() and rcu_read_unlock()s Hugh Dickins
2023-06-20 7:42 ` [PATCH v2 02/12] mm/pgtable: add PAE safety to __pte_offset_map() Hugh Dickins
2023-06-20 7:43 ` [PATCH v2 03/12] arm: adjust_pte() use pte_offset_map_nolock() Hugh Dickins
2023-06-20 7:45 ` [PATCH v2 04/12] powerpc: assert_pte_locked() " Hugh Dickins
2023-06-20 7:47 ` [PATCH v2 05/12] powerpc: add pte_free_defer() for pgtables sharing page Hugh Dickins
2023-06-20 11:45 ` Jason Gunthorpe
2023-06-20 19:54 ` Hugh Dickins
2023-06-20 23:52 ` Jason Gunthorpe
2023-06-22 2:36 ` Hugh Dickins
2023-06-27 17:01 ` Jason Gunthorpe
2023-06-27 20:53 ` Hugh Dickins
2023-06-20 7:49 ` [PATCH v2 06/12] sparc: add pte_free_defer() for pte_t *pgtable_t Hugh Dickins
2023-06-20 7:51 ` [PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page Hugh Dickins
2023-06-28 19:16 ` Gerald Schaefer
2023-06-29 5:08 ` Hugh Dickins
2023-06-29 15:22 ` Jason Gunthorpe
2023-06-29 15:56 ` Gerald Schaefer
2023-06-30 6:00 ` Hugh Dickins
2023-07-02 4:32 ` Hugh Dickins
2023-07-04 13:40 ` Alexander Gordeev
2023-07-04 16:03 ` Hugh Dickins
2023-07-04 15:19 ` Gerald Schaefer
2023-07-04 17:03 ` Hugh Dickins
2023-07-05 12:55 ` Gerald Schaefer
2023-07-06 1:20 ` Hugh Dickins
2023-07-06 15:02 ` Gerald Schaefer
2023-07-06 19:45 ` Hugh Dickins
2023-07-10 17:21 ` Jason Gunthorpe
2023-07-05 6:46 ` Alexander Gordeev
2023-07-06 0:52 ` Hugh Dickins
2023-07-07 14:37 ` Gerald Schaefer
2023-07-03 16:10 ` Gerald Schaefer [this message]
2023-06-29 13:59 ` Alexander Gordeev
2023-06-29 15:43 ` Gerald Schaefer
2023-06-30 13:38 ` Claudio Imbrenda
2023-06-30 15:28 ` Hugh Dickins
2023-06-30 16:25 ` Claudio Imbrenda
2023-06-30 19:22 ` Hugh Dickins
2023-07-03 11:00 ` Claudio Imbrenda
2023-07-03 21:29 ` Jason Gunthorpe
2023-06-20 7:53 ` [PATCH v2 08/12] mm/pgtable: add pte_free_defer() for pgtable as page Hugh Dickins
2023-06-20 7:54 ` [PATCH v2 09/12] mm/khugepaged: retract_page_tables() without mmap or vma lock Hugh Dickins
2023-06-20 7:56 ` [PATCH v2 10/12] mm/khugepaged: collapse_pte_mapped_thp() with mmap_read_lock() Hugh Dickins
2023-06-20 8:04 ` [PATCH mm " Hugh Dickins
2023-06-20 7:58 ` [PATCH v2 11/12] mm/khugepaged: delete khugepaged_collapse_pte_mapped_thps() Hugh Dickins
2023-06-20 7:59 ` [PATCH v2 12/12] mm: delete mmap_write_trylock() and vma_try_start_write() Hugh Dickins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230703181032.5803c333@thinkpad-T15 \
--to=gerald.schaefer@linux.ibm.com \
--cc=agordeev@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=anshuman.khandual@arm.com \
--cc=apopple@nvidia.com \
--cc=axelrasmussen@google.com \
--cc=borntraeger@linux.ibm.com \
--cc=christophe.leroy@csgroup.eu \
--cc=davem@davemloft.net \
--cc=david@redhat.com \
--cc=gor@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=hch@infradead.org \
--cc=hughd@google.com \
--cc=imbrenda@linux.ibm.com \
--cc=ira.weiny@intel.com \
--cc=jannh@google.com \
--cc=jgg@ziepe.ca \
--cc=kirill.shutemov@linux.intel.com \
--cc=linmiaohe@huawei.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-s390@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=lstoakes@gmail.com \
--cc=mgorman@techsingularity.net \
--cc=mike.kravetz@oracle.com \
--cc=minchan@kernel.org \
--cc=mpe@ellerman.id.au \
--cc=naoya.horiguchi@nec.com \
--cc=pasha.tatashin@soleen.com \
--cc=peterx@redhat.com \
--cc=peterz@infradead.org \
--cc=rcampbell@nvidia.com \
--cc=rppt@kernel.org \
--cc=shy828301@gmail.com \
--cc=sj@kernel.org \
--cc=song@kernel.org \
--cc=sparclinux@vger.kernel.org \
--cc=steven.price@arm.com \
--cc=surenb@google.com \
--cc=thomas.hellstrom@linux.intel.com \
--cc=vbabka@suse.cz \
--cc=vishal.moola@gmail.com \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
--cc=yuzhao@google.com \
--cc=zackr@vmware.com \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox