From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by kanga.kvack.org (Postfix) with ESMTP id 76D0D6B4C95 for ; Wed, 28 Nov 2018 05:59:19 -0500 (EST) Received: by mail-pl1-f199.google.com with SMTP id h10so25976125plk.12 for ; Wed, 28 Nov 2018 02:59:19 -0800 (PST) Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id h9sor8538599pgs.76.2018.11.28.02.59.17 for (Google Transport Security); Wed, 28 Nov 2018 02:59:17 -0800 (PST) Date: Wed, 28 Nov 2018 13:59:12 +0300 From: "Kirill A. Shutemov" Subject: Re: [PATCH 07/10] mm/khugepaged: minor reorderings in collapse_shmem() Message-ID: <20181128105911.ggngeqq5xevxpmsk@kshutemo-mobl1> References: <20181127075945.m5nbflc6nqto6f2i@kshutemo-mobl1> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Hugh Dickins Cc: Andrew Morton , "Kirill A. Shutemov" , Matthew Wilcox , linux-mm@kvack.org On Tue, Nov 27, 2018 at 12:23:32PM -0800, Hugh Dickins wrote: > On Tue, 27 Nov 2018, Kirill A. Shutemov wrote: > > On Mon, Nov 26, 2018 at 03:27:52PM -0800, Hugh Dickins wrote: > > > Several cleanups in collapse_shmem(): most of which probably do not > > > really matter, beyond doing things in a more familiar and reassuring > > > order. Simplify the failure gotos in the main loop, and on success > > > update stats while interrupts still disabled from the last iteration. > > > > > > Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages") > > > Signed-off-by: Hugh Dickins > > > Cc: Kirill A. Shutemov > > > Cc: stable@vger.kernel.org # 4.8+ > > > --- > > > mm/khugepaged.c | 72 ++++++++++++++++++++++--------------------------- > > > 1 file changed, 32 insertions(+), 40 deletions(-) > > > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > > index 1c402d33547e..9d4e9ff1af95 100644 > > > --- a/mm/khugepaged.c > > > +++ b/mm/khugepaged.c > > > @@ -1329,10 +1329,10 @@ static void collapse_shmem(struct mm_struct *mm, > > > goto out; > > > } > > > > > > + __SetPageLocked(new_page); > > > + __SetPageSwapBacked(new_page); > > > new_page->index = start; > > > new_page->mapping = mapping; > > > - __SetPageSwapBacked(new_page); > > > - __SetPageLocked(new_page); > > > BUG_ON(!page_ref_freeze(new_page, 1)); > > > > > > /* > > > @@ -1366,13 +1366,13 @@ static void collapse_shmem(struct mm_struct *mm, > > > if (index == start) { > > > if (!xas_next_entry(&xas, end - 1)) { > > > result = SCAN_TRUNCATED; > > > - break; > > > + goto xa_locked; > > > } > > > xas_set(&xas, index); > > > } > > > if (!shmem_charge(mapping->host, 1)) { > > > result = SCAN_FAIL; > > > - break; > > > + goto xa_locked; > > > } > > > xas_store(&xas, new_page + (index % HPAGE_PMD_NR)); > > > nr_none++; > > > @@ -1387,13 +1387,12 @@ static void collapse_shmem(struct mm_struct *mm, > > > result = SCAN_FAIL; > > > goto xa_unlocked; > > > } > > > - xas_lock_irq(&xas); > > > - xas_set(&xas, index); > > > } else if (trylock_page(page)) { > > > get_page(page); > > > + xas_unlock_irq(&xas); > > > } else { > > > result = SCAN_PAGE_LOCK; > > > - break; > > > + goto xa_locked; > > > } > > > > > > /* > > > > I'm puzzled by locking change here. > > The locking change here is to not re-get xas_lock_irq (in shmem_getpage > case) just before we drop it anyway: you point out that it used to cover > /* > * The page must be locked, so we can drop the i_pages lock > * without racing with truncate. > */ > VM_BUG_ON_PAGE(!PageLocked(page), page); > VM_BUG_ON_PAGE(!PageUptodate(page), page); > VM_BUG_ON_PAGE(PageTransCompound(page), page); > if (page_mapping(page) != mapping) { > but now does not. > > But the comment you wrote there originally (ah, git blame shows > that Matthew has made it say i_pages lock instead of tree_lock), > "The page must be locked, so we can drop...", was correct all along, > I'm just following what it says. > > It would wrong if the trylock_page came after the xas_unlock_irq, > but it comes before (as before): holding i_pages lock across the > lookup makes sure we look up the right page (no RCU racing) and > trylock_page makes sure that it cannot be truncated or hole-punched > or migrated or whatever from that point on - so can drop i_pages lock. You are right. I confused myself. > Actually, I think we could VM_BUG_ON(page_mapping(page) != mapping), > couldn't we? Not that I propose to make such a change at this stage. Yeah it should be safe. We may put WARN there. > > Isn't the change responsible for the bug you are fixing in 09/10? > > In which I relaxed the VM_BUG_ON_PAGE(PageTransCompound(page), page) > that appears in the sequence above. > > Well, what I was thinking of in 9/10 was a THP being inserted at some > stage between selecting this range for collapse and reaching the last > (usually first) xas_lock_irq(&xas) in the "This will be less messy..." > loop above: I don't see any locking against that possibility. (And it > has to be that initial xas_lock_irq(&xas), because once the PageLocked > head of new_page is inserted in the i_pages tree, there is no more > chance for truncation and a competing THP to be inserted there.) > > So 9/10 would be required anyway; but you're thinking that the page > we looked up under i_pages lock and got trylock_page on, could then > become Compound once i_pages lock is dropped? I don't think so: pages > don't become Compound after they've left the page allocator, do they? > And if we ever manage to change that, I'm pretty sure it would be with > page locks held and page refcounts frozen. > > IIRC, my intend for the locking scheme was to protect against > > truncate-repopulate race. > > > > What do I miss? > > The stage in between selecting the range for collapse, and getting > the initial i_pages lock? Pages not becoming Compound underneath > you, with or without page lock, with or without i_pages lock? Page > lock being sufficient protection against truncation and migration? Agreed on all fronts. Sorry for the noise. > > The rest of the patch *looks* okay, but I found it hard to follow. > > Splitting it up would make it easier. > > It needs some time, I admit: thanks a lot for persisting with it. > And thanks (to you and to Matthew) for the speedy Acks elsewhere. > > Hugh -- Kirill A. Shutemov