linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Andres Freund <andres@anarazel.de>
Cc: David Stevens <stevensd@chromium.org>,
	linux-mm@kvack.org, Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Matthew Wilcox <willy@infradead.org>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	Yang Shi <shy828301@gmail.com>,
	David Hildenbrand <david@redhat.com>,
	Jiaqi Yan <jiaqiyan@google.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v6 4/4] mm/khugepaged: maintain page cache uptodate flag
Date: Tue, 20 Jun 2023 17:11:30 -0400	[thread overview]
Message-ID: <ZJIWAvTczl0rHJBv@x1n> (raw)
In-Reply-To: <20230620205547.qzmivkjox2hkpzmm@awork3.anarazel.de>

On Tue, Jun 20, 2023 at 01:55:47PM -0700, Andres Freund wrote:
> Hi,

Hi, Andres,

> 
> On 2023-04-04 21:01:17 +0900, David Stevens wrote:
> > From: David Stevens <stevensd@chromium.org>
> > 
> > Make sure that collapse_file doesn't interfere with checking the
> > uptodate flag in the page cache by only inserting hpage into the page
> > cache after it has been updated and marked uptodate. This is achieved by
> > simply not replacing present pages with hpage when iterating over the
> > target range.
> > 
> > The present pages are already locked, so replacing them with the locked
> > hpage before the collapse is finalized is unnecessary. However, it is
> > necessary to stop freezing the present pages after validating them,
> > since leaving long-term frozen pages in the page cache can lead to
> > deadlocks. Simply checking the reference count is sufficient to ensure
> > that there are no long-term references hanging around that would the
> > collapse would break. Similar to hpage, there is no reason that the
> > present pages actually need to be frozen in addition to being locked.
> > 
> > This fixes a race where folio_seek_hole_data would mistake hpage for
> > an fallocated but unwritten page. This race is visible to userspace via
> > data temporarily disappearing from SEEK_DATA/SEEK_HOLE. This also fixes
> > a similar race where pages could temporarily disappear from mincore.
> > 
> > Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
> > Signed-off-by: David Stevens <stevensd@chromium.org>
> 
> I noticed that recently MADV_COLLAPSE stopped being able to collapse a
> binary's executable code, always failing with EAGAIN. I bisected it down to
> a2e17cc2efc7 - this commit.
> 
> Using perf trace -e 'huge_memory:*' -a I see
> 
>   1000.433 postgres.2/1872144 huge_memory:mm_khugepaged_collapse_file(mm: 0xffff889e800bdf00, hpfn: 46720000, index: 1537, is_shmem: 1, filename: "postgres.2", result: 17)
>   1000.445 postgres.2/1872144 huge_memory:mm_khugepaged_scan_file(mm: 0xffff889e800bdf00, pfn: -1, filename: "postgres.2", present: 512, result: 17)
>   1000.485 postgres.2/1872144 huge_memory:mm_khugepaged_collapse_file(mm: 0xffff889e800bdf00, hpfn: 46720000, index: 2049, is_shmem: 1, filename: "postgres.2", result: 17)
>   1000.489 postgres.2/1872144 huge_memory:mm_khugepaged_scan_file(mm: 0xffff889e800bdf00, pfn: -1, filename: "postgres.2", present: 512, result: 17)
>   1000.526 postgres.2/1872144 huge_memory:mm_khugepaged_collapse_file(mm: 0xffff889e800bdf00, hpfn: 46720000, index: 2561, is_shmem: 1, filename: "postgres.2", result: 17)
>   1000.532 postgres.2/1872144 huge_memory:mm_khugepaged_scan_file(mm: 0xffff889e800bdf00, pfn: -1, filename: "postgres.2", present: 512, result: 17)
>   1000.570 postgres.2/1872144 huge_memory:mm_khugepaged_collapse_file(mm: 0xffff889e800bdf00, hpfn: 46720000, index: 3073, is_shmem: 1, filename: "postgres.2", result: 17)
>   1000.575 postgres.2/1872144 huge_memory:mm_khugepaged_scan_file(mm: 0xffff889e800bdf00, pfn: -1, filename: "postgres.2", present: 512, result: 17)
> 
> for every attempt at doing madvise(MADV_COLLAPSE).
> 
> 
> I'm sad about that, because MADV_COLLAPSE was the first thing that allowed
> using huge pages for executable code that wasn't entirely completely gross.
> 
> 
> I don't yet have a standalone repro, but can write one if that's helpful.

There's a fix:

https://lore.kernel.org/all/20230607053135.2087354-1-stevensd@google.com/

Already in today's Andrew's pull for rc7:

https://lore.kernel.org/all/20230620123828.813b1140d9c13af900e8edb3@linux-foundation.org/

-- 
Peter Xu



  reply	other threads:[~2023-06-20 21:11 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-04 12:01 [PATCH v6 0/4] mm/khugepaged: fixes for khugepaged+shmem David Stevens
2023-04-04 12:01 ` [PATCH v6 1/4] mm/khugepaged: drain lru after swapping in shmem David Stevens
2023-04-04 12:01 ` [PATCH v6 2/4] mm/khugepaged: refactor collapse_file control flow David Stevens
2023-04-04 12:01 ` [PATCH v6 3/4] mm/khugepaged: skip shmem with userfaultfd David Stevens
2023-04-04 12:01 ` [PATCH v6 4/4] mm/khugepaged: maintain page cache uptodate flag David Stevens
2023-04-04 21:21   ` Peter Xu
2023-04-19  4:37     ` Hugh Dickins
2023-06-20 20:55   ` Andres Freund
2023-06-20 21:11     ` Peter Xu [this message]
2023-06-20 21:41       ` Andres Freund

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZJIWAvTczl0rHJBv@x1n \
    --to=peterx@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andres@anarazel.de \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=jiaqiyan@google.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=shy828301@gmail.com \
    --cc=stevensd@chromium.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox