From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Yang Shi <yang.shi@linux.alibaba.com>,
Alexander Duyck <alexander.duyck@gmail.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
David Hildenbrand <david@redhat.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Matthew Wilcox <willy@infradead.org>,
Andrea Arcangeli <aarcange@redhat.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] huge tmpfs: try to split_huge_page() when punching hole
Date: Thu, 27 Feb 2020 11:47:04 +0300 [thread overview]
Message-ID: <20200227084704.aolem5nktpricrzo@box> (raw)
In-Reply-To: <alpine.LSU.2.11.2002261959020.10801@eggly.anvils>
On Wed, Feb 26, 2020 at 08:06:33PM -0800, Hugh Dickins wrote:
> Yang Shi writes:
>
> Currently, when truncating a shmem file, if the range is partly in a THP
> (start or end is in the middle of THP), the pages actually will just get
> cleared rather than being freed, unless the range covers the whole THP.
> Even though all the subpages are truncated (randomly or sequentially),
> the THP may still be kept in page cache.
>
> This might be fine for some usecases which prefer preserving THP, but
> balloon inflation is handled in base page size. So when using shmem THP
> as memory backend, QEMU inflation actually doesn't work as expected since
> it doesn't free memory. But the inflation usecase really needs to get
> the memory freed. (Anonymous THP will also not get freed right away,
> but will be freed eventually when all subpages are unmapped: whereas
> shmem THP still stays in page cache.)
>
> Split THP right away when doing partial hole punch, and if split fails
> just clear the page so that read of the punched area will return zeroes.
>
> Hugh Dickins adds:
>
> Our earlier "team of pages" huge tmpfs implementation worked in the way
> that Yang Shi proposes; and we have been using this patch to continue to
> split the huge page when hole-punched or truncated, since converting over
> to the compound page implementation. Although huge tmpfs gives out huge
> pages when available, if the user specifically asks to truncate or punch
> a hole (perhaps to free memory, perhaps to reduce the memcg charge), then
> the filesystem should do so as best it can, splitting the huge page.
I'm still uncomfortable with proposition to use truncate or punch a hole
operations to manage memory footprint. These operations are about managing
storage footprint, not memory. This happens to be the same for tmpfs.
I wounder if we should consider limiting the behaviour to the operation
that explicitly combines memory and storage managing: MADV_REMOVE. This
way we can avoid future misunderstandings with THP backed by a real
filesystem.
> }
>
> /*
> + * Check whether a hole-punch or truncation needs to split a huge page,
> + * returning true if no split was required, or the split has been successful.
> + *
> + * Eviction (or truncation to 0 size) should never need to split a huge page;
> + * but in rare cases might do so, if shmem_undo_range() failed to trylock on
> + * head, and then succeeded to trylock on tail.
> + *
> + * A split can only succeed when there are no additional references on the
> + * huge page: so the split below relies upon find_get_entries() having stopped
> + * when it found a subpage of the huge page, without getting further references.
> + */
> +static bool shmem_punch_compound(struct page *page, pgoff_t start, pgoff_t end)
> +{
> + if (!PageTransCompound(page))
> + return true;
> +
> + /* Just proceed to delete a huge page wholly within the range punched */
> + if (PageHead(page) &&
> + page->index >= start && page->index + HPAGE_PMD_NR <= end)
> + return true;
> +
> + /* Try to split huge page, so we can truly punch the hole or truncate */
> + return split_huge_page(page) >= 0;
> +}
I wanted to recommend taking into account khugepaged_max_ptes_none here,
but it will nullify usefulness of the feature for ballooning. Oh, well...
--
Kirill A. Shutemov
next prev parent reply other threads:[~2020-02-27 8:46 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-27 4:06 Hugh Dickins
2020-02-27 8:47 ` Kirill A. Shutemov [this message]
2020-02-28 4:04 ` Hugh Dickins
2020-02-28 4:26 ` Matthew Wilcox
2020-02-28 12:54 ` Kirill A. Shutemov
2020-02-28 22:28 ` Yang Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200227084704.aolem5nktpricrzo@box \
--to=kirill@shutemov.name \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=alexander.duyck@gmail.com \
--cc=david@redhat.com \
--cc=hughd@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mst@redhat.com \
--cc=willy@infradead.org \
--cc=yang.shi@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox