From: Andreas Dilger <adilger@dilger.ca>
To: Theodore Ts'o <tytso@mit.edu>
Cc: linux-mm <linux-mm@kvack.org>, linux-ext4@vger.kernel.org
Subject: Re: Best way to pin a page in ext4?
Date: Mon, 15 Sep 2014 14:57:23 -0600 [thread overview]
Message-ID: <36321733-F488-49E3-8733-C6758F83DFA1@dilger.ca> (raw)
In-Reply-To: <20140915185102.0944158037A@closure.thunk.org>
[-- Attachment #1: Type: text/plain, Size: 2781 bytes --]
On Sep 15, 2014, at 12:51 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> In ext4, we currently use the page cache to store the allocation
> bitmaps. The pages are associated with an internal, in-memory inode
> which is located in EXT4_SB(sb)->s_buddy_cache. Since the pages can be
> reconstructed at will, either by reading them from disk (in the case of
> the actual allocation bitmap), or by calculating the buddy bitmap from
> the allocation bitmap, normally we allow the VM to eject the pags as
> necessary.
>
> For a specialty use case, I've been requested to have an optional mode
> where the on-disk bitmaps are pinned into memory; this is a situation
> where the file system size is known in advance, and the user is willing
> to trade off the locked-down memory for the latency gains required by
> this use case.
As discussed in http://lists.openwall.net/linux-ext4/2013/03/25/15
the bitmap pages were being evicted under memory pressure even when
they are active use. That turned out to be an MM problem and not an
ext4 problem in the end, and was fixed in commit c53954a092d in 3.11,
in case you are running an older kernel.
There was a discussion on whether we were doing all of the right calls
to mark_page_accessed() in the ext4 code to ensure that these bitmaps
were being kept at the hot end of the LRU.
> It seems that the simplest way to do that is to use mlock_vma_page()
> when the file system is first mounted, and then use munlock_vma_page()
> when the file system is unmounted. However, these functions are in
> mm/internal.h, so I figured I'd better ask permission before using
> them. Does this sound like a sane way to do things?
>
> The other approach would be to keep an elevated refcount on the pages in
> question, but it seemed it would be more efficient use the mlock
> facility since that keeps the pages on an unevictable list.
It doesn't seem unreasonable to just grab an extra refcount on the pages
when they are first loaded. However, the memory usage may be fairly
high (32MB per 1TB of disk) so this definitely can't be generally used,
and it would be nice to make sure that ext4 is already doing the right
thing to keep these important pages in cache.
The other option is to improve the in-memory description of free blocks
and use an extent map or rbtree to handle this instead of bitmaps. That
may also speed up allocation in general, but is a lot more work...
> Does using the mlock/munlock_vma_page() functions make sense? Any
> pitfalls I should worry about? Note that these pages are never mapped
> into userspace, so there is no associated vma; fortunately the functions
> don't take a vma argument, their name notwithstanding.....
>
> Thanks,
>
> - Ted
Cheers, Andreas
[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2014-09-15 20:57 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-15 18:51 Theodore Ts'o
2014-09-15 20:57 ` Andreas Dilger [this message]
2014-09-16 18:07 ` Theodore Ts'o
2014-09-16 18:34 ` Christoph Lameter
2014-09-16 18:56 ` Theodore Ts'o
2014-09-17 0:07 ` Hugh Dickins
2014-09-17 1:25 ` Theodore Ts'o
2014-09-17 3:31 ` Christoph Lameter
2014-09-17 13:57 ` Peter Zijlstra
2014-09-17 20:37 ` Hugh Dickins
2014-09-17 13:56 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=36321733-F488-49E3-8733-C6758F83DFA1@dilger.ca \
--to=adilger@dilger.ca \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox