From: Miklos Szeredi <miklos@szeredi.hu>
To: Jan Kara <jack@suse.cz>
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
linux-btrfs@vger.kernel.org, lsf-pc@lists.linux-foundation.org
Subject: Re: [Lsf-pc] [LSF/MM TOPIC] sharing pages between mappings
Date: Wed, 11 Jan 2017 15:13:19 +0100 [thread overview]
Message-ID: <CAJfpeguuBgypYh3G1Ew1a37o4WuRozPzLe=D_gh2BbtYXE=zzg@mail.gmail.com> (raw)
In-Reply-To: <20170111115143.GJ16116@quack2.suse.cz>
On Wed, Jan 11, 2017 at 12:51 PM, Jan Kara <jack@suse.cz> wrote:
> On Wed 11-01-17 11:29:28, Miklos Szeredi wrote:
>> I know there's work on this for xfs, but could this be done in generic mm
>> code?
>>
>> What are the obstacles? page->mapping and page->index are the obvious
>> ones.
>
> Yes, these two are the main that come to my mind. Also you'd need to
> somehow share the mapping->i_mmap tree so that unmap_mapping_range() works.
>
>> If that's too difficult is it maybe enough to share mappings between
>> files while they are completely identical and clone the mapping when
>> necessary?
>
> Well, but how would the page->mapping->host indirection work? Even if you
> have identical contents of the mappings, you still need to be aware there
> are several inodes behind them and you need to pick the right one
> somehow...
When do we actually need page->mapping->host? The only place where
it's not available is page writeback. Then we can know that the
original page was already cow-ed and after being cowed, the page
belong only to a single inode.
What then happens if the newly written data is cloned before being
written back? We can either write back the page during the clone, so
that only clean pages are ever shared. Or we can let dirty pages be
shared between inodes. In that latter case the question is: do we
care about which inode we use for writing back the data? Is the inode
needed at all? I don't know enough about filesystem internals to see
clearly what happens in such a situation.
>> All COW filesystems would benefit, as well as layered ones: lots of
>> fuse fs, and in some cases overlayfs too.
>>
>> Related: what can DAX do in the presence of cloned block?
>
> For DAX handling a block COW should be doable if that is what you are
> asking about. Handling of blocks that can be written to while they are
> shared will be rather difficult (you have problems with keeping dirty bits
> in the radix tree consistent if nothing else).
What happens if you do:
- clone_file_range(A, off1, B, off2, len);
- mmap both A and B using DAX.
The mapping will contain the same struct page for two different mappings, no?
Thanks,
Miklos
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-01-11 14:13 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-11 10:29 Miklos Szeredi
2017-01-11 11:51 ` [Lsf-pc] " Jan Kara
2017-01-11 14:13 ` Miklos Szeredi [this message]
2017-01-17 15:46 ` Jan Kara
2017-01-11 18:05 ` Darrick J. Wong
2017-01-11 20:35 ` Andreas Dilger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJfpeguuBgypYh3G1Ew1a37o4WuRozPzLe=D_gh2BbtYXE=zzg@mail.gmail.com' \
--to=miklos@szeredi.hu \
--cc=jack@suse.cz \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox