linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Piotr Sarna <p.sarna@tlen.pl>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH] hugetlbfs: add O_TMPFILE support
Date: Wed, 23 Oct 2019 09:14:32 +0200	[thread overview]
Message-ID: <36c17999-caf6-9f0a-d63a-cc6e4b5fabb8@tlen.pl> (raw)
In-Reply-To: <d29bc957-a074-22f6-51d7-e043719d5f98@oracle.com>

On 10/23/19 4:55 AM, Mike Kravetz wrote:
> On 10/22/19 12:09 AM, Piotr Sarna wrote:
>> On 10/21/19 7:17 PM, Mike Kravetz wrote:
>>> On 10/15/19 4:37 PM, Mike Kravetz wrote:
>>>> On 10/15/19 3:50 AM, Michal Hocko wrote:
>>>>> On Tue 15-10-19 11:01:12, Piotr Sarna wrote:
>>>>>> With hugetlbfs, a common pattern for mapping anonymous huge pages
>>>>>> is to create a temporary file first.
>>>>>
>>>>> Really? I though that this is normally done by shmget(SHM_HUGETLB) or
>>>>> mmap(MAP_HUGETLB). Or maybe I misunderstood your definition on anonymous
>>>>> huge pages.
>>>>>
>>>>>> Currently libraries like
>>>>>> libhugetlbfs and seastar create these with a standard mkstemp+unlink
>>>>>> trick,
>>>>
>>>> I would guess that much of libhugetlbfs was writen before MAP_HUGETLB
>>>> was implemented.  So, that is why it does not make (more) use of that
>>>> option.
>>>>
>>>> The implementation looks to be straight forward.  However, I really do
>>>> not want to add more functionality to hugetlbfs unless there is specific
>>>> use case that needs it.
>>>
>>> It was not my intention to shut down discussion on this patch.  I was just
>>> asking if there was a (new) use case for such a change.  I am checking with
>>> our DB team as I seem to remember them using the create/unlink approach for
>>> hugetlbfs in one of their upcoming models.
>>>
>>> Is there a new use case you were thinking about?
>>>
>>
>> Oh, I indeed thought it was a shutdown. The use case I was thinking about was in Seastar, where the create+unlink trick is used for creating temporary files (in a generic way, not only for hugetlbfs). I simply intended to migrate it to a newer approach - O_TMPFILE. However,
>> for the specific case of hugetlbfs it indeed makes more sense to skip it and use mmap's MAP_HUGETLB, so perhaps it's not worth it to patch a perfectly good and stable file system just to provide a semi-useful flag support. My implementation of tmpfile for hugetlbfs is straightforward indeed, but the MAP_HUGETLB argument made me realize that it may not be worth the trouble - especially that MAP_HUGETLB is here since 2.6 and O_TMPFILE was introduced around v3.11, so the mmap way looks more portable.
>>
>> tldr: I'd be very happy to get my patch accepted, but the use case I had in mind can be easily solved with MAP_HUGETLB, so I don't insist.
> 
> If you really are after something like 'anonymous memory' for Seastar,
> then MAP_HUGETLB would be the better approach.

Just to clarify - my original goal was to migrate Seastar's temporary 
file implementation (which is fs-agnostic, based on descriptors) from 
the current create+unlink to O_TMPFILE, for robustness. One of the 
internal usages of this generic mechanism was to create a tmpfile on 
hugetlbfs and that's why I sent this patch. However, this particular 
internal usage can be easily switched to more portable MAP_HUGETLB, 
which will also mean that the generic tmpfile implementation will not be 
used internally for hugetlbfs anymore.

There *may* still be value in being able to support hugetlbfs once 
Seastar's tmpfile implementation migrates to O_TMPFILE, since the 
library offers creating temporary files in its public API, but there's 
no immediate use case I can apply it to.

> 
> I'm still checking with Oracle DB team as they may have a use for O_TMPFILE
> in an upcoming release.  In their use case, they want an open fd to work with.
> If it looks like they will proceed in this direction, we can work to get
> your patch moved forward.
> 
> Thanks,

Great, if it turns out that my patch helps anyone with their O_TMPFILE 
usage, I'd be very glad to see it merged.



  reply	other threads:[~2019-10-23  7:14 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-15  9:01 Piotr Sarna
2019-10-15 10:50 ` Michal Hocko
2019-10-15 23:37   ` Mike Kravetz
2019-10-21 17:17     ` Mike Kravetz
2019-10-22  7:09       ` Piotr Sarna
2019-10-23  2:55         ` Mike Kravetz
2019-10-23  7:14           ` Piotr Sarna [this message]
2019-10-28 18:56 ` Mike Kravetz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=36c17999-caf6-9f0a-d63a-cc6e4b5fabb8@tlen.pl \
    --to=p.sarna@tlen.pl \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox