From: Matthew Wilcox <willy@infradead.org>
To: Dmitry Dolgov <9erthalion6@gmail.com>
Cc: linux-mm@kvack.org
Subject: Re: [QUESTION] Resizing shared mapping without clashing with others
Date: Sun, 1 Dec 2024 20:57:07 +0000 [thread overview]
Message-ID: <Z0zNoxn-qkHYh6Pq@casper.infradead.org> (raw)
In-Reply-To: <20241201184410.gl2huwqkbdwm6jvj@erthalion.local>
On Sun, Dec 01, 2024 at 07:44:10PM +0100, Dmitry Dolgov wrote:
> > On Sun, Dec 01, 2024 at 11:55:37AM +0000, Matthew Wilcox wrote:
> > On Sat, Nov 30, 2024 at 05:24:13PM +0100, Dmitry Dolgov wrote:
> > > Hi,
> > >
> > > While working on PostgreSQL [1] we've stumbled upon a question regarding
> > > resizing of shared mappings without conflicting with any other possible
> > > mappings. Before making any wrong conclusions, I would love to get some
> > > consultation from kernel folks on that topic.
> > >
> > > To put it into a context, PostgreSQL uses anonymous shared memory
> > > mapping as a buffer cache for data. The mapping size is configured at
> > > the start, and could not be changed without a restart. Now, we would
> > > like to make it more flexible and allow to change it at runtime, ideally
> > > without changing already used addresses and copying stuff back and
> > > forth.
> > >
> > > The idea is to place the shared mapping at a specified address (with
> > > MAP_FIXED if needed) with a gap, then use mremap to resize it into the
> > > gap. This approach has an open question -- how to make sure there will
> > > be no other mapping created withing the same address space, where we
> > > want to expand the shared mapping? E.g. the shared mapping was created,
> > > then large memory allocation caused another mapping to be created close
> > > to it, so that expanding is not possible.
> >
> > I think there's a very straightforward answer, which is to mmap() it to
> > the larger size to begin with. If, say, you create a file of 1GB, you
> > can mmap() the first 100GB of that file. If you access the last 99GB of
> > the mapping, you'll get SIGBUS, but you can truncate() the file larger
> > and gain access to the new memory that way. Does that work for you?
> >
> > Or if you're doing MAP_ANON | MAP_SHARED, just don't access the last
> > 99GB until your configuration changes. Memory is allocated on demand,
> > so you won't be charged for it until you use it.
>
> Right, mapping with the larger size than needed is one option we're
> considering. But there are few arguments against that:
>
> * Folks are wary of unnecessary large shared mappings, since in the past
> there were issues with OOM killer making unfavorable to postgres
> decisions because of that. It might have changed over time, but to
> confirm that will require some investigation.
>
> * It can cause memory accounting problems. E.g. if we use hugetlb inside
> a cgroup with reservation limits set (something like
> hugetlb.2MB.rsvd.limit_in_bytes), then such mmap() will be counted
> against the limit, even though the memory wasn't allocated -- meaning
> that we claim some resource without using it.
If it does turn out to be a problem, you can use a similar trick to how
ld.so maps binaries:
mmap(NULL, 2055640, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f221a758000
mmap(0x7f221a780000, 1462272, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x28000) = 0x7f221a780000
mmap(0x7f221a8e5000, 352256, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x18d000) = 0x7f221a8e5000
mmap(0x7f221a93b000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1e2000) = 0x7f221a93b000
mmap(0x7f221a941000, 52696, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f221a941000
Although you wouldn't want to do consecutive mmaps, you'd want to use
mremap() with MREMAP_FIXED -- not to change new_address, but to expand
length over the initial reserving-space mapping.
next prev parent reply other threads:[~2024-12-01 20:57 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-30 16:24 Dmitry Dolgov
2024-12-01 11:55 ` Matthew Wilcox
2024-12-01 18:44 ` Dmitry Dolgov
2024-12-01 20:57 ` Matthew Wilcox [this message]
2024-12-02 14:54 ` Dmitry Dolgov
2024-12-02 11:07 ` David Hildenbrand
2024-12-02 15:04 ` Dmitry Dolgov
2024-12-02 15:40 ` David Hildenbrand
2024-12-02 16:14 ` Dmitry Dolgov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z0zNoxn-qkHYh6Pq@casper.infradead.org \
--to=willy@infradead.org \
--cc=9erthalion6@gmail.com \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox