linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Dmitry Dolgov <9erthalion6@gmail.com>
Cc: linux-mm@kvack.org
Subject: Re: [QUESTION] Resizing shared mapping without clashing with others
Date: Sun, 1 Dec 2024 20:57:07 +0000	[thread overview]
Message-ID: <Z0zNoxn-qkHYh6Pq@casper.infradead.org> (raw)
In-Reply-To: <20241201184410.gl2huwqkbdwm6jvj@erthalion.local>

On Sun, Dec 01, 2024 at 07:44:10PM +0100, Dmitry Dolgov wrote:
> > On Sun, Dec 01, 2024 at 11:55:37AM +0000, Matthew Wilcox wrote:
> > On Sat, Nov 30, 2024 at 05:24:13PM +0100, Dmitry Dolgov wrote:
> > > Hi,
> > >
> > > While working on PostgreSQL [1] we've stumbled upon a question regarding
> > > resizing of shared mappings without conflicting with any other possible
> > > mappings. Before making any wrong conclusions, I would love to get some
> > > consultation from kernel folks on that topic.
> > >
> > > To put it into a context, PostgreSQL uses anonymous shared memory
> > > mapping as a buffer cache for data. The mapping size is configured at
> > > the start, and could not be changed without a restart. Now, we would
> > > like to make it more flexible and allow to change it at runtime, ideally
> > > without changing already used addresses and copying stuff back and
> > > forth.
> > >
> > > The idea is to place the shared mapping at a specified address (with
> > > MAP_FIXED if needed) with a gap, then use mremap to resize it into the
> > > gap. This approach has an open question -- how to make sure there will
> > > be no other mapping created withing the same address space, where we
> > > want to expand the shared mapping? E.g. the shared mapping was created,
> > > then large memory allocation caused another mapping to be created close
> > > to it, so that expanding is not possible.
> >
> > I think there's a very straightforward answer, which is to mmap() it to
> > the larger size to begin with.  If, say, you create a file of 1GB, you
> > can mmap() the first 100GB of that file.  If you access the last 99GB of
> > the mapping, you'll get SIGBUS, but you can truncate() the file larger
> > and gain access to the new memory that way.  Does that work for you?
> >
> > Or if you're doing MAP_ANON | MAP_SHARED, just don't access the last
> > 99GB until your configuration changes.  Memory is allocated on demand,
> > so you won't be charged for it until you use it.
> 
> Right, mapping with the larger size than needed is one option we're
> considering. But there are few arguments against that:
> 
> * Folks are wary of unnecessary large shared mappings, since in the past
>   there were issues with OOM killer making unfavorable to postgres
>   decisions because of that. It might have changed over time, but to
>   confirm that will require some investigation.
> 
> * It can cause memory accounting problems. E.g. if we use hugetlb inside
>   a cgroup with reservation limits set (something like
>   hugetlb.2MB.rsvd.limit_in_bytes), then such mmap() will be counted
>   against the limit, even though the memory wasn't allocated -- meaning
>   that we claim some resource without using it.

If it does turn out to be a problem, you can use a similar trick to how
ld.so maps binaries:

mmap(NULL, 2055640, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f221a758000
mmap(0x7f221a780000, 1462272, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x28000) = 0x7f221a780000
mmap(0x7f221a8e5000, 352256, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x18d000) = 0x7f221a8e5000
mmap(0x7f221a93b000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1e2000) = 0x7f221a93b000
mmap(0x7f221a941000, 52696, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f221a941000

Although you wouldn't want to do consecutive mmaps, you'd want to use
mremap() with MREMAP_FIXED -- not to change new_address, but to expand
length over the initial reserving-space mapping.


  reply	other threads:[~2024-12-01 20:57 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-30 16:24 Dmitry Dolgov
2024-12-01 11:55 ` Matthew Wilcox
2024-12-01 18:44   ` Dmitry Dolgov
2024-12-01 20:57     ` Matthew Wilcox [this message]
2024-12-02 14:54       ` Dmitry Dolgov
2024-12-02 11:07 ` David Hildenbrand
2024-12-02 15:04   ` Dmitry Dolgov
2024-12-02 15:40     ` David Hildenbrand
2024-12-02 16:14       ` Dmitry Dolgov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z0zNoxn-qkHYh6Pq@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=9erthalion6@gmail.com \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox