linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Petr Špaček" <pspacek@isc.org>
To: Pedro Falcato <pedro.falcato@gmail.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Vlastimil Babka <vbabka@suse.cz>,
	Liam Howlett <liam.howlett@oracle.com>
Subject: Re: [PATCH RFC] mm: mmap: Change DEFAULT_MAX_MAP_COUNT to INT_MAX
Date: Fri, 30 Aug 2024 19:00:33 +0200	[thread overview]
Message-ID: <b8436b35-0c39-4471-baf7-ec9a07537f9f@isc.org> (raw)
In-Reply-To: <mftebk5inxamd52k46frhq2llopoiiewsgdkrjbamg4yukyhqw@vf4jzz6lmgcu>

On 30. 08. 24 17:04, Pedro Falcato wrote:
> On Fri, Aug 30, 2024 at 04:28:33PM GMT, Petr Špaček wrote:
>> Now I understand your concern. From the docs and code comments I've seen it
>> was not clear that the limit serves _another_ purpose than mere
>> compatibility shim for old ELF tools.
>>
>>> It is a NACK, but it's a NACK because of the limit being so high.
>>>
>>> With steam I believe it is a product of how it performs allocations, and
>>> unfortunately this causes it to allocate quite a bit more than you would
>>> expect.
>>
>> FTR select non-game applications:
>>
>> ElasticSearch and OpenSearch insist on at least 262144.
>> DNS server BIND 9.18.28 linked to jemalloc 5.2.1 was observed with usage
>> around 700000.
>> OpenJDK GC sometimes weeps about values < 737280.
>> SAP docs I was able to access use 1000000.
>> MariaDB is being tested by their QA with 1048576.
>> Fedora, Ubuntu, NixOS, and Arch distros went with value 1048576.
>>
>> Is it worth sending a patch with the default raised to 1048576?
>>
>>
>>> With jemalloc() that seems strange, perhaps buggy behaviour?
>>
>> Good question. In case of BIND DNS server, jemalloc handles mmap() and we
>> keep statistics about bytes requested from malloc().
>>
>> When we hit max_map_count limit the
>> (sum of not-yet-freed malloc(size)) / (vm.max_map_count)
>> gives average size of mmaped block ~ 100 k.
>>
>> Is 100 k way too low / does it indicate a bug? It does not seem terrible to
>> me - the application is handling ~ 100-1500 B packets at rate somewhere
>> between 10-200 k packets per second so it's expected it does lots of small
>> short lived allocations.
>>
>> A complicating factor is that the process itself does not see the current
>> counter value (unless BPF is involved) so it's hard to monitor this until
>> the limit is hit.
> 
> Can you get us a dump of the /proc/<pid>/maps? It'd be interesting to see how
> exactly you're hitting this.

I have immediately available only a coredump from hitting the default 
limit. GDB apparently does not show these regions in "info proc 
mappings", but I was able to extract section addresses from the coredump:
https://users.isc.org/~pspacek/sf1717/elf-sections.csv

Distribution of section sizes and their count in format "size,count" is 
here:
https://users.isc.org/~pspacek/sf1717/sizes.csv

If you want to see some cumulative stats they are as OpenDocument here:
https://users.isc.org/~pspacek/sf1717/sizes.ods

 From a quick glance it is obvious that single-page blocks eat most of 
the quota.

I don't know if it is a bug or just memory fragmentation caused by a 
long-running server application.

I can try to get data from production system to you next week if needed.

-- 
Petr Špaček
Internet Systems Consortium


  reply	other threads:[~2024-08-30 17:00 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-30  9:56 Petr Spacek
2024-08-30 11:41 ` Lorenzo Stoakes
2024-08-30 12:01   ` Lorenzo Stoakes
2024-08-30 14:28     ` Petr Špaček
2024-08-30 15:04       ` Pedro Falcato
2024-08-30 17:00         ` Petr Špaček [this message]
2024-09-02 10:37           ` Petr Špaček
2024-09-02 11:05             ` Pedro Falcato
2024-08-30 15:24   ` David Hildenbrand
2024-08-30 16:48     ` Liam R. Howlett

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b8436b35-0c39-4471-baf7-ec9a07537f9f@isc.org \
    --to=pspacek@isc.org \
    --cc=liam.howlett@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=pedro.falcato@gmail.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox