From: John Hubbard <jhubbard@nvidia.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Mikael Pettersson <mikpelinux@gmail.com>,
linux-mm@kvack.org,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
linux-fsdevel@vger.kernel.org,
Linux API <linux-api@vger.kernel.org>
Subject: Re: [PATCH] mm: disable `vm.max_map_count' sysctl limit
Date: Tue, 28 Nov 2017 21:14:23 -0800 [thread overview]
Message-ID: <5ca7d54b-5ae4-646d-f3a0-9b85129c9ccf@nvidia.com> (raw)
In-Reply-To: <20171128081259.gnkiw5227dtmfm4l@dhcp22.suse.cz>
On 11/28/2017 12:12 AM, Michal Hocko wrote:
> On Mon 27-11-17 15:26:27, John Hubbard wrote:
> [...]
>> Let me add a belated report, then: we ran into this limit while implementing
>> an early version of Unified Memory[1], back in 2013. The implementation
>> at the time depended on tracking that assumed "one allocation == one vma".
>
> And you tried hard to make those VMAs really separate? E.g. with
> prot_none gaps?
We didn't do that, and in fact I'm probably failing to grasp the underlying
design idea that you have in mind there...hints welcome...
What we did was to hook into the mmap callbacks in the kernel driver, after
userspace mmap'd a region (via a custom allocator API). And we had an ioctl
in there, to connect up other allocation attributes that couldn't be passed
through via mmap. Again, this was for regions of memory that were to be
migrated between CPU and device (GPU).
>
>> So, with only 64K vmas, we quickly ran out, and changed the design to work
>> around that. (And later, the design was *completely* changed to use a separate
>> tracking system altogether). exag
>>
>> The existing limit seems rather too low, at least from my perspective. Maybe
>> it would be better, if expressed as a function of RAM size?
>
> Dunno. Whenever we tried to do RAM scaling it turned out a bad idea
> after years when memory grown much more than the code author expected.
> Just look how we scaled hash table sizes... But maybe you can come up
> with something clever. In any case tuning this from the userspace is a
> trivial thing to do and I am somehow skeptical that any early boot code
> would trip over the limit.
>
I agree that this is not a limit that boot code is likely to hit. And maybe
tuning from userspace really is the right approach here, considering that
there is a real cost to going too large.
Just philosophically here, hard limits like this seem a little awkward if they
are set once in, say, 1999 (gross exaggeration here, for effect) and then not
updated to stay with the times, right? In other words, one should not routinely
need to tune most things. That's why I was wondering if something crude and silly
would work, such as just a ratio of RAM to vma count. (I'm more just trying to
understand the "rules" here, than to debate--I don't have a strong opinion
on this.)
The fact that this apparently failed with hash tables is interesting, I'd
love to read more if you have any notes or links. I spotted a 2014 LWN article
( https://lwn.net/Articles/612100 ) about hash table resizing, and some commits
that fixed resizing bugs, such as
12311959ecf8a ("rhashtable: fix shift by 64 when shrinking")
...was it just a storm of bugs that showed up?
thanks,
John Hubbard
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-11-29 5:14 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-26 16:09 Mikael Pettersson
2017-11-27 10:12 ` Michal Hocko
2017-11-27 16:22 ` Matthew Wilcox
2017-11-27 19:28 ` Mikael Pettersson
2017-11-27 17:25 ` Andi Kleen
2017-11-27 18:32 ` Michal Hocko
2017-11-27 19:57 ` Michal Hocko
2017-11-27 20:21 ` Andi Kleen
2017-11-27 20:52 ` Michal Hocko
2017-11-27 19:36 ` Mikael Pettersson
2017-11-27 19:18 ` Mikael Pettersson
2017-11-27 19:52 ` Michal Hocko
2017-11-27 23:26 ` John Hubbard
2017-11-28 8:12 ` Michal Hocko
2017-11-29 5:14 ` John Hubbard [this message]
2017-11-29 8:32 ` Michal Hocko
2017-11-27 19:46 Alexey Dobriyan
2017-11-27 19:47 ` Alexey Dobriyan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5ca7d54b-5ae4-646d-f3a0-9b85129c9ccf@nvidia.com \
--to=jhubbard@nvidia.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mikpelinux@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox