From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 047A0C4345F for ; Wed, 17 Apr 2024 01:04:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 706176B009C; Tue, 16 Apr 2024 21:04:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6B54B6B009D; Tue, 16 Apr 2024 21:04:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 556BA6B009E; Tue, 16 Apr 2024 21:04:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 38E366B009C for ; Tue, 16 Apr 2024 21:04:37 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id E3D761C05FB for ; Wed, 17 Apr 2024 01:04:36 +0000 (UTC) X-FDA: 82017228552.04.3D96379 Received: from us-smtp-delivery-172.mimecast.com (us-smtp-delivery-172.mimecast.com [170.10.133.172]) by imf26.hostedemail.com (Postfix) with ESMTP id 93FD8140003 for ; Wed, 17 Apr 2024 01:04:34 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=valvesoftware.com header.s=mc20150811 header.b=QsduHai2; spf=pass (imf26.hostedemail.com: domain of pgriffais@valvesoftware.com designates 170.10.133.172 as permitted sender) smtp.mailfrom=pgriffais@valvesoftware.com; dmarc=pass (policy=quarantine) header.from=valvesoftware.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1713315874; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6FeMTOebIWiBtUZKkliwaGESMMJVIbyv1q51gFBv61g=; b=MOw9lPZFarAzheGl4kCTMngUOFOQIM7nKRZs1UBsqh6AVGyn42jRSx8G9pHbrFDBghnX7F Gcba7DpDprr1gDKfHyHoGsESdQ6NEGe/vnJ5FCYSKX/z2PgzO04Tgc5r+oX6cZo8DlWQ8v mL7PBLqK3yaXifTxJH7tgZjMEwqPUVM= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=valvesoftware.com header.s=mc20150811 header.b=QsduHai2; spf=pass (imf26.hostedemail.com: domain of pgriffais@valvesoftware.com designates 170.10.133.172 as permitted sender) smtp.mailfrom=pgriffais@valvesoftware.com; dmarc=pass (policy=quarantine) header.from=valvesoftware.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1713315874; a=rsa-sha256; cv=none; b=8LLCu9Jl5sKpbMI49s1oASQSwDVlJzCZWmFgzcecBUah0PbYy7mWPTZHFtEpso+2kanSao A4c+L8dGFuQEOzrkO0m3QArzXKlfcF1f1V1XtqEkeAGczxJa95+FN7D9CS0I/7OKsD/h05 8Mlv1KKAhUXqCyl9QrUZh4XwypnzAI0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=valvesoftware.com; s=mc20150811; t=1713315873; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6FeMTOebIWiBtUZKkliwaGESMMJVIbyv1q51gFBv61g=; b=QsduHai2Ny5AYGAWfF8MhHL//wDu2pn8zgRlXuzErku8vausMchVnWI133rRDK7PM4O2d5 sYHJbX6pD/1yMdr5N08R4wD4kqlUObKTQFgVjY4zZiAKjJoho2C2oNBLN6O4uvHU8lSF5H QGX6vNP0h9WBEQ5ikk5fR/xKKAsYsI8= Received: from smtp-01-blv1.valvesoftware.com (smtp-01-blv1.valvesoftware.com [208.64.203.181]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-42-qS7idgijNmaw8r_48GGXjg-1; Tue, 16 Apr 2024 21:04:32 -0400 X-MC-Unique: qS7idgijNmaw8r_48GGXjg-1 Received: from antispam.valve.org ([172.16.1.107]) by smtp-01-blv1.valvesoftware.com with esmtp (Exim 4.93) (envelope-from ) id 1rwtix-00CDhz-N8 for linux-mm@kvack.org; Tue, 16 Apr 2024 18:04:31 -0700 Received: from antispam.valve.org (127.0.0.1) id h3sg1u0171sq for ; Tue, 16 Apr 2024 18:04:31 -0700 (envelope-from ) Received: from mail2.valvemail.org ([172.16.144.23]) by antispam.valve.org ([172.16.1.107]) (SonicWall 10.0.15.7233) with ESMTP id o202404170104310062776-5; Tue, 16 Apr 2024 18:04:31 -0700 Received: from [172.16.36.23] (172.16.36.23) by mail2.valvemail.org (172.16.144.23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Tue, 16 Apr 2024 18:04:31 -0700 Message-ID: <8360ec33-65f1-497f-8230-665b4328e1c0@valvesoftware.com> Date: Tue, 16 Apr 2024 18:04:30 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Increase Default vm_max_map_count to Improve Compatibility with Modern Games To: "Liam R. Howlett" , David Hildenbrand , Oleksandr Natalenko , , , Andrew Morton , References: <566168554.272637693.1710968734203.JavaMail.root@zimbra54-e10.priv.proxad.net> <13499186.uLZWGnKmhe@natalenko.name> <1a91e772-4150-4d28-9c67-cb6d0478af79@redhat.com> <8f6e2d69-b4df-45f3-aed4-5190966e2dea@valvesoftware.com> From: "Pierre-Loup A. Griffais" In-Reply-To: X-ClientProxiedBy: mail1.valvemail.org (172.16.144.22) To mail2.valvemail.org (172.16.144.23) X-Mlf-DSE-Version: 6871 X-Mlf-Rules-Version: s20240412180756; ds20230628172248; di20240404161241; ri20160318003319; fs20240313174141 X-Mlf-Smartnet-Version: 20210917223710 X-Mlf-Envelope-From: pgriffais@valvesoftware.com X-Mlf-Version: 10.0.15.7233 X-Mlf-License: BSV_C_AP____ X-Mlf-UniqueId: o202404170104310062776 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: valvesoftware.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-Stat-Signature: qz3ftut3hge155xtp9owu663u4r53tkx X-Rspamd-Queue-Id: 93FD8140003 X-Rspamd-Server: rspam10 X-Rspam-User: X-HE-Tag: 1713315874-724637 X-HE-Meta: U2FsdGVkX1+56OxfIX8ALGvar4kQNFv7jijcMiWyGnyTD2o6CkXQeZItBMEh2wO1ZgOom/tYGdgXffXcvbYXtOQgJH7+THp5velAs/96XLUjD+VDDQIVl6PXMlZL9y4ws351QnpsE1ha33878nVUDcFpD10+CGKBXssH8m0BJRQX3GM0MrHM2Y4M4/o12F8WDZnitUmlnlcaBtcSHa/dK88nN6U2IC96EGt36O7rnKWm1hUJ70lYXCMl6Ogklbxm82NFpABwqzbMXMC1mqWAswLO5cFy/cCgLLuzaWHrFMVQjrdUsy0EHW5NtXNYG4i+EjgJRuK8ugUZOYfZ+qaUbtOqT/uBp7sIfmzx9aHyXjPRQCWf8C8Bip5hhbQ4Aek9sRs/OkG8Tz/oD0idvZ8WUaZQzDnOw3Jhf6Ke1Mc07LpivPWztI0qd3oq136JyuuedqgPd1nRLg6gPcoEiMd/vzsxODhwCOeiil9lHFPjky+eCe6iU2Isnz0ftdJn51x/KMQ2W4SbY5enANgmERUk5ynzgViE6G0BLsgGMQRprGrG+I+GeaftqvVs3BuVhRXPJs97dG+CYnGem9cARA+jIgYGu10n/MBGXt1gfE81A/Cuca2sYAfoDOSGfBBBPNblIebL59WVGjENbmahELKrYvUhxqCai4WFS353Hy3bo+cUzwOAbJgAn+aS/52fEV6uL1QeAAZvHgkzrqATbmUjYT2H93C4zAX4bxg81hjn1E/tdBOaGBfiTZoWaBFBCmUqXgcf8f+y9k7pa2h0CFscD3+Q9CwUEgxd4D63qZCEXrsg5F4hRJCt9zSZl+POcjR1osMFJEFr6ycpF/Fv61GbmRWl4KtceZ93ErQGLMZBqP+n8cwunTpXCWA3VEUrxgJzqDh6y7FmCzNI8pFkjz6jBG5KAjgrS+3ODE+S1qHOXskqJtaXG7WVDlcE1HKpCXR4N3xvou48EVj81ozo6ms F2sglNJm QI/YaiJmgRXS9wx5fFRi8JFNWE4F9dPuqHqzDuseNieIRLuCIPpCiGvjbjwkIDM7S8T3ASCUtxPJn6Mbfu/rAlzZV2pVXETpoerAN4CEM0NZqAZz9wca/MrVooBt0eejVg077XXbJ1wRwuFceiK54H9+K9lRJWwle/IFPhN2r594cdS99rGKiAs9S+XTYEF5/AbaRzjZUVQu7vv6QR+j3aQlgKYaBBT8uLNvi+e7Se9mG2MWlG7OdOr35jhT3pHKG6ydyHrncCo2wFlURgf7fJIttXwqnuq+GEFJ/dLceEbhehz9qs/uU4Blkugi3PyuIw2QpVu4CRAlvLvWTWs6rovNFYdSw841ABe2iG492J6WpxoO3PJY29RqaAvzC31k4OtPLQLfr/3QltRRKDWS1MjRweu7dmrGwEH5mOlSJ8wsvCug= X-Bogosity: Ham, tests=bogofilter, spamicity=0.023656, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 4/15/24 12:57 PM, Liam R. Howlett wrote: > * Pierre-Loup A. Griffais [240414 20:22]: >> >> > ... >=20 >>> >>> To be clear, what you are doing here is akin to adding more memory to >>> your system when there is a memory leak. This is not the solution you >>> should be pushing. Ironically, this is using more memory and performin= g >>> worse than it should. At best, the limit increase is a workaround for >>> buggy programs. >>> >>> At worst, you are enabling bad things to keep happening and normalising >>> poor programming choices. Please put pressure on the applications that >>> clearly have issues. >> >> We don't get to prescribe what those applications do. The fact of the ma= tter >> is that there are several high-performance memory allocators in wide use= by >> game applications that make heavy internal use of mmap(), and that using >> hundreds of thousands of different memory mappings is well supported on = the >> platform those applications were written for. (or mapping regions with >> different permissions, which results in different regions after platform >> translation to Linux happens within Wine) >=20 > Thank you for the information on the situation that causes the kernel to > use such a large number of vmas. >=20 > The mmap operations will run faster if there are significantly less > vmas. Having such a large number of objects will cause the faulting of > information into the memory to be slower, and that would hold true for > all platforms. >=20 > If this is for high-performance, then it would be unlikely that it was > designed to run with 65,530 objects to search. It is also odd that > there are several allocators running into the same issue. If I were to > guess, the allocators are trying to bypass the operating systems use of > memory and implement another way of tracking it specific to your usecase > for speed. It sounds like it is being translated incorrectly and > causing a monster data structure to track it on the kernel side. >=20 > If it's a translation layer in wine making a decision on how to > translate a particular set of calls then it could be fixed, or at least > examined for inefficiencies. I mentioned translation because it can play a role if the original=20 mappings contain regions with different permissions, as it would need to=20 translate those into several different mappings on Linux, but I wouldn't=20 expect it's really having a meaningful effect. By and large, I think=20 those mappings are coming as-is through the app. >=20 > Either way, the performance will be sub-optimal on the page fault path > (probably the most common) and any other path that uses such a large > number of vmas. >=20 >> >> Pointing out that there exists one game that doesn't happen to do that i= s >> not terribly useful for the purpose of this discussion. >=20 > I provided the data I could collect reasonably quickly, but the scale of > the difference was the important part of my statement. >=20 >> >> The problem statement seems pretty simple - distributions that want to >> support those usecases out of the box can make that change, like we've d= one >> for years on SteamOS. On those that don't, users of those applications w= ill >> have to discover and learn to apply the change by hand after having a li= kely >> sub-par experience trying to get their game up and running. >=20 > This number of vmas is indicating an issue with the utilisation of the > virtual memeory areas. Increasing the limit is allowing the game to > run, but it will not be performant. It is unfortunate that the solution > was to increase the value. Games don't necessarily care if mmap() (and ensuing faults) is a bit=20 slower than the fastest case. Doing such an operation is already=20 considered a relatively slow path and would likely happen on a resource=20 loading thread instead of the hot main loop. >=20 >> >> I've yet to hear a specific downside of making the change other than a r= eal >> concern about DoS of kernel memory in another discussion - it seems to m= e >> like there is much lower hanging fruit for DoSing a Linux system you hav= e >> shell access to, at the moment. >=20 > Poor performance is the downside. The specific downside is the overly > large data structure that the kernel has to navigate on every page fault > or any other vma operation. This isn't specific to changing the number, > but to the fact that it needed to be changed in the first place. >=20 > Is there an upper limit of vmas that you have seen? Can you provide a > copy of the mappings when you see this for testing? This works out to a > 5 level maple tree. I don't really know of an upper limit. I can provide a contrasting=20 anecdote that seems to use a fair amount of mappings - running the title=20 `Hogwarts Legacy` after having loaded into interactive gameplay in the=20 initial area: plagman@redcore:~$ cat /proc/2009007/maps | wc -l 27217 Here's a copy of /proc/maps if you're curious: https://www.dropbox.com/scl/fi/rf970vdxoexsx8u1otufl/hogwarts_maps?rlkey=3D= ws8uwz9ivjo6rh0y9h15nsbna&dl=3D0 I'm guessing there is a guard page after all of those mmap()ed=20 mini-arenas the allocator creates, effectively doubling the mapping count. Thanks, - Pierre-Loup >=20 > Thanks, > Liam >=20