From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5745DC001DE for ; Wed, 26 Jul 2023 23:38:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 92C308D0001; Wed, 26 Jul 2023 19:38:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8DBDE6B007D; Wed, 26 Jul 2023 19:38:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7F17A8D0001; Wed, 26 Jul 2023 19:38:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 70BC86B0078 for ; Wed, 26 Jul 2023 19:38:21 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 3F9E7160426 for ; Wed, 26 Jul 2023 23:38:21 +0000 (UTC) X-FDA: 81055379202.10.468F2A3 Received: from out-48.mta0.migadu.com (out-48.mta0.migadu.com [91.218.175.48]) by imf30.hostedemail.com (Postfix) with ESMTP id 45EEF80010 for ; Wed, 26 Jul 2023 23:38:19 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=NKW2Mrgm; spf=pass (imf30.hostedemail.com: domain of roman.gushchin@linux.dev designates 91.218.175.48 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690414699; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sTgnN9ZeCU8wiDSONMF29b7WeR8DaW8U9oGN1VLWAJs=; b=R32tgD+4kYNJhgdlNIFubK8y8RGPxXTwLUNzQZ0llguagOZtHOiypv19n+zJRifikuIYdH FxhZWTwmQLs/WHzbYODFCZQ5QUzLy0uzyD6Xx/jHNdC7Jf47MNQ98LfVN/ffXaC0PHd0GY HMRN18zfGfD3ACvMLift0CAL7pOqf7E= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690414699; a=rsa-sha256; cv=none; b=ZDVrmq04Pj9k9CFj1wmJUYSMPM9U14UyQLS8kcM91k0jeSh7OKgEuulfsIllJQcrM8mRbw J64Hu5CU0MsQSBTJPvH8qRQ/9hzxJyHoFkBiqE6fO3hzYga+/m4fUS4REov5U8bYrO1Qsz iC7wCRSHQbCrcQgRS1ohfmZ5vSHza3A= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=NKW2Mrgm; spf=pass (imf30.hostedemail.com: domain of roman.gushchin@linux.dev designates 91.218.175.48 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev; dmarc=pass (policy=none) header.from=linux.dev Date: Wed, 26 Jul 2023 16:38:11 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1690414697; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=sTgnN9ZeCU8wiDSONMF29b7WeR8DaW8U9oGN1VLWAJs=; b=NKW2MrgmXsiJppIB+OLNLxajVnmn+Wn5652O4+9iPCMbNLfO9dfUZsiagbuZOLok2qQJd9 2izmMoBxHggaEFQA/W2ge7sIuH9TkPVxiemOIw1hlnnIwzW1ekv+YEOqcexDHJUYLvy8JI 7U0MNYy1bnE1GTqd1+27kUraNH7oNuw= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Roman Gushchin To: Johannes Weiner Cc: Andrew Morton , Vlastimil Babka , Mel Gorman , Rik van Riel , Joonsoo Kim , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm: page_alloc: consume available CMA space first Message-ID: References: <20230726145304.1319046-1-hannes@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230726145304.1319046-1-hannes@cmpxchg.org> X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: 45EEF80010 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: wh54d4ifaek6jnjn54n3ao8uffmih1df X-HE-Tag: 1690414699-330882 X-HE-Meta: U2FsdGVkX1+QLagmPAULtg4Gve/X0a8Uco1Cy0iSja5mzPgyejdDkK0u1Yf9VQEZklswpPfWscD9cLYSegaEgqeOYEnp+vtHfkcsIg12ZPtOkK+TphehbCzMhrF7ZbkOvUuLGMJbl5iRVZxW3lH4XpAowzJzodJ/TQD9kZdVwL6Z33f0caBmPOyHse5w5IEHuB3OrXl/uO+4zvOx0ZqWePU7LIcFjbjh5/pITBouzMW+umv3sTK/kU762xN+szs1849PWbtkhsKfE1bNUd3zAM8MPxTCAGtzAwtnfGwgbXD6k2pdW9DY58UZaxPx1gi6dpbDfEBeDRZqI5qZmhfTvWb/KSHQI++YJXA6D5gitKaUiYJIEptWOmYecenOHiPW0oiXub88MNUcNJubQHsqOXfYOUvcjzadMXYf2yTPdbQukpF/9haz2CQozMgHh8rpYh/PsEpyVHF8zY5I3+BN1AU67BRACKWuzglDOYqdFDLRnQsuvatMRJQgvsBqqI5VSNIf/WUc+9ZqrCnsyEjUipiegkYvxWX7P5/wNKzQCTnwblCu3+IOhdad0wnUa9IPE0suzjToMRGSn8eZtyuRZntEf8TKr/3D3k0Eh3qdH4Cxi88dj9q+Oq3wgjGy/SlcxEZ04i1P0hFYzShDRM7HHP30sh7CnD9ljGDoYSB4TCxyDMQFpdEZPRVvX8zPNyKY78oSCBiSOZgSysZ/vb8E+uKo7gev/WXuWggTnvq62SfmGwbjMuR2G+mHBsnUK0/+k1+C8uWnh3kuN/Weekw9br55H3hT4YYqvImLCv6CIlgwBy6xfGdDGMOFlgE8r0vl2JrYkjQBrfHqcsIX0A0++nkbUf+4FoLIFKcH+/lDDajsyzyZ8f83EHTxkAL6fYyhm8tag47WpRm1c9DQDdnJjXtbNILwpcq0yVWNgUZRSYqEpmZp94SBcfjranNRacuXANB53XTsIFu4GB6cDCI vmxpnNrX 9hFO+glFqw9RrOj4BEyxqWlX6ao46fkB8jjPq+9sDFq3hk5miv74KVMYzI3tb2uC1jJ5VNP7S56kyF/LaV/3UkcyhfLI/9OdmarSRMhKppFsTJAsVrU+uqgGjJmmFuNIj9huw X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Jul 26, 2023 at 10:53:04AM -0400, Johannes Weiner wrote: > On a memcache setup with heavy anon usage and no swap, we routinely > see premature OOM kills with multiple gigabytes of free space left: > > Node 0 Normal free:4978632kB [...] free_cma:4893276kB > > This free space turns out to be CMA. We set CMA regions aside for > potential hugetlb users on all of our machines, figuring that even if > there aren't any, the memory is available to userspace allocations. > > When the OOMs trigger, it's from unmovable and reclaimable allocations > that aren't allowed to dip into CMA. The non-CMA regions meanwhile are > dominated by the anon pages. > > > Because we have more options for CMA pages, change the policy to > always fill up CMA first. This reduces the risk of premature OOMs. I suspect it might cause regressions on small(er) devices where a relatively small cma area (Mb's) is often reserved for a use by various device drivers, which can't handle allocation failures well (even interim allocation failures). A startup time can regress too: migrating pages out of cma will take time. And given the velocity of kernel upgrades on such devices, we won't learn about it for next couple of years. > Movable pages can be migrated out of CMA when necessary, but we don't > have a mechanism to migrate them *into* CMA to make room for unmovable > allocations. The only recourse we have for these pages is reclaim, > which due to a lack of swap is unavailable in our case. Idk, should we introduce such a mechanism? Or use some alternative heuristics, which will be a better compromise between those who need cma allocations always pass and those who use large cma areas for opportunistic huge page allocations. Of course, we can add a boot flag/sysctl/per-cma-area flag, but I doubt we want really this. Thanks!