From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14441C28B30 for ; Sun, 23 Mar 2025 03:47:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DEE05280002; Sat, 22 Mar 2025 23:47:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D7927280001; Sat, 22 Mar 2025 23:47:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BF2B8280002; Sat, 22 Mar 2025 23:47:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 9EB53280001 for ; Sat, 22 Mar 2025 23:47:06 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id D6913161D78 for ; Sun, 23 Mar 2025 03:47:06 +0000 (UTC) X-FDA: 83251430052.04.F8FC8DE Received: from mail-qt1-f175.google.com (mail-qt1-f175.google.com [209.85.160.175]) by imf11.hostedemail.com (Postfix) with ESMTP id BBCBD40002 for ; Sun, 23 Mar 2025 03:47:04 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=sEYNyNpZ; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf11.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.175 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742701625; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=40bNv7hMAwwtRXNVkAj4nxm757/6riVRcENrQFl4p2Q=; b=sA273uqumsbAqs7sn5dNo4ng2kKwUHmoJpUSAby+8e66peuQWhNKoIfreajjFt24hrcPGA dC2gSj4N5dQ64dnnskxbmis/bJS4cQAG369W3y+3rM3cd15PzBFasPEWcIwdwOCBMnIPAE 9EZUO/Xb4qzeZiy7xOVqlwYCLtcZ5O8= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=sEYNyNpZ; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf11.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.175 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742701625; a=rsa-sha256; cv=none; b=j2Ab9gMU4X64WcrErIYLJadlXM8R3wBsv0g2sCmgyNoHSN+szrgcvJzDQkVnXP+a45sOpp Ar9N8SmgedrBVnulrPXWP22cF0A8MulR6KAsi/vxzqG3j8YC6La3s83vf+jCrEOiHcBYW9 yRdc9yOWa+v8AQxkK8LoRAyiYqFSeRk= Received: by mail-qt1-f175.google.com with SMTP id d75a77b69052e-476b4c9faa2so43244861cf.3 for ; Sat, 22 Mar 2025 20:47:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1742701623; x=1743306423; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=40bNv7hMAwwtRXNVkAj4nxm757/6riVRcENrQFl4p2Q=; b=sEYNyNpZOlQd8ICPuxecLnt+QRsvJdXkqkYo6cXTiwaKtzJTV9e6fTiVa/YC9V/7zV wTm8PtQXljCSA/+rRundrLAqBdz0bfgIPuPv6MM2twFWXSyxDWxTVLDMUdJ4x1Y+h281 oFTeTb4FmjLNI/+WkHSD+rGx2c/CSbaWQXFEIFnuuRv/MF7xEonjhxQmZWeiE6bI8fpH rgHeym7QMrpjI6n+z5WD323YKFtF5+r5lEc5u0fnx2duC4hRglAChzvOJewTpYmGkePL APGH5tc7Ato6Ognu9yLSe9DGZGK9zTrC9YQyD8u1IO+c45SjHAA4rW6iYNVazUNfN7Lp 1hGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742701623; x=1743306423; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=40bNv7hMAwwtRXNVkAj4nxm757/6riVRcENrQFl4p2Q=; b=A4cuPJFdJh8ojKlxTLurOeiL4joKFFVr8DUIXHi1L10IkC6JMMhoiQCFjDuKH+Yb1W xNeoAHQ0vUu/2AIkojCclO5W0oeoL/YXzF1Xl7lt6ET94RZgsjz/5xCU5jUoDwl3fiIF 26jroqN/NVzwolbcnG4Q+I+gU4o99YZb4gfK5DRJHKdgTITxm+t5nNhXXGXF6zf097Zf TPBEedTE9hhYPKGTfFycNRzJGAFn4f3H0v5VuVCBDE2wbULN1Gbf+OrC4SVwKnczbtO2 VMz/+nk08C561ru3XG+J7b/1CUKq4+ZJDST2Ofb9fc+rB06/hHXYOuCqGYCc1bFiQNco 0KRw== X-Forwarded-Encrypted: i=1; AJvYcCXgaXv40C3jQGLQRQ98ZrtbMjavZUml9ZvqddfBOiNnbxOvwSGxK8TlJ4e+bD/WVRUJkiv4fp/Zng==@kvack.org X-Gm-Message-State: AOJu0YymahsuKWRn/wuF5yfEXB4l1aT6S39sZbVFWv69gYGk8h2ofdoB XHXUBh3J4hS3C43yuY1PsbRlIGr4RIq+xjVkjPP4r2un9RNCR+6DVDMQLDcdXrc= X-Gm-Gg: ASbGncvrHJ8cdNcux9QxXE7mZk5/XbGXkNomFDCf4fGLBIY0BeX0K6RQvWQSYGIjt9l 9nrYDEPF3wVFTgr0tvrVgT+2gZZgbJNbcsNZDiZGHwQh+xUeO9cWyNLvWYB2khbRUMaOoWRVpUk hv/64qcZOOU1rUVGYa1y7HiJaZwTLEm8Dk82MPrSHCqk7fnckCQ/ICw3dXg2FIRK7Xu0XMjOghN pAAjbElmF7o0iDYotx+cYqOI6+RdbQkRg2esBab8fZBcLdNWzPk37RlzrdyanmeT7Jms0WALcfb 2PFBmOYj+TotavpjOBj+2fb0ev2mecoR9pv6v6rIKZ0= X-Google-Smtp-Source: AGHT+IFnRcBONefnUgCdgjRnWDkdRzPswf7L/Fq+jixakG6CxFaIgQKJOmOkORZ5wK1xmjSBQij57A== X-Received: by 2002:a05:622a:5449:b0:476:923a:f1cb with SMTP id d75a77b69052e-4771de13fccmr162328441cf.41.1742701623704; Sat, 22 Mar 2025 20:47:03 -0700 (PDT) Received: from localhost ([2603:7000:c01:2716:da5e:d3ff:fee7:26e7]) by smtp.gmail.com with UTF8SMTPSA id d75a77b69052e-4771d51fd8csm30548191cf.49.2025.03.22.20.47.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 22 Mar 2025 20:47:02 -0700 (PDT) Date: Sat, 22 Mar 2025 23:46:57 -0400 From: Johannes Weiner To: Brendan Jackman Cc: Andrew Morton , Vlastimil Babka , Mel Gorman , Zi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 3/5] mm: page_alloc: defrag_mode Message-ID: <20250323034657.GD1894930@cmpxchg.org> References: <20250313210647.1314586-1-hannes@cmpxchg.org> <20250313210647.1314586-4-hannes@cmpxchg.org> <20250323005823.GB1894930@cmpxchg.org> <20250323013405.GC1894930@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250323013405.GC1894930@cmpxchg.org> X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: BBCBD40002 X-Stat-Signature: 1hgpc7nefoyxrsbos5ghss1dogz37388 X-Rspam-User: X-HE-Tag: 1742701624-360990 X-HE-Meta: U2FsdGVkX1/8iKmF5W4hwVNM0x/sFTI9JCKsUGZJoxRLD/Ez3SIbEPKc7y9eIYif/3/b7MwbO+ND21MHoCDQwDg39au3eBLI9CRSbr0ILLb8QUliIqDF3sqcsm6d8pTqgtrXTBgPbyfa6jLFzg/0CHJ1YnDnNJhIUVAJFkW94Q5OTyvSdRiFHo5UGHXrHSAEmzzWEvMI7kyLyaPzOAuKOumwHtN2iEexgMiw9DpxTBSrCT9MmjP1F1ixCvj1mcKxAOJxk1xdRLhAfGVQPNTkRUd4sbbd2CpaCIjz6VFmS1E3QabG+1EJdi0KW6qarvHKLMMAhDr97Bg+YN5BLK4w/BGitKZSkn0xgd2E3NxvWk8qyY6VF7y3aAJxQthq1JnizGogNBvasynanMuR97sHg1oXTvGmN+IlVCUq5zgAn26Nn66ipY/NbXgbFgot7z/OYE1nF9IKfVRnWW1+ADEP2voCQVIkBbAeAK39ZQXKJ3aOR+MNF1iPa8aSKK2y5n/MwWDOZo1/T9WMEzryA4eOVefPq3LMgQCI+Vo3iFqB4nRj3xwYBGKFYTunvF6sxwit1MgtwcStMsbWj7yyUFA39QTkSvMrYGrkBelP88ng6alalupZd7KDXG1M8Z6oaIb9mkmSZTZYtolM8hMOlNpMk0YX5Q+AW7FBn2PI6Egwbn6U5cE/yEh5hD8NfpoGZ7l1PPY4CYGLANyHpP6yLHYdUI3X7jkO38t9DvgtL4XkMSHljzta187ViKKGyHsjd98rYqWW+7rRE2jKWt9/IGgXtNrWRC18D+g9FLV/V6kkp2p/H+943WhHdHedi/w6+TikOrZWC4FnDmfeTmNdmuPzGfI1a3j0vXQnC00NqPKkBczU4spHiQjcIo3N5NXN/HzYYd0bqbJSNjwtpRG7bBJwX05efTUrZhMf8sIB1dtAvqB9eqmkhyVozkDZFE+KyaD6l6U1cIhZpFTS73LyVve kxIpRQG0 zdH3mCMWGcOAhPsaelO8VOP3sUikbIlKXgXNypuFCmK0m9UH4kz8SedVPmX5opMZ7uY0HLxa10mTzD6RZ7ygpBFwSyp5iedr25XuaLIohdYuopW3XVvweMzDMR+VIh3XBfI7G4r9rM1uKCedyjzpsq0VWu+OTfaLegLBBdtz8hI5/LbJtNmeLnew3tXOFcb7twZLy8eOhjOp75jo+W1IJxrRXswwc1hnHCDSe+b6kCX3P2N1AzzjqTt8mXUtL1Pf5jlFyb1h8ZW6ma+rhK+tKRyiDBBMSmFj7LV9q+a82+2dqwp2Yi1Y/aLPIJs3o1HW2c6oj4ue9yol2QTPCIBNMpBHB/6eDZK5JTOaCesbC3QKeQJJX4qFEEsza1OH1YCwoSNN1GJv3488qzk3hAG20UqHEMTEnZTLyvj8FBeeSr2Y3S8yftQ26djKObaYddDxOD3GYhmG1UFy8cvluA+K0/1R7KiZ+sL2KXuk9BLxZeIHDr5dzAt7+DLmakKL0lrKsAG8ji2W2thu3Zqglqt9JiCgNUGhdQRgGicxvb6PAFog1Sf2ekSI3VIKjIQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Mar 22, 2025 at 09:34:09PM -0400, Johannes Weiner wrote: > On Sat, Mar 22, 2025 at 08:58:27PM -0400, Johannes Weiner wrote: > > On Sat, Mar 22, 2025 at 04:05:52PM +0100, Brendan Jackman wrote: > > > On Thu Mar 13, 2025 at 10:05 PM CET, Johannes Weiner wrote: > > > > + /* Reclaim/compaction failed to prevent the fallback */ > > > > + if (defrag_mode) { > > > > + alloc_flags &= ALLOC_NOFRAGMENT; > > > > + goto retry; > > > > + } > > > > > > I can't see where ALLOC_NOFRAGMENT gets cleared, is it supposed to be > > > here (i.e. should this be ~ALLOC_NOFRAGMENT)? > > Please ignore my previous email, this is actually a much more severe > issue than I thought at first. The screwed up clearing is bad, but > this will also not check the flag before retrying, which means the > thread will retry reclaim/compaction and never reach OOM. > > This code has weeks of load testing, with workloads fine-tuned to > *avoid* OOM. A blatant OOM test shows this problem immediately. > > A simple fix, but I'll put it through the wringer before sending it. Ok, here is the patch. I verified this with intentional OOMing 100 times in a loop; this would previously lock up on first try in defrag_mode, but kills and recovers reliably with this applied. I also re-ran the full THP benchmarks, to verify that erroneous looping here did not accidentally contribute to fragmentation avoidance and thus THP success & latency rates. They were in fact not; the improvements claimed for defrag_mode are unchanged with this fix: VANILLA defrag_mode=1-OOMFIX Hugealloc Time mean 52739.45 ( +0.00%) 27342.44 ( -48.15%) Hugealloc Time stddev 56541.26 ( +0.00%) 33227.16 ( -41.23%) Kbuild Real time 197.47 ( +0.00%) 196.32 ( -0.58%) Kbuild User time 1240.49 ( +0.00%) 1231.89 ( -0.69%) Kbuild System time 70.08 ( +0.00%) 58.75 ( -15.95%) THP fault alloc 46727.07 ( +0.00%) 62669.93 ( +34.12%) THP fault fallback 21910.60 ( +0.00%) 5966.40 ( -72.77%) Direct compact fail 195.80 ( +0.00%) 50.53 ( -73.81%) Direct compact success 7.93 ( +0.00%) 4.07 ( -43.28%) Compact daemon scanned migrate 3369601.27 ( +0.00%) 1588238.93 ( -52.87%) Compact daemon scanned free 5075474.47 ( +0.00%) 1441944.27 ( -71.59%) Compact direct scanned migrate 161787.27 ( +0.00%) 64838.53 ( -59.92%) Compact direct scanned free 163467.53 ( +0.00%) 37243.00 ( -77.22%) Compact total migrate scanned 3531388.53 ( +0.00%) 1653077.47 ( -53.19%) Compact total free scanned 5238942.00 ( +0.00%) 1479187.27 ( -71.77%) Alloc stall 2371.07 ( +0.00%) 553.00 ( -76.64%) Pages kswapd scanned 2160926.73 ( +0.00%) 4052539.93 ( +87.54%) Pages kswapd reclaimed 533191.07 ( +0.00%) 765447.47 ( +43.56%) Pages direct scanned 400450.33 ( +0.00%) 358933.93 ( -10.37%) Pages direct reclaimed 94441.73 ( +0.00%) 26991.60 ( -71.42%) Pages total scanned 2561377.07 ( +0.00%) 4411473.87 ( +72.23%) Pages total reclaimed 627632.80 ( +0.00%) 792439.07 ( +26.26%) Swap out 47959.53 ( +0.00%) 128511.80 ( +167.96%) Swap in 7276.00 ( +0.00%) 27736.20 ( +281.16%) File refaults 138043.00 ( +0.00%) 206198.40 ( +49.37%) Many thanks for your careful review, Brendan. --- >From c84651a46910448c6cfaf44885644fdb215f7f6a Mon Sep 17 00:00:00 2001 From: Johannes Weiner Date: Sat, 22 Mar 2025 19:21:45 -0400 Subject: [PATCH] mm: page_alloc: fix defrag_mode's retry & OOM path Brendan points out that defrag_mode doesn't properly clear ALLOC_NOFRAGMENT on its last-ditch attempt to allocate. But looking closer, the problem is actually more severe: it doesn't actually *check* whether it's already retried, and keeps looping. This means the OOM path is never taken, and the thread can loop indefinitely. This is verified with an intentional OOM test on defrag_mode=1, which results in the machine hanging. After this patch, it triggers the OOM kill reliably and recovers. Clear ALLOC_NOFRAGMENT properly, and only retry once. Fixes: e3aa7df331bc ("mm: page_alloc: defrag_mode") Reported-by: Brendan Jackman Signed-off-by: Johannes Weiner --- mm/page_alloc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 0c01998cb3a0..582364d42906 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4543,8 +4543,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, goto retry; /* Reclaim/compaction failed to prevent the fallback */ - if (defrag_mode) { - alloc_flags &= ALLOC_NOFRAGMENT; + if (defrag_mode && (alloc_flags & ALLOC_NOFRAGMENT)) { + alloc_flags &= ~ALLOC_NOFRAGMENT; goto retry; } -- 2.49.0