From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47FC6D13587 for ; Sun, 27 Oct 2024 20:52:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BC6186B0085; Sun, 27 Oct 2024 16:52:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B75386B008C; Sun, 27 Oct 2024 16:52:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A15F26B0092; Sun, 27 Oct 2024 16:52:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 82A706B0085 for ; Sun, 27 Oct 2024 16:52:22 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 55AC24049B for ; Sun, 27 Oct 2024 20:52:10 +0000 (UTC) X-FDA: 82720579662.04.A7D7EC4 Received: from mail-vs1-f52.google.com (mail-vs1-f52.google.com [209.85.217.52]) by imf03.hostedemail.com (Postfix) with ESMTP id DFF0220015 for ; Sun, 27 Oct 2024 20:52:09 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Yz3H5lN8; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of yuzhao@google.com designates 209.85.217.52 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730062212; a=rsa-sha256; cv=none; b=nu9HYKVELzgo+MJ/2dRENpigYaM3JQZ/PqDx0CCBl0oyv4eo0kLQIZN58jcGivurgrgw4s 4jtbIA0jOH2BYUqZEoyHbOSmxoe0MqcnXZYpent2ohe0PXP8ly8FQs2uA0YiqqTM7O6KTZ aRx/O2/3TTwPR0ttRVzKD9nwQHoXabM= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Yz3H5lN8; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of yuzhao@google.com designates 209.85.217.52 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730062212; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uEP1aXwx4Opc5UBPmeJH/5/ytNpBnKvXGednu8gjFqE=; b=IGMx9B0VRoVFNO7xPGnK0ERuRmv7//goGRFN28tixV9JBAgAwOJO+/K0osGxvaLQjzHFjn TJsl3zEhP1ToV02e9nEJOlfmiqX+1aaQhlXH8BWjaa7kGovlxGT/BF2ZkLv64ZFZncKobu orc5AKArBS+DOVlZAdHfkh85Wf2Pp9I= Received: by mail-vs1-f52.google.com with SMTP id ada2fe7eead31-4a47dc2ef46so1068394137.3 for ; Sun, 27 Oct 2024 13:52:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1730062339; x=1730667139; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=uEP1aXwx4Opc5UBPmeJH/5/ytNpBnKvXGednu8gjFqE=; b=Yz3H5lN8hWzzsjq43HU0vZmIPeO1PGrzgbuns55pkRfUWYz637q7v5TI2m0W6xNR6R p0JqHOI0Ivuob+0gW9YjWw7cnrg+1Q3YiiHu7dzF7Sw5mV2ms6aT5cunrJJMoZpiEDz1 KBf2rCtKJt7ZBDR2T/VXWVkllVxC8H/IsnbeIkcfXE3c9zBy5eyHhyJiZ6Q44TsZK79i KDVnt1KLuV8BZwrC7OAFd6Lt71e9UMXzaK2Tt3KKYbWyd805wNhh+pNKMyJ5aEuUubbP ivEgZkaAX8SZjfIJRo1MYBwshQzrZwsWI0Hzs+fflPO2Xx8UQpD72CVzK1YJQ/KKQjVz Hp7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730062339; x=1730667139; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uEP1aXwx4Opc5UBPmeJH/5/ytNpBnKvXGednu8gjFqE=; b=Vb/FCRuu+Drq3ussIPyGMMEQp1nmKxabzU30EZTuYL+QFDtnLwjojxfoi4RgGuNUQv xP61cowSEfcRa4vLLgo9/vIxD9bwPS1EHeRapWg14Z7xx1rq/rt8VdSvp49y5b98SaPU 16MsSzmEmj46lSXy24Rz3xiO+2WninOgneDx4Ah9kFjKg8pOTX0o5hcc/RumRMFxmfLU 3Y5qnUIEryOm6cd8kirk0SYFOo77ioN9sY/kEblLANQHwM3YdNKWAjAWfLEatHXg5PkJ 4RepYvVJlQ90BojabuhVkszspcqKAewZshPHqCbkeixO3rl+UDLIm5dTmiuXyZFBdWXc BjfQ== X-Forwarded-Encrypted: i=1; AJvYcCWTaONqYKCjqXcRptLVCdlTUCtBZBeksILg7Y/QkxmQXIe5NM9lOl24Dz33c4v6gRugJTHD7ZWovg==@kvack.org X-Gm-Message-State: AOJu0Yw7cFnhHcoLqVfmuUXhOwMfZEqIy0rEjxkjgXzuF2rxf3bOLVvl iPX/L77BU1s0G+gYr7oiVv6owow2ag0ZTNiER0MBpYeF0UPdaloQXrbyYHx4Lb4V6UNYcfL5Okc 5ZmKiYoPBzUbyIbw0JzEtGR02mGeugUAo6/jU X-Google-Smtp-Source: AGHT+IET90x0oUx9agpRnmE/iqRubOv9iCKU3Om1nVjpljKzJxZlCuH6iPVL7k9m2OXVXjAB/kD+AUOPM+Sdm67QaUM= X-Received: by 2002:a05:6102:dcf:b0:4a3:e1de:4fd8 with SMTP id ada2fe7eead31-4a8cfd6d8b3mr4094664137.26.1730062339318; Sun, 27 Oct 2024 13:52:19 -0700 (PDT) MIME-Version: 1.0 References: <20241026033625.2237102-1-yuzhao@google.com> <37a28ef7-e477-40b0-a8e4-3d74b747e323@suse.cz> <8459b884-5877-41bd-a882-546e046b9dad@suse.cz> In-Reply-To: <8459b884-5877-41bd-a882-546e046b9dad@suse.cz> From: Yu Zhao Date: Sun, 27 Oct 2024 14:51:42 -0600 Message-ID: Subject: Re: [PATCH mm-unstable v2] mm/page_alloc: keep track of free highatomic To: Vlastimil Babka Cc: Andrew Morton , Johannes Weiner , Zi Yan , Mel Gorman , Matt Fleming , David Rientjes , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Link Lin Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: DFF0220015 X-Stat-Signature: 8b18rt1mgxyen8pncs48zitfkh1u36rz X-Rspam-User: X-HE-Tag: 1730062329-199881 X-HE-Meta: U2FsdGVkX1+vJJqVgywivmEZpGum8Hd7w6G9Jk7MjS4IXkOwyqER+4g9EGBNgfyk5yTpZ7p8siTYdfpatmLOzc/rx4xO3FX3m+4ypug5lZuqz6ZBOFI2a5xyCnyvydZlnah/pIlXpyjVUXkgmTWQwAZekjudmnqHyHWdMWtnLGp7X//oW5FtL7MY+fx5S9vdB5s+VOf0NTCrk49DbSve4RAQZTkORPCnititzaZ8qOLSOUufchxOfDeYK0d29bLpF/Y8nNYjmTn3Nwk9rnfDZ8D64W3POTi7VWxsNhSIsbjUm7YQabLzqbhpY5/k/oJaiyFIp7AysGXQ1lGTtGH2RCW3lttIhWNUJa9nCZv8jJax8S2Ly4bN7dnAY3KazFoLYKahN9LMNxhAhwoZn3uIyz3cJzpQEmjAod9rcIXBlCbmHJBGi5x//nUOJlYVGnxej5QTB8wtj1bf5g8XVMNFPCXiITuNnQ8HPVepppjpidXSvNDFVMtszmSkZxT6Jq45TGJELKhqT+PiCFbF5dm7WUrC61RMuXIQu6iGSab/SPvMbolNeYziyYf89A6DpjSI7w2hkCNK20hb96hAr5Tn2vyKgxWYje2ySh+NMNK7nZdOa6qdgEGxEQkG7P16XRTT9SQ0xtvZ9GFFOoBZ9wQUKG81qSS3tuleWxjDRvHJ7c/CqsSMHeURrK8Nmp21WWK9JkdHdvi0TMKI8mgqMklPCWQd1dTUlbyCF+GOX03CH7eQnmLtdSZgpGPelSjlnBjuRjbYp4BJkUdx8eMRMBWANaXiufO2776VYHNYSscaFFlP3tRqL8qEqoHqSsL3s+eQtjTYfL8OsA4Rs2XYIyDTmW9gICk2seJQJ1Yy4xY5WgfHzAm+iyA9Fzf5UQ50NO9gU6Pc2haIvQmXouBG6miILChJGOd3f/67v7UE+xxKgzIdHq6JnaaZ8oPx431rmOGNG5phas6769BqlknkYGd Jv9slNqk jCaSVeRU+skUqZoDkouTgyQuhd/Rj94fV972R6i5wjNwal0Al8A0ijMz8oXpSaaDBTO/5blfyDpY4uCMc9TjfH3uyAyq6ZaB0nC7V73HuVS2Xg/cMFl/jQfPIKOlRhmYlFxBRjIRl5fX/Pe+CKuHA4uMhVj3wzAYxOZTBOx7xyjesP5mF/wSt2Jcgq0h/mOaAg5adA5bn2Fk3kt3hQqnWgzLauKx3iRZKUMB3Tm/cJSvOziqgrC3rKbSwaw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000113, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Oct 27, 2024 at 2:36=E2=80=AFPM Vlastimil Babka wr= ote: > > On 10/27/24 21:17, Yu Zhao wrote: > > On Sun, Oct 27, 2024 at 1:53=E2=80=AFPM Vlastimil Babka wrote: > >> > >> On 10/26/24 05:36, Yu Zhao wrote: > >> > OOM kills due to vastly overestimated free highatomic reserves were > >> > observed: > >> > > >> > ... invoked oom-killer: gfp_mask=3D0x100cca(GFP_HIGHUSER_MOVABLE),= order=3D0 ... > >> > Node 0 Normal free:1482936kB boost:0kB min:410416kB low:739404kB h= igh:1068392kB reserved_highatomic:1073152KB ... > >> > Node 0 Normal: 1292*4kB (ME) 1920*8kB (E) 383*16kB (UE) 220*32kB (= ME) 340*64kB (E) 2155*128kB (UE) 3243*256kB (UE) 615*512kB (U) 1*1024kB (M)= 0*2048kB 0*4096kB =3D 1477408kB > >> > > >> > The second line above shows that the OOM kill was due to the followi= ng > >> > condition: > >> > > >> > free (1482936kB) - reserved_highatomic (1073152kB) =3D 409784KB < = min (410416kB) > >> > > >> > And the third line shows there were no free pages in any > >> > MIGRATE_HIGHATOMIC pageblocks, which otherwise would show up as type > >> > 'H'. Therefore __zone_watermark_unusable_free() underestimated the > >> > usable free memory by over 1GB, which resulted in the unnecessary OO= M > >> > kill above. > >> > > >> > The comments in __zone_watermark_unusable_free() warns about the > >> > potential risk, i.e., > >> > > >> > If the caller does not have rights to reserves below the min > >> > watermark then subtract the high-atomic reserves. This will > >> > over-estimate the size of the atomic reserve but it avoids a searc= h. > >> > > >> > However, it is possible to keep track of free pages in reserved > >> > highatomic pageblocks with a new per-zone counter nr_free_highatomic > >> > protected by the zone lock, to avoid a search when calculating the > >> > >> It's only possible to track this reliably since the "mm: page_alloc: > >> freelist migratetype hygiene" patchset was merged, which explains why > >> nr_reserved_highatomic was used until now, even if it's imprecise. > > > > I just refreshed my memory by quickly going through the discussion > > around that series and didn't find anything that helps me understand > > the above. More pointers please? > > For example: > > - a page is on pcplist in MIGRATE_MOVABLE list > - we reserve its pageblock as highatomic, which does nothing to the page = on > the pcplist > - page above is flushed from pcplist to zone freelist, but it remembers i= t > was MIGRATE_MOVABLE, merges with another buddy/buddies from the > now-highatomic list, the resulting order-X page ends up on the movable > freelist despite being in highatomic pageblock. The counter of free > highatomic is now wrong wrt the freelist reality This is the part I don't follow: how is it wrong w.r.t. the freelist reality? The new nr_free_highatomic should reflect how many pages are exactly on free_list[MIGRATE_HIGHATOMIC], because it's updated accordingly. (My current understanding is that, in this case, the reservation itself is messed up, i.e., under-reserved.) > The series has addressed various scenarios like that, where page can end = up > on the wrong freelist.