From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1691C02181 for ; Wed, 22 Jan 2025 14:43:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 439FE6B0082; Wed, 22 Jan 2025 09:43:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3E62D6B0088; Wed, 22 Jan 2025 09:43:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2ADDE6B0089; Wed, 22 Jan 2025 09:43:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 0B1DF6B0082 for ; Wed, 22 Jan 2025 09:43:07 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id B24C5AF06F for ; Wed, 22 Jan 2025 14:43:06 +0000 (UTC) X-FDA: 83035355172.28.E642047 Received: from mail-qt1-f177.google.com (mail-qt1-f177.google.com [209.85.160.177]) by imf29.hostedemail.com (Postfix) with ESMTP id 419F312002C for ; Wed, 22 Jan 2025 14:42:43 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=XsIhirrT; spf=pass (imf29.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.177 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737556963; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jUTvI1zjodzKf4BKTdSmbVm2vNENh5QZ9jsPAlaP8rc=; b=7xC3qlW1W4IRQ4GgybJgoXr0CwI0s9EFx4nsysDCTszM1bmabCGVk2hIJbnCgkJwzBkwAH i2NK/6B2ddjZBVVGNhhschqiCOxIquNguPn3pJmL5Kr4B9CSLCZZ5pacttnyIqGoW3szDz yrS78aGJsv48FCZ5sVHH1lhwGMclXyc= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=XsIhirrT; spf=pass (imf29.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.177 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737556963; a=rsa-sha256; cv=none; b=7BUCxpAkE98ygkvzRp4B8QUSmMJIlcSzRJJSqDgc4wsK/I2nNlhNWesadct2PkNpaqn0wO TcrI5yskUJbQEITAMUcRLIePVNqzhtaIp4+prNxgAhvUGRfAy3zOU4AxW5016OH4tj4xT7 r7+fMNpwHXWuoXKChsj+i6y1lykQ7tk= Received: by mail-qt1-f177.google.com with SMTP id d75a77b69052e-46785fbb949so67854921cf.3 for ; Wed, 22 Jan 2025 06:42:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1737556962; x=1738161762; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=jUTvI1zjodzKf4BKTdSmbVm2vNENh5QZ9jsPAlaP8rc=; b=XsIhirrTgfJRkxTPw8Mgqi7xX4DSzQ9gUdfQAvlborKPm41tL9Gnn231cq1coyW3Vg gwf7aWXdJOZewmf9/4PCAnRNUYNDkDm8xmlQiDmD1o2/XPmy4UDmRLmO9TTKyuvxhGyD rcf3DxjwP43eIX3WPiD+zGecEp6G5Ni0dSyScogkVjMBKJ6k+uXHi6Sk/T6fMwCQR+vN CVv6PyO41/TFSPC7YAa1VAMmAKuAksRtSzDwTf22JlUxWxrK4/SQd4sB9E4B54xeW5px nDPAGXKkfV9nTl6kVJp8McUwHLMs/FNnjqG10Sydhe+IpfDJZDDdplmucd3IIyL9dZSU ZbGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737556962; x=1738161762; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=jUTvI1zjodzKf4BKTdSmbVm2vNENh5QZ9jsPAlaP8rc=; b=CiQt3IV5/4zrUfAOdjoLaJ26AdIm+5LpuzxtZuh7C681CjMibQlkAKmwxSA4HlRUDD F2EZ8+cL9I7fyCbR+o/8dLWQ9ZBjZfk99SP8eXeJu4g3SC2RqGERt+fznTKjU+gZEGoN lKmOLkgbTVKdhmaM91sl++Juz15xQ5TyrHY1AmGcIF6uBipJ2jZclY8njUJnsGOpIX3O QKfkylmz2ONZxYHRlBFohi7QOBEAVqiiTqv4Hps06NYe72AT3u4PoP7uCzRtwhxWRVvE cOFyHecjcmc70jY9KSLve9IxFTiw6j9VlFAY8ybmXOMyN2sR4h3gX5t1BLLMsQjjMo71 b32A== X-Forwarded-Encrypted: i=1; AJvYcCWsSetYmQ9shng7l1u/i4pgfec5oqN9KIEUQdSbRdLpNM8HJpPpbemPYQ4vYFn7Sml4TsOWENP7/g==@kvack.org X-Gm-Message-State: AOJu0YyVu2gerc0pigdvFoZl9QpxjTCqeuCQwS02NtIYLpC0mpoqemhT XherHZwBnAIzk7fNZ/4FGxWKmPtMXoNhUuJiiwVkRK0tAnilyibf1Tm3Ine5DUg= X-Gm-Gg: ASbGnctuiOEOofBz8Bn1r8gdh2sCHJevTydYRc2KsPWpjal30NMIVxh6I/MwvtMjpu9 Dj16oKsfNhySUwUHzp7sMPN4cRCHhKiDn+R29Tcyefl+vzKYDBJ+mx1Xg/uNgujkvLvCf0R61VF yyIj17c5AU9O6Thy7m3nCh3BERJyodup2hYdRjIM/W2xXAjxL3EovKs2UPx+fggGiamQ+xvZCFX pSWROO9WU0iXblgjvXONpPkXE3vdnAgnL7l91xVA1NDxueYYLN1bsNB+CrLNJd3Ezof X-Google-Smtp-Source: AGHT+IHj0fasPmM9mB40hLFoTkSprPapd9w8WXM3yi8YwYCjYzRwR0BeAsihTxBYsjOZcsaYzaW4zw== X-Received: by 2002:a05:622a:1a85:b0:467:61a5:1a85 with SMTP id d75a77b69052e-46e12aa4b98mr350905731cf.30.1737556962154; Wed, 22 Jan 2025 06:42:42 -0800 (PST) Received: from localhost ([2603:7000:c01:2716:da5e:d3ff:fee7:26e7]) by smtp.gmail.com with UTF8SMTPSA id d75a77b69052e-46e102ebfbdsm64601371cf.15.2025.01.22.06.42.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 Jan 2025 06:42:41 -0800 (PST) Date: Wed, 22 Jan 2025 09:42:40 -0500 From: Johannes Weiner To: yangge1116@126.com Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, 21cnbao@gmail.com, david@redhat.com, baolin.wang@linux.alibaba.com, vbabka@suse.cz, liuzixing@hygon.cn Subject: Re: [PATCH V2] mm: compaction: use the actual allocation context to determine the watermarks for costly order during async memory compaction Message-ID: <20250122144240.GA217180@cmpxchg.org> References: <1736991214-29069-1-git-send-email-yangge1116@126.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1736991214-29069-1-git-send-email-yangge1116@126.com> X-Rspamd-Queue-Id: 419F312002C X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: uuf1irtcjsprzhyepimcep6jui86jb97 X-HE-Tag: 1737556963-557063 X-HE-Meta: U2FsdGVkX1/4dn7joWjf9gm78vKi36KSEIKmg3bG7CXtvQl+rGJyYshtRGN9Ep2e3/tKCcdcOuFoBs3dYxrZ1SBNEXhpCvj51mrPgaqVXXk9a/Zr/lmljvXMeCnFCSV7xVQT1T4vVr5ggF9C3U3j9mD+fMthj9o5+4IfCXMz4TcX4Zmunh5bwvrazzPIljHWCfpdEpFN/55lF8pQeOK6t3FbvvAxvkbtiyV2kvhkjlO2W+y4yuEeo26n+tqxpJkti4LLdF9NOzrZvv+EwJ84/45L44eTFG2w5mwYMf8XkkD7lXbpeB/U4d4JFC31Y3X2JT5n27uuy1g2VIDXKbQy2C35lAPDGEePku3CMZaBKjUcVHSaknQvOO2zth6yKCVbvDcUbMmPJK7+Upiimm8cTXs1xTfx1Og6i6CXyqHxjml+yEczwpz20uZ0kbM6n4UXMDeKRr3nfxezUMcF5tSkLFMnCkNxkUqSVIvjI21PFSybTOwC9gze0PGM9hxDZW8RpBUFcShMq/4NppXf2L6AB4OXT+LSWlP0EMk1Ud4PRHtGkM/+GmuQ0yWNFEiXyHKTj1m7C4cW96eAoyNKl3ewEfjo3i3fOGLSgDc9XOMeTfeKps8mFUBIfF6ls1swn8EGbwE+/iDrmSpjSRvdlQ1f813tdr47/jh7xwfid1NRZTEbti+0MrmOt1qjfwbqdU8oLsyrCW9NkMAuvz8qNQSB+4o4kiho10aB+MgjbypE4L7Of//rgaAphqjP3dya7p3pglPSoJLeXVd0RNwZNKRJ/w+5HB2rnz39TDpaBP+F1ghFVN3jDk6GR++3wC5ELrb+Vn1cs8dmpJi2DypdRg64nO1O/Sw2Jqbeex/xEyL+kJ1ZUiXaXjsDGLEmMPfc//Ia+9mPgYCCQHn8JfaOw7z5bt/V6Nuu+E6BMReqw6+VpYeCcT7rM3aNfypf2vgL7Gjc8jSdFWsOXqF17kDGF4P mZbU2kys bOlqjV5HMEmZ6lDkQxgomgnHzrek67syFjffpUOqUxGDaVs066Fs3/asSYclNpPgAba+IpcH5FuGt53XEpx970zI0vsdZlGaSrP5PTZm0+oZZyWaQzS+02uh2/4ukh9NiXb9HtQIyLk8o20MokWUvl+4+ArM0zEdkm73NO9A95FD6iZwg8PsR7QI+yHZczTBrLMlyKh68jr+abKkU+qfJX4JXhby+GmrWRdywTDoUjEVWdeQonMqHnFyXJZhPUcx3inm01OPLDmCI21HwLWeQz3Z4F+kFOqaDjxFkfGoI+bjKaOoNCiNSsCTL4kxzMD1TomUcqi7WUXkAWjpgEkfBkyDBSTnfxA4fJGVfZ41NC2sOtTTbuB5lKjoFsEsvTzel916M9IjuVbRbYueLRrl+NbX8U/LOKKp3pVIjVWBvWL7M0mVX2UPnUNsknhHe5k26SGquReHjhz3y2Z8VGQfO+54+fpoQN7QBPu5deMLi1Lec4bVQUz4stVkPaZvHLJOvLkLageykLwgEGozOSYL2bRmXcbxD8ajp+kVPgeOrqPg9hzX6LWnFuTXHcxJA8NHQLfA/imY+vGyg9dG4SN0YjzmOLqXbTyWim8fz X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jan 16, 2025 at 09:33:34AM +0800, yangge1116@126.com wrote: > From: yangge > > There are 4 NUMA nodes on my machine, and each NUMA node has 32GB > of memory. I have configured 16GB of CMA memory on each NUMA node, > and starting a 32GB virtual machine with device passthrough is > extremely slow, taking almost an hour. > > Long term GUP cannot allocate memory from CMA area, so a maximum of > 16 GB of no-CMA memory on a NUMA node can be used as virtual machine > memory. There is 16GB of free CMA memory on a NUMA node, which is > sufficient to pass the order-0 watermark check, causing the > __compaction_suitable() function to consistently return true. > > For costly allocations, if the __compaction_suitable() function always > returns true, it causes the __alloc_pages_slowpath() function to fail > to exit at the appropriate point. This prevents timely fallback to > allocating memory on other nodes, ultimately resulting in excessively > long virtual machine startup times. > Call trace: > __alloc_pages_slowpath > if (compact_result == COMPACT_SKIPPED || > compact_result == COMPACT_DEFERRED) > goto nopage; // should exit __alloc_pages_slowpath() from here > > We could use the real unmovable allocation context to have > __zone_watermark_unusable_free() subtract CMA pages, and thus we won't > pass the order-0 check anymore once the non-CMA part is exhausted. There > is some risk that in some different scenario the compaction could in > fact migrate pages from the exhausted non-CMA part of the zone to the > CMA part and succeed, and we'll skip it instead. But only __GFP_NORETRY > allocations should be affected in the immediate "goto nopage" when > compaction is skipped, others will attempt with DEF_COMPACT_PRIORITY > anyway and won't fail without trying to compact-migrate the non-CMA > pageblocks into CMA pageblocks first, so it should be fine. > > After this fix, it only takes a few tens of seconds to start a 32GB > virtual machine with device passthrough functionality. > > Link: https://lore.kernel.org/lkml/1736335854-548-1-git-send-email-yangge1116@126.com/ > Signed-off-by: yangge > Acked-by: Vlastimil Babka Acked-by: Johannes Weiner