From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7C16C52D7B for ; Tue, 13 Aug 2024 17:22:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 262CE6B008A; Tue, 13 Aug 2024 13:22:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 212E96B0092; Tue, 13 Aug 2024 13:22:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0DA7D6B0095; Tue, 13 Aug 2024 13:22:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E276B6B008A for ; Tue, 13 Aug 2024 13:22:53 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 67455409EB for ; Tue, 13 Aug 2024 17:22:53 +0000 (UTC) X-FDA: 82447892226.15.FDF7803 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) by imf07.hostedemail.com (Postfix) with ESMTP id A071140012 for ; Tue, 13 Aug 2024 17:22:50 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=jOMhfhCw; spf=none (imf07.hostedemail.com: domain of ak@linux.intel.com has no SPF policy when checking 192.198.163.8) smtp.mailfrom=ak@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723569735; a=rsa-sha256; cv=none; b=5BwYEXtjYX3Qx2ei5IvINJha5m3nKrv40NFI0gkbIqpGcPIBJEgM4RfACPnfyEEElxCDHD TB3UuTiXzQ0r59e3ymF6+TAFFiCml+t9d3TYQ+A55iZykeMDjzQy//F3fchFfgelhit95t ZlVUnOyAe/WK2TDW64bKwMSFUfG75Vk= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=jOMhfhCw; spf=none (imf07.hostedemail.com: domain of ak@linux.intel.com has no SPF policy when checking 192.198.163.8) smtp.mailfrom=ak@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723569735; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lVq7VwrAxk0RMOuO0x1p+oYDBnoHzBDTtz6j5C/x35c=; b=VY+aM4laMk3wsXP0n1L3MCJsT4J/ZMJ0qI24glHFqW51tvjP3kr16rmDIDeQZH02vP23c+ ZQsYCYvR5izdxkjSgLB9PnW5zKx3wTlgrpq72TTUPWbpZj7VhNZguBCti4KxAVfIQn6Tx3 STwC6LqjUcd1ZoYQB/+NKdM3RyOKy7g= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1723569771; x=1755105771; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version; bh=PXjGkj2+4r+pOwvax1EJsHXufhiX5E5s/WYsh43Fu9g=; b=jOMhfhCw07EAjvL+U0lqWsqw0tF0265aDA9X5DxKuFmikSZimzRvHxa7 Dl7lsBktlbpy4RDyD6ms2nAgm0FhACCXv3gPWoENQampBAaF8kytYlH3C DwNOByNN2k8LhjSe+CvYjjn1qPXccdWk3ZvqBNMn6j0nL+U6NkbIV2xJN TlNjGxBAY2rTpKwq6CPZWKkZGslu7WVAChornjMNO6ZagihRntZH+zhIh VPeR75IV8dpjRuOkOqJi0ej4qliDStZpaTL9K+kwuPw+1oNGFA3ug96JK ZEgku9BYF4aHDjnAF0lkcYxvFamfPqJYNC1TJsix5pjUBe98q6IymY/+9 w==; X-CSE-ConnectionGUID: JXeBQLuzTm6SBxhgVSuDrA== X-CSE-MsgGUID: cqd5NPauRC+WhF/otw4l0Q== X-IronPort-AV: E=McAfee;i="6700,10204,11163"; a="39257082" X-IronPort-AV: E=Sophos;i="6.09,286,1716274800"; d="scan'208";a="39257082" Received: from fmviesa007.fm.intel.com ([10.60.135.147]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Aug 2024 10:22:49 -0700 X-CSE-ConnectionGUID: 9AfD1iuWTmC/9e/5rre4IQ== X-CSE-MsgGUID: YUgBNSyfR5CAsRM+vDHsaw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,286,1716274800"; d="scan'208";a="58430370" Received: from tassilo.jf.intel.com (HELO tassilo.localdomain) ([10.54.38.190]) by fmviesa007.fm.intel.com with ESMTP; 13 Aug 2024 10:22:48 -0700 Received: by tassilo.localdomain (Postfix, from userid 1000) id 474E230125F; Tue, 13 Aug 2024 10:22:48 -0700 (PDT) From: Andi Kleen To: Usama Arif Cc: akpm@linux-foundation.org, linux-mm@kvack.org, hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com Subject: Re: [PATCH v3 0/6] mm: split underutilized THPs In-Reply-To: <20240813120328.1275952-1-usamaarif642@gmail.com> (Usama Arif's message of "Tue, 13 Aug 2024 13:02:43 +0100") References: <20240813120328.1275952-1-usamaarif642@gmail.com> Date: Tue, 13 Aug 2024 10:22:48 -0700 Message-ID: <87y150mj6f.fsf@linux.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain X-Stat-Signature: 4ijcewzak9ntfnn8ydor73nykamnjsif X-Rspamd-Queue-Id: A071140012 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1723569770-301608 X-HE-Meta: U2FsdGVkX18Byv0Q/DDRwrWvFgqMXmfSsia8LtAtE9T7rdkqW7gbl2HtA2itXUb/vOvUMH3yKMVMqadNd4cs5QlhlK3xziZLA/fAnmWIGzRz7ecQNXf6kxRtgQNOUrxOLcT9G0tiZiM1ucUTYWxadJRlaP9xv4KKNzQ6bmdJ16IWBevLPl9X/snuNtgZY0wMOfNS3O2oB819N7yapLIwnQcImsIBfW1SnERvE0PRSurXZ2PfrRfA2E+mSe8BRH6OIzaOC9NEK2WeUjECG6HYVmHAVnLv/JC/TXa5Mxrv06g4GsMigGKcOMWZ4amnGJxFlVxja+eSf26gjNB9J8SLWqHnVnKKDNWehAr7GwV1iT/TnL8gkeHc5ivSul3fUZR5fzvAG4HpTCZZV6TfF3DGzItIJxQ9nC3aBHlNzlE2bsjugyEcINknuU0O893m2XtPeffUGpu1Oxt0wVpuhOND5zyKpsUBsZWQcxmIVXJn5hR5H5rTOXliJW6stRSE733qrqn0hddo2AjgtSIzF1uDZbH1zdVm0SDUnyf7xhCRzXGzaBxY7v0xmr4PpQG563zlLgD6grh30IMbGBadtcO7DLpNRF5/4apgTW8+a363m+PgE+5BsimmJOiF3mHV1hr01CKCSsdldGCfuJWegQSwfdPBPJG2SY0hEWJBIG9lK1WWwKl1xNMJ5R+0X8LFlb3XODKV32nZj70phhbKZ9fZ7u0jE3rTnT3b9dq5Jt2X+ZSu/3Ihw+1fTUekZpYCGexmHCte3kDA3u5DeEyIgcRJAYFkVHBiMUHnplO/OrJhO/lASfE4D/YwoAQJTyKkOmVqHmuzwvh8hHY/7lU7NVa+w+MaryGCdLMqaFuOf65ATAACjTKg8QXPrckJGdVWDEZKbAFsPAtaeV2+O1FvUwKhY+E1VKQBjt8GW40P9DhdwphET0JPrXGSjzv46MDfpWRvmbNkt64l/bAnL8F76c4 kqKLrGRo Z0N/8ug52cRLg7+1+VmPMRNJixj8xLXAfCHXMlsQN5sqR0u1Ps2lU+W2gyxej2WHOmEjKk4VP1PmscdhK5HxPyptaecSGJNzNqONr X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Usama Arif writes: > > This patch-series is an attempt to mitigate the issue of running out of > memory when THP is always enabled. During runtime whenever a THP is being > faulted in or collapsed by khugepaged, the THP is added to a list. > Whenever memory reclaim happens, the kernel runs the deferred_split > shrinker which goes through the list and checks if the THP was underutilized, > i.e. how many of the base 4K pages of the entire THP were zero-filled. Sometimes when writing a benchmark I fill things with zero explictly to avoid faults later. For example if you want to measure memory read bandwidth you need to fault the pages first, but that fault pattern may well be zero. With your patch if there is memory pressure there are two effects: - If things are remapped to the zero page the benchmark reading memory may give unrealistically good results because what is thinks is a big memory area is actually only backed by a single page. - If I expect to write I may end up with an unexpected zeropage->real memory fault if the pages got remapped. I expect such patterns can happen without benchmarking too. I could see it being a problem for latency sensitive applications. Now you could argue that this all should only happen under memory pressure and when that happens things may be slow anyways and your patch will still be an improvement. Maybe that's true but there might be still corner cases which are negatively impacted by this. I don't have a good solution other than a tunable, but I expect it will cause problems for someone. The other problem I have with your patch is that it may cause the kernel to pollute CPU caches in the background, which again will cause noise in the system. Instead of plain memchr_inv, you should probably use some primitive to bypass caches or use a NTA prefetch hint at least. -Andi