From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E9FDC4725D for ; Fri, 19 Jan 2024 12:51:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ECE566B0075; Fri, 19 Jan 2024 07:51:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E7E666B0078; Fri, 19 Jan 2024 07:51:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D46ED6B007D; Fri, 19 Jan 2024 07:51:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id C6CBE6B0075 for ; Fri, 19 Jan 2024 07:51:56 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 93AED14043A for ; Fri, 19 Jan 2024 12:51:56 +0000 (UTC) X-FDA: 81696047832.05.DCF1795 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf09.hostedemail.com (Postfix) with ESMTP id 32420140008 for ; Fri, 19 Jan 2024 12:51:53 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=cmsk7QZg; dkim=pass header.d=suse.com header.s=susede1 header.b=cmsk7QZg; spf=pass (imf09.hostedemail.com: domain of mhocko@suse.com designates 195.135.223.131 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705668714; a=rsa-sha256; cv=none; b=uBJRdmMIPazcg4JaQnTaBHbpRK/lVeOS/geIOyOqZAsf5fDbuD9e1nYc3yIz+tsDz+IWL+ FeyyK+akEiZblz8pYLZ7qJTGlKgeK1fAtVSFF7EnAmY2e54bPawYptJza4NRu2BB/AEese qFnjUPyn2mc6JLkTiCB6CNYIEdV2zbM= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=cmsk7QZg; dkim=pass header.d=suse.com header.s=susede1 header.b=cmsk7QZg; spf=pass (imf09.hostedemail.com: domain of mhocko@suse.com designates 195.135.223.131 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705668714; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=V/t7pPanzZo6yuB0Rau9uLmRftNtX/53oMyJObN2XzQ=; b=fqe/qWGMk2iu6y0lnkY8dWe6kdu+QQOZlocmYIYABaulDqqvN9zIiGpRVmZFlQG7bhAYna BhZmvWi9UIYItaVb6I0bvrZ0Yn1DQZUjJg6PhqQ/Stzrli8AKOlX7qLWgotILVe0DdCMVj EOACNZnAkcBPHmSMPWprcmP9a316wWM= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 6D5C31FD15; Fri, 19 Jan 2024 12:51:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1705668712; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=V/t7pPanzZo6yuB0Rau9uLmRftNtX/53oMyJObN2XzQ=; b=cmsk7QZg71qSypTUXvv6jwmfrbjRsUj9zQgIEaaZ7V5peesZqr2w22PNHdQRANDM6SIvIB Yqhp8Tl+Q67FtdhAJrw1f94mfePvkCyprCJa6luhWK8Dq7LuP9bDIiA/lI9wPLVmq5868x Pyh7towxAi6Hfz57KISvt9jSVlVgTA8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1705668712; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=V/t7pPanzZo6yuB0Rau9uLmRftNtX/53oMyJObN2XzQ=; b=cmsk7QZg71qSypTUXvv6jwmfrbjRsUj9zQgIEaaZ7V5peesZqr2w22PNHdQRANDM6SIvIB Yqhp8Tl+Q67FtdhAJrw1f94mfePvkCyprCJa6luhWK8Dq7LuP9bDIiA/lI9wPLVmq5868x Pyh7towxAi6Hfz57KISvt9jSVlVgTA8= Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 470A7136F5; Fri, 19 Jan 2024 12:51:52 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id ovZrDmhwqmVEEgAAD6G6ig (envelope-from ); Fri, 19 Jan 2024 12:51:52 +0000 Date: Fri, 19 Jan 2024 13:51:38 +0100 From: Michal Hocko To: Lance Yang Cc: akpm@linux-foundation.org, zokeefe@google.com, david@redhat.com, songmuchun@bytedance.com, shy828301@gmail.com, peterx@redhat.com, mknyszek@google.com, minchan@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 1/1] mm/madvise: add MADV_F_COLLAPSE_LIGHT to process_madvise() Message-ID: References: <20240118120347.61817-1-ioworker0@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 32420140008 X-Stat-Signature: psgjg8ehmjeypy7oaqgs81fkdm3dd1md X-Rspam-User: X-HE-Tag: 1705668713-77894 X-HE-Meta: U2FsdGVkX1+nTSksuYcn2Gj5SNoV5+3OuBggjXx2b3SFjSnW2xMoF7HHwaagKmTfgCFNPzr/5MQDNAc9ztlUtTGjMmXNvD8YMC1Ac+MP+poEfNPK6uyGJ+SY2+7Hc19wrsSoCy+4G+QS62wFg1ilIUoJVz4guvGRMU9Uv9mWOmdPrWB2T5+WuH+FCaZ27oPqkg69WWwJl/ElNx2EQtoYynvyyu7QIOdN86kBRvTS/BxApDilwm6lzkxSWyE90ef0q1bFQlIQwsoRWZXtnmVcxgvM3LOZmHMGEdFLEgBJeClUVgyDdTsXsLkm7ybECRUomCyzFT3RLbynyzwkkvrxVSA/B06K8nDypaMOxSpqkDwZHOedUBLIsM2XTfDml7W+hWScOsvCvlLHte5zasnSM/n2raZYb3U4KPmzPm4emEptxuogtcfbsitIkmyDTzz4G2MI1+m2nXN6sdu0yFHdQ7j8gwwHjzo0L/BejTcAa5Twj6gSAH5tnpsZvyk1Bao4h51gy7EJEnzro8agZkwrnNPylxM2tv0HqSJOExTaI6QndY8abWExR9M7MS9gXUzX0ru9YAeRARIe+i/W3cHc98p58fHP85W7MlxI86DFzzEnCBnoMDEaLA+qnh99pCj3AWgQj8ZTtdbyCSC7Jc59AQWMALlHNyr4F6QgsherycwEvlsm0iFNPgUnOy+mHI6X9AhcXG824XGk3HWwpKc/0cmCf1jfrN1ZXV+6w87BRU1y1mgRG5u/ceYuDWWGqyp52FxY3gd8gyakhjAxDLlU2tmkmMiApd+GDcURsscZZ8N4HLO3TaG6Dyfd+kypZwMLUNus1VrgK8j9f8NjQJNqhdCSJ3V6Vz1CrnmoQSEUyCMrLJU+QH8jaAiRdfV43PSoX/wSMJPXhb6dGw8X5GNBXwkm9HjQaqNuq843l6NnjTaGrSMDiXpFayKXNGfZEXLJsZs+6TENO0LObQXBew0 qeeu8JGN bEwYOZF9hRw41/UzqdYqGR6Htvqq5H2mVGE7jDd1woJCNtOWerFOy0J9EBStpOATvhdZhlwQyIFlvijnusjG/krSBJNpGwKTw5ljhsAfy4T8WGgjViBO//JMa+Y43lMCROCVCMeB8tbsIQa8Ofzf+foB3FdV3ihKm4Zr3CxwsUREYysijDQOQ3MfnQd0yvpiLaM5JtV0EA2NDaPycG3XkIndnPvGov22BsTtpfdN41n2cG17jkooqpEBWnGhzYvl2aI+w1sR5IKekibZ/RNz0bIW1EoSHfj/JsHpmIMVP1v2PFpyx5NCmGVaK/SvolaGKKaTC X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri 19-01-24 10:03:05, Lance Yang wrote: > Hey Michal, > > Thanks for taking the time to review! > > On Thu, Jan 18, 2024 at 9:40 PM Michal Hocko wrote: > > > > On Thu 18-01-24 20:03:46, Lance Yang wrote: > > [...] > > > > before we discuss the semantic, let's focus on the usecase. > > > > > Use Cases > > > > > > An immediate user of this new functionality is the Go runtime heap allocator > > > that manages memory in hugepage-sized chunks. In the past, whether it was a > > > newly allocated chunk through mmap() or a reused chunk released by > > > madvise(MADV_DONTNEED), the allocator attempted to eagerly back memory with > > > huge pages using madvise(MADV_HUGEPAGE)[2] and madvise(MADV_COLLAPSE)[3] > > > respectively. However, both approaches resulted in performance issues; for > > > both scenarios, there could be entries into direct reclaim and/or compaction, > > > leading to unpredictable stalls[4]. Now, the allocator can confidently use > > > process_madvise(MADV_F_COLLAPSE_LIGHT) to attempt the allocation of huge pages. > > > > IIUC the primary reason is the cost of the huge page allocation which > > can be really high if the memory is heavily fragmented and it is called > > synchronously from the process directly, correct? Can that be worked > > Yes, that's correct. > > > around by process_madvise and performing the operation from a different > > context? Are there any other reasons to have a different mode? > > In latency-sensitive scenarios, some applications aim to enhance performance > by utilizing huge pages as much as possible. At the same time, in case of > allocation failure, they prefer a quick return without triggering direct memory > reclamation and compaction. Could you elaborate some more on why? > > I mean I can think of a more relaxed (opportunistic) MADV_COLLAPSE - > > e.g. non blocking one to make sure that the caller doesn't really block > > on resource contention (be it locks or memory availability) because that > > matches our non-blocking interface in other areas but having a LIGHT > > operation sounds really vague and the exact semantic would be > > implementation specific and might change over time. Non-blocking has a > > clear semantic but it is not really clear whether that is what you > > really need/want. > > Could you provide me with some suggestions regarding the naming of a > more relaxed (opportunistic) MADV_COLLAPSE? Naming is not all that important at this stage (it could be MADV_COLLAPSE_NOBLOCK for example). The primary question is whether non-blocking in general is the desired behavior or the implementation should try but not too hard. -- Michal Hocko SUSE Labs