From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88CDEC4707B for ; Thu, 18 Jan 2024 13:40:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E98886B00A3; Thu, 18 Jan 2024 08:40:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E21056B00A4; Thu, 18 Jan 2024 08:40:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D0FB06B00A5; Thu, 18 Jan 2024 08:40:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C23926B00A3 for ; Thu, 18 Jan 2024 08:40:24 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 9FDE8120CAF for ; Thu, 18 Jan 2024 13:40:24 +0000 (UTC) X-FDA: 81692541168.11.F0578CF Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf08.hostedemail.com (Postfix) with ESMTP id 572BD16001A for ; Thu, 18 Jan 2024 13:40:22 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=QfYk7VRu; dkim=pass header.d=suse.com header.s=susede1 header.b=QfYk7VRu; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf08.hostedemail.com: domain of mhocko@suse.com designates 195.135.223.131 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705585222; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=we2LNT5Wx4kSmP/D3/KqsKAmidZoCRWlYOg8GUmAkvQ=; b=bbXtO32SokTL7xnx94NXtXsu8FHcsknQw/x2znxeX5V6Na6pdyZ0rD56iWXBTopdwzQN4q 7e4a7oQ1/UmXgi9pFpcL4INx0V18YRp5RB1Q/B22fzOnIdh6aPynPqxdK9V5tOQYY73sEV 42pwzq8Gj170UbXlvCy8iUq+tOU7amw= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=QfYk7VRu; dkim=pass header.d=suse.com header.s=susede1 header.b=QfYk7VRu; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf08.hostedemail.com: domain of mhocko@suse.com designates 195.135.223.131 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705585222; a=rsa-sha256; cv=none; b=BRSeCl+fbgWMTmtqs+lb5kD547phEU0yyP2C7d/ObQ+V1r/fxEaYedPVE3HMNPi4ts0oCN imCM4m+mClxI7/iuxyU5KS5tq1kx7eN/SiWGKhzEEzzSPfuVpMIjMRGtTKo1fIaVDRl5N9 GP29H9ogfuzKf4kHEjfZunpOo6UCR/w= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 6495F1F793; Thu, 18 Jan 2024 13:40:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1705585220; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=we2LNT5Wx4kSmP/D3/KqsKAmidZoCRWlYOg8GUmAkvQ=; b=QfYk7VRuspLDEbexxh3EuSniYjjNoEI62Z8ZhX8CK+pm1uwbMbPPvAQxRoNtyS+U9Np137 hsjPeA0GIJ6VmNRnvPzvxOvUG4nm0ZTvY9jrmaMjqSmnOtjn4K6ihJmAWdXeBgeUbpAUwV PP7WMmYhFMJA1Z3uMxBUTZXhQh1aDTQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1705585220; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=we2LNT5Wx4kSmP/D3/KqsKAmidZoCRWlYOg8GUmAkvQ=; b=QfYk7VRuspLDEbexxh3EuSniYjjNoEI62Z8ZhX8CK+pm1uwbMbPPvAQxRoNtyS+U9Np137 hsjPeA0GIJ6VmNRnvPzvxOvUG4nm0ZTvY9jrmaMjqSmnOtjn4K6ihJmAWdXeBgeUbpAUwV PP7WMmYhFMJA1Z3uMxBUTZXhQh1aDTQ= Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 44E3C13874; Thu, 18 Jan 2024 13:40:20 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 8S2uDkQqqWX9CgAAD6G6ig (envelope-from ); Thu, 18 Jan 2024 13:40:20 +0000 Date: Thu, 18 Jan 2024 14:40:19 +0100 From: Michal Hocko To: Lance Yang Cc: akpm@linux-foundation.org, zokeefe@google.com, david@redhat.com, songmuchun@bytedance.com, shy828301@gmail.com, peterx@redhat.com, mknyszek@google.com, minchan@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 1/1] mm/madvise: add MADV_F_COLLAPSE_LIGHT to process_madvise() Message-ID: References: <20240118120347.61817-1-ioworker0@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240118120347.61817-1-ioworker0@gmail.com> X-Rspamd-Queue-Id: 572BD16001A X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: hzxn4hm6jr9x8mn5837adt5huee5tjfn X-HE-Tag: 1705585222-116638 X-HE-Meta: U2FsdGVkX1+OgkIehYvhR04LP/JcZUMD6vw6cOcS7wwTylCAzkNsuWDESmpGENrbGxcVjBTIPgu0Dpm9oxpgFjwtiZCSRigaeGNat0N+2G9hHkAO7YMbzcZ9X02LITfd1Y1ARC1Glap7qJkI5qtgD2Xwp4DlfnKlYOfJO3DYVbAeJg324vFo85W9UFNpbn983VmX/Q8wC9mWd+dDn7W5ech6dcr9kiGceTGL1SEKJkl9WvK+KPwyHNsLk3NCfPngjgdamnGZAAIebNVcEov14WXh6uL8gexF1MwgW1tVImL5hE9qL4sR6kdTgpy/XSWZ6hmHoAvWvXeOmZ2bfigeDhGEVha5CS0MhAZWHio3eD7NKNbKbCxcg/N1jtv0VpLCNCKmPhft20AuFoMt9Cu85fbZUg4YTEEYDnekpIApy/6Kj+HN4JlScYau7SCtLMVqfv9Awt3McO894REQ7TySTam8GW3NQ6uKbaxrBMrzGKY4pvunnGv+ye0CVALdfsaCiU2ZIvRNAXzZethldXFBgDbsWcW7j568FIUEV5qQ5orOfJw5k+27yJMQHGb8Bg3JJoXbjmQL+RX3GLOBKuMHCRJCmHxcwMzuYNUCaQy+mWhp6PNzmhFvxosPmOc95Anxheiw/GiOd5W+g1mzTz4DxyvqJYGZ+ejiKhG0ib2iUzB7IoSqelHhq81C4fWMmviPw1PCLHiTVsVJgzxHGgEadESwki78iBlcSAnbGQulDSVRb3sgStvhHpQWeJsSP4pFl61BmGTYugWyrhjFDFM2iq9VEaOXDJ5UT4hNDzDvNIJ8JuLuCYH3cACAI//NU5k0N5BQVHpurAcxkzTZo6MjFOe6SPnH3eLP2l/YXloKdG9NwvIILbz4Zd1yAIQzP2B3UjdgO8O0ERM5q6oA6a/y9l98x2eY8ibchTHwCJbMfsCBHw3lLy6iKiODC1SJPkpS4vGnVdycUpva4e3TFOL mkkxhhNa 3HQXdc7zlLoOeMBCcXLjz/SfNoMhENS2MwBhgSuf9WkrVg4IbBjk/PGmhsg0+z5Rqd72+/qNV6Jr9IsQ7UrKruZql3kSvF9mwDczlXKwnydVcYPGz+GWCqyulSWuB1/Av7j4pw4XOAIrsG6mVxTYsC6xS1JI20c83eqSTmcgDmEN9nCr9ExqZshJhQZ/VHInjPY2Cgobj+yyjha8c3Ut5b3zA2qL32O34xKJpCIAJdLQUnZJM0srnYY9mmkvj1MrIbuPrUA7ye1ObrCCqNLWwitAfYRxaDySgK/DcaOOs/Amsamcr4zocw0bz/Al0JjRyGVut7sVtCHpzJJBlTDoMK0EnHXShv3oZd8JZSrWmj9ypsmm3T0UPJIIV0w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu 18-01-24 20:03:46, Lance Yang wrote: [...] before we discuss the semantic, let's focus on the usecase. > Use Cases > > An immediate user of this new functionality is the Go runtime heap allocator > that manages memory in hugepage-sized chunks. In the past, whether it was a > newly allocated chunk through mmap() or a reused chunk released by > madvise(MADV_DONTNEED), the allocator attempted to eagerly back memory with > huge pages using madvise(MADV_HUGEPAGE)[2] and madvise(MADV_COLLAPSE)[3] > respectively. However, both approaches resulted in performance issues; for > both scenarios, there could be entries into direct reclaim and/or compaction, > leading to unpredictable stalls[4]. Now, the allocator can confidently use > process_madvise(MADV_F_COLLAPSE_LIGHT) to attempt the allocation of huge pages. IIUC the primary reason is the cost of the huge page allocation which can be really high if the memory is heavily fragmented and it is called synchronously from the process directly, correct? Can that be worked around by process_madvise and performing the operation from a different context? Are there any other reasons to have a different mode? I mean I can think of a more relaxed (opportunistic) MADV_COLLAPSE - e.g. non blocking one to make sure that the caller doesn't really block on resource contention (be it locks or memory availability) because that matches our non-blocking interface in other areas but having a LIGHT operation sounds really vague and the exact semantic would be implementation specific and might change over time. Non-blocking has a clear semantic but it is not really clear whether that is what you really need/want. > [1] https://github.com/torvalds/linux/commit/7d8faaf155454f8798ec56404faca29a82689c77 > [2] https://github.com/golang/go/commit/8fa9e3beee8b0e6baa7333740996181268b60a3a > [3] https://github.com/golang/go/commit/9f9bb26880388c5bead158e9eca3be4b3a9bd2af > [4] https://github.com/golang/go/issues/63334 > > [v1] https://lore.kernel.org/lkml/20240117050217.43610-1-ioworker0@gmail.com/ -- Michal Hocko SUSE Labs