From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1962DC47422 for ; Fri, 19 Jan 2024 02:03:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5C2F16B007B; Thu, 18 Jan 2024 21:03:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5738A6B0082; Thu, 18 Jan 2024 21:03:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 43B216B0085; Thu, 18 Jan 2024 21:03:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 3206C6B007B for ; Thu, 18 Jan 2024 21:03:20 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id CA0A81C11B2 for ; Fri, 19 Jan 2024 02:03:19 +0000 (UTC) X-FDA: 81694413318.15.980A065 Received: from mail-oi1-f182.google.com (mail-oi1-f182.google.com [209.85.167.182]) by imf20.hostedemail.com (Postfix) with ESMTP id 1C6361C001B for ; Fri, 19 Jan 2024 02:03:17 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="SFi+K9a/"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.167.182 as permitted sender) smtp.mailfrom=ioworker0@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705629798; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qRcgxHMJ8VpBBVLzbYNZYouTGZecFbrpSJHdtv6EjmE=; b=WvmNM4F8HKl9BLLuCgr/BwmsdMcwSGtV2vNEF/QwcrDUN+VPK2S04gTyG4ownl83KuNJSO OUkInglAmoWe8A6cMqmMJ+gUifSFV6URQXh4Z48gOY8VtINKs+1VatsXBsSCMBGP5iSu39 +LHz1UUGk3S+I74V4ZipwjDwrn4a6Ok= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="SFi+K9a/"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.167.182 as permitted sender) smtp.mailfrom=ioworker0@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705629798; a=rsa-sha256; cv=none; b=bhCV4Qx2fF6NqPgA9Ulp8o1jbxFqEc1uNALv/4+Ibk9NjAtA2HwWRbSvMzobpXNpH6L0T1 SqJSef9e0oitZ5woW1WK/k0+kvHGpxxnTFBAvaDvDg5hw1oZ/loGfjb50lY4lxBG9OKGHL obnP+yFlZKpu9+LiYyVWim7hHjrR5Os= Received: by mail-oi1-f182.google.com with SMTP id 5614622812f47-3bda4ee62f3so9352b6e.2 for ; Thu, 18 Jan 2024 18:03:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705629797; x=1706234597; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=qRcgxHMJ8VpBBVLzbYNZYouTGZecFbrpSJHdtv6EjmE=; b=SFi+K9a/9Sleu4PHzGvmSa8XL2QHEgjc7wBa0hNE25rsfghl8z2jxjxpTON3HHZXns rqbJ2uegUwjJfxIYgIwqCYzIZC09yX2b7IqFo1fflAfSSyZVZt7Kv2i7xAq1GibvET6p RNfv1IhwIpG82+oMq387l5GJ7oWWNAHY+Y4r19D8+DMcYb/Gbh5pz0kYeA7woYu3pXCm g/PLs8+ck2Jeo348pVppTJ+ZdEUed0CD8oluxpGGP72xs3dnFVmv3Lb+2ov5WSN/pHFe vFTz3vn1Bj7fiHVrzbwPITT+v3+iUgFVKBG6CALdnCVdPBZ24w6EBQ2CBK9Q4aN+6nUk Y0rw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705629797; x=1706234597; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qRcgxHMJ8VpBBVLzbYNZYouTGZecFbrpSJHdtv6EjmE=; b=fUugbOjoqdPgSwtFz6pKF7GVQ2bguZUr50k3uZl0pmbhXNW9xtHK26GJ3AzoyMbusF XgkGq8DH/Tqm7ZIzn1M5WggM3Mp310McoSaQhy5NCRV25GO4qS+rQoyMkdugdiWA5Rp7 +OTXMY2Jba4g4QQ8gwMEUZKTzKQppgFtZqrEcoDhLV6sZXKm0ccT1xYIigFeYSl67Bl8 PQIUSGVPx4ya5j8VyvJaLoHCZgjQg+p1ixrwwAuaNhCwJwtgt+M8nIkmhUNr6+shRBm/ CCFxcPLsOMp8V+yRNhFSFZ5f3vz3D37Zxh0x2LcZ5tzQbkioqAz8ofpFy2wI40k8pn1M +L4g== X-Gm-Message-State: AOJu0YzKkMCniGmwY37yueAE55mGYfKMKbCC4d31QJsPYDVqHXJrUkK1 DIldtjT8tMaT6WSZsqcm2vR+c78jCBCR8+eDzl84xrlYSKiMroxRrA0gnnAZ714rlmV3KCaErl+ ma5Jnpmys+LbufK6gHc6vt4cFGYc= X-Google-Smtp-Source: AGHT+IHPRBtEkk0NJMOxUEU6XNvrEYnIieqra4NJuy8hIaLQ51bR4M0BaIjDY5tR8XSjcfiSsqyo/Xt6XOjLNHychL4= X-Received: by 2002:a05:6808:2017:b0:3bd:9ff7:428b with SMTP id q23-20020a056808201700b003bd9ff7428bmr1244888oiw.51.1705629797154; Thu, 18 Jan 2024 18:03:17 -0800 (PST) MIME-Version: 1.0 References: <20240118120347.61817-1-ioworker0@gmail.com> In-Reply-To: From: Lance Yang Date: Fri, 19 Jan 2024 10:03:05 +0800 Message-ID: Subject: Re: [PATCH v2 1/1] mm/madvise: add MADV_F_COLLAPSE_LIGHT to process_madvise() To: Michal Hocko Cc: akpm@linux-foundation.org, zokeefe@google.com, david@redhat.com, songmuchun@bytedance.com, shy828301@gmail.com, peterx@redhat.com, mknyszek@google.com, minchan@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 1C6361C001B X-Stat-Signature: zzr31mddb6gfnjkgc45edtwwgba5whuj X-HE-Tag: 1705629797-401158 X-HE-Meta: U2FsdGVkX1+SxlRMhhhbh7sErT4QHbjdYOgfcDm2zOMtjDvntCPEv2NQ2CHqUqXtg6hnRRcm3tvAdqRtQSfi27G/xdqcvkh7JJvrZjL9RSQYIix0eVOcNMENgN4X6ISZ949rELgUJr067Fr2hBucDqdBvjCV8AodmepyBPdzz/Pm9CesKZEGC/fI3r9MuGn9/rnFuHWvGLsMroNrZindZRIZ5c2/ZnRUa51jmTKRBdq/GO8N40qDqFvFbf3uSm2TVODsvLYN2GibUVZMhEmoNm3jut3PmUEfR0yEz7OoiEPEBxQwhkAaE0TVg/S1QRtkGqeil/agUmcEAOF3tZfHkf5IB4QlICgbErponQ/idY+BhxAF97eJbFF+OF07t3f95qJyJkZlPcZQpexq9ckQlRMvuBD2V8ItIXcwoeeREdxJTcs6Ep0/6PMAI7J7pRI01fK7d+XKgkTPb46uoQ281V9kNgfRHwNddfGGTd0wQx/+Jrfd5sNxTnBOXUu9PR+xmFzOB84mDSVanZwZHd/fJGlu7+hfSM/APKlNOnjpKir6pq0sNcTVJvX5ZkDcDYOtJOLFjZ250Wv9XX1FuW5TJtvGaxWMbzZV0Zyfrwc6j4uO33D2vcUewkgNCBi7HhFRlfSk79Vt2hx7Hg0UpyKdyCRkX01FLABhnWCQTFTfuR08iIirMWqCgQqpChtxS7o4eq7j1uEQWTZsWYzCSqTN8WxJzn7sVBW++YypHug3oslBPliCOs5xYy8DLKYRkMlqYdCyjJbw6bHZlwPE5DfslcnQYkMQn5IjlPrkUk+F3hzj0wMsKIT3viMKEuYr06wm5zZazvmOk1FY+iQXzQq3YY4v/1UIeqGYZV+9rzQKihrZHWVjA/F7ZK7KZvPHGb4HK9WRSX2A9erYtXR+V3lYKsDb8ZZkXZDA/vUSebaE9MeXZENS3mGS7a/gOfkQ/ZrCgwNf5Lf92aJ7cpXtW7n NKaBmuFN ha4IDkKuZ9jH1kkdRwmN8qcQssSUTdLePfZ/3WBP4/x7CAZhqUmQGdN0hRJvMsyb86QlMdGO58yAS0pITjorJKDqrJe9aO4UWcWEPgsCd/9kwIPr6J+QHVSRjISvXAf1xJs1R0BEzJGfYh5QEfjERFs7orgE81tUqAsIzVvAWPfKc9q481smknOG1mfR4zbsGKGSySixhFgUQmYwoHxlmLgo1fQiu5kGSCL6qE3m1mwb6CdIkyVU5v/GLjcFYMPzS9+YptNlOTlZoNzK2thYvwTch8Sp9p85tVIWjL6GVn4f/9jwy8kBWyBdtVZC+/gg6+7DcaXibFxQZM75ihvZnxiG/7sXtJNc9uTLAPSjIcAvr9ojnU9eB1KynVXjMG+V8zdal/Li/IpR0l1Pker7JsVe/UV615kRpV+Ql4dj0QmuWfNkaF96AYN06x5mUdfCXAliFVO25NOq11qqjWDJLT75CnroIwOmcLn7YtpI/MUAr+n5S4k3KFjtIonEUMdZFAXpFzh75TI/jQjkwH+NshmEzaYb59M8EPBJbtA0hzcY1pnlUiBjfuVqFrX4qPG7HaT46 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000012, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hey Michal, Thanks for taking the time to review! On Thu, Jan 18, 2024 at 9:40=E2=80=AFPM Michal Hocko wrot= e: > > On Thu 18-01-24 20:03:46, Lance Yang wrote: > [...] > > before we discuss the semantic, let's focus on the usecase. > > > Use Cases > > > > An immediate user of this new functionality is the Go runtime heap allo= cator > > that manages memory in hugepage-sized chunks. In the past, whether it w= as a > > newly allocated chunk through mmap() or a reused chunk released by > > madvise(MADV_DONTNEED), the allocator attempted to eagerly back memory = with > > huge pages using madvise(MADV_HUGEPAGE)[2] and madvise(MADV_COLLAPSE)[3= ] > > respectively. However, both approaches resulted in performance issues; = for > > both scenarios, there could be entries into direct reclaim and/or compa= ction, > > leading to unpredictable stalls[4]. Now, the allocator can confidently = use > > process_madvise(MADV_F_COLLAPSE_LIGHT) to attempt the allocation of hug= e pages. > > IIUC the primary reason is the cost of the huge page allocation which > can be really high if the memory is heavily fragmented and it is called > synchronously from the process directly, correct? Can that be worked Yes, that's correct. > around by process_madvise and performing the operation from a different > context? Are there any other reasons to have a different mode? In latency-sensitive scenarios, some applications aim to enhance performanc= e by utilizing huge pages as much as possible. At the same time, in case of allocation failure, they prefer a quick return without triggering direct me= mory reclamation and compaction. > > I mean I can think of a more relaxed (opportunistic) MADV_COLLAPSE - > e.g. non blocking one to make sure that the caller doesn't really block > on resource contention (be it locks or memory availability) because that > matches our non-blocking interface in other areas but having a LIGHT > operation sounds really vague and the exact semantic would be > implementation specific and might change over time. Non-blocking has a > clear semantic but it is not really clear whether that is what you > really need/want. Could you provide me with some suggestions regarding the naming of a more relaxed (opportunistic) MADV_COLLAPSE? Thanks again for your review and your suggestion! Lance > > > [1] https://github.com/torvalds/linux/commit/7d8faaf155454f8798ec56404f= aca29a82689c77 > > [2] https://github.com/golang/go/commit/8fa9e3beee8b0e6baa7333740996181= 268b60a3a > > [3] https://github.com/golang/go/commit/9f9bb26880388c5bead158e9eca3be4= b3a9bd2af > > [4] https://github.com/golang/go/issues/63334 > > > > [v1] https://lore.kernel.org/lkml/20240117050217.43610-1-ioworker0@gmai= l.com/ > -- > Michal Hocko > SUSE Labs