From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 26F73C4725D for ; Sat, 20 Jan 2024 02:09:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 436246B0075; Fri, 19 Jan 2024 21:09:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3E5E66B0078; Fri, 19 Jan 2024 21:09:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2AD546B007B; Fri, 19 Jan 2024 21:09:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 1CECD6B0075 for ; Fri, 19 Jan 2024 21:09:46 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id AD51C400E9 for ; Sat, 20 Jan 2024 02:09:45 +0000 (UTC) X-FDA: 81698058330.15.2144F82 Received: from mail-yb1-f170.google.com (mail-yb1-f170.google.com [209.85.219.170]) by imf24.hostedemail.com (Postfix) with ESMTP id 283E9180011 for ; Sat, 20 Jan 2024 02:09:43 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=TGpKgSiV; spf=pass (imf24.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.219.170 as permitted sender) smtp.mailfrom=ioworker0@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705716584; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Z0dTd6MRCU5KWfJPW01fq42KBfk3Nx8D3X3vKSY1Rxo=; b=zKQUaQwOVV/ZTEmvgCFBECXURaNT6bHAgXxDPFBDGWzU/kUyiDrugzOVMoS3uKOPghhgYr E5/UuM8bPnvWxPYnIP3GCK27BbDcZDpfLZKTYhm6P/ssnuQSU1GHPzaqqn8jXIJDt026fC kFtyPBN/lC6UmLGrEln/ho0nGC4xXhE= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=TGpKgSiV; spf=pass (imf24.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.219.170 as permitted sender) smtp.mailfrom=ioworker0@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705716584; a=rsa-sha256; cv=none; b=y5HVSNM2bnTiUx/hPDbplC6AeUyHj985mZnCSEOaCfsMEj9NsNxQ27mu6nL8xSHLcB0BWz th03Wcd1IQNJ3sgQnxgfap4ZEyshNvEwoh1/zMjGHNFnVqUycsmA7m3PuH9L5uxJOk0BxK Sg/94n+IVgi0Cj7Li/OUVp2WPYCnWJI= Received: by mail-yb1-f170.google.com with SMTP id 3f1490d57ef6-dc223d20a29so1182868276.3 for ; Fri, 19 Jan 2024 18:09:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705716583; x=1706321383; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Z0dTd6MRCU5KWfJPW01fq42KBfk3Nx8D3X3vKSY1Rxo=; b=TGpKgSiV+90kXIBa9btPav+iXl5bW8Ae648Ep9tpqio9q1sXVqGRs1zhTdiRmM2X/w TInog5kpkXlsOFEmbCHNfnGke/1j+rbjI1uBBKyAeqXgb+oLg+Ze2zqbDrSSDxOgDQjw 79a07C1z/9gA3UtE1fQlKtfeM+BMLFJDXer09IE4Q2c9ICWgHSCyIg11S19XqUm/n1Ol OGGSuCiVJ3iYKuKpPfQmgEDZThEi3/E9C/Ao4eWDE6vOucmc7Nil7gyqBE7JekWD178Q s1hvDOrbWiDv1GYi3ag32xXzEkE6gv5kAGocWWBCD+ZSerbbZz5Vgi668X+cA/4+D4Sc qaUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705716583; x=1706321383; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Z0dTd6MRCU5KWfJPW01fq42KBfk3Nx8D3X3vKSY1Rxo=; b=AHOIYvxjqNBLZykc0fizOnPoMbtohDpdPRZiFNMvJegx2LWLTNMPvfGtYMP8MaY40x 0BFvWtH7fJrekUUVe7CQB/MVSmRiJ893L24UVOILrp+OiLvOFJR1FwLNkD/skgh7qEXo ZV1P+3RBKMrXrjeudiYF3s1F7jn2+1PPWp3NAHyXG4ARSAKQRSJBkjHwdmkODl3FbQ2T ExFd0hHmUu+ayaYbE6jFGR4okhGnh/4P9Zksty7ytUe+2l0dvzFPAb27ZVIJljhm9hFJ M2sEahb+U/qdYkz9N0OKF/AGCDhFse1qqWOa3V6zsKCOBZpmpvNe4duus2PSTWrr1sr0 M1qw== X-Gm-Message-State: AOJu0YxDvUPToEFoMCAC5FpBXIRhN+mUFSanMs553mupxIAbLiDHunh9 0JxZMuxCozqL3LoeWPukFARLRV9XvCNbua1WBhCzP9WWAT+MskxipYLBe13we3Xp1PJP6kQEN0f 8aIZW53X1KC/ME+aV8oyiL6ptv2s= X-Google-Smtp-Source: AGHT+IHEcW2qLgtQkeTqlQvEvwlyO/nGg+bxkQXGmBFiKo2xBstXNd3UgwuF1BPoZzh3HljOFBWcjS5rmnAhy123joE= X-Received: by 2002:a25:9191:0:b0:dc2:66ab:bb6e with SMTP id w17-20020a259191000000b00dc266abbb6emr761725ybl.20.1705716583259; Fri, 19 Jan 2024 18:09:43 -0800 (PST) MIME-Version: 1.0 References: <20240118120347.61817-1-ioworker0@gmail.com> In-Reply-To: From: Lance Yang Date: Sat, 20 Jan 2024 10:09:32 +0800 Message-ID: Subject: Re: [PATCH v2 1/1] mm/madvise: add MADV_F_COLLAPSE_LIGHT to process_madvise() To: Michal Hocko Cc: akpm@linux-foundation.org, zokeefe@google.com, david@redhat.com, songmuchun@bytedance.com, shy828301@gmail.com, peterx@redhat.com, mknyszek@google.com, minchan@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 283E9180011 X-Rspam-User: X-Stat-Signature: 8miyxwbp15z8nudiat3oomx8jx5sxw8t X-Rspamd-Server: rspam01 X-HE-Tag: 1705716583-595280 X-HE-Meta: U2FsdGVkX1/4+6OW67CRbnPxftdHD47J9PAnWmFCGNgOU5nIedHjz9DSiGDyh/8x+EwEch8vblTuhiiqfP9I7A1Y+I2DhMKYHK5vd4AfHEWSnMRcKruIVcxdrF3m92cqn2hRcev1d9gVDDzbfVki3gCqtvoNAnET5EeuBvpvHwiUDuXRCuJZGkZSQ/0/yinVR4CISzSM080+lEfFuuyLURFwTAe1Oy4/UxJnUpNOOyBBY95IvGJf5oGNvhzDbRybkEqjppHA89LcC6tCwQSowKmzyVwCJ6AnsDhTqrb9RLwlr3nOGEgC7EqjNfmzy1e9r38DanVp2Y4KEsi4moY9BTVY8ygt3CAEFMdVepuD+tdBvtU9oshwHivoPKIAxI0x+Yd6YJ+khCJhZ6woiiHn/VNpZ2x5lxHZST5uFtdA0PbTcpjum9oXJnldKLHEZBYE+x8725H1zp5MtVnlM+hJ3snXFqxv4dObKSJLxgilpFGngAdqkls7Q5Ulm6ArhtFLRt+2QQMQezoJVav2aN4VZj/UZLX/Ql49iWAeRXoNuz/hyi85IoBhEmznxYtsr6hPn++K6GlK15GE/qBUFBWCinC8vp9bk1m3jr69lMpK9klfkpTXCkt2yTrvVmM6buk06j3TJ3uuh2zdMCHjv6hEKL3TDZaySpnEviIqJhuwEKoijC54pz8TH2vrIjoAuqciR6/xRkIxMqHh4xVwHFGrgYzqz+uzHpQjQECvIa18q9gEPrQV5XbAv3P5lVO7WzAdwxI3F/un1Df1jCBdRDBFhzXxRsBwRFvWmp640/QnGZ1IzH6auF+FLibd4V/XBShiyKnTKTjGdxhSgqnA/FdEKj7U0K0dOcGYxclr+tba/aDhW9QCr0pNRT/KXFWXEI3oKL1o1qD4ZFGJz2gEMH1YhdxyLzaq5w08HFESymyqJkACbSF/UPf71JV0iiczQhnmuMxQkeT+Q6H+qTi0l8Q 2m8qFnWC B3DvINosejynNwWQiGrLxkd+vsbw1J3g184XQHWVjhf8803SUPI3C/C4RsguDmtPFAIbbzjRBj/Ucvb5vuAJl6/bYGQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000004, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jan 19, 2024 at 8:51=E2=80=AFPM Michal Hocko wrot= e: > > On Fri 19-01-24 10:03:05, Lance Yang wrote: > > Hey Michal, > > > > Thanks for taking the time to review! > > > > On Thu, Jan 18, 2024 at 9:40=E2=80=AFPM Michal Hocko = wrote: > > > > > > On Thu 18-01-24 20:03:46, Lance Yang wrote: > > > [...] > > > > > > before we discuss the semantic, let's focus on the usecase. > > > > > > > Use Cases > > > > > > > > An immediate user of this new functionality is the Go runtime heap = allocator > > > > that manages memory in hugepage-sized chunks. In the past, whether = it was a > > > > newly allocated chunk through mmap() or a reused chunk released by > > > > madvise(MADV_DONTNEED), the allocator attempted to eagerly back mem= ory with > > > > huge pages using madvise(MADV_HUGEPAGE)[2] and madvise(MADV_COLLAPS= E)[3] > > > > respectively. However, both approaches resulted in performance issu= es; for > > > > both scenarios, there could be entries into direct reclaim and/or c= ompaction, > > > > leading to unpredictable stalls[4]. Now, the allocator can confiden= tly use > > > > process_madvise(MADV_F_COLLAPSE_LIGHT) to attempt the allocation of= huge pages. > > > > > > IIUC the primary reason is the cost of the huge page allocation which > > > can be really high if the memory is heavily fragmented and it is call= ed > > > synchronously from the process directly, correct? Can that be worked > > > > Yes, that's correct. > > > > > around by process_madvise and performing the operation from a differe= nt > > > context? Are there any other reasons to have a different mode? > > > > In latency-sensitive scenarios, some applications aim to enhance perfor= mance > > by utilizing huge pages as much as possible. At the same time, in case = of > > allocation failure, they prefer a quick return without triggering direc= t memory > > reclamation and compaction. > > Could you elaborate some more on why? > > > > I mean I can think of a more relaxed (opportunistic) MADV_COLLAPSE - > > > e.g. non blocking one to make sure that the caller doesn't really blo= ck > > > on resource contention (be it locks or memory availability) because t= hat > > > matches our non-blocking interface in other areas but having a LIGHT > > > operation sounds really vague and the exact semantic would be > > > implementation specific and might change over time. Non-blocking has = a > > > clear semantic but it is not really clear whether that is what you > > > really need/want. > > > > Could you provide me with some suggestions regarding the naming of a > > more relaxed (opportunistic) MADV_COLLAPSE? > > Naming is not all that important at this stage (it could be > MADV_COLLAPSE_NOBLOCK for example). The primary question is whether > non-blocking in general is the desired behavior or the implementation > should try but not too hard. Hey Michal, Thanks for your suggestion! It seems that the implementation should try but not too hard aligns well with my desired behavior. Non-blocking in general is also a great idea. Perhaps in the future, we can add a MADV_F_COLLAPSE_NOBLOCK flag for scenarios where latency is extremely critical. Thanks again, Lance > > -- > Michal Hocko > SUSE Labs