From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3B19C4345F for ; Thu, 18 Apr 2024 06:26:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 44F1F6B0089; Thu, 18 Apr 2024 02:26:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3FE426B008C; Thu, 18 Apr 2024 02:26:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 25BD06B0093; Thu, 18 Apr 2024 02:26:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 035C76B0089 for ; Thu, 18 Apr 2024 02:26:22 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id AE4AC81138 for ; Thu, 18 Apr 2024 06:26:22 +0000 (UTC) X-FDA: 82021668204.10.7420082 Received: from mail-ot1-f45.google.com (mail-ot1-f45.google.com [209.85.210.45]) by imf19.hostedemail.com (Postfix) with ESMTP id 1278E1A000A for ; Thu, 18 Apr 2024 06:26:20 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="iry/HHPT"; spf=pass (imf19.hostedemail.com: domain of pizhenwei@bytedance.com designates 209.85.210.45 as permitted sender) smtp.mailfrom=pizhenwei@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1713421581; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=x0vNJuhAwUKc+UYf59Hf5S0kNxOXWBcwSUDB4ktR3Zw=; b=7anyQhecpGju3B2rrmg2c7fCPHtD2Ukx/o5ig0BfnYqdRryRKcne76tA3iArSS/e3Bsj1M 0aDcrUsjjqm2tuyyoCGJXp8Yx1VWk12kK+RqS63GmbrHdf1CYfS6Nh1hABmIlt6fIfkDfn e3hQVn3RIWYkyEeKTLk4/9F8c39jajY= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="iry/HHPT"; spf=pass (imf19.hostedemail.com: domain of pizhenwei@bytedance.com designates 209.85.210.45 as permitted sender) smtp.mailfrom=pizhenwei@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1713421581; a=rsa-sha256; cv=none; b=VzmJ1bLyz5R5AZz1zM9F9vhEpPwYVyY3eoUy/QACB5yiDgsO4s/t8dNOWTrtxJxcxLlK+V Mma5yruJiMyYaVo63uc2ocCIKDRkl3uI579uoTItemuO9pIJmwB8HG81otQBTOTPmqzXMY HtACHKjD0T1wcPNHXb3KQmB9kIvydBg= Received: by mail-ot1-f45.google.com with SMTP id 46e09a7af769-6ea09b6c826so109286a34.0 for ; Wed, 17 Apr 2024 23:26:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1713421580; x=1714026380; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=x0vNJuhAwUKc+UYf59Hf5S0kNxOXWBcwSUDB4ktR3Zw=; b=iry/HHPTDf0U5gueih76q9w/AqTe6pgeyj4UoxpH3zsxHHkKZp4pwNF5uBLg6S53Mq Og9M2J8uhZtB2j9FSoOo1R1EzSlM5DMrg6uEdlc2nD/Zm6oTQYnk1ito9yR87i4qP7Up JeRDGSN5ZQMB2fqDZ2VbVY7XUkXYq9mufMX37zHst/0OborZ8nbuoAh+RTuEW/vCQaj/ Z3BsLcvIxv/l+QFAWPXWBf3f8TQdmw54rfDsVwLLhG1fo5/U7HidaYrim913e3RyPatj Yoh5cYkM2bYjtrr0hyJ4HEydoDZzeHbniubgnPc4mOYHJ4FA9SaMofADWOj9sZxhrSom H+pQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713421580; x=1714026380; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=x0vNJuhAwUKc+UYf59Hf5S0kNxOXWBcwSUDB4ktR3Zw=; b=Ho79n9PbQZ7EYlOzR/g9glFtahDD+KMB/8kWQRKVXhR4PwVvvonw7xDYJz5gbzXksl LYSi0XDWTA76U6wlaLtaE/WdFc/EEw1aE+sKy76CzZ9l78paH6OmGJef0wuV7K8b5iG3 3cpjyktQzhlxhPmQRsdU80rfTmh7FzIrfacYBwKC259w2Dn5ALDEqkN4Ak2+kYsaAY58 CSGe5FOs+quKs7NTocEgQLxrU+Zxh0RZVZPatnUp+Q/GkE/CDDCYbtqW11DGp5YQMai7 Z99fz8WA8d0Hxz9lIbXQiApXdiE3Y8420Y6Cr9TBF7PTqdC2ELPJeri5AHWV5R0bbkAH ISuA== X-Forwarded-Encrypted: i=1; AJvYcCVnOFhavQP3+Se0hTNy5yed0dwFo41lodMvrW77Mkm06HZhjjcz6oCabrMM6tGVspMPgx+bVm9h73vlf6gRTgG4K98= X-Gm-Message-State: AOJu0Ywa3jQBcUFxsT3cHq9Np+kEPvDOE8gi9x39wxT24TLMSYmI/UbS YsvhIvuiZP+tzlbqwwm1Lc6iqU4orAVYCncfnp2Dyl5tlmNdabv+ouLXoN5vs80= X-Google-Smtp-Source: AGHT+IHHR6Htepai94lycGDse6wf1IdRutz/3r1ea6ZF/gP/tH+hAUh/tHbzPsYdqQF4kLBlWMl4TA== X-Received: by 2002:a9d:6a90:0:b0:6eb:bbc6:9af0 with SMTP id l16-20020a9d6a90000000b006ebbbc69af0mr2171391otq.12.1713421579905; Wed, 17 Apr 2024 23:26:19 -0700 (PDT) Received: from libai.bytedance.net ([61.213.176.11]) by smtp.gmail.com with ESMTPSA id s21-20020a632155000000b005cd8044c6fesm666392pgm.23.2024.04.17.23.26.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Apr 2024 23:26:19 -0700 (PDT) From: zhenwei pi To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, virtualization@lists.linux.dev Cc: mst@redhat.com, david@redhat.com, jasowang@redhat.com, xuanzhuo@linux.alibaba.com, akpm@linux-foundation.org, zhenwei pi Subject: [PATCH 3/3] virtio_balloon: introduce memory scan/reclaim info Date: Thu, 18 Apr 2024 14:26:02 +0800 Message-Id: <20240418062602.1291391-4-pizhenwei@bytedance.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240418062602.1291391-1-pizhenwei@bytedance.com> References: <20240418062602.1291391-1-pizhenwei@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 1278E1A000A X-Stat-Signature: cxsb7r68bp5tdjijod6zayzo75q9hfjy X-Rspam-User: X-Rspamd-Server: rspam07 X-HE-Tag: 1713421580-205743 X-HE-Meta: U2FsdGVkX18hkARTAdoQ6w0B7XYcNDL0v6lGE4okc21E0RFjgjqfszwPtzHV8pgZcNXXfrlJ9ZlueLeDWNhZqJ2JMPpNqvG+ddZsRn4JZtQGxdSswIqfbzhafvDto4NP+TAqEB7FU+/feoYwvYtICmIz+4WIL4iEMH1rJMX0UBCCBhWfKKDVuAdcxdKjpl8KJ0824rIxuk2TCsS72H3AyqrzqIV0RPDUBmxy0r52rqykETOujoaLoCStqPTkRRkDH0VeE/xMrSuM7Ks64kEZ9B3t9Bu8H1+Gd+wQHWJ8cycU2G+b8RU2Ad6hAI9XA19taNUjMHn8HHFHFFJ2at0vzgh/7weqMZS2GkZuhJVi1FVZqLleNquVVzUH/MwAZd0jBlA44O/vnWJu9fABh+K/64yE9wqCU4zkmFlYLyU6XwTBL2tGAse39geIO/tnPl1GHk1vRQfrg84WbW1HAoZIyS+SWutKatBc9agFrHqIWgmn/ZSWT4zWIc0rWUddj7Xmx3NLYgeBxbgouMgbV3/2e9J5o/+tETp9mJVzuKf0FahNtYYoARz2QXx2njXHFATiMdl4orLMgaEqgxoaNx2inuDOtKxXFTg2PMSleOXHoELql2/3bUQzLdMXM6awlIwnVHkfP6tTi6Dw+leg6tuf9qao893Zh2ARQYnlcDmLYW2SRXEvOCkf6MPS+r296L5VRNbWJPSW5Hlh/zLlrwRC9MZ2hKgBZ8q55BwJLJddRtclpKOoQ2QoATT9BV0bmG3sZ+y9Kv6dACWq0wCleLypK7x5vusM9PzrzZMxN2wSYKXmTGM2DJYkewlzhrnKkDaLvv8ebO7cy1XZ3I0BWI89lqK/4q4dQU4w2l8xAUl9LtmxdiJlEu/88EkLhUwO/Wplu+8/bMJ2XDKMW7lWXTWmFYzmYSAjwO98/bjFmr4FmTLtcbG2Zbw+Y1UvoQ8QRoc2iDyNp5vFe6lPRe1i6uL VioY+qmh Yn79fNzaBuElEufXrFC0zG8S/cqls1PVmGc/aoiMNEupMvDk38JB3l1UFNfR5xz4HSwM/3agUHqeltYuPkKUxv34wKVVPqoXFO4JG3Qx85qd72l7cDmjaGnX+XzARw2Bfcv5b6BqgRu2LoXQzXigzEhUuLrhaLMU9hMWpz8HmGifn4BboIxFjKNkxHA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Expose memory scan/reclaim information to the host side via virtio balloon device. Now we have a metric to analyze the memory performance: y: counter increases n: counter does not changes h: the rate of counter change is high l: the rate of counter change is low OOM: VIRTIO_BALLOON_S_OOM_KILL STALL: VIRTIO_BALLOON_S_ALLOC_STALL ASCAN: VIRTIO_BALLOON_S_SCAN_ASYNC DSCAN: VIRTIO_BALLOON_S_SCAN_DIRECT ARCLM: VIRTIO_BALLOON_S_RECLAIM_ASYNC DRCLM: VIRTIO_BALLOON_S_RECLAIM_DIRECT - OOM[y], STALL[*], ASCAN[*], DSCAN[*], ARCLM[*], DRCLM[*]: the guest runs under really critial memory pressure - OOM[n], STALL[h], ASCAN[*], DSCAN[l], ARCLM[*], DRCLM[l]: the memory allocation stalls due to cgroup, not the global memory pressure. - OOM[n], STALL[h], ASCAN[*], DSCAN[h], ARCLM[*], DRCLM[h]: the memory allocation stalls due to global memory pressure. The performance gets hurt a lot. A high ratio between DRCLM/DSCAN shows quite effective memory reclaiming. - OOM[n], STALL[h], ASCAN[*], DSCAN[h], ARCLM[*], DRCLM[l]: the memory allocation stalls due to global memory pressure. the ratio between DRCLM/DSCAN gets low, the guest OS is thrashing heavily, the serious case leads poor performance and difficult trouble shooting. Ex, sshd may block on memory allocation when accepting new connections, a user can't login a VM by ssh command. - OOM[n], STALL[n], ASCAN[h], DSCAN[n], ARCLM[l], DRCLM[n]: the low ratio between ARCLM/ASCAN shows that the guest tries to reclaim more memory, but it can't. Once more memory is required in future, it will struggle to reclaim memory. Signed-off-by: zhenwei pi --- drivers/virtio/virtio_balloon.c | 9 +++++++++ include/uapi/linux/virtio_balloon.h | 12 ++++++++++-- 2 files changed, 19 insertions(+), 2 deletions(-) diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index e88e6573afa5..bc9332c1ae85 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -356,6 +356,15 @@ static unsigned int update_balloon_stats(struct virtio_balloon *vb) stall += events[ALLOCSTALL_MOVABLE]; update_stat(vb, idx++, VIRTIO_BALLOON_S_ALLOC_STALL, stall); + update_stat(vb, idx++, VIRTIO_BALLOON_S_ASYNC_SCAN, + pages_to_bytes(events[PGSCAN_KSWAPD])); + update_stat(vb, idx++, VIRTIO_BALLOON_S_DIRECT_SCAN, + pages_to_bytes(events[PGSCAN_DIRECT])); + update_stat(vb, idx++, VIRTIO_BALLOON_S_ASYNC_RECLAIM, + pages_to_bytes(events[PGSTEAL_KSWAPD])); + update_stat(vb, idx++, VIRTIO_BALLOON_S_DIRECT_RECLAIM, + pages_to_bytes(events[PGSTEAL_DIRECT])); + #ifdef CONFIG_HUGETLB_PAGE update_stat(vb, idx++, VIRTIO_BALLOON_S_HTLB_PGALLOC, events[HTLB_BUDDY_PGALLOC]); diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h index 487b893a160e..ee35a372805d 100644 --- a/include/uapi/linux/virtio_balloon.h +++ b/include/uapi/linux/virtio_balloon.h @@ -73,7 +73,11 @@ struct virtio_balloon_config { #define VIRTIO_BALLOON_S_HTLB_PGFAIL 9 /* Hugetlb page allocation failures */ #define VIRTIO_BALLOON_S_OOM_KILL 10 /* OOM killer invocations */ #define VIRTIO_BALLOON_S_ALLOC_STALL 11 /* Stall count of memory allocatoin */ -#define VIRTIO_BALLOON_S_NR 12 +#define VIRTIO_BALLOON_S_ASYNC_SCAN 12 /* Amount of memory scanned asynchronously */ +#define VIRTIO_BALLOON_S_DIRECT_SCAN 13 /* Amount of memory scanned directly */ +#define VIRTIO_BALLOON_S_ASYNC_RECLAIM 14 /* Amount of memory reclaimed asynchronously */ +#define VIRTIO_BALLOON_S_DIRECT_RECLAIM 15 /* Amount of memory reclaimed directly */ +#define VIRTIO_BALLOON_S_NR 16 #define VIRTIO_BALLOON_S_NAMES_WITH_PREFIX(VIRTIO_BALLOON_S_NAMES_prefix) { \ VIRTIO_BALLOON_S_NAMES_prefix "swap-in", \ @@ -87,7 +91,11 @@ struct virtio_balloon_config { VIRTIO_BALLOON_S_NAMES_prefix "hugetlb-allocations", \ VIRTIO_BALLOON_S_NAMES_prefix "hugetlb-failures", \ VIRTIO_BALLOON_S_NAMES_prefix "oom-kills", \ - VIRTIO_BALLOON_S_NAMES_prefix "alloc-stalls" \ + VIRTIO_BALLOON_S_NAMES_prefix "alloc-stalls", \ + VIRTIO_BALLOON_S_NAMES_prefix "async-scans", \ + VIRTIO_BALLOON_S_NAMES_prefix "direct-scans", \ + VIRTIO_BALLOON_S_NAMES_prefix "async-reclaims", \ + VIRTIO_BALLOON_S_NAMES_prefix "direct-reclaims" \ } #define VIRTIO_BALLOON_S_NAMES VIRTIO_BALLOON_S_NAMES_WITH_PREFIX("") -- 2.34.1