From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8AE9FC4828D for ; Thu, 1 Feb 2024 21:07:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0EF7F6B0072; Thu, 1 Feb 2024 16:07:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 078FF6B007E; Thu, 1 Feb 2024 16:07:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E82496B0080; Thu, 1 Feb 2024 16:07:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id D322E6B0072 for ; Thu, 1 Feb 2024 16:07:07 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 72CB01A0680 for ; Thu, 1 Feb 2024 21:07:07 +0000 (UTC) X-FDA: 81744470094.09.79B11A3 Received: from mail-oi1-f182.google.com (mail-oi1-f182.google.com [209.85.167.182]) by imf07.hostedemail.com (Postfix) with ESMTP id C36FF40017 for ; Thu, 1 Feb 2024 21:07:04 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none ("invalid DKIM record") header.d=soleen.com header.s=google header.b=DhLL4gEC; spf=none (imf07.hostedemail.com: domain of pasha.tatashin@soleen.com has no SPF policy when checking 209.85.167.182) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706821624; a=rsa-sha256; cv=none; b=uaPoet1KII91PYScgTv7c8ZsNlv+flqIrQjzeXIEoHbkxe4DVC7EMp9UP1Mwc05p+A8t7A 0XtaEq6Yxjbqh80x2gnAQB5Tn9KtcZ5IsG5xpT71LAL5T3jJ5HGFgOBjn5n8+XQTDwapkh UesxCLvWMWW6/PrPyzeJLX9a19H+lDI= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none ("invalid DKIM record") header.d=soleen.com header.s=google header.b=DhLL4gEC; spf=none (imf07.hostedemail.com: domain of pasha.tatashin@soleen.com has no SPF policy when checking 209.85.167.182) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706821624; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Xpf1puraZK0YLhm6VC32jQUIE3+0XR7H+iAbplKocpw=; b=h2CWR9uhLm0dxaur9P4m500R/wjwao3XZWy5kCU4gvpUc3CmYfULmQge9c+5umXDLgUJEc fYqb+NI/cBg6LGW1+IHRN4sSIsj48cFURwcGBSOxG1oZH7vG/Zz7pBH218+vCqJL4NDzBL YWHyTM+7k66XA5pgtdfsPcAtUAua2tM= Received: by mail-oi1-f182.google.com with SMTP id 5614622812f47-3beb504c985so755155b6e.0 for ; Thu, 01 Feb 2024 13:07:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1706821624; x=1707426424; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Xpf1puraZK0YLhm6VC32jQUIE3+0XR7H+iAbplKocpw=; b=DhLL4gECgILGzf4rHezdfnvjksFzz5odtbEMfBDsSW1ieoEgD2v0mvE5R5Ay7GwR5v vfhoCwJloxAMXA1++j9srZ+S6gAT9KsaBIcF/2rSPZ71F8BxqWQqgMu0aCGLWygAmOJi khmFokzgSMocGHwDVfhBEMUf0YfHSESzM/VXw5UfJEH3GbnTFij53P1QLQ2542mA+rUx NlLuLxURdNmBBWlrLU6A+owl2AUf/NPXz2+E0NKDLIpS3N109bFzY5jzvPBq5XeI6zvM 1j9k8oS/7Wd5Gs2MLF6B52BSJ0M391PxZnRG2HRpJTCmN1YO4p7Ye5qCf1eWTJ56u6N0 A8Fg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706821624; x=1707426424; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Xpf1puraZK0YLhm6VC32jQUIE3+0XR7H+iAbplKocpw=; b=Wokq5qp2B0ShieotXDEAJrX0TnsEuBF+iTKptAV84zolxQIYm7NkMDsBCzJylUzwHK 7FMoWeUMYtlzjeYv+CUhCy/qn5my90iikADOCf81kfMFvchKXl/upE8twp35F21toKwe LhbycT76enYcATvNkICOmQqZnTouR3Klm2eIqeyTHWy65500JMzDRl9eDjqRpekYZ00C p42HvfeDQrAkrsPKyH3c+h9yiRpbwZNzoxsw9tl/PoG4+haSLA52wGt0Yg6GMT2QewGH ahnzv6iDaFOJaIAuNPL8HxtBECFr6ekicT6+TeWD7Evkc1DsmqBbgrsh0ua+XsN0IoG+ zvzw== X-Gm-Message-State: AOJu0Yzm30iHW+eFLnkMOA/KEzwjS1dSGEIvmCqwQTUfJf2Zh74WioY/ oOSxJHCzttXgNnpoeocoovb4Li7a8OcXn+H4hrMWRVjFTXKCgeE6WER4VdfsV+hr0MNkf7zcyrp +HAsyAuy5BeAqxLRypTLI7FpVin5oRIMegonAKZiEBS9tFGGBhBE= X-Google-Smtp-Source: AGHT+IHGtpacZ+jaBeIGAsk6QQM31sihBMAtG5jzOo+mB5SscJQIhVF2DHVshWAHRjZKVuxKH7/NbyOPrZUa5kEDaHc= X-Received: by 2002:a05:6808:2220:b0:3be:aafd:ce3 with SMTP id bd32-20020a056808222000b003beaafd0ce3mr7132823oib.3.1706821623809; Thu, 01 Feb 2024 13:07:03 -0800 (PST) MIME-Version: 1.0 References: <20240201193014.2785570-1-tatashin@google.com> <02610629-05ef-4956-a122-36b6ac98fbc2@arm.com> In-Reply-To: <02610629-05ef-4956-a122-36b6ac98fbc2@arm.com> From: Pasha Tatashin Date: Thu, 1 Feb 2024 16:06:27 -0500 Message-ID: Subject: Re: [PATCH] iommu/iova: use named kmem_cache for iova magazines To: Robin Murphy Cc: joro@8bytes.org, will@kernel.org, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, rientjes@google.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: C36FF40017 X-Stat-Signature: whp3w7mqsg1srk3mgb5rxk8ht3nyk9xf X-Rspam-User: X-HE-Tag: 1706821624-279246 X-HE-Meta: U2FsdGVkX18BP51o2FG2FvlNGiGX36Hquugsewx+gWXoDSxQl4s/NC1sU2YwVIzhG4wckmPYUn6H/s5jOyaRNYIjF1DrA5/HHU2hM10XFXIWcbdCZLdoPf0IrKG1gvUhMRvG+SEqTe5LWfSLa255TDBgNKoWZdZpdFexTJiY+G8CHIp9YMJvRjfqCVVS5CxMWhJYlbn7yd6hfbbiQKvAMN95+7XX2vn8scTXZOfjutwefk6cPpXvbtoez1uoNjRL1quYz8wK4t5fl31iFV0zJgiA+FuZFcEcgaqsJu0W6YXtoAonGly33BzmOoWe4+PeObZp5CklEzximyO0wqcfyiCMb95MfiMXgeJBWqIv/2cwKPmP4MbxUE4bPryHp0Uh5QDXFvOOYZMR2VwagogTuGlL33EmJR5h3d4Y1a4XFrDEvaJBGibjMzmGTkgaw++86QRXF+l+na2r4Wx0T3ClTGUkdnS5J9QfITVb3KPvfFEDYYaq4HH2WJT/1Dk374qPet+sLQ2OoKNcI2kwWLKmpMmcLuD4bkOuRMb5dY2KwcW18nQm3/XABvKv5muw6GaWqQ5yx6P4Eg6hVzPwvVt9uvOoBafk3+aZqNbqEoJ+G4fRWamN2MW2l6N57SHYoXlANX+/Xih+8ti8t24AUmOhvv+sMiTYjcGW5C9o9nr4MBaPK/Bbv4t4D7DyPAaFIzkSyTRwOpiDLtE7eAbsWltFFvarP/HhgdbU5iEkVKZIiA1My7uyCH7WmN7XEikJ3WMvTMC8b9AlkXhBX97AJ0HDfEVTrVHtI88HvsXzwy4Hst5ubKDMHqVBbNANtKJDHNWjmZY2xEl0pzQUUTn4s0IYUf/CiFMT3DirXtH4+BDMv25e9nHgz/D9R57LLrDh27QTVU43KMHGhFY8B3IG+BmgEK2dqfWElEYpNRguFMeBpb5v0c6g7H1/RmU3AF3zeZRU X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Feb 1, 2024 at 3:56=E2=80=AFPM Robin Murphy = wrote: > > On 2024-02-01 7:30 pm, Pasha Tatashin wrote: > > From: Pasha Tatashin > > > > The magazine buffers can take gigabytes of kmem memory, dominating all > > other allocations. For observability prurpose create named slab cache s= o > > the iova magazine memory overhead can be clearly observed. > > > > With this change: > > > >> slabtop -o | head > > Active / Total Objects (% used) : 869731 / 952904 (91.3%) > > Active / Total Slabs (% used) : 103411 / 103974 (99.5%) > > Active / Total Caches (% used) : 135 / 211 (64.0%) > > Active / Total Size (% used) : 395389.68K / 411430.20K (96.1%) > > Minimum / Average / Maximum Object : 0.02K / 0.43K / 8.00K > > > > OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME > > 244412 244239 99% 1.00K 61103 4 244412K iommu_iova_magazin= e > > 91636 88343 96% 0.03K 739 124 2956K kmalloc-32 > > 75744 74844 98% 0.12K 2367 32 9468K kernfs_node_cache > > > > On this machine it is now clear that magazine use 242M of kmem memory. > > Hmm, something smells there... > > In the "worst" case there should be a maximum of 6 * 2 * > num_online_cpus() empty magazines in the iova_cpu_rcache structures, > i.e., 12KB per CPU. Under normal use those will contain at least some > PFNs, but mainly every additional magazine stored in a depot is full > with 127 PFNs, and each one of those PFNs is backed by a 40-byte struct > iova, i.e. ~5KB per 1KB magazine. Unless that machine has many thousands > of CPUs, if iova_magazine allocations are the top consumer of memory > then something's gone wrong. This is an upstream kernel + few drivers that is booted on AMD EPYC, with 128 CPUs. It has allocations stacks like these: init_iova_domain+0x1ed/0x230 iommu_setup_dma_ops+0xf8/0x4b0 amd_iommu_probe_finalize. And also init_iova_domain() calls for Google's TPU drivers 242M is actually not that much, compared to the size of the system. Pasha > > Thanks, > Robin. > > > Signed-off-by: Pasha Tatashin > > --- > > drivers/iommu/iova.c | 57 +++++++++++++++++++++++++++++++++++++++++--= - > > 1 file changed, 54 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c > > index d30e453d0fb4..617bbc2b79f5 100644 > > --- a/drivers/iommu/iova.c > > +++ b/drivers/iommu/iova.c > > @@ -630,6 +630,10 @@ EXPORT_SYMBOL_GPL(reserve_iova); > > > > #define IOVA_DEPOT_DELAY msecs_to_jiffies(100) > > > > +static struct kmem_cache *iova_magazine_cache; > > +static unsigned int iova_magazine_cache_users; > > +static DEFINE_MUTEX(iova_magazine_cache_mutex); > > + > > struct iova_magazine { > > union { > > unsigned long size; > > @@ -654,11 +658,51 @@ struct iova_rcache { > > struct delayed_work work; > > }; > > > > +static int iova_magazine_cache_init(void) > > +{ > > + int ret =3D 0; > > + > > + mutex_lock(&iova_magazine_cache_mutex); > > + > > + iova_magazine_cache_users++; > > + if (iova_magazine_cache_users > 1) > > + goto out_unlock; > > + > > + iova_magazine_cache =3D kmem_cache_create("iommu_iova_magazine", > > + sizeof(struct iova_magazi= ne), > > + 0, SLAB_HWCACHE_ALIGN, NU= LL); > > + > > + if (!iova_magazine_cache) { > > + pr_err("Couldn't create iova magazine cache\n"); > > + ret =3D -ENOMEM; > > + } > > + > > +out_unlock: > > + mutex_unlock(&iova_magazine_cache_mutex); > > + > > + return ret; > > +} > > + > > +static void iova_magazine_cache_fini(void) > > +{ > > + mutex_lock(&iova_magazine_cache_mutex); > > + > > + if (WARN_ON(!iova_magazine_cache_users)) > > + goto out_unlock; > > + > > + iova_magazine_cache_users--; > > + if (!iova_magazine_cache_users) > > + kmem_cache_destroy(iova_magazine_cache); > > + > > +out_unlock: > > + mutex_unlock(&iova_magazine_cache_mutex); > > +} > > + > > static struct iova_magazine *iova_magazine_alloc(gfp_t flags) > > { > > struct iova_magazine *mag; > > > > - mag =3D kmalloc(sizeof(*mag), flags); > > + mag =3D kmem_cache_alloc(iova_magazine_cache, flags); > > if (mag) > > mag->size =3D 0; > > > > @@ -667,7 +711,7 @@ static struct iova_magazine *iova_magazine_alloc(gf= p_t flags) > > > > static void iova_magazine_free(struct iova_magazine *mag) > > { > > - kfree(mag); > > + kmem_cache_free(iova_magazine_cache, mag); > > } > > > > static void > > @@ -766,11 +810,17 @@ int iova_domain_init_rcaches(struct iova_domain *= iovad) > > unsigned int cpu; > > int i, ret; > > > > + ret =3D iova_magazine_cache_init(); > > + if (ret) > > + return -ENOMEM; > > + > > iovad->rcaches =3D kcalloc(IOVA_RANGE_CACHE_MAX_SIZE, > > sizeof(struct iova_rcache), > > GFP_KERNEL); > > - if (!iovad->rcaches) > > + if (!iovad->rcaches) { > > + iova_magazine_cache_fini(); > > return -ENOMEM; > > + } > > > > for (i =3D 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) { > > struct iova_cpu_rcache *cpu_rcache; > > @@ -948,6 +998,7 @@ static void free_iova_rcaches(struct iova_domain *i= ovad) > > > > kfree(iovad->rcaches); > > iovad->rcaches =3D NULL; > > + iova_magazine_cache_fini(); > > } > > > > /*