From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=0.3 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 798FAC4361B for ; Mon, 14 Dec 2020 07:43:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B193522838 for ; Mon, 14 Dec 2020 07:43:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B193522838 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D25D96B0036; Mon, 14 Dec 2020 02:43:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CD5B26B005C; Mon, 14 Dec 2020 02:43:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B76896B005D; Mon, 14 Dec 2020 02:43:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0076.hostedemail.com [216.40.44.76]) by kanga.kvack.org (Postfix) with ESMTP id 9F2376B0036 for ; Mon, 14 Dec 2020 02:43:29 -0500 (EST) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 64BFC18012CE1 for ; Mon, 14 Dec 2020 07:43:29 +0000 (UTC) X-FDA: 77591097738.22.jar12_2c0254327419 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin22.hostedemail.com (Postfix) with ESMTP id 4485B1803737A for ; Mon, 14 Dec 2020 07:43:29 +0000 (UTC) X-HE-Tag: jar12_2c0254327419 X-Filterd-Recvd-Size: 10203 Received: from mail-il1-f196.google.com (mail-il1-f196.google.com [209.85.166.196]) by imf41.hostedemail.com (Postfix) with ESMTP for ; Mon, 14 Dec 2020 07:43:28 +0000 (UTC) Received: by mail-il1-f196.google.com with SMTP id q1so14968496ilt.6 for ; Sun, 13 Dec 2020 23:43:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=6+4qnaiUHeN8+vAT4uVsBxcWZ3MWhgK1Wu1+mXqbFG0=; b=YJD4HQ68YzGELMiZfQMb1fLdgZmDlrve1VWv+OIb3b8bpgJu2e0HVrf2ukB/Kq7LfO 6knEEL+opCVG3fMXWtuOs7qdphMQI0z0yVFHSqEIHnaGOKaw1G6t1bnmwEDxgdOQ4kZA RmDZnDEf2EDwZRjU6Vjo+te75hOD5uB6HdDyjzo1TF15nejcVWhsOmZa0JzDESoAWFxw eYAjbj3g4GsV29V5gfHgXOY551m3PpKtLuJxt2CIJ1dhg85dTpx3zjU0GnoKVqnkpkG5 wxv8h45CmSDi+RuD1MuT1Qs2DY8OGGx9M7AnSuZ/hABgEP42MpMazpC2O7iQdxR5mzjR 79qQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=6+4qnaiUHeN8+vAT4uVsBxcWZ3MWhgK1Wu1+mXqbFG0=; b=qEgxyqdOOZ6ZsdbY9esU/b8ekO3KVejpLSsmC2JWlLfBqCTQpn7M/9ml0M0nOu6qpN DtzFssKJo7pgwn98gEptecnQYOPrptsxpAmGA0A4hkwlkMZ4Wqenljj8EX4+zrl1Dy+I KmGLWy4FjUfE8FV2MNUqlfN2/6FekrAMYUaw6ZsZrKJ2oRyXedg26QeTAZtnq8sT+apX ClBcNaTgeCLW6xAmo/f6CqJw2eLOAEbbLsChXPsdqPu683eLbO46LoqaRfxpgjLmyWrP avntHlBbIh0REiXIaJtosV0vSRS62RU/TO1V9PbGkMxYS8advodkDPav5ApvfhbpyyuB orkA== X-Gm-Message-State: AOAM533okvDG7ZyXIiSWo6Xb/Y35vRws0Nlv2DKZjCy5hCEgufbBNe+a CkHN2ohY2UejID/QH1hYTl1jmM1pFdKwZFJhJ3640OPxpo0= X-Google-Smtp-Source: ABdhPJwgS7LAZO0XsiuLaNPOIsbbc7KYc75yK1W7S+62mm/gJfHs0YCED2LIlkid2wF/DP2P+YcODbjWYskIx6yvG2k= X-Received: by 2002:a92:cb0d:: with SMTP id s13mr34099117ilo.73.1607931808297; Sun, 13 Dec 2020 23:43:28 -0800 (PST) MIME-Version: 1.0 References: <158893941613.200862.4094521350329937435.stgit@buzz> <97ece625-2799-7ae6-28b5-73c52c7c497b@oracle.com> In-Reply-To: From: Konstantin Khlebnikov Date: Mon, 14 Dec 2020 10:43:17 +0300 Message-ID: Subject: Re: [PATCH RFC 0/8] dcache: increase poison resistance To: Junxiao Bi Cc: Konstantin Khlebnikov , Linux Kernel Mailing List , linux-fsdevel , linux-mm@kvack.org, Alexander Viro , Waiman Long , Gautham Ananthakrishna , matthew.wilcox@oracle.com Content-Type: multipart/alternative; boundary="00000000000041b8c905b667cc17" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --00000000000041b8c905b667cc17 Content-Type: text/plain; charset="UTF-8" On Sun, Dec 13, 2020 at 9:52 PM Junxiao Bi wrote: > On 12/11/20 11:32 PM, Konstantin Khlebnikov wrote: > > > On Thu, Dec 10, 2020 at 2:01 AM Junxiao Bi > > wrote: > > > > Hi Konstantin, > > > > We tested this patch set recently and found it limiting negative > > dentry > > to a small part of total memory. And also we don't see any > > performance > > regression on it. Do you have any plan to integrate it into > > mainline? It > > will help a lot on memory fragmentation issue causing by dentry slab, > > there were a lot of customer cases where sys% was very high since > > most > > cpu were doing memory compaction, dentry slab was taking too much > > memory > > and nearly all dentry there were negative. > > > > > > Right now I don't have any plans for this. I suspect such problems will > > appear much more often since machines are getting bigger. > > So, somebody will take care of it. > We already had a lot of customer cases. It made no sense to leave so > many negative dentry in the system, it caused memory fragmentation and > not much benefit. > Dcache could grow so big only if the system lacks of memory pressure. Simplest solution is a cronjob which provinces such pressure by creating sparse file on disk-based fs and then reading it. This should wash away all inactive caches with no IO and zero chance of oom. > > > > First part which collects negative dentries at the end list of > > siblings could be > > done in a more obvious way by splitting the list in two. > > But this touches much more code. > That would add new field to dentry? > Yep. Decision is up to maintainers. > > > Last patch isn't very rigid but does non-trivial changes. > > Probably it's better to call some garbage collector thingy periodically. > > Lru list needs pressure to age and reorder entries properly. > > Swap the negative dentry to the head of hash list when it get accessed? > Extra ones can be easily trimmed when swapping, using GC is to reduce > perf impact? > Reclaimer/shrinker scans denties in LRU lists, it's an another list. My patch used order in hash lists is a very unusual way. Don't be confused. There are four lists parent - siblings hashtable - hashchain LRU inode - alias > > Thanks, > > Junxioao. > > > > > Gc could be off by default or thresholds set very high (50% of ram for > > example). > > Final setup could be left up to owners of large systems, which needs > > fine tuning. > --00000000000041b8c905b667cc17 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Sun, Dec 13, 2020 at 9:52 PM Junxi= ao Bi <junxiao.bi@oracle.com> wrote:
On = 12/11/20 11:32 PM, Konstantin Khlebnikov wrote:

> On Thu, Dec 10, 2020 at 2:01 AM Junxiao Bi <
junxiao.bi@oracle.com
> <mailto:= junxiao.bi@oracle.com>> wrote:
>
>=C2=A0 =C2=A0 =C2=A0Hi Konstantin,
>
>=C2=A0 =C2=A0 =C2=A0We tested this patch set recently and found it limi= ting negative
>=C2=A0 =C2=A0 =C2=A0dentry
>=C2=A0 =C2=A0 =C2=A0to a small part of total memory. And also we don= 9;t see any
>=C2=A0 =C2=A0 =C2=A0performance
>=C2=A0 =C2=A0 =C2=A0regression on it. Do you have any plan to integrate= it into
>=C2=A0 =C2=A0 =C2=A0mainline? It
>=C2=A0 =C2=A0 =C2=A0will help a lot on memory fragmentation issue causi= ng by dentry slab,
>=C2=A0 =C2=A0 =C2=A0there were a lot of customer cases where sys% was v= ery high since
>=C2=A0 =C2=A0 =C2=A0most
>=C2=A0 =C2=A0 =C2=A0cpu were doing memory compaction, dentry slab was t= aking too much
>=C2=A0 =C2=A0 =C2=A0memory
>=C2=A0 =C2=A0 =C2=A0and nearly all dentry there were negative.
>
>
> Right now I don't have any plans for this. I suspect such problems= will
> appear much more often since machines are getting bigger.
> So, somebody will take care of it.
We already had a lot of customer cases. It made no sense to leave so
many negative dentry in the system, it caused memory fragmentation and
not much benefit.

Dcache could grow so = big only if the system lacks of memory pressure.

S= implest solution is a cronjob=C2=A0which provinces=C2=A0such pressure by
creating sparse file on disk-based fs and then reading it.
This should wash away all inactive caches with no IO and zero chance of o= om.
=C2=A0
>
> First part which collects negative dentries at the end list of
> siblings could be
> done in a more obvious way by splitting the list in two.
> But this touches much more code.
That would add new field to dentry?

Yep= . Decision=C2=A0is up to maintainers.

>
> Last patch isn't very rigid but does non-trivial changes.
> Probably it's better to call some garbage collector thingy periodi= cally.
> Lru list needs pressure to age and reorder entries properly.

Swap the negative dentry to the head of hash list when it get accessed? Extra ones can be easily trimmed when swapping, using GC is to reduce
perf impact?

Reclaimer/shrinker scans d= enties=C2=A0in LRU lists, it's an another list.
My patch used= order in hash lists is a very unusual way. Don't be confused.

There are four lists
parent - siblings
hashtable - hashchain
LRU
inode - alias
=C2= =A0

Thanks,

Junxioao.

>
> Gc could be off by default or thresholds set very high (50% of ram for=
> example).
> Final setup could be left up to owners of large systems, which needs <= br> > fine tuning.
--00000000000041b8c905b667cc17--