From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.3 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 77486C35247 for ; Wed, 5 Feb 2020 22:37:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1B2092082E for ; Wed, 5 Feb 2020 22:37:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="p/7HbroU" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1B2092082E Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A98D26B0003; Wed, 5 Feb 2020 17:37:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A49DD6B0005; Wed, 5 Feb 2020 17:37:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 960286B0006; Wed, 5 Feb 2020 17:37:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0168.hostedemail.com [216.40.44.168]) by kanga.kvack.org (Postfix) with ESMTP id 75E2F6B0003 for ; Wed, 5 Feb 2020 17:37:29 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 14B6C4995FD for ; Wed, 5 Feb 2020 22:37:29 +0000 (UTC) X-FDA: 76457536218.08.chain69_2427cc6f37642 X-HE-Tag: chain69_2427cc6f37642 X-Filterd-Recvd-Size: 33036 Received: from mail-vk1-f196.google.com (mail-vk1-f196.google.com [209.85.221.196]) by imf09.hostedemail.com (Postfix) with ESMTP for ; Wed, 5 Feb 2020 22:37:28 +0000 (UTC) Received: by mail-vk1-f196.google.com with SMTP id p191so1057825vkf.8 for ; Wed, 05 Feb 2020 14:37:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=bde5/Wsqypb9DjAs5ruXvY1IV4fBJUJFj4woiDaNJrs=; b=p/7HbroUL+adbBStQ+LYYF5g6GElQj/RNNYIjTc76pNkLz9IQQ0nO/t+sqbksv5S8n +wEXVJJWuBKdqFI1eNLnTZTTaYLtHJD9oEHtosVGt9fXYvX2133kv2mfQIPfuL6FoTh4 op9z+H3DzhqN0aOCM3A5Ad6VhxMsA+oQHY2y7sdNCX1Kj9FlztL9jFUSRTPG4Sdu2jpr CyEj47WpkaPgjwxfNK5eOxplbF+f1FqFYW/EpG+gvqWta22N0X+aIHOxJnNoV3pSq6QZ T3TGgoBrVA9gA8gtS3kqP9ZJ3YlllmYc71clmJ/+D1ze2YS8m8Fj68YpvLa74haJl9JM DRRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=bde5/Wsqypb9DjAs5ruXvY1IV4fBJUJFj4woiDaNJrs=; b=AUrBBpf/ftyCGW413eplllggQcnd3JvS5beMB86bJsFDOxlslba3ShwXrVxg3DsjLR 4T6Z55PwTydq2Y1aLZtDz9l1ggcdPo0gvltij/le99zMX6BILJLlboDB4viI94EMOWKk wAAfufbd1JiXmdD5eOUyMw7hZ6yo/WcKUgaxMkvToc3Ys6Z3W4uxhWTss1JlaSy3PvS4 uoCDggnIb5FpkdVPYAYHfXr5OjMaQfEoNT/jvwagmqSjB+eWjZzymfnQOpvOf5sjiCbT MiKxlrR/5RRoM/JTsMXEWI6zgjL4P6Ow9G1JAyXaSQCnDtrhFuqEZ1QXinLCEAeHB7jw rn6Q== X-Gm-Message-State: APjAAAXVBge039XkYSATzfVjyAJoIm0tgM4BFhrOY1IlPFujhwT6w0LE EyPQ02CSwpWWIiN98AaLzkv93tH3KXLxIMBTDylSrA== X-Google-Smtp-Source: APXvYqzgDcaSr/OU25W8Z6x10HuBUgYcWn4SCBNA4CXLfKc1vnrxuHVeU21SWSYB+s07RZfb4wNAT0CBhDL1Tvnbm1U= X-Received: by 2002:ac5:c5c2:: with SMTP id g2mr165812vkl.82.1580942246307; Wed, 05 Feb 2020 14:37:26 -0800 (PST) MIME-Version: 1.0 References: <20200205163402.42627-1-david@redhat.com> <20200205163402.42627-4-david@redhat.com> In-Reply-To: <20200205163402.42627-4-david@redhat.com> From: Tyler Sanderson Date: Wed, 5 Feb 2020 14:37:15 -0800 Message-ID: Subject: Re: [PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM To: David Hildenbrand Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, virtualization@lists.linux-foundation.org, "Michael S . Tsirkin" , Wei Wang , Alexander Duyck , David Rientjes , Nadav Amit , Michal Hocko Content-Type: multipart/alternative; boundary="00000000000000d6c8059ddbcdcf" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --00000000000000d6c8059ddbcdcf Content-Type: text/plain; charset="UTF-8" On Wed, Feb 5, 2020 at 8:34 AM David Hildenbrand wrote: > Commit 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker") > changed the behavior when deflation happens automatically. Instead of > deflating when called by the OOM handler, the shrinker is used. > > However, the balloon is not simply some slab cache that should be > shrunk when under memory pressure. The shrinker does not have a concept of > priorities, so this behavior cannot be configured. > > There was a report that this results in undesired side effects when > inflating the balloon to shrink the page cache. [1] > "When inflating the balloon against page cache (i.e. no free memory > remains) vmscan.c will both shrink page cache, but also invoke the > shrinkers -- including the balloon's shrinker. So the balloon > driver allocates memory which requires reclaim, vmscan gets this > memory by shrinking the balloon, and then the driver adds the > memory back to the balloon. Basically a busy no-op." > > The name "deflate on OOM" makes it pretty clear when deflation should > happen - after other approaches to reclaim memory failed, not while > reclaiming. This allows to minimize the footprint of a guest - memory > will only be taken out of the balloon when really needed. > > Especially, a drop_slab() will result in the whole balloon getting > deflated - undesired. While handling it via the OOM handler might not be > perfect, it keeps existing behavior. If we want a different behavior, then > we need a new feature bit and document it properly (although, there should > be a clear use case and the intended effects should be well described). > > Keep using the shrinker for VIRTIO_BALLOON_F_FREE_PAGE_HINT, because > this has no such side effects. Always register the shrinker with > VIRTIO_BALLOON_F_FREE_PAGE_HINT now. We are always allowed to reuse free > pages that are still to be processed by the guest. The hypervisor takes > care of identifying and resolving possible races between processing a > hinting request and the guest reusing a page. > > In contrast to pre commit 71994620bb25 ("virtio_balloon: replace oom > notifier with shrinker"), don't add a moodule parameter to configure the > number of pages to deflate on OOM. Can be re-added if really needed. > Also, pay attention that leak_balloon() returns the number of 4k pages - > convert it properly in virtio_balloon_oom_notify(). > > Note1: using the OOM handler is frowned upon, but it really is what we > need for this feature. > > Note2: without VIRTIO_BALLOON_F_MUST_TELL_HOST (iow, always with QEMU) we > could actually skip sending deflation requests to our hypervisor, > making the OOM path *very* simple. Besically freeing pages and > updating the balloon. If the communication with the host ever > becomes a problem on this call path. > > [1] https://www.spinics.net/lists/linux-virtualization/msg40863.html > > Reported-by: Tyler Sanderson > Cc: Michael S. Tsirkin > Cc: Wei Wang > Cc: Alexander Duyck > Cc: David Rientjes > Cc: Nadav Amit > Cc: Michal Hocko > Signed-off-by: David Hildenbrand > --- > drivers/virtio/virtio_balloon.c | 107 +++++++++++++------------------- > 1 file changed, 44 insertions(+), 63 deletions(-) > > diff --git a/drivers/virtio/virtio_balloon.c > b/drivers/virtio/virtio_balloon.c > index 7e5d84caeb94..e7b18f556c5e 100644 > --- a/drivers/virtio/virtio_balloon.c > +++ b/drivers/virtio/virtio_balloon.c > @@ -14,6 +14,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -27,7 +28,9 @@ > */ > #define VIRTIO_BALLOON_PAGES_PER_PAGE (unsigned)(PAGE_SIZE >> > VIRTIO_BALLOON_PFN_SHIFT) > #define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256 > -#define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80 > +/* Maximum number of (4k) pages to deflate on OOM notifications. */ > +#define VIRTIO_BALLOON_OOM_NR_PAGES 256 > +#define VIRTIO_BALLOON_OOM_NOTIFY_PRIORITY 80 > > #define VIRTIO_BALLOON_FREE_PAGE_ALLOC_FLAG (__GFP_NORETRY | __GFP_NOWARN > | \ > __GFP_NOMEMALLOC) > @@ -112,8 +115,11 @@ struct virtio_balloon { > /* Memory statistics */ > struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR]; > > - /* To register a shrinker to shrink memory upon memory pressure */ > + /* Shrinker to return free pages - VIRTIO_BALLOON_F_FREE_PAGE_HINT > */ > struct shrinker shrinker; > + > + /* OOM notifier to deflate on OOM - > VIRTIO_BALLOON_F_DEFLATE_ON_OOM */ > + struct notifier_block oom_nb; > }; > > static struct virtio_device_id id_table[] = { > @@ -786,50 +792,13 @@ static unsigned long shrink_free_pages(struct > virtio_balloon *vb, > return blocks_freed * VIRTIO_BALLOON_HINT_BLOCK_PAGES; > } > > -static unsigned long leak_balloon_pages(struct virtio_balloon *vb, > - unsigned long pages_to_free) > -{ > - return leak_balloon(vb, pages_to_free * > VIRTIO_BALLOON_PAGES_PER_PAGE) / > - VIRTIO_BALLOON_PAGES_PER_PAGE; > -} > - > -static unsigned long shrink_balloon_pages(struct virtio_balloon *vb, > - unsigned long pages_to_free) > -{ > - unsigned long pages_freed = 0; > - > - /* > - * One invocation of leak_balloon can deflate at most > - * VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it > - * multiple times to deflate pages till reaching pages_to_free. > - */ > - while (vb->num_pages && pages_freed < pages_to_free) > - pages_freed += leak_balloon_pages(vb, > - pages_to_free - > pages_freed); > - > - update_balloon_size(vb); > - > - return pages_freed; > -} > - > static unsigned long virtio_balloon_shrinker_scan(struct shrinker > *shrinker, > struct shrink_control > *sc) > { > - unsigned long pages_to_free, pages_freed = 0; > struct virtio_balloon *vb = container_of(shrinker, > struct virtio_balloon, shrinker); > > - pages_to_free = sc->nr_to_scan; > - > - if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > - pages_freed = shrink_free_pages(vb, pages_to_free); > - > - if (pages_freed >= pages_to_free) > - return pages_freed; > - > - pages_freed += shrink_balloon_pages(vb, pages_to_free - > pages_freed); > - > - return pages_freed; > + return shrink_free_pages(vb, sc->nr_to_scan); > } > > static unsigned long virtio_balloon_shrinker_count(struct shrinker > *shrinker, > @@ -837,26 +806,22 @@ static unsigned long > virtio_balloon_shrinker_count(struct shrinker *shrinker, > { > struct virtio_balloon *vb = container_of(shrinker, > struct virtio_balloon, shrinker); > - unsigned long count; > - > - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE; > - count += vb->num_free_page_blocks * > VIRTIO_BALLOON_HINT_BLOCK_PAGES; > > - return count; > + return vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES; > } > > -static void virtio_balloon_unregister_shrinker(struct virtio_balloon *vb) > +static int virtio_balloon_oom_notify(struct notifier_block *nb, > + unsigned long dummy, void *parm) > { > - unregister_shrinker(&vb->shrinker); > -} > + struct virtio_balloon *vb = container_of(nb, > + struct virtio_balloon, > oom_nb); > + unsigned long *freed = parm; > > -static int virtio_balloon_register_shrinker(struct virtio_balloon *vb) > -{ > - vb->shrinker.scan_objects = virtio_balloon_shrinker_scan; > - vb->shrinker.count_objects = virtio_balloon_shrinker_count; > - vb->shrinker.seeks = DEFAULT_SEEKS; > + *freed += leak_balloon(vb, VIRTIO_BALLOON_OOM_NR_PAGES) / > + VIRTIO_BALLOON_PAGES_PER_PAGE; > + update_balloon_size(vb); > > - return register_shrinker(&vb->shrinker); > + return NOTIFY_OK; > } > > static int virtballoon_probe(struct virtio_device *vdev) > @@ -933,22 +898,35 @@ static int virtballoon_probe(struct virtio_device > *vdev) > virtio_cwrite(vb->vdev, struct > virtio_balloon_config, > poison_val, &poison_val); > } > - } > - /* > - * We continue to use VIRTIO_BALLOON_F_DEFLATE_ON_OOM to decide if > a > - * shrinker needs to be registered to relieve memory pressure. > - */ > - if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) > { > - err = virtio_balloon_register_shrinker(vb); > + > + /* > + * We're allowed to reuse any free pages, even if they are > + * still to be processed by the host. > It is important to clarify that pages that are on the inflate queue but not ACKed by the host (the queue entry has not been returned) are _not_ okay to reuse. If the host is going to do something destructive to the page (like deback it) then that needs to happen before the entry is returned. + */ > + vb->shrinker.scan_objects = virtio_balloon_shrinker_scan; > + vb->shrinker.count_objects = virtio_balloon_shrinker_count; > + vb->shrinker.seeks = DEFAULT_SEEKS; > + err = register_shrinker(&vb->shrinker); > if (err) > goto out_del_balloon_wq; > } > + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) { > + vb->oom_nb.notifier_call = virtio_balloon_oom_notify; > + vb->oom_nb.priority = VIRTIO_BALLOON_OOM_NOTIFY_PRIORITY; > + err = register_oom_notifier(&vb->oom_nb); > + if (err < 0) > + goto out_unregister_shrinker; > + } > + > virtio_device_ready(vdev); > > if (towards_target(vb)) > virtballoon_changed(vdev); > return 0; > > +out_unregister_shrinker: > + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > + unregister_shrinker(&vb->shrinker); > out_del_balloon_wq: > if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > destroy_workqueue(vb->balloon_wq); > @@ -987,8 +965,11 @@ static void virtballoon_remove(struct virtio_device > *vdev) > { > struct virtio_balloon *vb = vdev->priv; > > - if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) > - virtio_balloon_unregister_shrinker(vb); > + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) > + unregister_oom_notifier(&vb->oom_nb); > + if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) > + unregister_shrinker(&vb->shrinker); > + > spin_lock_irq(&vb->stop_update_lock); > vb->stop_update = true; > spin_unlock_irq(&vb->stop_update_lock); > -- > 2.24.1 > > --00000000000000d6c8059ddbcdcf Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Wed, Feb 5, 2020 at 8:34 AM David = Hildenbrand <david@redhat.com>= ; wrote:
Commit = 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker"= ;)
changed the behavior when deflation happens automatically. Instead of
deflating when called by the OOM handler, the shrinker is used.

However, the balloon is not simply some slab cache that should be
shrunk when under memory pressure. The shrinker does not have a concept of<= br> priorities, so this behavior cannot be configured.

There was a report that this results in undesired side effects when
inflating the balloon to shrink the page cache. [1]
=C2=A0 =C2=A0 =C2=A0 =C2=A0 "When inflating the balloon against page c= ache (i.e. no free memory
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0remains) vmscan.c will both shrink page c= ache, but also invoke the
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0shrinkers -- including the balloon's = shrinker. So the balloon
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0driver allocates memory which requires re= claim, vmscan gets this
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0memory by shrinking the balloon, and then= the driver adds the
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0memory back to the balloon. Basically a b= usy no-op."

The name "deflate on OOM" makes it pretty clear when deflation sh= ould
happen - after other approaches to reclaim memory failed, not while
reclaiming. This allows to minimize the footprint of a guest - memory
will only be taken out of the balloon when really needed.

Especially, a drop_slab() will result in the whole balloon getting
deflated - undesired. While handling it via the OOM handler might not be perfect, it keeps existing behavior. If we want a different behavior, then<= br> we need a new feature bit and document it properly (although, there should<= br> be a clear use case and the intended effects should be well described).

Keep using the shrinker for VIRTIO_BALLOON_F_FREE_PAGE_HINT, because
this has no such side effects. Always register the shrinker with
VIRTIO_BALLOON_F_FREE_PAGE_HINT now. We are always allowed to reuse free pages that are still to be processed by the guest. The hypervisor takes
care of identifying and resolving possible races between processing a
hinting request and the guest reusing a page.

In contrast to pre commit 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker"), don't add a moodule parameter to configu= re the
number of pages to deflate on OOM. Can be re-added if really needed.
Also, pay attention that leak_balloon() returns the number of 4k pages - convert it properly in virtio_balloon_oom_notify().

Note1: using the OOM handler is frowned upon, but it really is what we
=C2=A0 =C2=A0 =C2=A0 =C2=A0need for this feature.

Note2: without VIRTIO_BALLOON_F_MUST_TELL_HOST (iow, always with QEMU) we =C2=A0 =C2=A0 =C2=A0 =C2=A0could actually skip sending deflation requests t= o our hypervisor,
=C2=A0 =C2=A0 =C2=A0 =C2=A0making the OOM path *very* simple. Besically fre= eing pages and
=C2=A0 =C2=A0 =C2=A0 =C2=A0updating the balloon. If the communication with = the host ever
=C2=A0 =C2=A0 =C2=A0 =C2=A0becomes a problem on this call path.

[1] https://www.spinics.net/lists/li= nux-virtualization/msg40863.html

Reported-by: Tyler Sanderson <tysand@google.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Wei Wang <= wei.w.wang@intel.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Nadav Amit <na= mit@vmware.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
=C2=A0drivers/virtio/virtio_balloon.c | 107 +++++++++++++------------------= -
=C2=A01 file changed, 44 insertions(+), 63 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloo= n.c
index 7e5d84caeb94..e7b18f556c5e 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -14,6 +14,7 @@
=C2=A0#include <linux/slab.h>
=C2=A0#include <linux/module.h>
=C2=A0#include <linux/balloon_compaction.h>
+#include <linux/oom.h>
=C2=A0#include <linux/wait.h>
=C2=A0#include <linux/mm.h>
=C2=A0#include <linux/mount.h>
@@ -27,7 +28,9 @@
=C2=A0 */
=C2=A0#define VIRTIO_BALLOON_PAGES_PER_PAGE (unsigned)(PAGE_SIZE >> V= IRTIO_BALLOON_PFN_SHIFT)
=C2=A0#define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256
-#define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
+/* Maximum number of (4k) pages to deflate on OOM notifications. */
+#define VIRTIO_BALLOON_OOM_NR_PAGES 256
+#define VIRTIO_BALLOON_OOM_NOTIFY_PRIORITY 80

=C2=A0#define VIRTIO_BALLOON_FREE_PAGE_ALLOC_FLAG (__GFP_NORETRY | __GFP_NO= WARN | \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0__GFP_NOMEMALLOC)
@@ -112,8 +115,11 @@ struct virtio_balloon {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 /* Memory statistics */
=C2=A0 =C2=A0 =C2=A0 =C2=A0 struct virtio_balloon_stat stats[VIRTIO_BALLOON= _S_NR];

-=C2=A0 =C2=A0 =C2=A0 =C2=A0/* To register a shrinker to shrink memory upon= memory pressure */
+=C2=A0 =C2=A0 =C2=A0 =C2=A0/* Shrinker to return free pages - VIRTIO_BALLO= ON_F_FREE_PAGE_HINT */
=C2=A0 =C2=A0 =C2=A0 =C2=A0 struct shrinker shrinker;
+
+=C2=A0 =C2=A0 =C2=A0 =C2=A0/* OOM notifier to deflate on OOM - VIRTIO_BALL= OON_F_DEFLATE_ON_OOM */
+=C2=A0 =C2=A0 =C2=A0 =C2=A0struct notifier_block oom_nb;
=C2=A0};

=C2=A0static struct virtio_device_id id_table[] =3D {
@@ -786,50 +792,13 @@ static unsigned long shrink_free_pages(struct virtio_= balloon *vb,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 return blocks_freed * VIRTIO_BALLOON_HINT_BLOCK= _PAGES;
=C2=A0}

-static unsigned long leak_balloon_pages(struct virtio_balloon *vb,
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 u= nsigned long pages_to_free)
-{
-=C2=A0 =C2=A0 =C2=A0 =C2=A0return leak_balloon(vb, pages_to_free * VIRTIO_= BALLOON_PAGES_PER_PAGE) /
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0VIRTIO_BALLOON_PAGE= S_PER_PAGE;
-}
-
-static unsigned long shrink_balloon_pages(struct virtio_balloon *vb,
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0un= signed long pages_to_free)
-{
-=C2=A0 =C2=A0 =C2=A0 =C2=A0unsigned long pages_freed =3D 0;
-
-=C2=A0 =C2=A0 =C2=A0 =C2=A0/*
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 * One invocation of leak_balloon can deflate a= t most
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 * VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages,= so we call it
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 * multiple times to deflate pages till reachin= g pages_to_free.
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 */
-=C2=A0 =C2=A0 =C2=A0 =C2=A0while (vb->num_pages && pages_freed = < pages_to_free)
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0pages_freed +=3D le= ak_balloon_pages(vb,
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0pages_to_free - pages_freed);
-
-=C2=A0 =C2=A0 =C2=A0 =C2=A0update_balloon_size(vb);
-
-=C2=A0 =C2=A0 =C2=A0 =C2=A0return pages_freed;
-}
-
=C2=A0static unsigned long virtio_balloon_shrinker_scan(struct shrinker *sh= rinker,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 struct shrink_control *sc)
=C2=A0{
-=C2=A0 =C2=A0 =C2=A0 =C2=A0unsigned long pages_to_free, pages_freed =3D 0;=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 struct virtio_balloon *vb =3D container_of(shri= nker,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 struct v= irtio_balloon, shrinker);

-=C2=A0 =C2=A0 =C2=A0 =C2=A0pages_to_free =3D sc->nr_to_scan;
-
-=C2=A0 =C2=A0 =C2=A0 =C2=A0if (virtio_has_feature(vb->vdev, VIRTIO_BALL= OON_F_FREE_PAGE_HINT))
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0pages_freed =3D shr= ink_free_pages(vb, pages_to_free);
-
-=C2=A0 =C2=A0 =C2=A0 =C2=A0if (pages_freed >=3D pages_to_free)
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return pages_freed;=
-
-=C2=A0 =C2=A0 =C2=A0 =C2=A0pages_freed +=3D shrink_balloon_pages(vb, pages= _to_free - pages_freed);
-
-=C2=A0 =C2=A0 =C2=A0 =C2=A0return pages_freed;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0return shrink_free_pages(vb, sc->nr_to_scan)= ;
=C2=A0}

=C2=A0static unsigned long virtio_balloon_shrinker_count(struct shrinker *s= hrinker,
@@ -837,26 +806,22 @@ static unsigned long virtio_balloon_shrinker_count(st= ruct shrinker *shrinker,
=C2=A0{
=C2=A0 =C2=A0 =C2=A0 =C2=A0 struct virtio_balloon *vb =3D container_of(shri= nker,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 struct v= irtio_balloon, shrinker);
-=C2=A0 =C2=A0 =C2=A0 =C2=A0unsigned long count;
-
-=C2=A0 =C2=A0 =C2=A0 =C2=A0count =3D vb->num_pages / VIRTIO_BALLOON_PAG= ES_PER_PAGE;
-=C2=A0 =C2=A0 =C2=A0 =C2=A0count +=3D vb->num_free_page_blocks * VIRTIO= _BALLOON_HINT_BLOCK_PAGES;

-=C2=A0 =C2=A0 =C2=A0 =C2=A0return count;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0return vb->num_free_page_blocks * VIRTIO_BAL= LOON_HINT_BLOCK_PAGES;
=C2=A0}

-static void virtio_balloon_unregister_shrinker(struct virtio_balloon *vb)<= br> +static int virtio_balloon_oom_notify(struct notifier_block *nb,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 unsigned long dummy, v= oid *parm)
=C2=A0{
-=C2=A0 =C2=A0 =C2=A0 =C2=A0unregister_shrinker(&vb->shrinker);
-}
+=C2=A0 =C2=A0 =C2=A0 =C2=A0struct virtio_balloon *vb =3D container_of(nb,<= br> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 struct virtio_balloon, oom_nb);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0unsigned long *freed =3D parm;

-static int virtio_balloon_register_shrinker(struct virtio_balloon *vb)
-{
-=C2=A0 =C2=A0 =C2=A0 =C2=A0vb->shrinker.scan_objects =3D virtio_balloon= _shrinker_scan;
-=C2=A0 =C2=A0 =C2=A0 =C2=A0vb->shrinker.count_objects =3D virtio_balloo= n_shrinker_count;
-=C2=A0 =C2=A0 =C2=A0 =C2=A0vb->shrinker.seeks =3D DEFAULT_SEEKS;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0*freed +=3D leak_balloon(vb, VIRTIO_BALLOON_OOM= _NR_PAGES) /
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0VIRTIO_BALLO= ON_PAGES_PER_PAGE;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0update_balloon_size(vb);

-=C2=A0 =C2=A0 =C2=A0 =C2=A0return register_shrinker(&vb->shrinker);=
+=C2=A0 =C2=A0 =C2=A0 =C2=A0return NOTIFY_OK;
=C2=A0}

=C2=A0static int virtballoon_probe(struct virtio_device *vdev)
@@ -933,22 +898,35 @@ static int virtballoon_probe(struct virtio_device *vd= ev)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 virtio_cwrite(vb->vdev, struct virtio_balloon_config,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 poison_val, &am= p;poison_val);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 }
-=C2=A0 =C2=A0 =C2=A0 =C2=A0}
-=C2=A0 =C2=A0 =C2=A0 =C2=A0/*
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 * We continue to use VIRTIO_BALLOON_F_DEFLATE_= ON_OOM to decide if a
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 * shrinker needs to be registered to relieve m= emory pressure.
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 */
-=C2=A0 =C2=A0 =C2=A0 =C2=A0if (virtio_has_feature(vb->vdev, VIRTIO_BALL= OON_F_DEFLATE_ON_OOM)) {
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0err =3D virtio_ball= oon_register_shrinker(vb);
+
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/*
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * We're allowe= d to reuse any free pages, even if they are
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * still to be proc= essed by the host.
It is important to clarify that pag= es that are on the inflate queue but not ACKed by the host (the queue entry= has not been returned) are _not_ okay to reuse.
If the host is g= oing to do something destructive to the page (like deback it) then that nee= ds to happen before the entry is returned.

+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 */
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0vb->shrinker.sca= n_objects =3D virtio_balloon_shrinker_scan;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0vb->shrinker.cou= nt_objects =3D virtio_balloon_shrinker_count;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0vb->shrinker.see= ks =3D DEFAULT_SEEKS;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0err =3D register_sh= rinker(&vb->shrinker);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (err)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 goto out_del_balloon_wq;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 }
+=C2=A0 =C2=A0 =C2=A0 =C2=A0if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_D= EFLATE_ON_OOM)) {
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0vb->oom_nb.notif= ier_call =3D virtio_balloon_oom_notify;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0vb->oom_nb.prior= ity =3D VIRTIO_BALLOON_OOM_NOTIFY_PRIORITY;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0err =3D register_oo= m_notifier(&vb->oom_nb);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (err < 0)
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0goto out_unregister_shrinker;
+=C2=A0 =C2=A0 =C2=A0 =C2=A0}
+
=C2=A0 =C2=A0 =C2=A0 =C2=A0 virtio_device_ready(vdev);

=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (towards_target(vb))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 virtballoon_changed= (vdev);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 return 0;

+out_unregister_shrinker:
+=C2=A0 =C2=A0 =C2=A0 =C2=A0if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_F= REE_PAGE_HINT))
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0unregister_shrinker= (&vb->shrinker);
=C2=A0out_del_balloon_wq:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_F= REE_PAGE_HINT))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 destroy_workqueue(v= b->balloon_wq);
@@ -987,8 +965,11 @@ static void virtballoon_remove(struct virtio_device *v= dev)
=C2=A0{
=C2=A0 =C2=A0 =C2=A0 =C2=A0 struct virtio_balloon *vb =3D vdev->priv;
-=C2=A0 =C2=A0 =C2=A0 =C2=A0if (virtio_has_feature(vb->vdev, VIRTIO_BALL= OON_F_DEFLATE_ON_OOM))
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0virtio_balloon_unre= gister_shrinker(vb);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_D= EFLATE_ON_OOM))
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0unregister_oom_noti= fier(&vb->oom_nb);
+=C2=A0 =C2=A0 =C2=A0 =C2=A0if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_F= REE_PAGE_HINT))
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0unregister_shrinker= (&vb->shrinker);
+
=C2=A0 =C2=A0 =C2=A0 =C2=A0 spin_lock_irq(&vb->stop_update_lock); =C2=A0 =C2=A0 =C2=A0 =C2=A0 vb->stop_update =3D true;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 spin_unlock_irq(&vb->stop_update_lock);<= br> --
2.24.1

--00000000000000d6c8059ddbcdcf--