linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: David Miller <davem@davemloft.net>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.cz>,
	Vladimir Davydov <vdavydov@virtuozzo.com>,
	Tejun Heo <tj@kernel.org>,
	netdev@vger.kernel.org, linux-mm@kvack.org,
	cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
	kernel-team@fb.com
Subject: [PATCH 0/8] mm: memcontrol: account socket memory in unified hierarchy v2
Date: Wed,  4 Nov 2015 17:22:06 -0500	[thread overview]
Message-ID: <1446675734-25671-1-git-send-email-hannes@cmpxchg.org> (raw)

Hi,

this is version 2 of the patches to add socket memory accounting to
the unified hierarchy memory controller. Changes from v1 include:

- No accounting overhead unless a dedicated cgroup is created and the
  memory controller instructed to track that group's memory footprint.
  Distribution kernels enable CONFIG_MEMCG, and users (incl. systemd)
  might create cgroups only for process control or resources other
  than memory. As noted by David and Michal, these setups shouldn't
  pay any overhead for this.

- Continue to enter the socket pressure state when hitting the memory
  controller's hard limit. Vladimir noted that there is at least some
  value in telling other sockets in the cgroup to not increase their
  transmit windows when one of them is already dropping packets.

- Drop the controversial vmpressure rework. Instead of changing the
  level where pressure is noted, keep noting pressure in its origin
  and then make the pressure check hierarchical. As noted by Michal
  and Vladimir, we shouldn't risk changing user-visible behavior.

---

Socket buffer memory can make up a significant share of a workload's
memory footprint that can be directly linked to userspace activity,
and so it needs to be part of the memory controller to provide proper
resource isolation/containment.

Historically, socket buffers were accounted in a separate counter,
without any pressure equalization between anonymous memory, page
cache, and the socket buffers. When the socket buffer pool was
exhausted, buffer allocations would fail hard and cause network
performance to tank, regardless of whether there was still memory
available to the group or not. Likewise, struggling anonymous or cache
workingsets could not dip into an idle socket memory pool. Because of
this, the feature was not usable for many real life applications.

To not repeat this mistake, the new memory controller will account all
types of memory pages it is tracking on behalf of a cgroup in a single
pool. Upon pressure, the VM reclaims and shrinks and puts pressure on
whatever memory consumer in that pool is within its reach.

For socket memory, pressure feedback is provided through vmpressure
events. When the VM has trouble freeing memory, the network code is
instructed to stop growing the cgroup's transmit windows.

---

This series begins with a rework of the existing tcp memory controller
that simplifies and cleans up the code while allowing us to have only
one set of networking hooks for both memory controller versions. The
original behavior of the existing tcp controller should be preserved.

It then adds socket accounting to the v2 memory controller, including
the use of the per-cpu charge cache and async memory.high enforcement
from socket memory charges.

Lastly, vmpressure is hooked up to the socket code so that it stops
growing transmit windows when the VM has trouble reclaiming memory.

 include/linux/memcontrol.h   |  98 ++++++++-------
 include/linux/page_counter.h |   6 +-
 include/net/sock.h           | 137 ++-------------------
 include/net/tcp.h            |   5 +-
 include/net/tcp_memcontrol.h |   7 --
 mm/backing-dev.c             |   2 +-
 mm/hugetlb_cgroup.c          |   3 +-
 mm/memcontrol.c              | 262 +++++++++++++++++++++++++----------------
 mm/page_counter.c            |  14 +--
 mm/vmpressure.c              |  25 +++-
 mm/vmscan.c                  |  31 ++---
 net/core/sock.c              |  78 +++---------
 net/ipv4/sysctl_net_ipv4.c   |   1 -
 net/ipv4/tcp.c               |   3 +-
 net/ipv4/tcp_ipv4.c          |   9 +-
 net/ipv4/tcp_memcontrol.c    | 147 ++++-------------------
 net/ipv4/tcp_output.c        |   6 +-
 net/ipv6/tcp_ipv6.c          |   3 -
 18 files changed, 328 insertions(+), 509 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

             reply	other threads:[~2015-11-04 22:22 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-04 22:22 Johannes Weiner [this message]
2015-11-04 22:22 ` [PATCH 1/8] mm: memcontrol: export root_mem_cgroup Johannes Weiner
2015-11-04 22:22 ` [PATCH 2/8] mm: vmscan: simplify memcg vs. global shrinker invocation Johannes Weiner
2015-11-04 22:22 ` [PATCH 3/8] mm: page_counter: let page_counter_try_charge() return bool Johannes Weiner
2015-11-04 22:22 ` [PATCH 4/8] net: tcp_memcontrol: remove bogus hierarchy pressure propagation Johannes Weiner
2015-11-04 22:22 ` [PATCH 5/8] net: tcp_memcontrol: consolidate socket buffer tracking and accounting Johannes Weiner
2015-11-04 22:22 ` [PATCH 6/8] mm: memcontrol: prepare for unified hierarchy socket accounting Johannes Weiner
2015-11-04 22:22 ` [PATCH 7/8] mm: memcontrol: account socket memory in unified hierarchy memory controller Johannes Weiner
2015-11-04 22:22 ` [PATCH 8/8] mm: memcontrol: hook up vmpressure to socket pressure Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1446675734-25671-1-git-send-email-hannes@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=davem@davemloft.net \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=netdev@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=vdavydov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox