linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] net, mm: account sock objects to kmemcg
@ 2018-06-27 20:41 Shakeel Butt
  2018-06-27 21:51 ` Eric Dumazet
  0 siblings, 1 reply; 4+ messages in thread
From: Shakeel Butt @ 2018-06-27 20:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Vladimir Davydov, Greg Thelen, Roman Gushchin,
	David S . Miller, Eric Dumazet, Kirill Tkhai, linux-kernel,
	netdev, linux-mm, Shakeel Butt

Currently the kernel accounts the memory for network traffic through
mem_cgroup_[un]charge_skmem() interface. However the memory accounted
only includes the truesize of sk_buff which does not include the size of
sock objects. In our production environment, with opt-out kmem
accounting, the sock kmem caches (TCP[v6], UDP[v6], RAW[v6], UNIX) are
among the top most charged kmem caches and consume a significant amount
of memory which can not be left as system overhead. So, this patch
converts the kmem caches of more important sock objects to SLAB_ACCOUNT.

Signed-off-by: Shakeel Butt <shakeelb@google.com>
---
 net/ipv4/raw.c      | 1 +
 net/ipv4/tcp_ipv4.c | 2 +-
 net/ipv4/udp.c      | 1 +
 net/ipv6/raw.c      | 1 +
 net/ipv6/tcp_ipv6.c | 2 +-
 net/ipv6/udp.c      | 1 +
 net/unix/af_unix.c  | 1 +
 7 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index abb3c9490c55..2c4b04c6461a 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -988,6 +988,7 @@ struct proto raw_prot = {
 	.hash		   = raw_hash_sk,
 	.unhash		   = raw_unhash_sk,
 	.obj_size	   = sizeof(struct raw_sock),
+	.slab_flags	   = SLAB_ACCOUNT,
 	.useroffset	   = offsetof(struct raw_sock, filter),
 	.usersize	   = sizeof_field(struct raw_sock, filter),
 	.h.raw_hash	   = &raw_v4_hashinfo,
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index fed3f1c66167..9ae31979aefa 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2459,7 +2459,7 @@ struct proto tcp_prot = {
 	.sysctl_rmem_offset	= offsetof(struct net, ipv4.sysctl_tcp_rmem),
 	.max_header		= MAX_TCP_HEADER,
 	.obj_size		= sizeof(struct tcp_sock),
-	.slab_flags		= SLAB_TYPESAFE_BY_RCU,
+	.slab_flags		= SLAB_TYPESAFE_BY_RCU | SLAB_ACCOUNT,
 	.twsk_prot		= &tcp_timewait_sock_ops,
 	.rsk_prot		= &tcp_request_sock_ops,
 	.h.hashinfo		= &tcp_hashinfo,
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 9bb27df4dac5..26e07b8a83cc 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2657,6 +2657,7 @@ struct proto udp_prot = {
 	.sysctl_wmem_offset	= offsetof(struct net, ipv4.sysctl_udp_wmem_min),
 	.sysctl_rmem_offset	= offsetof(struct net, ipv4.sysctl_udp_rmem_min),
 	.obj_size		= sizeof(struct udp_sock),
+	.slab_flags		= SLAB_ACCOUNT,
 	.h.udp_table		= &udp_table,
 #ifdef CONFIG_COMPAT
 	.compat_setsockopt	= compat_udp_setsockopt,
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index ce6f0d15b5dd..044ed44e7c16 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -1272,6 +1272,7 @@ struct proto rawv6_prot = {
 	.hash		   = raw_hash_sk,
 	.unhash		   = raw_unhash_sk,
 	.obj_size	   = sizeof(struct raw6_sock),
+	.slab_flags	   = SLAB_ACCOUNT,
 	.useroffset	   = offsetof(struct raw6_sock, filter),
 	.usersize	   = sizeof_field(struct raw6_sock, filter),
 	.h.raw_hash	   = &raw_v6_hashinfo,
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index b620d9b72e59..7187609ca25f 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1973,7 +1973,7 @@ struct proto tcpv6_prot = {
 	.sysctl_rmem_offset	= offsetof(struct net, ipv4.sysctl_tcp_rmem),
 	.max_header		= MAX_TCP_HEADER,
 	.obj_size		= sizeof(struct tcp6_sock),
-	.slab_flags		= SLAB_TYPESAFE_BY_RCU,
+	.slab_flags		= SLAB_TYPESAFE_BY_RCU | SLAB_ACCOUNT,
 	.twsk_prot		= &tcp6_timewait_sock_ops,
 	.rsk_prot		= &tcp6_request_sock_ops,
 	.h.hashinfo		= &tcp_hashinfo,
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index e6645cae403e..47c9a3c74981 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1582,6 +1582,7 @@ struct proto udpv6_prot = {
 	.sysctl_wmem_offset     = offsetof(struct net, ipv4.sysctl_udp_wmem_min),
 	.sysctl_rmem_offset     = offsetof(struct net, ipv4.sysctl_udp_rmem_min),
 	.obj_size		= sizeof(struct udp6_sock),
+	.slab_flags		= SLAB_ACCOUNT,
 	.h.udp_table		= &udp_table,
 #ifdef CONFIG_COMPAT
 	.compat_setsockopt	= compat_udpv6_setsockopt,
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 95b02a71fd47..5e3e377a7269 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -742,6 +742,7 @@ static struct proto unix_proto = {
 	.name			= "UNIX",
 	.owner			= THIS_MODULE,
 	.obj_size		= sizeof(struct unix_sock),
+	.slab_flags		= SLAB_ACCOUNT,
 };
 
 static struct sock *unix_create1(struct net *net, struct socket *sock, int kern)
-- 
2.18.0.rc2.346.g013aa6912e-goog

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH] net, mm: account sock objects to kmemcg
  2018-06-27 20:41 [RFC PATCH] net, mm: account sock objects to kmemcg Shakeel Butt
@ 2018-06-27 21:51 ` Eric Dumazet
  2018-06-27 22:03   ` Shakeel Butt
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Dumazet @ 2018-06-27 21:51 UTC (permalink / raw)
  To: Shakeel Butt, Andrew Morton
  Cc: Johannes Weiner, Vladimir Davydov, Greg Thelen, Roman Gushchin,
	David S . Miller, Eric Dumazet, Kirill Tkhai, linux-kernel,
	netdev, linux-mm



On 06/27/2018 01:41 PM, Shakeel Butt wrote:
> Currently the kernel accounts the memory for network traffic through
> mem_cgroup_[un]charge_skmem() interface. However the memory accounted
> only includes the truesize of sk_buff which does not include the size of
> sock objects. In our production environment, with opt-out kmem
> accounting, the sock kmem caches (TCP[v6], UDP[v6], RAW[v6], UNIX) are
> among the top most charged kmem caches and consume a significant amount
> of memory which can not be left as system overhead. So, this patch
> converts the kmem caches of more important sock objects to SLAB_ACCOUNT.
> 
> Signed-off-by: Shakeel Butt <shakeelb@google.com>
> ---
>  net/ipv4/raw.c      | 1 +
>  net/ipv4/tcp_ipv4.c | 2 +-
>  net/ipv4/udp.c      | 1 +
>  net/ipv6/raw.c      | 1 +
>  net/ipv6/tcp_ipv6.c | 2 +-
>  net/ipv6/udp.c      | 1 +
>  net/unix/af_unix.c  | 1 +
>  7 files changed, 7 insertions(+), 2 deletions(-)


Hey, you just disclosed we do not use DCCP ;)

Joke aside, what about simply factorizing this stuff ?

diff --git a/net/core/sock.c b/net/core/sock.c
index bcc41829a16d50714bdd3c25c976c0b7296fab84..b6714f8d7e9ba313723a6f619799c56230ff5fd4 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3243,7 +3243,8 @@ static int req_prot_init(const struct proto *prot)
 
        rsk_prot->slab = kmem_cache_create(rsk_prot->slab_name,
                                           rsk_prot->obj_size, 0,
-                                          prot->slab_flags, NULL);
+                                          SLAB_ACCOUNT | prot->slab_flags,
+                                          NULL);
 
        if (!rsk_prot->slab) {
                pr_crit("%s: Can't create request sock SLAB cache!\n",
@@ -3258,7 +3259,8 @@ int proto_register(struct proto *prot, int alloc_slab)
        if (alloc_slab) {
                prot->slab = kmem_cache_create_usercopy(prot->name,
                                        prot->obj_size, 0,
-                                       SLAB_HWCACHE_ALIGN | prot->slab_flags,
+                                       SLAB_HWCACHE_ALIGN | SLAB_ACCOUNT |
+                                       prot->slab_flags,
                                        prot->useroffset, prot->usersize,
                                        NULL);
 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH] net, mm: account sock objects to kmemcg
  2018-06-27 21:51 ` Eric Dumazet
@ 2018-06-27 22:03   ` Shakeel Butt
  2018-06-27 22:06     ` Eric Dumazet
  0 siblings, 1 reply; 4+ messages in thread
From: Shakeel Butt @ 2018-06-27 22:03 UTC (permalink / raw)
  To: eric.dumazet
  Cc: Andrew Morton, Johannes Weiner, Vladimir Davydov, Greg Thelen,
	Roman Gushchin, davem, Eric Dumazet, Kirill Tkhai, LKML, netdev,
	Linux MM

On Wed, Jun 27, 2018 at 2:51 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 06/27/2018 01:41 PM, Shakeel Butt wrote:
> > Currently the kernel accounts the memory for network traffic through
> > mem_cgroup_[un]charge_skmem() interface. However the memory accounted
> > only includes the truesize of sk_buff which does not include the size of
> > sock objects. In our production environment, with opt-out kmem
> > accounting, the sock kmem caches (TCP[v6], UDP[v6], RAW[v6], UNIX) are
> > among the top most charged kmem caches and consume a significant amount
> > of memory which can not be left as system overhead. So, this patch
> > converts the kmem caches of more important sock objects to SLAB_ACCOUNT.
> >
> > Signed-off-by: Shakeel Butt <shakeelb@google.com>
> > ---
> >  net/ipv4/raw.c      | 1 +
> >  net/ipv4/tcp_ipv4.c | 2 +-
> >  net/ipv4/udp.c      | 1 +
> >  net/ipv6/raw.c      | 1 +
> >  net/ipv6/tcp_ipv6.c | 2 +-
> >  net/ipv6/udp.c      | 1 +
> >  net/unix/af_unix.c  | 1 +
> >  7 files changed, 7 insertions(+), 2 deletions(-)
>
>
> Hey, you just disclosed we do not use DCCP ;)
>

Oops.

>
> Joke aside, what about simply factorizing this stuff ?
>

This will opt-in all the sock kmem_caches which I think is better and
much smaller change. Should I resend this or do you want to send the
patch?

> diff --git a/net/core/sock.c b/net/core/sock.c
> index bcc41829a16d50714bdd3c25c976c0b7296fab84..b6714f8d7e9ba313723a6f619799c56230ff5fd4 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -3243,7 +3243,8 @@ static int req_prot_init(const struct proto *prot)
>
>         rsk_prot->slab = kmem_cache_create(rsk_prot->slab_name,
>                                            rsk_prot->obj_size, 0,
> -                                          prot->slab_flags, NULL);
> +                                          SLAB_ACCOUNT | prot->slab_flags,
> +                                          NULL);
>
>         if (!rsk_prot->slab) {
>                 pr_crit("%s: Can't create request sock SLAB cache!\n",
> @@ -3258,7 +3259,8 @@ int proto_register(struct proto *prot, int alloc_slab)
>         if (alloc_slab) {
>                 prot->slab = kmem_cache_create_usercopy(prot->name,
>                                         prot->obj_size, 0,
> -                                       SLAB_HWCACHE_ALIGN | prot->slab_flags,
> +                                       SLAB_HWCACHE_ALIGN | SLAB_ACCOUNT |
> +                                       prot->slab_flags,
>                                         prot->useroffset, prot->usersize,
>                                         NULL);
>
>

Shakeel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH] net, mm: account sock objects to kmemcg
  2018-06-27 22:03   ` Shakeel Butt
@ 2018-06-27 22:06     ` Eric Dumazet
  0 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2018-06-27 22:06 UTC (permalink / raw)
  To: Shakeel Butt, eric.dumazet
  Cc: Andrew Morton, Johannes Weiner, Vladimir Davydov, Greg Thelen,
	Roman Gushchin, davem, Eric Dumazet, Kirill Tkhai, LKML, netdev,
	Linux MM



On 06/27/2018 03:03 PM, Shakeel Butt wrote:

> 
> This will opt-in all the sock kmem_caches which I think is better and
> much smaller change. Should I resend this or do you want to send the
> patch?
>

Please send a V2, with maybe some updated changelog ;)

Thanks !

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-06-27 22:06 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-27 20:41 [RFC PATCH] net, mm: account sock objects to kmemcg Shakeel Butt
2018-06-27 21:51 ` Eric Dumazet
2018-06-27 22:03   ` Shakeel Butt
2018-06-27 22:06     ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox