From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx135.postini.com [74.125.245.135]) by kanga.kvack.org (Postfix) with SMTP id 5BBDD6B0032 for ; Mon, 1 Jul 2013 14:51:54 -0400 (EDT) Received: by mail-ve0-f179.google.com with SMTP id d10so4111288vea.38 for ; Mon, 01 Jul 2013 11:51:53 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20130701184503.GG17812@cmpxchg.org> References: <20130701180101.GA5460@ac100> <20130701184503.GG17812@cmpxchg.org> From: Aaron Staley Date: Mon, 1 Jul 2013 11:51:33 -0700 Message-ID: Subject: Re: PROBLEM: Processes writing large files in memory-limited LXC container are killed by OOM Content-Type: multipart/alternative; boundary=047d7b3a8522d2110804e077b8ff Sender: owner-linux-mm@kvack.org List-ID: To: Johannes Weiner Cc: Serge Hallyn , "containers@lists.linux-foundation.org" , Paul Menage , Li Zefan , Michal Hocko , linux-kernel@vger.kernel.org, linux-mm@kvack.org --047d7b3a8522d2110804e077b8ff Content-Type: text/plain; charset=ISO-8859-1 Hi Johannes, It does appear to still be happening on Linux 3.8. Does it remain an open issue? Regards, Aaron On Mon, Jul 1, 2013 at 11:45 AM, Johannes Weiner wrote: > On Mon, Jul 01, 2013 at 01:01:01PM -0500, Serge Hallyn wrote: > > Quoting Aaron Staley (aaron@picloud.com): > > > This is better explained here: > > > > http://serverfault.com/questions/516074/why-are-applications-in-a-memory-limited-lxc-container-writing-large-files-to-di > > > (The > > > highest-voted answer believes this to be a kernel bug.) > > > > Hi, > > > > in irc it has been suggested that indeed the kernel should be slowing > > down new page creates while waiting for old page cache entries to be > > written out to disk, rather than ooming. > > > > With a 3.0.27-1-ac100 kernel, doing dd if=/dev/zero of=xxx bs=1M > > count=100 is immediately killed. In contrast, doing the same from a > > 3.0.8 kernel did the right thing for me. But I did reproduce your > > experiment below on ec2 with the same result. > > > > So, cc:ing linux-mm in the hopes someone can tell us whether this > > is expected behavior, known mis-behavior, or an unknown bug. > > It's a known issue that was fixed/improved in e62e384 'memcg: prevent > OOM with too many dirty pages', included in 3.6+. > > > > Summary: I have set up a system where I am using LXC to create multiple > > > virtualized containers on my system with limited resources. > Unfortunately, I'm > > > running into a troublesome scenario where the OOM killer is hard > killing > > > processes in my LXC container when I write a file with size exceeding > the > > > memory limitation (set to 300MB). There appears to be some issue with > the > > > file buffering respecting the containers memory limit. > > > > > > > > > Reproducing: > > > > > > /done on a c1.xlarge instance running on Amazon EC2 > > > > > > Create 6 empty lxc containers (in my case I did lxc-create -n testcon > -t > > > ubuntu -- -r precise) > > > > > > Modify the configuration of each container to set lxc.cgroup.memory. > > > limit_in_bytes = 300M > > > > > > Within each container run: > > > dd if=/dev/zero of=test2 bs=100k count=5010 > > > parallel > > > > > > This will with high probability activate the OOM (as seen in demsg); > often > > > the dd processes themselves will be killed. > > > > > > This has been verified to have problems on: > > > Linux 3.8.0-25-generic #37-Ubuntu SMP and Linux ip-10-8-139-98 > > > 3.2.0-29-virtual #46-Ubuntu SMP Fri Jul 27 17:23:50 UTC 2012 x86_64 > x86_64 > > > x86_64 GNU/Linux > > > > > > Please let me know your thoughts. > > > > > > Regards, > > > Aaron Staley > > > _______________________________________________ > > > Containers mailing list > > > Containers@lists.linux-foundation.org > > > https://lists.linuxfoundation.org/mailman/listinfo/containers > > > > -- > > To unsubscribe, send a message with 'unsubscribe linux-mm' in > > the body to majordomo@kvack.org. For more info on Linux MM, > > see: http://www.linux-mm.org/ . > > Don't email: email@kvack.org > -- Aaron Staley *PiCloud, Inc.* --047d7b3a8522d2110804e077b8ff Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi Johannes,

It does appear to still be= happening on Linux 3.8. =A0Does it remain an open issue?

Regards,
Aaron

=
On Mon, Jul 1, 2013 at 11:45 AM, Johannes Weiner= <hannes@cmpxchg.org> wrote:
On Mon, Jul 01, 2013 at 01:01:01PM -0500, Serge Hallyn wr= ote:
> Quoting Aaron Staley (aaron@piclo= ud.com):
> > This is better explained here:
> > http://serverfault.com/questions/516074/why-are-applications-in= -a-memory-limited-lxc-container-writing-large-files-to-di
> > (The
> > highest-voted answer believes this to be a kernel bug.)
>
> Hi,
>
> in irc it has been suggested that indeed the kernel should be slowing<= br> > down new page creates while waiting for old page cache entries to be > written out to disk, rather than ooming.
>
> With a 3.0.27-1-ac100 kernel, doing dd if=3D/dev/zero of=3Dxxx bs=3D1M=
> count=3D100 is immediately killed. =A0In contrast, doing the same from= a
> 3.0.8 kernel did the right thing for me. =A0But I did reproduce your > experiment below on ec2 with the same result.
>
> So, cc:ing linux-mm in the hopes someone can tell us whether this
> is expected behavior, known mis-behavior, or an unknown bug.

It's a known issue that was fixed/improved in e62e384 'memcg:= prevent
OOM with too many dirty pages', included in 3.6+.

> > Summary: I have set up a system where I am using LXC to create mu= ltiple
> > virtualized containers on my system with limited resources. Unfor= tunately, I'm
> > running into a troublesome scenario where the OOM killer is hard = killing
> > processes in my LXC container when I write a file with size excee= ding the
> > memory limitation (set to 300MB). There appears to be some issue = with the
> > file buffering respecting the containers memory limit.
> >
> >
> > Reproducing:
> >
> > /done on a c1.xlarge instance running on Amazon EC2
> >
> > Create 6 empty lxc containers (in my case I did lxc-create -n tes= tcon -t
> > ubuntu -- -r precise)
> >
> > Modify the configuration of each container to set lxc.cgroup.memo= ry.
> > limit_in_bytes =3D 300M
> >
> > Within each container run:
> > dd if=3D/dev/zero of=3Dtest2 bs=3D100k count=3D5010
> > parallel
> >
> > This will with high probability activate the OOM (as seen in dems= g); often
> > the dd processes themselves will be killed.
> >
> > This has been verified to have problems on:
> > Linux 3.8.0-25-generic #37-Ubuntu SMP and Linux ip-10-8-139-98 > > 3.2.0-29-virtual #46-Ubuntu SMP Fri Jul 27 17:23:50 UTC 2012 x86_= 64 x86_64
> > x86_64 GNU/Linux
> >
> > Please let me know your thoughts.
> >
> > Regards,
> > Aaron Staley
> > _______________________________________________
> > Containers mailing list
> > Containe= rs@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listin= fo/containers
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in<= br> > the body to majordomo@kvack.org= . =A0For more info on Linux MM,
> see: http://www= .linux-mm.org/ .
> Don't email: <a href=3Dmailto:"dont@kvack.org"> emai= l@kvack.org </a>



--
Aaron Staley
PiCloud, Inc.
--047d7b3a8522d2110804e077b8ff-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org