On 09/26/2011 07:52 AM, KAMEZAWA Hiroyuki wrote: > On Sat, 24 Sep 2011 11:45:04 -0300 > Glauber Costa wrote: > >> On 09/22/2011 12:09 PM, Balbir Singh wrote: >>> On Thu, Sep 22, 2011 at 11:30 AM, Greg Thelen wrote: >>>> On Wed, Sep 21, 2011 at 11:59 AM, Glauber Costa wrote: >>>>> Right now I am working under the assumption that tasks are long lived inside >>>>> the cgroup. Migration potentially introduces some nasty locking problems in >>>>> the mem_schedule path. >>>>> >>>>> Also, unless I am missing something, the memcg already has the policy of >>>>> not carrying charges around, probably because of this very same complexity. >>>>> >>>>> True that at least it won't EBUSY you... But I think this is at least a way >>>>> to guarantee that the cgroup under our nose won't disappear in the middle of >>>>> our allocations. >>>> >>>> Here's the memcg user page behavior using the same pattern: >>>> >>>> 1. user page P is allocate by task T in memcg M1 >>>> 2. T is moved to memcg M2. The P charge is left behind still charged >>>> to M1 if memory.move_charge_at_immigrate=0; or the charge is moved to >>>> M2 if memory.move_charge_at_immigrate=1. >>>> 3. rmdir M1 will try to reclaim P (if P was left in M1). If unable to >>>> reclaim, then P is recharged to parent(M1). >>>> >>> >>> We also have some magic in page_referenced() to remove pages >>> referenced from different containers. What we do is try not to >>> penalize a cgroup if another cgroup is referencing this page and the >>> page under consideration is being reclaimed from the cgroup that >>> touched it. >>> >>> Balbir Singh >> Do you guys see it as a showstopper for this series to be merged, or can >> we just TODO it ? >> > > In my experience, 'I can't rmdir cgroup.' is always an important/difficult > problem. The users cannot know where the accouting is leaking other than > kmem.usage_in_bytes or memory.usage_in_bytes. and can't fix the issue. > > please add EXPERIMENTAL to Kconfig until this is fixed. > >> I can push a proposal for it, but it would be done in a separate patch >> anyway. Also, we may be in better conditions to fix this when the slab >> part is merged - since it will likely have the same problems... >> > > Yes. considering sockets which can be shared between tasks(cgroups) > you'll finally need > - owner task of socket > - account moving callback > > Or disallow task moving once accounted. > So, I tried to come up with proper task charge moving here, and the locking easily gets quite complicated. (But I have the feeling I am overlooking something...) So I think I'll really need more time for that. What do you guys think of this following patch, + EXPERIMENTAL ?