From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from zps35.corp.google.com (zps35.corp.google.com [172.25.146.35]) by smtp-out.google.com with ESMTP id m9L1TUra004730 for ; Mon, 20 Oct 2008 18:29:31 -0700 Received: from qw-out-2122.google.com (qwi5.prod.google.com [10.241.195.5]) by zps35.corp.google.com with ESMTP id m9L1TTqS012918 for ; Mon, 20 Oct 2008 18:29:29 -0700 Received: by qw-out-2122.google.com with SMTP id 5so601472qwi.57 for ; Mon, 20 Oct 2008 18:29:28 -0700 (PDT) Message-ID: <6599ad830810201829o5483ef48g633e920cce9cc015@mail.gmail.com> Date: Mon, 20 Oct 2008 18:29:28 -0700 From: "Paul Menage" Subject: Re: [PATCH -mm 1/5] memcg: replace res_counter In-Reply-To: <20081021101430.d2629a81.kamezawa.hiroyu@jp.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20081017194804.fce28258.nishimura@mxp.nes.nec.co.jp> <20081017195601.0b9abda1.nishimura@mxp.nes.nec.co.jp> <6599ad830810201253u3bca41d4rabe48eb1ec1d529f@mail.gmail.com> <20081021101430.d2629a81.kamezawa.hiroyu@jp.fujitsu.com> Sender: owner-linux-mm@kvack.org Return-Path: To: KAMEZAWA Hiroyuki Cc: Daisuke Nishimura , linux-mm@kvack.org, balbir@linux.vnet.ibm.com List-ID: On Mon, Oct 20, 2008 at 6:14 PM, KAMEZAWA Hiroyuki wrote: > > 1. It's harmful to increase size of *generic* res_counter. So, modifing > res_counter only for us is not a choice. Adding an extra pointer to a per-cgroup structure isn't particularly harmful. > 2. Operation should be done under a lock. We have to do > -page + swap in atomic, at least. How bad would things really be if you did something like the code below? if (charge_swap()) { uncharge_mem(); } else { return -ENOMEM; } It's true that this introduces a tiny race whereby a single swap-in page allocation that might have succeeded could fail, but if you're that close to the limit your cgroup is heading for an OOM anyway. > 3. We want to pack all member into a cache-line, multiple res_counter > is no good. As I said previously, if we do a prefetch on the aggregated res_counter before we touch any fields in the basic counter, then in theory we should never have to wait on a cache miss on the aggregated counter - either we have no misses (if both were in cache) or we fetch both lines concurrently (if neither were in cache). Do you think that reasoning is invalid? > >> Maybe have an "aggregate" pointer in a res_counter that points to >> another res_counter that sums some number of counters; both the mem >> and the swap res_counter objects for a cgroup would point to the >> mem+swap res_counter for their aggregate. Adjusting the usage of a >> counter would also adjust its aggregate (or fail if adjusting the >> aggregate failed). >> > It's complicated. Agreed, it's a bit more complicated than defining a new structure and code that's very reminiscent of res_counter. But it does solve the problem of aggregating across multiple resource types and multiple children in a generic way. Paul -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org