From: Hugh Dickins <hugh@veritas.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hirokazu Takahashi <taka@valinux.co.jp>,
linux-mm@kvack.org, balbir@linux.vnet.ibm.com,
yamamoto@valinux.co.jp, riel@redhat.com
Subject: Re: [RFC][PATCH] Clarify mem_cgroup lock handling and avoid races.
Date: Fri, 22 Feb 2008 09:24:36 +0000 (GMT) [thread overview]
Message-ID: <Pine.LNX.4.64.0802220916290.18145@blonde.site> (raw)
In-Reply-To: <20080220155049.094056ac.kamezawa.hiroyu@jp.fujitsu.com>
On Wed, 20 Feb 2008, KAMEZAWA Hiroyuki wrote:
> On Wed, 20 Feb 2008 15:27:53 +0900 (JST)
> Hirokazu Takahashi <taka@valinux.co.jp> wrote:
>
> > > Unlike the unsafeties of force_empty, this is liable to hit anyone
> > > running with MEM_CONT compiled in, they don't have to be consciously
> > > using mem_cgroups at all.
> >
> > As for force_empty, though this may not be the main topic here,
> > mem_cgroup_force_empty_list() can be implemented simpler.
> > It is possible to make the function just call mem_cgroup_uncharge_page()
> > instead of releasing page_cgroups by itself. The tips is to call get_page()
> > before invoking mem_cgroup_uncharge_page() so the page won't be released
> > during this function.
> >
> > Kamezawa-san, you may want look into the attached patch.
> > I think you will be free from the weired complexity here.
> >
> > This code can be optimized but it will be enough since this function
> > isn't critical.
> >
> > Thanks.
> >
> >
> > Signed-off-by: Hirokazu Takahashi <taka@vallinux.co.jp>
Hirokazu-san, may I change that to <taka@valinux.co.jp>?
>...
>
> Seems simple. But isn't there following case ?
>
> ==in force_empty==
>
> pc1 = list_entry(list->prev, struct page_cgroup, lru);
> page = pc1->page;
> get_page(page)
> spin_unlock_irqrestore(&mz->lru_lock, flags)
> mem_cgroup_uncharge_page(page);
> => lock_page_cgroup(page);
> => pc2 = page_get_page_cgroup(page);
>
> Here, pc2 != pc1 and pc2->mem_cgroup != pc1->mem_cgroup.
> maybe need some check.
>
> But maybe yours is good direction.
I like Hirokazu-san's approach very much.
Although I eventually completed the locking for my mem_cgroup_move_lists
(SLAB_DESTROY_BY_RCU didn't help there, actually, because it left a
possibility that the same page_cgroup got reused for the same page
but a different mem_cgroup: in which case we got the wrong spinlock),
his reversal in force_empty lets us use straightforward locking in
mem_cgroup_move_lists (though it still has to try_lock_page_cgroup).
So I want to take Hirokazu-san's patch into my bugfix and cleanup
series, where it's testing out fine so far.
Regarding your point above, Kamezawa-san: you're surely right that can
happen, but is it a case that we actually need to avoid? Aren't we
entitled to take the page out of pc2->mem_cgroup there, because if any
such race occurs, it could easily have happened the other way around,
removing the page from pc1->mem_cgroup just after pc2->mem_cgroup
touched it, so ending up with that page in neither?
I'd just prefer not to handle it as you did in your patch, because
earlier in my series I'd removed the mem_cgroup_uncharge level (which
just gets in the way, requiring a silly lock_page_cgroup at the end
just to match the unlock_page_cgroup at the mem_cgroup_uncharge_page
level), and don't much want to add it back in.
While we're thinking of races...
It seemed to me that mem_cgroup_uncharge should be doing its css_put
after its __mem_cgroup_remove_list: doesn't doing it before leave open
a slight danger that the struct mem_cgroup could be freed before the
remove_list? Perhaps there's some other refcounting that makes that
impossible, but I've felt safer shifting those around.
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-02-22 9:24 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-19 12:54 KAMEZAWA Hiroyuki
2008-02-19 15:40 ` Hugh Dickins
2008-02-20 1:03 ` KAMEZAWA Hiroyuki
2008-02-20 4:14 ` Hugh Dickins
2008-02-20 4:37 ` KAMEZAWA Hiroyuki
2008-02-20 4:39 ` Hugh Dickins
2008-02-20 4:41 ` Balbir Singh
2008-02-20 6:40 ` Balbir Singh
2008-02-20 7:23 ` KAMEZAWA Hiroyuki
2008-02-20 3:14 ` KAMEZAWA Hiroyuki
2008-02-20 3:37 ` YAMAMOTO Takashi
2008-02-20 4:13 ` KAMEZAWA Hiroyuki
2008-02-20 4:32 ` Hugh Dickins
2008-02-20 5:57 ` Balbir Singh
2008-02-20 9:58 ` Hirokazu Takahashi
2008-02-20 10:06 ` Paul Menage
2008-02-20 10:11 ` Balbir Singh
2008-02-20 10:18 ` Paul Menage
2008-02-20 10:55 ` Balbir Singh
2008-02-20 11:21 ` KAMEZAWA Hiroyuki
2008-02-20 11:18 ` Balbir Singh
2008-02-20 11:32 ` KAMEZAWA Hiroyuki
2008-02-20 11:34 ` Balbir Singh
2008-02-20 11:44 ` Paul Menage
2008-02-20 11:41 ` Paul Menage
2008-02-20 11:36 ` Balbir Singh
2008-02-20 11:55 ` Paul Menage
2008-02-21 2:49 ` Hirokazu Takahashi
2008-02-21 6:35 ` KAMEZAWA Hiroyuki
2008-02-21 9:07 ` Hirokazu Takahashi
2008-02-21 9:21 ` KAMEZAWA Hiroyuki
2008-02-21 9:28 ` Balbir Singh
2008-02-21 9:44 ` KAMEZAWA Hiroyuki
2008-02-22 3:31 ` [RFC] Block I/O Cgroup Hirokazu Takahashi
2008-02-22 5:05 ` KAMEZAWA Hiroyuki
2008-02-22 5:45 ` Hirokazu Takahashi
2008-02-21 9:25 ` [RFC][PATCH] Clarify mem_cgroup lock handling and avoid races Balbir Singh
2008-02-20 6:27 ` Hirokazu Takahashi
2008-02-20 6:50 ` KAMEZAWA Hiroyuki
2008-02-20 8:32 ` Clean up force_empty (Was Re: [RFC][PATCH] Clarify mem_cgroup lock handling and avoid races.) KAMEZAWA Hiroyuki
2008-02-20 10:07 ` Clean up force_empty Hirokazu Takahashi
2008-02-22 9:24 ` Hugh Dickins [this message]
2008-02-22 10:07 ` [RFC][PATCH] Clarify mem_cgroup lock handling and avoid races KAMEZAWA Hiroyuki
2008-02-22 10:25 ` Hugh Dickins
2008-02-22 10:34 ` KAMEZAWA Hiroyuki
2008-02-22 10:50 ` Hirokazu Takahashi
2008-02-22 11:14 ` Balbir Singh
2008-02-22 12:00 ` Hugh Dickins
2008-02-22 12:28 ` Balbir Singh
2008-02-22 12:53 ` Hugh Dickins
2008-02-25 3:18 ` Balbir Singh
2008-02-19 15:54 ` kamezawa.hiroyu
2008-02-19 16:26 ` Hugh Dickins
2008-02-20 1:55 ` KAMEZAWA Hiroyuki
2008-02-20 2:05 ` YAMAMOTO Takashi
2008-02-20 2:15 ` KAMEZAWA Hiroyuki
2008-02-20 2:32 ` YAMAMOTO Takashi
2008-02-20 4:27 ` Hugh Dickins
2008-02-20 6:38 ` Balbir Singh
2008-02-20 11:00 ` Hugh Dickins
2008-02-20 11:32 ` Balbir Singh
2008-02-20 14:19 ` Hugh Dickins
2008-02-20 5:00 ` Balbir Singh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0802220916290.18145@blonde.site \
--to=hugh@veritas.com \
--cc=balbir@linux.vnet.ibm.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=riel@redhat.com \
--cc=taka@valinux.co.jp \
--cc=yamamoto@valinux.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox