From: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Markus Trippelsdorf <markus@trippelsdorf.de>,
akpm@linux-foundation.org,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
khlebnikov@openvz.org, hughd@google.com, stable@vger.kernel.org
Subject: Re: [patch 12/12] mm: correctly synchronize rss-counters at exit/exec
Date: Mon, 11 Jun 2012 19:25:21 +0900 [thread overview]
Message-ID: <4FD5C791.9090902@jp.fujitsu.com> (raw)
In-Reply-To: <20120608121816.GA23147@redhat.com>
(2012/06/08 21:18), Oleg Nesterov wrote:
> On 06/07, Linus Torvalds wrote:
>>
>> It does totally insane things in xacct_add_tsk(). You can't call
>> "sync_mm_rss(mm)" on somebody elses mm,
>
> Damn, I am stupid. Yes, I forgot about fill_stats_for_pid().
> And I didn't bother to look at get_task_mm() which clearly
> shows that this tsk can be !current.
>
> We can add the "p == current" check as Hugh suggested.
>
> But,
>
>> Doing it
>> *anywhere* where mm is not clearly "current->mm" is wrong.
>
> Agreed.
>
> How about v2? It adds sync_mm_rss() into taskstats_exit(). Note
> that it preserves the "tsk->mm != NULL" check we currently have.
> I think it should be removed (see the changelog), but even if I
> am right I'd prefer to do this in a separate patch.
>
I'm sorry I've been silent...one another fix I can think of is
this kind of change to sync_mm_rss(). How do you think ?
==
From be49ed6843b09ae33d758f2a51cf8357f7502512 Mon Sep 17 00:00:00 2001
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Date: Mon, 11 Jun 2012 19:45:09 +0900
Subject: [PATCH] fix sync_mm_rss() leakage.
Any page fault after sync_mm_rss() in do_exit() causes problem
in check_mm(). It happens because task's rss counter is not
synchronized after the last sync_mm_rss().
This patch replaces the last sync_mm_rss() with finalize_mm_rss()
and disallow per-task rss count caching after finalization.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
fs/exec.c | 3 ++-
include/linux/mm.h | 10 ++++++++++
kernel/exit.c | 3 +--
mm/memory.c | 21 ++++++++++++++++++---
4 files changed, 31 insertions(+), 6 deletions(-)
diff --git a/fs/exec.c b/fs/exec.c
index a79786a..3e47772 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -819,7 +819,7 @@ static int exec_mmap(struct mm_struct *mm)
/* Notify parent that we're no longer interested in the old VM */
tsk = current;
old_mm = current->mm;
- sync_mm_rss(old_mm);
+ finalize_mm_rss();
mm_release(tsk, old_mm);
if (old_mm) {
@@ -851,6 +851,7 @@ static int exec_mmap(struct mm_struct *mm)
return 0;
}
mmdrop(active_mm);
+ initialize_mm_rss();
return 0;
}
diff --git a/include/linux/mm.h b/include/linux/mm.h
index b36d08c..995d7ff 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1129,10 +1129,20 @@ static inline void setmax_mm_hiwater_rss(unsigned long *maxrss,
#if defined(SPLIT_RSS_COUNTING)
void sync_mm_rss(struct mm_struct *mm);
+void finalize_mm_rss(void);
+void initialize_mm_rss(void);
#else
+static inline void finalize_mm_rss(void)
+{
+}
+
static inline void sync_mm_rss(struct mm_struct *mm)
{
}
+
+static inline void initialize_mm_rss(void)
+{
+}
#endif
int vma_wants_writenotify(struct vm_area_struct *vma);
diff --git a/kernel/exit.c b/kernel/exit.c
index 34867cc..2111879 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -961,8 +961,7 @@ void do_exit(long code)
acct_update_integrals(tsk);
/* sync mm's RSS info before statistics gathering */
- if (tsk->mm)
- sync_mm_rss(tsk->mm);
+ finalize_mm_rss();
group_dead = atomic_dec_and_test(&tsk->signal->live);
if (group_dead) {
hrtimer_cancel(&tsk->signal->real_timer);
diff --git a/mm/memory.c b/mm/memory.c
index 1b7dc66..07aa887d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -125,6 +125,20 @@ core_initcall(init_zero_pfn);
#if defined(SPLIT_RSS_COUNTING)
+void initialize_mm_rss(void)
+{
+ current->rss_stat.events = 0;
+}
+
+void finalize_mm_rss(void)
+{
+ current->rss_stat.events = -1;
+ if (current->mm)
+ sync_mm_rss(current->mm);
+}
+
+#define rss_count_finalized(task) ((task)->rss_stat.events < 0)
+
void sync_mm_rss(struct mm_struct *mm)
{
int i;
@@ -135,14 +149,15 @@ void sync_mm_rss(struct mm_struct *mm)
current->rss_stat.count[i] = 0;
}
}
- current->rss_stat.events = 0;
+ if (!rss_count_finalized(current))
+ current->rss_stat.events = 0;
}
static void add_mm_counter_fast(struct mm_struct *mm, int member, int val)
{
struct task_struct *task = current;
- if (likely(task->mm == mm))
+ if (likely(task->mm == mm && !rss_count_finalized(task)))
task->rss_stat.count[member] += val;
else
add_mm_counter(mm, member, val);
@@ -154,7 +169,7 @@ static void add_mm_counter_fast(struct mm_struct *mm, int member, int val)
#define TASK_RSS_EVENTS_THRESH (64)
static void check_sync_rss_stat(struct task_struct *task)
{
- if (unlikely(task != current))
+ if (unlikely(task != current || rss_count_finalized(task)))
return;
if (unlikely(task->rss_stat.events++ > TASK_RSS_EVENTS_THRESH))
sync_mm_rss(task->mm);
--
1.7.4.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-06-11 10:27 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20120607212114.E4F5AA02F8@akpm.mtv.corp.google.com>
[not found] ` <CA+55aFxOWR_h1vqRLAd_h5_woXjFBLyBHP--P8F7WsYrciXdmA@mail.gmail.com>
2012-06-08 0:25 ` Linus Torvalds
2012-06-08 1:05 ` Markus Trippelsdorf
2012-06-08 1:18 ` Linus Torvalds
2012-06-08 12:18 ` Oleg Nesterov
2012-06-11 10:25 ` Kamezawa Hiroyuki [this message]
2012-06-08 1:16 ` Hugh Dickins
2012-06-08 1:19 ` Linus Torvalds
2012-06-08 5:28 ` Hugh Dickins
2012-06-08 10:20 ` Konstantin Khlebnikov
2012-06-08 12:24 ` Oleg Nesterov
2012-06-08 13:29 ` Konstantin Khlebnikov
2012-06-08 17:01 ` Oleg Nesterov
2012-06-09 9:43 ` [PATCH] " Konstantin Khlebnikov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FD5C791.9090902@jp.fujitsu.com \
--to=kamezawa.hiroyu@jp.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=hughd@google.com \
--cc=khlebnikov@openvz.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=markus@trippelsdorf.de \
--cc=oleg@redhat.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox