linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Rusty Russell <rusty@rustcorp.com.au>
To: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, kernel-testers@vger.kernel.org,
	linux-mm@kvack.org, sugita <yumiko.sugita.yf@hitachi.com>,
	Satoshi OSHIMA <satoshi.oshima.fk@hitachi.com>,
	Ingo Molnar <mingo@elte.hu>
Subject: Re: [BUG][PATCH -mm] avoid BUG() in __stop_machine_run()
Date: Thu, 19 Jun 2008 20:12:43 +1000	[thread overview]
Message-ID: <200806192012.44459.rusty@rustcorp.com.au> (raw)
In-Reply-To: <485A03E6.2090509@hitachi.com>

On Thursday 19 June 2008 16:59:50 Hidehiro Kawai wrote:
> When a process loads a kernel module, __stop_machine_run() is called, and
> it calls sched_setscheduler() to give newly created kernel threads highest
> priority.  However, the process can have no CAP_SYS_NICE which required
> for sched_setscheduler() to increase the priority.  For example, SystemTap
> loads its module with only CAP_SYS_MODULE.  In this case,
> sched_setscheduler() returns -EPERM, then BUG() is called.

Hi Hidehiro,

	Nice catch.  This can happen in the current code, it just doesn't
BUG().

> Failure of sched_setscheduler() wouldn't be a real problem, so this
> patch just ignores it.

	Well, it can mean that the stop_machine blocks indefinitely.  Better
than a BUG(), but we should aim higher.

> Or, should we give the CAP_SYS_NICE capability temporarily?

        I don't think so.  It can be seen from another thread, and in theory
that should not see something random.  Worse, they can change it from
another thread.

How's this?

sched_setscheduler: add a flag to control access checks

Hidehiro Kawai noticed that sched_setscheduler() can fail in
stop_machine: it calls sched_setscheduler() from insmod, which can
have CAP_SYS_MODULE without CAP_SYS_NICE.

This simply introduces a flag to allow us to disable the capability
checks for internal callers (this is simpler than splitting the
sched_setscheduler() function, since it loops checking permissions).

The flag is only "false" (ie. no check) for the following cases, where
it shouldn't matter:
  drivers/input/touchscreen/ucb1400_ts.c:ucb1400_ts_thread()
	- it's a kthread
  drivers/mmc/core/sdio_irq.c:sdio_irq_thread()
	- also a kthread
  kernel/kthread.c:create_kthread()
	- making a kthread (from kthreadd)
  kernel/softlockup.c:watchdog()
	- also a kthread

And these cases could have failed before:
  kernel/softirq.c:cpu_callback()
	- CPU hotplug callback
  kernel/stop_machine.c:__stop_machine_run()
	- Called from various places, including modprobe()

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

diff -r 509f0724da6b drivers/input/touchscreen/ucb1400_ts.c
--- a/drivers/input/touchscreen/ucb1400_ts.c	Thu Jun 19 17:06:30 2008 +1000
+++ b/drivers/input/touchscreen/ucb1400_ts.c	Thu Jun 19 19:36:40 2008 +1000
@@ -287,7 +287,7 @@ static int ucb1400_ts_thread(void *_ucb)
 	int valid = 0;
 	struct sched_param param = { .sched_priority = 1 };
 
-	sched_setscheduler(tsk, SCHED_FIFO, &param);
+	sched_setscheduler(tsk, SCHED_FIFO, &param, false);
 
 	set_freezable();
 	while (!kthread_should_stop()) {
diff -r 509f0724da6b drivers/mmc/core/sdio_irq.c
--- a/drivers/mmc/core/sdio_irq.c	Thu Jun 19 17:06:30 2008 +1000
+++ b/drivers/mmc/core/sdio_irq.c	Thu Jun 19 19:36:40 2008 +1000
@@ -70,7 +70,7 @@ static int sdio_irq_thread(void *_host)
 	unsigned long period, idle_period;
 	int ret;
 
-	sched_setscheduler(current, SCHED_FIFO, &param);
+	sched_setscheduler(current, SCHED_FIFO, &param, false);
 
 	/*
 	 * We want to allow for SDIO cards to work even on non SDIO
diff -r 509f0724da6b include/linux/sched.h
--- a/include/linux/sched.h	Thu Jun 19 17:06:30 2008 +1000
+++ b/include/linux/sched.h	Thu Jun 19 19:36:40 2008 +1000
@@ -1654,7 +1654,8 @@ extern int can_nice(const struct task_st
 extern int can_nice(const struct task_struct *p, const int nice);
 extern int task_curr(const struct task_struct *p);
 extern int idle_cpu(int cpu);
-extern int sched_setscheduler(struct task_struct *, int, struct sched_param *);
+extern int sched_setscheduler(struct task_struct *, int, struct sched_param *,
+			      bool);
 extern struct task_struct *idle_task(int cpu);
 extern struct task_struct *curr_task(int cpu);
 extern void set_curr_task(int cpu, struct task_struct *p);
diff -r 509f0724da6b kernel/kthread.c
--- a/kernel/kthread.c	Thu Jun 19 17:06:30 2008 +1000
+++ b/kernel/kthread.c	Thu Jun 19 19:36:40 2008 +1000
@@ -104,7 +104,7 @@ static void create_kthread(struct kthrea
 		 * root may have changed our (kthreadd's) priority or CPU mask.
 		 * The kernel thread should not inherit these properties.
 		 */
-		sched_setscheduler(create->result, SCHED_NORMAL, &param);
+		sched_setscheduler(create->result, SCHED_NORMAL, &param, false);
 		set_user_nice(create->result, KTHREAD_NICE_LEVEL);
 		set_cpus_allowed(create->result, CPU_MASK_ALL);
 	}
diff -r 509f0724da6b kernel/rtmutex-tester.c
--- a/kernel/rtmutex-tester.c	Thu Jun 19 17:06:30 2008 +1000
+++ b/kernel/rtmutex-tester.c	Thu Jun 19 19:36:40 2008 +1000
@@ -327,7 +327,8 @@ static ssize_t sysfs_test_command(struct
 	switch (op) {
 	case RTTEST_SCHEDOT:
 		schedpar.sched_priority = 0;
-		ret = sched_setscheduler(threads[tid], SCHED_NORMAL, &schedpar);
+		ret = sched_setscheduler(threads[tid], SCHED_NORMAL, &schedpar,
+					 true);
 		if (ret)
 			return ret;
 		set_user_nice(current, 0);
@@ -335,7 +336,8 @@ static ssize_t sysfs_test_command(struct
 
 	case RTTEST_SCHEDRT:
 		schedpar.sched_priority = dat;
-		ret = sched_setscheduler(threads[tid], SCHED_FIFO, &schedpar);
+		ret = sched_setscheduler(threads[tid], SCHED_FIFO, &schedpar,
+					 true);
 		if (ret)
 			return ret;
 		break;
diff -r 509f0724da6b kernel/sched.c
--- a/kernel/sched.c	Thu Jun 19 17:06:30 2008 +1000
+++ b/kernel/sched.c	Thu Jun 19 19:36:40 2008 +1000
@@ -4749,11 +4749,12 @@ __setscheduler(struct rq *rq, struct tas
  * @p: the task in question.
  * @policy: new policy.
  * @param: structure containing the new RT priority.
+ * @user: do checks to ensure this thread has permission
  *
  * NOTE that the task may be already dead.
  */
 int sched_setscheduler(struct task_struct *p, int policy,
-		       struct sched_param *param)
+		       struct sched_param *param, bool user)
 {
 	int retval, oldprio, oldpolicy = -1, on_rq, running;
 	unsigned long flags;
@@ -4785,7 +4786,7 @@ recheck:
 	/*
 	 * Allow unprivileged RT tasks to decrease priority:
 	 */
-	if (!capable(CAP_SYS_NICE)) {
+	if (user && !capable(CAP_SYS_NICE)) {
 		if (rt_policy(policy)) {
 			unsigned long rlim_rtprio;
 
@@ -4821,7 +4822,8 @@ recheck:
 	 * Do not allow realtime tasks into groups that have no runtime
 	 * assigned.
 	 */
-	if (rt_policy(policy) && task_group(p)->rt_bandwidth.rt_runtime == 0)
+	if (user
+	    && rt_policy(policy) && task_group(p)->rt_bandwidth.rt_runtime == 0)
 		return -EPERM;
 #endif
 
@@ -4888,7 +4890,7 @@ do_sched_setscheduler(pid_t pid, int pol
 	retval = -ESRCH;
 	p = find_process_by_pid(pid);
 	if (p != NULL)
-		retval = sched_setscheduler(p, policy, &lparam);
+		retval = sched_setscheduler(p, policy, &lparam, true);
 	rcu_read_unlock();
 
 	return retval;
diff -r 509f0724da6b kernel/softirq.c
--- a/kernel/softirq.c	Thu Jun 19 17:06:30 2008 +1000
+++ b/kernel/softirq.c	Thu Jun 19 19:36:40 2008 +1000
@@ -645,7 +645,7 @@ static int __cpuinit cpu_callback(struct
 
 		p = per_cpu(ksoftirqd, hotcpu);
 		per_cpu(ksoftirqd, hotcpu) = NULL;
-		sched_setscheduler(p, SCHED_FIFO, &param);
+		sched_setscheduler(p, SCHED_FIFO, &param, false);
 		kthread_stop(p);
 		takeover_tasklets(hotcpu);
 		break;
diff -r 509f0724da6b kernel/softlockup.c
--- a/kernel/softlockup.c	Thu Jun 19 17:06:30 2008 +1000
+++ b/kernel/softlockup.c	Thu Jun 19 19:36:40 2008 +1000
@@ -211,7 +211,7 @@ static int watchdog(void *__bind_cpu)
 	struct sched_param param = { .sched_priority = MAX_RT_PRIO-1 };
 	int this_cpu = (long)__bind_cpu;
 
-	sched_setscheduler(current, SCHED_FIFO, &param);
+	sched_setscheduler(current, SCHED_FIFO, &param, false);
 
 	/* initialize timestamp */
 	touch_softlockup_watchdog();
diff -r 509f0724da6b kernel/stop_machine.c
--- a/kernel/stop_machine.c	Thu Jun 19 17:06:30 2008 +1000
+++ b/kernel/stop_machine.c	Thu Jun 19 19:36:40 2008 +1000
@@ -187,7 +187,7 @@ struct task_struct *__stop_machine_run(i
 		struct sched_param param = { .sched_priority = MAX_RT_PRIO-1 };
 
 		/* One high-prio thread per cpu.  We'll do this one. */
-		sched_setscheduler(p, SCHED_FIFO, &param);
+		sched_setscheduler(p, SCHED_FIFO, &param, false);
 		kthread_bind(p, cpu);
 		wake_up_process(p);
 		wait_for_completion(&smdata.done);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2008-06-19 10:12 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-12  5:59 2.6.26-rc5-mm3 Andrew Morton
2008-06-12  7:58 ` 2.6.26-rc5-mm3: kernel BUG at mm/vmscan.c:510 Alexey Dobriyan
2008-06-12  8:22   ` Andrew Morton
2008-06-12  8:23     ` Alexey Dobriyan
2008-06-12  8:44 ` [BUG] 2.6.26-rc5-mm3 kernel BUG at mm/filemap.c:575! Kamalesh Babulal
2008-06-12  8:57   ` Andrew Morton
2008-06-12 11:20     ` KAMEZAWA Hiroyuki
2008-06-13  1:44       ` [PATCH] fix double unlock_page() in " KAMEZAWA Hiroyuki
2008-06-13  2:13         ` Andrew Morton
2008-06-13 15:30           ` Lee Schermerhorn
2008-06-15  3:59             ` Kamalesh Babulal
2008-06-16 14:49             ` Lee Schermerhorn
2008-06-17  2:32             ` KAMEZAWA Hiroyuki
2008-06-17 15:26               ` Lee Schermerhorn
2008-06-13  4:34         ` Valdis.Kletnieks
2008-06-14 13:32         ` Kamalesh Babulal
2008-06-12 11:38     ` [BUG] " Nick Piggin
2008-06-13  0:25       ` KAMEZAWA Hiroyuki
2008-06-13  4:18   ` Valdis.Kletnieks
2008-06-13  7:16     ` Andrew Morton
2008-06-12 23:32 ` 2.6.26-rc5-mm3 Byron Bradley
2008-06-12 23:55   ` 2.6.26-rc5-mm3 Daniel Walker
2008-06-13  0:04     ` 2.6.26-rc5-mm3 Byron Bradley
2008-06-18 17:55   ` 2.6.26-rc5-mm3 Daniel Walker
2008-06-19  9:13     ` 2.6.26-rc5-mm3 Ingo Molnar
2008-06-19 14:39       ` 2.6.26-rc5-mm3 Daniel Walker
2008-06-17  7:35 ` [PATCH][RFC] fix kernel BUG at mm/migrate.c:719! in 2.6.26-rc5-mm3 Daisuke Nishimura
2008-06-17  7:47   ` [Bad page] trying to free locked page? (Re: [PATCH][RFC] fix kernel BUG at mm/migrate.c:719! in 2.6.26-rc5-mm3) Daisuke Nishimura
2008-06-17  9:03     ` KAMEZAWA Hiroyuki
2008-06-17  9:14       ` KOSAKI Motohiro
2008-06-17  9:15       ` Daisuke Nishimura
2008-06-17 18:29         ` Lee Schermerhorn
2008-06-17 20:00           ` [PATCH] unevictable mlocked pages: initialize mm member of munlock mm_walk structure Lee Schermerhorn
2008-06-18  3:33             ` KOSAKI Motohiro
2008-06-18  2:40           ` [Bad page] trying to free locked page? (Re: [PATCH][RFC] fix kernel BUG at mm/migrate.c:719! in 2.6.26-rc5-mm3) Daisuke Nishimura
2008-06-17 15:34     ` KOSAKI Motohiro
2008-06-18  2:32       ` Daisuke Nishimura
2008-06-18 10:20         ` KOSAKI Motohiro
2008-06-18  9:40     ` [Experimental][PATCH] putback_lru_page rework KAMEZAWA Hiroyuki
2008-06-18 11:36       ` KOSAKI Motohiro
2008-06-18 11:55         ` KAMEZAWA Hiroyuki
2008-06-19  8:00           ` Daisuke Nishimura
2008-06-19  8:24             ` KAMEZAWA Hiroyuki
2008-06-18 14:50       ` Daisuke Nishimura
2008-06-18 18:21       ` Lee Schermerhorn
2008-06-19  0:22         ` KAMEZAWA Hiroyuki
2008-06-19 14:45           ` Lee Schermerhorn
2008-06-20  0:47             ` KAMEZAWA Hiroyuki
2008-06-20  1:13             ` KAMEZAWA Hiroyuki
2008-06-20 17:10               ` Lee Schermerhorn
2008-06-20 20:41                 ` Lee Schermerhorn
2008-06-21  8:56                   ` KOSAKI Motohiro
2008-06-23  0:30                     ` KAMEZAWA Hiroyuki
2008-06-21  8:41                 ` KOSAKI Motohiro
2008-06-21  8:39               ` KOSAKI Motohiro
2008-06-19 15:32           ` kamezawa.hiroyu
2008-06-20 16:24             ` Lee Schermerhorn
2008-06-17 15:33   ` [PATCH][RFC] fix kernel BUG at mm/migrate.c:719! in 2.6.26-rc5-mm3 KOSAKI Motohiro
2008-06-18  1:54     ` Daisuke Nishimura
2008-06-18  4:41       ` Daisuke Nishimura
2008-06-18  4:59         ` KAMEZAWA Hiroyuki
2008-06-18  7:54         ` [PATCH][-mm] remove redundant page->mapping check KOSAKI Motohiro
2008-06-17 17:46   ` [PATCH][RFC] fix kernel BUG at mm/migrate.c:719! in 2.6.26-rc5-mm3 Lee Schermerhorn
2008-06-17 18:33     ` Hugh Dickins
2008-06-17 19:28       ` Lee Schermerhorn
2008-06-18  5:19         ` Nick Piggin
2008-06-18  2:59     ` Daisuke Nishimura
2008-06-18  1:13   ` KAMEZAWA Hiroyuki
2008-06-18  1:26     ` Daisuke Nishimura
2008-06-18  1:54     ` [PATCH] migration_entry_wait fix KAMEZAWA Hiroyuki
2008-06-18  5:26       ` KOSAKI Motohiro
2008-06-18  5:35       ` Nick Piggin
2008-06-18  6:04         ` KAMEZAWA Hiroyuki
2008-06-18  6:42           ` Nick Piggin
2008-06-18  6:52             ` KAMEZAWA Hiroyuki
2008-06-18  7:29               ` [PATCH -mm][BUGFIX] migration_entry_wait fix. v2 KAMEZAWA Hiroyuki
2008-06-18  7:26                 ` KOSAKI Motohiro
2008-06-18  7:40                 ` Nick Piggin
2008-06-19  6:59 ` [BUG][PATCH -mm] avoid BUG() in __stop_machine_run() Hidehiro Kawai
2008-06-19 10:12   ` Rusty Russell [this message]
2008-06-19 15:51     ` Jeremy Fitzhardinge
2008-06-20 13:21       ` Ingo Molnar
2008-06-23  3:55         ` Rusty Russell
2008-06-23 21:01           ` Ingo Molnar
2008-06-19 16:27 ` 2.6.26-rc5-mm3: BUG large value for HugePages_Rsvd Jon Tollefson
2008-06-19 17:16   ` Andy Whitcroft
2008-06-20  3:18     ` Jon Tollefson
2008-06-20 19:17   ` [RFC] hugetlb reservations -- MAP_PRIVATE fixes for split vmas Andy Whitcroft
2008-06-20 19:17     ` [PATCH 1/2] hugetlb reservations: move region tracking earlier Andy Whitcroft
2008-06-20 19:17     ` [PATCH 2/2] hugetlb reservations: fix hugetlb MAP_PRIVATE reservations across vma splits Andy Whitcroft
2008-06-23  7:33       ` Mel Gorman
2008-06-23  8:00       ` Mel Gorman
2008-06-23  9:53         ` Andy Whitcroft
2008-06-23 16:04     ` [RFC] hugetlb reservations -- MAP_PRIVATE fixes for split vmas Jon Tollefson
2008-06-23 17:35   ` [RFC] hugetlb reservations -- MAP_PRIVATE fixes for split vmas V2 Andy Whitcroft
2008-06-23 17:35     ` [PATCH 1/2] hugetlb reservations: move region tracking earlier Andy Whitcroft
2008-06-23 23:05       ` Mel Gorman
2008-06-23 17:35     ` [PATCH 2/2] hugetlb reservations: fix hugetlb MAP_PRIVATE reservations across vma splits V2 Andy Whitcroft
2008-06-23 23:08       ` Mel Gorman
2008-06-25 21:22     ` [RFC] hugetlb reservations -- MAP_PRIVATE fixes for split vmas V2 Jon Tollefson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200806192012.44459.rusty@rustcorp.com.au \
    --to=rusty@rustcorp.com.au \
    --cc=akpm@linux-foundation.org \
    --cc=hidehiro.kawai.ez@hitachi.com \
    --cc=kernel-testers@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@elte.hu \
    --cc=satoshi.oshima.fk@hitachi.com \
    --cc=yumiko.sugita.yf@hitachi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox