* [PATCH linux next] delayacct: add delay max to record delay peak
@ 2024-12-03 8:48 jiang.kun2
2024-12-04 1:08 ` Andrew Morton
0 siblings, 1 reply; 4+ messages in thread
From: jiang.kun2 @ 2024-12-03 8:48 UTC (permalink / raw)
To: bsingharora, akpm
Cc: david, linux-kernel, linux-mm, linux-doc, wang.yong12,
wang.yaxin, fan.yu9, he.peilin, tu.qiang35, qiu.yutan,
zhang.yunkai, ye.xingchen, xu.xin16, yang.yang29
From: Wang Yaxin <wang.yaxin@zte.com.cn>
Introduce the use cases of delay max, which can help quickly detect
potential abnormal delays in the system and record the types and
specific details of delay spikes.
Problem
========
Delay accounting can track the average delay of processes to show
system workload. However, when a process experiences a significant
delay, maybe a delay spike, which adversely affects performance,
getdelays can only display the average system delay over a period
of time. Yet, average delay is unhelpful for diagnosing delay peak.
It is not even possible to determine which type of delay has spiked,
as this information might be masked by the average delay.
Solution
=========
the 'delay max' can display delay peak since the system's startup,
which can record potential abnormal delays over time, including
the type of delay and the maximum delay. This is helpful for
quickly identifying crash caused by delay.
Use case
=========
bash# ./getdelays -d -p 244
print delayacct stats ON
PID 244
CPU count real total virtual total delay total delay average delay max
68 192000000 213676651 705643 0.010ms 0.306381ms
IO count delay total delay average delay max
0 0 0.000ms 0.000000ms
SWAP count delay total delay average delay max
0 0 0.000ms 0.000000ms
RECLAIM count delay total delay average delay max
0 0 0.000ms 0.000000ms
THRASHING count delay total delay average delay max
0 0 0.000ms 0.000000ms
COMPACT count delay total delay average delay max
0 0 0.000ms 0.000000ms
WPCOPY count delay total delay average delay max
235 15648284 0.067ms 0.263842ms
IRQ count delay total delay average delay max
0 0 0.000ms 0.000000ms
Co-developed-by: Wang Yong <wang.yong12@zte.com.cn>
Signed-off-by: Wang Yong <wang.yong12@zte.com.cn>
Co-developed-by: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: xu xin <xu.xin16@zte.com.cn>
Co-developed-by: Wang Yaxin <wang.yaxin@zte.com.cn>
Signed-off-by: Wang Yaxin <wang.yaxin@zte.com.cn>
Signed-off-by: Kun Jiang <jiang.kun2@zte.com.cn>
---
include/linux/delayacct.h | 7 ++++
include/linux/sched.h | 3 ++
include/uapi/linux/taskstats.h | 9 ++++++
kernel/delayacct.c | 35 ++++++++++++++------
kernel/sched/stats.h | 5 ++-
tools/accounting/getdelays.c | 59 ++++++++++++++++++++--------------
6 files changed, 83 insertions(+), 35 deletions(-)
diff --git a/include/linux/delayacct.h b/include/linux/delayacct.h
index 6639f48dac36..56fbfa2c2ac5 100644
--- a/include/linux/delayacct.h
+++ b/include/linux/delayacct.h
@@ -29,25 +29,32 @@ struct task_delay_info {
* XXX_delay contains the accumulated delay time in nanoseconds.
*/
u64 blkio_start;
+ u64 blkio_delay_max;
u64 blkio_delay; /* wait for sync block io completion */
u64 swapin_start;
+ u64 swapin_delay_max;
u64 swapin_delay; /* wait for swapin */
u32 blkio_count; /* total count of the number of sync block */
/* io operations performed */
u32 swapin_count; /* total count of swapin */
u64 freepages_start;
+ u64 freepages_delay_max;
u64 freepages_delay; /* wait for memory reclaim */
u64 thrashing_start;
+ u64 thrashing_delay_max;
u64 thrashing_delay; /* wait for thrashing page */
u64 compact_start;
+ u64 compact_delay_max;
u64 compact_delay; /* wait for memory compact */
u64 wpcopy_start;
+ u64 wpcopy_delay_max;
u64 wpcopy_delay; /* wait for write-protect copy */
+ u64 irq_delay_max;
u64 irq_delay; /* wait for IRQ/SOFTIRQ */
u32 freepages_count; /* total count of memory reclaim */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index d380bffee2ef..c70cbdc72950 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -398,6 +398,9 @@ struct sched_info {
/* Time spent waiting on a runqueue: */
unsigned long long run_delay;
+ /* Max time spent waiting on a runqueue: */
+ unsigned long long max_run_delay;
+
/* Timestamps: */
/* When did we last run on a CPU? */
diff --git a/include/uapi/linux/taskstats.h b/include/uapi/linux/taskstats.h
index b50b2eb257a0..e0d1c6fc9f3b 100644
--- a/include/uapi/linux/taskstats.h
+++ b/include/uapi/linux/taskstats.h
@@ -72,6 +72,7 @@ struct taskstats {
*/
__u64 cpu_count __attribute__((aligned(8)));
__u64 cpu_delay_total;
+ __u64 cpu_delay_max;
/* Following four fields atomically updated using task->delays->lock */
@@ -80,10 +81,12 @@ struct taskstats {
*/
__u64 blkio_count;
__u64 blkio_delay_total;
+ __u64 blkio_delay_max;
/* Delay waiting for page fault I/O (swap in only) */
__u64 swapin_count;
__u64 swapin_delay_total;
+ __u64 swapin_delay_max;
/* cpu "wall-clock" running time
* On some architectures, value will adjust for cpu time stolen
@@ -166,10 +169,12 @@ struct taskstats {
/* Delay waiting for memory reclaim */
__u64 freepages_count;
__u64 freepages_delay_total;
+ __u64 freepages_delay_max;
/* Delay waiting for thrashing page */
__u64 thrashing_count;
__u64 thrashing_delay_total;
+ __u64 thrashing_delay_max;
/* v10: 64-bit btime to avoid overflow */
__u64 ac_btime64; /* 64-bit begin time */
@@ -177,6 +182,7 @@ struct taskstats {
/* v11: Delay waiting for memory compact */
__u64 compact_count;
__u64 compact_delay_total;
+ __u64 compact_delay_max;
/* v12 begin */
__u32 ac_tgid; /* thread group ID */
@@ -198,10 +204,13 @@ struct taskstats {
/* v13: Delay waiting for write-protect copy */
__u64 wpcopy_count;
__u64 wpcopy_delay_total;
+ __u64 wpcopy_delay_max;
/* v14: Delay waiting for IRQ/SOFTIRQ */
__u64 irq_count;
__u64 irq_delay_total;
+ __u64 irq_delay_max;
+ /* v15: add Delay max */
};
diff --git a/kernel/delayacct.c b/kernel/delayacct.c
index dead51de8eb5..d64ad2a48b4f 100644
--- a/kernel/delayacct.c
+++ b/kernel/delayacct.c
@@ -95,7 +95,7 @@ void __delayacct_tsk_init(struct task_struct *tsk)
* Finish delay accounting for a statistic using its timestamps (@start),
* accumalator (@total) and @count
*/
-static void delayacct_end(raw_spinlock_t *lock, u64 *start, u64 *total, u32 *count)
+static void delayacct_end(raw_spinlock_t *lock, u64 *start, u64 *total, u32 *count, u64 *max)
{
s64 ns = local_clock() - *start;
unsigned long flags;
@@ -104,6 +104,8 @@ static void delayacct_end(raw_spinlock_t *lock, u64 *start, u64 *total, u32 *cou
raw_spin_lock_irqsave(lock, flags);
*total += ns;
(*count)++;
+ if (ns > *max)
+ *max = ns;
raw_spin_unlock_irqrestore(lock, flags);
}
}
@@ -122,7 +124,8 @@ void __delayacct_blkio_end(struct task_struct *p)
delayacct_end(&p->delays->lock,
&p->delays->blkio_start,
&p->delays->blkio_delay,
- &p->delays->blkio_count);
+ &p->delays->blkio_count,
+ &p->delays->blkio_delay_max);
}
int delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk)
@@ -153,10 +156,11 @@ int delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk)
d->cpu_count += t1;
+ d->cpu_delay_max = tsk->sched_info.max_run_delay;
tmp = (s64)d->cpu_delay_total + t2;
d->cpu_delay_total = (tmp < (s64)d->cpu_delay_total) ? 0 : tmp;
-
tmp = (s64)d->cpu_run_virtual_total + t3;
+
d->cpu_run_virtual_total =
(tmp < (s64)d->cpu_run_virtual_total) ? 0 : tmp;
@@ -164,20 +168,26 @@ int delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk)
return 0;
/* zero XXX_total, non-zero XXX_count implies XXX stat overflowed */
-
raw_spin_lock_irqsave(&tsk->delays->lock, flags);
+ d->blkio_delay_max = tsk->delays->blkio_delay_max;
tmp = d->blkio_delay_total + tsk->delays->blkio_delay;
d->blkio_delay_total = (tmp < d->blkio_delay_total) ? 0 : tmp;
+ d->swapin_delay_max = tsk->delays->swapin_delay_max;
tmp = d->swapin_delay_total + tsk->delays->swapin_delay;
d->swapin_delay_total = (tmp < d->swapin_delay_total) ? 0 : tmp;
+ d->freepages_delay_max = tsk->delays->freepages_delay_max;
tmp = d->freepages_delay_total + tsk->delays->freepages_delay;
d->freepages_delay_total = (tmp < d->freepages_delay_total) ? 0 : tmp;
+ d->thrashing_delay_max = tsk->delays->thrashing_delay_max;
tmp = d->thrashing_delay_total + tsk->delays->thrashing_delay;
d->thrashing_delay_total = (tmp < d->thrashing_delay_total) ? 0 : tmp;
+ d->compact_delay_max = tsk->delays->compact_delay_max;
tmp = d->compact_delay_total + tsk->delays->compact_delay;
d->compact_delay_total = (tmp < d->compact_delay_total) ? 0 : tmp;
+ d->wpcopy_delay_max = tsk->delays->wpcopy_delay_max;
tmp = d->wpcopy_delay_total + tsk->delays->wpcopy_delay;
d->wpcopy_delay_total = (tmp < d->wpcopy_delay_total) ? 0 : tmp;
+ d->irq_delay_max = tsk->delays->irq_delay_max;
tmp = d->irq_delay_total + tsk->delays->irq_delay;
d->irq_delay_total = (tmp < d->irq_delay_total) ? 0 : tmp;
d->blkio_count += tsk->delays->blkio_count;
@@ -213,7 +223,8 @@ void __delayacct_freepages_end(void)
delayacct_end(¤t->delays->lock,
¤t->delays->freepages_start,
¤t->delays->freepages_delay,
- ¤t->delays->freepages_count);
+ ¤t->delays->freepages_count,
+ ¤t->delays->freepages_delay_max);
}
void __delayacct_thrashing_start(bool *in_thrashing)
@@ -235,7 +246,8 @@ void __delayacct_thrashing_end(bool *in_thrashing)
delayacct_end(¤t->delays->lock,
¤t->delays->thrashing_start,
¤t->delays->thrashing_delay,
- ¤t->delays->thrashing_count);
+ ¤t->delays->thrashing_count,
+ ¤t->delays->thrashing_delay_max);
}
void __delayacct_swapin_start(void)
@@ -248,7 +260,8 @@ void __delayacct_swapin_end(void)
delayacct_end(¤t->delays->lock,
¤t->delays->swapin_start,
¤t->delays->swapin_delay,
- ¤t->delays->swapin_count);
+ ¤t->delays->swapin_count,
+ ¤t->delays->swapin_delay_max);
}
void __delayacct_compact_start(void)
@@ -261,7 +274,8 @@ void __delayacct_compact_end(void)
delayacct_end(¤t->delays->lock,
¤t->delays->compact_start,
¤t->delays->compact_delay,
- ¤t->delays->compact_count);
+ ¤t->delays->compact_count,
+ ¤t->delays->compact_delay_max);
}
void __delayacct_wpcopy_start(void)
@@ -274,7 +288,8 @@ void __delayacct_wpcopy_end(void)
delayacct_end(¤t->delays->lock,
¤t->delays->wpcopy_start,
¤t->delays->wpcopy_delay,
- ¤t->delays->wpcopy_count);
+ ¤t->delays->wpcopy_count,
+ ¤t->delays->wpcopy_delay_max);
}
void __delayacct_irq(struct task_struct *task, u32 delta)
@@ -284,6 +299,8 @@ void __delayacct_irq(struct task_struct *task, u32 delta)
raw_spin_lock_irqsave(&task->delays->lock, flags);
task->delays->irq_delay += delta;
task->delays->irq_count++;
+ if (delta > task->delays->irq_delay_max)
+ task->delays->irq_delay_max = delta;
raw_spin_unlock_irqrestore(&task->delays->lock, flags);
}
diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
index 8ee0add5a48a..ed72435aef51 100644
--- a/kernel/sched/stats.h
+++ b/kernel/sched/stats.h
@@ -244,7 +244,8 @@ static inline void sched_info_dequeue(struct rq *rq, struct task_struct *t)
delta = rq_clock(rq) - t->sched_info.last_queued;
t->sched_info.last_queued = 0;
t->sched_info.run_delay += delta;
-
+ if (delta > t->sched_info.max_run_delay)
+ t->sched_info.max_run_delay = delta;
rq_sched_info_dequeue(rq, delta);
}
@@ -266,6 +267,8 @@ static void sched_info_arrive(struct rq *rq, struct task_struct *t)
t->sched_info.run_delay += delta;
t->sched_info.last_arrival = now;
t->sched_info.pcount++;
+ if (delta > t->sched_info.max_run_delay)
+ t->sched_info.max_run_delay = delta;
rq_sched_info_arrive(rq, delta);
}
diff --git a/tools/accounting/getdelays.c b/tools/accounting/getdelays.c
index 1334214546d7..e570bcad185d 100644
--- a/tools/accounting/getdelays.c
+++ b/tools/accounting/getdelays.c
@@ -192,60 +192,69 @@ static int get_family_id(int sd)
}
#define average_ms(t, c) (t / 1000000ULL / (c ? c : 1))
+#define delay_max_ms(t) (t / 1000000ULL)
static void print_delayacct(struct taskstats *t)
{
- printf("\n\nCPU %15s%15s%15s%15s%15s\n"
- " %15llu%15llu%15llu%15llu%15.3fms\n"
- "IO %15s%15s%15s\n"
- " %15llu%15llu%15.3fms\n"
- "SWAP %15s%15s%15s\n"
- " %15llu%15llu%15.3fms\n"
- "RECLAIM %12s%15s%15s\n"
- " %15llu%15llu%15.3fms\n"
- "THRASHING%12s%15s%15s\n"
- " %15llu%15llu%15.3fms\n"
- "COMPACT %12s%15s%15s\n"
- " %15llu%15llu%15.3fms\n"
- "WPCOPY %12s%15s%15s\n"
- " %15llu%15llu%15.3fms\n"
- "IRQ %15s%15s%15s\n"
- " %15llu%15llu%15.3fms\n",
+ printf("\n\nCPU %15s%15s%15s%15s%15s%15s\n"
+ " %15llu%15llu%15llu%15llu%15.3fms%13.6fms\n"
+ "IO %15s%15s%15s%15s\n"
+ " %15llu%15llu%15.3fms%13.6fms\n"
+ "SWAP %15s%15s%15s%15s\n"
+ " %15llu%15llu%15.3fms%13.6fms\n"
+ "RECLAIM %12s%15s%15s%15s\n"
+ " %15llu%15llu%15.3fms%13.6fms\n"
+ "THRASHING%12s%15s%15s%15s\n"
+ " %15llu%15llu%15.3fms%13.6fms\n"
+ "COMPACT %12s%15s%15s%15s\n"
+ " %15llu%15llu%15.3fms%13.6fms\n"
+ "WPCOPY %12s%15s%15s%15s\n"
+ " %15llu%15llu%15.3fms%13.6fms\n"
+ "IRQ %15s%15s%15s%15s\n"
+ " %15llu%15llu%15.3fms%13.6fms\n",
"count", "real total", "virtual total",
- "delay total", "delay average",
+ "delay total", "delay average", "delay max",
(unsigned long long)t->cpu_count,
(unsigned long long)t->cpu_run_real_total,
(unsigned long long)t->cpu_run_virtual_total,
(unsigned long long)t->cpu_delay_total,
average_ms((double)t->cpu_delay_total, t->cpu_count),
- "count", "delay total", "delay average",
+ delay_max_ms((double)t->cpu_delay_max),
+ "count", "delay total", "delay average", "delay max",
(unsigned long long)t->blkio_count,
(unsigned long long)t->blkio_delay_total,
average_ms((double)t->blkio_delay_total, t->blkio_count),
- "count", "delay total", "delay average",
+ delay_max_ms((double)t->blkio_delay_max),
+ "count", "delay total", "delay average", "delay max",
(unsigned long long)t->swapin_count,
(unsigned long long)t->swapin_delay_total,
average_ms((double)t->swapin_delay_total, t->swapin_count),
- "count", "delay total", "delay average",
+ delay_max_ms((double)t->swapin_delay_max),
+ "count", "delay total", "delay average", "delay max",
(unsigned long long)t->freepages_count,
(unsigned long long)t->freepages_delay_total,
average_ms((double)t->freepages_delay_total, t->freepages_count),
- "count", "delay total", "delay average",
+ delay_max_ms((double)t->freepages_delay_max),
+ "count", "delay total", "delay average", "delay max",
(unsigned long long)t->thrashing_count,
(unsigned long long)t->thrashing_delay_total,
average_ms((double)t->thrashing_delay_total, t->thrashing_count),
- "count", "delay total", "delay average",
+ delay_max_ms((double)t->thrashing_delay_max),
+ "count", "delay total", "delay average", "delay max",
(unsigned long long)t->compact_count,
(unsigned long long)t->compact_delay_total,
average_ms((double)t->compact_delay_total, t->compact_count),
- "count", "delay total", "delay average",
+ delay_max_ms((double)t->compact_delay_max),
+ "count", "delay total", "delay average", "delay max",
(unsigned long long)t->wpcopy_count,
(unsigned long long)t->wpcopy_delay_total,
average_ms((double)t->wpcopy_delay_total, t->wpcopy_count),
- "count", "delay total", "delay average",
+ delay_max_ms((double)t->wpcopy_delay_max),
+ "count", "delay total", "delay average", "delay max",
(unsigned long long)t->irq_count,
(unsigned long long)t->irq_delay_total,
- average_ms((double)t->irq_delay_total, t->irq_count));
+ average_ms((double)t->irq_delay_total, t->irq_count),
+ delay_max_ms((double)t->irq_delay_max));
}
static void task_context_switch_counts(struct taskstats *t)
--
2.25.1
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH linux next] delayacct: add delay max to record delay peak
2024-12-03 8:48 [PATCH linux next] delayacct: add delay max to record delay peak jiang.kun2
@ 2024-12-04 1:08 ` Andrew Morton
2024-12-04 1:43 ` yang.yang29
0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2024-12-04 1:08 UTC (permalink / raw)
To: jiang.kun2
Cc: bsingharora, david, linux-kernel, linux-mm, linux-doc,
wang.yong12, wang.yaxin, fan.yu9, he.peilin, tu.qiang35,
qiu.yutan, zhang.yunkai, ye.xingchen, xu.xin16, yang.yang29
On Tue, 3 Dec 2024 16:48:48 +0800 (CST) <jiang.kun2@zte.com.cn> wrote:
> From: Wang Yaxin <wang.yaxin@zte.com.cn>
>
> Introduce the use cases of delay max, which can help quickly detect
> potential abnormal delays in the system and record the types and
> specific details of delay spikes.
>
> Problem
> ========
> Delay accounting can track the average delay of processes to show
> system workload. However, when a process experiences a significant
> delay, maybe a delay spike, which adversely affects performance,
> getdelays can only display the average system delay over a period
> of time. Yet, average delay is unhelpful for diagnosing delay peak.
> It is not even possible to determine which type of delay has spiked,
> as this information might be masked by the average delay.
>
> Solution
> =========
> the 'delay max' can display delay peak since the system's startup,
> which can record potential abnormal delays over time, including
> the type of delay and the maximum delay. This is helpful for
> quickly identifying crash caused by delay.
Seems sensible, and the implementation is straightforward. I'll queue
it for testing and review.
> include/linux/delayacct.h | 7 ++++
> include/linux/sched.h | 3 ++
> include/uapi/linux/taskstats.h | 9 ++++++
> kernel/delayacct.c | 35 ++++++++++++++------
> kernel/sched/stats.h | 5 ++-
> tools/accounting/getdelays.c | 59 ++++++++++++++++++++--------------
Please prepare a v2 which includes an update to
Documentation/accounting/delay-accounting.rst.
>
> ...
>
> --- a/kernel/delayacct.c
> +++ b/kernel/delayacct.c
> @@ -95,7 +95,7 @@ void __delayacct_tsk_init(struct task_struct *tsk)
> * Finish delay accounting for a statistic using its timestamps (@start),
> * accumalator (@total) and @count
And I wouldn't be offended if v2 fixed that spelling error! "accumulator".
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH linux next] delayacct: add delay max to record delay peak
2024-12-04 1:08 ` Andrew Morton
@ 2024-12-04 1:43 ` yang.yang29
0 siblings, 0 replies; 4+ messages in thread
From: yang.yang29 @ 2024-12-04 1:43 UTC (permalink / raw)
To: jiang.kun2
Cc: bsingharora, david, linux-kernel, linux-mm, linux-doc,
wang.yong12, wang.yaxin, fan.yu9, he.peilin, tu.qiang35,
qiu.yutan, zhang.yunkai, ye.xingchen, xu.xin16, akpm,
Liu.Jianjun3, tan.hu, jiang.xuexin
On Tue, 3 Dec 2024 16:48:48 +0800 (CST) <jiang.kun2@zte.com.cn> wrote:
> From: Wang Yaxin <wang.yaxin@zte.com.cn>
> the 'delay max' can display delay peak since the system's startup
What about also add 'delay min' ? This could help us get to know how
large the difference is between min and max, provide clues for optimizing
potential. This is also some benchmark tools do, like stream which print:
precision of your system timer.
-------------------------------------------------------------
Function Best Rate MB/s Avg time Min time Max time
Copy: 234195.8 0.073357 0.073357 0.073357
Scale: 163856.6 0.104847 0.104847 0.104847
Add: 177791.5 0.144944 0.144944 0.144944
And please also update Documentation/accounting/delay-accounting.rst.
Other parts LGTM, thanks!
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH linux next] delayacct: add delay max to record delay peak
@ 2024-12-05 11:10 yaxin_wang
0 siblings, 0 replies; 4+ messages in thread
From: yaxin_wang @ 2024-12-05 11:10 UTC (permalink / raw)
To: yang.yang29
Cc: bsingharora, akpm, david, linux-kernel, linux-mm, linux-doc,
wang.yong12, wang.yaxin, fan.yu9, he.peilin, tu.qiang35,
qiu.yutan, zhang.yunkai, ye.xingchen, xu.xin16
>> From: Wang Yaxin <wang.yaxin@zte.com.cn>
>> the 'delay max' can display delay peak since the system's startup
>
>What about also add 'delay min' ? This could help us get to know how
>large the difference is between min and max, provide clues for optimizing
>potential. This is also some benchmark tools do, like stream which print:
>precision of your system timer.
I also think that 'delay min' would be useful. I will submit a new patch
to implement the 'delay min' later.
Yaxin
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-12-05 11:10 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-12-03 8:48 [PATCH linux next] delayacct: add delay max to record delay peak jiang.kun2
2024-12-04 1:08 ` Andrew Morton
2024-12-04 1:43 ` yang.yang29
2024-12-05 11:10 yaxin_wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox