* [RFC PATCH] mm: memory-failure: add soft-offline stat in mf_stats
@ 2024-11-21 4:55 Tomohiro Misono
2024-11-26 3:09 ` Miaohe Lin
0 siblings, 1 reply; 10+ messages in thread
From: Tomohiro Misono @ 2024-11-21 4:55 UTC (permalink / raw)
To: Andrew Morton, Miaohe Lin, Naoya Horiguchi
Cc: jiaqiyan, misono.tomohiro, linux-mm, linux-kernel
commit 44b8f8bf2438 ("mm: memory-failure: add memory failure stats
to sysfs") introduces per NUMA memory error stats which show
breakdown of HardwareCorrupted of /proc/meminfo in
/sys/devices/system/node/nodeX/memory_failure.
However, HardwareCorrupted also counts soft-offline pages. So, add
soft-offline stats in mf_stats too to represent more accurate status.
This updates total count as:
total = recovered + ignored + failed + delayed + soft_offline
Test example:
1) # grep HardwareCorrupted /proc/meminfo
HardwareCorrupted: 0 kB
2) soft-offline 1 page by madvise(MADV_SOFT_OFFLINE)
3) # grep HardwareCorrupted /proc/meminfo
HardwareCorrupted: 4 kB
# grep -r "" /sys/devices/system/node/node0/memory_failure
/sys/devices/system/node/node0/memory_failure/total:1
/sys/devices/system/node/node0/memory_failure/soft_offline:1
/sys/devices/system/node/node0/memory_failure/recovered:0
/sys/devices/system/node/node0/memory_failure/ignored:0
/sys/devices/system/node/node0/memory_failure/failed:0
/sys/devices/system/node/node0/memory_failure/delayed:0
Signed-off-by: Tomohiro Misono <misono.tomohiro@fujitsu.com>
---
Hello
This is RFC because I'm not sure adding SOFT_OFFLINE in enum
mf_result is a right approach. Also, maybe is it better to move
update_per_node_mf_stats() into num_poisoned_pages_inc()?
I omitted some cleanups and sysfs doc update in this version to
highlight changes. I'd appreciate any suggestions.
Regards,
Tomohiro Misono
include/linux/mm.h | 2 ++
include/linux/mmzone.h | 4 +++-
mm/memory-failure.c | 9 +++++++++
3 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5d6cd523c7c0..7f93f6883760 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3991,6 +3991,8 @@ enum mf_result {
MF_FAILED, /* Error: handling failed */
MF_DELAYED, /* Will be handled later */
MF_RECOVERED, /* Successfully recovered */
+
+ MF_RES_SOFT_OFFLINE, /* Soft-offline */
};
enum mf_action_page_type {
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index b36124145a16..6a030610cba3 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1282,13 +1282,15 @@ struct memory_failure_stats {
/*
* Recovery results of poisoned raw pages handled by memory_failure,
* in sync with mf_result.
- * total = ignored + failed + delayed + recovered.
+ * total = ignored + failed + delayed + recovered + soft_offline.
* total * PAGE_SIZE * #nodes = /proc/meminfo/HardwareCorrupted.
*/
unsigned long ignored;
unsigned long failed;
unsigned long delayed;
unsigned long recovered;
+
+ unsigned long soft_offline;
};
#endif
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index a7b8ccd29b6f..02f845a222cc 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -109,6 +109,7 @@ MF_ATTR_RO(ignored);
MF_ATTR_RO(failed);
MF_ATTR_RO(delayed);
MF_ATTR_RO(recovered);
+MF_ATTR_RO(soft_offline);
static struct attribute *memory_failure_attr[] = {
&dev_attr_total.attr,
@@ -116,6 +117,7 @@ static struct attribute *memory_failure_attr[] = {
&dev_attr_failed.attr,
&dev_attr_delayed.attr,
&dev_attr_recovered.attr,
+ &dev_attr_soft_offline.attr,
NULL,
};
@@ -185,6 +187,9 @@ static int __page_handle_poison(struct page *page)
return ret;
}
+static void update_per_node_mf_stats(unsigned long pfn,
+ enum mf_result result);
+
static bool page_handle_poison(struct page *page, bool hugepage_or_freepage, bool release)
{
if (hugepage_or_freepage) {
@@ -208,6 +213,7 @@ static bool page_handle_poison(struct page *page, bool hugepage_or_freepage, boo
put_page(page);
page_ref_inc(page);
num_poisoned_pages_inc(page_to_pfn(page));
+ update_per_node_mf_stats(page_to_pfn(page), MF_RES_SOFT_OFFLINE);
return true;
}
@@ -1314,6 +1320,9 @@ static void update_per_node_mf_stats(unsigned long pfn,
case MF_RECOVERED:
++mf_stats->recovered;
break;
+ case MF_RES_SOFT_OFFLINE:
+ ++mf_stats->soft_offline;
+ break;
default:
WARN_ONCE(1, "Memory failure: mf_result=%d is not properly handled", result);
break;
--
2.34.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] mm: memory-failure: add soft-offline stat in mf_stats
2024-11-21 4:55 [RFC PATCH] mm: memory-failure: add soft-offline stat in mf_stats Tomohiro Misono
@ 2024-11-26 3:09 ` Miaohe Lin
2024-11-27 2:32 ` Tomohiro Misono (Fujitsu)
0 siblings, 1 reply; 10+ messages in thread
From: Miaohe Lin @ 2024-11-26 3:09 UTC (permalink / raw)
To: Tomohiro Misono
Cc: jiaqiyan, linux-mm, linux-kernel, Andrew Morton, Naoya Horiguchi
On 2024/11/21 12:55, Tomohiro Misono wrote:
> commit 44b8f8bf2438 ("mm: memory-failure: add memory failure stats
Sorry for late, I've been swamped recently.
> to sysfs") introduces per NUMA memory error stats which show
> breakdown of HardwareCorrupted of /proc/meminfo in
> /sys/devices/system/node/nodeX/memory_failure.
Thanks for your patch.
>
> However, HardwareCorrupted also counts soft-offline pages. So, add
> soft-offline stats in mf_stats too to represent more accurate status.
Adding soft-offline stats makes sense to me.
>
> This updates total count as:
> total = recovered + ignored + failed + delayed + soft_offline>
> Test example:
> 1) # grep HardwareCorrupted /proc/meminfo
> HardwareCorrupted: 0 kB
> 2) soft-offline 1 page by madvise(MADV_SOFT_OFFLINE)
> 3) # grep HardwareCorrupted /proc/meminfo
> HardwareCorrupted: 4 kB
> # grep -r "" /sys/devices/system/node/node0/memory_failure
> /sys/devices/system/node/node0/memory_failure/total:1
> /sys/devices/system/node/node0/memory_failure/soft_offline:1
> /sys/devices/system/node/node0/memory_failure/recovered:0
> /sys/devices/system/node/node0/memory_failure/ignored:0
> /sys/devices/system/node/node0/memory_failure/failed:0
> /sys/devices/system/node/node0/memory_failure/delayed:0
>
> Signed-off-by: Tomohiro Misono <misono.tomohiro@fujitsu.com>
> ---
> Hello
>
> This is RFC because I'm not sure adding SOFT_OFFLINE in enum
> mf_result is a right approach. Also, maybe is it better to move
> update_per_node_mf_stats() into num_poisoned_pages_inc()?
>
> I omitted some cleanups and sysfs doc update in this version to
> highlight changes. I'd appreciate any suggestions.
>
> Regards,
> Tomohiro Misono
>
> include/linux/mm.h | 2 ++
> include/linux/mmzone.h | 4 +++-
> mm/memory-failure.c | 9 +++++++++
> 3 files changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 5d6cd523c7c0..7f93f6883760 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -3991,6 +3991,8 @@ enum mf_result {
> MF_FAILED, /* Error: handling failed */
> MF_DELAYED, /* Will be handled later */
> MF_RECOVERED, /* Successfully recovered */
> +
> + MF_RES_SOFT_OFFLINE, /* Soft-offline */
It might not be a good idea to add MF_RES_SOFT_OFFLINE here. 'mf_result' is used to record
the result of memory failure handler. So it might be inappropriate to add MF_RES_SOFT_OFFLINE here.
Thanks.
.
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [RFC PATCH] mm: memory-failure: add soft-offline stat in mf_stats
2024-11-26 3:09 ` Miaohe Lin
@ 2024-11-27 2:32 ` Tomohiro Misono (Fujitsu)
2024-11-27 7:06 ` Jiaqi Yan
0 siblings, 1 reply; 10+ messages in thread
From: Tomohiro Misono (Fujitsu) @ 2024-11-27 2:32 UTC (permalink / raw)
To: 'Miaohe Lin'
Cc: 'jiaqiyan@google.com', 'linux-mm@kvack.org',
'linux-kernel@vger.kernel.org', 'Andrew Morton',
'Naoya Horiguchi'
> On 2024/11/21 12:55, Tomohiro Misono wrote:
> > commit 44b8f8bf2438 ("mm: memory-failure: add memory failure stats
>
> Sorry for late, I've been swamped recently.
Hi,
Thanks for your comments.
>
> > to sysfs") introduces per NUMA memory error stats which show
> > breakdown of HardwareCorrupted of /proc/meminfo in
> > /sys/devices/system/node/nodeX/memory_failure.
>
> Thanks for your patch.
>
> >
> > However, HardwareCorrupted also counts soft-offline pages. So, add
> > soft-offline stats in mf_stats too to represent more accurate status.
>
> Adding soft-offline stats makes sense to me.
Thanks for confirming.
>
> >
> > This updates total count as:
> > total = recovered + ignored + failed + delayed + soft_offline>
> > Test example:
> > 1) # grep HardwareCorrupted /proc/meminfo
> > HardwareCorrupted: 0 kB
> > 2) soft-offline 1 page by madvise(MADV_SOFT_OFFLINE)
> > 3) # grep HardwareCorrupted /proc/meminfo
> > HardwareCorrupted: 4 kB
> > # grep -r "" /sys/devices/system/node/node0/memory_failure
> > /sys/devices/system/node/node0/memory_failure/total:1
> > /sys/devices/system/node/node0/memory_failure/soft_offline:1
> > /sys/devices/system/node/node0/memory_failure/recovered:0
> > /sys/devices/system/node/node0/memory_failure/ignored:0
> > /sys/devices/system/node/node0/memory_failure/failed:0
> > /sys/devices/system/node/node0/memory_failure/delayed:0
> >
> > Signed-off-by: Tomohiro Misono <misono.tomohiro@fujitsu.com>
> > ---
> > Hello
> >
> > This is RFC because I'm not sure adding SOFT_OFFLINE in enum
> > mf_result is a right approach. Also, maybe is it better to move
> > update_per_node_mf_stats() into num_poisoned_pages_inc()?
> >
> > I omitted some cleanups and sysfs doc update in this version to
> > highlight changes. I'd appreciate any suggestions.
> >
> > Regards,
> > Tomohiro Misono
> >
> > include/linux/mm.h | 2 ++
> > include/linux/mmzone.h | 4 +++-
> > mm/memory-failure.c | 9 +++++++++
> > 3 files changed, 14 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 5d6cd523c7c0..7f93f6883760 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -3991,6 +3991,8 @@ enum mf_result {
> > MF_FAILED, /* Error: handling failed */
> > MF_DELAYED, /* Will be handled later */
> > MF_RECOVERED, /* Successfully recovered */
> > +
> > + MF_RES_SOFT_OFFLINE, /* Soft-offline */
>
> It might not be a good idea to add MF_RES_SOFT_OFFLINE here. 'mf_result' is used to record
> the result of memory failure handler. So it might be inappropriate to add MF_RES_SOFT_OFFLINE here.
Understood. As I don't see other suitable place to put ENUM value, how about changing like below?
Or, do you prefer adding another ENUM type instead of this?
```
static void update_per_node_mf_stats(unsigned long pfn,
- enum mf_result result)
+ enum mf_result result, bool is_soft_offline)
{
int nid = MAX_NUMNODES;
struct memory_failure_stats *mf_stats = NULL;
@@ -1299,6 +1299,12 @@ static void update_per_node_mf_stats(unsigned long pfn,
}
mf_stats = &NODE_DATA(nid)->mf_stats;
+ if (is_soft_offline) {
+ ++mf->stats->soft_offlined;
+ ++mf_stats->total;
+ return;
+ }
+
switch (result) {
case MF_IGNORED:
++mf_stats->ignored;
```
Regards,
Tomohiro Misono
>
>
> Thanks.
> .
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] mm: memory-failure: add soft-offline stat in mf_stats
2024-11-27 2:32 ` Tomohiro Misono (Fujitsu)
@ 2024-11-27 7:06 ` Jiaqi Yan
2024-11-28 5:46 ` Tomohiro Misono (Fujitsu)
0 siblings, 1 reply; 10+ messages in thread
From: Jiaqi Yan @ 2024-11-27 7:06 UTC (permalink / raw)
To: Tomohiro Misono (Fujitsu)
Cc: Miaohe Lin, linux-mm, linux-kernel, Andrew Morton, Naoya Horiguchi
On Tue, Nov 26, 2024 at 6:32 PM Tomohiro Misono (Fujitsu)
<misono.tomohiro@fujitsu.com> wrote:
>
> > On 2024/11/21 12:55, Tomohiro Misono wrote:
> > > commit 44b8f8bf2438 ("mm: memory-failure: add memory failure stats
> >
> > Sorry for late, I've been swamped recently.
>
> Hi,
> Thanks for your comments.
>
> >
> > > to sysfs") introduces per NUMA memory error stats which show
> > > breakdown of HardwareCorrupted of /proc/meminfo in
> > > /sys/devices/system/node/nodeX/memory_failure.
> >
> > Thanks for your patch.
> >
> > >
> > > However, HardwareCorrupted also counts soft-offline pages. So, add
> > > soft-offline stats in mf_stats too to represent more accurate status.
> >
> > Adding soft-offline stats makes sense to me.
>
> Thanks for confirming.
Agreed with Miaohe.
>
> >
> > >
> > > This updates total count as:
> > > total = recovered + ignored + failed + delayed + soft_offline>
> > > Test example:
> > > 1) # grep HardwareCorrupted /proc/meminfo
> > > HardwareCorrupted: 0 kB
> > > 2) soft-offline 1 page by madvise(MADV_SOFT_OFFLINE)
> > > 3) # grep HardwareCorrupted /proc/meminfo
> > > HardwareCorrupted: 4 kB
> > > # grep -r "" /sys/devices/system/node/node0/memory_failure
> > > /sys/devices/system/node/node0/memory_failure/total:1
> > > /sys/devices/system/node/node0/memory_failure/soft_offline:1
> > > /sys/devices/system/node/node0/memory_failure/recovered:0
> > > /sys/devices/system/node/node0/memory_failure/ignored:0
> > > /sys/devices/system/node/node0/memory_failure/failed:0
> > > /sys/devices/system/node/node0/memory_failure/delayed:0
> > >
> > > Signed-off-by: Tomohiro Misono <misono.tomohiro@fujitsu.com>
> > > ---
> > > Hello
> > >
> > > This is RFC because I'm not sure adding SOFT_OFFLINE in enum
> > > mf_result is a right approach. Also, maybe is it better to move
> > > update_per_node_mf_stats() into num_poisoned_pages_inc()?
> > >
> > > I omitted some cleanups and sysfs doc update in this version to
> > > highlight changes. I'd appreciate any suggestions.
> > >
> > > Regards,
> > > Tomohiro Misono
> > >
> > > include/linux/mm.h | 2 ++
> > > include/linux/mmzone.h | 4 +++-
> > > mm/memory-failure.c | 9 +++++++++
> > > 3 files changed, 14 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > > index 5d6cd523c7c0..7f93f6883760 100644
> > > --- a/include/linux/mm.h
> > > +++ b/include/linux/mm.h
> > > @@ -3991,6 +3991,8 @@ enum mf_result {
> > > MF_FAILED, /* Error: handling failed */
> > > MF_DELAYED, /* Will be handled later */
> > > MF_RECOVERED, /* Successfully recovered */
> > > +
> > > + MF_RES_SOFT_OFFLINE, /* Soft-offline */
> >
> > It might not be a good idea to add MF_RES_SOFT_OFFLINE here. 'mf_result' is used to record
> > the result of memory failure handler. So it might be inappropriate to add MF_RES_SOFT_OFFLINE here.
>
> Understood. As I don't see other suitable place to put ENUM value, how about changing like below?
> Or, do you prefer adding another ENUM type instead of this?
I think SOFT_OFFLINE-ed is one of the results of successfully
recovered, and the other one is HARD_OFFLINE-ed. So how about make a
separate sub-ENUM for MF_RECOVERED? Something like:
enum mf_recovered_result {
MF_RECOVERED_SOFT_OFFLINE,
MF_RECOVERED_HARD_OFFLINE,
};
And
1. total = recovered + ignored + failed + delayed
2. recovered = soft_offline + hard_offline
>
> ```
> static void update_per_node_mf_stats(unsigned long pfn,
> - enum mf_result result)
> + enum mf_result result, bool is_soft_offline)
> {
> int nid = MAX_NUMNODES;
> struct memory_failure_stats *mf_stats = NULL;
> @@ -1299,6 +1299,12 @@ static void update_per_node_mf_stats(unsigned long pfn,
> }
>
> mf_stats = &NODE_DATA(nid)->mf_stats;
> + if (is_soft_offline) {
> + ++mf->stats->soft_offlined;
> + ++mf_stats->total;
> + return;
> + }
> +
> switch (result) {
> case MF_IGNORED:
> ++mf_stats->ignored;
> ```
>
> Regards,
> Tomohiro Misono
>
> >
> >
> > Thanks.
> > .
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [RFC PATCH] mm: memory-failure: add soft-offline stat in mf_stats
2024-11-27 7:06 ` Jiaqi Yan
@ 2024-11-28 5:46 ` Tomohiro Misono (Fujitsu)
2024-11-29 7:07 ` Miaohe Lin
0 siblings, 1 reply; 10+ messages in thread
From: Tomohiro Misono (Fujitsu) @ 2024-11-28 5:46 UTC (permalink / raw)
To: 'Jiaqi Yan'
Cc: Miaohe Lin, linux-mm, linux-kernel, Andrew Morton, Naoya Horiguchi
> > > On 2024/11/21 12:55, Tomohiro Misono wrote:
> > > > commit 44b8f8bf2438 ("mm: memory-failure: add memory failure stats
> > >
> > > Sorry for late, I've been swamped recently.
> >
> > Hi,
> > Thanks for your comments.
> >
> > >
> > > > to sysfs") introduces per NUMA memory error stats which show
> > > > breakdown of HardwareCorrupted of /proc/meminfo in
> > > > /sys/devices/system/node/nodeX/memory_failure.
> > >
> > > Thanks for your patch.
> > >
> > > >
> > > > However, HardwareCorrupted also counts soft-offline pages. So, add
> > > > soft-offline stats in mf_stats too to represent more accurate status.
> > >
> > > Adding soft-offline stats makes sense to me.
> >
> > Thanks for confirming.
>
> Agreed with Miaohe.
>
> >
> > >
> > > >
> > > > This updates total count as:
> > > > total = recovered + ignored + failed + delayed + soft_offline>
> > > > Test example:
> > > > 1) # grep HardwareCorrupted /proc/meminfo
> > > > HardwareCorrupted: 0 kB
> > > > 2) soft-offline 1 page by madvise(MADV_SOFT_OFFLINE)
> > > > 3) # grep HardwareCorrupted /proc/meminfo
> > > > HardwareCorrupted: 4 kB
> > > > # grep -r "" /sys/devices/system/node/node0/memory_failure
> > > > /sys/devices/system/node/node0/memory_failure/total:1
> > > > /sys/devices/system/node/node0/memory_failure/soft_offline:1
> > > > /sys/devices/system/node/node0/memory_failure/recovered:0
> > > > /sys/devices/system/node/node0/memory_failure/ignored:0
> > > > /sys/devices/system/node/node0/memory_failure/failed:0
> > > > /sys/devices/system/node/node0/memory_failure/delayed:0
> > > >
> > > > Signed-off-by: Tomohiro Misono <misono.tomohiro@fujitsu.com>
> > > > ---
> > > > Hello
> > > >
> > > > This is RFC because I'm not sure adding SOFT_OFFLINE in enum
> > > > mf_result is a right approach. Also, maybe is it better to move
> > > > update_per_node_mf_stats() into num_poisoned_pages_inc()?
> > > >
> > > > I omitted some cleanups and sysfs doc update in this version to
> > > > highlight changes. I'd appreciate any suggestions.
> > > >
> > > > Regards,
> > > > Tomohiro Misono
> > > >
> > > > include/linux/mm.h | 2 ++
> > > > include/linux/mmzone.h | 4 +++-
> > > > mm/memory-failure.c | 9 +++++++++
> > > > 3 files changed, 14 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > > > index 5d6cd523c7c0..7f93f6883760 100644
> > > > --- a/include/linux/mm.h
> > > > +++ b/include/linux/mm.h
> > > > @@ -3991,6 +3991,8 @@ enum mf_result {
> > > > MF_FAILED, /* Error: handling failed */
> > > > MF_DELAYED, /* Will be handled later */
> > > > MF_RECOVERED, /* Successfully recovered */
> > > > +
> > > > + MF_RES_SOFT_OFFLINE, /* Soft-offline */
> > >
> > > It might not be a good idea to add MF_RES_SOFT_OFFLINE here. 'mf_result' is used to record
> > > the result of memory failure handler. So it might be inappropriate to add MF_RES_SOFT_OFFLINE here.
> >
> > Understood. As I don't see other suitable place to put ENUM value, how about changing like below?
> > Or, do you prefer adding another ENUM type instead of this?
>
> I think SOFT_OFFLINE-ed is one of the results of successfully
> recovered, and the other one is HARD_OFFLINE-ed. So how about make a
> separate sub-ENUM for MF_RECOVERED? Something like:
Thanks for the suggestion.
>
> enum mf_recovered_result {
> MF_RECOVERED_SOFT_OFFLINE,
> MF_RECOVERED_HARD_OFFLINE,
> };
Ok.
>
> And
> 1. total = recovered + ignored + failed + delayed
> 2. recovered = soft_offline + hard_offline
Do you mean mf_stats now have 7 entries in sysfs?
(total, ignored, failed, delayed, recovered, hard_offline, soft_offline, then recovered = hard_offline + soft_offline)
Or 6 entries ? (in that case, hard_offline = recovered - soft_offline)
It might be simpler to understand for user if total is just the sum of other entries like this RFC,
but I'd like to know other opinions.
Regards,
Tomohiro Misono
>
> >
> > ```
> > static void update_per_node_mf_stats(unsigned long pfn,
> > - enum mf_result result)
> > + enum mf_result result, bool is_soft_offline)
> > {
> > int nid = MAX_NUMNODES;
> > struct memory_failure_stats *mf_stats = NULL;
> > @@ -1299,6 +1299,12 @@ static void update_per_node_mf_stats(unsigned long pfn,
> > }
> >
> > mf_stats = &NODE_DATA(nid)->mf_stats;
> > + if (is_soft_offline) {
> > + ++mf->stats->soft_offlined;
> > + ++mf_stats->total;
> > + return;
> > + }
> > +
> > switch (result) {
> > case MF_IGNORED:
> > ++mf_stats->ignored;
> > ```
> >
> > Regards,
> > Tomohiro Misono
> >
> > >
> > >
> > > Thanks.
> > > .
> >
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] mm: memory-failure: add soft-offline stat in mf_stats
2024-11-28 5:46 ` Tomohiro Misono (Fujitsu)
@ 2024-11-29 7:07 ` Miaohe Lin
2024-11-29 8:26 ` Tomohiro Misono (Fujitsu)
2024-12-07 0:17 ` jane.chu
0 siblings, 2 replies; 10+ messages in thread
From: Miaohe Lin @ 2024-11-29 7:07 UTC (permalink / raw)
To: Tomohiro Misono (Fujitsu), 'Jiaqi Yan'
Cc: linux-mm, linux-kernel, Andrew Morton, Naoya Horiguchi
On 2024/11/28 13:46, Tomohiro Misono (Fujitsu) wrote:
>>>> On 2024/11/21 12:55, Tomohiro Misono wrote:
>>>>> commit 44b8f8bf2438 ("mm: memory-failure: add memory failure stats
>>>>
>>>> Sorry for late, I've been swamped recently.
>>>
>>> Hi,
>>> Thanks for your comments.
>>>
>>>>
>>>>> to sysfs") introduces per NUMA memory error stats which show
>>>>> breakdown of HardwareCorrupted of /proc/meminfo in
>>>>> /sys/devices/system/node/nodeX/memory_failure.
>>>>
>>>> Thanks for your patch.
>>>>
>>>>>
>>>>> However, HardwareCorrupted also counts soft-offline pages. So, add
>>>>> soft-offline stats in mf_stats too to represent more accurate status.
>>>>
>>>> Adding soft-offline stats makes sense to me.
>>>
>>> Thanks for confirming.
>>
>> Agreed with Miaohe.
>>
>>>
>>>>
>>>>>
>>>>> This updates total count as:
>>>>> total = recovered + ignored + failed + delayed + soft_offline>
>>>>> Test example:
>>>>> 1) # grep HardwareCorrupted /proc/meminfo
>>>>> HardwareCorrupted: 0 kB
>>>>> 2) soft-offline 1 page by madvise(MADV_SOFT_OFFLINE)
>>>>> 3) # grep HardwareCorrupted /proc/meminfo
>>>>> HardwareCorrupted: 4 kB
>>>>> # grep -r "" /sys/devices/system/node/node0/memory_failure
>>>>> /sys/devices/system/node/node0/memory_failure/total:1
>>>>> /sys/devices/system/node/node0/memory_failure/soft_offline:1
>>>>> /sys/devices/system/node/node0/memory_failure/recovered:0
>>>>> /sys/devices/system/node/node0/memory_failure/ignored:0
>>>>> /sys/devices/system/node/node0/memory_failure/failed:0
>>>>> /sys/devices/system/node/node0/memory_failure/delayed:0
>>>>>
>>>>> Signed-off-by: Tomohiro Misono <misono.tomohiro@fujitsu.com>
>>>>> ---
>>>>> Hello
>>>>>
>>>>> This is RFC because I'm not sure adding SOFT_OFFLINE in enum
>>>>> mf_result is a right approach. Also, maybe is it better to move
>>>>> update_per_node_mf_stats() into num_poisoned_pages_inc()?
>>>>>
>>>>> I omitted some cleanups and sysfs doc update in this version to
>>>>> highlight changes. I'd appreciate any suggestions.
>>>>>
>>>>> Regards,
>>>>> Tomohiro Misono
>>>>>
>>>>> include/linux/mm.h | 2 ++
>>>>> include/linux/mmzone.h | 4 +++-
>>>>> mm/memory-failure.c | 9 +++++++++
>>>>> 3 files changed, 14 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>>>> index 5d6cd523c7c0..7f93f6883760 100644
>>>>> --- a/include/linux/mm.h
>>>>> +++ b/include/linux/mm.h
>>>>> @@ -3991,6 +3991,8 @@ enum mf_result {
>>>>> MF_FAILED, /* Error: handling failed */
>>>>> MF_DELAYED, /* Will be handled later */
>>>>> MF_RECOVERED, /* Successfully recovered */
>>>>> +
>>>>> + MF_RES_SOFT_OFFLINE, /* Soft-offline */
>>>>
>>>> It might not be a good idea to add MF_RES_SOFT_OFFLINE here. 'mf_result' is used to record
>>>> the result of memory failure handler. So it might be inappropriate to add MF_RES_SOFT_OFFLINE here.
>>>
>>> Understood. As I don't see other suitable place to put ENUM value, how about changing like below?
>>> Or, do you prefer adding another ENUM type instead of this?
>>
>> I think SOFT_OFFLINE-ed is one of the results of successfully
>> recovered, and the other one is HARD_OFFLINE-ed. So how about make a
>> separate sub-ENUM for MF_RECOVERED? Something like:
>
> Thanks for the suggestion.
>
>>
>> enum mf_recovered_result {
>> MF_RECOVERED_SOFT_OFFLINE,
>> MF_RECOVERED_HARD_OFFLINE,
>> };
>
> Ok.
>
>>
>> And
>> 1. total = recovered + ignored + failed + delayed
>> 2. recovered = soft_offline + hard_offline
>
> Do you mean mf_stats now have 7 entries in sysfs?
> (total, ignored, failed, delayed, recovered, hard_offline, soft_offline, then recovered = hard_offline + soft_offline)
> Or 6 entries ? (in that case, hard_offline = recovered - soft_offline)
> It might be simpler to understand for user if total is just the sum of other entries like this RFC,
> but I'd like to know other opinions.
Will it be better to have below items?
"
total
ignored
failed
dalayed
hard_offline
soft_offline
"
though this will break the previous interface.
Any thoughts?
Thanks.
.
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [RFC PATCH] mm: memory-failure: add soft-offline stat in mf_stats
2024-11-29 7:07 ` Miaohe Lin
@ 2024-11-29 8:26 ` Tomohiro Misono (Fujitsu)
2024-11-29 9:07 ` Miaohe Lin
2024-12-07 0:17 ` jane.chu
1 sibling, 1 reply; 10+ messages in thread
From: Tomohiro Misono (Fujitsu) @ 2024-11-29 8:26 UTC (permalink / raw)
To: 'Miaohe Lin', 'Jiaqi Yan'
Cc: linux-mm, linux-kernel, Andrew Morton, Naoya Horiguchi
> On 2024/11/28 13:46, Tomohiro Misono (Fujitsu) wrote:
> >>>> On 2024/11/21 12:55, Tomohiro Misono wrote:
> >>>>> commit 44b8f8bf2438 ("mm: memory-failure: add memory failure stats
> >>>>
> >>>> Sorry for late, I've been swamped recently.
> >>>
> >>> Hi,
> >>> Thanks for your comments.
> >>>
> >>>>
> >>>>> to sysfs") introduces per NUMA memory error stats which show
> >>>>> breakdown of HardwareCorrupted of /proc/meminfo in
> >>>>> /sys/devices/system/node/nodeX/memory_failure.
> >>>>
> >>>> Thanks for your patch.
> >>>>
> >>>>>
> >>>>> However, HardwareCorrupted also counts soft-offline pages. So, add
> >>>>> soft-offline stats in mf_stats too to represent more accurate status.
> >>>>
> >>>> Adding soft-offline stats makes sense to me.
> >>>
> >>> Thanks for confirming.
> >>
> >> Agreed with Miaohe.
> >>
> >>>
> >>>>
> >>>>>
> >>>>> This updates total count as:
> >>>>> total = recovered + ignored + failed + delayed + soft_offline>
> >>>>> Test example:
> >>>>> 1) # grep HardwareCorrupted /proc/meminfo
> >>>>> HardwareCorrupted: 0 kB
> >>>>> 2) soft-offline 1 page by madvise(MADV_SOFT_OFFLINE)
> >>>>> 3) # grep HardwareCorrupted /proc/meminfo
> >>>>> HardwareCorrupted: 4 kB
> >>>>> # grep -r "" /sys/devices/system/node/node0/memory_failure
> >>>>> /sys/devices/system/node/node0/memory_failure/total:1
> >>>>> /sys/devices/system/node/node0/memory_failure/soft_offline:1
> >>>>> /sys/devices/system/node/node0/memory_failure/recovered:0
> >>>>> /sys/devices/system/node/node0/memory_failure/ignored:0
> >>>>> /sys/devices/system/node/node0/memory_failure/failed:0
> >>>>> /sys/devices/system/node/node0/memory_failure/delayed:0
> >>>>>
> >>>>> Signed-off-by: Tomohiro Misono <misono.tomohiro@fujitsu.com>
> >>>>> ---
> >>>>> Hello
> >>>>>
> >>>>> This is RFC because I'm not sure adding SOFT_OFFLINE in enum
> >>>>> mf_result is a right approach. Also, maybe is it better to move
> >>>>> update_per_node_mf_stats() into num_poisoned_pages_inc()?
> >>>>>
> >>>>> I omitted some cleanups and sysfs doc update in this version to
> >>>>> highlight changes. I'd appreciate any suggestions.
> >>>>>
> >>>>> Regards,
> >>>>> Tomohiro Misono
> >>>>>
> >>>>> include/linux/mm.h | 2 ++
> >>>>> include/linux/mmzone.h | 4 +++-
> >>>>> mm/memory-failure.c | 9 +++++++++
> >>>>> 3 files changed, 14 insertions(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
> >>>>> index 5d6cd523c7c0..7f93f6883760 100644
> >>>>> --- a/include/linux/mm.h
> >>>>> +++ b/include/linux/mm.h
> >>>>> @@ -3991,6 +3991,8 @@ enum mf_result {
> >>>>> MF_FAILED, /* Error: handling failed */
> >>>>> MF_DELAYED, /* Will be handled later */
> >>>>> MF_RECOVERED, /* Successfully recovered */
> >>>>> +
> >>>>> + MF_RES_SOFT_OFFLINE, /* Soft-offline */
> >>>>
> >>>> It might not be a good idea to add MF_RES_SOFT_OFFLINE here. 'mf_result' is used to record
> >>>> the result of memory failure handler. So it might be inappropriate to add MF_RES_SOFT_OFFLINE
> here.
> >>>
> >>> Understood. As I don't see other suitable place to put ENUM value, how about changing like below?
> >>> Or, do you prefer adding another ENUM type instead of this?
> >>
> >> I think SOFT_OFFLINE-ed is one of the results of successfully
> >> recovered, and the other one is HARD_OFFLINE-ed. So how about make a
> >> separate sub-ENUM for MF_RECOVERED? Something like:
> >
> > Thanks for the suggestion.
> >
> >>
> >> enum mf_recovered_result {
> >> MF_RECOVERED_SOFT_OFFLINE,
> >> MF_RECOVERED_HARD_OFFLINE,
> >> };
> >
> > Ok.
> >
> >>
> >> And
> >> 1. total = recovered + ignored + failed + delayed
> >> 2. recovered = soft_offline + hard_offline
> >
> > Do you mean mf_stats now have 7 entries in sysfs?
> > (total, ignored, failed, delayed, recovered, hard_offline, soft_offline, then recovered = hard_offline +
> soft_offline)
> > Or 6 entries ? (in that case, hard_offline = recovered - soft_offline)
> > It might be simpler to understand for user if total is just the sum of other entries like this RFC,
> > but I'd like to know other opinions.
>
> Will it be better to have below items?
> "
> total
> ignored
> failed
> dalayed
> hard_offline
> soft_offline
> "
>
> though this will break the previous interface.
> Any thoughts?
That would be great, but these files are under stable ABI and
I don't think we can change them, right?
https://docs.kernel.org/admin-guide/abi-stable.html
Userspace programs are free to use these interfaces with no restrictions, and backward
compatibility for them will be guaranteed for at least 2 years.
Most interfaces (like syscalls) are expected to never change and always be available.
Regards,
Tomohiro Misono
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] mm: memory-failure: add soft-offline stat in mf_stats
2024-11-29 8:26 ` Tomohiro Misono (Fujitsu)
@ 2024-11-29 9:07 ` Miaohe Lin
0 siblings, 0 replies; 10+ messages in thread
From: Miaohe Lin @ 2024-11-29 9:07 UTC (permalink / raw)
To: Tomohiro Misono (Fujitsu), 'Jiaqi Yan'
Cc: linux-mm, linux-kernel, Andrew Morton, Naoya Horiguchi
On 2024/11/29 16:26, Tomohiro Misono (Fujitsu) wrote:
>> On 2024/11/28 13:46, Tomohiro Misono (Fujitsu) wrote:
>>>>>> On 2024/11/21 12:55, Tomohiro Misono wrote:
>>>>>>> commit 44b8f8bf2438 ("mm: memory-failure: add memory failure stats
>>>>>>
>>>>>> Sorry for late, I've been swamped recently.
>>>>>
>>>>> Hi,
>>>>> Thanks for your comments.
>>>>>
>>>>>>
>>>>>>> to sysfs") introduces per NUMA memory error stats which show
>>>>>>> breakdown of HardwareCorrupted of /proc/meminfo in
>>>>>>> /sys/devices/system/node/nodeX/memory_failure.
>>>>>>
>>>>>> Thanks for your patch.
>>>>>>
>>>>>>>
>>>>>>> However, HardwareCorrupted also counts soft-offline pages. So, add
>>>>>>> soft-offline stats in mf_stats too to represent more accurate status.
>>>>>>
>>>>>> Adding soft-offline stats makes sense to me.
>>>>>
>>>>> Thanks for confirming.
>>>>
>>>> Agreed with Miaohe.
>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>> This updates total count as:
>>>>>>> total = recovered + ignored + failed + delayed + soft_offline>
>>>>>>> Test example:
>>>>>>> 1) # grep HardwareCorrupted /proc/meminfo
>>>>>>> HardwareCorrupted: 0 kB
>>>>>>> 2) soft-offline 1 page by madvise(MADV_SOFT_OFFLINE)
>>>>>>> 3) # grep HardwareCorrupted /proc/meminfo
>>>>>>> HardwareCorrupted: 4 kB
>>>>>>> # grep -r "" /sys/devices/system/node/node0/memory_failure
>>>>>>> /sys/devices/system/node/node0/memory_failure/total:1
>>>>>>> /sys/devices/system/node/node0/memory_failure/soft_offline:1
>>>>>>> /sys/devices/system/node/node0/memory_failure/recovered:0
>>>>>>> /sys/devices/system/node/node0/memory_failure/ignored:0
>>>>>>> /sys/devices/system/node/node0/memory_failure/failed:0
>>>>>>> /sys/devices/system/node/node0/memory_failure/delayed:0
>>>>>>>
>>>>>>> Signed-off-by: Tomohiro Misono <misono.tomohiro@fujitsu.com>
>>>>>>> ---
>>>>>>> Hello
>>>>>>>
>>>>>>> This is RFC because I'm not sure adding SOFT_OFFLINE in enum
>>>>>>> mf_result is a right approach. Also, maybe is it better to move
>>>>>>> update_per_node_mf_stats() into num_poisoned_pages_inc()?
>>>>>>>
>>>>>>> I omitted some cleanups and sysfs doc update in this version to
>>>>>>> highlight changes. I'd appreciate any suggestions.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Tomohiro Misono
>>>>>>>
>>>>>>> include/linux/mm.h | 2 ++
>>>>>>> include/linux/mmzone.h | 4 +++-
>>>>>>> mm/memory-failure.c | 9 +++++++++
>>>>>>> 3 files changed, 14 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>>>>>> index 5d6cd523c7c0..7f93f6883760 100644
>>>>>>> --- a/include/linux/mm.h
>>>>>>> +++ b/include/linux/mm.h
>>>>>>> @@ -3991,6 +3991,8 @@ enum mf_result {
>>>>>>> MF_FAILED, /* Error: handling failed */
>>>>>>> MF_DELAYED, /* Will be handled later */
>>>>>>> MF_RECOVERED, /* Successfully recovered */
>>>>>>> +
>>>>>>> + MF_RES_SOFT_OFFLINE, /* Soft-offline */
>>>>>>
>>>>>> It might not be a good idea to add MF_RES_SOFT_OFFLINE here. 'mf_result' is used to record
>>>>>> the result of memory failure handler. So it might be inappropriate to add MF_RES_SOFT_OFFLINE
>> here.
>>>>>
>>>>> Understood. As I don't see other suitable place to put ENUM value, how about changing like below?
>>>>> Or, do you prefer adding another ENUM type instead of this?
>>>>
>>>> I think SOFT_OFFLINE-ed is one of the results of successfully
>>>> recovered, and the other one is HARD_OFFLINE-ed. So how about make a
>>>> separate sub-ENUM for MF_RECOVERED? Something like:
>>>
>>> Thanks for the suggestion.
>>>
>>>>
>>>> enum mf_recovered_result {
>>>> MF_RECOVERED_SOFT_OFFLINE,
>>>> MF_RECOVERED_HARD_OFFLINE,
>>>> };
>>>
>>> Ok.
>>>
>>>>
>>>> And
>>>> 1. total = recovered + ignored + failed + delayed
>>>> 2. recovered = soft_offline + hard_offline
>>>
>>> Do you mean mf_stats now have 7 entries in sysfs?
>>> (total, ignored, failed, delayed, recovered, hard_offline, soft_offline, then recovered = hard_offline +
>> soft_offline)
>>> Or 6 entries ? (in that case, hard_offline = recovered - soft_offline)
>>> It might be simpler to understand for user if total is just the sum of other entries like this RFC,
>>> but I'd like to know other opinions.
>>
>> Will it be better to have below items?
>> "
>> total
>> ignored
>> failed
>> dalayed
>> hard_offline
>> soft_offline
>> "
>>
>> though this will break the previous interface.
>> Any thoughts?
>
> That would be great, but these files are under stable ABI and
> I don't think we can change them, right?
>
> https://docs.kernel.org/admin-guide/abi-stable.html
> Userspace programs are free to use these interfaces with no restrictions, and backward
> compatibility for them will be guaranteed for at least 2 years.
> Most interfaces (like syscalls) are expected to never change and always be available.
Thanks for your information. So we need to propose a better solution. Looking forward to hearing more suggestions.
Thanks.
.
>
> Regards,
> Tomohiro Misono
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] mm: memory-failure: add soft-offline stat in mf_stats
2024-11-29 7:07 ` Miaohe Lin
2024-11-29 8:26 ` Tomohiro Misono (Fujitsu)
@ 2024-12-07 0:17 ` jane.chu
2024-12-10 8:46 ` Tomohiro Misono (Fujitsu)
1 sibling, 1 reply; 10+ messages in thread
From: jane.chu @ 2024-12-07 0:17 UTC (permalink / raw)
To: Miaohe Lin, Tomohiro Misono (Fujitsu), 'Jiaqi Yan'
Cc: linux-mm, linux-kernel, Andrew Morton, Naoya Horiguchi
>>> And
>>> 1. total = recovered + ignored + failed + delayed
>>> 2. recovered = soft_offline + hard_offline
>> Do you mean mf_stats now have 7 entries in sysfs?
>> (total, ignored, failed, delayed, recovered, hard_offline, soft_offline, then recovered = hard_offline + soft_offline)
>> Or 6 entries ? (in that case, hard_offline = recovered - soft_offline)
>> It might be simpler to understand for user if total is just the sum of other entries like this RFC,
>> but I'd like to know other opinions.
> Will it be better to have below items?
> "
> total
> ignored
> failed
> dalayed
> hard_offline
> soft_offline
> "
The existing "ignored, failed, delayed, recovered" apply to UEs while
"soft_offline" applies to CE. The difference between UE and CE is that
even a recovered UE page has PG_hwpoison set, but a soft offlined page
does not and thus could be re-deployed.
So if we want to flag CE pages, they seem to belong to a different
category, something like -
/sys/devices/system/node/node0/memory_failure/Uncorrected/{ignored, delayed, failed, recovered}
/sys/devices/system/node/node0/memory_failure/Corrected/{offlined}
Thanks,
-jane
>
> though this will break the previous interface.
> Any thoughts?
>
> Thanks.
> .
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [RFC PATCH] mm: memory-failure: add soft-offline stat in mf_stats
2024-12-07 0:17 ` jane.chu
@ 2024-12-10 8:46 ` Tomohiro Misono (Fujitsu)
0 siblings, 0 replies; 10+ messages in thread
From: Tomohiro Misono (Fujitsu) @ 2024-12-10 8:46 UTC (permalink / raw)
To: 'jane.chu@oracle.com', 'Miaohe Lin', 'Jiaqi Yan'
Cc: 'linux-mm@kvack.org',
'linux-kernel@vger.kernel.org', 'Andrew Morton',
'Naoya Horiguchi'
> >>> And
> >>> 1. total = recovered + ignored + failed + delayed
> >>> 2. recovered = soft_offline + hard_offline
> >> Do you mean mf_stats now have 7 entries in sysfs?
> >> (total, ignored, failed, delayed, recovered, hard_offline, soft_offline, then recovered = hard_offline +
> soft_offline)
> >> Or 6 entries ? (in that case, hard_offline = recovered - soft_offline)
> >> It might be simpler to understand for user if total is just the sum of other entries like this RFC,
> >> but I'd like to know other opinions.
> > Will it be better to have below items?
> > "
> > total
> > ignored
> > failed
> > dalayed
> > hard_offline
> > soft_offline
> > "
>
> The existing "ignored, failed, delayed, recovered" apply to UEs while
> "soft_offline" applies to CE. The difference between UE and CE is that
> even a recovered UE page has PG_hwpoison set, but a soft offlined page
> does not and thus could be re-deployed.
Hi, thanks for your comments.
If I understand correctly, PG_hwpoison is also set in soft offlined page (and thus
counted in HardwareCorrupted too):
https://github.com/torvalds/linux/blob/v6.13-rc2/mm/memory-failure.c#L206
Also, unpoison works but can only be used via debugfs by hwpoison-inject module.
Is this correct?
>
> So if we want to flag CE pages, they seem to belong to a different
> category, something like -
>
> /sys/devices/system/node/node0/memory_failure/Uncorrected/{ignored, delayed, failed, recovered}
> /sys/devices/system/node/node0/memory_failure/Corrected/{offlined}
This makes sense. But as I stated in other thread, I don't think we can change the
current I/F for "Uncorrected". Is it worth to create "Corrected" dir only?
Regards
Tomohiro Misono
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-12-10 8:46 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-11-21 4:55 [RFC PATCH] mm: memory-failure: add soft-offline stat in mf_stats Tomohiro Misono
2024-11-26 3:09 ` Miaohe Lin
2024-11-27 2:32 ` Tomohiro Misono (Fujitsu)
2024-11-27 7:06 ` Jiaqi Yan
2024-11-28 5:46 ` Tomohiro Misono (Fujitsu)
2024-11-29 7:07 ` Miaohe Lin
2024-11-29 8:26 ` Tomohiro Misono (Fujitsu)
2024-11-29 9:07 ` Miaohe Lin
2024-12-07 0:17 ` jane.chu
2024-12-10 8:46 ` Tomohiro Misono (Fujitsu)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox