* Re: [PATCH v3] mm: release private data before split THP
2022-08-10 6:49 [PATCH v3] mm: release private data before split THP Yin Fengwei
@ 2022-08-10 17:09 ` Yang Shi
2022-08-11 7:26 ` Yin Fengwei
2022-08-11 1:54 ` Miaohe Lin
` (3 subsequent siblings)
4 siblings, 1 reply; 9+ messages in thread
From: Yang Shi @ 2022-08-10 17:09 UTC (permalink / raw)
To: Yin Fengwei
Cc: linux-mm, naoya.horiguchi, linmiaohe, willy, aaron.lu, tony.luck,
qiuxu.zhuo
On Tue, Aug 9, 2022 at 11:50 PM Yin Fengwei <fengwei.yin@intel.com> wrote:
>
> If there is private data attached to THP, the refcount of
> THP will be increased and block the THP split. Release
> private data attached to THP before split it to increase
> the chance of splitting THP successfully.
>
> There was a memory failure issue hit during HW error
> injection testing with 5.18 kernel + xfs as rootfs. Test
> got killed and system reboot was required to re-run the
> test.
>
> The issue was tracked down to THP split failure caused the
> memory failure not being handled. The page dump showed:
>
> [ 1785.433075] page:0000000025f9530b refcount:18 mapcount:0 mapping:000000008162eea7 index:0xa10 pfn:0x2f0200
> [ 1785.443954] head:0000000025f9530b order:4 compound_mapcount:0 compound_pincount:0
> [ 1785.452408] memcg:ff4247f2d28e9000
> [ 1785.456304] aops:xfs_address_space_operations ino:8555182 dentry name:"baseos-filenames.solvx"
> [ 1785.466612] flags: 0x1000000000012036(referenced|uptodate|lru|active|private|head|node=0|zone=2)
> [ 1785.476514] raw: 1000000000012036 ffb9460f8bc07c08 ffb9460f8bc08408 ff4247f22e6299f8
> [ 1785.485268] raw: 0000000000000a10 ff4247f194ade900 00000012ffffffff ff4247f2d28e9000
>
> It was like the error was injected to a large folio for xfs
> with private data attached.
>
> With private data released before split THP, the test case
> could be run successfully many times without reboot system.
>
> Co-developed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> ---
> Changelog from v2:
> - Use safe gfp flags for different callsite of split_huge_page_to_list
> per Yang's comment.
> - Remove reviewed-by tag from Aaron which was only valid for RFC patch
> but keep it by mistake.
Reviewed-by: Yang Shi <shy828301@gmail.com>
>
> Changelog from v1:
> - Move private release to split_huge_page_to_list
> to cover wider path per Yang's comment.
> - Update to commit message.
>
> Changelog from RFC:
> - Use new folio API per Mathhew Wilcox's suggestion.
> - Add one line comment before re-get folio of page per
> Miaohe's comment.
> - Remove RFC tag
> - Add Co-developed-by of Qiuxu who did a lot of debugging
> work to locate where the real issue is.
> mm/huge_memory.c | 14 ++++++++++++--
> 1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 8a7c1b344abe..ae8c4e209e58 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2627,6 +2627,8 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
> mapping = NULL;
> anon_vma_lock_write(anon_vma);
> } else {
> + gfp_t gfp;
> +
> mapping = head->mapping;
>
> /* Truncated ? */
> @@ -2635,8 +2637,16 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
> goto out;
> }
>
> - xas_split_alloc(&xas, head, compound_order(head),
> - mapping_gfp_mask(mapping) & GFP_RECLAIM_MASK);
> + gfp = current_gfp_context(mapping_gfp_mask(mapping) &
> + GFP_RECLAIM_MASK);
> +
> + if (folio_test_private(folio) &&
> + !filemap_release_folio(folio, gfp)) {
> + ret = -EBUSY;
> + goto out;
> + }
> +
> + xas_split_alloc(&xas, head, compound_order(head), gfp);
> if (xas_error(&xas)) {
> ret = xas_error(&xas);
> goto out;
>
> base-commit: d4252071b97d2027d246f6a82cbee4d52f618b47
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v3] mm: release private data before split THP
2022-08-10 17:09 ` Yang Shi
@ 2022-08-11 7:26 ` Yin Fengwei
0 siblings, 0 replies; 9+ messages in thread
From: Yin Fengwei @ 2022-08-11 7:26 UTC (permalink / raw)
To: Yang Shi, Yin Fengwei
Cc: linux-mm, naoya.horiguchi, linmiaohe, willy, aaron.lu, tony.luck,
qiuxu.zhuo
On 2022/8/11 01:09, Yang Shi wrote:
> On Tue, Aug 9, 2022 at 11:50 PM Yin Fengwei <fengwei.yin@intel.com> wrote:
>>
>> If there is private data attached to THP, the refcount of
>> THP will be increased and block the THP split. Release
>> private data attached to THP before split it to increase
>> the chance of splitting THP successfully.
>>
>> There was a memory failure issue hit during HW error
>> injection testing with 5.18 kernel + xfs as rootfs. Test
>> got killed and system reboot was required to re-run the
>> test.
>>
>> The issue was tracked down to THP split failure caused the
>> memory failure not being handled. The page dump showed:
>>
>> [ 1785.433075] page:0000000025f9530b refcount:18 mapcount:0 mapping:000000008162eea7 index:0xa10 pfn:0x2f0200
>> [ 1785.443954] head:0000000025f9530b order:4 compound_mapcount:0 compound_pincount:0
>> [ 1785.452408] memcg:ff4247f2d28e9000
>> [ 1785.456304] aops:xfs_address_space_operations ino:8555182 dentry name:"baseos-filenames.solvx"
>> [ 1785.466612] flags: 0x1000000000012036(referenced|uptodate|lru|active|private|head|node=0|zone=2)
>> [ 1785.476514] raw: 1000000000012036 ffb9460f8bc07c08 ffb9460f8bc08408 ff4247f22e6299f8
>> [ 1785.485268] raw: 0000000000000a10 ff4247f194ade900 00000012ffffffff ff4247f2d28e9000
>>
>> It was like the error was injected to a large folio for xfs
>> with private data attached.
>>
>> With private data released before split THP, the test case
>> could be run successfully many times without reboot system.
>>
>> Co-developed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
>> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
>> Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
>> Suggested-by: Matthew Wilcox <willy@infradead.org>
>> ---
>> Changelog from v2:
>> - Use safe gfp flags for different callsite of split_huge_page_to_list
>> per Yang's comment.
>> - Remove reviewed-by tag from Aaron which was only valid for RFC patch
>> but keep it by mistake.
>
> Reviewed-by: Yang Shi <shy828301@gmail.com>
Thanks for your reviewing.
Regards
Yin, Fengwei
>
>>
>> Changelog from v1:
>> - Move private release to split_huge_page_to_list
>> to cover wider path per Yang's comment.
>> - Update to commit message.
>>
>> Changelog from RFC:
>> - Use new folio API per Mathhew Wilcox's suggestion.
>> - Add one line comment before re-get folio of page per
>> Miaohe's comment.
>> - Remove RFC tag
>> - Add Co-developed-by of Qiuxu who did a lot of debugging
>> work to locate where the real issue is.
>> mm/huge_memory.c | 14 ++++++++++++--
>> 1 file changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 8a7c1b344abe..ae8c4e209e58 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -2627,6 +2627,8 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
>> mapping = NULL;
>> anon_vma_lock_write(anon_vma);
>> } else {
>> + gfp_t gfp;
>> +
>> mapping = head->mapping;
>>
>> /* Truncated ? */
>> @@ -2635,8 +2637,16 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
>> goto out;
>> }
>>
>> - xas_split_alloc(&xas, head, compound_order(head),
>> - mapping_gfp_mask(mapping) & GFP_RECLAIM_MASK);
>> + gfp = current_gfp_context(mapping_gfp_mask(mapping) &
>> + GFP_RECLAIM_MASK);
>> +
>> + if (folio_test_private(folio) &&
>> + !filemap_release_folio(folio, gfp)) {
>> + ret = -EBUSY;
>> + goto out;
>> + }
>> +
>> + xas_split_alloc(&xas, head, compound_order(head), gfp);
>> if (xas_error(&xas)) {
>> ret = xas_error(&xas);
>> goto out;
>>
>> base-commit: d4252071b97d2027d246f6a82cbee4d52f618b47
>> --
>> 2.25.1
>>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3] mm: release private data before split THP
2022-08-10 6:49 [PATCH v3] mm: release private data before split THP Yin Fengwei
2022-08-10 17:09 ` Yang Shi
@ 2022-08-11 1:54 ` Miaohe Lin
2022-08-11 7:27 ` Yin Fengwei
2022-08-11 8:18 ` Lu, Aaron
` (2 subsequent siblings)
4 siblings, 1 reply; 9+ messages in thread
From: Miaohe Lin @ 2022-08-11 1:54 UTC (permalink / raw)
To: Yin Fengwei
Cc: aaron.lu, tony.luck, qiuxu.zhuo, linux-mm, HORIGUCHI NAOYA,
Matthew Wilcox, Yang Shi
On 2022/8/10 14:49, Yin Fengwei wrote:
> If there is private data attached to THP, the refcount of
> THP will be increased and block the THP split. Release
> private data attached to THP before split it to increase
> the chance of splitting THP successfully.
>
> There was a memory failure issue hit during HW error
> injection testing with 5.18 kernel + xfs as rootfs. Test
> got killed and system reboot was required to re-run the
> test.
>
> The issue was tracked down to THP split failure caused the
> memory failure not being handled. The page dump showed:
>
> [ 1785.433075] page:0000000025f9530b refcount:18 mapcount:0 mapping:000000008162eea7 index:0xa10 pfn:0x2f0200
> [ 1785.443954] head:0000000025f9530b order:4 compound_mapcount:0 compound_pincount:0
> [ 1785.452408] memcg:ff4247f2d28e9000
> [ 1785.456304] aops:xfs_address_space_operations ino:8555182 dentry name:"baseos-filenames.solvx"
> [ 1785.466612] flags: 0x1000000000012036(referenced|uptodate|lru|active|private|head|node=0|zone=2)
> [ 1785.476514] raw: 1000000000012036 ffb9460f8bc07c08 ffb9460f8bc08408 ff4247f22e6299f8
> [ 1785.485268] raw: 0000000000000a10 ff4247f194ade900 00000012ffffffff ff4247f2d28e9000
>
> It was like the error was injected to a large folio for xfs
> with private data attached.
>
> With private data released before split THP, the test case
> could be run successfully many times without reboot system.
>
> Co-developed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
> Suggested-by: Matthew Wilcox <willy@infradead.org>
Thanks for your work.
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v3] mm: release private data before split THP
2022-08-11 1:54 ` Miaohe Lin
@ 2022-08-11 7:27 ` Yin Fengwei
0 siblings, 0 replies; 9+ messages in thread
From: Yin Fengwei @ 2022-08-11 7:27 UTC (permalink / raw)
To: Miaohe Lin
Cc: aaron.lu, tony.luck, qiuxu.zhuo, linux-mm, HORIGUCHI NAOYA,
Matthew Wilcox, Yang Shi
On 2022/8/11 09:54, Miaohe Lin wrote:
> On 2022/8/10 14:49, Yin Fengwei wrote:
>> If there is private data attached to THP, the refcount of
>> THP will be increased and block the THP split. Release
>> private data attached to THP before split it to increase
>> the chance of splitting THP successfully.
>>
>> There was a memory failure issue hit during HW error
>> injection testing with 5.18 kernel + xfs as rootfs. Test
>> got killed and system reboot was required to re-run the
>> test.
>>
>> The issue was tracked down to THP split failure caused the
>> memory failure not being handled. The page dump showed:
>>
>> [ 1785.433075] page:0000000025f9530b refcount:18 mapcount:0 mapping:000000008162eea7 index:0xa10 pfn:0x2f0200
>> [ 1785.443954] head:0000000025f9530b order:4 compound_mapcount:0 compound_pincount:0
>> [ 1785.452408] memcg:ff4247f2d28e9000
>> [ 1785.456304] aops:xfs_address_space_operations ino:8555182 dentry name:"baseos-filenames.solvx"
>> [ 1785.466612] flags: 0x1000000000012036(referenced|uptodate|lru|active|private|head|node=0|zone=2)
>> [ 1785.476514] raw: 1000000000012036 ffb9460f8bc07c08 ffb9460f8bc08408 ff4247f22e6299f8
>> [ 1785.485268] raw: 0000000000000a10 ff4247f194ade900 00000012ffffffff ff4247f2d28e9000
>>
>> It was like the error was injected to a large folio for xfs
>> with private data attached.
>>
>> With private data released before split THP, the test case
>> could be run successfully many times without reboot system.
>>
>> Co-developed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
>> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
>> Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
>> Suggested-by: Matthew Wilcox <willy@infradead.org>
>
> Thanks for your work.
>
> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Thanks for your reviewing.
Regards
Yin, Fengwei
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3] mm: release private data before split THP
2022-08-10 6:49 [PATCH v3] mm: release private data before split THP Yin Fengwei
2022-08-10 17:09 ` Yang Shi
2022-08-11 1:54 ` Miaohe Lin
@ 2022-08-11 8:18 ` Lu, Aaron
2022-08-19 5:04 ` Yin, Fengwei
2022-08-20 0:45 ` Andrew Morton
4 siblings, 0 replies; 9+ messages in thread
From: Lu, Aaron @ 2022-08-11 8:18 UTC (permalink / raw)
To: linux-mm, linmiaohe, naoya.horiguchi, willy, shy828301, Yin, Fengwei
Cc: Luck, Tony, Zhuo, Qiuxu
On Wed, 2022-08-10 at 14:49 +0800, Yin Fengwei wrote:
> If there is private data attached to THP, the refcount of
> THP will be increased and block the THP split. Release
> private data attached to THP before split it to increase
> the chance of splitting THP successfully.
>
> There was a memory failure issue hit during HW error
> injection testing with 5.18 kernel + xfs as rootfs. Test
> got killed and system reboot was required to re-run the
> test.
>
> The issue was tracked down to THP split failure caused the
> memory failure not being handled. The page dump showed:
>
> [ 1785.433075] page:0000000025f9530b refcount:18 mapcount:0 mapping:000000008162eea7 index:0xa10 pfn:0x2f0200
> [ 1785.443954] head:0000000025f9530b order:4 compound_mapcount:0 compound_pincount:0
> [ 1785.452408] memcg:ff4247f2d28e9000
> [ 1785.456304] aops:xfs_address_space_operations ino:8555182 dentry name:"baseos-filenames.solvx"
> [ 1785.466612] flags: 0x1000000000012036(referenced|uptodate|lru|active|private|head|node=0|zone=2)
> [ 1785.476514] raw: 1000000000012036 ffb9460f8bc07c08 ffb9460f8bc08408 ff4247f22e6299f8
> [ 1785.485268] raw: 0000000000000a10 ff4247f194ade900 00000012ffffffff ff4247f2d28e9000
>
> It was like the error was injected to a large folio for xfs
> with private data attached.
>
> With private data released before split THP, the test case
> could be run successfully many times without reboot system.
>
> Co-developed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
> Suggested-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Aaron Lu <aaron.lu@intel.com>
> ---
> Changelog from v2:
> - Use safe gfp flags for different callsite of split_huge_page_to_list
> per Yang's comment.
> - Remove reviewed-by tag from Aaron which was only valid for RFC patch
> but keep it by mistake.
>
> Changelog from v1:
> - Move private release to split_huge_page_to_list
> to cover wider path per Yang's comment.
> - Update to commit message.
>
> Changelog from RFC:
> - Use new folio API per Mathhew Wilcox's suggestion.
> - Add one line comment before re-get folio of page per
> Miaohe's comment.
> - Remove RFC tag
> - Add Co-developed-by of Qiuxu who did a lot of debugging
> work to locate where the real issue is.
> mm/huge_memory.c | 14 ++++++++++++--
> 1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 8a7c1b344abe..ae8c4e209e58 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2627,6 +2627,8 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
> mapping = NULL;
> anon_vma_lock_write(anon_vma);
> } else {
> + gfp_t gfp;
> +
> mapping = head->mapping;
>
> /* Truncated ? */
> @@ -2635,8 +2637,16 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
> goto out;
> }
>
> - xas_split_alloc(&xas, head, compound_order(head),
> - mapping_gfp_mask(mapping) & GFP_RECLAIM_MASK);
> + gfp = current_gfp_context(mapping_gfp_mask(mapping) &
> + GFP_RECLAIM_MASK);
> +
> + if (folio_test_private(folio) &&
> + !filemap_release_folio(folio, gfp)) {
> + ret = -EBUSY;
> + goto out;
> + }
> +
> + xas_split_alloc(&xas, head, compound_order(head), gfp);
> if (xas_error(&xas)) {
> ret = xas_error(&xas);
> goto out;
>
> base-commit: d4252071b97d2027d246f6a82cbee4d52f618b47
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v3] mm: release private data before split THP
2022-08-10 6:49 [PATCH v3] mm: release private data before split THP Yin Fengwei
` (2 preceding siblings ...)
2022-08-11 8:18 ` Lu, Aaron
@ 2022-08-19 5:04 ` Yin, Fengwei
2022-08-20 0:45 ` Andrew Morton
4 siblings, 0 replies; 9+ messages in thread
From: Yin, Fengwei @ 2022-08-19 5:04 UTC (permalink / raw)
To: linux-mm, naoya.horiguchi, linmiaohe, willy, shy828301, Andrew Morton
Cc: aaron.lu, tony.luck, qiuxu.zhuo
Hi Andrew,
> On 8/10/2022 2:49 PM, Yin Fengwei wrote:
Sorry for pinging you here. I suppose all the review comments were
addressed so far. To make this patch merged, any other action I need
to take? Thanks.
v1 is on:
https://lore.kernel.org/linux-mm/20220804025121.4001361-1-fengwei.yin@intel.com/
v2 is on:
https://lore.kernel.org/linux-mm/20220805062844.439152-1-fengwei.yin@intel.com/
v3 is on:
https://lore.kernel.org/linux-mm/20220810064907.582899-1-fengwei.yin@intel.com/
Regards
Yin, Fengwei
> If there is private data attached to THP, the refcount of
> THP will be increased and block the THP split. Release
> private data attached to THP before split it to increase
> the chance of splitting THP successfully.
>
> There was a memory failure issue hit during HW error
> injection testing with 5.18 kernel + xfs as rootfs. Test
> got killed and system reboot was required to re-run the
> test.
>
> The issue was tracked down to THP split failure caused the
> memory failure not being handled. The page dump showed:
>
> [ 1785.433075] page:0000000025f9530b refcount:18 mapcount:0 mapping:000000008162eea7 index:0xa10 pfn:0x2f0200
> [ 1785.443954] head:0000000025f9530b order:4 compound_mapcount:0 compound_pincount:0
> [ 1785.452408] memcg:ff4247f2d28e9000
> [ 1785.456304] aops:xfs_address_space_operations ino:8555182 dentry name:"baseos-filenames.solvx"
> [ 1785.466612] flags: 0x1000000000012036(referenced|uptodate|lru|active|private|head|node=0|zone=2)
> [ 1785.476514] raw: 1000000000012036 ffb9460f8bc07c08 ffb9460f8bc08408 ff4247f22e6299f8
> [ 1785.485268] raw: 0000000000000a10 ff4247f194ade900 00000012ffffffff ff4247f2d28e9000
>
> It was like the error was injected to a large folio for xfs
> with private data attached.
>
> With private data released before split THP, the test case
> could be run successfully many times without reboot system.
>
> Co-developed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> ---
> Changelog from v2:
> - Use safe gfp flags for different callsite of split_huge_page_to_list
> per Yang's comment.
> - Remove reviewed-by tag from Aaron which was only valid for RFC patch
> but keep it by mistake.
>
> Changelog from v1:
> - Move private release to split_huge_page_to_list
> to cover wider path per Yang's comment.
> - Update to commit message.
>
> Changelog from RFC:
> - Use new folio API per Mathhew Wilcox's suggestion.
> - Add one line comment before re-get folio of page per
> Miaohe's comment.
> - Remove RFC tag
> - Add Co-developed-by of Qiuxu who did a lot of debugging
> work to locate where the real issue is.
> mm/huge_memory.c | 14 ++++++++++++--
> 1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 8a7c1b344abe..ae8c4e209e58 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2627,6 +2627,8 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
> mapping = NULL;
> anon_vma_lock_write(anon_vma);
> } else {
> + gfp_t gfp;
> +
> mapping = head->mapping;
>
> /* Truncated ? */
> @@ -2635,8 +2637,16 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
> goto out;
> }
>
> - xas_split_alloc(&xas, head, compound_order(head),
> - mapping_gfp_mask(mapping) & GFP_RECLAIM_MASK);
> + gfp = current_gfp_context(mapping_gfp_mask(mapping) &
> + GFP_RECLAIM_MASK);
> +
> + if (folio_test_private(folio) &&
> + !filemap_release_folio(folio, gfp)) {
> + ret = -EBUSY;
> + goto out;
> + }
> +
> + xas_split_alloc(&xas, head, compound_order(head), gfp);
> if (xas_error(&xas)) {
> ret = xas_error(&xas);
> goto out;
>
> base-commit: d4252071b97d2027d246f6a82cbee4d52f618b47
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v3] mm: release private data before split THP
2022-08-10 6:49 [PATCH v3] mm: release private data before split THP Yin Fengwei
` (3 preceding siblings ...)
2022-08-19 5:04 ` Yin, Fengwei
@ 2022-08-20 0:45 ` Andrew Morton
2022-08-20 1:48 ` Yin, Fengwei
4 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2022-08-20 0:45 UTC (permalink / raw)
To: Yin Fengwei
Cc: linux-mm, naoya.horiguchi, linmiaohe, willy, shy828301, aaron.lu,
tony.luck, qiuxu.zhuo
On Wed, 10 Aug 2022 14:49:07 +0800 Yin Fengwei <fengwei.yin@intel.com> wrote:
> If there is private data attached to THP, the refcount of
> THP will be increased and block the THP split. Release
> private data attached to THP before split it to increase
> the chance of splitting THP successfully.
>
> There was a memory failure issue hit during HW error
> injection testing with 5.18 kernel + xfs as rootfs. Test
> got killed and system reboot was required to re-run the
> test.
>
> The issue was tracked down to THP split failure caused the
> memory failure not being handled. The page dump showed:
>
> [ 1785.433075] page:0000000025f9530b refcount:18 mapcount:0 mapping:000000008162eea7 index:0xa10 pfn:0x2f0200
> [ 1785.443954] head:0000000025f9530b order:4 compound_mapcount:0 compound_pincount:0
> [ 1785.452408] memcg:ff4247f2d28e9000
> [ 1785.456304] aops:xfs_address_space_operations ino:8555182 dentry name:"baseos-filenames.solvx"
> [ 1785.466612] flags: 0x1000000000012036(referenced|uptodate|lru|active|private|head|node=0|zone=2)
> [ 1785.476514] raw: 1000000000012036 ffb9460f8bc07c08 ffb9460f8bc08408 ff4247f22e6299f8
> [ 1785.485268] raw: 0000000000000a10 ff4247f194ade900 00000012ffffffff ff4247f2d28e9000
>
> It was like the error was injected to a large folio for xfs
> with private data attached.
>
> With private data released before split THP, the test case
> could be run successfully many times without reboot system.
I did a bit of editorial work on the changelog. Please check, Note my
addition of "attempt to" to the second sentence.
: If there is private data attached to a THP, the refcount of THP will be
: increased and will prevent the THP from being split. Attempt to release
: any private data attached to the THP before attempting the split to
: increase the chance of splitting successfully.
:
: There was a memory failure issue hit during HW error injection testing
: with 5.18 kernel + xfs as rootfs. The test was killed and a system reboot
: was required to re-run the test.
:
: The issue was tracked down to a THP split failure caused by the memory
: failure not being handled. The page dump showed:
:
: [ 1785.433075] page:0000000025f9530b refcount:18 mapcount:0 mapping:000000008162eea7 index:0xa10 pfn:0x2f0200
: [ 1785.443954] head:0000000025f9530b order:4 compound_mapcount:0 compound_pincount:0
: [ 1785.452408] memcg:ff4247f2d28e9000
: [ 1785.456304] aops:xfs_address_space_operations ino:8555182 dentry name:"baseos-filenames.solvx"
: [ 1785.466612] flags: 0x1000000000012036(referenced|uptodate|lru|active|private|head|node=0|zone=2)
: [ 1785.476514] raw: 1000000000012036 ffb9460f8bc07c08 ffb9460f8bc08408 ff4247f22e6299f8
: [ 1785.485268] raw: 0000000000000a10 ff4247f194ade900 00000012ffffffff ff4247f2d28e9000
:
: It was like the error was injected to a large folio for xfs with private
: data attached.
:
: With private data released before splitting the THP, the test case could
: be run successfully many times without rebooting the system.
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
>
> ...
>
> @@ -2635,8 +2637,16 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
> goto out;
> }
>
> - xas_split_alloc(&xas, head, compound_order(head),
> - mapping_gfp_mask(mapping) & GFP_RECLAIM_MASK);
> + gfp = current_gfp_context(mapping_gfp_mask(mapping) &
> + GFP_RECLAIM_MASK);
> +
> + if (folio_test_private(folio) &&
> + !filemap_release_folio(folio, gfp)) {
> + ret = -EBUSY;
> + goto out;
> + }
> +
> + xas_split_alloc(&xas, head, compound_order(head), gfp);
Because I assume we run into the same problem if
filemap_release_folio() fails?
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v3] mm: release private data before split THP
2022-08-20 0:45 ` Andrew Morton
@ 2022-08-20 1:48 ` Yin, Fengwei
0 siblings, 0 replies; 9+ messages in thread
From: Yin, Fengwei @ 2022-08-20 1:48 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, naoya.horiguchi, linmiaohe, willy, shy828301, aaron.lu,
tony.luck, qiuxu.zhuo
Hi Andrew,
On 8/20/2022 8:45 AM, Andrew Morton wrote:
> On Wed, 10 Aug 2022 14:49:07 +0800 Yin Fengwei <fengwei.yin@intel.com> wrote:
>
>> If there is private data attached to THP, the refcount of
>> THP will be increased and block the THP split. Release
>> private data attached to THP before split it to increase
>> the chance of splitting THP successfully.
>>
>> There was a memory failure issue hit during HW error
>> injection testing with 5.18 kernel + xfs as rootfs. Test
>> got killed and system reboot was required to re-run the
>> test.
>>
>> The issue was tracked down to THP split failure caused the
>> memory failure not being handled. The page dump showed:
>>
>> [ 1785.433075] page:0000000025f9530b refcount:18 mapcount:0 mapping:000000008162eea7 index:0xa10 pfn:0x2f0200
>> [ 1785.443954] head:0000000025f9530b order:4 compound_mapcount:0 compound_pincount:0
>> [ 1785.452408] memcg:ff4247f2d28e9000
>> [ 1785.456304] aops:xfs_address_space_operations ino:8555182 dentry name:"baseos-filenames.solvx"
>> [ 1785.466612] flags: 0x1000000000012036(referenced|uptodate|lru|active|private|head|node=0|zone=2)
>> [ 1785.476514] raw: 1000000000012036 ffb9460f8bc07c08 ffb9460f8bc08408 ff4247f22e6299f8
>> [ 1785.485268] raw: 0000000000000a10 ff4247f194ade900 00000012ffffffff ff4247f2d28e9000
>>
>> It was like the error was injected to a large folio for xfs
>> with private data attached.
>>
>> With private data released before split THP, the test case
>> could be run successfully many times without reboot system.
>
> I did a bit of editorial work on the changelog. Please check, Note my
> addition of "attempt to" to the second sentence.
Thanks a lot for the update. Looks good to me.
>
> : If there is private data attached to a THP, the refcount of THP will be
> : increased and will prevent the THP from being split. Attempt to release
> : any private data attached to the THP before attempting the split to
> : increase the chance of splitting successfully.
> :
> : There was a memory failure issue hit during HW error injection testing
> : with 5.18 kernel + xfs as rootfs. The test was killed and a system reboot
> : was required to re-run the test.
> :
> : The issue was tracked down to a THP split failure caused by the memory
> : failure not being handled. The page dump showed:
> :
> : [ 1785.433075] page:0000000025f9530b refcount:18 mapcount:0 mapping:000000008162eea7 index:0xa10 pfn:0x2f0200
> : [ 1785.443954] head:0000000025f9530b order:4 compound_mapcount:0 compound_pincount:0
> : [ 1785.452408] memcg:ff4247f2d28e9000
> : [ 1785.456304] aops:xfs_address_space_operations ino:8555182 dentry name:"baseos-filenames.solvx"
> : [ 1785.466612] flags: 0x1000000000012036(referenced|uptodate|lru|active|private|head|node=0|zone=2)
> : [ 1785.476514] raw: 1000000000012036 ffb9460f8bc07c08 ffb9460f8bc08408 ff4247f22e6299f8
> : [ 1785.485268] raw: 0000000000000a10 ff4247f194ade900 00000012ffffffff ff4247f2d28e9000
> :
> : It was like the error was injected to a large folio for xfs with private
> : data attached.
> :
> : With private data released before splitting the THP, the test case could
> : be run successfully many times without rebooting the system.
>
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>>
>> ...
>>
>> @@ -2635,8 +2637,16 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
>> goto out;
>> }
>>
>> - xas_split_alloc(&xas, head, compound_order(head),
>> - mapping_gfp_mask(mapping) & GFP_RECLAIM_MASK);
>> + gfp = current_gfp_context(mapping_gfp_mask(mapping) &
>> + GFP_RECLAIM_MASK);
>> +
>> + if (folio_test_private(folio) &&
>> + !filemap_release_folio(folio, gfp)) {
>> + ret = -EBUSY;
>> + goto out;
>> + }
>> +
>> + xas_split_alloc(&xas, head, compound_order(head), gfp);
>
> Because I assume we run into the same problem if
> filemap_release_folio() fails?
Yes. You are right. Thanks.
Regards
Yin, Fengwei
^ permalink raw reply [flat|nested] 9+ messages in thread