* Re: [PATCH 4/6] mm: move flush in madvise_free_pte_range()
[not found] <016001d1d36e$ef1db5a0$cd5920e0$@alibaba-inc.com>
@ 2016-07-01 8:30 ` Hillf Danton
0 siblings, 0 replies; 2+ messages in thread
From: Hillf Danton @ 2016-07-01 8:30 UTC (permalink / raw)
To: 'Dave Hansen', Dave Hansen; +Cc: Minchan Kim, linux-kernel, linux-mm
>
> From: Dave Hansen <dave.hansen@linux.intel.com>
>
> I think this code is OK and does not *need* to be patched. We
> are just rewriting the PTE without the Accessed and Dirty bits.
> The hardware could come along and set them at any time with or
> without the erratum that this series addresses
>
> But this does make the ptep_get_and_clear_full() and
> tlb_remove_tlb_entry() calls here more consistent with the other
> places they are used together and look *obviously* the same
> between call-sites.
>
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Minchan Kim <minchan@kernel.org>
> ---
>
> b/mm/madvise.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff -puN mm/madvise.c~knl-leak-40-madvise_free_pte_range-move-flush mm/madvise.c
> --- a/mm/madvise.c~knl-leak-40-madvise_free_pte_range-move-flush 2016-06-30 17:10:42.557246755 -0700
> +++ b/mm/madvise.c 2016-06-30 17:10:42.561246936 -0700
> @@ -369,13 +369,13 @@ static int madvise_free_pte_range(pmd_t
> */
> ptent = ptep_get_and_clear_full(mm, addr, pte,
> tlb->fullmm);
> + tlb_remove_tlb_entry(tlb, pte, addr);
>
Then the current comment has to be updated, no?-/
thanks
Hillf
> ptent = pte_mkold(ptent);
> ptent = pte_mkclean(ptent);
> set_pte_at(mm, addr, pte, ptent);
> if (PageActive(page))
> deactivate_page(page);
> - tlb_remove_tlb_entry(tlb, pte, addr);
> }
> }
> out:
> _
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 2+ messages in thread
* [PATCH 0/6] [v3] Workaround for Xeon Phi PTE A/D bits erratum
@ 2016-07-01 0:12 Dave Hansen
2016-07-01 0:12 ` [PATCH 4/6] mm: move flush in madvise_free_pte_range() Dave Hansen
0 siblings, 1 reply; 2+ messages in thread
From: Dave Hansen @ 2016-07-01 0:12 UTC (permalink / raw)
To: linux-kernel
Cc: x86, linux-mm, torvalds, akpm, bp, ak, mhocko, Dave Hansen, minchan
The Intel(R) Xeon Phi(TM) Processor x200 Family (codename: Knights
Landing) has an erratum where a processor thread setting the Accessed
or Dirty bits may not do so atomically against its checks for the
Present bit. This may cause a thread (which is about to page fault)
to set A and/or D, even though the Present bit had already been
atomically cleared.
If the PTE is used for storing a swap index or a NUMA migration index,
the A bit could be misinterpreted as part of the swap type. The stray
bits being set cause a software-cleared PTE to be interpreted as a
swap entry. In some cases (like when the swap index ends up being
for a non-existent swapfile), the kernel detects the stray value
and WARN()s about it, but there is no guarantee that the kernel can
always detect it.
This patch causes the page unmap path in vmscan/direct reclaim to
flush remote TLBs after clearing each page, and also clears the PTE
again after the flush. For reclaim, this brings the behavior (and
associated reclaim performance) back to what it was before Mel's
changes that increased TLB flush batching.
For the unmap path, this patch may force some additional flushes, but
they are limited to a maximum of one per PTE page. This patch clears
these stray A/D bits before releasing the pagetable lock which
prevents other parts of the kernel from observing the stray bits.
Andi Kleen wrote the original version of this patch, and Dave Hansen
added the batching. The original version was much simpler but it
did too many extra TLB flushes which killed performance.
v3: huge rework to keep batching working in unmap case
v2: out of line. avoid single thread flush. cover more clear
cases
Cc: Minchan Kim <minchan@kernel.org>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 2+ messages in thread* [PATCH 4/6] mm: move flush in madvise_free_pte_range()
2016-07-01 0:12 [PATCH 0/6] [v3] Workaround for Xeon Phi PTE A/D bits erratum Dave Hansen
@ 2016-07-01 0:12 ` Dave Hansen
0 siblings, 0 replies; 2+ messages in thread
From: Dave Hansen @ 2016-07-01 0:12 UTC (permalink / raw)
To: linux-kernel
Cc: x86, linux-mm, torvalds, akpm, bp, ak, mhocko, Dave Hansen,
dave.hansen, minchan
From: Dave Hansen <dave.hansen@linux.intel.com>
I think this code is OK and does not *need* to be patched. We
are just rewriting the PTE without the Accessed and Dirty bits.
The hardware could come along and set them at any time with or
without the erratum that this series addresses
But this does make the ptep_get_and_clear_full() and
tlb_remove_tlb_entry() calls here more consistent with the other
places they are used together and look *obviously* the same
between call-sites.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Minchan Kim <minchan@kernel.org>
---
b/mm/madvise.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff -puN mm/madvise.c~knl-leak-40-madvise_free_pte_range-move-flush mm/madvise.c
--- a/mm/madvise.c~knl-leak-40-madvise_free_pte_range-move-flush 2016-06-30 17:10:42.557246755 -0700
+++ b/mm/madvise.c 2016-06-30 17:10:42.561246936 -0700
@@ -369,13 +369,13 @@ static int madvise_free_pte_range(pmd_t
*/
ptent = ptep_get_and_clear_full(mm, addr, pte,
tlb->fullmm);
+ tlb_remove_tlb_entry(tlb, pte, addr);
ptent = pte_mkold(ptent);
ptent = pte_mkclean(ptent);
set_pte_at(mm, addr, pte, ptent);
if (PageActive(page))
deactivate_page(page);
- tlb_remove_tlb_entry(tlb, pte, addr);
}
}
out:
_
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2016-07-01 8:30 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <016001d1d36e$ef1db5a0$cd5920e0$@alibaba-inc.com>
2016-07-01 8:30 ` [PATCH 4/6] mm: move flush in madvise_free_pte_range() Hillf Danton
2016-07-01 0:12 [PATCH 0/6] [v3] Workaround for Xeon Phi PTE A/D bits erratum Dave Hansen
2016-07-01 0:12 ` [PATCH 4/6] mm: move flush in madvise_free_pte_range() Dave Hansen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox