* PCID and TLB flushes (was: [GIT PULL] kdbus for 4.1-rc1)
@ 2015-04-28 22:15 Kirill A. Shutemov
2015-04-28 22:38 ` Dave Hansen
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Kirill A. Shutemov @ 2015-04-28 22:15 UTC (permalink / raw)
To: Andy Lutomirski, Dave Hansen
Cc: Linus Torvalds, Andrew Morton, Mel Gorman, Rik van Riel,
linux-kernel, linux-mm, x86
On Tue, Apr 28, 2015 at 01:42:10PM -0700, Andy Lutomirski wrote:
> At some point, I'd like to implement PCID on x86 (if no one beats me
> to it, and this is a low priority for me), which will allow us to skip
> expensive TLB flushes while context switching. I have no idea whether
> ARM can do something similar.
I talked with Dave about implementing PCID and he thinks that it will be
net loss. TLB entries will live longer and it means we would need to trigger
more IPIs to flash them out when we have to. Cost of IPIs will be higher
than benifit from hot TLB after context switch.
Do you have different expectations?
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: PCID and TLB flushes (was: [GIT PULL] kdbus for 4.1-rc1) 2015-04-28 22:15 PCID and TLB flushes (was: [GIT PULL] kdbus for 4.1-rc1) Kirill A. Shutemov @ 2015-04-28 22:38 ` Dave Hansen 2015-04-28 22:41 ` Rik van Riel 2015-04-28 22:56 ` Linus Torvalds 2 siblings, 0 replies; 12+ messages in thread From: Dave Hansen @ 2015-04-28 22:38 UTC (permalink / raw) To: Kirill A. Shutemov, Andy Lutomirski Cc: Linus Torvalds, Andrew Morton, Mel Gorman, Rik van Riel, linux-kernel, linux-mm, x86 On 04/28/2015 03:15 PM, Kirill A. Shutemov wrote: > On Tue, Apr 28, 2015 at 01:42:10PM -0700, Andy Lutomirski wrote: >> At some point, I'd like to implement PCID on x86 (if no one beats me >> to it, and this is a low priority for me), which will allow us to skip >> expensive TLB flushes while context switching. I have no idea whether >> ARM can do something similar. > > I talked with Dave about implementing PCID and he thinks that it will be > net loss. TLB entries will live longer and it means we would need to trigger > more IPIs to flash them out when we have to. Cost of IPIs will be higher > than benifit from hot TLB after context switch. > > Do you have different expectations? Kirill, I think Andy is asking about something different that what you and I talked about. My point to you was that PCIDs can not be used to to replace or in lieu of TLB shootdowns because they *only* make TLB entries live longer. Their entire purpose is to make things live longer and to reduce the cost of the implicit TLB shootdowns that we do as a part of a context switch. I'm not sure if it will have a benefit overall. It depends on the increase in shootdown cost vs. the decrease in TLB refill cost at context switch. I think someone hacked up some code to do it (maybe just internally to Intel), so if anyone is seriously interested in implementing it, let me know and I'll see if I can dig it up. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PCID and TLB flushes (was: [GIT PULL] kdbus for 4.1-rc1) 2015-04-28 22:15 PCID and TLB flushes (was: [GIT PULL] kdbus for 4.1-rc1) Kirill A. Shutemov 2015-04-28 22:38 ` Dave Hansen @ 2015-04-28 22:41 ` Rik van Riel 2015-04-28 22:54 ` Andy Lutomirski 2015-04-28 22:56 ` Linus Torvalds 2 siblings, 1 reply; 12+ messages in thread From: Rik van Riel @ 2015-04-28 22:41 UTC (permalink / raw) To: Kirill A. Shutemov, Andy Lutomirski, Dave Hansen Cc: Linus Torvalds, Andrew Morton, Mel Gorman, linux-kernel, linux-mm, x86 On 04/28/2015 06:15 PM, Kirill A. Shutemov wrote: > On Tue, Apr 28, 2015 at 01:42:10PM -0700, Andy Lutomirski wrote: >> At some point, I'd like to implement PCID on x86 (if no one beats me >> to it, and this is a low priority for me), which will allow us to skip >> expensive TLB flushes while context switching. I have no idea whether >> ARM can do something similar. > > I talked with Dave about implementing PCID and he thinks that it will be > net loss. TLB entries will live longer and it means we would need to trigger > more IPIs to flash them out when we have to. Cost of IPIs will be higher > than benifit from hot TLB after context switch. I suspect that may depend on how you do the shootdown. If, when receiving a TLB shootdown for a non-current PCID, we just flush all the entries for that PCID and remove the CPU from the mm's cpu_vm_mask_var, we will never receive more than one shootdown IPI for a non-current mm, but we will still get the benefits of TLB longevity when dealing with eg. pipe workloads where tasks take turns running on the same CPU. -- All rights reversed -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PCID and TLB flushes (was: [GIT PULL] kdbus for 4.1-rc1) 2015-04-28 22:41 ` Rik van Riel @ 2015-04-28 22:54 ` Andy Lutomirski 2015-04-28 22:56 ` Rik van Riel 2015-04-28 23:16 ` Linus Torvalds 0 siblings, 2 replies; 12+ messages in thread From: Andy Lutomirski @ 2015-04-28 22:54 UTC (permalink / raw) To: Rik van Riel Cc: Kirill A. Shutemov, Dave Hansen, Linus Torvalds, Andrew Morton, Mel Gorman, linux-kernel, linux-mm, X86 ML On Tue, Apr 28, 2015 at 3:41 PM, Rik van Riel <riel@redhat.com> wrote: > On 04/28/2015 06:15 PM, Kirill A. Shutemov wrote: >> On Tue, Apr 28, 2015 at 01:42:10PM -0700, Andy Lutomirski wrote: >>> At some point, I'd like to implement PCID on x86 (if no one beats me >>> to it, and this is a low priority for me), which will allow us to skip >>> expensive TLB flushes while context switching. I have no idea whether >>> ARM can do something similar. >> >> I talked with Dave about implementing PCID and he thinks that it will be >> net loss. TLB entries will live longer and it means we would need to trigger >> more IPIs to flash them out when we have to. Cost of IPIs will be higher >> than benifit from hot TLB after context switch. > > I suspect that may depend on how you do the shootdown. > > If, when receiving a TLB shootdown for a non-current PCID, we just flush > all the entries for that PCID and remove the CPU from the mm's > cpu_vm_mask_var, we will never receive more than one shootdown IPI for > a non-current mm, but we will still get the benefits of TLB longevity > when dealing with eg. pipe workloads where tasks take turns running on > the same CPU. I had a totally different implementation idea in mind. It goes something like this: For each CPU, we allocate a fixed number of PCIDs, e.g. 0-7. We have a per-cpu array of the mm [1] that owns each PCID. On context switch, we look up the new mm in the array and, if there's a PCID mapped, we switch cr3 and select that PCID. If there is no PCID mapped, we choose one (LRU? clock replacement?), switch cr3 and select and invalidate that PCID. When it's time to invalidate a TLB entry on an mm that's active remotely, we really don't want to send an IPI to a CPU that doesn't actually have that mm active. Instead we bump some kind of generation counter in the mm_struct that will cause the next switch to that mm not to match the PCID list. To keep this working, I think we also need to update the per-cpu PCID list with our generation counter either when we context switch out or when we process a TLB shootdown IPI. This could be a bit tricky to get right, but I think it can be done without adding more than a cacheline or two to the context switch overhead and without any extra IPIs at all. [1] It shouldn't be just an mm_struct pointer, because then we have to invalidate it somehow when we recycle an mm_struct. Maybe we'd use some kind of counter. We also need a TLB shootdown generation counter of some sort as described. --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PCID and TLB flushes (was: [GIT PULL] kdbus for 4.1-rc1) 2015-04-28 22:54 ` Andy Lutomirski @ 2015-04-28 22:56 ` Rik van Riel 2015-04-28 23:01 ` Andy Lutomirski 2015-04-28 23:16 ` Linus Torvalds 1 sibling, 1 reply; 12+ messages in thread From: Rik van Riel @ 2015-04-28 22:56 UTC (permalink / raw) To: Andy Lutomirski Cc: Kirill A. Shutemov, Dave Hansen, Linus Torvalds, Andrew Morton, Mel Gorman, linux-kernel, linux-mm, X86 ML On 04/28/2015 06:54 PM, Andy Lutomirski wrote: > On Tue, Apr 28, 2015 at 3:41 PM, Rik van Riel <riel@redhat.com> wrote: >> On 04/28/2015 06:15 PM, Kirill A. Shutemov wrote: >>> On Tue, Apr 28, 2015 at 01:42:10PM -0700, Andy Lutomirski wrote: >>>> At some point, I'd like to implement PCID on x86 (if no one beats me >>>> to it, and this is a low priority for me), which will allow us to skip >>>> expensive TLB flushes while context switching. I have no idea whether >>>> ARM can do something similar. >>> >>> I talked with Dave about implementing PCID and he thinks that it will be >>> net loss. TLB entries will live longer and it means we would need to trigger >>> more IPIs to flash them out when we have to. Cost of IPIs will be higher >>> than benifit from hot TLB after context switch. >> >> I suspect that may depend on how you do the shootdown. >> >> If, when receiving a TLB shootdown for a non-current PCID, we just flush >> all the entries for that PCID and remove the CPU from the mm's >> cpu_vm_mask_var, we will never receive more than one shootdown IPI for >> a non-current mm, but we will still get the benefits of TLB longevity >> when dealing with eg. pipe workloads where tasks take turns running on >> the same CPU. > > I had a totally different implementation idea in mind. It goes > something like this: > > For each CPU, we allocate a fixed number of PCIDs, e.g. 0-7. We have > a per-cpu array of the mm [1] that owns each PCID. On context switch, > we look up the new mm in the array and, if there's a PCID mapped, we > switch cr3 and select that PCID. If there is no PCID mapped, we > choose one (LRU? clock replacement?), switch cr3 and select and > invalidate that PCID. > > When it's time to invalidate a TLB entry on an mm that's active > remotely, we really don't want to send an IPI to a CPU that doesn't > actually have that mm active. Instead we bump some kind of generation > counter in the mm_struct that will cause the next switch to that mm > not to match the PCID list. To keep this working, I think we also > need to update the per-cpu PCID list with our generation counter > either when we context switch out or when we process a TLB shootdown > IPI. If we do that, we can also get rid of TLB shootdowns for idle CPUs in lazy TLB mode. Very nice, if the details work out. -- All rights reversed -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PCID and TLB flushes (was: [GIT PULL] kdbus for 4.1-rc1) 2015-04-28 22:56 ` Rik van Riel @ 2015-04-28 23:01 ` Andy Lutomirski 2015-04-28 23:19 ` Linus Torvalds 0 siblings, 1 reply; 12+ messages in thread From: Andy Lutomirski @ 2015-04-28 23:01 UTC (permalink / raw) To: Rik van Riel Cc: Kirill A. Shutemov, Dave Hansen, Linus Torvalds, Andrew Morton, Mel Gorman, linux-kernel, linux-mm, X86 ML On Tue, Apr 28, 2015 at 3:56 PM, Rik van Riel <riel@redhat.com> wrote: > On 04/28/2015 06:54 PM, Andy Lutomirski wrote: >> On Tue, Apr 28, 2015 at 3:41 PM, Rik van Riel <riel@redhat.com> wrote: >>> On 04/28/2015 06:15 PM, Kirill A. Shutemov wrote: >>>> On Tue, Apr 28, 2015 at 01:42:10PM -0700, Andy Lutomirski wrote: >>>>> At some point, I'd like to implement PCID on x86 (if no one beats me >>>>> to it, and this is a low priority for me), which will allow us to skip >>>>> expensive TLB flushes while context switching. I have no idea whether >>>>> ARM can do something similar. >>>> >>>> I talked with Dave about implementing PCID and he thinks that it will be >>>> net loss. TLB entries will live longer and it means we would need to trigger >>>> more IPIs to flash them out when we have to. Cost of IPIs will be higher >>>> than benifit from hot TLB after context switch. >>> >>> I suspect that may depend on how you do the shootdown. >>> >>> If, when receiving a TLB shootdown for a non-current PCID, we just flush >>> all the entries for that PCID and remove the CPU from the mm's >>> cpu_vm_mask_var, we will never receive more than one shootdown IPI for >>> a non-current mm, but we will still get the benefits of TLB longevity >>> when dealing with eg. pipe workloads where tasks take turns running on >>> the same CPU. >> >> I had a totally different implementation idea in mind. It goes >> something like this: >> >> For each CPU, we allocate a fixed number of PCIDs, e.g. 0-7. We have >> a per-cpu array of the mm [1] that owns each PCID. On context switch, >> we look up the new mm in the array and, if there's a PCID mapped, we >> switch cr3 and select that PCID. If there is no PCID mapped, we >> choose one (LRU? clock replacement?), switch cr3 and select and >> invalidate that PCID. >> >> When it's time to invalidate a TLB entry on an mm that's active >> remotely, we really don't want to send an IPI to a CPU that doesn't >> actually have that mm active. Instead we bump some kind of generation >> counter in the mm_struct that will cause the next switch to that mm >> not to match the PCID list. To keep this working, I think we also >> need to update the per-cpu PCID list with our generation counter >> either when we context switch out or when we process a TLB shootdown >> IPI. > > If we do that, we can also get rid of TLB shootdowns for > idle CPUs in lazy TLB mode. > > Very nice, if the details work out. > I wonder if we could treat the non-PCID case just like the PCID case but with only one PCID. Maybe get rid of the mm vs active_mm distinction. Maybe not, though -- if nothing else, we still need to kick our pgd out from idle or kthread CPUs before we free it. The reason I thought of PCIDs this way is that 12 bits isn't nearly enough to get away with allocating each mm its own PCID. Rather than trying to shoehorn them in, it seemed like a better approach would be to only use a very small number, since keeping around TLB entries that are more than a few context switches old seems mostly useless. --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PCID and TLB flushes (was: [GIT PULL] kdbus for 4.1-rc1) 2015-04-28 23:01 ` Andy Lutomirski @ 2015-04-28 23:19 ` Linus Torvalds 0 siblings, 0 replies; 12+ messages in thread From: Linus Torvalds @ 2015-04-28 23:19 UTC (permalink / raw) To: Andy Lutomirski Cc: Rik van Riel, Kirill A. Shutemov, Dave Hansen, Andrew Morton, Mel Gorman, linux-kernel, linux-mm, X86 ML On Tue, Apr 28, 2015 at 4:01 PM, Andy Lutomirski <luto@amacapital.net> wrote: > > The reason I thought of PCIDs this way is that 12 bits isn't nearly > enough to get away with allocating each mm its own PCID. Not even close. And really, we've already done this for other architectures. On alpha, the number of bits in the pcid is model-specific, but it was something like 6 for the ones I used. That's plenty. Also, I don't think Intel actually does 12 bits of pcid. What they do is to hash the 12 bits down to something smaller (like two or three bits in the actual TLB data structure), and then the CPU basically invalidates any pcid's that alias (have a small 4- or 8-entry array saying that "this hash was used for this 12-bit pcid). So there's actually *another* level of dynamic mapping going on below the software interface. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PCID and TLB flushes (was: [GIT PULL] kdbus for 4.1-rc1) 2015-04-28 22:54 ` Andy Lutomirski 2015-04-28 22:56 ` Rik van Riel @ 2015-04-28 23:16 ` Linus Torvalds 2015-04-28 23:23 ` Andy Lutomirski 1 sibling, 1 reply; 12+ messages in thread From: Linus Torvalds @ 2015-04-28 23:16 UTC (permalink / raw) To: Andy Lutomirski Cc: Rik van Riel, Kirill A. Shutemov, Dave Hansen, Andrew Morton, Mel Gorman, linux-kernel, linux-mm, X86 ML On Tue, Apr 28, 2015 at 3:54 PM, Andy Lutomirski <luto@amacapital.net> wrote: > > I had a totally different implementation idea in mind. It goes > something like this: > > For each CPU, we allocate a fixed number of PCIDs, e.g. 0-7. We have > a per-cpu array of the mm [1] that owns each PCID. [...] We've done this before on other architectures. See for example alpha. Look up "__get_new_mm_context()" and friends. I think sparc does the same (and I think sparc copied a lot of it from the alpha implementation). Iirc, the alpha version just generates a (per-cpu) asid one at a time, and has a generation counter so that when you run out of ASID's you do a global TLB invalidate on that CPU and start from 0 again. Actually, I think the generation number is just the high bits of the asid counter (alpha calls them "asn", intel calls them "pcid", and I tend to prefer "asid", but it's all the same thing). Then each thread just has a per-thread ASID. We don't try to make that be per-thread and per-cpu, but instead just force a new allocation when a thread moves to another CPU. It's not obvious what alpha does, because we end up hiding the per-thread ASN in the "struct pcb_struct" (in 'struct thread_info') which is part the alpha pal-code interface. But it seemed to work and is fairly simple. I think something very similar should work with intel pcid's. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PCID and TLB flushes (was: [GIT PULL] kdbus for 4.1-rc1) 2015-04-28 23:16 ` Linus Torvalds @ 2015-04-28 23:23 ` Andy Lutomirski 2015-04-28 23:38 ` Linus Torvalds 0 siblings, 1 reply; 12+ messages in thread From: Andy Lutomirski @ 2015-04-28 23:23 UTC (permalink / raw) To: Linus Torvalds Cc: Rik van Riel, Kirill A. Shutemov, Dave Hansen, Andrew Morton, Mel Gorman, linux-kernel, linux-mm, X86 ML On Tue, Apr 28, 2015 at 4:16 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Tue, Apr 28, 2015 at 3:54 PM, Andy Lutomirski <luto@amacapital.net> wrote: >> >> I had a totally different implementation idea in mind. It goes >> something like this: >> >> For each CPU, we allocate a fixed number of PCIDs, e.g. 0-7. We have >> a per-cpu array of the mm [1] that owns each PCID. [...] > > We've done this before on other architectures. See for example alpha. > Look up "__get_new_mm_context()" and friends. I think sparc does the > same (and I think sparc copied a lot of it from the alpha > implementation). > > Iirc, the alpha version just generates a (per-cpu) asid one at a time, > and has a generation counter so that when you run out of ASID's you do > a global TLB invalidate on that CPU and start from 0 again. Actually, > I think the generation number is just the high bits of the asid > counter (alpha calls them "asn", intel calls them "pcid", and I tend > to prefer "asid", but it's all the same thing). > > Then each thread just has a per-thread ASID. We don't try to make that > be per-thread and per-cpu, but instead just force a new allocation > when a thread moves to another CPU. Alpha appears to have a per-thread per-cpu id of some sort: /* The alpha MMU context is one "unsigned long" bitmap per CPU */ typedef unsigned long mm_context_t[NR_CPUS]; I think we can do it without that by keeping the mapping in reverse as I sort of outlined -- for each cpu, store a mapping from mm to pcid. When things fall out of the list, no big deal. --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PCID and TLB flushes (was: [GIT PULL] kdbus for 4.1-rc1) 2015-04-28 23:23 ` Andy Lutomirski @ 2015-04-28 23:38 ` Linus Torvalds 2015-04-28 23:49 ` Andy Lutomirski 0 siblings, 1 reply; 12+ messages in thread From: Linus Torvalds @ 2015-04-28 23:38 UTC (permalink / raw) To: Andy Lutomirski Cc: Rik van Riel, Kirill A. Shutemov, Dave Hansen, Andrew Morton, Mel Gorman, linux-kernel, linux-mm, X86 ML On Tue, Apr 28, 2015 at 4:23 PM, Andy Lutomirski <luto@amacapital.net> wrote: > > I think we can do it without that by keeping the mapping in reverse as > I sort of outlined -- for each cpu, store a mapping from mm to pcid. > When things fall out of the list, no big deal. So you do it by just having a per-cpu array of (say, 64 entries), you now end up having to search that every time you do a task switch to find the asid for the mm. And even then you've limited yourself to just six bits, because doing the same for a possible full 12-bit asid would not be possible. It's actually much simpler if you just do it the other way. But hey, maybe you do something clever and can figure out a good way to do it. I'm just saying that we *have* done this before on other architectures, and it has worked. I think ARM has another asid implementation in arch/arm/mm/context.c. I really think it would be a good idea to copy some existing case rather than make up a new one. It's not like asid's are unusual. It's arguably x86 that was unusual in _not_ having them. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PCID and TLB flushes (was: [GIT PULL] kdbus for 4.1-rc1) 2015-04-28 23:38 ` Linus Torvalds @ 2015-04-28 23:49 ` Andy Lutomirski 0 siblings, 0 replies; 12+ messages in thread From: Andy Lutomirski @ 2015-04-28 23:49 UTC (permalink / raw) To: Linus Torvalds Cc: Rik van Riel, Kirill A. Shutemov, Dave Hansen, Andrew Morton, Mel Gorman, linux-kernel, linux-mm, X86 ML On Tue, Apr 28, 2015 at 4:38 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Tue, Apr 28, 2015 at 4:23 PM, Andy Lutomirski <luto@amacapital.net> wrote: >> >> I think we can do it without that by keeping the mapping in reverse as >> I sort of outlined -- for each cpu, store a mapping from mm to pcid. >> When things fall out of the list, no big deal. > > So you do it by just having a per-cpu array of (say, 64 entries), you > now end up having to search that every time you do a task switch to > find the asid for the mm. And even then you've limited yourself to > just six bits, because doing the same for a possible full 12-bit asid > would not be possible. > > It's actually much simpler if you just do it the other way. I'm unconvinced. I doubt that trying to keep more than 4-8 PCIDs alive in a cpu's TLB is ever a win. After all, the TLB isn't that big, and, if we're only the 7th most recent mm to have been loaded on a cpu, I doubt any of our TLB entries are still likely to be there. Given that, even if we need 16 bytes of generation counter and such in the per-cpu array, that's at most 128 bytes. In practice, we really ought to be able to get it down to closer to 8 bytes with some care or we could only use 4 PCIDs, at which point the whole per-cpu structure fits in a single cache line. We can search it with 4-8 branches and no additional L1 misses. Sure, with 64 entries this would be expensive, but I think that's excessive. Also, this approach keeps the cost of blowing away stale PCIDs when we need to invalidate a TLB entry on an inactive PCID down to a single write as opposed to digging through the per-mm array to poke at the state for each cpu it might be cached in. But maybe I missed some trick that avoids needing to do that. --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PCID and TLB flushes (was: [GIT PULL] kdbus for 4.1-rc1) 2015-04-28 22:15 PCID and TLB flushes (was: [GIT PULL] kdbus for 4.1-rc1) Kirill A. Shutemov 2015-04-28 22:38 ` Dave Hansen 2015-04-28 22:41 ` Rik van Riel @ 2015-04-28 22:56 ` Linus Torvalds 2 siblings, 0 replies; 12+ messages in thread From: Linus Torvalds @ 2015-04-28 22:56 UTC (permalink / raw) To: Kirill A. Shutemov Cc: Andy Lutomirski, Dave Hansen, Andrew Morton, Mel Gorman, Rik van Riel, Linux Kernel Mailing List, linux-mm, the arch/x86 maintainers On Tue, Apr 28, 2015 at 3:15 PM, Kirill A. Shutemov <kirill@shutemov.name> wrote: > > I talked with Dave about implementing PCID and he thinks that it will be > net loss. So I'm told that Suresh Siddha actually had a patch inside Intel to use PCID (back when he worked for Intel, I think he left), and that it was a wash in their testing. I never saw the patch, and it might be interesting to try it again, but there is some reason to believe that it doesn't make much of a difference. Unlike most of the traditional RISC machines that got big speedups, Intel TLB walking is so good that it likely isn't nearly as noticeable, and it likely *does* result in more IPI's etc. Possibly not a lot more, but if the win isn't big... So I don't want to discourage you, because I'd love to see what the patch looks like and if we can find cases where it matters, but I do want to set expectations right. It's unlikely to be a big issue. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2015-04-28 23:49 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-04-28 22:15 PCID and TLB flushes (was: [GIT PULL] kdbus for 4.1-rc1) Kirill A. Shutemov 2015-04-28 22:38 ` Dave Hansen 2015-04-28 22:41 ` Rik van Riel 2015-04-28 22:54 ` Andy Lutomirski 2015-04-28 22:56 ` Rik van Riel 2015-04-28 23:01 ` Andy Lutomirski 2015-04-28 23:19 ` Linus Torvalds 2015-04-28 23:16 ` Linus Torvalds 2015-04-28 23:23 ` Andy Lutomirski 2015-04-28 23:38 ` Linus Torvalds 2015-04-28 23:49 ` Andy Lutomirski 2015-04-28 22:56 ` Linus Torvalds
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox