On Wed, Oct 26, 2016 at 9:32 AM, Linus Torvalds wrote: > > Quite frankly, I think the solution is to just rip out all the insane > zone crap. IOW, something like the attached. Advantage: - just look at the number of garbage lines removed! 21 insertions(+), 182 deletions(-) - it will actually speed up even the current case for all common situations: no idiotic extra indirections that will take extra cache misses - because the bit_wait_table array is now denser (256 entries is about 6kB of data on 64-bit with no spinlock debugging, so ~100 cachelines), maybe it gets fewer cache misses too - we know how to handle the page_waitqueue contention issue, and it has nothing to do with the stupid NUMA zones The only case you actually get real page wait activity is IO, and I suspect that hashing it out over ~100 cachelines will be more than sufficient to avoid excessive contention, plus it's a cache-miss vs an IO, so nobody sane cares. The only reason it did that insane per-zone thing in the first place that right now we access those wait-queues even when we damn well shouldn't, and we have the solution for that. Guys, holler if you hate this, but I think it's realistically the only sane solution to the "wait queue on stack" issue. Oh, and the patch is obviously entirely untested. I wouldn't want to ruin my reputation by *testing* the patches I send out. What would be the fun in that? Linus