On Sat, Jan 9, 2021 at 4:55 PM Linus Torvalds wrote: > > What part of "clear_refs is the _least_ important of the three cases" > are you not willing to understand? In fact, I couldn't even turn on that code with my normal config, because it depends on CONFIG_CHECKPOINT_RESTORE that I didn't even have enabled. IOW, that code is some special-case stuff, and instead of messing up the rest of the VM, it should be made to conform to all the normal VM rules and requirements. Here's two patches to basically start doing that. The first one is the same one I already sent out earlier, fixing the locking. And yes, it can be improved upon, but before improving on it, let's _fix_ the code. The second is a trivial "oh, look, I can see that the page is pinned, soft-dirty cannot work so don't do it then". Again, it can be improved upon, most particularly by doing the same (simple) tests for the hugepage case too, which I didn't do. Note: I have not a single actual user of this code that I can test with, so this is all ENTIRELY untested. IOW, I am in no way claiming that these patches are perfect and correct, and the only way to do things. But what I _am_ claiming is that this clear_refs code (and the UFFD code) is of secondary importance, and instead of messing up the core VM, we should fix these special cases to not do bad things. It really is that simple. And no, I didn't make the UFFDIO_WRITEPROTECT code take the mmap_sem for writing. For whoever wants to look at that, it's mwriteprotect_range() in mm/userfaultfd.c and the fix is literally to turn the read-lock (and unlock) into a write-lock (and unlock). Linus