On Tue, 14 Oct 2025, Kalesh Singh wrote: > On Mon, Oct 13, 2025 at 11:28 PM Hugh Dickins wrote: > > > > Sorry for letting you go so far before speaking up (I had to test what > > I believed to be true, and had hoped that meanwhile one of your many > > illustrious reviewers would say so first, but no): it's a NAK from me. > > > > These are not off-by-ones: at the point of these checks, it is not > > known whether an additional map/vma will have to be added, or the > > addition will be merged into an existing map/vma. So the checks > > err on the lenient side, letting you get perhaps one more than the > > sysctl said, but not allowing any more than that. > > > > Which is all that matters, isn't it? Limiting unrestrained growth. > > > > In this patch you're proposing to change it from erring on the > > lenient side to erring on the strict side - prohibiting merges > > at the limit which have been allowed for many years. > > > > Whatever one thinks about the merits of erring on the lenient versus > > erring on the strict side, I see no reason to make this change now, > > and most certainly not with a Fixes Cc: stable. There is no danger > > in the current behaviour; there is danger in prohibiting what was > > allowed before. > > > > As to the remainder of your series: I have to commend you for doing > > a thorough and well-presented job, but I cannot myself see the point in > > changing 21 files for what almost amounts to a max_map_count subsystem. > > I call it misdirected effort, not at all to my taste, which prefers the > > straightforward checks already there; but accept that my taste may be > > out of fashion, so won't stand in the way if others think it worthwhile. > > Hi Hugh, > > Thanks for the detailed review and for taking the time to test the behavior. > > You've raised a valid point. I wasn't aware of the history behind the > lenient check for merges. The lack of a comment, like the one that > exists for exceeding the limit in munmap(), led me to misinterpret > this as an off-by-one bug. The convention makes sense if we consider > potential merges. Yes, a comment there would be helpful (and I doubt it's worth more than adding a comment); but I did not understand at all, Liam's suggestion for the comment "to state that the count may not change". > > If it was in-fact the intended behavior, then I agree we should keep > it lenient. It would mean though, that munmap() being able to free a > VMA if a split is required (by permitting exceeding the limit by 1) > would not work in the case where we have already exceeded the limit. I > find this to be inconsistent but this is also the current behavior ... You're saying that once we go one over the limit, say with a new mmap, an munmap check makes it impossible to munmap that or any other vma? If that's so, I do agree with you, that's nasty, and I would hate any new code to behave that way. In code that's survived as long as this without troubling anyone, I'm not so sure: but if it's easily fixed (a more lenient check at the munmap end?) that would seem worthwhile. Ah, but reading again, you say "if a split is required": I guess munmapping the whole vma has no problem; and it's fine for a middle munmap, splitting into three before munmapping the middle, to fail. I suppose it would be nicer if munmaping start or end succeeeded, but I don't think that matters very much in this case. > > I will drop this patch and the patch that introduces the > vma_count_remaining() helper, as I see your point about it potentially > being unnecessary overhead. > > Regarding your feedback on the rest of the series, I believe the 3 > remaining patches are still valuable on their own. > > - The selftest adds a comprehensive tests for VMA operations at the > sysctl_max_map_count limit. This will self-document the exact behavior > expected, including the leniency for potential merges that you > highlighted, preventing the kind of misunderstanding that led to my > initial patch. > > - The rename of mm_struct->map_count to vma_count, is a > straightforward cleanup for code clarity that makes the purpose of the > field more explicit. > > - The tracepoint adds needed observability for telemetry, allowing us > to see when processes are failing in the field due to VMA count limit. > > The selftest, is what makes up a large portion of the diff you > sited, and with vma_count_remaining() gone the series will not touch > nearly as many files. > > Would this be an acceptable path forward? Possibly, if others like it: my concern was to end a misunderstanding (I'm generally much too slow to get involved in cleanups). Though given that the sysctl is named "max_map_count", I'm not very keen on renaming everything else from map_count to vma_count (and of course I'm not suggesting to rename the sysctl). Hugh