On Wed, Mar 10, 2021 at 1:31 PM Sean Christopherson wrote: > Invoke the MMU notifier's .invalidate_range_end() callbacks even if one > of the .invalidate_range_start() callbacks failed. If there are multiple > notifiers, the notifier that did not fail may have performed actions in > its ...start() that it expects to unwind via ...end(). Per the > mmu_notifier_ops documentation, ...start() and ...end() must be paired. > > The only in-kernel usage that is fatally broken is the SGI UV GRU driver, > which effectively blocks and sleeps fault handlers during ...start(), and > unblocks/wakes the handlers during ...end(). But, the only users that > can fail ...start() are the i915 and Nouveau drivers, which are unlikely > to collide with the SGI driver. > > KVM is the only other user of ...end(), and while KVM also blocks fault > handlers in ...start(), the fault handlers do not sleep and originate in > killable ioctl() calls. So while it's possible for the i915 and Nouveau > drivers to collide with KVM, the bug is benign for KVM since the process > is dying and KVM's guest is about to be terminated. > > So, as of today, the bug is likely benign. But, that may not always be > true, e.g. there is a potential use case for blocking memslot updates in > KVM while an invalidation is in-progress, and failure to unblock would > result in said updates being blocked indefinitely and hanging. > > Found by inspection. Verified by adding a second notifier in KVM that > periodically returns -EAGAIN on non-blockable ranges, triggering OOM, > and observing that KVM exits with an elevated notifier count. > > Fixes: 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu > notifiers") > Cc: stable@vger.kernel.org > Cc: David Rientjes > Cc: Ben Gardon > Cc: Jason Gunthorpe > Cc: Michal Hocko > Cc: "Jérôme Glisse" > Cc: Andrea Arcangeli > Cc: Johannes Weiner > Cc: Dimitri Sivanich > Signed-off-by: Sean Christopherson > Reviewed-by: Ben Gardon Thanks for catching and fixing this. > --- > mm/oom_kill.c | 8 +++----- > 1 file changed, 3 insertions(+), 5 deletions(-) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index bc65ba4f5192..acc3ba8b2ed7 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -546,12 +546,10 @@ bool __oom_reap_task_mm(struct mm_struct *mm) > vma, mm, vma->vm_start, > vma->vm_end); > tlb_gather_mmu(&tlb, mm); > - if > (mmu_notifier_invalidate_range_start_nonblock(&range)) { > - tlb_finish_mmu(&tlb); > + if > (!mmu_notifier_invalidate_range_start_nonblock(&range)) > + unmap_page_range(&tlb, vma, range.start, > range.end, NULL); > + else > ret = false; > - continue; > - } > - unmap_page_range(&tlb, vma, range.start, > range.end, NULL); > mmu_notifier_invalidate_range_end(&range); > tlb_finish_mmu(&tlb); > } > -- > 2.30.1.766.gb4fecdf3b7-goog > >