From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB13FC54EE9 for ; Fri, 16 Sep 2022 14:15:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6A9AC8D0002; Fri, 16 Sep 2022 10:15:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6328F8D0001; Fri, 16 Sep 2022 10:15:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4D2BD8D0002; Fri, 16 Sep 2022 10:15:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 36BAF8D0001 for ; Fri, 16 Sep 2022 10:15:16 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 63B0EA19E2 for ; Fri, 16 Sep 2022 14:15:10 +0000 (UTC) X-FDA: 79918145580.17.F5B4036 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf02.hostedemail.com (Postfix) with ESMTP id 64ACA800C4 for ; Fri, 16 Sep 2022 14:15:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=cDjYANMaigZKz2ZmXcdxHDPZC9Jq96f25A9j6uLNqSE=; b=U+GWz9Yh4hMtHhRHb+tTq0tFL1 QANOSW2XB0WX4/Hf8OKs8KzWBHt25XUS67aaFAT50vAdU8ytxqqoKeozm/gP5HknhXIH8EQ/fu2n1 PfoSzEZVDnKo/0/tGrDTCC+rrUW1N+6/aPmJXZ2DHnwcjSwOW/VD5YKuG6jocNwPp3lwcRNuEzykA 2G7qhSyEs/DEhPdww+fK3fD3Br/o1sYzodOLCWWLFHcEj9P2r/92EhdU49YwG4j19Lv2B83qsCjC9 6TzJk0e1Qhx/UaEuUrbcVjkPifmcODK+DmGfFASWW0saedeOlvPMboGhHEjD712G6eJ64ZHoZPuZA 0J90JV6g==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1oZC7V-002H3v-4V; Fri, 16 Sep 2022 14:15:05 +0000 Date: Fri, 16 Sep 2022 15:15:05 +0100 From: Matthew Wilcox To: Kees Cook Cc: Yu Zhao , Andrew Morton , dev@der-flo.net, Linux-MM , Uladzislau Rezki , bugzilla-daemon@kernel.org Subject: Re: [Bug 216489] New: Machine freezes due to memory lock Message-ID: References: <20220915133931.ee0a6c8a86c59a144828eb60@linux-foundation.org> <202209160230.CE9E0E51@keescook> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <202209160230.CE9E0E51@keescook> ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663337710; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cDjYANMaigZKz2ZmXcdxHDPZC9Jq96f25A9j6uLNqSE=; b=xmaHH2s3mE08HAyQlSzBjKU/h1OTeh+a7l9pbaCDUAhWNym7ZKYsMFaUJckBj3PyXDcljE iWnr5xSPnpuMN+6/dlte2yJ94ST2le8g/bjkdrsJ18ukI4r1Y2GxJYeGXx3v8frN80DWA4 hwQbWbodnNAnnZHAqrZUE3qQuT2AuNA= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=U+GWz9Yh; dmarc=none; spf=none (imf02.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1663337710; a=rsa-sha256; cv=none; b=8gknVSL8ZDijsV60Mus6hPQdLUq/HrIdCqmCHpCWuMDE/r+TqGnH7Qf+SwZuWcaaburknT stXY1I5yc9NmX4Qr4tVHHeRxXxKzldaIyyaHsrlPPg1FCaIpOrNnOjOmsTlH7DCrBuVoEn gN0VxaCT1gE6qBuJwiKvPEpv0x0QCpk= Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=U+GWz9Yh; dmarc=none; spf=none (imf02.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org X-Stat-Signature: 1kx59ytwamcyiynomou1o7q3dsgcxnmj X-Rspamd-Queue-Id: 64ACA800C4 X-Rspamd-Server: rspam12 X-Rspam-User: X-HE-Tag: 1663337708-793484 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Sep 16, 2022 at 02:46:39AM -0700, Kees Cook wrote: > On Fri, Sep 16, 2022 at 09:38:33AM +0100, Matthew Wilcox wrote: > > On Thu, Sep 15, 2022 at 05:59:56PM -0600, Yu Zhao wrote: > > > I think this is a manifest of the lockdep warning I reported a couple > > > of weeks ago: > > > https://lore.kernel.org/r/CAOUHufaPshtKrTWOz7T7QFYUNVGFm0JBjvM700Nhf9qEL9b3EQ@mail.gmail.com/ > > > > That would certainly match the symptoms. > > > > Turning vmap_lock into an NMI-safe lock would be bad. I don't even know > > if we have primitives for that (it's not like you can disable an NMI > > ...) > > > > I don't quite have time to write a patch right now. Perhaps something > > like: > > > > struct vmap_area *find_vmap_area_nmi(unsigned long addr) > > { > > struct vmap_area *va; > > > > if (spin_trylock(&vmap_area_lock)) > > return NULL; > > va = __find_vmap_area(addr, &vmap_area_root); > > spin_unlock(&vmap_area_lock); > > > > return va; > > } > > > > and then call find_vmap_area_nmi() in check_heap_object(). I may have > > the polarity of the return value of spin_trylock() incorrect. > > I think we'll need something slightly tweaked, since this would > return NULL under any contention (and a NULL return is fatal in > check_heap_object()). It seems like we need to explicitly check > for being in nmi context in check_heap_object() to deal with it? > Like this (only build tested): Right, and Ulad is right about it beig callable from any context. I think the longterm solution is to make the vmap_area_root tree walkable under RCU protection. For now, let's have a distinct return code (ERR_PTR(-EAGAIN), perhaps?) to indicate that we've hit contention. It generally won't matter if we hit it in process context because hardening doesn't have to be 100% reliable to be useful. Erm ... so what prevents this race: CPU 0 CPU 1 copy_to_user() check_heap_object() area = find_vmap_area(addr) __purge_vmap_area_lazy() merge_or_add_vmap_area_augment() __merge_or_add_vmap_area() kmem_cache_free(vmap_area_cachep, va); if (n > area->va_end - addr) { Yes, it's a race in the code that allocated this memory; they're simultaneously calling copy_to_user() and __vunmap(). We'll catch this bad behaviour sooner rather than later, but sometimes in trying to catch this bug, we'll get caught by the bug and go splat. I don't know that we need to go through heroics to be sure we don't get caught by this bug. It already has to run a workqueue to do the freeing. We could delay it even further with RCU or something, but we're only trading off one kind of badness for another.