From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C721C2B9FB for ; Sat, 22 May 2021 21:19:51 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 89E1161175 for ; Sat, 22 May 2021 21:19:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 89E1161175 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B57CA8E008C; Sat, 22 May 2021 17:19:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B07F08E007F; Sat, 22 May 2021 17:19:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 98D1F8E008C; Sat, 22 May 2021 17:19:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 673F58E007F for ; Sat, 22 May 2021 17:19:49 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 07E09812A for ; Sat, 22 May 2021 21:19:49 +0000 (UTC) X-FDA: 78170134098.29.8724F68 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf24.hostedemail.com (Postfix) with ESMTP id CFBA6A0001C7 for ; Sat, 22 May 2021 21:19:44 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 563916108D; Sat, 22 May 2021 21:19:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1621718387; bh=B2lsAOJv9bBu+Fx6RHHX+xG31xxgaNCxYEIIJz6iKxA=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=sgp/w7olg0sWa71S4TzAXyW6gyVDBPvnlKHmMEIBP4LzHDfYcml+EbiARYx+qp+Sz XDUB7XBRx3czU4D+WzbHYPsPGEj7DsAv8E5eS6SnKiO7bsrDi3tBl9CorywX6ktEez 135FnFU5z/MBQnFZ1GhV+WyczYaFiiCJ53sjv8+E= Date: Sat, 22 May 2021 14:19:46 -0700 From: Andrew Morton To: Mina Almasry Cc: Axel Rasmussen , Peter Xu , linux-mm@kvack.org, Mike Kravetz , linux-kernel@vger.kernel.org Subject: Re: [PATCH v3] mm, hugetlb: fix resv_huge_pages underflow on UFFDIO_COPY Message-Id: <20210522141946.f8a62010350a76302b9508fb@linux-foundation.org> In-Reply-To: <20210521074433.931380-1-almasrymina@google.com> References: <20210521074433.931380-1-almasrymina@google.com> X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.31; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: CFBA6A0001C7 Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="sgp/w7ol"; dmarc=none; spf=pass (imf24.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam03 X-Stat-Signature: yna7probn8y18o7py6nagoeku6y8p8qh X-HE-Tag: 1621718384-174199 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, 21 May 2021 00:44:33 -0700 Mina Almasry wrote: > The userfaultfd hugetlb tests detect a resv_huge_pages underflow. This > happens when hugetlb_mcopy_atomic_pte() is called with !is_continue on > an index for which we already have a page in the cache. When this > happens, we allocate a second page, double consuming the reservation, > and then fail to insert the page into the cache and return -EEXIST. > > To fix this, we first if there exists a page in the cache which already ^ check > consumed the reservation, and return -EEXIST immediately if so. > > Secondly, if we fail to copy the page contents while holding the > hugetlb_fault_mutex, we will drop the mutex and return to the caller > after allocating a page that consumed a reservation. In this case there > may be a fault that double consumes the reservation. To handle this, we > free the allocated page, fix the reservations, and allocate a temporary > hugetlb page and return that to the caller. When the caller does the > copy outside of the lock, we again check the cache, and allocate a page > consuming the reservation, and copy over the contents. > > Test: > Hacked the code locally such that resv_huge_pages underflows produce > a warning and the copy_huge_page_from_user() always fails, then: > > ./tools/testing/selftests/vm/userfaultfd hugetlb_shared 10 > 2 /tmp/kokonut_test/huge/userfaultfd_test && echo test success > ./tools/testing/selftests/vm/userfaultfd hugetlb 10 > 2 /tmp/kokonut_test/huge/userfaultfd_test && echo test success > > Both tests succeed and produce no warnings. After the test runs > number of free/resv hugepages is correct. > > ... > > include/linux/hugetlb.h | 4 ++ > mm/hugetlb.c | 103 ++++++++++++++++++++++++++++++++++++---- > mm/migrate.c | 39 +++------------ > 3 files changed, 103 insertions(+), 43 deletions(-) I'm assuming we want this in -stable? Are we able to identify a Fixes: for this? It's a large change. Can we come up with some smaller and easier to review and integrate version which we can feed into 5.13 and -stable and do the fancier version for 5.14? If you don't think -stable needs this then this version will be OK as-is.