From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ACC15E7F129 for ; Tue, 26 Sep 2023 21:15:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1DB0E8D003B; Tue, 26 Sep 2023 17:15:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 18B528D0002; Tue, 26 Sep 2023 17:15:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 07AE88D003B; Tue, 26 Sep 2023 17:15:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id EE72E8D0002 for ; Tue, 26 Sep 2023 17:15:40 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id B538EB40A8 for ; Tue, 26 Sep 2023 21:15:40 +0000 (UTC) X-FDA: 81280005240.23.5F31260 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf14.hostedemail.com (Postfix) with ESMTP id 80795100011 for ; Tue, 26 Sep 2023 21:15:38 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Gid+BjIV; dmarc=none; spf=pass (imf14.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695762939; a=rsa-sha256; cv=none; b=ICI4vgXIkhWq9Jg7+wlkPhERfgZPaOJ8DDLqmGjlQbKoYy2Ui4PCtU7hxOYhPqCOZriiix C+sFPC8sLygq/OwA0dy+G4kOCq0eoZh1SlNDzYRRRaA3myD1uvfl0ppXJvvi/RExHSEXWd L36rB3JebJcr52Ky2r5D8h9cji/AqMI= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=Gid+BjIV; dmarc=none; spf=pass (imf14.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695762939; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3b+RVbB+4o77UpDwBTm5/0td/Pf5JxoV1XUePhf/9EY=; b=r1tQ2GFUUkzEZRLxU6AKCh+NFytH/FUZmHCGZIUUZ4mjBGEvDl8YCcjlYO1bKYNxuUC7Wm EHkwZ+Rho2MlUKOFNZlFMTJXnSEapgf8EvyCTuC3Aaob7h8YMdwiftmVaB6VHgWwJQb6GW gNNNAKPfHCA0o8Whup0+oiHfhb7PMG4= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id 88278CE12C5; Tue, 26 Sep 2023 21:15:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8A278C43215; Tue, 26 Sep 2023 21:15:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1695762931; bh=8Ow3T98wjJTpXnAUN+3mn4M9J+ijsbPHhUgnoSqynpo=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=Gid+BjIVN7lHoeH6PgshhbGSLsnYcSQ1UpDs30BX2V0A6+VgKPuvAql/4YzmQKvG5 g1KibwCqaFMsRCeNj6/WTFIm6f4P330z30ldUIT0SfYVGTP2m1Oo1xbGSEPRGP0dwG NxvraLRp4y9jId8iLhVbFiSEqYLXBbeFcl93UiJw= Date: Tue, 26 Sep 2023 14:15:30 -0700 From: Andrew Morton To: riel@surriel.com Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, linux-mm@kvack.org, muchun.song@linux.dev, mike.kravetz@oracle.com, leit@meta.com, willy@infradead.org Subject: Re: [PATCH 2/3] hugetlbfs: close race between MADV_DONTNEED and page fault Message-Id: <20230926141530.26bc8550f2f2411945b566f1@linux-foundation.org> In-Reply-To: <20230926031245.795759-3-riel@surriel.com> References: <20230926031245.795759-1-riel@surriel.com> <20230926031245.795759-3-riel@surriel.com> X-Mailer: Sylpheed 3.8.0beta1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 80795100011 X-Stat-Signature: nbd4msxtzpweokmb8z4poynmt4yk87xd X-HE-Tag: 1695762938-598822 X-HE-Meta: U2FsdGVkX18VRoEncT61DAQfT4684G5HFaTPL/qMednakcPI+WSfaAdVsavGSQQSOqudaVB2VZE0BG2oSBXPthEGXUVFu95ce+jlxUc1jIcFBEPIqppSYCr8W4pZ3VQ4VEpqFWxUuV39Ex3fWgoFgrc3BAa+uknCXrl2P4mBps5vUsQnpIW9IEtMMUuBDCrEvIqUA5FHfrWutA4WrZJjFgAYxLqVnnHzy91L4bbBsuN0rOWtRnwhryRPhGDrBpx0C0zFJ7vLy46jTjPsG2M53VIj2NVx459U1VLitkTyfqEkFFeTKFdaTZPAOirapVGZ0tKv8nlTVo8aTViF153eleBhbL+BTrPLCc75raSDwv1sfVKWhYVI3QUq0vAaJ0bzm+al4b2j8qGn+GfJKUxGqsZSAB0F/YT+6NMJTTL41wRfztlCUhCfIc+wGs6fgf3iebuELdbMCt+pQaPkKQDz6C6EZc45LoiLBV/pCpNAf1SkGPOB6hgmVDUPeyQJZ19Lb41X7Ysn6mzqv6epCGqptL/8UtQzmIdYyOcUvdC1VwR2QM9uu/yh3TgNrWLufETjZ2/ZPJ+E2pqFVI1gBnTIJBOqifQXLh3m16LcronFSn085TnWxQY8zK1p8mH8mdOjeePzFYKSwuFbJqj3Rf7tbPP0EO1V3C6ne0TyofqFQE9KcQfOCVlCRYRrozPvkX714qtHhVCRFoQsjkQmtaN+CwZybNFZD7kmIao7tM5bTb0368tGzkm1o86vX9i+xZMBstF+ADFkcqsX0KGOhHIN/CQvgUU70z+tYfEScyf+UdwymJnVVyRaqyMLaW5OTZvvASe46RSn4PZ6swZGNOVBnqvmTFrvnB+6exNYt0f8zo3XqL6XfCXnxdRXSNCZAourS3rLySOhLng+wl5Jalt6PtAMC9Hc2ssRIQiQpchOnG5m6vveYHW3frRb3x4IQEGDXsxeEl2CzKrcg7NVcwf 2tD1Y/Ym 2rKzuVttUWwpfLeKVe4vepUwnwX6qsSagk+hCUYOdn/3wga08LrWUg4zAzAHof7rguyDW/WJOILm1lO9gqCg0to+0+PSFPtizo81ZE0wxLPRoZNQR1mmA5JjqMZsBQY7CgtOPuwEI+ccVpqbA86KcTpOxihqhprpzi0H82IniWG2Snq6Vw9XvcTiPva62uxBS58umgb+VEHEyF3xFAwJT7WEMfrO02EIiEP8664UO+RCQ6nNKbq6zrbQ1sNEuMCrJ32vnF4h+oP7dalMOEVm4LZXKU4xJWCTr0pUyzSjUdDMnCI6UqPNXsx2var737JVnSwmjdXv77acOi4xaP64dgaEhcw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, 25 Sep 2023 23:10:51 -0400 riel@surriel.com wrote: > From: Rik van Riel > > Malloc libraries, like jemalloc and tcalloc, take decisions on when > to call madvise independently from the code in the main application. > > This sometimes results in the application page faulting on an address, > right after the malloc library has shot down the backing memory with > MADV_DONTNEED. > > Usually this is harmless, because we always have some 4kB pages > sitting around to satisfy a page fault. However, with hugetlbfs > systems often allocate only the exact number of huge pages that > the application wants. > > Due to TLB batching, hugetlbfs MADV_DONTNEED will free pages outside of > any lock taken on the page fault path, which can open up the following > race condition: > > CPU 1 CPU 2 > > MADV_DONTNEED > unmap page > shoot down TLB entry > page fault > fail to allocate a huge page > killed with SIGBUS > free page > > Fix that race by pulling the locking from __unmap_hugepage_final_range > into helper functions called from zap_page_range_single. This ensures > page faults stay locked out of the MADV_DONTNEED VMA until the > huge pages have actually been freed. > Was a -stable backport considered?