From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3DE1AD2F7D1 for ; Wed, 16 Oct 2024 21:57:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F1E7A6B0089; Wed, 16 Oct 2024 17:57:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E7D046B008A; Wed, 16 Oct 2024 17:57:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D1D466B008C; Wed, 16 Oct 2024 17:57:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id A5D206B0089 for ; Wed, 16 Oct 2024 17:57:44 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id E64C8120B9A for ; Wed, 16 Oct 2024 21:57:34 +0000 (UTC) X-FDA: 82680827838.15.A66D116 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf22.hostedemail.com (Postfix) with ESMTP id A1A81C0009 for ; Wed, 16 Oct 2024 21:57:31 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=QehjS7be; dmarc=none; spf=pass (imf22.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729115829; a=rsa-sha256; cv=none; b=Fx8yDLa9kgIsmQanJ8UsNyao/wX7rEi08N9O70UsB7z1kBMIqjUq/CCf7vovbgKNxTWR8x 0nWQuwjUeVB2B3XMzukD5hPlSz2LYllHR89ARWY9rkdiIqb32WkMre38adaTUwWD4VSelM CNF3HgrPwZ+9HMssVNcaWEcA74g5Z4o= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=QehjS7be; dmarc=none; spf=pass (imf22.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729115829; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=P/op0VeMJVY12RWHoJAxIfjebDrs7RoVexQLNdmbzbc=; b=umOzUzonqwpDBLCtlKRGYhHQlK6jO0n21Wx3C07gRxnb+wlGtrU19fhyTqQWeA8Lj386r4 kQxT5p27tRlfH9gmsXm7ViJAdrouZsOPZzXJR3YLZBrdLTMUM9oL/PbcT32Wb01T97/NjE w0WU+WWThS3/A4dnb6BASyHKsVVfFyA= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 38EAE5C5DF3; Wed, 16 Oct 2024 21:57:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 79E2AC4CEC5; Wed, 16 Oct 2024 21:57:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1729115860; bh=3joD48iRmd7t3mUBMO2Okg6RSsqTIj3O0U/LxmWFCD4=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=QehjS7beWvmWvXAleXKwjdAR5rwA6dUqakkk2BlCoaIzIPwky1Rr/nWU4ffvgtYgC ruFvWf6iKQx9NXk+RBr2TT+LRonjJ0h/VCv5NYHyDsNNuUL3PHYlFapN4zGdz7QZtO XV0Ks3y/QV7XVB/hxLQII5EUzEQXyGvd4n44RzPY= Date: Wed, 16 Oct 2024 14:57:39 -0700 From: Andrew Morton To: John Hubbard Cc: LKML , linux-mm@kvack.org, Alistair Popple , Shigeru Yoshida , David Hildenbrand , Jason Gunthorpe , Minchan Kim , Pasha Tatashin Subject: Re: [PATCH] mm/gup: stop leaking pinned pages in low memory conditions Message-Id: <20241016145739.770543d44313967f611f3810@linux-foundation.org> In-Reply-To: <20241016202242.456953-1-jhubbard@nvidia.com> References: <20241016202242.456953-1-jhubbard@nvidia.com> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspam-User: X-Stat-Signature: ks8fzo19r6pxc45pnf8io4obtir7xa6u X-Rspamd-Queue-Id: A1A81C0009 X-Rspamd-Server: rspam02 X-HE-Tag: 1729115851-608816 X-HE-Meta: U2FsdGVkX19bnjYVjku3Z2Ggjr15NUoUTRFNvQ8bpL7JwGtpLkbhP1PzHhE2Yw/YN/q0bD6vFCA6wmbU7c4Eesp72lyPX8cwU4YVfNvIO5AW2sPxwShQrs6i9tNkqwbdoSbj3j9CXIpjSXTgb7ZiKZf6h62Kmfj2VzJAIGI8ddw0m3zTNkudc9AZxsCuXjfeuo+7HpuPRk3bXWYwBXYRJeXGdDaxFn1yfI2s84uUAhJ5nb9gCeQFLIfCayx5smBvMk40cGe3tC0P8fSK/uzCNtT7unilCBrxYK5GzdCvMzaifRhhsK/7Bi20p0AMOQOK5+uXgJlqSgMRJGnA/Wxg3IJY8cAFChyDMyaLwKCbRN/7ReXS3/zar45yXLIuXDIRM0rAXScpl3HmIj9aJhvUIVkK4CAELOsBA9gWaNKNd51V+ZA/CFog1jMxGMMT6IyXCmrVKZB0eCgUIvuczqLhWvo+HYs3woTC3X8tElEjBoSTs1UgJWszJHFL5Oc1IQBseya+xCiAB7q6o+fw0wXr2xqrbhluETzTQVjEtrQsNoOUyvEo+Zbu18OYlj3+e1AFWqlumAyDOyJc/vaPGNX0cRHWuBXaT3dNbY/BFhNVFQKI2c5n4aqZDDP92DvsStqGfEUtkuNrIXor6g/n/3FbaYo6GdtzcC4sbW77kG8Fygs8qe4fcks5hjBqVofjUgd2pFHRHUu/nFK8tv1tOen3hfM9f1rYgZISj3saB7Rnsqbm3nkX9uQFhzO0U+Cl+iz/yHWKnZLZIhfRMxfM5sApSHnZ/VHqWFoHvkaJwmcPu5BTCIGHUuuFGdpSbbYrrFaDITN02j8nGJuhAYG00bFvRiJP6sJ1R+I64SSZQ866Ljl5fbGK3UW11sHcRgOfLfgexc5WSbWCzNYDf/NueR26k+uIrpIlO3iib60PyUb2icfIopLyOkkTDP+Or+aKzBOxUd8lhJ7HS344nS0Tn4y 33bhzIXu juX+Aozyth1/gsn0CmUYVo5/3V9WDVYVbJG2UWikfSlc7vww+sAuoOgmOe1If+i7ABpm+mGj9i0e9wL2H7QbGNA/Uma5HSJPyJVs53lTQ5UCOBa7GSNp5Jbg98pooVQtifRW74M4ZQ4EsV0q2oQ0O0AHWDmggBUGf2LM6dkcfUxHIRECE7dxyZLrBCzG2ueVEyXtB42GJeNW13GekQckYp9YgI1eY8RvTbAUrRUHA2bgq12Fggt64NiCvuuApR54o29X4U2cPCqV+IeHrqZPdor7v8g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, 16 Oct 2024 13:22:42 -0700 John Hubbard wrote: > If a driver tries to call any of the pin_user_pages*(FOLL_LONGTERM) > family of functions, and requests "too many" pages, then the call will > erroneously leave pages pinned. This is visible in user space as an > actual memory leak. > > Repro is trivial: just make enough pin_user_pages(FOLL_LONGTERM) calls > to exhaust memory. > > The root cause of the problem is this sequence, within > __gup_longterm_locked(): > > __get_user_pages_locked() > rc = check_and_migrate_movable_pages() > > ...which gets retried in a loop. The loop error handling is incomplete, > clearly due to a somewhat unusual and complicated tri-state error API. > But anyway, if -ENOMEM, or in fact, any unexpected error is returned > from check_and_migrate_movable_pages(), then __gup_longterm_locked() > happily returns the error, while leaving the pages pinned. > > In the failed case, which is an app that requests (via a device driver) > 30720000000 bytes to be pinned, and then exits, I see this: > > $ grep foll /proc/vmstat > nr_foll_pin_acquired 7502048 > nr_foll_pin_released 2048 > > And after applying this patch, it returns to balanced pins: > > $ grep foll /proc/vmstat > nr_foll_pin_acquired 7502048 > nr_foll_pin_released 7502048 > > Fix this by unpinning the pages that __get_user_pages_locked() has > pinned, in such error cases. Thanks. > Fixes: 24a95998e9ba ("mm/gup.c: simplify and fix check_and_migrate_movable_pages() return codes") I'll add this to the -stable backport pile, although this seems a bit marginal?