From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6539AC433FE for ; Fri, 21 Oct 2022 23:30:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 020D28E0002; Fri, 21 Oct 2022 19:30:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F12BB8E0001; Fri, 21 Oct 2022 19:30:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E014B8E0002; Fri, 21 Oct 2022 19:30:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id D05808E0001 for ; Fri, 21 Oct 2022 19:30:03 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id A3CE4ABD12 for ; Fri, 21 Oct 2022 23:30:03 +0000 (UTC) X-FDA: 80046551886.16.FE73215 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf07.hostedemail.com (Postfix) with ESMTP id 5838C4000A for ; Fri, 21 Oct 2022 23:30:02 +0000 (UTC) Received: from imladris.surriel.com ([96.67.55.152]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1om1Si-0002Fd-34; Fri, 21 Oct 2022 19:30:00 -0400 Message-ID: Subject: Re: [PATCH] mm,madvise,hugetlb: fix unexpected data loss with MADV_DONTNEED on hugetlbfs From: Rik van Riel To: Mike Kravetz Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-team@meta.com, Andrew Morton , David Hildenbrand Date: Fri, 21 Oct 2022 19:29:59 -0400 In-Reply-To: References: <20221021154546.57df96db@imladris.surriel.com> Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="=-Ln3tjc1s0eh8wFvc/pi/" User-Agent: Evolution 3.42.4 (3.42.4-2.fc35) MIME-Version: 1.0 ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; spf=none (imf07.hostedemail.com: domain of riel@shelob.surriel.com has no SPF policy when checking 96.67.55.147) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1666395002; a=rsa-sha256; cv=none; b=yDzRyIM12iVlfWSw0ZXt7ECei46vWuiG7RQsG90KA2a0lo/oZpQ/1bcl1slNQXFRWv4bam m9jawJ2WheT7lV1cKaQx+Nr9i/PkMvk8H35SjCO5ySbX/len9juvM8CIakNWkn9EknPq5w 4hnLpn/1D8H9jOyW1g9lO7jvWYFJJPY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1666395002; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EHgkgUHG+2DTjOdI3Fl0G36uUkFQUOnJRwr2P0H/4Jw=; b=lXvjBhMdZwWuVsrjVNmNCFO6KODFFSlEA0zkkZQojNX+YWawUZH4WCmR6XQ7CEQP08aOA6 OehDYXZxK2Ose/HfQ7rh4RDUG/4iRXFifXQUP+y3xgKzKGSf28Nn9p776eIOwckP9Zdjn5 iwiBWa1KTAh9fqGBF09MJo+V+1mFYYI= X-Stat-Signature: 5r4scerk96oaqzbh1coz3u5gzm3zjup3 X-Rspamd-Queue-Id: 5838C4000A X-Rspam-User: Authentication-Results: imf07.hostedemail.com; dkim=none; spf=none (imf07.hostedemail.com: domain of riel@shelob.surriel.com has no SPF policy when checking 96.67.55.147) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none X-Rspamd-Server: rspam06 X-HE-Tag: 1666395002-888558 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --=-Ln3tjc1s0eh8wFvc/pi/ Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, 2022-10-21 at 13:48 -0700, Mike Kravetz wrote: > On 10/21/22 15:45, Rik van Riel wrote: > > A common use case for hugetlbfs is for the application to create > > memory pools backed by huge pages, which then get handed over to > > some malloc library (eg. jemalloc) for further management. > >=20 > > That malloc library may be doing MADV_DONTNEED calls on memory > > that is no longer needed, expecting those calls to happen on > > PAGE_SIZE boundaries. > >=20 >=20 > Thanks Rik.=C2=A0 I tend to agree with this direction as it is 'breaking' > current code.=C2=A0 David and I discussed this in this thread, > https://lore.kernel.org/linux-mm/356a4b9a-1f56-ae06-b211-bd32fc93ecda@red= hat.com/ >=20 > One thing to note is that there was not any documentation saying > madvise would happen on page boundaries.=C2=A0 The system call takes a > length and rounds up to page size.=C2=A0 However, the man page explicitly > said it operates on a byte range.=C2=A0 Certainly mm people and others > know we only operate on pages.=C2=A0 But, that is not what was documented= . >=20 > When the change was made to add hugetlb support, the decision was > made > to round up the range to hugetlb page boundaries in hugetlb vmas.=C2=A0 > This > was to be consistent with how madvise operated on base pages.=C2=A0 At th= e > same time, madvise documentation was updated say it operates on page > boundaries as well as the behavior for hugetlb mappings.=C2=A0 If moving > forward with this change we will need to update the man page. I'll send in a patch for the man page after the patch gets merged. I'll change the text to clarify that the system may round up the specified length to PAGE_SIZE granularity, which is a quantity programs can get through (IIRC) getconf. Andrew, I split out the bit of the patch for stable. --=20 All Rights Reversed. --=-Ln3tjc1s0eh8wFvc/pi/ Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEKR73pCCtJ5Xj3yADznnekoTE3oMFAmNTK3gACgkQznnekoTE 3oPuIwgAm0O7awmpragf2EibhYa6Cav047vyduUCGjVoZQG4wMZgs0ui/u4JG+O1 C8l0qoih42ttFeLMqWWdmNiFHIb8v69QPFktZTo/IPMph700qxTo+VmnIir9KBGw 1U4dDeQqikvT0SpsijAXDmyvLFWAUCBPW+G2DiqHGzgArIw0vu0ycKX6wX8jcp3y ptYdq8fGCsfQn1NWYQWyPMT8y6SQNnWpF713yxEuI2PCpfwQ1Qa/yrcGP4XwqKN5 HlqWK1yq/6vhHJ8hd+iBP7lXUrbOo0Fl0v1nR29NQDi7/OxsgYsg34u2PS+NS7z8 h9ysC5ux3VWnX8sTfc9DfnEf5swRqA== =JQHs -----END PGP SIGNATURE----- --=-Ln3tjc1s0eh8wFvc/pi/--