From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC868E8FDA1 for ; Tue, 3 Oct 2023 19:35:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C54A88D0084; Tue, 3 Oct 2023 15:35:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C04B88D0003; Tue, 3 Oct 2023 15:35:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AF37D8D0084; Tue, 3 Oct 2023 15:35:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 9C63D8D0003 for ; Tue, 3 Oct 2023 15:35:50 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 6E234160475 for ; Tue, 3 Oct 2023 19:35:50 +0000 (UTC) X-FDA: 81305155260.28.A620DCA Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf28.hostedemail.com (Postfix) with ESMTP id D39D3C000C for ; Tue, 3 Oct 2023 19:35:47 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=none; dmarc=none; spf=none (imf28.hostedemail.com: domain of riel@shelob.surriel.com has no SPF policy when checking 96.67.55.147) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696361748; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=i/Sr+uBIkJIfxEN2UwxnO/tvLeS9KYepEsFgz+EBFrs=; b=y1Bd25QJSsjNXOdkmWSZZ83WKGtCSMci0THU+Cnuo/ZmlR38loghFmdfJc4rZ3123j0m4V Y3l+mknHxDhMBBcpk8h5T7J/lDWRYJwemPwRrnqzDN6Iz3pAIBu4CeNK5SbG0ucNtBE3kb zrV8odAgxr+yf379YjkJ8JLdkEPQl6s= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=none; dmarc=none; spf=none (imf28.hostedemail.com: domain of riel@shelob.surriel.com has no SPF policy when checking 96.67.55.147) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1696361748; a=rsa-sha256; cv=none; b=0Wpq1cknGmoinM+/Z2u9/lMaX7Zrgiga/vXDfxOljcWy4VfF7QoHXdTOqncN5GHxjeG44W s6wvBbUBWNiBKAjnKyJEB0MFjiMUR06kHTd4X0ydN94maPBnDWEdkZO9rqhWvEVxlWtOHC pqsr6wE9ERhHeAstLVMoo9T9boC4j8w= Received: from imladris.home.surriel.com ([10.0.13.28] helo=imladris.surriel.com) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1qnlAz-0000RR-1g; Tue, 03 Oct 2023 15:35:25 -0400 Message-ID: <8d19b6d092b7b5d9b1d0829e0d99c9915db3ed61.camel@surriel.com> Subject: Re: [PATCH 2/3] hugetlbfs: close race between MADV_DONTNEED and page fault From: Rik van Riel To: Mike Kravetz Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, muchun.song@linux.dev, leit@meta.com, willy@infradead.org Date: Tue, 03 Oct 2023 15:35:25 -0400 In-Reply-To: <20231002043958.GB11194@monkey> References: <20231001005659.2185316-1-riel@surriel.com> <20231001005659.2185316-3-riel@surriel.com> <20231002043958.GB11194@monkey> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.46.4 (3.46.4-1.fc37) MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: D39D3C000C X-Stat-Signature: 7e7yakhd17zwix4yud393ntmdx37x8w1 X-HE-Tag: 1696361747-315677 X-HE-Meta: U2FsdGVkX195aO2rpIuUcUfGp4jxwM+o4HH1dZuEQPOnw4PGfDlN1Sbqey8ILVOzL5liAvIkL5j0iJZsCL/6Au4kSh4K2VhynxnYgA4yKaBKq5A9QzyWvwVWIL83MQDYZpHahzLqFLbQzTAoXfSfj1mZLBezcoKtebxUnJtzM4itHIg7nfmaEU6O5w1FDjnCWJD1PYWD3866tVxbNeQGIaSWCVTDjyt35U5Aaf7UN+T7fCUJ5i3G4fsJzdE4V32ol/KTovkVXC7P0iK39MRRZtBpoQKso6llKcAbYzl3TiifBHfcHiWPKIyslkKvbuNGcZBTi9cQ0sIqcKVyfdBiP1uzWEie6y4ArvnLfaXG/VdgW9UYqL4IaiBv4lA6lkMMxneqQUl2Ax/rxMgNTb83yhNxRMZ+pANWfAL+m5DZUp/Kx4+nMjFUYQCzs/1EEI2P1RQefolTr/K6jHf1/5bOACKRS3J81qVzFGznOHPfB4cYW+m3BS3Y2naGXyuuz/RtFffhEtRC0k53lDVnVPzOzJxuY/PA3WgLnZIxuFAcbB7xEFjGjKr/riKODPS/Uf4WTvLrM9KTWW8wxPIIChC9m6gfqNqaiDk00Nqow8zp83qRnVkGWpDnmGLwHhr+AA5UBTBOq5Q7k4y96XHWCbTFOBehOpmVy5nv8xacGiWD3nss6+kF75X2owl7/RAzlPMIN/WkqZ0PnBEdCo1ITXRjBKWSlnaH93n8HwcVeCjOHEJPAj5XBzzkqFO7kWukKJRfKU38sR5/pL4XjKx0KezEdtfM8YW9YhtrtcgS43A1Tz0awhnOWfQUE1lVJ6Z7mrvlnzcAzmdI5XE8pCb6hxm2RBha8HnQQI2rjaRcgW84H6r3o1aJaaD8WNeN0qDJ2S68XC1nKClu6SsPr24yUgkYcSCbZe+sx9i+SdnBoXc+dcDhGoCFxIiAzDj7EUTrwWcU+yLWYOD7pQteidoCofE ieHPhgnz DrRgZ2EPH6GNo9HeVTheCJnt3cUYA6PujClotfS5Ej+lZ5NJp7jK2rRYthKyHFFWcTSNqdP2fefrF8xi+6Z8iizrVHh3gTYztCjHuKR7A7KRa2Kcvizjhi3/3Gt0YT8Q7wUxjwRWNupFNGGtmafGZ8pzmRjDL9aCsDRDqtzYw/IHhAvkVKqvp3g8eqI0Tgn5p16z2bTHMQz3Et9w2xxk2R1xZ9KbIYo2neH8y61vVv0LltDicAnTT/fc0XPfv8a0MuHeJHzJ76vsyvYXomrEwMswTFc6jZ81fg531F2K05nFiSYoyy+7UGU9zZxoNvKP/ijeHd7888RQ3YtQIHwXL9OBwkw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, 2023-10-01 at 21:39 -0700, Mike Kravetz wrote: >=20 > Something is not right here.=C2=A0 I have not looked closely at the patch= , > but running libhugetlbfs test suite hits this NULL deref in misalign > (2M: 32). Hi Mike, fixing the null dereference was easy, but I continued running into a test case failure with linkhuge_rw. After tweaking the code in my patches quite a few times, I finally ran out of ideas and tried it on a tree without my patches. I still see the test failure on upstream 2cf0f7156238 ("Merge tag 'nfs-for-6.6-2' of git://git.linux- nfs.org/projects/anna/linux-nfs") This is with a modern glibc, and the __morecore assignments in libhugetlbfs/morecore.c commented out. HUGETLB_ELFMAP=3DR HUGETLB_SHARE=3D1 linkhuge_rw (2M: 32): Pool state: (('hugepages-2048kB', (('free_hugepages', 1), ('resv_hugepages', 0), ('surplus_hugepages', 0), ('nr_hugepages_mempolicy', 1), ('nr_hugepages', 1), ('nr_overcommit_hugepages', 0))),) Hugepage pool state not preserved! BEFORE: (('hugepages-2048kB', (('free_hugepages', 1), ('resv_hugepages', 0), ('surplus_hugepages', 0), ('nr_hugepages_mempolicy', 1), ('nr_hugepages', 1), ('nr_overcommit_hugepages', 0))),) AFTER: (('hugepages-2048kB', (('free_hugepages', 0), ('resv_hugepages', 0), ('surplus_hugepages', 0), ('nr_hugepages_mempolicy', 1), ('nr_hugepages', 1), ('nr_overcommit_hugepages', 0))),) It may take a little while to figure this one out. I did some bpftracing, but don't have a real smoking gun yet. The trace certainly shows the last user of the leaked huge page going into __unmap_hugepage_range. --=20 All Rights Reversed.