From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 154E8E9370C for ; Thu, 5 Oct 2023 13:23:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 705706B02B9; Thu, 5 Oct 2023 09:23:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6B5C96B02BA; Thu, 5 Oct 2023 09:23:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 57F056B02BB; Thu, 5 Oct 2023 09:23:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 4700C6B02B9 for ; Thu, 5 Oct 2023 09:23:55 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 146081A01A4 for ; Thu, 5 Oct 2023 13:23:55 +0000 (UTC) X-FDA: 81311475630.20.E443247 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf07.hostedemail.com (Postfix) with ESMTP id E52584001C for ; Thu, 5 Oct 2023 13:23:51 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; spf=none (imf07.hostedemail.com: domain of riel@shelob.surriel.com has no SPF policy when checking 96.67.55.147) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696512233; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=g+LDLUTzE2yFoBWScOZHf6Omi/FMC1yReBcb55wHQJ4=; b=jxSQ7niZgrN2gO9S56jitK3IRrh3+JMeRmelCgVntpdfJ1bB7wQ47aPSFnn5sAbHBgQHpM Bd+lLwaGS8sQygHO7KXRS2q7JG+LHzV9hyFg4TuhzKy/2TFn1loMF0koPPqp8Z+UWSccgo qJ3JH+mw+rs7RUN2YDZDrrtjEDjl8Vg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1696512233; a=rsa-sha256; cv=none; b=oJKDPzXqHtokSo+npYX+xoLOMg3bhPigTlAYZhJGjeQlNDrgteVFPibV0YCmPknfvgtl5J jILXB8GLIHehye8E2lOYdHSkeZ6CKEdPV9nLrTRqr8L9rqN7XbdPORfD12TddpPDvC5Hxm 3jnILPaWOctjCGGLCBdPldi/G5umfPA= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; spf=none (imf07.hostedemail.com: domain of riel@shelob.surriel.com has no SPF policy when checking 96.67.55.147) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none Received: from imladris.home.surriel.com ([10.0.13.28] helo=imladris.surriel.com) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1qoOJl-000234-17; Thu, 05 Oct 2023 09:23:05 -0400 Message-ID: <99b67e2408c260f53958e98226449fd2bb6a58d8.camel@surriel.com> Subject: Re: [PATCH 2/3] hugetlbfs: close race between MADV_DONTNEED and page fault From: Rik van Riel To: Mike Kravetz Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, muchun.song@linux.dev, leit@meta.com, willy@infradead.org, stable@kernel.org Date: Thu, 05 Oct 2023 09:23:05 -0400 In-Reply-To: <20231005031959.GA13437@monkey> References: <20231004032814.3108383-1-riel@surriel.com> <20231004032814.3108383-3-riel@surriel.com> <20231005031959.GA13437@monkey> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.46.4 (3.46.4-1.fc37) MIME-Version: 1.0 X-Rspamd-Queue-Id: E52584001C X-Rspam-User: X-Stat-Signature: 5x8sejr59ky7f9iamotom8hims8xmydy X-Rspamd-Server: rspam03 X-HE-Tag: 1696512231-109179 X-HE-Meta: U2FsdGVkX19+qV63A7KvAxd1YcVqJBP491ikTA4/AZF3w5k3ERsQV6oP1cVvnSJpLOv6YkYbwwPxDcX/2Qf8ekqRGSSemzJXTkBvadQACS72jh61UNFJAAmVRGK2hbCfeOiI3P5Yf6CQycP0EVTfFly4/Tj8zm3/OdS+j8facA1tliTw0GefeIvqLOw86LBc0r5po2fB2+b8rLqKw6UwvSW/jZ9TSySOZHKKFRf2hRe4yVWFjh+xvTwkNMe7foeIwNz6dY3rqKqZ5ue7mTBscDJIFoi04kAgv400X+JIQ2UJ5Nkujx2vyUNH9/NP4MHGd1TW+Bf2BlXgvNqc+YHrQWb7SXHgQp4WTfj/N/CKy6LLDKZJ+qkkhhyyUPJEClDDeVj72sSfgSgNEfJCrgTy8xePv4IihMAy4xU7VQKlYyzf+th88pTMWgPjXQiyy82yiBYiY4RGe7YBsE/xsI3SeHnT2TKVnCQ157rWMwLGxNccLouN6Drf1/ap4Ou00xubwFpC/mpnxkXpBr2uHaX9upmUUgefhU8i35I1RaPRn8PvmCLTt9Si82Vs+//mkrgDAlC4ntrrWDox55+xVPTfsYb3Ra3uXAbhlWrXZeGb+gaUfcuI8mFXn0fvfOKeq2T9ZbdEHop5dQlUOxIX6zGz1KkURY3OfduFjahI7DNRRrdOeUQ6k172roomhLEMNRfwsAV1nKH1nNf+Vpq2l/vMloe2PEXovDsIVhVhoTF35pvokhMinf6iIU3cjniEVxV4YGOEtnQm9UgWrKyeIJZ9H6/guhAkh2CGRbHp3vdrW+RFGANrK77R83Q/WNi+YpXQruFeTk12BFXqUL6aqDvr5Djds8m9XijlyGWq83J636nDRfWICnW83VyijplkCQlOJ59e2n3ExcDSF4XafFVM1In6MQFn3/SB2KB4N9flvY5zZeSo+DfVK6SZseXrB2AOOkzelrKXjh11xjd10ms gjI302cQ ovaTmxw6qbXVfYTgoh4Fs+91qaHzV8I5pP3/ugbwdpCHQMYZhmov76BcElhNP8a/rEaqazoP0egTG0PwBlH9cdcWzLJKB8SR8mu4LFv6yod/jw3pAME4ye/9TU6o8rjic+zXkOHR7B789Ivt2lgkMIx9q5mv269TXvP03OjbCLRt2AZ9bYFS5CCmtiD6SVyJ5zn91j7ejQ7URQJbglVmmZEBDfiEo5UBkk7qzFHvk0G3AA4zTs9EDu1J+h1hX2ZI3CKNagyNBwrT6LHNzZswp1FV2BNGcEEgesVLt7LtfHQYXZNj0buCZkyI7SElsH6RWgGcRuAUERnURRSf5bX8pNE464g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, 2023-10-04 at 20:19 -0700, Mike Kravetz wrote: > On 10/03/23 23:25, riel@surriel.com=C2=A0wrote: > >=20 > > @@ -5457,11 +5460,12 @@ void __unmap_hugepage_range_final(struct > > mmu_gather *tlb, > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 * someone else. > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 */ > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0__hugetlb_vma_unlock_write_free(vma); > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0i_mmap_unlock_write(vma->vm_file->f_mapping); > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} else { > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0i_mmap_unlock_write(vma->vm_file->f_mapping); > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0hugetlb_vma_unlock_write(vma); > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > > + > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (vma->vm_file) > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0i_mmap_unlock_write(vma->vm_file->f_mapping); > > =C2=A0} >=20 > In the case of a mmap(hugetlbfs_file_mmap) error, the per-vma hugetlb > lock will not be setup.=C2=A0 The hugetlb_vma_lock/unlock routines do not > check for this as they were previously always called after the lock > was > set up.=C2=A0 So, we can now get: Wait, the hugetlb_vma_(un)lock_{read,write} functions do have checks for the presence of the lock: void hugetlb_vma_lock_read(struct vm_area_struct *vma) { if (__vma_shareable_lock(vma)) { struct hugetlb_vma_lock *vma_lock =3D vma- >vm_private_data; down_read(&vma_lock->rw_sema); } else if (__vma_private_lock(vma)) { struct resv_map *resv_map =3D vma_resv_map(vma); down_read(&resv_map->rw_sema); } } Both __vma_shareable_lock and __vma_private_lock check that vma->vm_private_data points at something. Exactly what corner case am I missing here? What leaves vma->vm_private_data pointing at something invalid? >=20 > +++ b/mm/hugetlb.c > @@ -5503,10 +5503,12 @@ void __unmap_hugepage_range(struct mmu_gather > *tlb, struct vm_area_struct *vma, > =C2=A0void __hugetlb_zap_begin(struct vm_area_struct *vma, > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 un= signed long *start, unsigned long *end) > =C2=A0{ > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (!vma->vm_file)=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0/* hugetlbfs_file_mmap error */ > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0return; > + This does not seem quite correct, because the locking is needed to avoid the race between MADV_DONTNEED and the page fault path. > Another way to resolve would be to fix up the hugetlb_vma_lock/unlock > routines > to check for and handle a null lock. I thought I had that already.=20 Does __vma_shareable_lock need to check for !vma->vm_file ? --=20 All Rights Reversed.