From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D288D32D94 for ; Tue, 12 Nov 2024 11:24:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 568456B0098; Tue, 12 Nov 2024 06:24:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4F00E6B00B7; Tue, 12 Nov 2024 06:24:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 36A036B00B8; Tue, 12 Nov 2024 06:24:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 11C5D6B0098 for ; Tue, 12 Nov 2024 06:24:11 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 7B3FB12055C for ; Tue, 12 Nov 2024 11:24:10 +0000 (UTC) X-FDA: 82777207842.22.6FDDF37 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf27.hostedemail.com (Postfix) with ESMTP id C2B9240011 for ; Tue, 12 Nov 2024 11:23:25 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=jadU7NWf; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=7XWNgOmN; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=jadU7NWf; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=7XWNgOmN; spf=pass (imf27.hostedemail.com: domain of osalvador@suse.de designates 195.135.223.131 as permitted sender) smtp.mailfrom=osalvador@suse.de; dmarc=pass (policy=none) header.from=suse.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731410415; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=M3maZGpP1+h3ml9nQ674zOyaaMAhmGct+5CBHPa7DBw=; b=NJ4yvqmqasS1/O8CrLTDEO7iwgnkVxWWL/SW0NFNkJK4ym+AeKqVJbjvz92sbWl+hr8U04 gBxknw9urmYtypfKgYJ+HTLzDTH/nAOrrinC9QN4uLMKtNXL+v/XDB2oOJyRCTlORx1Lsm TsJ/nOSRMXHBSlpoUr//hinHZkG/aRk= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=jadU7NWf; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=7XWNgOmN; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=jadU7NWf; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=7XWNgOmN; spf=pass (imf27.hostedemail.com: domain of osalvador@suse.de designates 195.135.223.131 as permitted sender) smtp.mailfrom=osalvador@suse.de; dmarc=pass (policy=none) header.from=suse.de ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731410415; a=rsa-sha256; cv=none; b=r5uXeKlZjBbKEUZo/DeqEDtyIQpMOV14inYyJwDqSjXrSOApfUo2waPZ+aNxYmBlBZRWba bJwWdY+rg4LIxyInkVq2f4uk2fVjn6Hg5Sox9BF/IqLLB61nlvRuvNMheaOeVh+9kJK/pL HJvaQ57UNbUCJDh3hYD8hPVBNeVPUQs= Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 9E25E1F451; Tue, 12 Nov 2024 11:24:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1731410646; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=M3maZGpP1+h3ml9nQ674zOyaaMAhmGct+5CBHPa7DBw=; b=jadU7NWf04aiYHrSUFQCiIVaSkS/DppnyXZkMEJ/h8xkUS0F1A7tOuc0rGPYJhH8GZIi7a Mrxk5vk7QYroh53eKPj9YTsY5HT15x+IKgTdx/gQs105Ddzu1S5Zh/SuRPUkXn1zxMOscq OanHqYZwvpgFIYqUgjZ86I9LWgvUSUw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1731410646; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=M3maZGpP1+h3ml9nQ674zOyaaMAhmGct+5CBHPa7DBw=; b=7XWNgOmNdYOzxSXGHTkMBOtcSGOjGrSHK66I6iie3g7WMlvrk7MkjWche24fZQa7x5YoKx MJhfxw+cPlC/I8Bg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1731410646; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=M3maZGpP1+h3ml9nQ674zOyaaMAhmGct+5CBHPa7DBw=; b=jadU7NWf04aiYHrSUFQCiIVaSkS/DppnyXZkMEJ/h8xkUS0F1A7tOuc0rGPYJhH8GZIi7a Mrxk5vk7QYroh53eKPj9YTsY5HT15x+IKgTdx/gQs105Ddzu1S5Zh/SuRPUkXn1zxMOscq OanHqYZwvpgFIYqUgjZ86I9LWgvUSUw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1731410646; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=M3maZGpP1+h3ml9nQ674zOyaaMAhmGct+5CBHPa7DBw=; b=7XWNgOmNdYOzxSXGHTkMBOtcSGOjGrSHK66I6iie3g7WMlvrk7MkjWche24fZQa7x5YoKx MJhfxw+cPlC/I8Bg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id EA90213721; Tue, 12 Nov 2024 11:24:05 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id i4z6NdU6M2cfQwAAD6G6ig (envelope-from ); Tue, 12 Nov 2024 11:24:05 +0000 Date: Tue, 12 Nov 2024 12:24:00 +0100 From: Oscar Salvador To: Peter Xu Cc: riel@surriel.com, linux-kernel@vger.kernel.org, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, muchun.song@linux.dev, mike.kravetz@oracle.com, leit@meta.com, willy@infradead.org, stable@kernel.org, Ackerley Tng Subject: Re: [PATCH 2/4] hugetlbfs: extend hugetlb_vma_lock to private VMAs Message-ID: References: <20231006040020.3677377-1-riel@surriel.com> <20231006040020.3677377-3-riel@surriel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: C2B9240011 X-Stat-Signature: ejpf5efogr9tgpm5fmorx953hmhija6e X-Rspam-User: X-HE-Tag: 1731410605-966042 X-HE-Meta: U2FsdGVkX1+2GY8Lej8EzTpJBuHURaHfSFlvXDsWzFizEckLFPI7PbeQ7nyvsLD+dahljXf+B77DNudPVj8A03lMVs7QZpOnWqKzMz5I8s5cYOsZfrGOaiR4jPNlIRe5QQjWPExfbQ3Zu4BYu2l5x9CiUWyaqhe6DpIM+tAaKQdcm9J8TMJpOWK1eY1foOeECGtxzwqIsncclqJfMi8XHIqufcDNdk0iY9gjTRaYy7F5ykywDYOKanKMtbmBIcPJ8ciOMQp6ehZ8Gvpp6kWs8zIbWcAfkE5rIT+zTStIp2lHBPV++cTCF80Yu6SJjX5EZMcUUGcZ3SZSzCRwjvFDF6d1YimBzqeq/wT9QOILr+7i7s4jb/IOKrUSdusIdQ4sfscqL1VmTXZhlSVjjtF2PoA1KuH2JgrUB22yhPznlACYjHVMjwZLQiRA9eEgZWgqDHaWkBwFRA/UtXusSYsRBxpHvM9EJgICC/9wDJ8q+5YgnJsFUMm7LEw9Mx0iUQRv4lQsffYAjU5Y0c326c1Lm0TMU2ar43mIVregnye3u8/BJrFk5I+j7Bb3+lMpatWUtW8r47ydRkCWO1dm/nttZRjchpV/dUNEDQE7VDRExHROxMAhTBz4RH4+iwxooh5Hd+4SY8p3OanFZXdyqJTcjXi/pP+hoCineMNpsViIcqrgoWrISy1FfSF13HTxvBJUhPYYf7og4saLcP3L1BfXi9VxCt8LwA3+KWhpL5oiKMC4rBLpRKjEAV/sve1y1LFBGWLiLEQoeisUHTmAucIfeapUlafp0kgytpkkDAKJ7v71+AyEQTnxQ1zlBW9ZwpuQ+i9mfSuedhtpP53liNEMYMqSfjH4wILjtNH92VsoxaEBtKI9h1M3MrD7+PjQxmqLTmiMJRULMbHb14Ia3ooSIvZP0Esp+dPe1CV/P7MMYJuIQeIiV3j7a/evTb51JItuM6z/9Mj5BxlSYCGbVto BnJio/HZ 2613D4yqn5xhRt3cl3spgVwOUHL1k4GZV5wxaiViPL+kA5Fi9I88ZE2dPu2uTmDZ9UzWG4VqGRDs+qsrLShHr8xR1W715d9OfejY128I88+ZEvYSbFkEjseuDvP5xFS8M8x3BeKxqvduiz3ziq3HtevmVfj3j2AriqOhGVYlOXoYRJHOYXkw6OJpnj5zI0XdsHOZ3nAv3kDdBYi1uzt9IhQP9yBaAlFvbQ648ISVL4Fn3vObkCoG6QuoP3y8y1HIVMhF7V/G1SgMO92X9wipBTybWBdmCqsaPemKg3BzsgNl3apCxtKH/gFPkHx6FipfmsnV+n9PWyFBSiMJS6uesyCYAe+/BOLx9lLPUPnjEd0/yM4zOCINcLy1gqQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Nov 07, 2024 at 03:18:51PM -0500, Peter Xu wrote: > +Ackerley +Oscar > > I'm reading the resv code recently and just stumbled upon this. So want to > raise this question. > > IIUC __vma_private_lock() will return false for MAP_PRIVATE hugetlb vma if > the vma is dup()ed from a fork(), with/without commit 187da0f8250a > ("hugetlb: fix null-ptr-deref in hugetlb_vma_lock_write") which fixed a > slightly different issue. > > The problem is the current vma lock for private mmap() is based on the resv > map, and the resv map only belongs to the process that mmap()ed this > private vma. E.g. dup_mmap() has: > > if (is_vm_hugetlb_page(tmp)) > hugetlb_dup_vma_private(tmp); > > Which does: > > if (vma->vm_flags & VM_MAYSHARE) { > ... > } else > vma->vm_private_data = NULL; <--------------------- > > So even if I don't know how many of us are even using hugetlb PRIVATE + > fork(), assuming that's the most controversial use case that I'm aware of > on hugetlb that people complains about.. with some tricky changes like > 04f2cbe35699.. Just still want to raise this pure question, that after a > fork() on private vma, and if I read it alright, lock/unlock operations may > become noop.. I have been taking a look at this, and yes, __vma_private_lock will return false for private hugetlb mappings that were forked . I quickly checked what protects what and we currently have: hugetlb_vma_lock_read - copy_hugetlb_page_range (only sharing) hugetlb_vma_lock_read - hugetlb_wp (only for HPAGE_RESV_OWNER) hugetlb_vma_lock_read - hugetlb_fault , protects huge_pmd_unshare? hugetlb_vma_lock_read - pagewalks hugetlb_vma_lock_write - hugetlb_change_protection hugetlb_vma_lock_write - hugetlb_unshare_pmds hugetlb_vma_lock_wirte - move_hugetlb_page_tables hugetlb_vma_lock_wirte - _hugetlb_zap_begin (unmap_vmas) the ones taking the hugetlb_vma_lock in write (so, the last four) also take the i_mmap_lock_write (vma->vm_file->f_mapping), and AFAIK, hugetlb mappings, private or not, should have vma->vm_file->f_mapping set. Which means that technically we cannot race between hugetlb_change_protection and move_hugetlb_page_tables etc. But, checking commit bf4916922c60f43efaa329744b3eef539aa6a2b2 Author: Rik van Riel Date: Thu Oct 5 23:59:07 2023 -0400 hugetlbfs: extend hugetlb_vma_lock to private VMAs which its motivation was to protect MADV_DONTNEED vs page_faults, I do not see how it gets protected with private hugetlb mappings that were dupped (forked). madvise_dontneed_single_vma zap_page_range_single _hugetlb_zap_begin hugetlb_vma_lock_write - noop for mappings that do not own the reservation i_mmap_lock_write But the hugetlb_fault path only takes hugetlb_vma_lock_*, so theorically we still could race between page_fault vs madvise_dontneed_single_vma? A quick way to prove would be map a hugetlb private mapping, fork it and have two threads tryong to madvise(MADV_DONTNEED) and the other trying to write to it? I do not know, maybe we are protected in some other way I cannot see right now. I will have a look. -- Oscar Salvador SUSE Labs