From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 668E9C8302B for ; Mon, 30 Jun 2025 14:42:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0A8ED6B00CF; Mon, 30 Jun 2025 10:42:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0816C6B00D7; Mon, 30 Jun 2025 10:42:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ED8D46B00D8; Mon, 30 Jun 2025 10:42:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id DEDD76B00CF for ; Mon, 30 Jun 2025 10:42:20 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 66088140547 for ; Mon, 30 Jun 2025 14:42:20 +0000 (UTC) X-FDA: 83612332440.06.F2D6A4C Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by imf24.hostedemail.com (Postfix) with ESMTP id 2457F180012 for ; Mon, 30 Jun 2025 14:42:17 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=dbwaxbQZ; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=7QYbxML7; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=dbwaxbQZ; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=7QYbxML7; dmarc=pass (policy=none) header.from=suse.de; spf=pass (imf24.hostedemail.com: domain of osalvador@suse.de designates 195.135.223.130 as permitted sender) smtp.mailfrom=osalvador@suse.de ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751294538; a=rsa-sha256; cv=none; b=sIw0iz0NcvN3uGh1FynzQSIQab442ZYgBlDmngM+ul5+Mz/o6FOJJZI7bKaW4u/pe9u+7l prfRQvfexaC1xHAMwFHatqFS12a9MHlxYJWhZFHrN+Dx0boNW6UAoVd7onnCX95yfpAAv2 0RdvLQLFItpgRBkGkR047zYpMGFFnLk= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=dbwaxbQZ; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=7QYbxML7; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=dbwaxbQZ; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=7QYbxML7; dmarc=pass (policy=none) header.from=suse.de; spf=pass (imf24.hostedemail.com: domain of osalvador@suse.de designates 195.135.223.130 as permitted sender) smtp.mailfrom=osalvador@suse.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751294538; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=GzLMyabYZWnfai/GjIlaaIB9T8NN4dI3ADLvoB8CDB0=; b=QX0dY/5f7wVLAb7LTo5UDIuJcm/zRZCFZv64d7V4GYJAsFOD35RiuZbnaZb3KdIapP7K8V 5Xsy9fiA5yt0v+mCfuHQcu1iSuxLiVG5dxixEe8KwGsvuCvMcwzV7kI915y1H578cM3IHq q2fOsWhlxrrc107QH8ZXzB4qhmJAupo= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 7A4D021164; Mon, 30 Jun 2025 14:42:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1751294536; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=GzLMyabYZWnfai/GjIlaaIB9T8NN4dI3ADLvoB8CDB0=; b=dbwaxbQZWOqWpC+fVhTa7AD0Qn8LYOYRhkGykz2AOwLW/heZZEww7Gn96579sDolInGNwu n58h5yEvyQSoZ24+Q5pHlr+sDjFH9TUvGAa18yHHodprzqxsiNNqw4TL1rn3FPeNnGBahu Ircdc8kxDzt9QpFpwSGgosIs28J+fKU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1751294536; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=GzLMyabYZWnfai/GjIlaaIB9T8NN4dI3ADLvoB8CDB0=; b=7QYbxML7KbZkNz5tB8x4iZE2G+JMl/ioo10zic3Wn+10lOZcLitxYVbwV3c03IvbFi7oz6 zdLUMPNtcM1p16DA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1751294536; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=GzLMyabYZWnfai/GjIlaaIB9T8NN4dI3ADLvoB8CDB0=; b=dbwaxbQZWOqWpC+fVhTa7AD0Qn8LYOYRhkGykz2AOwLW/heZZEww7Gn96579sDolInGNwu n58h5yEvyQSoZ24+Q5pHlr+sDjFH9TUvGAa18yHHodprzqxsiNNqw4TL1rn3FPeNnGBahu Ircdc8kxDzt9QpFpwSGgosIs28J+fKU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1751294536; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=GzLMyabYZWnfai/GjIlaaIB9T8NN4dI3ADLvoB8CDB0=; b=7QYbxML7KbZkNz5tB8x4iZE2G+JMl/ioo10zic3Wn+10lOZcLitxYVbwV3c03IvbFi7oz6 zdLUMPNtcM1p16DA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id F3B691399F; Mon, 30 Jun 2025 14:42:15 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id oHReOEeiYmjqdAAAD6G6ig (envelope-from ); Mon, 30 Jun 2025 14:42:15 +0000 From: Oscar Salvador To: Andrew Morton Cc: David Hildenbrand , Muchun Song , Peter Xu , Gavin Guo , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Oscar Salvador Subject: [PATCH v4 0/5] Misc rework on hugetlb faulting path Date: Mon, 30 Jun 2025 16:42:07 +0200 Message-ID: <20250630144212.156938-1-osalvador@suse.de> X-Mailer: git-send-email 2.49.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Action: no action X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 2457F180012 X-Stat-Signature: 5khada9w88n7rnemqra8h8ewkkyybx1i X-HE-Tag: 1751294537-545222 X-HE-Meta: U2FsdGVkX1/XxqL7kSqH9meDZdOcGsJYTAMqFjJpcaaccDiGbkVX7TenNXB5vdj+i95qrrYnom1sbd02+ya8+a+gCOnoQcUJRUzwhapxj+RNZZHT/yRxFCIbKd5QoIeiiKVjMhdXB6h0IQzHLEoBY+z1EB9yyBR6EpqSp/GITxJhQW1MwblqVOw6O6TpLHVvfWTTx2zdcfNzN3zLAn1Kl/dx/ZC23Yml0g+xzsatlGryJZjq/ldnYFYuG6QNsi3wYK4qo6Xvv0B+NnU7WfuRblltxYYbtX+fLQO1iPOcru6R0K/5J2KR4Vx89obQ6OfrTMwe8/YO8LhV+eO0NIwe0+vLOBFn6dkDtfNsLP79l9gZt7f4SsmPXn2QzVlDblvWqs954iTaRh3saR298NOiCBb3FeMBmQ9mLwIGL+hUnsa1BjewCUuyu83r3WK14RGHxUqsrpA8gJIMHYbb88xshngvAOjhWY2bHPUpbmIzWF7EE1rsWsmxfzdSBMtgHBnTXoi0PnwwasM7AcOewSGChewGLfDmhQ/N0i46A5a6jbL/ZJhxltWCfXhEW9Fs3CHRfy0ahowv5RkN9LZHlmNBoVLoaafgo7o66CCt1nYRDN83D0WiwsQi8amMsWLYFpsXmHHa+1K4mDNR2lvAMlceFLxeD3bt3j/xUrGxL6P5rpwEb75bibOiNOvzykVpMMpWTHdXMjYByjBCnQj6JagIlVPeBLUa1qtvYKBTSFxCwmRXd7x27fhRBuFFTO5Csjz2cbA/8lQswUhlL9J47sV2/3XmMzuMMdDvXx5vCHufLbq1Srg3i1re9DNNbRszgvBUkBUEpE7ITym2ldUh8tq+nkSP67NF0P4jgVtvPxCN6lSMZOS93XpPROgsze3Gb/YRIZeHeEwC35nk8UXjBOwgbHgmSkLoUBXC6g0m3FyhEsU+Fmjk+vMMdjvsV+j4PhI0yakIOreS4okyhFIYSKY hWSvkD42 yqvyYvw9SKAuZEMxlNPeFLlShB6+adExmIpu/rkGIZoOPdBqvuyMlj0tYqSoAOjBxnAlBHZImfu9yEKeETZHeMmE5fM+UhPGSUCj4jEyzSNQ/uaDy25vTnKZ5P2HPOWOX6WbYPmfmkPQey65dgi/Yr0ha0aXrrw5mA3m84d8FIrSS3WqZLJN0YD8jkVAZLfsInssh75FDKmeHx6ltq2SSjkHBf4hD7BfGVKKEB3F5SPVKoGWqBF5ydy8JtdKEgdI6PyBeiIKAHjG+IYgy/VXsJHYDpjffpeDqpIWXQCQ1k1HDh0gaZrk+poQFNVz8NRatSXkMNEWSkJ65CQL4g4eKSgOqt4EgWExm/gJDk4qOHGF8LpNADdYchy1w7g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: v3 -> v4: - Fix the deadlock for real in patch#1 (Kudos to Galving for testing out v3) - Keep trylock as the original code does - Add reproducer in the cover letter v2 -> v3: - Addressed issue folio_lock when holding spinlock (per David) - Simplify new_anon_folio (per David) - Slightly rework patch#2 to make it simpler v1 -> v2: - Addressed feedback from David - Settle ideas wrt. locking in faulting path after discussion with David - Add Acks-by RFC -> v1: - Stop looking up folio in the pagecache for detecting a COW on a private mapping. - Document the locks This patchset aims to give some love to the hugetlb faulting path, doing so by removing obsolete comments that are no longer true, sorting out the folio lock, and changing the mechanism we use to determine whether we are COWing a private mapping already. The most important patch of the series is #1, as it fixes a deadlock that was described in [1], where two processes were holding the same lock for the folio in the pagecache, and then deadlocked in the mutex. Note that this can also happen for anymous folios. This has been tested using this reproducer [2]. Looking up and locking the folio in the pagecache was done to check whether that folio was the same folio we had mapped in our pagetables, meaning that if it was different we knew that we already mapped that folio privately, so any further CoW would be made on a private mapping, which lead us to the question: __Was the reservation for that address consumed?__ That is all we care about, because if it was indeed consumed and we are the owner and we cannot allocate more folios, we need to unmap the folio from the processes pagetables and make it exclusive for us. We figured we do not need to look up the folio at all, and it is just enough to check whether the folio we have mapped is anonymous, which means we mapped it privately, so the reservation was indeed consumed. Patch#2 sorts out folio locking in the faulting path, reducing the scope of it ,only taking it when we are dealing with an anonymous folio and document it. More details in the patch. Patch#3-5 are cleanups. [1] https://lore.kernel.org/lkml/20250513093448.592150-1-gavinguo@igalia.com/ [2] Here is the reproducer: #include #include #include #include #include #define PROTECTION (PROT_READ | PROT_WRITE) #define LENGTH (2UL*1024*1024) #define ADDR (void *)(0x0UL) #define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB) void __read(char *addr) { int i = 0; printf("a[%d]: %c\n", i, addr[i]); } void fill(char *addr) { addr[0] = 'd'; printf("addr: %c\n", addr[0]); } int main(void) { void *addr; pid_t pid, wpid; int status; addr = mmap(ADDR, LENGTH, PROTECTION, FLAGS, -1, 0); if (addr == MAP_FAILED) { perror("mmap"); return -1; } printf("Parent faulting in RO\n"); __read(addr); sleep (10); printf("Forking\n"); pid = fork(); switch (pid) { case -1: perror("fork"); break; case 0: sleep (4); printf("Child: Faulting in\n"); fill(addr); exit(0); break; default: printf("Parent: Faulting in\n"); fill(addr); while((wpid = wait(&status)) > 0); if (munmap(addr, LENGTH)) perror("munmap"); } return 0; } You will also have to add a delay in hugetlb_wp, after releasing the mutex and before unmapping, so the window is large enough to reproduce it reliably. diff --git a/mm/hugetlb.c b/mm/hugetlb.c index fda6b748e985..5601a9cf819b 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -38,6 +38,7 @@ #include #include #include +#include #include #include @@ -6261,6 +6262,8 @@ static vm_fault_t hugetlb_wp(struct vm_fault *vmf) hugetlb_vma_unlock_read(vma); mutex_unlock(&hugetlb_fault_mutex_table[hash]); + mdelay(8000); + unmap_ref_private(mm, vma, old_folio, vmf->address); mutex_lock(&hugetlb_fault_mutex_table[hash]); Oscar Salvador (5): mm,hugetlb: change mechanism to detect a COW on private mapping mm,hugetlb: sort out folio locking in the faulting path mm,hugetlb: rename anon_rmap to new_anon_folio and make it boolean mm,hugetlb: drop obsolete comment about non-present pte and second faults mm,hugetlb: drop unlikelys from hugetlb_fault mm/hugetlb.c | 132 +++++++++++++++++++++++---------------------------- 1 file changed, 60 insertions(+), 72 deletions(-) -- 2.50.0