From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 37CE2C83038 for ; Tue, 1 Jul 2025 14:33:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B768E6B0095; Tue, 1 Jul 2025 10:33:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B4D996B0096; Tue, 1 Jul 2025 10:33:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A635B6B0099; Tue, 1 Jul 2025 10:33:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 941756B0095 for ; Tue, 1 Jul 2025 10:33:32 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 325C01A04A7 for ; Tue, 1 Jul 2025 14:33:32 +0000 (UTC) X-FDA: 83615939064.28.0D3AE1F Received: from mail-wm1-f49.google.com (mail-wm1-f49.google.com [209.85.128.49]) by imf01.hostedemail.com (Postfix) with ESMTP id 4977940008 for ; Tue, 1 Jul 2025 14:33:30 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=BTbq0+MV; spf=pass (imf01.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.128.49 as permitted sender) smtp.mailfrom=ioworker0@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751380410; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=1Yt3x3zsr5fbqG+1bODhLJma2AVCo1fUrMLDWbtz//4=; b=CffXUbQzhcYSUr4t9vW+Kij7AOoZQ4zpwE6gInvgQXA5dVbjZWr0g4DOhL0I4S8bd4hS2L Xciybx3e+TvfflVDRo+aKxq4p173g0PaP2upRZ1Sw5m2CjDR0Xo5DaUu5O8/vJORrqnPkU 3FopuXNC8xaIHYLouV2iuLtr7w5HXPE= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=BTbq0+MV; spf=pass (imf01.hostedemail.com: domain of ioworker0@gmail.com designates 209.85.128.49 as permitted sender) smtp.mailfrom=ioworker0@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751380410; a=rsa-sha256; cv=none; b=sqlo+QJIqRrOx5Us5irdBfQuP7Y+5awJ5DIUvx/KAZxeuMUN3/TpLMCZFhZVVYhtEj5P5P m/PJBx/Pl9EUCOh5VelbCxUlcLa3fBWGSgwmWsQYg7hQM0bKZybmewZRIewBvNEXjLCygW D5t2GSWbwFz9hiqNY07i2ABWeCHsQhY= Received: by mail-wm1-f49.google.com with SMTP id 5b1f17b1804b1-453634d8609so41024245e9.3 for ; Tue, 01 Jul 2025 07:33:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1751380409; x=1751985209; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=1Yt3x3zsr5fbqG+1bODhLJma2AVCo1fUrMLDWbtz//4=; b=BTbq0+MVFJUZNT9vhZ3t7Tg6KPEigbtygcI7PRC/q3lUySKmBJWPhsj4kcjXUuYoGp Icf4a7F5zSHrn5DGxs83E0qDk6cz0folUQYzk/KaaZEEUctH83S0gYgLjiYfI8tdkiHN Fgc4pI82YGxgPir5sJeG60JYTtKw5XUtA6LbId8qiCQXbtbOSyh+opyqMTLohwpTuAnZ qCgsNypzA9iK6w5EFxEYdnOEuG3AZxCZOLuqF/8BAmrE10W7JokK+724ompa8b+Y8eMk oHQdlGCyZ+B6N3Qz2wwyecCjKoFcDtA8I8bK8KjT2Dp92cHupXITDr5QEbwic2mgqaVa 2X5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751380409; x=1751985209; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=1Yt3x3zsr5fbqG+1bODhLJma2AVCo1fUrMLDWbtz//4=; b=Tl2xHbb9d+/MF+kYjiOS5QbTcvX3MBclMVagtb4Gg+E/Kmi9GDoxfGzbw6F7FcAdYP x3x7lR+jbpTN0999lpF6+4Z+Af6JrA8aB+SwM/5dX8hxG8mvb1H/uafSq/hRfO6TbEP0 9PaIVq/+sXbz9X15JkSvJBGklIQx6SUQAIEo2Hhe3v6EF43mqqk6m8uZ82ghG2QbXeEL 1cIKolvNPDdCdgQqK6Vv5x2fX0yLxkafA5SERYVu16CWMFitToyv67SGcEB5JIUC5C3y HPDJmF+XGn7tDPtLGJHCOwhLuj36ilqWFzQqCPOIRQkzkhrBeTu/lbav9IEwR/PVckY3 w1Ig== X-Forwarded-Encrypted: i=1; AJvYcCVZ9DWKVq27vKILG7dsmSk0XHLren2keXv9WqQEANfIpoH0PbP1mGcXUIVZRF3dJSVEPQgSgRj/lw==@kvack.org X-Gm-Message-State: AOJu0YzdXh2PQ48zZPVBFND1v+xXCorBrnMaSr2JnT7wZb0LCPDbRTp0 7GCyX2w6O+ph01gWobvZTB75Qhdby2Ih57P/mKEbHKSjrRiehWc6mY1x X-Gm-Gg: ASbGncsyLWZA6HSAYCwukxE8bM2o4CwnnpF2BQRlOb3i6In40cvenoPwHq/RejttrOu OzTCSV6S7QLMB6fkLU2Y8VGtK6c2bgLszVzGbWQcS5nkSrUaKaG9UvbUCrza2+W4edG86eB8aNF 7HQLsG+gU2SDd9++D4/xIpVGccCSmO8zV22wXcSK7U/fVPjs4BkodR1kQyQwlnDIeqeCIuncqZa 8XLEX8MnyUv7djeZN2taqLPXRZp/bxgIcCaT/Udzu2mNe02JxPIm/nidWhLxw7y/aRkr6xypTnF 0APclmSJdTiHZ3PPaO7IogPmDtQdF0mP9DiJSBaUA9CT0CCGnek= X-Google-Smtp-Source: AGHT+IHdbiNvZTIGRfHjK3LlOcp8azh08Vy9sNOtWFRXMMltik4m+thxSCZBh8cytvB5myfYwyr7/g== X-Received: by 2002:a05:600c:1e8a:b0:450:d3b9:a5fc with SMTP id 5b1f17b1804b1-4539551fea8mr127351845e9.27.1751380408083; Tue, 01 Jul 2025 07:33:28 -0700 (PDT) Received: from localhost.localdomain ([2a09:0:1:2::305e]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4538a3fe592sm163990875e9.21.2025.07.01.07.33.11 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 01 Jul 2025 07:33:27 -0700 (PDT) From: Lance Yang X-Google-Original-From: Lance Yang To: akpm@linux-foundation.org, david@redhat.com, 21cnbao@gmail.com Cc: baolin.wang@linux.alibaba.com, chrisl@kernel.org, ioworker0@gmail.com, kasong@tencent.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-riscv@lists.infradead.org, lorenzo.stoakes@oracle.com, ryan.roberts@arm.com, v-songbaohua@oppo.com, x86@kernel.org, huang.ying.caritas@gmail.com, zhengtangquan@oppo.com, riel@surriel.com, Liam.Howlett@oracle.com, vbabka@suse.cz, harry.yoo@oracle.com, mingzhe.yang@ly.com, stable@vger.kernel.org, Barry Song , Lance Yang Subject: [PATCH v4 1/1] mm/rmap: fix potential out-of-bounds page table access during batched unmap Date: Tue, 1 Jul 2025 22:31:00 +0800 Message-ID: <20250701143100.6970-1-lance.yang@linux.dev> X-Mailer: git-send-email 2.49.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Stat-Signature: g3o7s8aqhcxpmxwjhc65efdy1nu3jiqm X-Rspamd-Queue-Id: 4977940008 X-Rspamd-Server: rspam11 X-Rspam-User: X-HE-Tag: 1751380410-586457 X-HE-Meta: U2FsdGVkX1+u9pBf5JHeToXSZM0u9qtNHWTSJN+D/ujXVdr97usO6/a+QttOJGNKIwTUJNjPuAfxhW+8Sw06dYpMg792yg4k/otfDk/c19AC6ik2I4GhId0b4H7gQPwEWLmc6kKF8seUEIca6etiz0AHH7HTkRMajUCiZ7UKqsHAiHO/JgVDhDS2d6ttsWU84R/BfFt8t1/0rLYpms7URFuNNHvufSaVMxE+RQ7tXoCvqbRe9Xd2iRt7cVO5qBlRGfVNVh1X//iv8i//KB4Tti66ur1wURPjzOTbL6cGp87ga7bqGdBnsM2HAjnvwRjT+FFXy5obMbN63lzLpv//HNEJg8NlKUFOiwwZO6m1hBdrod57sr58RUcs+7m+RuTJrjdbVw8lU4SZOUZE97w7FLesROaaEMqGFISC+yDKr6ZjxI+F2I306ktdRh2VyrqueJiZeEMlbSTWnpyy81Xfjq0A2EfCyTUd1+R4jwbfcUE5WAANSZauAPqn1SpkDgClZHqL9zRckvw+QT55YevCuq4gTvJLHcLKrosMaSXzcGo1UM5Rg8GIrqCei7qxfbcfRG2qr7Jww3KNJZyFx55YfXa8wcHyake+hJfaSD4BF8gmFGlz2ixqOZ41XKa22mLDqrWXc96LZxLIHNMHPUORlch1mk42ezamqVUxSHu4w32lnd8CGfzVOflT1FJTAV/ezYYjbD4eV6qigAMnhww7u6qP4xAjiZwKAvFpc61yUHl+oDzw4WEPV7ZV/QRVNQ/c3OO6mMKsJWCGO8bnWN1azLrWveRlHWmcbVOtEMfgIxA6r0U/Yyp5mk6tQXcPUh463HgVxyRPbSqaabiYEMjoGLYVFjRrLsN7d9WCf6E/13fB6I3p1MAFNxY2lGvFeIr7h34T7Ljx9fboRpax6liyW/dRBKLIUUso/+7OgK6dGAlg/RqPTVy9w/WJ9ktCtIWVD4wv+wIAOIc9rivCtAJ GML+kFfY jzg3E+b69OTim0l7DWFh8dpGi6Ya26T2+np6LqUSzJVLyhhofyF3IJnw66ihgONZKCNFWIuvPMTb6smIPW7kZ3IaWK6A2sX0cDzmyMMZ4qa3SfRySEV7C2QdBTYWivX65rv5LFIKSb3VXGvIM2AGSVTMrpyBtcuVb97VGJPJA6bC5VM6LQbwnpZ7XlZ3s/QyNuQkN0c1SWmr/B578zZcw3KgGRPCEgon54/V0lwDlk0lFeHSAn+9n8aBvEW6/SDKaPynUuiPca2BwJxw12jw2ZtyL0gQzH1NKbkMU++n3Ti0BKANhvz1gmVjKwzdzCe78hDCqn/2/gZbMYVZblVaK4oA6ikLQliZkTE8IPBsk8s5MRtNb8Pz4FI7Q4k21lekuH0eVt/ID9VJdEpRnfPXMAU1H5OATciet+QS2pROTfK0mxcDCY29Cja0U3ZUy5x2GAqjJztVnZWh1sH5GKQXHoTghsg5pvqYKIR7NDdpWpXVoaK4DlUZtPn9cn7pTUVGluqICDrP7C8X1xOzaUnCdOFruXw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Lance Yang As pointed out by David[1], the batched unmap logic in try_to_unmap_one() may read past the end of a PTE table when a large folio's PTE mappings are not fully contained within a single page table. While this scenario might be rare, an issue triggerable from userspace must be fixed regardless of its likelihood. This patch fixes the out-of-bounds access by refactoring the logic into a new helper, folio_unmap_pte_batch(). The new helper correctly calculates the safe batch size by capping the scan at both the VMA and PMD boundaries. To simplify the code, it also supports partial batching (i.e., any number of pages from 1 up to the calculated safe maximum), as there is no strong reason to special-case for fully mapped folios. [1] https://lore.kernel.org/linux-mm/a694398c-9f03-4737-81b9-7e49c857fcbe@redhat.com Cc: Reported-by: David Hildenbrand Closes: https://lore.kernel.org/linux-mm/a694398c-9f03-4737-81b9-7e49c857fcbe@redhat.com Fixes: 354dffd29575 ("mm: support batched unmap for lazyfree large folios during reclamation") Suggested-by: Barry Song Acked-by: Barry Song Reviewed-by: Lorenzo Stoakes Acked-by: David Hildenbrand Signed-off-by: Lance Yang --- v3 -> v4: - Add Reported-by + Closes tags (per David) - Pick RB from Lorenzo - thanks! - Pick AB from David - thanks! - https://lore.kernel.org/linux-mm/20250630011305.23754-1-lance.yang@linux.dev v2 -> v3: - Tweak changelog (per Barry and David) - Pick AB from Barry - thanks! - https://lore.kernel.org/linux-mm/20250627062319.84936-1-lance.yang@linux.dev v1 -> v2: - Update subject and changelog (per Barry) - https://lore.kernel.org/linux-mm/20250627025214.30887-1-lance.yang@linux.dev mm/rmap.c | 46 ++++++++++++++++++++++++++++------------------ 1 file changed, 28 insertions(+), 18 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index fb63d9256f09..1320b88fab74 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1845,23 +1845,32 @@ void folio_remove_rmap_pud(struct folio *folio, struct page *page, #endif } -/* We support batch unmapping of PTEs for lazyfree large folios */ -static inline bool can_batch_unmap_folio_ptes(unsigned long addr, - struct folio *folio, pte_t *ptep) +static inline unsigned int folio_unmap_pte_batch(struct folio *folio, + struct page_vma_mapped_walk *pvmw, + enum ttu_flags flags, pte_t pte) { const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY; - int max_nr = folio_nr_pages(folio); - pte_t pte = ptep_get(ptep); + unsigned long end_addr, addr = pvmw->address; + struct vm_area_struct *vma = pvmw->vma; + unsigned int max_nr; + + if (flags & TTU_HWPOISON) + return 1; + if (!folio_test_large(folio)) + return 1; + /* We may only batch within a single VMA and a single page table. */ + end_addr = pmd_addr_end(addr, vma->vm_end); + max_nr = (end_addr - addr) >> PAGE_SHIFT; + + /* We only support lazyfree batching for now ... */ if (!folio_test_anon(folio) || folio_test_swapbacked(folio)) - return false; + return 1; if (pte_unused(pte)) - return false; - if (pte_pfn(pte) != folio_pfn(folio)) - return false; + return 1; - return folio_pte_batch(folio, addr, ptep, pte, max_nr, fpb_flags, NULL, - NULL, NULL) == max_nr; + return folio_pte_batch(folio, addr, pvmw->pte, pte, max_nr, fpb_flags, + NULL, NULL, NULL); } /* @@ -2024,9 +2033,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, if (pte_dirty(pteval)) folio_mark_dirty(folio); } else if (likely(pte_present(pteval))) { - if (folio_test_large(folio) && !(flags & TTU_HWPOISON) && - can_batch_unmap_folio_ptes(address, folio, pvmw.pte)) - nr_pages = folio_nr_pages(folio); + nr_pages = folio_unmap_pte_batch(folio, &pvmw, flags, pteval); end_addr = address + nr_pages * PAGE_SIZE; flush_cache_range(vma, address, end_addr); @@ -2206,13 +2213,16 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, hugetlb_remove_rmap(folio); } else { folio_remove_rmap_ptes(folio, subpage, nr_pages, vma); - folio_ref_sub(folio, nr_pages - 1); } if (vma->vm_flags & VM_LOCKED) mlock_drain_local(); - folio_put(folio); - /* We have already batched the entire folio */ - if (nr_pages > 1) + folio_put_refs(folio, nr_pages); + + /* + * If we are sure that we batched the entire folio and cleared + * all PTEs, we can just optimize and stop right here. + */ + if (nr_pages == folio_nr_pages(folio)) goto walk_done; continue; walk_abort: -- 2.49.0