From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1928C3ABA3 for ; Fri, 2 May 2025 09:42:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0C14C6B0089; Fri, 2 May 2025 05:42:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 049336B008C; Fri, 2 May 2025 05:42:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DB76E6B0092; Fri, 2 May 2025 05:42:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id B49986B0089 for ; Fri, 2 May 2025 05:42:26 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 3F85E121190 for ; Fri, 2 May 2025 09:42:28 +0000 (UTC) X-FDA: 83397477576.25.84BCA8C Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf14.hostedemail.com (Postfix) with ESMTP id A0C2F100007 for ; Fri, 2 May 2025 09:42:25 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BtV5FZVm; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf14.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746178946; a=rsa-sha256; cv=none; b=n4O0D6WBoimtonjoBxSOHTpxwGaeWY5eVH/DGXfQEqYbWYLi80KBCJetLO6PgDLCVbi99v NSw5g1Wkvt65dZXZKjjCexzrduiRWrg+kDSHZ14vHLhhO+gJg61fokvekGhG6+cD6Oq6RI jImxv8/Nqf9e/Ohxpa3MAyYGe3slX28= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746178946; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=miNJgwsqELXveVAn/lTZelbfXx6FusXbd6KjeEVrBDU=; b=pCZnyhZPf/M911HLJLOCs0GPGJVL6dICuAV5SzdChe6ojYEOP5aoRUHnn86v5ZhLkyMm2I uCcjhpJmvZ5OCsMjhK4tL+7I6cKHSaon4v25xg1Cm9jXRYssEXeYYGo/70B30UKu3Mx4lq wZuWScxCXJOMy1c3q5xDFsxOuuzkSdE= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BtV5FZVm; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf14.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1746178945; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=miNJgwsqELXveVAn/lTZelbfXx6FusXbd6KjeEVrBDU=; b=BtV5FZVmmklh5x6zuDQ8VbXceE6dogJix3luYmDqHVXbrCKr4bxOlp8QqjvOsMIuP2YY/n bfwiQXBKGviclSaycAeGG5Z5zZ0nwY3bDoOn50zqdMTpo/XQ9dLMs1Fi+ksgmDMGBJtl6d OCAhe8ens+5E4WMLeRNnSTdt67AsTW8= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-186-tYTRB4ouOXalxGJNJ69RVA-1; Fri, 02 May 2025 05:42:23 -0400 X-MC-Unique: tYTRB4ouOXalxGJNJ69RVA-1 X-Mimecast-MFC-AGG-ID: tYTRB4ouOXalxGJNJ69RVA_1746178943 Received: by mail-wr1-f69.google.com with SMTP id ffacd0b85a97d-39130f02631so493029f8f.2 for ; Fri, 02 May 2025 02:42:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746178942; x=1746783742; h=content-transfer-encoding:in-reply-to:organization:autocrypt :content-language:references:cc:to:from:subject:user-agent :mime-version:date:message-id:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=miNJgwsqELXveVAn/lTZelbfXx6FusXbd6KjeEVrBDU=; b=V9/TMvCsx1KAQl2997SU3fksM097GqEateiqgOpV272FNVj0NR/b7/sTSG/x5N3vBf SHRpEz+nkihcygxDowAl5x+XZHkGWS9GZhV1ygy06J68uRkK+obAB0DLEWvA6wdfKtyF gPbZse8SqsAwe426e4gN4Li9i3EGWlcNA10A9C4kMNEwUF8socKwld63ygWCo48YpcUF e3KxwJwlvAHFBVPSWwBO+sY5jX4EAUzyUzVMUMrR50zSYXk5mQIA7ceX5qGBS4fuM4R/ BBEPwyqxH5yupY62lc4n6+JaEroNDA4shvjlTyImA9FUnH86i1/3jEuF7gEf6DpOK5Ix /+Ig== X-Forwarded-Encrypted: i=1; AJvYcCUCuxKAi1RX3fXME/G+XXDkXjgGbIMO83QaEm7xfCEjfLrEVA4/Pm7t+9ykRCZuS2zPLcWpnjfM3Q==@kvack.org X-Gm-Message-State: AOJu0YwvVpW1tW+KyMoWJML2EYrp0+9I3UHebzyIoxmWXvJ7g6zonUVc y19W6A339O3puGhoXzQOQiLeKKe9oalLYJ61E3DnPR3nEv9NecQKu9/I4UdoKSWVKenYX+bvGDs Nli/qjTk8eR4L5K081EBNRSbo88Qfj/He1Iql6fpOFbiLU9zz X-Gm-Gg: ASbGnctiFXlNpXD3HnvK9LeheIrpYZeyIz92NgYGQA8s6KT4PZrB4o6x3HspDBuuDOs a3yL0/AcsWKkZ4xEjtd62SQmjHdRcyAOcOsWblPnhh/Hpqsq2cQ59IHRGrA457eVOc1LnckV6dI KPcytl2QuJRhkEiqo8ZxGMpX1m1lBk3JQVwKLnthrLuECVk4+DUIfKNwYRiK84xOmLdh3Mc8MDg cPoGjH3gLKxN8dKy8Lt5+hehFnPglV3Ergzl0s8FHuPUx+Vy8lHtbJihZ61sIMhMja3S+N/N58P GqL9ACB7xtZvB+7iPRa9nJaHO7do/s4af0HK1VEcqhU+cDHTud/5RksY4REntXNZEMjaXHEvC0n VfnDeEZQMK3PlbFb1t59aBw1fHlqBYpP1mEVV1wc= X-Received: by 2002:a05:6000:1843:b0:3a0:8c3d:d7ed with SMTP id ffacd0b85a97d-3a099adcf68mr1373593f8f.30.1746178942617; Fri, 02 May 2025 02:42:22 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGqI4s9svPYgaPtQzJzmo7HGast8DY8uxTkT2z8vO9wXKoJLd2K9V5WFPktwNwa5jTxVbpjaQ== X-Received: by 2002:a05:6000:1843:b0:3a0:8c3d:d7ed with SMTP id ffacd0b85a97d-3a099adcf68mr1373569f8f.30.1746178942173; Fri, 02 May 2025 02:42:22 -0700 (PDT) Received: from ?IPV6:2003:cb:c713:d600:afc5:4312:176f:3fb0? (p200300cbc713d600afc54312176f3fb0.dip0.t-ipconnect.de. [2003:cb:c713:d600:afc5:4312:176f:3fb0]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3a099ae0ca4sm1671063f8f.14.2025.05.02.02.42.21 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 02 May 2025 02:42:21 -0700 (PDT) Message-ID: Date: Fri, 2 May 2025 11:42:20 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 1/1] mm: Fix folio_pte_batch() overcount with zero PTEs From: David Hildenbrand To: =?UTF-8?Q?Petr_Van=C4=9Bk?= Cc: linux-kernel@vger.kernel.org, Andrew Morton , Ryan Roberts , linux-mm@kvack.org, stable@vger.kernel.org References: <2025429144547-aBDmGzJBQc9RMBj--arkamar@atlas.cz> <2025429183321-aBEbcQQY3WX6dsNI-arkamar@atlas.cz> <1df577bb-eaba-4e34-9050-309ee1c7dc57@redhat.com> <202543011526-aBIO5nq6Olsmq2E--arkamar@atlas.cz> <9c412f4f-3bdf-43c0-a3cd-7ce52233f4e5@redhat.com> <202543016037-aBJJJdupFVd_6FTX-arkamar@atlas.cz> <91c4e7e6-e4c2-4ff0-8b13-7b3ff138e98e@redhat.com> <20255174537-aBMmobhpYTFtoONI-arkamar@atlas.cz> <0e029877-4ca9-40f0-933f-6d0779c95d72@redhat.com> Autocrypt: addr=david@redhat.com; keydata= xsFNBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABzSREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT7CwZgEEwEIAEICGwMGCwkIBwMCBhUIAgkKCwQW AgMBAh4BAheAAhkBFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAl8Ox4kFCRKpKXgACgkQTd4Q 9wD/g1oHcA//a6Tj7SBNjFNM1iNhWUo1lxAja0lpSodSnB2g4FCZ4R61SBR4l/psBL73xktp rDHrx4aSpwkRP6Epu6mLvhlfjmkRG4OynJ5HG1gfv7RJJfnUdUM1z5kdS8JBrOhMJS2c/gPf wv1TGRq2XdMPnfY2o0CxRqpcLkx4vBODvJGl2mQyJF/gPepdDfcT8/PY9BJ7FL6Hrq1gnAo4 3Iv9qV0JiT2wmZciNyYQhmA1V6dyTRiQ4YAc31zOo2IM+xisPzeSHgw3ONY/XhYvfZ9r7W1l pNQdc2G+o4Di9NPFHQQhDw3YTRR1opJaTlRDzxYxzU6ZnUUBghxt9cwUWTpfCktkMZiPSDGd KgQBjnweV2jw9UOTxjb4LXqDjmSNkjDdQUOU69jGMUXgihvo4zhYcMX8F5gWdRtMR7DzW/YE BgVcyxNkMIXoY1aYj6npHYiNQesQlqjU6azjbH70/SXKM5tNRplgW8TNprMDuntdvV9wNkFs 9TyM02V5aWxFfI42+aivc4KEw69SE9KXwC7FSf5wXzuTot97N9Phj/Z3+jx443jo2NR34XgF 89cct7wJMjOF7bBefo0fPPZQuIma0Zym71cP61OP/i11ahNye6HGKfxGCOcs5wW9kRQEk8P9 M/k2wt3mt/fCQnuP/mWutNPt95w9wSsUyATLmtNrwccz63XOwU0EVcufkQEQAOfX3n0g0fZz Bgm/S2zF/kxQKCEKP8ID+Vz8sy2GpDvveBq4H2Y34XWsT1zLJdvqPI4af4ZSMxuerWjXbVWb T6d4odQIG0fKx4F8NccDqbgHeZRNajXeeJ3R7gAzvWvQNLz4piHrO/B4tf8svmRBL0ZB5P5A 2uhdwLU3NZuK22zpNn4is87BPWF8HhY0L5fafgDMOqnf4guJVJPYNPhUFzXUbPqOKOkL8ojk CXxkOFHAbjstSK5Ca3fKquY3rdX3DNo+EL7FvAiw1mUtS+5GeYE+RMnDCsVFm/C7kY8c2d0G NWkB9pJM5+mnIoFNxy7YBcldYATVeOHoY4LyaUWNnAvFYWp08dHWfZo9WCiJMuTfgtH9tc75 7QanMVdPt6fDK8UUXIBLQ2TWr/sQKE9xtFuEmoQGlE1l6bGaDnnMLcYu+Asp3kDT0w4zYGsx 5r6XQVRH4+5N6eHZiaeYtFOujp5n+pjBaQK7wUUjDilPQ5QMzIuCL4YjVoylWiBNknvQWBXS lQCWmavOT9sttGQXdPCC5ynI+1ymZC1ORZKANLnRAb0NH/UCzcsstw2TAkFnMEbo9Zu9w7Kv AxBQXWeXhJI9XQssfrf4Gusdqx8nPEpfOqCtbbwJMATbHyqLt7/oz/5deGuwxgb65pWIzufa N7eop7uh+6bezi+rugUI+w6DABEBAAHCwXwEGAEIACYCGwwWIQQb2cqtc1xMOkYN/MpN3hD3 AP+DWgUCXw7HsgUJEqkpoQAKCRBN3hD3AP+DWrrpD/4qS3dyVRxDcDHIlmguXjC1Q5tZTwNB boaBTPHSy/Nksu0eY7x6HfQJ3xajVH32Ms6t1trDQmPx2iP5+7iDsb7OKAb5eOS8h+BEBDeq 3ecsQDv0fFJOA9ag5O3LLNk+3x3q7e0uo06XMaY7UHS341ozXUUI7wC7iKfoUTv03iO9El5f XpNMx/YrIMduZ2+nd9Di7o5+KIwlb2mAB9sTNHdMrXesX8eBL6T9b+MZJk+mZuPxKNVfEQMQ a5SxUEADIPQTPNvBewdeI80yeOCrN+Zzwy/Mrx9EPeu59Y5vSJOx/z6OUImD/GhX7Xvkt3kq Er5KTrJz3++B6SH9pum9PuoE/k+nntJkNMmQpR4MCBaV/J9gIOPGodDKnjdng+mXliF3Ptu6 3oxc2RCyGzTlxyMwuc2U5Q7KtUNTdDe8T0uE+9b8BLMVQDDfJjqY0VVqSUwImzTDLX9S4g/8 kC4HRcclk8hpyhY2jKGluZO0awwTIMgVEzmTyBphDg/Gx7dZU1Xf8HFuE+UZ5UDHDTnwgv7E th6RC9+WrhDNspZ9fJjKWRbveQgUFCpe1sa77LAw+XFrKmBHXp9ZVIe90RMe2tRL06BGiRZr jPrnvUsUUsjRoRNJjKKA/REq+sAnhkNPPZ/NNMjaZ5b8Tovi8C0tmxiCHaQYqj7G2rgnT0kt WNyWQQ== Organization: Red Hat In-Reply-To: <0e029877-4ca9-40f0-933f-6d0779c95d72@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: M2sZWhMpuozO3gB0fUF6cJjZ-sqlaxg0MKcnY-8eey4_1746178943 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: A0C2F100007 X-Stat-Signature: hwfyu73y4ngz6g3c39474gxpsgpa139j X-Rspam-User: X-HE-Tag: 1746178945-686919 X-HE-Meta: U2FsdGVkX18WUKMuoa9dx3obnfuTHEf/Qf6tOl9aq7GK5elohEP9Q0Xa274H+5aGuVpdLGb5g7ub8hI7+TGBknBzPutONrc7dGPcEAtqi8W2OdniGuG/t+voyanmKUTz2eajfrcJYX8+ZyVDjPssbUqkyfnokR368dYP1xh4dCokHLPNTTuEVpSv6jNPG7K/4NCndhmksIen6nhrPUDT17wSUNmDRj1aPzBfZyRZwycWVLPDtx0vtszt6VafkWuI4mrH3/aoWtDjHB3585neSqidEaVM3LpTqphtRZtbsvFs1u6KoR+49PDAQHsmO3mlOYr5uWm5Ldj5XVH1EmQFzHMUKSJyta4vlJ55Ufk5b0rRv1Ra1ZYPWeVDqanbVU2Q6OFdmRQj+OkNBfdqNDK3uWFSCLbwv/+9ZdCUCx9ss7J5JxHDGHgQncpX6eWSZrd8dGUs5bX98Nr0LZCDeOjOohQIvE4ZiUKScvJVfeRz/HdzJQaOnzd4SdbGuXkRxvYnqhNl5zIh7whuflEAiFEvb56tyzoDPTPOVT87wvF9SISI8znY/E2SG9sxLqvSSr+5C/GEmyO00yZC4IEyKVRx+KVdwjVzs+FcnW9l6AXwJHhjmPZM+axlhaOQeuQhKW9Zf8rheUU9MBzjKUj+vWbKqUKgKX51c/k/h27DICyIDIZ1L91NmF0quomoAAMvxFigQF1WJCnniyV3LUmUMMwZHdsJxh9Tr9Ps1bUpnXPhRG76hi7eneCxn10WgSzA7zuE3kgsnfiWXUfnXG2d2F86xrwgWUGC/GvqkRS4jfyNNAfDmIKZu4uxv7EpNW8PVViMbr2Mk8CkF79Toc3dqG+ZgD6O613tXJu5/R2xeCyRE28HaXEBBR6MmFQg9O9k9rDXmiXGn8H3K9mLOYM+fFsxNkg6Gx8U8yw+vpwOjNsLfNbXZB9G+xZPDqT2aAd6fL50vb5vIuRvOPaoTOVgnZO qzWfeJdP L/Q73icarfd8JD2Xx3Zo+t6zUUp5kV7d6uSvVQEJpQdRZ4jou5uOQLtTwypI2Za+sYONyrx2KqUjVgupAY/TaWMXjR5ZXoIcLYrfCdYNlWpHhv4pKgbhe+uWFG3OT59Tya883Rk9CANbPY/Rh1AHWw+ZON/RZIFW8+n+8ZNSWb7NP8JssKV1yIBTgYhWCkvtiwzd2dosylwYN10+O14yQq/PTZuYS0ANry4pN4dQXO/0yEtRdRz+5llMScf1qFusJTrVbyjiv3nCjq6XIXkplaFx99DBTiBSC2qPb+cLA2D12kgYz4s+HYgP3XM1WHNOYGGq2v84Avj4PcqpgzLp9mJilvCRBSPMXTOjnGffjN0Qjj5QjttrsWZW9/wC2T7pIzcMU8mHGJyQv0ylXp+ZvAnqX9mECixEecERrgs0OCEYnyF2Stsgt/xSsd2R+r7RhHBM42JXJjQMaA5CLX5padJidgkYpGEi3C0OBEb9MkIGqpHXvyR8NfjpYvxxO1S2S6fsyvkqOqoZmKtY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 02.05.25 09:37, David Hildenbrand wrote: > On 01.05.25 09:45, Petr Vaněk wrote: >> On Wed, Apr 30, 2025 at 11:25:56PM +0200, David Hildenbrand wrote: >>> On 30.04.25 18:00, Petr Vaněk wrote: >>>> On Wed, Apr 30, 2025 at 04:37:21PM +0200, David Hildenbrand wrote: >>>>> On 30.04.25 13:52, Petr Vaněk wrote: >>>>>> On Tue, Apr 29, 2025 at 08:56:03PM +0200, David Hildenbrand wrote: >>>>>>> On 29.04.25 20:33, Petr Vaněk wrote: >>>>>>>> On Tue, Apr 29, 2025 at 05:45:53PM +0200, David Hildenbrand wrote: >>>>>>>>> On 29.04.25 16:52, David Hildenbrand wrote: >>>>>>>>>> On 29.04.25 16:45, Petr Vaněk wrote: >>>>>>>>>>> On Tue, Apr 29, 2025 at 04:29:30PM +0200, David Hildenbrand wrote: >>>>>>>>>>>> On 29.04.25 16:22, Petr Vaněk wrote: >>>>>>>>>>>>> folio_pte_batch() could overcount the number of contiguous PTEs when >>>>>>>>>>>>> pte_advance_pfn() returns a zero-valued PTE and the following PTE in >>>>>>>>>>>>> memory also happens to be zero. The loop doesn't break in such a case >>>>>>>>>>>>> because pte_same() returns true, and the batch size is advanced by one >>>>>>>>>>>>> more than it should be. >>>>>>>>>>>>> >>>>>>>>>>>>> To fix this, bail out early if a non-present PTE is encountered, >>>>>>>>>>>>> preventing the invalid comparison. >>>>>>>>>>>>> >>>>>>>>>>>>> This issue started to appear after commit 10ebac4f95e7 ("mm/memory: >>>>>>>>>>>>> optimize unmap/zap with PTE-mapped THP") and was discovered via git >>>>>>>>>>>>> bisect. >>>>>>>>>>>>> >>>>>>>>>>>>> Fixes: 10ebac4f95e7 ("mm/memory: optimize unmap/zap with PTE-mapped THP") >>>>>>>>>>>>> Cc: stable@vger.kernel.org >>>>>>>>>>>>> Signed-off-by: Petr Vaněk >>>>>>>>>>>>> --- >>>>>>>>>>>>> mm/internal.h | 2 ++ >>>>>>>>>>>>> 1 file changed, 2 insertions(+) >>>>>>>>>>>>> >>>>>>>>>>>>> diff --git a/mm/internal.h b/mm/internal.h >>>>>>>>>>>>> index e9695baa5922..c181fe2bac9d 100644 >>>>>>>>>>>>> --- a/mm/internal.h >>>>>>>>>>>>> +++ b/mm/internal.h >>>>>>>>>>>>> @@ -279,6 +279,8 @@ static inline int folio_pte_batch(struct folio *folio, unsigned long addr, >>>>>>>>>>>>> dirty = !!pte_dirty(pte); >>>>>>>>>>>>> pte = __pte_batch_clear_ignored(pte, flags); >>>>>>>>>>>>> >>>>>>>>>>>>> + if (!pte_present(pte)) >>>>>>>>>>>>> + break; >>>>>>>>>>>>> if (!pte_same(pte, expected_pte)) >>>>>>>>>>>>> break; >>>>>>>>>>>> >>>>>>>>>>>> How could pte_same() suddenly match on a present and non-present PTE. >>>>>>>>>>> >>>>>>>>>>> In the problematic case pte.pte == 0 and expected_pte.pte == 0 as well. >>>>>>>>>>> pte_same() returns a.pte == b.pte -> 0 == 0. Both are non-present PTEs. >>>>>>>>>> >>>>>>>>>> Observe that folio_pte_batch() was called *with a present pte*. >>>>>>>>>> >>>>>>>>>> do_zap_pte_range() >>>>>>>>>> if (pte_present(ptent)) >>>>>>>>>> zap_present_ptes() >>>>>>>>>> folio_pte_batch() >>>>>>>>>> >>>>>>>>>> How can we end up with an expected_pte that is !present, if it is based >>>>>>>>>> on the provided pte that *is present* and we only used pte_advance_pfn() >>>>>>>>>> to advance the pfn? >>>>>>>>> >>>>>>>>> I've been staring at the code for too long and don't see the issue. >>>>>>>>> >>>>>>>>> We even have >>>>>>>>> >>>>>>>>> VM_WARN_ON_FOLIO(!pte_present(pte), folio); >>>>>>>>> >>>>>>>>> So the initial pteval we got is present. >>>>>>>>> >>>>>>>>> I don't see how >>>>>>>>> >>>>>>>>> nr = pte_batch_hint(start_ptep, pte); >>>>>>>>> expected_pte = __pte_batch_clear_ignored(pte_advance_pfn(pte, nr), flags); >>>>>>>>> >>>>>>>>> would suddenly result in !pte_present(expected_pte). >>>>>>>> >>>>>>>> The issue is not happening in __pte_batch_clear_ignored but later in >>>>>>>> following line: >>>>>>>> >>>>>>>> expected_pte = pte_advance_pfn(expected_pte, nr); >>>>>>>> >>>>>>>> The issue seems to be in __pte function which converts PTE value to >>>>>>>> pte_t in pte_advance_pfn, because warnings disappears when I change the >>>>>>>> line to >>>>>>>> >>>>>>>> expected_pte = (pte_t){ .pte = pte_val(expected_pte) + (nr << PFN_PTE_SHIFT) }; >>>>>>>> >>>>>>>> The kernel probably uses __pte function from >>>>>>>> arch/x86/include/asm/paravirt.h because it is configured with >>>>>>>> CONFIG_PARAVIRT=y: >>>>>>>> >>>>>>>> static inline pte_t __pte(pteval_t val) >>>>>>>> { >>>>>>>> return (pte_t) { PVOP_ALT_CALLEE1(pteval_t, mmu.make_pte, val, >>>>>>>> "mov %%rdi, %%rax", ALT_NOT_XEN) }; >>>>>>>> } >>>>>>>> >>>>>>>> I guess it might cause this weird magic, but I need more time to >>>>>>>> understand what it does :) >>>>>> >>>>>> I understand it slightly more. __pte() uses xen_make_pte(), which calls >>>>>> pte_pfn_to_mfn(), however, mfn for this pfn contains INVALID_P2M_ENTRY >>>>>> value, therefore the pte_pfn_to_mfn() returns 0, see [1]. >>>>>> >>>>>> I guess that the mfn was invalidated by xen-balloon driver? >>>>>> >>>>>> [1] https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/xen/mmu_pv.c?h=v6.15-rc4#n408 >>>>>> >>>>>>> What XEN does with basic primitives that convert between pteval and >>>>>>> pte_t is beyond horrible. >>>>>>> >>>>>>> How come set_ptes() that uses pte_next_pfn()->pte_advance_pfn() does not >>>>>>> run into this? >>>>>> >>>>>> I don't know, but I guess it is somehow related to pfn->mfn translation. >>>>>> >>>>>>> Is it only a problem if we exceed a certain pfn? >>>>>> >>>>>> No, it is a problem if the corresponding mft to given pfn is invalid. >>>>>> >>>>>> I am not sure if my original patch is a good fix. >>>>> >>>>> No :) >>>>> >>>>> Maybe it would be >>>>>> better to have some sort of native_pte_advance_pfn() which will use >>>>>> native_make_pte() rather than __pte(). Or do you think the issue is in >>>>>> Xen part? >>>>> >>>>> I think what's happening is that -- under XEN only -- we might get garbage when >>>>> calling pte_advance_pfn() and the next PFN would no longer fall into the folio. And >>>>> the current code cannot deal with that XEN garbage. >>>>> >>>>> But still not 100% sure. >>>>> >>>>> The following is completely untested, could you give that a try? >>>> >>>> Yes, it solves the issue for me. >>> >>> Cool! >>> >>>> >>>> However, maybe it would be better to solve it with the following patch. >>>> The pte_pfn_to_mfn() actually returns the same value for non-present >>>> PTEs. I suggest to return original PTE if the mfn is INVALID_P2M_ENTRY, >>>> rather than empty non-present PTE, but the _PAGE_PRESENT bit will be set >>>> to zero. Thus, we will not loose information about original pfn but it >>>> will be clear that the page is not present. >>>> >>>> From e84781f9ec4fb7275d5e7629cf7e222466caf759 Mon Sep 17 00:00:00 2001 >>>> From: =?UTF-8?q?Petr=20Van=C4=9Bk?= >>>> Date: Wed, 30 Apr 2025 17:08:41 +0200 >>>> Subject: [PATCH] x86/mm: Reset pte _PAGE_PRESENT bit for invalid mft >>>> MIME-Version: 1.0 >>>> Content-Type: text/plain; charset=UTF-8 >>>> Content-Transfer-Encoding: 8bit >>>> >>>> Signed-off-by: Petr Vaněk >>>> --- >>>> arch/x86/xen/mmu_pv.c | 9 +++------ >>>> 1 file changed, 3 insertions(+), 6 deletions(-) >>>> >>>> diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c >>>> index 38971c6dcd4b..92a6a9af0c65 100644 >>>> --- a/arch/x86/xen/mmu_pv.c >>>> +++ b/arch/x86/xen/mmu_pv.c >>>> @@ -392,28 +392,25 @@ static pteval_t pte_mfn_to_pfn(pteval_t val) >>>> static pteval_t pte_pfn_to_mfn(pteval_t val) >>>> { >>>> if (val & _PAGE_PRESENT) { >>>> unsigned long pfn = (val & PTE_PFN_MASK) >> PAGE_SHIFT; >>>> pteval_t flags = val & PTE_FLAGS_MASK; >>>> unsigned long mfn; >>>> >>>> mfn = __pfn_to_mfn(pfn); >>>> >>>> /* >>>> - * If there's no mfn for the pfn, then just create an >>>> - * empty non-present pte. Unfortunately this loses >>>> - * information about the original pfn, so >>>> - * pte_mfn_to_pfn is asymmetric. >>>> + * If there's no mfn for the pfn, then just reset present pte bit. >>>> */ >>>> if (unlikely(mfn == INVALID_P2M_ENTRY)) { >>>> - mfn = 0; >>>> - flags = 0; >>>> + mfn = pfn; >>>> + flags &= ~_PAGE_PRESENT; >>>> } else >>>> mfn &= ~(FOREIGN_FRAME_BIT | IDENTITY_FRAME_BIT); >>>> val = ((pteval_t)mfn << PAGE_SHIFT) | flags; >>>> } >>>> >>>> return val; >>>> } >>>> >>>> __visible pteval_t xen_pte_val(pte_t pte) >>>> { >>> >>> That might do as well. >>> >>> >>> I assume the following would also work? (removing the early ==1 check) >> >> Yes, it also works in my case and the removal makes sense to me. >> >>> It has the general benefit of removing the pte_pfn() call from the >>> loop body, which is why I like that fix. (almost looks like a cleanup) >> >> Indeed, it looks like a cleanup to me as well :) > > Okay, let me polish the patch up and send it out later. Actually, can you polish it up and send it out? I think we need a better description on how exactly that problems happens, in particular involving pte_none(). So stuff from the cover letter should probably be living in here. Feel free to add my Co-developed-by: David Hildenbrand Signed-off-by: David Hildenbrand and stay first author. Thanks! -- Cheers, David / dhildenb