From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F3B7C369D9 for ; Wed, 30 Apr 2025 21:26:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 27FBF6B00D3; Wed, 30 Apr 2025 17:26:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 22EB46B00D4; Wed, 30 Apr 2025 17:26:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0A7646B00D6; Wed, 30 Apr 2025 17:26:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id DCCAC6B00D3 for ; Wed, 30 Apr 2025 17:26:02 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 1F388C0EE6 for ; Wed, 30 Apr 2025 21:26:04 +0000 (UTC) X-FDA: 83391993048.08.5C756EF Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf13.hostedemail.com (Postfix) with ESMTP id 7076220005 for ; Wed, 30 Apr 2025 21:26:01 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=YFXT0tN6; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf13.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746048361; a=rsa-sha256; cv=none; b=hHptVr3WwgzAkrUtm6ijGfj152SBOWJFcz2Wr9GhL/yIHXJnhDR6LaXNjfMX4joPoAcfgY grhCwDP5lBPGxjxpYy5u6Bi6ItINp8r0p7Hhgcw5ppLRwrsiy6PHZcPqIJ+iL1IE7Ubf4a 4BJUzp5LVE2yKI4kHwlv/Hf/ubwR/II= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=YFXT0tN6; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf13.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746048361; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=P0hqLgkuFwczFuWQejRvpFmQgCbvdtkrt3GHoFfacwg=; b=JN1SMB9iiq14S8gAdoPkqBAcKIMf0abxav47MFHbEpGaHXtfycQ5Wstd8WRnJnMjrKnZD7 H2E3f1t8a0ALUISDzCAE/68/FZh6fYdM003h4Kz+TqZ4rgN9SJ1xw0UpAXHaehOOi34IvA DFyS09+iiv6sr1f0d1ezgVcjHCp9GxY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1746048360; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=P0hqLgkuFwczFuWQejRvpFmQgCbvdtkrt3GHoFfacwg=; b=YFXT0tN6n3mL3Es965+2c8yb7Z0MZSMAXaDCsf7R66uZmEujCHEhatchNPRtoUbEmr1qGt O3A/GSOZsmpyGVr5I1E6I+m0eXS5k9+kCpNe8Hu5EYt7sw4KjulUEoEO8mfecQ8U+fIybf XYtBv1mSwyFveR0Bv/VKibhLvPlM1oc= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-638-buSW8IIVMbKNqjqpmgNhVQ-1; Wed, 30 Apr 2025 17:25:59 -0400 X-MC-Unique: buSW8IIVMbKNqjqpmgNhVQ-1 X-Mimecast-MFC-AGG-ID: buSW8IIVMbKNqjqpmgNhVQ_1746048358 Received: by mail-wr1-f70.google.com with SMTP id ffacd0b85a97d-39ee57e254aso118512f8f.1 for ; Wed, 30 Apr 2025 14:25:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746048358; x=1746653158; h=content-transfer-encoding:in-reply-to:organization:autocrypt :content-language:from:references:cc:to:subject:user-agent :mime-version:date:message-id:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=P0hqLgkuFwczFuWQejRvpFmQgCbvdtkrt3GHoFfacwg=; b=KYcFq760Xwmqq9BPyHqFdk/x+fpsH445ETodSx5oYhOATqNbLUhtj9kpOFAbY4LxTp jzpXXhrlcw0xlqPnnuACuGa3o0K+/6BMcyG1rX3T0YOGah+OeljR4/9pBi2n1QtNzPo3 ppv131iiWh7uSqWfIv2nEov8KvbiOSpy6txAfRB9zaQGQ3hL5uVDvoW1OKqQpZMeLhSi vp53+AXY4agWc9TwAFmLePiYzr/HlN7XSg+hY5tqr/8Rz0Cm42LzxF6legEkwzMrJhUz pl8ODyh3jV+0pfqgsoDo0beNDz5MpsBQfM9+7r2VLbACYttv6owitnYLeOytAU3vfr3/ RuUQ== X-Forwarded-Encrypted: i=1; AJvYcCX2JStbaBOuXZVRUJ7BJ55h+rCVM6xPTVxbi/+gqOM/uHFJaIav7DvmPBSJ38xaSKrPre3QvALALA==@kvack.org X-Gm-Message-State: AOJu0YwUZTTqG7d3asG7h3oOeXXOHXDo5xNK9uPs8bC527wd+iQ2WDF1 WeeBNtvHhgxnMvQEj0Nu7PGFNX3ZbLljE2Q5NEUqDBe5+NsIV7jpCXs9clukFiMfMbgoNrWldRW rnRcwodX7lSl+c0OnO/AZla8NWo2zXZdg9R/esx3+rtwLJH6B9fGZ8brDzQULUQ== X-Gm-Gg: ASbGncuBlhv3k/bB+9W4zKYgzp7DcRURguFGGc36uX40SybGR8jehjajnPMz9sF4xAr zBiFrw2rHxxBsSFFhE+4awaAlf1I0tN/fUrVslEi1k/ljvi3ZcVorNdv7U3tCQjBpWwpr52CY0h i6u0AF9yj66Y3oW3CJtskFh2c6v3xWlivDuiaAuj4fZxwVH8XLdiyxDsj9Vn/pUGIPkNeWbFeVh /dwvxwgeeD/bNonm01YCigeCVrhzjkYeGqqicF/ueTUklRBlWI63uwx6rL8eCYYswW2O95QrCE2 8Einf6r/oAckAQIvG6OBrkaC60vqXpxqltM4SfEP6eZrXovRibXvDgTRYxM91qUg7bhrGWikQTz Io95+0suhoTb2OoTCh4JOAyUkl+/rpQA2kUphsHE= X-Received: by 2002:a05:6000:40c9:b0:39f:fd4:aec7 with SMTP id ffacd0b85a97d-3a08f752673mr4236869f8f.7.1746048358045; Wed, 30 Apr 2025 14:25:58 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFeTLR8opCStfNUraDLVuL8xHwKHLe9iq+teCwCB6RPSb4S+15g+nH1pqP6wPQLvtkzN+P+Lw== X-Received: by 2002:a05:6000:40c9:b0:39f:fd4:aec7 with SMTP id ffacd0b85a97d-3a08f752673mr4236857f8f.7.1746048357703; Wed, 30 Apr 2025 14:25:57 -0700 (PDT) Received: from ?IPV6:2003:cb:c745:a500:7f54:d66b:cf40:8ee9? (p200300cbc745a5007f54d66bcf408ee9.dip0.t-ipconnect.de. [2003:cb:c745:a500:7f54:d66b:cf40:8ee9]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3a073e46869sm18034585f8f.72.2025.04.30.14.25.56 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 30 Apr 2025 14:25:57 -0700 (PDT) Message-ID: <91c4e7e6-e4c2-4ff0-8b13-7b3ff138e98e@redhat.com> Date: Wed, 30 Apr 2025 23:25:56 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 1/1] mm: Fix folio_pte_batch() overcount with zero PTEs To: =?UTF-8?Q?Petr_Van=C4=9Bk?= Cc: linux-kernel@vger.kernel.org, Andrew Morton , Ryan Roberts , linux-mm@kvack.org, stable@vger.kernel.org References: <20250429142237.22138-1-arkamar@atlas.cz> <20250429142237.22138-2-arkamar@atlas.cz> <2025429144547-aBDmGzJBQc9RMBj--arkamar@atlas.cz> <2025429183321-aBEbcQQY3WX6dsNI-arkamar@atlas.cz> <1df577bb-eaba-4e34-9050-309ee1c7dc57@redhat.com> <202543011526-aBIO5nq6Olsmq2E--arkamar@atlas.cz> <9c412f4f-3bdf-43c0-a3cd-7ce52233f4e5@redhat.com> <202543016037-aBJJJdupFVd_6FTX-arkamar@atlas.cz> From: David Hildenbrand Autocrypt: addr=david@redhat.com; keydata= xsFNBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABzSREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT7CwZgEEwEIAEICGwMGCwkIBwMCBhUIAgkKCwQW AgMBAh4BAheAAhkBFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAl8Ox4kFCRKpKXgACgkQTd4Q 9wD/g1oHcA//a6Tj7SBNjFNM1iNhWUo1lxAja0lpSodSnB2g4FCZ4R61SBR4l/psBL73xktp rDHrx4aSpwkRP6Epu6mLvhlfjmkRG4OynJ5HG1gfv7RJJfnUdUM1z5kdS8JBrOhMJS2c/gPf wv1TGRq2XdMPnfY2o0CxRqpcLkx4vBODvJGl2mQyJF/gPepdDfcT8/PY9BJ7FL6Hrq1gnAo4 3Iv9qV0JiT2wmZciNyYQhmA1V6dyTRiQ4YAc31zOo2IM+xisPzeSHgw3ONY/XhYvfZ9r7W1l pNQdc2G+o4Di9NPFHQQhDw3YTRR1opJaTlRDzxYxzU6ZnUUBghxt9cwUWTpfCktkMZiPSDGd KgQBjnweV2jw9UOTxjb4LXqDjmSNkjDdQUOU69jGMUXgihvo4zhYcMX8F5gWdRtMR7DzW/YE BgVcyxNkMIXoY1aYj6npHYiNQesQlqjU6azjbH70/SXKM5tNRplgW8TNprMDuntdvV9wNkFs 9TyM02V5aWxFfI42+aivc4KEw69SE9KXwC7FSf5wXzuTot97N9Phj/Z3+jx443jo2NR34XgF 89cct7wJMjOF7bBefo0fPPZQuIma0Zym71cP61OP/i11ahNye6HGKfxGCOcs5wW9kRQEk8P9 M/k2wt3mt/fCQnuP/mWutNPt95w9wSsUyATLmtNrwccz63XOwU0EVcufkQEQAOfX3n0g0fZz Bgm/S2zF/kxQKCEKP8ID+Vz8sy2GpDvveBq4H2Y34XWsT1zLJdvqPI4af4ZSMxuerWjXbVWb T6d4odQIG0fKx4F8NccDqbgHeZRNajXeeJ3R7gAzvWvQNLz4piHrO/B4tf8svmRBL0ZB5P5A 2uhdwLU3NZuK22zpNn4is87BPWF8HhY0L5fafgDMOqnf4guJVJPYNPhUFzXUbPqOKOkL8ojk CXxkOFHAbjstSK5Ca3fKquY3rdX3DNo+EL7FvAiw1mUtS+5GeYE+RMnDCsVFm/C7kY8c2d0G NWkB9pJM5+mnIoFNxy7YBcldYATVeOHoY4LyaUWNnAvFYWp08dHWfZo9WCiJMuTfgtH9tc75 7QanMVdPt6fDK8UUXIBLQ2TWr/sQKE9xtFuEmoQGlE1l6bGaDnnMLcYu+Asp3kDT0w4zYGsx 5r6XQVRH4+5N6eHZiaeYtFOujp5n+pjBaQK7wUUjDilPQ5QMzIuCL4YjVoylWiBNknvQWBXS lQCWmavOT9sttGQXdPCC5ynI+1ymZC1ORZKANLnRAb0NH/UCzcsstw2TAkFnMEbo9Zu9w7Kv AxBQXWeXhJI9XQssfrf4Gusdqx8nPEpfOqCtbbwJMATbHyqLt7/oz/5deGuwxgb65pWIzufa N7eop7uh+6bezi+rugUI+w6DABEBAAHCwXwEGAEIACYCGwwWIQQb2cqtc1xMOkYN/MpN3hD3 AP+DWgUCXw7HsgUJEqkpoQAKCRBN3hD3AP+DWrrpD/4qS3dyVRxDcDHIlmguXjC1Q5tZTwNB boaBTPHSy/Nksu0eY7x6HfQJ3xajVH32Ms6t1trDQmPx2iP5+7iDsb7OKAb5eOS8h+BEBDeq 3ecsQDv0fFJOA9ag5O3LLNk+3x3q7e0uo06XMaY7UHS341ozXUUI7wC7iKfoUTv03iO9El5f XpNMx/YrIMduZ2+nd9Di7o5+KIwlb2mAB9sTNHdMrXesX8eBL6T9b+MZJk+mZuPxKNVfEQMQ a5SxUEADIPQTPNvBewdeI80yeOCrN+Zzwy/Mrx9EPeu59Y5vSJOx/z6OUImD/GhX7Xvkt3kq Er5KTrJz3++B6SH9pum9PuoE/k+nntJkNMmQpR4MCBaV/J9gIOPGodDKnjdng+mXliF3Ptu6 3oxc2RCyGzTlxyMwuc2U5Q7KtUNTdDe8T0uE+9b8BLMVQDDfJjqY0VVqSUwImzTDLX9S4g/8 kC4HRcclk8hpyhY2jKGluZO0awwTIMgVEzmTyBphDg/Gx7dZU1Xf8HFuE+UZ5UDHDTnwgv7E th6RC9+WrhDNspZ9fJjKWRbveQgUFCpe1sa77LAw+XFrKmBHXp9ZVIe90RMe2tRL06BGiRZr jPrnvUsUUsjRoRNJjKKA/REq+sAnhkNPPZ/NNMjaZ5b8Tovi8C0tmxiCHaQYqj7G2rgnT0kt WNyWQQ== Organization: Red Hat In-Reply-To: <202543016037-aBJJJdupFVd_6FTX-arkamar@atlas.cz> X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: 7RlegxK76pIkbIgWtRIKvAobBMUcUm6dArYlkKGkDqI_1746048358 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 7076220005 X-Stat-Signature: 71sjwnmzuxdkzp7gsp61oegjtptz9jzr X-HE-Tag: 1746048361-403709 X-HE-Meta: U2FsdGVkX1+LH29YNVfxhEDXuwayTMD2xtzoNChyGm1Dx7xMe2mlEgtcA34VtrZ46IB9BZ/gMeOLg/a9Erw5r5jMmPD+H/cka0MKf+XoQ0zY2yrwRJdwxNS26GgnJ0xeAao8wp1Eg9JqHvWRf9TVmxpFIbtBMb3z/4MbZO6HEPvgoS/kSStq0qwa42QSJ/PyO5/ef97fq4LtdES/OQ+tI8+p8N9lsuJKmLpTIKIyoriIxC3/QG1plk3D+lriccv50b2x2PrCD3JLoBc1uPWttfelF56ldac96tHDT4WWIKB6W2RfNK0tlW40YlRf5dZS1IkmiwATVS8dFv8XFADYt/RiRhjdZft60Znx2+qnA8Hn1rmMmsnsLUcJjtrDfMUzeI3xMgsRllePwpnv1qbbaBAgNlruAugQLTdhGpUamgmX8FEKbdjvIK4xppX0HHIaOIlqKFWqJa96CD5QlsXtS4tSoR7sWsDlxQdjR4L/jF9J3NH+EYrgM7q9/59UtnlxohrO4MpVZZS3bmaXLq9UoX3qLsTiI9w/f9wxAj3XrctWA8N/FB7RdmbLS49qwt4UFdlXcP7roR6+Z7Cc5jzQeK24dKrkHMaKKDg2zdENu3GbrwiFeTKmX5nVHl4E673T/S6lpMOxiEaH7E6I82YCtR9vQ7dp64on8XN1C8loawHYR4dGk+abLjsgkaAQ2DxUP0ucywqLRap1YtrmgPw0EIp27+JPeCnHRPDA994+z/xwdC8otDmq/OkojycRToDCsoRCZABiNZ/rgyeFMvti8RgDsLtUnd9s6oKPA/0Q18idA9Iui9heDGnB1W4ICPPLfxx+/dUjyhng+TCNOVxEktlX5dgqlIxqH2gOYFz5JTf7gU/cJxFs2ODrg2dUu/tKXuziutZdyNzyt7fWCnj+2YWEeXymtDK8WkRKTZHE3e+t/nh2pDKhTKiaGEn/qT9loPEvZF4yuJR7AYAzb9z 1vP0KvJD QMHdITq9a5cZTfJ2Z0loiEgOCaI9bO5x43PuG6KVGFruJUklmnbi4ZO35VnS2/LCraWxdRZmRtLQuG1g7/gtFzK3GnKh/DNj6PBMI854wwnPv5vBUPSOdSuqBOpspOZv/Y995OhqcRjGNZH8QHrP46FRKLadChyt+FKZiBhfHvmIqpRdNJVYwDR/r0zOvImqm7vwVXiRqz1+fn6TFMR48unLAjKUTtsBwm3x1hP3SroYNGb/7Bzb7M/G5KmP3GVpVK2cddIr9hVUhcTln2l53bU0BMPzVGTDi51gzNc1ptSoiCRmZjqoYfb9ac8O29Wq5hOBPbiuaBnIqIdedPl1ChEL0Eod/Bwx+4oNRtdWsXEe6QqApzcNTWH4SU9cfZJToJohzcAu4lsMcYyhIbnpdGY70eDwpn1+zcpGM4H82yo0SCNTGFv81HP09Nr7usPqIT9oVfk88a36gd3zSLBU4eLJRKxxcqXOzfbItgoss26g+qbves9j2woaPm2anS0Cue50eqzia6LHDaKk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 30.04.25 18:00, Petr Vaněk wrote: > On Wed, Apr 30, 2025 at 04:37:21PM +0200, David Hildenbrand wrote: >> On 30.04.25 13:52, Petr Vaněk wrote: >>> On Tue, Apr 29, 2025 at 08:56:03PM +0200, David Hildenbrand wrote: >>>> On 29.04.25 20:33, Petr Vaněk wrote: >>>>> On Tue, Apr 29, 2025 at 05:45:53PM +0200, David Hildenbrand wrote: >>>>>> On 29.04.25 16:52, David Hildenbrand wrote: >>>>>>> On 29.04.25 16:45, Petr Vaněk wrote: >>>>>>>> On Tue, Apr 29, 2025 at 04:29:30PM +0200, David Hildenbrand wrote: >>>>>>>>> On 29.04.25 16:22, Petr Vaněk wrote: >>>>>>>>>> folio_pte_batch() could overcount the number of contiguous PTEs when >>>>>>>>>> pte_advance_pfn() returns a zero-valued PTE and the following PTE in >>>>>>>>>> memory also happens to be zero. The loop doesn't break in such a case >>>>>>>>>> because pte_same() returns true, and the batch size is advanced by one >>>>>>>>>> more than it should be. >>>>>>>>>> >>>>>>>>>> To fix this, bail out early if a non-present PTE is encountered, >>>>>>>>>> preventing the invalid comparison. >>>>>>>>>> >>>>>>>>>> This issue started to appear after commit 10ebac4f95e7 ("mm/memory: >>>>>>>>>> optimize unmap/zap with PTE-mapped THP") and was discovered via git >>>>>>>>>> bisect. >>>>>>>>>> >>>>>>>>>> Fixes: 10ebac4f95e7 ("mm/memory: optimize unmap/zap with PTE-mapped THP") >>>>>>>>>> Cc: stable@vger.kernel.org >>>>>>>>>> Signed-off-by: Petr Vaněk >>>>>>>>>> --- >>>>>>>>>> mm/internal.h | 2 ++ >>>>>>>>>> 1 file changed, 2 insertions(+) >>>>>>>>>> >>>>>>>>>> diff --git a/mm/internal.h b/mm/internal.h >>>>>>>>>> index e9695baa5922..c181fe2bac9d 100644 >>>>>>>>>> --- a/mm/internal.h >>>>>>>>>> +++ b/mm/internal.h >>>>>>>>>> @@ -279,6 +279,8 @@ static inline int folio_pte_batch(struct folio *folio, unsigned long addr, >>>>>>>>>> dirty = !!pte_dirty(pte); >>>>>>>>>> pte = __pte_batch_clear_ignored(pte, flags); >>>>>>>>>> >>>>>>>>>> + if (!pte_present(pte)) >>>>>>>>>> + break; >>>>>>>>>> if (!pte_same(pte, expected_pte)) >>>>>>>>>> break; >>>>>>>>> >>>>>>>>> How could pte_same() suddenly match on a present and non-present PTE. >>>>>>>> >>>>>>>> In the problematic case pte.pte == 0 and expected_pte.pte == 0 as well. >>>>>>>> pte_same() returns a.pte == b.pte -> 0 == 0. Both are non-present PTEs. >>>>>>> >>>>>>> Observe that folio_pte_batch() was called *with a present pte*. >>>>>>> >>>>>>> do_zap_pte_range() >>>>>>> if (pte_present(ptent)) >>>>>>> zap_present_ptes() >>>>>>> folio_pte_batch() >>>>>>> >>>>>>> How can we end up with an expected_pte that is !present, if it is based >>>>>>> on the provided pte that *is present* and we only used pte_advance_pfn() >>>>>>> to advance the pfn? >>>>>> >>>>>> I've been staring at the code for too long and don't see the issue. >>>>>> >>>>>> We even have >>>>>> >>>>>> VM_WARN_ON_FOLIO(!pte_present(pte), folio); >>>>>> >>>>>> So the initial pteval we got is present. >>>>>> >>>>>> I don't see how >>>>>> >>>>>> nr = pte_batch_hint(start_ptep, pte); >>>>>> expected_pte = __pte_batch_clear_ignored(pte_advance_pfn(pte, nr), flags); >>>>>> >>>>>> would suddenly result in !pte_present(expected_pte). >>>>> >>>>> The issue is not happening in __pte_batch_clear_ignored but later in >>>>> following line: >>>>> >>>>> expected_pte = pte_advance_pfn(expected_pte, nr); >>>>> >>>>> The issue seems to be in __pte function which converts PTE value to >>>>> pte_t in pte_advance_pfn, because warnings disappears when I change the >>>>> line to >>>>> >>>>> expected_pte = (pte_t){ .pte = pte_val(expected_pte) + (nr << PFN_PTE_SHIFT) }; >>>>> >>>>> The kernel probably uses __pte function from >>>>> arch/x86/include/asm/paravirt.h because it is configured with >>>>> CONFIG_PARAVIRT=y: >>>>> >>>>> static inline pte_t __pte(pteval_t val) >>>>> { >>>>> return (pte_t) { PVOP_ALT_CALLEE1(pteval_t, mmu.make_pte, val, >>>>> "mov %%rdi, %%rax", ALT_NOT_XEN) }; >>>>> } >>>>> >>>>> I guess it might cause this weird magic, but I need more time to >>>>> understand what it does :) >>> >>> I understand it slightly more. __pte() uses xen_make_pte(), which calls >>> pte_pfn_to_mfn(), however, mfn for this pfn contains INVALID_P2M_ENTRY >>> value, therefore the pte_pfn_to_mfn() returns 0, see [1]. >>> >>> I guess that the mfn was invalidated by xen-balloon driver? >>> >>> [1] https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/xen/mmu_pv.c?h=v6.15-rc4#n408 >>> >>>> What XEN does with basic primitives that convert between pteval and >>>> pte_t is beyond horrible. >>>> >>>> How come set_ptes() that uses pte_next_pfn()->pte_advance_pfn() does not >>>> run into this? >>> >>> I don't know, but I guess it is somehow related to pfn->mfn translation. >>> >>>> Is it only a problem if we exceed a certain pfn? >>> >>> No, it is a problem if the corresponding mft to given pfn is invalid. >>> >>> I am not sure if my original patch is a good fix. >> >> No :) >> >> Maybe it would be >>> better to have some sort of native_pte_advance_pfn() which will use >>> native_make_pte() rather than __pte(). Or do you think the issue is in >>> Xen part? >> >> I think what's happening is that -- under XEN only -- we might get garbage when >> calling pte_advance_pfn() and the next PFN would no longer fall into the folio. And >> the current code cannot deal with that XEN garbage. >> >> But still not 100% sure. >> >> The following is completely untested, could you give that a try? > > Yes, it solves the issue for me. Cool! > > However, maybe it would be better to solve it with the following patch. > The pte_pfn_to_mfn() actually returns the same value for non-present > PTEs. I suggest to return original PTE if the mfn is INVALID_P2M_ENTRY, > rather than empty non-present PTE, but the _PAGE_PRESENT bit will be set > to zero. Thus, we will not loose information about original pfn but it > will be clear that the page is not present. > > From e84781f9ec4fb7275d5e7629cf7e222466caf759 Mon Sep 17 00:00:00 2001 > From: =?UTF-8?q?Petr=20Van=C4=9Bk?= > Date: Wed, 30 Apr 2025 17:08:41 +0200 > Subject: [PATCH] x86/mm: Reset pte _PAGE_PRESENT bit for invalid mft > MIME-Version: 1.0 > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit > > Signed-off-by: Petr Vaněk > --- > arch/x86/xen/mmu_pv.c | 9 +++------ > 1 file changed, 3 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c > index 38971c6dcd4b..92a6a9af0c65 100644 > --- a/arch/x86/xen/mmu_pv.c > +++ b/arch/x86/xen/mmu_pv.c > @@ -392,28 +392,25 @@ static pteval_t pte_mfn_to_pfn(pteval_t val) > static pteval_t pte_pfn_to_mfn(pteval_t val) > { > if (val & _PAGE_PRESENT) { > unsigned long pfn = (val & PTE_PFN_MASK) >> PAGE_SHIFT; > pteval_t flags = val & PTE_FLAGS_MASK; > unsigned long mfn; > > mfn = __pfn_to_mfn(pfn); > > /* > - * If there's no mfn for the pfn, then just create an > - * empty non-present pte. Unfortunately this loses > - * information about the original pfn, so > - * pte_mfn_to_pfn is asymmetric. > + * If there's no mfn for the pfn, then just reset present pte bit. > */ > if (unlikely(mfn == INVALID_P2M_ENTRY)) { > - mfn = 0; > - flags = 0; > + mfn = pfn; > + flags &= ~_PAGE_PRESENT; > } else > mfn &= ~(FOREIGN_FRAME_BIT | IDENTITY_FRAME_BIT); > val = ((pteval_t)mfn << PAGE_SHIFT) | flags; > } > > return val; > } > > __visible pteval_t xen_pte_val(pte_t pte) > { That might do as well. I assume the following would also work? (removing the early ==1 check) It has the general benefit of removing the pte_pfn() call from the loop body, which is why I like that fix. (almost looks like a cleanup) From 75948778b586d4759a480bf412fd4682067b12ea Mon Sep 17 00:00:00 2001 From: David Hildenbrand Date: Wed, 30 Apr 2025 16:35:12 +0200 Subject: [PATCH] tmp Signed-off-by: David Hildenbrand --- mm/internal.h | 27 +++++++++++---------------- 1 file changed, 11 insertions(+), 16 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index e9695baa59226..25a29872c634b 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -248,11 +248,9 @@ static inline int folio_pte_batch(struct folio *folio, unsigned long addr, pte_t *start_ptep, pte_t pte, int max_nr, fpb_t flags, bool *any_writable, bool *any_young, bool *any_dirty) { - unsigned long folio_end_pfn = folio_pfn(folio) + folio_nr_pages(folio); - const pte_t *end_ptep = start_ptep + max_nr; pte_t expected_pte, *ptep; bool writable, young, dirty; - int nr; + int nr, cur_nr; if (any_writable) *any_writable = false; @@ -265,11 +263,15 @@ static inline int folio_pte_batch(struct folio *folio, unsigned long addr, VM_WARN_ON_FOLIO(!folio_test_large(folio) || max_nr < 1, folio); VM_WARN_ON_FOLIO(page_folio(pfn_to_page(pte_pfn(pte))) != folio, folio); + /* Limit max_nr to the actual remaining PFNs in the folio we could batch. */ + max_nr = min_t(unsigned long, max_nr, + folio_pfn(folio) + folio_nr_pages(folio) - pte_pfn(pte)); + nr = pte_batch_hint(start_ptep, pte); expected_pte = __pte_batch_clear_ignored(pte_advance_pfn(pte, nr), flags); ptep = start_ptep + nr; - while (ptep < end_ptep) { + while (nr < max_nr) { pte = ptep_get(ptep); if (any_writable) writable = !!pte_write(pte); @@ -282,14 +284,6 @@ static inline int folio_pte_batch(struct folio *folio, unsigned long addr, if (!pte_same(pte, expected_pte)) break; - /* - * Stop immediately once we reached the end of the folio. In - * corner cases the next PFN might fall into a different - * folio. - */ - if (pte_pfn(pte) >= folio_end_pfn) - break; - if (any_writable) *any_writable |= writable; if (any_young) @@ -297,12 +291,13 @@ static inline int folio_pte_batch(struct folio *folio, unsigned long addr, if (any_dirty) *any_dirty |= dirty; - nr = pte_batch_hint(ptep, pte); - expected_pte = pte_advance_pfn(expected_pte, nr); - ptep += nr; + cur_nr = pte_batch_hint(ptep, pte); + expected_pte = pte_advance_pfn(expected_pte, cur_nr); + ptep += cur_nr; + nr += cur_nr; } - return min(ptep - start_ptep, max_nr); + return min(nr, max_nr); } /** -- 2.49.0 -- Cheers, David / dhildenb