From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5603DC54E67 for ; Wed, 27 Mar 2024 09:34:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BA4746B0085; Wed, 27 Mar 2024 05:34:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B53D06B0087; Wed, 27 Mar 2024 05:34:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9A6F06B0095; Wed, 27 Mar 2024 05:34:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 7651E6B0085 for ; Wed, 27 Mar 2024 05:34:40 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 12991A0B41 for ; Wed, 27 Mar 2024 09:34:40 +0000 (UTC) X-FDA: 81942309120.28.73CADD2 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf24.hostedemail.com (Postfix) with ESMTP id B94E4180016 for ; Wed, 27 Mar 2024 09:34:37 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=amhD6EKS; spf=pass (imf24.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711532077; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9LLuogmZBw7/u037QPSgxeXpKZR87NTC4TIrWGd7J2U=; b=HRkZloDR2pq2FeyM3DTFXAy1jiFLoo71lbI8WmvAspc/gx3N3t07ynnnkKD1AgnDDlTGRg tDzuuI4aeO/CALd+lFmTc2ygAq+3GkPRBRJarbp3plz2G5ZE9PXJkYlLV5NO7LkJU3Rm3t 4zg+Z/zKHyqcSNdq3Qi+loEqDeTGGg0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711532077; a=rsa-sha256; cv=none; b=I1r1cfOvJeDee7ZQZte96y0ivUdFfZZ5Fx9XnKoY/JbYIOgODek9Ouc/XBq0ctvkwVM6G8 CZwhSnJ7SV5Knu2tdZPv/DCkHG7Ki/gmgmUwQnv9YhJ75n2+sh6aRytyGVHo5jAhSKcpTO G/xoUKbRLhCyYWheNOEuXYQsKe2r1Fc= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=amhD6EKS; spf=pass (imf24.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1711532077; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=9LLuogmZBw7/u037QPSgxeXpKZR87NTC4TIrWGd7J2U=; b=amhD6EKSMkR5tGbDBewUDcurzVSrGP0Us2+HhODBULxQVm1hjuy0l37xc0KOiTR1o6r4Qd p8+pDkxR5dGya/0p4oI5yn6tnmepc6vf3BJw4GF3DFmiKh9dB9sh/PV9sT4bp/2yUCIu+V BzkFXWHYDgq+ZN6OB3UOH6herdKXwK8= Received: from mail-lj1-f198.google.com (mail-lj1-f198.google.com [209.85.208.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-338-OTQXsBaCPv68vRv63qYujQ-1; Wed, 27 Mar 2024 05:34:34 -0400 X-MC-Unique: OTQXsBaCPv68vRv63qYujQ-1 Received: by mail-lj1-f198.google.com with SMTP id 38308e7fff4ca-2d6fee84ad1so550721fa.0 for ; Wed, 27 Mar 2024 02:34:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711532073; x=1712136873; h=content-transfer-encoding:in-reply-to:organization:autocrypt:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=9LLuogmZBw7/u037QPSgxeXpKZR87NTC4TIrWGd7J2U=; b=hqOAtzXWABlCoTaIBaNolWHtMNsEvTW6Z8SA4rzQ5ceD+Zt09VV9EN1GWZbhNGGbCz KI9uzI3vhu+p3NELH8qsllNeq8JiwS3/eJkfEzQAyIEr02do0vEJV4+O4E9uA2c3UExe pDaKG9KljdM0iImddCzjlCYaY1/soC6hNfr2DNMQpafT5ppb0xZDj5UeLTr9cQrXzfyu euw7an/UzraOjX+yjzmzROL2j5EtXHnxO/FVYQPNEY2nGO6wXiNa7HYmhEM/I6l3JvOB IQ1a4MDR0S2pbgEwpKrjy9nzN9cMyCgGKBaQECqM+/5fX2xHJha2cjwsqEz725EKJQFL t/PQ== X-Forwarded-Encrypted: i=1; AJvYcCX/8z2wkp3r1TrJ6C9qzXPgsPIHL+UnvsvIlFEdO83seFOLD6d0T3Pi8NjTg9SN/sZVAdqGVaoDxLKEypuLsaiT/xk= X-Gm-Message-State: AOJu0YzDx3FMb2DyjxOYjaAQR4HpJfdOM857jLxibeqB8ff33WA4Lovx oAzwqQ1Xugj2eEMek3R8CBQ8RYQ5H1wMm9S6ty+aINEZnSGx2ZygHwOgnE9M9SIfLsMvU7T4JX9 RQ0ti45FdKFK4xqllkAQg2porLYVh/CS/WlqnVrePF4W3fBnh X-Received: by 2002:a2e:b05a:0:b0:2d4:a35c:3e5f with SMTP id d26-20020a2eb05a000000b002d4a35c3e5fmr2444944ljl.30.1711532072777; Wed, 27 Mar 2024 02:34:32 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGMk2yHnXyNzj6c2qzpk/qO6ETWFT46sQBoOR5+HdB9TfU71Sw3ijy16j/SBsZVsmvZ0vpf+g== X-Received: by 2002:a2e:b05a:0:b0:2d4:a35c:3e5f with SMTP id d26-20020a2eb05a000000b002d4a35c3e5fmr2444925ljl.30.1711532072207; Wed, 27 Mar 2024 02:34:32 -0700 (PDT) Received: from ?IPV6:2003:cb:c708:8a00:362b:7e34:a3bc:9ddf? (p200300cbc7088a00362b7e34a3bc9ddf.dip0.t-ipconnect.de. [2003:cb:c708:8a00:362b:7e34:a3bc:9ddf]) by smtp.gmail.com with ESMTPSA id j19-20020a05600c1c1300b004146c769c79sm1611093wms.0.2024.03.27.02.34.31 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 27 Mar 2024 02:34:31 -0700 (PDT) Message-ID: <35236bbf-3d9a-40e9-84b5-e10e10295c0c@redhat.com> Date: Wed, 27 Mar 2024 10:34:30 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v1 0/4] Reduce cost of ptep_get_lockless on arm64 To: Ryan Roberts , Mark Rutland , Catalin Marinas , Will Deacon , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Andrew Morton , Muchun Song Cc: linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20240215121756.2734131-1-ryan.roberts@arm.com> <0ae22147-e1a1-4bcb-8a4c-f900f3f8c39e@redhat.com> <374d8500-4625-4bff-a934-77b5f34cf2ec@arm.com> <8bd9e136-8575-4c40-bae2-9b015d823916@redhat.com> <86680856-2532-495b-951a-ea7b2b93872f@arm.com> From: David Hildenbrand Autocrypt: addr=david@redhat.com; keydata= xsFNBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABzSREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT7CwZgEEwEIAEICGwMGCwkIBwMCBhUIAgkKCwQW AgMBAh4BAheAAhkBFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAl8Ox4kFCRKpKXgACgkQTd4Q 9wD/g1oHcA//a6Tj7SBNjFNM1iNhWUo1lxAja0lpSodSnB2g4FCZ4R61SBR4l/psBL73xktp rDHrx4aSpwkRP6Epu6mLvhlfjmkRG4OynJ5HG1gfv7RJJfnUdUM1z5kdS8JBrOhMJS2c/gPf wv1TGRq2XdMPnfY2o0CxRqpcLkx4vBODvJGl2mQyJF/gPepdDfcT8/PY9BJ7FL6Hrq1gnAo4 3Iv9qV0JiT2wmZciNyYQhmA1V6dyTRiQ4YAc31zOo2IM+xisPzeSHgw3ONY/XhYvfZ9r7W1l pNQdc2G+o4Di9NPFHQQhDw3YTRR1opJaTlRDzxYxzU6ZnUUBghxt9cwUWTpfCktkMZiPSDGd KgQBjnweV2jw9UOTxjb4LXqDjmSNkjDdQUOU69jGMUXgihvo4zhYcMX8F5gWdRtMR7DzW/YE BgVcyxNkMIXoY1aYj6npHYiNQesQlqjU6azjbH70/SXKM5tNRplgW8TNprMDuntdvV9wNkFs 9TyM02V5aWxFfI42+aivc4KEw69SE9KXwC7FSf5wXzuTot97N9Phj/Z3+jx443jo2NR34XgF 89cct7wJMjOF7bBefo0fPPZQuIma0Zym71cP61OP/i11ahNye6HGKfxGCOcs5wW9kRQEk8P9 M/k2wt3mt/fCQnuP/mWutNPt95w9wSsUyATLmtNrwccz63XOwU0EVcufkQEQAOfX3n0g0fZz Bgm/S2zF/kxQKCEKP8ID+Vz8sy2GpDvveBq4H2Y34XWsT1zLJdvqPI4af4ZSMxuerWjXbVWb T6d4odQIG0fKx4F8NccDqbgHeZRNajXeeJ3R7gAzvWvQNLz4piHrO/B4tf8svmRBL0ZB5P5A 2uhdwLU3NZuK22zpNn4is87BPWF8HhY0L5fafgDMOqnf4guJVJPYNPhUFzXUbPqOKOkL8ojk CXxkOFHAbjstSK5Ca3fKquY3rdX3DNo+EL7FvAiw1mUtS+5GeYE+RMnDCsVFm/C7kY8c2d0G NWkB9pJM5+mnIoFNxy7YBcldYATVeOHoY4LyaUWNnAvFYWp08dHWfZo9WCiJMuTfgtH9tc75 7QanMVdPt6fDK8UUXIBLQ2TWr/sQKE9xtFuEmoQGlE1l6bGaDnnMLcYu+Asp3kDT0w4zYGsx 5r6XQVRH4+5N6eHZiaeYtFOujp5n+pjBaQK7wUUjDilPQ5QMzIuCL4YjVoylWiBNknvQWBXS lQCWmavOT9sttGQXdPCC5ynI+1ymZC1ORZKANLnRAb0NH/UCzcsstw2TAkFnMEbo9Zu9w7Kv AxBQXWeXhJI9XQssfrf4Gusdqx8nPEpfOqCtbbwJMATbHyqLt7/oz/5deGuwxgb65pWIzufa N7eop7uh+6bezi+rugUI+w6DABEBAAHCwXwEGAEIACYCGwwWIQQb2cqtc1xMOkYN/MpN3hD3 AP+DWgUCXw7HsgUJEqkpoQAKCRBN3hD3AP+DWrrpD/4qS3dyVRxDcDHIlmguXjC1Q5tZTwNB boaBTPHSy/Nksu0eY7x6HfQJ3xajVH32Ms6t1trDQmPx2iP5+7iDsb7OKAb5eOS8h+BEBDeq 3ecsQDv0fFJOA9ag5O3LLNk+3x3q7e0uo06XMaY7UHS341ozXUUI7wC7iKfoUTv03iO9El5f XpNMx/YrIMduZ2+nd9Di7o5+KIwlb2mAB9sTNHdMrXesX8eBL6T9b+MZJk+mZuPxKNVfEQMQ a5SxUEADIPQTPNvBewdeI80yeOCrN+Zzwy/Mrx9EPeu59Y5vSJOx/z6OUImD/GhX7Xvkt3kq Er5KTrJz3++B6SH9pum9PuoE/k+nntJkNMmQpR4MCBaV/J9gIOPGodDKnjdng+mXliF3Ptu6 3oxc2RCyGzTlxyMwuc2U5Q7KtUNTdDe8T0uE+9b8BLMVQDDfJjqY0VVqSUwImzTDLX9S4g/8 kC4HRcclk8hpyhY2jKGluZO0awwTIMgVEzmTyBphDg/Gx7dZU1Xf8HFuE+UZ5UDHDTnwgv7E th6RC9+WrhDNspZ9fJjKWRbveQgUFCpe1sa77LAw+XFrKmBHXp9ZVIe90RMe2tRL06BGiRZr jPrnvUsUUsjRoRNJjKKA/REq+sAnhkNPPZ/NNMjaZ5b8Tovi8C0tmxiCHaQYqj7G2rgnT0kt WNyWQQ== Organization: Red Hat In-Reply-To: <86680856-2532-495b-951a-ea7b2b93872f@arm.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: B94E4180016 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: ip8mzmh9r9xhjrdz1uifhwx7atamfgmu X-HE-Tag: 1711532077-532592 X-HE-Meta: U2FsdGVkX18pfZcOOHz5qIGulLbuEztX7JZNeKtoXY104XUkd3qiX3cKP34JUJae5zk4MRzArnF3PVnr0ixg1h+xSDaZwNzgPp3y34oeW+83DiI4OI0FLYgSN1QhOc4/R3gzhFE9IUNyWT/vhS+yPBa8gB1dWn4ER+FEi+ZtfQMHCSE23jVvR3HbA7YQiC37Gk9dpLqcqsY7gqjWEIb1b7u8geHHc3ei8ip0fGGdZT0mEgmPX8BFsKdPeRuVwzozOpRwN1EB/ZNZxhKWP3LUDuUO5AcZWmGrDNA4pgce8FaPkBlx1DVnkzowvxhLtwkj6oDXc9/Og0wNvHAQpovhbgHD/4BVFsUSeFKZ19Utjsepq/OocZffizzteXB1MyxNe7Yxt6Qk8307FkKHS2S2fOo8XSliDc2nxYFRLOnIyZPJh2vF5/J+rIQGbZ+I9p6OYKQVCJLKCB7K0o0tasoUsZ0YqeZr3m4yL0LKHxa0G00EwqY11aCVMN6IstTh9hQ/0jplH2euQFXEmrRJRNGXaQEHoeFi/ku60o3hoEaJd2DSM0nnQg7QhGOFuVKFhIYEfPXAFTI1S59J641Tc67izftnjXn86ami9aMX9ZY44y+hG/g2zt7fDOCyBjdRyFRj6TmAxOFz6qF1cS5JfJwtjRTHlFlQEMp8dF3WBqNL7FXXBklv3YnDBBJE8quaCe078n8Bz9prBJnv/A8JGqyXc0nQjL7uqkq6uWUuD51rtwfLTmaBeW5mWMeDTjY9ctDtmpN5wjMa7VPIJlE2Vxin5F0SDJCn+LfthtAdWKrjEjffZ1dnqeXIkx6Qy+caPs/LJLNzlua3MNXF/vl6NE1f5Do0k+xGnVDKTu+GcAN7VgKpVhAR9xaMJGZDSEK/QtvE37FyscfuslEg/yAQN1AWQgxxMBCuEnApiSCT4qgGfRDL3URcDXzC0eiYrjoNA5kKsADyxPu2mogQKSGC6rP kUWtVafr w20/GHp1tnov++NG3++8Y30QsksBuguKFXSHaKiAVzD8Cs7U4amwe8oQcVskQR3oFL4VXV7B1CIrpLyWbMBGJMZEjSLdqRQf/CHizU09fzFfWHEDvSb3YSSdChNR6NFeqoRKltBKbA2heBnMKVdLwW6GvrDjvyiXT6CJS5LSRkYdkXxV5vNAmXjDTRtZt6uAKQyuWnFll/0VC0MIeDDQAHDkipFeXsgYOH8EnFs+FTjkib/Y5N7+EwYilstq6Cn0hXq9VYVr9tSKqQ1RCGZAgSe05QhcbiyBbIscmM/baQIQ1vFBMMUZKdzU2hbPOXZcHVJNpVk90+4QQRNh2SqogSTnVf9cg9/XulAtGNELqCeEzb0y7gzJh/j5G4Ylm1pC3bvqr+9cpQ3UK0MH6ARK/zo/fKC0yk2Je/YHNbbk+R2qw6ZTf/9W4wDYQmje7toDnaCYYyF/Yt1sdTuLctZzkq32j/Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 26.03.24 18:51, Ryan Roberts wrote: > On 26/03/2024 17:39, David Hildenbrand wrote: >> On 26.03.24 18:32, Ryan Roberts wrote: >>> On 26/03/2024 17:04, David Hildenbrand wrote: >>>>>>>> >>>>>>>> Likely, we just want to read "the real deal" on both sides of the pte_same() >>>>>>>> handling. >>>>>>> >>>>>>> Sorry I'm not sure I understand? You mean read the full pte including >>>>>>> access/dirty? That's the same as dropping the patch, right? Of course if >>>>>>> we do >>>>>>> that, we still have to keep pte_get_lockless() around for this case. In an >>>>>>> ideal >>>>>>> world we would convert everything over to ptep_get_lockless_norecency() and >>>>>>> delete ptep_get_lockless() to remove the ugliness from arm64. >>>>>> >>>>>> Yes, agreed. Patch #3 does not look too crazy and it wouldn't really affect >>>>>> any >>>>>> architecture. >>>>>> >>>>>> I do wonder if pte_same_norecency() should be defined per architecture and the >>>>>> default would be pte_same(). So we could avoid the mkold etc on all other >>>>>> architectures. >>>>> >>>>> Wouldn't that break it's semantics? The "norecency" of >>>>> ptep_get_lockless_norecency() means "recency information in the returned pte >>>>> may >>>>> be incorrect". But the "norecency" of pte_same_norecency() means "ignore the >>>>> access and dirty bits when you do the comparison". >>>> >>>> My idea was that ptep_get_lockless_norecency() would return the actual result on >>>> these architectures. So e.g., on x86, there would be no actual change in >>>> generated code. >>> >>> I think this is a bad plan... You'll end up with subtle differences between >>> architectures. >>> >>>> >>>> But yes, the documentation of these functions would have to be improved. >>>> >>>> Now I wonder if ptep_get_lockless_norecency() should actively clear >>>> dirty/accessed bits to more easily find any actual issues where the bits still >>>> matter ... >>> >>> I did a version that took that approach. Decided it was not as good as this way >>> though. Now for the life of me, I can't remember my reasoning. >> >> Maybe because there are some code paths that check accessed/dirty without >> "correctness" implications? For example, if the PTE is already dirty, no need to >> set it dirty etc? > > I think I decided I was penalizing the architectures that don't care because all > their ptep_get_norecency() and ptep_get_lockless_norecency() need to explicitly > clear access/dirty. And I would have needed ptep_get_norecency() from day 1 so > that I could feed its result into pte_same(). True. With ptep_get_norecency() you're also penalizing other architectures. Therefore my original thought about making the behavior arch-specific, but the arch has to make sure to get the combination of ptep_get_lockless_norecency()+ptep_same_norecency() is right. So if an arch decide to ignore bits in ptep_get_lockless_norecency(), it must make sure to also ignore them in ptep_same_norecency(), and must be able to handle access/dirty bit changes differently. Maybe one could have one variant for "hw-managed access/dirty" vs. "sw managed accessed or dirty". Only the former would end up ignoring stuff here, the latter would not. But again, just some random thoughts how this affects other architectures and how we could avoid it. The issue I describe in patch #3 would be gone if ptep_same_norecency() would just do a ptep_same() check on other architectures -- and would make it easier to sell :) -- Cheers, David / dhildenb