From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1CC3EC5AD49 for ; Mon, 26 May 2025 12:39:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AD5646B0083; Mon, 26 May 2025 08:39:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A86646B0088; Mon, 26 May 2025 08:39:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 975C56B0089; Mon, 26 May 2025 08:39:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 787026B0083 for ; Mon, 26 May 2025 08:39:42 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 1C481E5FE6 for ; Mon, 26 May 2025 12:39:42 +0000 (UTC) X-FDA: 83485015404.12.AD17201 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf15.hostedemail.com (Postfix) with ESMTP id A44FCA0009 for ; Mon, 26 May 2025 12:39:39 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=goAxS3tf; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf15.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748263179; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=f+MJ+rD4wu541gsfva4/MFHxHhJlGS+dtYhFcOT6l88=; b=DmQLYtbVFJ75BXkbd7gwbxo+UcFR6yawCy4NiYnnFa3p9+vpkwLYzBwuheyJkI9yRadfpZ rRKX+GYoqEl+ukEPlSOaOUexxxc7I+phWUg1n/9PHyqNzsOypy/DbL2UMCE6GZw1KO8lQF O6PM1Aex/BiQBwoSe9+fbcPezr87Gcw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748263179; a=rsa-sha256; cv=none; b=SuQUBDPED9sc5v/NB3IENwryp7Njl4Nkb2IJm3749OhASqC7ksr6kvvh7tW+W8sjGJuXSL bDZ91pKbft82wnMwkk4jxJte2u5/4fgoEP1tl9UvGKAiOvAIhIwwgP0nfpeMF9wHYwH1SJ zf5Vk61DfSZYX+nuIZ2nBCP7rJUfKn0= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=goAxS3tf; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf15.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1748263179; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=f+MJ+rD4wu541gsfva4/MFHxHhJlGS+dtYhFcOT6l88=; b=goAxS3tftGsiVm0ZnLoxFfr/RemCFN4uqSN68OysdRJl9D/obKiRXEAulEXHzIuOF1QMr3 zWdep9GFght0FV0j9IccBUDdbW8JvWrzUcGfssyA4OCq5HpVsBUCplWeqTSVshcMEOaRLT /gpwNEh1bh9D1EKVPlAufd6JV9m249k= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-682-N6-2vEpgP621ClhmC3lh2Q-1; Mon, 26 May 2025 08:39:35 -0400 X-MC-Unique: N6-2vEpgP621ClhmC3lh2Q-1 X-Mimecast-MFC-AGG-ID: N6-2vEpgP621ClhmC3lh2Q_1748263174 Received: by mail-wr1-f69.google.com with SMTP id ffacd0b85a97d-3a364394fa8so789544f8f.0 for ; Mon, 26 May 2025 05:39:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748263174; x=1748867974; h=content-transfer-encoding:in-reply-to:organization:autocrypt :content-language:from:references:cc:to:subject:user-agent :mime-version:date:message-id:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=f+MJ+rD4wu541gsfva4/MFHxHhJlGS+dtYhFcOT6l88=; b=ocq90ZWYmB+pMtUpP7rbWx1DSx3xJfaA6eJweNfdDCJxGei0UKtC6KEK0eupu0Hl50 EYQQcDBtPv3OniB0ydOtu07MWVZp2AiZTLbGRXsiXeT4TJtcxfRT5z8HYm93RBB/9YV+ WvTXdWWsxr60VCV0Ub8cBp+EgRgSgRonzhfJc5svNSz00uz9LKQbMWYgssGFZKNm5MSy GoiDH7hRAToK7mETEbwqRPBAR0amshQdQl6L2umnhi1zwTiSEV3bprjj3VLlvFKdgQGV iVzhX4ZOQox4mwNQsF8P++B+o+HgTCKDcv6M1BzTK/pUAp2xgxWaf9ekIgFUm16n3zSw MR+w== X-Forwarded-Encrypted: i=1; AJvYcCU3/Zn+iNSu+GFj2HoGZdpE9FolJGwf/tGMhY3Cp+Mivod99AcS82NIfWMLU2/9t1Bw+oa74fMu6w==@kvack.org X-Gm-Message-State: AOJu0YyhhEUuW4I1F3yKj3hJFkKZSPbeHP1/IRmJ+AqllsYa9z9X7/zJ cgqtoKuRq6TdJpZIF/btUX3rsQu/lwPIn6PQl9XeIoTo3zkM5ocOEHxibKxmZDo7uNneeH8sRVT OIQBY3QAWcxwgFmL8TGYeIVD9A7ZgYPMiVK0q8hsmUTCRWBW6vMYI X-Gm-Gg: ASbGncv5AivvpmbRf83UN3xG3uGhq+v/rvPMM7q203KnCIAFd6N5Art1aiN0o4fg4H1 7T8bUiwFFz/7vmV0+8O/IBDM8h1+Gy3LIb9K4ZVJ1qQ/A6vt5EMFtRhP/8nmU2+v4QDEcF7hthH hoGocOGGPLjuURhkpkKbYoxege/JHWp5GOPj6uO5sKb9vDhnFEVmdg70wf7xXKe/T1xg6TUQrxF HaUioCIYwPf4NHvccbS82a7XuFSYRfADaEFgQvj6PR923NgVwvLm3ug8+P6iCRAOA39Eol2hmFv jEBuFFQBTmi5gmjX0SUgp5M2I+eKuIpKHze3oAqvg4pJdQmQCANYu0jUE5hjAM3a0gv+icHxqtT pK4pvsYZsbeVxMyLDJChTd9k+p2M7Cy79G5XXCvQ= X-Received: by 2002:a05:6000:40cf:b0:3a1:fc6d:971c with SMTP id ffacd0b85a97d-3a4cb46d689mr7142465f8f.21.1748263174388; Mon, 26 May 2025 05:39:34 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH1po2j4G0EELTJLR9OfyHDXwqrHi2hqTZt/UpJrGsdNEjYSlXMb4yTeqcWOg0KCEOMbZAJSA== X-Received: by 2002:a05:6000:40cf:b0:3a1:fc6d:971c with SMTP id ffacd0b85a97d-3a4cb46d689mr7142443f8f.21.1748263173991; Mon, 26 May 2025 05:39:33 -0700 (PDT) Received: from ?IPV6:2003:d8:2f19:6500:e1c1:8216:4c25:efe4? (p200300d82f196500e1c182164c25efe4.dip0.t-ipconnect.de. [2003:d8:2f19:6500:e1c1:8216:4c25:efe4]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3a4c868745esm8880176f8f.30.2025.05.26.05.39.32 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 26 May 2025 05:39:33 -0700 (PDT) Message-ID: <5abe8b0c-2354-4107-9004-ccf86cf90d25@redhat.com> Date: Mon, 26 May 2025 14:39:31 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [BUG]userfaultfd_move fails to move a folio when swap-in occurs concurrently with swap-out To: Barry Song <21cnbao@gmail.com>, Peter Xu , Suren Baghdasaryan , Lokesh Gidra , Andrea Arcangeli Cc: Andrew Morton , Linux-MM , Kairui Song , LKML References: From: David Hildenbrand Autocrypt: addr=david@redhat.com; keydata= xsFNBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABzSREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT7CwZgEEwEIAEICGwMGCwkIBwMCBhUIAgkKCwQW AgMBAh4BAheAAhkBFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAl8Ox4kFCRKpKXgACgkQTd4Q 9wD/g1oHcA//a6Tj7SBNjFNM1iNhWUo1lxAja0lpSodSnB2g4FCZ4R61SBR4l/psBL73xktp rDHrx4aSpwkRP6Epu6mLvhlfjmkRG4OynJ5HG1gfv7RJJfnUdUM1z5kdS8JBrOhMJS2c/gPf wv1TGRq2XdMPnfY2o0CxRqpcLkx4vBODvJGl2mQyJF/gPepdDfcT8/PY9BJ7FL6Hrq1gnAo4 3Iv9qV0JiT2wmZciNyYQhmA1V6dyTRiQ4YAc31zOo2IM+xisPzeSHgw3ONY/XhYvfZ9r7W1l pNQdc2G+o4Di9NPFHQQhDw3YTRR1opJaTlRDzxYxzU6ZnUUBghxt9cwUWTpfCktkMZiPSDGd KgQBjnweV2jw9UOTxjb4LXqDjmSNkjDdQUOU69jGMUXgihvo4zhYcMX8F5gWdRtMR7DzW/YE BgVcyxNkMIXoY1aYj6npHYiNQesQlqjU6azjbH70/SXKM5tNRplgW8TNprMDuntdvV9wNkFs 9TyM02V5aWxFfI42+aivc4KEw69SE9KXwC7FSf5wXzuTot97N9Phj/Z3+jx443jo2NR34XgF 89cct7wJMjOF7bBefo0fPPZQuIma0Zym71cP61OP/i11ahNye6HGKfxGCOcs5wW9kRQEk8P9 M/k2wt3mt/fCQnuP/mWutNPt95w9wSsUyATLmtNrwccz63XOwU0EVcufkQEQAOfX3n0g0fZz Bgm/S2zF/kxQKCEKP8ID+Vz8sy2GpDvveBq4H2Y34XWsT1zLJdvqPI4af4ZSMxuerWjXbVWb T6d4odQIG0fKx4F8NccDqbgHeZRNajXeeJ3R7gAzvWvQNLz4piHrO/B4tf8svmRBL0ZB5P5A 2uhdwLU3NZuK22zpNn4is87BPWF8HhY0L5fafgDMOqnf4guJVJPYNPhUFzXUbPqOKOkL8ojk CXxkOFHAbjstSK5Ca3fKquY3rdX3DNo+EL7FvAiw1mUtS+5GeYE+RMnDCsVFm/C7kY8c2d0G NWkB9pJM5+mnIoFNxy7YBcldYATVeOHoY4LyaUWNnAvFYWp08dHWfZo9WCiJMuTfgtH9tc75 7QanMVdPt6fDK8UUXIBLQ2TWr/sQKE9xtFuEmoQGlE1l6bGaDnnMLcYu+Asp3kDT0w4zYGsx 5r6XQVRH4+5N6eHZiaeYtFOujp5n+pjBaQK7wUUjDilPQ5QMzIuCL4YjVoylWiBNknvQWBXS lQCWmavOT9sttGQXdPCC5ynI+1ymZC1ORZKANLnRAb0NH/UCzcsstw2TAkFnMEbo9Zu9w7Kv AxBQXWeXhJI9XQssfrf4Gusdqx8nPEpfOqCtbbwJMATbHyqLt7/oz/5deGuwxgb65pWIzufa N7eop7uh+6bezi+rugUI+w6DABEBAAHCwXwEGAEIACYCGwwWIQQb2cqtc1xMOkYN/MpN3hD3 AP+DWgUCXw7HsgUJEqkpoQAKCRBN3hD3AP+DWrrpD/4qS3dyVRxDcDHIlmguXjC1Q5tZTwNB boaBTPHSy/Nksu0eY7x6HfQJ3xajVH32Ms6t1trDQmPx2iP5+7iDsb7OKAb5eOS8h+BEBDeq 3ecsQDv0fFJOA9ag5O3LLNk+3x3q7e0uo06XMaY7UHS341ozXUUI7wC7iKfoUTv03iO9El5f XpNMx/YrIMduZ2+nd9Di7o5+KIwlb2mAB9sTNHdMrXesX8eBL6T9b+MZJk+mZuPxKNVfEQMQ a5SxUEADIPQTPNvBewdeI80yeOCrN+Zzwy/Mrx9EPeu59Y5vSJOx/z6OUImD/GhX7Xvkt3kq Er5KTrJz3++B6SH9pum9PuoE/k+nntJkNMmQpR4MCBaV/J9gIOPGodDKnjdng+mXliF3Ptu6 3oxc2RCyGzTlxyMwuc2U5Q7KtUNTdDe8T0uE+9b8BLMVQDDfJjqY0VVqSUwImzTDLX9S4g/8 kC4HRcclk8hpyhY2jKGluZO0awwTIMgVEzmTyBphDg/Gx7dZU1Xf8HFuE+UZ5UDHDTnwgv7E th6RC9+WrhDNspZ9fJjKWRbveQgUFCpe1sa77LAw+XFrKmBHXp9ZVIe90RMe2tRL06BGiRZr jPrnvUsUUsjRoRNJjKKA/REq+sAnhkNPPZ/NNMjaZ5b8Tovi8C0tmxiCHaQYqj7G2rgnT0kt WNyWQQ== Organization: Red Hat In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: xhVehjvW0oUWaGS7WgT3XQVwyQEDy4MA1YZK5U6QAn8_1748263174 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: A44FCA0009 X-Stat-Signature: ockhny4mxhrmyr4oysm4yztcdd6tijhx X-Rspam-User: X-HE-Tag: 1748263179-624326 X-HE-Meta: U2FsdGVkX19aJXT2IIB5MrE5mX1JUZrEUT9C5NnRNyWJ5HUUJh3stjY+A8Icg3FdKgqbZa1I5gskgsSvk8LL+TKjkSssdm9WyxbFg8xV/tvUMyOXqZvxrdpXODVXyfSp3dR94Xbjp67FqPaU1TJwIylG1jF7a/Wvd0gYKL1lfe9FLjQmKfy5/2Koc8xjbc8r2rukbr8WItShL7Xh7IzB2Wmh5KyPb9AKxtXISblJijKuUuimh2+EW7e9Bjonefq3IeFE8JekggzG845KTf2nTRnGKlfP3aNgKwGB02cpY6K75GOMI3NcFH1bIBlF824DGrSlyzdE+ng9S4zUx+U3w8LpbJHFuxQATYulBLjwKGthOcrkXwnf+QcCyJwhbLQecF4bNYf5NtZMW/c2kfqOZ3EVj6MNQz4MwDBS0K6Fn2HBKM5Qx50gTe+WbhXL85nsyS/69SLlg3UjCZEWnF9cooT/BQD32qyeOY+qQrbHn0xnGgHjo0C3ZEW9nUbv0kKvlNw66iqa0ZyiVaCwHeZgMwywrNuOsptB4Afqgh9zLr3c2mOAlgVMfKsMKlI5aYYc2g+Rz//kCLPrZssCx81TqFlwSn7vQmLi0KLSode7EM7kmlaCNryM4glDv5Jb3SsqzdNz/is5+ZxUqbnoIu0W2j+VD+nn55w0sMAEmHt9IDTHXgq3rlsvB2pSvxTs+1M8D7SkqmJT58ynTx3qtSCyZZKbsxruUOGSW+KQzONQX21tSUkSKsG5DLu6Ncqa3gXcCsrtg8N1yRGV2IzqgoiLM8QNJ4S/j7R0Vq4kF+mAilN468ul9kfMahyujeF2pku4m71QoD9wB6SBjhkNimcyf4FlEqeXw9zgzHui4NKwy3VWN6SOxWue4CLZg2uB66NZmrlvdXoEtoTQE9baNPMAFGAH3qZl8TKHO7XN3redB/uvh2P8n+7LUnVHvI+2NyDd4VV61LrYyUNopg73pEi wMZp5JD6 dDbo1IhreqIQBZZWHepMiFi6b26d9EVT4pFJt0TFlg9iqGlyq8iZ4B4eTYMxHXeZkbbtH/bWam29eyqNg66xlSzpJzJu4XfH4pVcKXabm/v7U3d4Yt9wu/co1ZxR7WkXcg6i7LLSJipAeJ02NxYf+80HLlNwHl1qWnnmc+OLyXi7SaoKlKiCtGZA5/nl13pwHz0aKq6HVbG0ViDvt/XYYsbpFP+hkwhRJMrZVjy5uZ3RW0URd1RdTOXs1MqvLWFM3u+2M1OYGGOYX+ZVn6jUYupwbHbM5sHGbiNkXaKbl/0FZ5ucy4XH9ZqCVtWR/k24LoJfBxEPxi/IbzbRo0o33l+x4sUDV0OR1d1Ziyc5JsIDqqE9GUrm2wUAbWN2trYJJ8s6dWtErL2+qhlRvTO3tz2D2shvHD97uYN2/cTun2JnYfmSGk23H/uXUCqErfV8M4E8E X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 23.05.25 01:23, Barry Song wrote: > Hi All, Hi! > > I'm encountering another bug that can be easily reproduced using the small > program below[1], which performs swap-out and swap-in in parallel. > > The issue occurs when a folio is being swapped out while it is accessed > concurrently. In this case, do_swap_page() handles the access. However, > because the folio is under writeback, do_swap_page() completely removes > its exclusive attribute. > > do_swap_page: > } else if (exclusive && folio_test_writeback(folio) && > data_race(si->flags & SWP_STABLE_WRITES)) { > ... > exclusive = false; > > As a result, userfaultfd_move() will return -EBUSY, even though the > folio is not shared and is in fact exclusively owned. > > folio = vm_normal_folio(src_vma, src_addr, > orig_src_pte); > if (!folio || !PageAnonExclusive(&folio->page)) { > spin_unlock(src_ptl); > + pr_err("%s %d folio:%lx exclusive:%d > swapcache:%d\n", > + __func__, __LINE__, folio, > PageAnonExclusive(&folio->page), > + folio_test_swapcache(folio)); > err = -EBUSY; > goto out; > } > > I understand that shared folios should not be moved. However, in this > case, the folio is not shared, yet its exclusive flag is not set. > > Therefore, I believe PageAnonExclusive is not a reliable indicator of > whether a folio is truly exclusive to a process. It is. The flag *not* being set is not a reliable indicator whether it is really shared. ;) The reason why we have this PAE workaround (dropping the flag) in place is because the page must not be written to (SWP_STABLE_WRITES). CoW reuse is not possible. uffd moving that page -- and in that same process setting it writable, see move_present_pte()->pte_mkwrite() -- would be very bad. > > The kernel log output is shown below: > [ 23.009516] move_pages_pte 1285 folio:fffffdffc01bba40 exclusive:0 > swapcache:1 > > I'm still struggling to find a real fix; it seems quite challenging. PAE tells you that you can immediately write to that page without going through CoW. However, here, CoW is required. > Please let me know if you have any ideas. In any case It seems > userspace should fall back to userfaultfd_copy. We could try detecting whether the page is now exclusive, to reset PAE. That will only be possible after writeback completed, so it adds complexity without being able to move the page in all cases (during writeback). Letting userspace deal with that in these rate scenarios is significantly easier. -- Cheers, David / dhildenb