From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68892C02198 for ; Wed, 12 Feb 2025 12:45:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AECC46B0083; Wed, 12 Feb 2025 07:45:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A9CC16B0085; Wed, 12 Feb 2025 07:45:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 93D466B008A; Wed, 12 Feb 2025 07:45:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 776326B0083 for ; Wed, 12 Feb 2025 07:45:20 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id B7327C1DB9 for ; Wed, 12 Feb 2025 12:45:12 +0000 (UTC) X-FDA: 83111262906.29.4616904 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by imf19.hostedemail.com (Postfix) with ESMTP id 74C9B1A000F for ; Wed, 12 Feb 2025 12:45:10 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=Dc9JHOpU; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf19.hostedemail.com: domain of agordeev@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=agordeev@linux.ibm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739364310; a=rsa-sha256; cv=none; b=AKm/8ydorM+Dka75e1EG7JunI/m14lW+GXqsBlLdQRshWGb19IUeAc6XMRYMWKmiU6KGSx ZYtXiBxiOcDXvx7iHN7E6l49sJLcV+dGGIDz/vlYlFr9EtA/tXHb3gMuKtG/ZO5e7xHEgg +xdFg3JR9QaHdgzynP/2fHY77lfXZ+w= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=Dc9JHOpU; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf19.hostedemail.com: domain of agordeev@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=agordeev@linux.ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739364310; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+0HnhwpUVcp7yOnWDfDaOOUvEpSQiHpGrXaenB/IARY=; b=nD/MHYiztYd0hGYDyp+lt3XO+nh5ofpG8SI6JpFLmdN1aMzjfZBV4aXuaYXayOteeXwBy6 BhjLgDskhYTYZKpLTpcMrKUUyWnq9nkGoRxnk0+AP9JQvT7QoCYMJ0RrNeZaf6rUoNVhVI hvNNzytzP6I7qylLgmQa2YOcTU6SmlA= Received: from pps.filterd (m0353729.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 51C5OQIi003183; Wed, 12 Feb 2025 12:45:05 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-type:date:from:in-reply-to:message-id:mime-version :references:subject:to; s=pp1; bh=+0HnhwpUVcp7yOnWDfDaOOUvEpSQiH pGrXaenB/IARY=; b=Dc9JHOpUd0RvM7zPcSNgSjSREcHcOdwbrfteLwo2J+AQqY +sxxVAwvwM4n988Yr7aHgZVh/us5dTQ+tSWb5VSdtZ1LzRVE3unl6co9aDNSMmuw eXGG777MWAEPzPyWzaZ6lvCNN7C+To5MgkQ6AlnbQltknK5748QxU0eTUL8m/57z ymF9p4pr/7baK3xayN3bzCixM9vLphwAIX9/YKcyDuC6Jzn2yK51lVgC/hGv+dQ3 N/p5vsRmqVvybwTVFtY0EmSzKGC4fgGoZpeognUyznmyd3CO+HpeXLyoI2ikSmx/ QML5nb3ju9xCFrNYB9/I+9fIp+yEmCo2x0hlCKDg== Received: from ppma12.dal12v.mail.ibm.com (dc.9e.1632.ip4.static.sl-reverse.com [50.22.158.220]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 44rnf8a1j3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Feb 2025 12:45:04 +0000 (GMT) Received: from pps.filterd (ppma12.dal12v.mail.ibm.com [127.0.0.1]) by ppma12.dal12v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 51CCPkBI021875; Wed, 12 Feb 2025 12:45:04 GMT Received: from smtprelay06.fra02v.mail.ibm.com ([9.218.2.230]) by ppma12.dal12v.mail.ibm.com (PPS) with ESMTPS id 44phksrrp5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Feb 2025 12:45:03 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay06.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 51CCj0tt15466772 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 12 Feb 2025 12:45:00 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 66C1C20049; Wed, 12 Feb 2025 12:45:00 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2CBC120040; Wed, 12 Feb 2025 12:45:00 +0000 (GMT) Received: from li-008a6a4c-3549-11b2-a85c-c5cc2836eea2.ibm.com (unknown [9.155.204.135]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTPS; Wed, 12 Feb 2025 12:45:00 +0000 (GMT) Date: Wed, 12 Feb 2025 13:44:58 +0100 From: Alexander Gordeev To: Matthew Wilcox Cc: Gerald Schaefer , Heiko Carstens , Vasily Gorbik , Claudio Imbrenda , Christian Borntraeger , linux-mm@kvack.org Subject: Re: [PATCH] s390: Remove PageDirty check inside mk_pte() Message-ID: References: <20250116212338.653160-1-willy@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: DIPtCtN1hcngnq_LQ3i4ug0vtotYqim_ X-Proofpoint-GUID: DIPtCtN1hcngnq_LQ3i4ug0vtotYqim_ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-02-12_04,2025-02-11_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 impostorscore=0 mlxscore=0 adultscore=0 mlxlogscore=283 priorityscore=1501 bulkscore=0 lowpriorityscore=0 suspectscore=0 spamscore=0 phishscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2501170000 definitions=main-2502120097 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 74C9B1A000F X-Stat-Signature: z6wbb5gus9fumw7b6ib9yjhkep6eeiuf X-Rspam-User: X-HE-Tag: 1739364310-271722 X-HE-Meta: U2FsdGVkX1+/33T+uGMPTWSdA1E1rGV6iLq/WZBOW0GQJlVkN+xVYccoinEQYjuKwKhsBcQptM/g2JhnNiPEQVzmGNaSOkQzftpIyPJ0pUJ4MQJ+403rNqhuBmj0DTp1Cwgce7ojtKYDh6qDmKmcm0ff9u7+4yNNsU9/4z7yjPy57z1izKYsfqU1tp8lYTHIB4+/4Lfy+z7WudnHsbkYKe2ZLZ9cSKHx3tbFf7RzwTWpKNf+d0I4Q3iTo9wU9adIfayv3oT0tLCo2tBJ37qI9zng+qiiD8TvV4jUh1f993mt4ydacZvsiHiuj9M040ruISEJ4iODF5LElbaDNiy9P5/YB8oWdcc90ZqYBy8uuW6oZFOAd4zLmi0lR6r4QPUmAiS4xc4STnStkeM1drklsqyM4J/1EUcvBJxcuAmtiHOMBGRI2nTwiERlxEKFW+HLdOXIJC65ruKJTNQXG6jPWMPlvVlcWAC4IQuKWtLbk/ZwTa1jnINqpAExH7zGowL9HdUW0BDCRgDBjr1P3Ea6SF+dNCCBkPzBAaMXZQ8iQRw/HLvgoqXOTASbJ/+n8VYk6g5HVEHcyoR53sd4TtMxTO1jg4cfEFY0JrXMIcY+X9nePae1G7FbEpkGHtz5cILgS65M6oYqYLqSdBEX41kSoTGuSKBvrZYRjtMdK2EHyEB6LDDfA2dxINZLr2HCSIC5fnIEP6W5aFNI8VIuwmqnlFa0RGUVcaX98Vw7fMSizD4JehniyyAp7l27VQcE4mbF4bVzdLjDMj8jo3T/Qway/AP4dXjYZJirR7AlagPdN0q17Yvih2m7ZbX9EaM/mzs14r13DHYBrkwY6onB9o/Ai/bCkUbcsXvg2FhIFOZI+yxysN2xYtNL9/l1HWTCvBbrmfLEEPWHLTLLyA1gibr5H4JuSDb9UZnWEwsS8qma3QtL7aShM36deqe31FcloQiFV7ralGL2t4OGbONV4zF E+mS0i0F eNwj7+vUgHVAGcSaUWXDAnvx+m3L9ALkLemQpN2EYXAXfhaWQhXDjWNLtAY2G6CRlfvej0Mt9XlVDp6A/MbMl8QIAUrq8q+1r5KDvxXBsTb0hgYlXpCwhuBxmYnN95x7P8V56tNLmYbpYUSAM3yK/1FqxmE4QUEzAdY42pFCqdCtNdLEYcSwgIkC5Ilgm/cjx75ugfhaStzsvnkvDze5ZGjiotr4wbVzdB00W+55kgKz3rVWNKU/QupS9RLaU4JM7unWqR/TUksvuE6+pZer3Odkf1jYvFjJnkBUtGyIgFDx39q01cl9eLSd1Jc0Vk1nheIBwr+ERlVpBRpLm+OPSTR9FRUk3kyp+VFfp0juaY2KAzwYgxgvHLYRBJgTjawoS/CG7eoR6jaaZPpk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Feb 03, 2025 at 02:03:05PM +0100, Alexander Gordeev wrote: > > So the question becomes whether to: > > > > (a) Get rid of the conditional pte_mkdirty() entirely (this trial > > balloon) > > (b) Put it in folio_mk_pte() for everybody, not just s390 > > (c) Put it in set_pte_range() as David suggested. > > > > It's feeling like (c) is the best idea. > > I will check option (c) These call stacks end up in set_pte_range(): mk_pte() set_pte_range() finish_fault() do_read_fault() do_pte_missing() __handle_mm_fault() handle_mm_fault() mk_pte() set_pte_range() finish_fault() do_shared_fault() do_pte_missing() __handle_mm_fault() handle_mm_fault() mk_pte() set_pte_range() filemap_map_pages() do_read_fault() do_pte_missing() __handle_mm_fault() handle_mm_fault() Moving the PageDirty() check to generic code works (in a sense page fault volume does not notably increase): diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h index 3ca5af4cfe43..b5d88f2b5214 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -1454,8 +1454,6 @@ static inline pte_t mk_pte(struct page *page, pgprot_t pgprot) unsigned long physpage = page_to_phys(page); pte_t __pte = mk_pte_phys(physpage, pgprot); - if (pte_write(__pte) && PageDirty(page)) - __pte = pte_mkdirty(__pte); return __pte; } diff --git a/mm/memory.c b/mm/memory.c index 539c0f7c6d54..4b04325db2ee 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5116,6 +5116,8 @@ void set_pte_range(struct vm_fault *vmf, struct folio *folio, flush_icache_pages(vma, page, nr); entry = mk_pte(page, vma->vm_page_prot); + if (pte_write(entry) && PageDirty(page)) + entry = pte_mkdirty(entry); if (prefault && arch_wants_old_prefaulted_pte()) entry = pte_mkold(entry); The above is however not exactly the same, since set_pte_range() -> set_ptes() dirtyfies all PTEs in a folio - unlike the current s390 implementation, which dirtyfies a single PTE based on its struct page flag. remove_migration_ptes() probably needs to be updated as well, unless we are fine with a claim that a PTE is allowed not always be pre-dirtyfied. That could also be true for mk_pte() call paths I did not manage to find or will get added in the future. mk_pte() remove_migration_ptes() migrate_pages_batch() migrate_pages_sync() migrate_pages() compact_zone() compact_node() kcompactd() kthread() __ret_from_fork() ret_from_fork() Also, with the above change to set_pte_range() hugetlb PTEs are affected: mk_pte() mk_huge_pte() make_huge_pte() hugetlb_no_page() hugetlb_fault() handle_mm_fault() Thus, we need to consider pre-dirtying of hugetlb PTEs as well, which I think is: diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 65068671e460..1c890b3c9453 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5168,7 +5168,7 @@ static pte_t make_huge_pte(struct vm_area_struct *vma, struct page *page, pte_t entry; unsigned int shift = huge_page_shift(hstate_vma(vma)); - if (try_mkwrite && (vma->vm_flags & VM_WRITE)) { + if ((vma->vm_flags & VM_WRITE) && (try_mkwrite || PageDirty(page))) { entry = huge_pte_mkwrite(huge_pte_mkdirty(mk_huge_pte(page, vma->vm_page_prot))); } else { Thanks!