From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90781EB64D9 for ; Thu, 29 Jun 2023 14:00:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D98028D0002; Thu, 29 Jun 2023 10:00:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D20148D0001; Thu, 29 Jun 2023 10:00:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B734B8D0002; Thu, 29 Jun 2023 10:00:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id A61D98D0001 for ; Thu, 29 Jun 2023 10:00:16 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 2862B160C10 for ; Thu, 29 Jun 2023 14:00:14 +0000 (UTC) X-FDA: 80955944748.09.E8B9DAD Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by imf19.hostedemail.com (Postfix) with ESMTP id 993911A002D for ; Thu, 29 Jun 2023 14:00:09 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b="qV8F6si/"; spf=pass (imf19.hostedemail.com: domain of agordeev@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=agordeev@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688047211; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LrYbdcM0evC7Xy9CDimNEkxtdbpvB/BFhHVZyMGsjWc=; b=jxlEBaYDfhwjUj4pJDa+9Tv4VVEp0DoMQMepbcw9AwNf3VxKt2q4ZfFGt9iBxhtZvkckf+ LpyBlgvd+zD/ntN6c5I+4R6HR1sux51XaMg7EvFZsBq0/ao/1XAe/BwiPUi1ZLB+Onekhc gvOzNgKxNZnfXuxTFDxPJ3ewn7py+50= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b="qV8F6si/"; spf=pass (imf19.hostedemail.com: domain of agordeev@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=agordeev@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688047211; a=rsa-sha256; cv=none; b=FowcSkDBGU9HsnfQrOKxaEUW2iFAq0+uFcKiNlQ4xo1hdtaTEehLkiIuu7HINkaQpqgIvi crYcWsC25UhUDX95cpdHSF6E82vItb6TAYXD+2Kn5Y0Gn86fS/EwHw/Jxt1OhtWsLjLMyn 9yaHlgpap8ZeZBs0nmK+qJvVh2Egywo= Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 35TDvSCY018330; Thu, 29 Jun 2023 13:59:20 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=date : from : to : cc : subject : message-id : references : mime-version : content-type : in-reply-to; s=pp1; bh=LrYbdcM0evC7Xy9CDimNEkxtdbpvB/BFhHVZyMGsjWc=; b=qV8F6si/hA4BMAclDXBQy7EBDubLFaV1IUaBMl1NIzccoj5zCV2u77w6NVMJFtkznkr7 ln1Oo/k2hNHfVuzudYvAvWMvXEoTFkxnXEBzShSU0D6WY/Cz2rBjy0HfxEbpc2zr6fmP eyeTGi/rLL1r1v/YRHvRsaWNFz6ptKjc1e2SbKrtxoqecgrANt4d9MN3C6N2f40/G59v 8s53SGXTAAegChcxiBTp/H1hMkYuiyIemVNXovvrZxVX9xW/puzk/H9OmrdYihtTMFXI BxoprZF/nsWn8XGRF24lWPl4TyTd2X1b9IT+NYD+e9N4ZAskKqxAzZbLBGlbHonfl8li /A== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rhba183ms-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 29 Jun 2023 13:59:20 +0000 Received: from m0353725.ppops.net (m0353725.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 35TDvdpg018759; Thu, 29 Jun 2023 13:59:19 GMT Received: from ppma01fra.de.ibm.com (46.49.7a9f.ip4.static.sl-reverse.com [159.122.73.70]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rhba183gk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 29 Jun 2023 13:59:18 +0000 Received: from pps.filterd (ppma01fra.de.ibm.com [127.0.0.1]) by ppma01fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 35T5nCVI014922; Thu, 29 Jun 2023 13:59:16 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma01fra.de.ibm.com (PPS) with ESMTPS id 3rdr452hpd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 29 Jun 2023 13:59:16 +0000 Received: from smtpav02.fra02v.mail.ibm.com (smtpav02.fra02v.mail.ibm.com [10.20.54.101]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 35TDxCmc56885746 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 29 Jun 2023 13:59:12 GMT Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9F48C20043; Thu, 29 Jun 2023 13:59:12 +0000 (GMT) Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4EA7020040; Thu, 29 Jun 2023 13:59:09 +0000 (GMT) Received: from li-008a6a4c-3549-11b2-a85c-c5cc2836eea2.ibm.com (unknown [9.171.2.247]) by smtpav02.fra02v.mail.ibm.com (Postfix) with ESMTPS; Thu, 29 Jun 2023 13:59:09 +0000 (GMT) Date: Thu, 29 Jun 2023 15:59:07 +0200 From: Alexander Gordeev To: Gerald Schaefer Cc: Hugh Dickins , Andrew Morton , Vasily Gorbik , Mike Kravetz , Mike Rapoport , "Kirill A. Shutemov" , Matthew Wilcox , David Hildenbrand , Suren Baghdasaryan , Qi Zheng , Yang Shi , Mel Gorman , Peter Xu , Peter Zijlstra , Will Deacon , Yu Zhao , Alistair Popple , Ralph Campbell , Ira Weiny , Steven Price , SeongJae Park , Lorenzo Stoakes , Huang Ying , Naoya Horiguchi , Christophe Leroy , Zack Rusin , Jason Gunthorpe , Axel Rasmussen , Anshuman Khandual , Pasha Tatashin , Miaohe Lin , Minchan Kim , Christoph Hellwig , Song Liu , Thomas Hellstrom , Russell King , "David S. Miller" , Michael Ellerman , "Aneesh Kumar K.V" , Heiko Carstens , Christian Borntraeger , Claudio Imbrenda , Jann Horn , Vishal Moola , Vlastimil Babka , linux-arm-kernel@lists.infradead.org, sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page Message-ID: References: <54cb04f-3762-987f-8294-91dafd8ebfb0@google.com> <20230628211624.531cdc58@thinkpad-T15> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230628211624.531cdc58@thinkpad-T15> X-TM-AS-GCONF: 00 X-Proofpoint-GUID: YMCVI8WX38wPGVoSaafWaumxPTpPmIfa X-Proofpoint-ORIG-GUID: BptJCD1qhVG-ySwo39DLjqgVEpX_NEnC X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-06-29_03,2023-06-27_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 adultscore=0 lowpriorityscore=0 bulkscore=0 impostorscore=0 mlxlogscore=999 suspectscore=0 malwarescore=0 clxscore=1011 mlxscore=0 spamscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2306290122 X-Rspamd-Queue-Id: 993911A002D X-Rspam-User: X-Stat-Signature: n6xeaqr4mqeph4mdcripfdph1oiczd5d X-Rspamd-Server: rspam01 X-HE-Tag: 1688047209-613793 X-HE-Meta: U2FsdGVkX19aZAoMtvKYVAZLNZT09Nn7i6pKHIzsF0d/3n5NzSmZfcNy4qqt+zm4LXmzxLueuxxhRFeTawnGK/o7cn3AQ/9Y/X3tQbIjDhCsIlTA1aI2n+6qSDoFeg2WcNDik5upAi0F0LY+dGgG92E9mFjjwy3R3EwW61ZXIjbdLihEzyEExvM1yIasWRJXbtdGk/XCe6qbR4lh5fVXbOd9S9tt/rMvVJF0v6+TzgOF19OHxf1qM5yQ+1bCotfgV67b8k5COVlLQM8hMMNsbV51RAQpYTAU+zPmKuYTyoplYVbAvE0DOnTGEE0YRSVFL6t8qOWmr1MmWwOny1sSH70N2OpZUw3RCggSS1Z5kFgDtiqoCTLdmIx+CobZpvPw91UZ5TmjpF2tQ+gnWNIB/k08z6Nt2ZurB1qhjz6Is8lxN00oTobPSKOTz56/PHat5qK7goEchkF5cmE288Y8LPyQD7V/EF3JJyeY8W9Xz3W7FDCU1rSxvk/W3Lz9eQcY4gfDz4+JhESvqMEmFRkE/+SIqPEME/73KT3Zui24KP/0vf7TXZ5RhlKeMkegZdWOvWtX4T5v6Kk/+GWiqr+u/99eJ8/8NpBONVxetMcjSZdShg9HewGBnGre51+8YmKX1Snfny6S5phqmlEeN2oboyOo3AMkhicFBhUxwi5V8gEO7WujFK6781iWDlJLoAUyUj2QbymJ14MwkAQdLw8bBbvCmTbA5dSZZX13Ifv/JsmbY8Q+6i1nMn3akJltyXEWMHIEA+TzgiaU1piMgzB/4G6/UBuXdsntaABuYYbg82QYtkwLTEnhICwAvcT3ilwem28PG3279rMPocEHvTXMo5TViD3ksl7wYhgBgzy3r3kxdyeFvmXtapF+aYTRlowOsZC3CyfFlTHB+oW+oCp5bUHHmzUj6PF6XMKPUJy+FL+H30TXsJMwN2GMKfULb11gJ3ed4XedCyYKwMlgGEq hmaqWCoJ gOagwL9/dOy8EzNuanAoA3VgywkkppxdvdypPpIRc3pKtPZKEpUQy8EVe0lhm/Kga+WlIwh10M8csgYLLrntVk5tshVlxQLfGg6NmX8i16GgBK6o3hfex6/zS5PkpygFDJqdFi/BnkDBqg1E/2z7ZaHVam8y35IlYkDS6WoC0ntcY8CXI9uv7M63gdnVfXWypqrMf17hCFK5LE4HHQh90ESMbOkG7vMsfCJCvg9tWmm3Zad0dZVuLWCv2OS51zx/Lj29WWk7nPA4N4RWbfqHh0J8EyOwcOLdGWrh/leGK7DOhPwg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Jun 28, 2023 at 09:16:24PM +0200, Gerald Schaefer wrote: > On Tue, 20 Jun 2023 00:51:19 -0700 (PDT) > Hugh Dickins wrote: Hi Gerald, Hugh! ... > @@ -407,6 +445,88 @@ void __tlb_remove_table(void *_table) > __free_page(page); > } > > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > +static void pte_free_now0(struct rcu_head *head); > +static void pte_free_now1(struct rcu_head *head); What about pte_free_lower() / pte_free_upper()? ... > +void pte_free_defer(struct mm_struct *mm, pgtable_t pgtable) > +{ > + unsigned int bit, mask; > + struct page *page; > + > + page = virt_to_page(pgtable); > + if (mm_alloc_pgste(mm)) { > + /* > + * TODO: Do we need gmap_unlink(mm, pgtable, addr), like in > + * page_table_free_rcu()? > + * If yes -> need addr parameter here, like in pte_free_tlb(). > + */ > + call_rcu(&page->rcu_head, pte_free_pgste); > + return; > +} > + bit = ((unsigned long)pgtable & ~PAGE_MASK) / (PTRS_PER_PTE * sizeof(pte_t)); > + > + spin_lock_bh(&mm->context.lock); > + mask = atomic_xor_bits(&page->_refcount, 0x15U << (bit + 24)); This makes the bit logic increasingly complicated to me. What if instead we set the rule "one bit at a time only"? That means an atomic group bit flip is only allowed between pairs of bits, namely: bit flip initiated from ----------- ---------------------------------------- P <- A page_table_free(), page_table_free_rcu() H <- A pte_free_defer() P <- H pte_free_half() In the current model P bit could be on together with H bit simultaneously. That actually brings in equation nothing. Besides, this check in page_table_alloc() (while still correct) makes one (well, me) wonder "what about HH bits?": mask = (mask | (mask >> 4)) & 0x03U; if (mask != 0x03U) { ... } By contrast, with "one bit at a time only" policy every of three bits effectevely indicates which state a page half is currently in. Thanks!