From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A452CC77B7F for ; Tue, 16 May 2023 09:48:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E71DD280004; Tue, 16 May 2023 05:48:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E22E1280001; Tue, 16 May 2023 05:48:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CEA12280004; Tue, 16 May 2023 05:48:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id BEE2B280001 for ; Tue, 16 May 2023 05:48:31 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 84BBDA01BA for ; Tue, 16 May 2023 09:48:31 +0000 (UTC) X-FDA: 80795643222.15.A9E4AD7 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf13.hostedemail.com (Postfix) with ESMTP id ABC6F20006 for ; Tue, 16 May 2023 09:48:29 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=DaSYYyid; spf=pass (imf13.hostedemail.com: domain of bhe@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=bhe@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684230509; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vWt1WnrL78KN76t18wgcQ0hzm5J2w3bN/iAJ4QOvahM=; b=6yJAyHBOeYbofb4mbgH6UPwodf/9GLng7KLgWH+MQYofoIEvi2y7epS79iIvNucGTqsglf +WzKNbTnsy6RCQ5Lhd5JMGs+lNIDxjxuAJaETKa9hk6gkRDyqh9m4QgbX9xhhs40JYxCSA Dsxf0PCpycfbHtB+oNMIzXq42adabSU= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=DaSYYyid; spf=pass (imf13.hostedemail.com: domain of bhe@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=bhe@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684230509; a=rsa-sha256; cv=none; b=3vBCcNR1npB4BVFBuZvYtEYS8X03Bxa0PP9T9TXJ8Do4iUcdg1Isglk/kw8ezmw7tORWTy J4LPWNeGs/4iPoTlxKkv1ACboF9l3sLbKWL3aWuubQTvllX+EbdS49hdVRPHSpKslujaHH VHmnsMaXowHfWYoHfVxBIY7X6xrvZEY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684230509; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=vWt1WnrL78KN76t18wgcQ0hzm5J2w3bN/iAJ4QOvahM=; b=DaSYYyidwusmbyKpp3p16jsG2Cm22ldk2NQkt7hYBCkNPYYZVPEXwwQjavBwxH1hoYIiE5 4ecHyZ5TXH8g+qhncjcK8uFQJtrV0Fy3PH1vFMheFON+TXm1yNsOeTtnse12ugKaxqgD6U Oagefmi7d3x0JFNZfZHbHVzLRlM+aJs= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-416-SPEsVo6ZNMO79NSXPEvafA-1; Tue, 16 May 2023 05:48:22 -0400 X-MC-Unique: SPEsVo6ZNMO79NSXPEvafA-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 173FC84AF32; Tue, 16 May 2023 09:48:22 +0000 (UTC) Received: from localhost (ovpn-13-34.pek2.redhat.com [10.72.13.34]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 5D488400F5A; Tue, 16 May 2023 09:48:21 +0000 (UTC) Date: Tue, 16 May 2023 17:48:18 +0800 From: Baoquan He To: Thomas Gleixner Cc: Uladzislau Rezki , Andrew Morton , linux-mm@kvack.org, Christoph Hellwig , Lorenzo Stoakes , Peter Zijlstra , John Ogness , linux-arm-kernel@lists.infradead.org, Russell King , Mark Rutland , Marc Zyngier Subject: Re: Excessive TLB flush ranges Message-ID: References: <87a5y5a6kj.ffs@tglx> <87o7mk93tc.ffs@tglx> <878rdo8xn2.ffs@tglx> MIME-Version: 1.0 In-Reply-To: <878rdo8xn2.ffs@tglx> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Stat-Signature: 3ypor73wxpimebrxhkjca5ztbqboayti X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: ABC6F20006 X-Rspam-User: X-HE-Tag: 1684230509-443910 X-HE-Meta: U2FsdGVkX18DYEB3J9sKPfknNai8KFj6KOqBqdyHcQ3bCAIzPQApc9DUhkEofpTvr3A+P/EQbeHw0RBcn//CwqrGNoU1eYRmwSg8CpsCrMyhD7Gi/Go6cDwCSczsUy8vGdgjdzH+DvmFfvZEHbWT0I9Nd18QqhWi+kFesTAi5gjArzUv3ObWjf9njE+S4BYS3JIsrzlGgDMsJreAT41/NTlFvwDQ+9ApAxlragkKCGsvCFkh++dZRnU7wwN5NVCQq1SfHZdjR8qS8aTkbsm5XmOQa77w3pmlpsV+S1yg3xMlZT+7yZGgmV5AfympPm3l7pUCbBjIBucT4Gb/kR8OIjz0pVVyd4WYjZXpu+ouzmwoEQn91nMgi8hrqEvtY1nDmwOthtJVtKrgX+XPzGdr0A74zA6wK2Sm5P6KA9QRN8OMH7ZT3ojc85L0O+EL4iKI2aFFLoJiR+hcx8EUNOA5UYWYq2mstwNUZGqmv+2hI0FAx4aVFsFQX9eThi7lKHRSrt0TWsfQ3FR5RFpl1Lw48lJMzn1GX3QrMI9u04v2DTuYHy7ewfa6DY3wGDbxWe6O1eO4jZHWewxpKNa0mTHl67upJwJsOrbZDrblarQKjxBrr1C9Dv5hp8BEptwY7IwFMmxAzCCyqRwQNJHqco/rRkS499FnGG7OcKsQfv4/iUliBvx9+RdIF4Lo3raKSNR1Qg4UQ78205GFrtfwVjgUz/k2atmoznjw80k4jdWPilcpArrtQ7ybZGJZM4g9VswxPnDM3Z/OILmsVA6sbSWYC64iYs7rXOqormYbOoksx8a2EZ50H7AiknzNKsE/zSYDez0XA00fFc8vcouce+JObdYzWq/57Q53D6gLDzei+L5oHsIatcnFG3+ubgAKJnAnrmCfwmhoShr2J2hJuvi7Z86NoDMKVuu3xQUjYaDXc5mWqaI7ttHPhnkte7BwEdXfxnGeLG7MeZH6QrvvWaR mw9Kq5SM DtjuA2SGmIUF0Acki5fvbYMChYB3MCHlXO+Ot6yaILZRy27rxFWfZ4S6pqh+aALrky1ibfuRcu2eBJwKPhoDjLeZzSRuAKFip1VIgy0T1nhNZFGoFOM25lTjVwbdG4OuvHe9wm+zOOF4JP/e2/t7P/xlbTqdMsqwTP6z6Yr0x4OPyps0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 05/16/23 at 10:54am, Thomas Gleixner wrote: > On Tue, May 16 2023 at 16:07, Baoquan He wrote: > > On 05/16/23 at 08:40am, Thomas Gleixner wrote: > >> On Tue, May 16 2023 at 10:26, Baoquan He wrote: > >> > On 05/15/23 at 08:17pm, Uladzislau Rezki wrote: > >> >> For systems which lack a full TLB flush and to flush a long range is > >> >> a problem(it takes time), probably we can flush VA one by one. Because > >> >> currently we calculate a flush range [min:max] and that range includes > >> >> the space that might not be mapped at all. Like below: > >> > > >> > It's fine if we only calculate a flush range of [min:max] with VA. In > >> > vm_reset_perms(), it calculates the flush range with the impacted direct > >> > mapping range, then merge it with VA's range. That looks really strange > >> > and surprising. If the vm->pages[] are got from a lower part of physical > >> > memory, the final merged flush will span tremendous range. Wondering why > >> > we need merge the direct map range with VA range, then do flush. Not > >> > sure if I misunderstand it. > >> > >> So what happens on this BPF teardown is: > >> > >> The vfree(8k) ends up flushing 3 entries. The actual vmalloc part (2) and > >> one extra which is in the direct map. I haven't verified that yet, but I > >> assume it's the alias of one of the vmalloc'ed pages. > > > > It looks like the reason. As Uladzislau pointed out, ARCH-es may > > have full TLB flush, so won't get trouble from the merged flush > > in the calculated [min:max] way, e.g arm64 and x86's flush_tlb_kernel_range(). > > However, arm32 seems lacking the ability of full TLB flash. > > ARM has a full flush, but it does not check for that in > flush_tlb_kernel_range(). > > > If agreed, I can make a draft patch to do the flush for direct map and > > VA seperately, see if it works. > > Of course it works. Already done that. > > But you are missing the point. Look at the examples I provided. > > The current implementation ends up doing a full flush on x86 just to > flush 3 TLB entries. For the very same reason because the flush range > (start..end) becomes insanely large due to the direct map and vmalloc > parts. > > But doing indivudual flushes for direct map and vmalloc space is silly > too because then it ends up doing two IPIs instead of one. IPIs are > expensive and the whole point of coalescing the flushes is to spare > IPIs, no? > > So with my hacked up flush_tlb_kernel_vas() I end up having exactly > _one_ IPI which walks the list and flushes the 3 TLB entries. Makes sense, thanks for telling. While your handling about alias_va may not be right. I will add inline comment in your patch, please check there.