From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2CCEC4332F for ; Mon, 30 Oct 2023 10:26:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 490C86B019B; Mon, 30 Oct 2023 06:26:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 41A0E6B019C; Mon, 30 Oct 2023 06:26:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2BA586B019D; Mon, 30 Oct 2023 06:26:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 15CB66B019B for ; Mon, 30 Oct 2023 06:26:42 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id E06ED120723 for ; Mon, 30 Oct 2023 10:26:41 +0000 (UTC) X-FDA: 81401749002.22.6F60643 Received: from invmail4.hynix.com (exvmail4.skhynix.com [166.125.252.92]) by imf25.hostedemail.com (Postfix) with ESMTP id 2955AA0005 for ; Mon, 30 Oct 2023 10:26:38 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=none; spf=pass (imf25.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698661600; a=rsa-sha256; cv=none; b=RKOF1s69J78plcqwtlzuOBj71niBX7Fyc/Ql/JOyDoTBPINNNJP09CM3P6s4FyUnQJlLr4 Ix4/ikVOBqiehKiSsA8aXiJ18Nw06HWPmj4JwHUKbY6EIIX12VWU+qd1hLDEosRUIfxXnN Vm2s6CirTuU3PG0b6HSWPuNR9YEimXk= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=none; spf=pass (imf25.hostedemail.com: domain of byungchul@sk.com designates 166.125.252.92 as permitted sender) smtp.mailfrom=byungchul@sk.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698661600; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=69wBS+Xk0++iD7Vx3SRCYp3GutJqJ+6QKQKlYfQe96w=; b=zKndCuOQ2/D9aORrWuj+JLbgU1k82lWxRCu1v7kcr5MT+nuAuA+zVKgUKMbHbUmoAKYHAx PNYqM6SUDUqPWuznmlYpuP8Pvs7jB+ZCrWZL0/WH8OM+IFRaxIe8q+qQc8Xl4ZuEryDOto 37W7NKoYCVquZeODUbhEK40VYNWECVk= X-AuditID: a67dfc5b-d6dff70000001748-e4-653f84dca000 Date: Mon, 30 Oct 2023 19:26:31 +0900 From: Byungchul Park To: Nadav Amit Cc: Linux Kernel Mailing List , linux-mm , "kernel_team@skhynix.com" , Andrew Morton , "ying.huang@intel.com" , "xhao@linux.alibaba.com" , "mgorman@techsingularity.net" , "hughd@google.com" , "willy@infradead.org" , "david@redhat.com" , "peterz@infradead.org" , Andy Lutomirski , Thomas Gleixner , "mingo@redhat.com" , "bp@alien8.de" , "dave.hansen@linux.intel.com" Subject: Re: [v3 1/3] mm/rmap: Recognize non-writable TLB entries during TLB batch flush Message-ID: <20231030102631.GB81877@system.software.com> References: <20231030072540.38631-1-byungchul@sk.com> <20231030072540.38631-2-byungchul@sk.com> <1DB097E6-6585-4D10-95C9-7BAA5A622B7E@vmware.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1DB097E6-6585-4D10-95C9-7BAA5A622B7E@vmware.com> User-Agent: Mutt/1.9.4 (2018-02-28) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrAIsWRmVeSWpSXmKPExsXC9ZZnoe7dFvtUg1Zjiznr17BZfN7wj83i xYZ2Rouv638xWzz91MdicXnXHDaLe2v+s1qc37WW1WLH0n1MFpcOLGCyuL7rIaPF8d4DTBab N01ltvj9A6huzhQri5OzJrM4CHh8b+1j8ViwqdRj8wotj8V7XjJ5bFrVyeax6dMkdo93586x e5yY8ZvFY+dDS495JwM93u+7yuax9Zedx+dNch7v5r9lC+CL4rJJSc3JLEst0rdL4Mr4sZu1 YKpyxbnDe5kaGG9IdzFyckgImEj0bmpmhLEPfL4DZrMIqEqc+rSWHcRmE1CXuHHjJzOILSKg KHFo/z2wGmaBd6wS3z9pgtjCAlESX2ZuB4vzClhIPN76haWLkYtDSGAqo8SjrglsEAlBiZMz n7BANKtL/Jl3CWgoB5AtLbH8HwdEWF6ieetssDCngJ3Et9YIkLCogLLEgW3HmUBGSghsY5e4 drCRHeJmSYmDK26wTGAUnIVkwywkG2YhbJiFZMMCRpZVjEKZeWW5iZk5JnoZlXmZFXrJ+bmb GIHxuqz2T/QOxk8Xgg8xCnAwKvHwBoTbpQqxJpYVV+YeYpTgYFYS4WV2tEkV4k1JrKxKLcqP LyrNSS0+xCjNwaIkzmv0rTxFSCA9sSQ1OzW1ILUIJsvEwSnVwMhzVVVU7ONmtbrfedLfz7/d oRjgtuzwXtHF8+RqZB8Wm4iePXUozOfUrsXuZ2J0Orqf73dnmM1ZdqD+7Dr/krYLa/T0xZT3 ngsTl17NsZr187WTDa+zef7VL8kJM/zzOfNYR7vF1R/ud+va7gvq7lrr65Nw4tlhJqljLccl NUM+b9I5Z7erUVCJpTgj0VCLuag4EQDalPfl0wIAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrBIsWRmVeSWpSXmKPExsXC5WfdrHunxT7V4P8MJYs569ewWXze8I/N 4sWGdkaLr+t/MVs8/dTHYnF47klWi8u75rBZ3Fvzn9Xi/K61rBY7lu5jsrh0YAGTxfVdDxkt jvceYLLYvGkqs8XvH0B1c6ZYWZycNZnFQdDje2sfi8eCTaUem1doeSze85LJY9OqTjaPTZ8m sXu8O3eO3ePEjN8sHjsfWnrMOxno8X7fVTaPxS8+MHls/WXn8XmTnMe7+W/ZAvijuGxSUnMy y1KL9O0SuDJ+7GYtmKpcce7wXqYGxhvSXYycHBICJhIHPt9hBLFZBFQlTn1ayw5iswmoS9y4 8ZMZxBYRUJQ4tP8eWA2zwDtWie+fNEFsYYEoiS8zt4PFeQUsJB5v/cLSxcjFISQwlVHiUdcE NoiEoMTJmU9YIJrVJf7MuwQ0lAPIlpZY/o8DIiwv0bx1NliYU8BO4ltrBEhYVEBZ4sC240wT GPlmIRk0C8mgWQiDZiEZtICRZRWjSGZeWW5iZo6pXnF2RmVeZoVecn7uJkZg/C2r/TNxB+OX y+6HGAU4GJV4eAPC7VKFWBPLiitzDzFKcDArifAyO9qkCvGmJFZWpRblxxeV5qQWH2KU5mBR Euf1Ck9NEBJITyxJzU5NLUgtgskycXBKNTC2+X1ofPlSZzfHlw9eWZZZ6iVfKncfnqC4yvgQ 37F4R6HgA3+SRIOfv+uQ2uLbqPJ60vxbisuW/JhdG+ixIzg4yDZfdp/L2bytJ6OP2TUsM1T2 iXjgmzBtparB8Z8XTh4vMnNJTd5yVWKdipiWcXug9vW23Ts1M4oK70aLTuNmuxP2nj/Du1iJ pTgj0VCLuag4EQB+QdrXuwIAAA== X-CFilter-Loop: Reflected X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 2955AA0005 X-Stat-Signature: njwicxd4wdj51zj6a5ymgb633w4g5yar X-HE-Tag: 1698661598-537221 X-HE-Meta: U2FsdGVkX196kxTx9GxOINy4fNpzqlbXyRmWZzjhboeL9x0cQkqq+tARJ9O+FgsfDdc8IUpk8F6rwBBxx0rC5k6xxgR64VHZsNLkIoUguPea81JQXaXL/mAEy7Bucu+4uGqpu3IsyESAJkpg20syFfB4JgCh5iXkLshzmpjpT4Qaen2l0Zgx0cs4Km9BJenC+1786lSf5yeAHesqxPTrAHT8FtWJ4XwGPmoFRIBJvr2jLMqVkvdW6vq8SoRORsnpiq3mvQXDTfeGEJ3C/gdYZz6VOZf0vC+wp3gtZSiB55kquLdK0Ty+Fc6jbPLt946NaziZMLITnYW/jSEuMylxs8cWY0pEscviK6s1Bimh4v1+KrjELXMJiEKcpoKGylFMQqE46nkzH3GEEQUpxFD/rpOIdKzJOjAopnlxI8qu3sNMSXl41epju0l5SJFkMSzCRmWq2Nj4Mvi9xE1SlcDDrjWXtZSDtI4cpPLb/B+7ajnN4aD0g6eXiR5Gt+gfKLW3Cba88QP8xkqIWwkEaenxo1Uurp2t7V9O+DqWZrA4Qty+Yb1xi/0UczvEcmebfrpXrUjiTqqFW+qq8c9n7DYM2qy83fsPIIpE+qGhFgnmjzC6WoXMyGE03XrHjryfSP0EJPojUcCcKYA7C/HEfL4If/Pa/Jz6G2ThAVIZZ2BVbg9Gal4icG8dcFnuw7Yhp7NwLSVzd9CeEF7ZpvhHKwXKWY2wG+42aMEKm1SQD3M+ZgsC/trdXY1SxK3RTH+nXkwZB551ZwGCEB1fUZPMVGqFfgKiV+penKFqMKBYXrKg9QOCnd+HL1aiKFOxNhvqz1RALcz5eIrQFd2Je6kzsrdxF3vXdvJJn3xgbDfzQyUCpbLW+Cu4BzHuYEkZcn12V7/vtVcK+HeYuTKj8frd0oNUC0ydxQbgSwQJYkLxyBFS9/hdN5RqJwOo2bPcO7SnMU04OLVpARijQiQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Oct 30, 2023 at 07:52:05AM +0000, Nadav Amit wrote: > > Below are some points you might find useful: Thank you! > > + > > /* > > * Blindly accessing user memory from NMI context can be dangerous > > * if we're in the middle of switching the current user task or > > diff --git a/include/linux/mm_types_task.h b/include/linux/mm_types_task.h > > index aa44fff8bb9d..35ba9425d48d 100644 > > --- a/include/linux/mm_types_task.h > > +++ b/include/linux/mm_types_task.h > > @@ -59,8 +59,8 @@ struct tlbflush_unmap_batch { > > */ > > struct arch_tlbflush_unmap_batch arch; > > > > - /* True if a flush is needed. */ > > - bool flush_required; > > + /* The number of flush requested. */ > > Number of what? Base pages I presume. How many times set_tlb_ubc_flush_pending() has been called. > > + int nr_flush_required; > > Perhaps unsigned would be better suited? Will change it to unsigned. > > /* > > * If true then the PTE was dirty when unmapped. The entry must be > > diff --git a/include/linux/sched.h b/include/linux/sched.h > > index 77f01ac385f7..63189c023357 100644 > > --- a/include/linux/sched.h > > +++ b/include/linux/sched.h > > @@ -1324,6 +1324,7 @@ struct task_struct { > > #endif > > > > struct tlbflush_unmap_batch tlb_ubc; > > + struct tlbflush_unmap_batch tlb_ubc_nowr; > > tlb_ubc_nowr is - I think - less informative the tlb_ubc_ro (and a comment > would be useful). At the beginning, I named it tlb_ubc_ro but.. I forgot why I changed it to tlb_ubc_nowr but.. I will change it back and add a comment on it. > > + > > +int nr_flush_required(void) > > +{ > > + return current->tlb_ubc.nr_flush_required; > > +} > > + > > +int nr_flush_required_nowr(void) > > +{ > > + return current->tlb_ubc_nowr.nr_flush_required; > > +} > > I haven’t gone through the users of these functions yet, as they are not included > in this patch (which is usually not great). Right. I will place these two on another patch that uses the functions. Or need to add an explanation in this commit message. > Anyhow, it might be a bit wasteful to have a function call for such a function. See > if it is possible to avoid that call. I will move them to mm/internal.h with inline added if possible. > > + > > /* > > * Flush TLB entries for recently unmapped pages from remote CPUs. It is > > * important if a PTE was dirty when it was unmapped that it's flushed > > @@ -615,11 +641,12 @@ void try_to_unmap_flush(void) > > { > > struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; > > > > - if (!tlb_ubc->flush_required) > > + fold_ubc_nowr(); > > + if (!tlb_ubc->nr_flush_required) > > return; > > > > arch_tlbbatch_flush(&tlb_ubc->arch); > > - tlb_ubc->flush_required = false; > > + tlb_ubc->nr_flush_required = 0; > > tlb_ubc->writable = false; > > } > > > > @@ -627,8 +654,9 @@ void try_to_unmap_flush(void) > > void try_to_unmap_flush_dirty(void) > > { > > struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; > > + struct tlbflush_unmap_batch *tlb_ubc_nowr = ¤t->tlb_ubc_nowr; > > > > - if (tlb_ubc->writable) > > + if (tlb_ubc->writable || tlb_ubc_nowr->writable) > > try_to_unmap_flush(); > > } > > > > @@ -645,15 +673,16 @@ void try_to_unmap_flush_dirty(void) > > static void set_tlb_ubc_flush_pending(struct mm_struct *mm, pte_t pteval, > > unsigned long uaddr) > > { > > - struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; > > + struct tlbflush_unmap_batch *tlb_ubc; > > int batch; > > bool writable = pte_dirty(pteval); > > > > if (!pte_accessible(mm, pteval)) > > return; > > > > + tlb_ubc = pte_write(pteval) || writable ? ¤t->tlb_ubc : ¤t->tlb_ubc_nowr; > > Using the ternary operator here is a bit confusing. You can use an “if” > instead or if you mind is set doing it this way at least make it easier to > read: > > tlb_ubc = (pte_write(pteval) || writable) ? ¤t->tlb_ubc : > ¤t->tlb_ubc_nowr; You are right. I should change it that way. Thanks. > And of course, add a comment. Okay. Also will add a comment. > > arch_tlbbatch_add_pending(&tlb_ubc->arch, mm, uaddr); > > - tlb_ubc->flush_required = true; > > + tlb_ubc->nr_flush_required += 1; > > Presumably overflow is impossible for other reasons, but something like that > worries me. Agree with you. Lemme think it more and fix it. Thank you. Byungchul