From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C859DD4920B for ; Mon, 18 Nov 2024 10:56:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 15EE56B00AC; Mon, 18 Nov 2024 05:56:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 10DD56B00AD; Mon, 18 Nov 2024 05:56:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EF0DD6B00AE; Mon, 18 Nov 2024 05:56:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D0CC66B00AC for ; Mon, 18 Nov 2024 05:56:13 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 7C894160111 for ; Mon, 18 Nov 2024 10:56:13 +0000 (UTC) X-FDA: 82798910964.28.14B814D Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) by imf07.hostedemail.com (Postfix) with ESMTP id 9461540011 for ; Mon, 18 Nov 2024 10:55:01 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=ib1LFUs6; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf07.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.173 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731927305; a=rsa-sha256; cv=none; b=Ib5A5wJX+Pr226BjPJE7X1QUxpTvmcBkf8EeKcVDA9O+12HYgMei03iwcCw+1hQXA1Dx+4 U0YqjpEJq18C6vwxu4DkTHMXb0Zjw7/qLNtffbgSL2Nre/Z0UYo5O4Ox6StGJuIQFxIdgi XtyQXvMnWvUuCUpVo3Rgukx6VpOnFfk= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=ib1LFUs6; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf07.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.173 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731927305; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GTnRV92NpOoxrbTXghMbwPsAR3AFhh0yFJWxh/9FjeY=; b=yZLYiYxtoPQzlb3vTPdu41sAurKw7UxA6PeWR9IHIom2FHmHMtIx3C7Bdzry4NwY19hsww M0+GNN7plnqo8VHMWSCZJAcpuSqUmmyuii/nteZ6q7U0Mwv4FHtMKXdfD5+HrAjUHSfd5O 1yMpV+Tmg9auSCTHYfV4TkxStVxrv4M= Received: by mail-pf1-f173.google.com with SMTP id d2e1a72fcca58-720b2d8bcd3so2857342b3a.2 for ; Mon, 18 Nov 2024 02:56:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1731927369; x=1732532169; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=GTnRV92NpOoxrbTXghMbwPsAR3AFhh0yFJWxh/9FjeY=; b=ib1LFUs6nU0RWXOCyI3uN0KjyeYusjABdomB9N3BCyYUpXxcWazH72wFwnvR4BaXft 7QXBZOBhcrGyLarkGkBNYB0uvHGE2cglaFcxe3Upi0uDQCv/01HYEU16VIh4i21hvADk lf3V3KUkwDPFZGwZk00U96Wz3vDSIg4IdVCoumKakowuA3pp89iYqMqcFynmLv45XTjq 6XssYb4cxBYJSdmEOmmm2zgBTwdyn3/8/z9qmYDx1PD6ntqz1bCZvjW/+lZKgg5wJ69l Tn+tNo9/Z1ZwqnyZtTBeHQ5ctnntb1KwivZ2lyx+tJ8NC3svNcWhDjeim9NDYnERDKH9 dQBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731927369; x=1732532169; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=GTnRV92NpOoxrbTXghMbwPsAR3AFhh0yFJWxh/9FjeY=; b=lgsxpNyWWHXi0i5dvAlCSHbCbu/NX3g9SHqznQQkdAw79u4iKjT+NGAFkM5OjuN4nU NpfEF/ui7F8mn4JL0TWLEaxca7X0C18Iai16fRPlfNfpP11Xgoo04SY8ZoF5Km9pilY1 rw3AFrKEkkaoFUtwBPqihOCZnjiZqZb5NYqZ7d9JLSmTOSHiDDKWkBS4UmbXIsHn+Y+r jMuQ9ap9Fx4XflSvVC7TcTgTtk/URfl4lHEHEm2cUBwJiW4JDslx7Rq7nhE45S1yiPU5 liAy+mU9P9lmEguOUGUvco+kr2mGtqaUNuS1skQ262VMX1c7UNknRtKqTjfYBkF6CITl Ul9g== X-Forwarded-Encrypted: i=1; AJvYcCWAHTpq9YUAHrwAOWUjLPyxXbD/4eIUgAr4VO7N652rWOKnkRi9tNjGVHdwI1cnWvdZF3iLMaWafw==@kvack.org X-Gm-Message-State: AOJu0Yzzjn0zj95VG4Bo6ytZ45qXqLDLOaTp+REzc1TpL1RRGIluGVjH O8FdWyCRgaM8/EK8iU47JdLxmKHbPAY7IN2lvkmyGIo46ZkIB4GswwAGeLlL+oU= X-Google-Smtp-Source: AGHT+IGP7j7Gjezn6MlPl8Z4Tzdv2xRpZ/rO6XtYluNhpj6WYZn1dAnYACGijsS/pWVgxe2DdkR2WQ== X-Received: by 2002:a05:6a00:2309:b0:71e:7745:85b8 with SMTP id d2e1a72fcca58-72476b721d1mr15486017b3a.1.1731927369236; Mon, 18 Nov 2024 02:56:09 -0800 (PST) Received: from [10.84.149.95] ([203.208.167.151]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7247720bb56sm5891069b3a.198.2024.11.18.02.56.02 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 18 Nov 2024 02:56:08 -0800 (PST) Message-ID: Date: Mon, 18 Nov 2024 18:56:00 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 4/9] mm: introduce skip_none_ptes() Content-Language: en-US To: David Hildenbrand Cc: jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, peterx@redhat.com, mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, zokeefe@google.com, rientjes@google.com References: <574bc9b646c87d878a5048edb63698a1f8483e10.1731566457.git.zhengqi.arch@bytedance.com> <617a063e-bd84-4da5-acf4-6ff516512055@bytedance.com> <253e5fd0-7e98-43fd-b0d7-8a5b739ae4aa@bytedance.com> <77b1eddf-7c1b-43e9-9352-229998ce3fc7@redhat.com> <5a3428bd-743a-4d51-8b75-163ab560bca7@bytedance.com> <4edccc1a-2761-4a5a-89a6-7869c1b6b08a@redhat.com> <2b48d313-4f66-47c8-98d8-8aa78db62b1b@bytedance.com> <995804f4-b658-44b2-bb40-c84b8a322616@redhat.com> <4ee60b7b-a81e-4b94-96c9-52b1bd9d5061@redhat.com> From: Qi Zheng In-Reply-To: <4ee60b7b-a81e-4b94-96c9-52b1bd9d5061@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Queue-Id: 9461540011 X-Rspamd-Server: rspam11 X-Stat-Signature: ex698bk3mmqc8dgugnz38x5njrjg7hxz X-HE-Tag: 1731927301-923494 X-HE-Meta: U2FsdGVkX18PHYhJMW72nwrxWc5NBfxh4wEJP/qSgM3jFIp5cBH466TobfJuh2U5A5QHuDR+nv/x0v9EBp5ZAxRqImEIGoDMZDNqQ175XYzPOQ/UB6bRU4oV2pRJPkt1fiN1gpZDOYN3Wii/SQB3Egf3d+5VqdyKIHQvb5ze5QketHVUp4RlADmcyJJW1jhKTtzIMcDbCZNxR/ghK4YyD4/yHuGoh3cXRk8SZd2JJtobFHzLACxCJaqp5lsxpRBQtt34uxY1OJ8Jbn3dTyLkprOEw9w9+9VoHiic2tVinx9L9YDXELIFVqcn1srvGFT0cZlRWB5/2dsjB+yzqF027Bz8bKa0ysPS5q7S1B9TlB6oct50DDz9F0INHTTtexNz2oTje0cQUMBxl0kiw17vLE0Ca31yl+dFW1k9Bg/cEDD8J5l/pK7CvdWnPTbfVSlarZxOdRAWS4BCOTHHc6a6dmzzX1YjOalkZyV3lzda5otjeDggQaXkPOOXq+2k/E6KbXIX/iEJa8qvR3Ctx8lDgePUVGr1owWzfT+sa86pWh4xo5G+iGdv+c1tUvY7CdKsm5RdCZ9Kb6ph7JZ6gUkveBCoSZTxXPjBASiKVGm3M8L0YzUuSjPGJdxJO/Oeo12kMhqtsTyKKcqPTx94ATgAUGUM9J9GK10mbwzMtbHDBAcwO8FCsDvNsrTnUApBe/LVRkjncnjtzH75q5fOJvSUkeQYB6SfFAcRsczQFAaO+LLx9IU5K0W/oyUBy7yWWbaQ6oxESlwUBjszORPT6oUrQmx7xBUF/laTpG5skciCYuwINsgDu7QyATbks7OlHRVIxlXxeHMhRcg8sf6QJw1LK8JyopGGiFYtEpjh0HyY9mxtDEyrXNODY0vDE5CNDY7ex/r1ow5gO4qAok6TkHulHBRXMtgSi7QG9Z1W5brFZZ3PGV82EIudJyMGkOBZ4+4Ml2/5xpRmuV6IrhSFOVf xB3M5iFB i36m5I1YdBiSclj9o1QNy8JU0zDz5KNcZpkmzYSEihaCnXJwFRfl6YpAh7KbcYwk7FYVHoG2r8iwZpW1jISFeNXuK4Pr50ir/OWFEBafRlNrmHCMrD3yXgpb8lou0jCkUfxyvTUBntM2GHxiQ0k/QTVOdUmVcoM65mz3hitgo9V6ovhkrkJXALjcIV+cwlDU5OplcWdZr6rwoG4jOBOy5LbVM8+DgbJbkjzE7mH+uf8FrDLzbHK7hOvw+wyZANxgPFzuSS0NK+L9ksNBaIkXATyL965tFzEK4m/84hVQXgK9UCm6OsJBXpe0qM3Mo66psovhoNucudikRc74ytDf/zcrVKRMQWjOiQ66aLs0Phs4X3xQsKZMz1heNuVvG8GgijLizr7FLYVYCfU2pWNbc8QSaMpE3GBYs3WIq X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/11/18 18:41, David Hildenbrand wrote: > On 18.11.24 11:34, Qi Zheng wrote: >> >> >> On 2024/11/18 17:29, David Hildenbrand wrote: >>> On 18.11.24 04:35, Qi Zheng wrote: >>>> >>>> >>>> On 2024/11/15 22:59, David Hildenbrand wrote: >>>>> On 15.11.24 15:41, Qi Zheng wrote: >>>>>> >>>>>> >>>>>> On 2024/11/15 18:22, David Hildenbrand wrote: >>>>>>>>>> *nr_skip = nr; >>>>>>>>>> >>>>>>>>>> and then: >>>>>>>>>> >>>>>>>>>> zap_pte_range >>>>>>>>>> --> nr = do_zap_pte_range(tlb, vma, pte, addr, end, details, >>>>>>>>>> &skip_nr, >>>>>>>>>>                              rss, &force_flush, &force_break); >>>>>>>>>>           if (can_reclaim_pt) { >>>>>>>>>>               none_nr += count_pte_none(pte, nr); >>>>>>>>>>               none_nr += nr_skip; >>>>>>>>>>           } >>>>>>>>>> >>>>>>>>>> Right? >>>>>>>>> >>>>>>>>> Yes. I did not look closely at the patch that adds the counting of >>>>>>>> >>>>>>>> Got it. >>>>>>>> >>>>>>>>> pte_none though (to digest why it is required :) ). >>>>>>>> >>>>>>>> Because 'none_nr == PTRS_PER_PTE' is used in patch #7 to detect >>>>>>>> empty PTE page. >>>>>>> >>>>>>> Okay, so the problem is that "nr" would be "all processed >>>>>>> entries" but >>>>>>> there are cases where we "process an entry but not zap it". >>>>>>> >>>>>>> What you really only want to know is "was any entry not zapped", >>>>>>> which >>>>>>> could be a simple input boolean variable passed into >>>>>>> do_zap_pte_range? >>>>>>> >>>>>>> Because as soon as any entry was processed but  no zapped, you can >>>>>>> immediately give up on reclaiming that table. >>>>>> >>>>>> Yes, we can set can_reclaim_pt to false when a !pte_none() entry is >>>>>> found in count_pte_none(). >>>>> >>>>> I'm not sure if well need cont_pte_none(), but I'll have to take a >>>>> look >>>>> at your new patch to see how this fits together with doing the >>>>> pte_none >>>>> detection+skipping in do_zap_pte_range(). >>>>> >>>>> I was wondering if you cannot simply avoid the additional scanning and >>>>> simply set "can_reclaim_pt" if you skip a zap. >>>> >>>> Maybe we can return the information whether the zap was skipped from >>>> zap_present_ptes() and zap_nonpresent_ptes() through parameters like I >>>> did in [PATCH v1 3/7] and [PATCH v1 4/7]. >>>> >>>> In theory, we can detect empty PTE pages in the following two ways: >>>> >>>> 1) If no zap is skipped, it means that all pte entries have been >>>>       zap, and the PTE page must be empty. >>>> 2) If all pte entries are detected to be none, then the PTE page is >>>>       empty. >>>> >>>> In the error case, 1) may cause non-empty PTE pages to be reclaimed >>>> (which is unacceptable), while the 2) will at most cause empty PTE >>>> pages >>>> to not be reclaimed. >>>> >>>> So the most reliable and efficient method may be: >>>> >>>> a. If there is a zap that is skipped, stop scanning and do not reclaim >>>>       the PTE page; >>>> b. Otherwise, as now, detect the empty PTE page through >>>> count_pte_none() >>> >>> Is there a need for count_pte_none() that I am missing? >> >> When any_skipped == false, at least add VM_BUG_ON() to recheck none ptes. >> >>> >>> Assume we have >>> >>> nr = do_zap_pte_range(&any_skipped) >>> >>> >>> If "nr" is the number of processed entries (including pte_none()), and >>> "any_skipped" is set whenever we skipped to zap a !pte_none entry, we >>> can detect what we need, no? >>> >>> If any_skipped == false after the call, we now have "nr" pte_none() >>> entries. -> We can continue trying to reclaim >> >> I prefer that "nr" should not include pte_none(). >> > > Why? do_zap_pte_range() should tell you how far to advance, nothing > less, nothing more. > > Let's just keep it simple and avoid count_pte_none(). > > I'm probably missing something important? As we discussed before, we should skip all consecutive none ptes, pte and addr are already incremented before returning. >