From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D8F0C2BD09 for ; Mon, 24 Jun 2024 21:02:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B82D36B01F2; Mon, 24 Jun 2024 17:02:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B32286B01F3; Mon, 24 Jun 2024 17:02:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9D2E06B01F4; Mon, 24 Jun 2024 17:02:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 7DEDC6B01F2 for ; Mon, 24 Jun 2024 17:02:13 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 1EC461412E7 for ; Mon, 24 Jun 2024 21:02:13 +0000 (UTC) X-FDA: 82267004946.30.9F3AF18 Received: from out-172.mta0.migadu.com (out-172.mta0.migadu.com [91.218.175.172]) by imf21.hostedemail.com (Postfix) with ESMTP id 594E01C001A for ; Mon, 24 Jun 2024 21:02:10 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=BXAPkxbo; spf=pass (imf21.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.172 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719262917; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LqCwlhhk7HAreetP8u9djrnWVZC/rtZIAaJkpetLXsM=; b=ukG3TsK2D2Tvm35ZyoEFh8TxidEA7CHDxTC8uisYf14AKqJ8S/F4UAFmy5SRPIQX4tGnl+ KiaEJPAnpck8DJASvuq0DOD0JWEnIERyK5qnNVaBtsEAimkHwUMWhREbDTTELQ9sIPcG7N mMuRDVl2e1Iq8UapMN0CGH/gS7UMchY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719262917; a=rsa-sha256; cv=none; b=U2EgHCkgc2tQYDn+CLoDXGYpPtOJm98Kr+ZNKJUz4DkOEXa3lDEqqPvCThH6jQEw6Oe0dg Sh1FzKIm73PYTc4zkf7mLKMJwjJp9+ulQ8RKBWuV118gATNT0lYzz9RmqSSQq9k9oyf+3H 1DpgRvuk9VrUVB8vmaK4YLHCfnOwIBk= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=BXAPkxbo; spf=pass (imf21.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.172 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-Envelope-To: willy@infradead.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1719262928; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LqCwlhhk7HAreetP8u9djrnWVZC/rtZIAaJkpetLXsM=; b=BXAPkxbovgKKMxwLwrXlUk5BSkuGi42Xt0jUCsHXIKPYHYji9CNlIWCHg862/VUg/YxcC4 reI/TEXHcbEFvV+pYhbgL1sDy0rTy6a18eKpp7VhkKfIlTjDQzZn3yQMyneeP8WD07wfVG fDBUli96uxmwdiJDDhSyf9UH3aczvxM= X-Envelope-To: yosryahmed@google.com X-Envelope-To: oliver.sang@intel.com X-Envelope-To: usamaarif642@gmail.com X-Envelope-To: oe-lkp@lists.linux.dev X-Envelope-To: lkp@intel.com X-Envelope-To: linux-mm@kvack.org X-Envelope-To: akpm@linux-foundation.org X-Envelope-To: chengming.zhou@linux.dev X-Envelope-To: nphamcs@gmail.com X-Envelope-To: david@redhat.com X-Envelope-To: ying.huang@intel.com X-Envelope-To: hughd@google.com X-Envelope-To: hannes@cmpxchg.org X-Envelope-To: ak@linux.intel.com X-Envelope-To: linux-kernel@vger.kernel.org Date: Mon, 24 Jun 2024 14:02:02 -0700 X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Matthew Wilcox Cc: Yosry Ahmed , kernel test robot , Usama Arif , oe-lkp@lists.linux.dev, lkp@intel.com, Linux Memory Management List , Andrew Morton , Chengming Zhou , Nhat Pham , David Hildenbrand , "Huang, Ying" , Hugh Dickins , Johannes Weiner , Andi Kleen , linux-kernel@vger.kernel.org Subject: Re: [linux-next:master] [mm] 0fa2857d23: WARNING:at_mm/page_alloc.c:#__alloc_pages_noprof Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: 594E01C001A X-Stat-Signature: pk4a95u1zgmcze41hfiztzqw5xumxg4h X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1719262930-794969 X-HE-Meta: U2FsdGVkX18g7nNlFh9lIDsb6+HD4q/L6V1PCq4xvilFYx6fSFeMISf4utvuwnFV6qusq6d4nNVqRHT1NWZqzHWVla8qYkOvSc699DsDB820Bwq0cefNlIyfGPKmrqLFAKscez4Q2YkXHpvhybFyIeA/2d6doo/dUlccYDyT/T1qa5Iosy5swiWbCkFXEVc7iEHF3CY29V5U0hQ+UD9InIL3yI6Nk6TWqAe54ZFPV5sAZigI3m1CIhMttYZS/zQF7yJX1sHJO5EAmiGx+nxdvnR6EDyvxJLCBnvx5Z8K1MCDqXgYN2lkIVQ+aLLkCAS6XyX0z1V2Lhe273XZ/n5mWwJrBIjD+x8ErNSr2exOFKMLP7NUvEEa4mSou2MItNEKlLP20vfqmzKxiS7jBWkqDqVA+Y19InI2aZ5//I1nArArpCtQC78fv5iYuBX8B6pUU6M1KzpKlnYEErSDE/IsDRntzF4gKPjNu9ZxJjHD/LalW2Pz/2RI/Eww/C9IwWdNNL73wnp4SqbqqfD0UXl9xqkp852pC2PigEPOOn6lgNOyg3MfbjpCm+WzvnPjop+DrW0Ad+cSrwcXXfqv2jcMxlvS6Ne91vmAmWii4d/oN1WfPlXBt6GfImKLgsslU9ShHASGe4wTkcxUCrOeCEUNWGXiMXAwV3Wg8bkO0vqW5V7dZ42iR31npqspC4vmOxZULr2vKrcrvVMhr48m2JdMOIm9DOR96cqdfnp9tokMLFFd1F96jDGGOHtHpBeaRBkCOhGYyjGdMYJ6oQtDG5n691gUUSiB/PNf8/3GDuf5U6oQcuIrg776RBU5EkLZnUqxd5NQl4PEK8lR80Vkd9aTK36ZLVQeNKNorXOhOJ4GgtG6G9vidMzz3bccJijOJtWF7vJgb2OjWKpji2OI8aH8YgjfceonePTkrni4HAT+A1dSxO7ugZxp8NvH5ZZsLap80KIaG6EKWji0p9+pS8f Q6QP/kEB DNk4najah4E6CKDMPp45xSqvfgbJ80Nw6yYY9hvkZiH9O1fDTnxMhG2nbkTZ38w7GseofbGVW4XQy9K058yqJDVCqrlF6YIHhkg7rxk3FUGUYzlEow8qHZt+r8DlZe1/Hsb+yUOEWCNJL1BLIDQ9oy2rl7f0Zxp1mLvQqfOS+fhBdHR8LKE+nbV8vHuZa9AHfOgDGO71TdKJKw+ZavEjPJFBIeKDjSLZCfhc/IqWhr0vqpjRyjkWLtY0exa/J2LZy9c9I X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jun 24, 2024 at 09:51:33PM GMT, Matthew Wilcox wrote: > On Mon, Jun 24, 2024 at 01:39:45PM -0700, Shakeel Butt wrote: > > On Mon, Jun 24, 2024 at 08:50:45PM GMT, Matthew Wilcox wrote: > > > On Mon, Jun 24, 2024 at 12:34:04PM -0700, Yosry Ahmed wrote: > > > > On Mon, Jun 24, 2024 at 12:26 PM Matthew Wilcox wrote: > > > > > > > > > > On Mon, Jun 24, 2024 at 11:57:45AM -0700, Yosry Ahmed wrote: > > > > > > On Mon, Jun 24, 2024 at 11:56 AM Matthew Wilcox wrote: > > > > > > > > > > > > > > On Mon, Jun 24, 2024 at 11:53:30AM -0700, Yosry Ahmed wrote: > > > > > > > > After a page is swapped out during reclaim, __remove_mapping() will > > > > > > > > call __delete_from_swap_cache() to replace the swap cache entry with a > > > > > > > > shadow entry (which is an xa_value). > > > > > > > > > > > > > > Special entries are disjoint from shadow entries. Shadow entries have > > > > > > > the last two bits as 01 or 11 (are congruent to 1 or 3 modulo 4). > > > > > > > Special entries have values below 4096 which end in 10 (are congruent > > > > > > > to 2 modulo 4). > > > > > > > > > > > > You are implying that we would no longer have a shadow entry for such > > > > > > zero folios, because we will be storing a special entry instead. > > > > > > Right? > > > > > > > > > > umm ... maybe I have a misunderstanding here. > > > > > > > > > > I'm saying that there wouldn't be a _swap_ entry here because the folio > > > > > wouldn't be stored anywhere on the swap device. But there could be a > > > > > _shadow_ entry. Although if the page is full of zeroes, it was probably > > > > > never referenced and doesn't really need a shadow entry. > > > > > > > > Is it possible to have a shadow entry AND a special entry (e.g. > > > > XA_ZERO_ENTRY) at the same index? This is what would be required to > > > > maintain the current behavior (assuming we really need the shadow > > > > entries for such zeroed folios). > > > > > > No, just like it's not possible to have a swap entry and a shadow entry > > > at the same location. You have to choose. But the zero entry is an > > > alternative to the swap entry, not the shadow entry. > > > > > > As I understand the swap cache, at the moment, you can have four > > > possible results from a lookup: > > > > > > - NULL > > > - a swap entry > > > - a shadow entry > > > - a folio > > > > > > Do I have that wrong? > > > > I don't think we have swap entry in the swapcache (underlying xarray). > > The swap entry is used as an index to find the folio or shadow entry. > > Ah. I think I understand the procedure now. > > We store a swap entry in the page table entry. That tells us both where > in the swap cache the folio might be found, and where in the swap device > the data can be found (because there is a very simple calculation for > both). If the folio is not present, then there's a shadow entry which > summarises the LRU information that would be stored in the folio had it > not been evicted from the swapcache. > > We can't know at the point where we unmap the page whether it's full > of zeroes or not, because we can't afford to scan its contents. At the > point where we decide to swap out the folio, we can afford to make that > decision because the cost of doing the I/O is high enough. > > So the question is whether we can afford to throw away the shadow > information and just store the information that this was a zero entry. > I think we can, but it is a more bold proposal than I realised I was > making. I agree that we can throw away shadow in the favor of zero entry but, as you already noted, it requires changes at mutiple places. At the moment I can think of: 1. Zero entry is not reclaimable like shadow entry. 2. Need to decide the right place to allocate the zero folio on swapin. 3. Should this be treated as major fault for stats purpose. Definitely I have missed more points as well.