From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EBB17CA0FE1 for ; Fri, 1 Sep 2023 04:36:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 35B848E0010; Fri, 1 Sep 2023 00:36:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 30C798D0002; Fri, 1 Sep 2023 00:36:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1D33E8E0010; Fri, 1 Sep 2023 00:36:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 0F0B28D0002 for ; Fri, 1 Sep 2023 00:36:38 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D582812011D for ; Fri, 1 Sep 2023 04:36:37 +0000 (UTC) X-FDA: 81186767634.11.ECF6B7C Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf07.hostedemail.com (Postfix) with ESMTP id 3C1A44000C for ; Fri, 1 Sep 2023 04:36:35 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=jhLte4MU; dmarc=none; spf=none (imf07.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1693542996; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Q86tS1awgf6N4EnRYrgorhAZo9uxjE8cuaKOiv4nfP0=; b=efRSCIAO94vRfFh7IkiAo3/hQzaHX7iTO33488M35Il7M0bEnibIiaHNMHT+gYUljDeF/c XjcQtwl7NSGuP30nK1SmBqkCgj1vnxnZjQBiZCzsC+8uizxH+VhJvIVe/9ez6BLnCz/tQt viiR3d2/T6XjG3cq/pzfdMveZburxvo= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=jhLte4MU; dmarc=none; spf=none (imf07.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1693542996; a=rsa-sha256; cv=none; b=NbF5ShsEM43uzfasSK048D4Xnukk7+KDN53wa7TulVA93GCRWNW+TY+VjJhffQELM7XsVN 88uf4ReDVCYRVkmQ4Vwd7ghj3OKhlhCUrYlgiZ4Wd9NwsIEIpgJkvJGPBC1Bj0Yg6T/Pv+ JzEzg7JQtP0VNL7bHs+w6WD46V57Ndw= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=Q86tS1awgf6N4EnRYrgorhAZo9uxjE8cuaKOiv4nfP0=; b=jhLte4MUt0KbG8uPB9PlV2WL/m OY7FevEG18kL7CZlWtgIWsOTfl1ECsdRk+zKkAPPkqI9bP+ZWNYvMgpqmwS2V/FAXUyMPNW5lLZJi RIYzWhVt4RP9PjJCdw28xVqmykWGl3for9gDTXFI3aVKZ+NuxctTGkx9xY5oEkF4JolvOSp6xY83g dRwPOMKY95KTFZ/f0Ju+zBzEp0DnZpj/u2mxg411vHVy+QwXY4TM/i4mSOpDQUmQlmqxrx8DFvEp2 eCOsHoVYjES9hEao+1huZKAkRfoVqjbFXsI8holHH4EXUHGZPMFl4Erim2CyoHWws42rcvRFnEyZH jKmcpW+Q==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1qbvtE-005S27-EN; Fri, 01 Sep 2023 04:36:12 +0000 Date: Fri, 1 Sep 2023 05:36:12 +0100 From: Matthew Wilcox To: Ryan Roberts Cc: Will Deacon , "Aneesh Kumar K.V" , Andrew Morton , Nick Piggin , Peter Zijlstra , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , David Hildenbrand , Yu Zhao , "Kirill A. Shutemov" , Yin Fengwei , Yang Shi , "Huang, Ying" , Zi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 4/5] mm: Refector release_pages() Message-ID: References: <20230830095011.1228673-1-ryan.roberts@arm.com> <20230830095011.1228673-5-ryan.roberts@arm.com> <31aa40aa-1471-44ca-9cc8-3f5476fbde7e@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <31aa40aa-1471-44ca-9cc8-3f5476fbde7e@arm.com> X-Rspamd-Queue-Id: 3C1A44000C X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 8emn3kxb8kim9anwmpjyi8zjsgyrdg6b X-HE-Tag: 1693542995-240672 X-HE-Meta: U2FsdGVkX1/OZAUuaDQY5ARRKDtBN5cYV2rbOkegs2gXlDm5/fa4oXYsYD5FZfrzbjTpmWpJJXUWi59hn9dbTUQQhgoXiaT6r6ehfvo30CVumOrHOOI7E5QgkBRknHmr0P/1z7m/9GihtZbtq68y9HPL7eY7UjZ5R4xvtrqv9OtT6d/wfDXgsGvkuO8Fn7ErCX9aWUWdILRKMlE75zQMqMcuEYUarJZYYk4VtTMBB0bgedhs36zxp+eWPb1onKNXfhOznSOWZ7hkE7MSY94L1F/todVmtM/rWYUu06TMkgQ8Mjx7cilqZu7lDlzku5C1O+7RRaX17HLg03HNWxti4jWSYEzmLjHxO52kxQk5fyZHGpqJZJ0/EthKl7TyzdzyRN1jtn/cJVc5rzB+lDlS1J+dwQyNSmQc32NCEN5IBXR5MJDyy3+W/pB8Gn200dPXu9zm107RFdsrcv48IPgTaCsQ/on9hwq8nqFR67508hdMH/U1w40MidOkns/pteYUJm0fgpRu8eIQfPA1uH1hC7xy3Y9/mMNWzvmL5NDTHdHQha2jaSXStrsRfVHoVj7/Qmfv8X225MHv3de4oKqlWU4aw8PPWqVEBRAT6EgVYPWOWKnoCClrvV1dwHhJErau3+LIvx1CDIC37EngFDtM0e035oO91MJxkUzR9JaKTT9lU06umxHPfGktlXeYVgzUdh0jh1vznhiIeMmPXJvXMJswxgcbOP/Cy5pcULVkz41bwbjA2QS/MoKsLUeIciLedy9zOn5AgvH5/HH+zWwKsZMThcYRZifhc5/SwrbeqJ9Is0n5jgHiBGSh9S9AtACbzJ2zeDOerNDbC2fEBUXbYDYMToBRVEHzf5vxlDtiDghp58bL6gCsh4Hs7af7DgIlzbh1e8QRcoqvvpnEL6b692DPU1hL0UUhBuCop2QUE8j5eOcsRey3j+pe9AiWC95HVEJg8GIgrAgF8itkPwi BmtnNRia lxPle9B78vEuXQxlE8tskUGvK0GZQ3FtnIwWcCbGfhTEeFRYq8tLkzevzpf9NdzwkO9DOp+THmbRRp8ZPND0EF2g4sxj8pyZAaa6H327qEw9e27bdVHCxFlSzpLvOGOZOrr2RV9mdWf0qTKy1qevSRoVfETPfy9P11ZjbAKK4c2ZxAjuznnKhArF2Z71JBYGaciWO X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Aug 31, 2023 at 08:57:51PM +0100, Ryan Roberts wrote: > On 30/08/2023 20:11, Matthew Wilcox wrote: > > On Wed, Aug 30, 2023 at 10:50:10AM +0100, Ryan Roberts wrote: > >> In preparation for implementing folios_put_refs() in the next patch, > >> refactor release_pages() into a set of helper functions, which can be > >> reused. The primary difference between release_pages() and > >> folios_put_refs() is how they iterate over the set of folios. The > >> per-folio actions are identical. > > > > As you noted, we have colliding patchsets. I'm not hugely happy with > > how patch 4 turned out, > > Could you describe the issues as you see them? I'm keen not to repeat the same > bad patterns in future. I think you absolutely had the right idea. I'd've probably done the same as you in the same circumstances as you. It's just that I'd done this patch series getting rid of / simplifying a bunch of the code you were modifying, and I thought it'd make your 4/5 irrelevant and 5/5 simpler. > > void release_unref_folios(struct folio_batch *folios) > > { > > struct lruvec *lruvec = NULL; > > unsigned long flags = 0; > > int i; > > > > for (i = 0; i < folios->nr; i++) { > > struct folio *folio = folios->folios[i]; > > free_swap_cache(folio); > > Agree this can't go here - would put it in pfn_range_put(). But would not want > it in folios_put() as you suggeted in the other email - that would surely change > the behaviour of folios_put()? yes; perhaps there are times when we want to put a batch of folios and _don't_ want to rip them out of the swapcache. I haven't thought about this in much detail, and we're definitely venturing into areas of the MM where I get myself in trouble (ie anonymous memory). > > __page_cache_release(folio, &lruvec, &flags); > > } > > I think you would need to add: > > if (lruvec) > unlock_page_lruvec_irqrestore(lruvec, flags); Indeed. > > mem_cgroup_uncharge_folios(folios); > > free_unref_folios(folios); > > } > > > > then this becomes: > > > > void folios_put(struct folio_batch *folios) > > { > > int i, j; > > > > for (i = 0, j = 0; i < folios->nr; i++) { > > struct folio *folio = folios->folios[i]; > > > > if (is_huge_zero_page(&folio->page)) > > continue; > > if (folio_is_zone_device(folio)) { > > if (put_devmap_managed_page(&folio->page)) > > continue; > > if (folio_put_testzero(folio)) > > free_zone_device_page(&folio->page); > > continue; > > } > > > > if (!folio_put_testzero(folio)) > > continue; > > if (folio_test_hugetlb(folio)) { > > free_huge_folio(folio); > > continue; > > } > > > > if (j != i) > > folios->folios[j] = folio; > > j++; > > } > > > > folios->nr = j; > > if (!j) > > return; > > release_unref_folios(folios); > > } > > > > and pfn_range_put() also becomes shorter and loses all the lruvec work. > > something like this? > > void pfn_range_put(struct pfn_range *range, unsigned int nr) > { > struct folio_batch folios; > unsigned int i; > > folio_batch_init(&folios); > for (i = 0; i < nr; i++) { > struct folio *folio = pfn_folio(range[i].start); > unsigned int refs = range[i].end - range[i].start; > > free_swap_cache(folio); // <<<<< HERE? To be equivalent to > // free_pages_and_swap_cache() > > if (is_huge_zero_page(&folio->page)) > continue; > > if (folio_is_zone_device(folio)) { > if (put_devmap_managed_page_refs(&folio->page, refs)) > continue; > if (folio_ref_sub_and_test(folio, refs)) > free_zone_device_page(&folio->page); > continue; > } > > if (!folio_ref_sub_and_test(folio, refs)) > continue; > > /* hugetlb has its own memcg */ > if (folio_test_hugetlb(folio)) { > free_huge_folio(folio); > continue; > } > > if (folio_batch_add(&folios, folio) == 0) > release_unref_folios(&folios); > } > > if (folios.nr) > release_unref_folios(&folios); > } > > > > > Thoughts? > > Looks like its getting there to me. Although the bulk of the logic inside the > loop is still common between folios_put() and pfn_range_put(); perhaps we can > have a common helper for that, which would just need to take (folio, refs)? > > So by adding free_swap_cache() above, we can call pfn_range_put() direct and can > remove free_pages_and_swap_cache() entirely. Yes. Maybe this works? I'd like it to! > What's the best way to move forward here? Two options as I see it: > > - I drop patch 4 and create a version of pfn_range_put() that has the same > semantic as above but essentially just copies the old release_pages() (similar > to my v1). That keeps my series independent. Then you can replace with the new > pfn_range_put() as part of your series. > > - We can just hook my patches up to the end of your series and do it all in one go. > > Opinion? I'm reluctant to tell you "hey, delay until after this series of mine". We don't have good data yet on whether my patch series helps or hurts. Plus I'm about to take a few days off (I'll be back for next Wednesday's meeting). I think your first option is better (and do feel free to use a different name from pfn_range_put()) just to keep us from colliding in ways that ultimately don't matter.