From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5FF4DC433EF for ; Thu, 27 Jan 2022 21:23:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A89DB6B0071; Thu, 27 Jan 2022 16:23:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A121B6B0072; Thu, 27 Jan 2022 16:23:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 88BC16B0073; Thu, 27 Jan 2022 16:23:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0023.hostedemail.com [216.40.44.23]) by kanga.kvack.org (Postfix) with ESMTP id 7459B6B0071 for ; Thu, 27 Jan 2022 16:23:56 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 3055E8248D52 for ; Thu, 27 Jan 2022 21:23:56 +0000 (UTC) X-FDA: 79077344472.10.20F3E25 Received: from mail-ed1-f49.google.com (mail-ed1-f49.google.com [209.85.208.49]) by imf02.hostedemail.com (Postfix) with ESMTP id CC7E280025 for ; Thu, 27 Jan 2022 21:23:54 +0000 (UTC) Received: by mail-ed1-f49.google.com with SMTP id w25so4684352edt.7 for ; Thu, 27 Jan 2022 13:23:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=vsZxj2n/FQH3vhW+ujW3pW+S6DvzEuQ7kVmpC7bpK4k=; b=Cl0nOWq7usXUtt3ClyuNJoGcfX4dh8CnVflfZzrcDXHlPXHL1bT6/PKwOHZhp24ISE T1nf5dlnF6DIfFFEhHEPhOL6qQdEe7QRCLsQWwfth+zmfqdRhl8xnCFXK8ipUX6EVYOA 6qJGWWuMFlfBFC3ToIQ8aHCFia2FZWzFjKlD9o3IXirOkv9DBp4YAwhr+N2CJf0ejrgw Tp/bVDl9CItryv66zQnjsJ4UjZp0xLCC8FFUSk1QByIKDkfJvdhkDLTlgobuOc820jFV /tYkNw1Ms1a5c10atzqsBH1pQlTsvRPA5xrJo2KA4bijTpvW2R0oI9GX5hzlKbETcS6P ePVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=vsZxj2n/FQH3vhW+ujW3pW+S6DvzEuQ7kVmpC7bpK4k=; b=TL/TfLhdaM53miXE3hXt3swpAF2LmNje0JbKSBhBQDqS+CHKwgnwGIL/8XDW9dtXJc mpAxdulC018ovtMLSs9ZnNkk3TKXMasm1YgX+7Am+fxzB7CXpuJiwUsbWemQJ23Y9crJ 5AT8QEseRfUJp5Gf48TmN6E9zAuwTDCDZyjWqCBmaL1NLkTSBEA8t5m520GwJsJoPkLW sxY5KHLPiOz7Z356V9G6ZSf3sIOA+5imYPeuYeszoc3J+XXMvlrIxoKzFHSBZB2/F3qG j7JIfC4U64CCj1ywbSN8NPB4qLOyPbOoirRYX17d91V6TRHRtze3/7I3iu0DixXV496b Aq5w== X-Gm-Message-State: AOAM533nEKwE4nGCF+pgP6stf6TdrlTj7+46omlFLrYK5BKPu9/+KHvt aXw+1L+7CAHzs1xE8Q9n0ko4RDU4DANZITTJn6s= X-Google-Smtp-Source: ABdhPJzpHUVc73mppp42NKEBLjn1WogRWR7KLVE21G2OpChrbS0G4B78nybuuh/nEwzSP4pHeIVrsKj6rsxgdgMi12g= X-Received: by 2002:a05:6402:270f:: with SMTP id y15mr5235743edd.409.1643318633482; Thu, 27 Jan 2022 13:23:53 -0800 (PST) MIME-Version: 1.0 References: <20220126095557.32392-1-david@redhat.com> <20220126095557.32392-7-david@redhat.com> In-Reply-To: <20220126095557.32392-7-david@redhat.com> From: Yang Shi Date: Thu, 27 Jan 2022 13:23:41 -0800 Message-ID: Subject: Re: [PATCH RFC v2 6/9] mm/khugepaged: remove reuse_swap_page() usage To: David Hildenbrand Cc: Linux Kernel Mailing List , Andrew Morton , Hugh Dickins , Linus Torvalds , David Rientjes , Shakeel Butt , John Hubbard , Jason Gunthorpe , Mike Kravetz , Mike Rapoport , "Kirill A . Shutemov" , Matthew Wilcox , Vlastimil Babka , Jann Horn , Michal Hocko , Nadav Amit , Rik van Riel , Roman Gushchin , Andrea Arcangeli , Peter Xu , Donald Dutile , Christoph Hellwig , Oleg Nesterov , Jan Kara , Liang Zhang , Linux MM Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: CC7E280025 X-Stat-Signature: cn5yf61um1fyn61yie18st5sux6fpagy Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=Cl0nOWq7; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf02.hostedemail.com: domain of shy828301@gmail.com designates 209.85.208.49 as permitted sender) smtp.mailfrom=shy828301@gmail.com X-Rspam-User: nil X-HE-Tag: 1643318634-666044 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Jan 26, 2022 at 2:00 AM David Hildenbrand wrote: > > reuse_swap_page() currently indicates if we can write to an anon page > without COW. A COW is required if the page is shared by multiple > processes (either already mapped or via swap entries) or if there is > concurrent writeback that cannot tolerate concurrent page modifications. > > reuse_swap_page() doesn't check for pending references from other > processes that already unmapped the page, however, > is_refcount_suitable() essentially does the same thing in the context of > khugepaged. khugepaged is the last remaining user of reuse_swap_page() and > we want to remove that function. > > In the context of khugepaged, we are not actually going to write to the > page and we don't really care about other processes mapping the page: > for example, without swap, we don't care about shared pages at all. > > The current logic seems to be: > * Writable: -> Not shared, but might be in the swapcache. Nobody can > fault it in from the swapcache as there are no other swap entries. > * Readable and not in the swapcache: Might be shared (but nobody can > fault it in from the swapcache). > * Readable and in the swapcache: Might be shared and someone might be > able to fault it in from the swapcache. Make sure we're the exclusive > owner via reuse_swap_page(). > > Having to guess due to lack of comments and documentation, the current > logic really only wants to make sure that a page that might be shared > cannot be faulted in from the swapcache while khugepaged is active. > It's hard to guess why that is that case and if it's really still required, > but let's try keeping that logic unmodified. I don't think it could be faulted in while khugepaged is active since khugepaged does hold mmap_lock in write mode IIUC. So page fault is serialized against khugepaged. My wild guess is that collapsing shared pages was not supported before v5.8, so we need reuse_swap_page() to tell us if the page in swap cache is shared or not. But it is not true anymore. And khugepaged just allocates a THP then copy the data from base pages to huge page then replace PTEs to PMD, it doesn't change the content of the page, so I failed to see a problem by collapsing a shared page in swap cache. But I'm really not entirely sure, I may miss something... > > Instead of relying on reuse_swap_page(), let's unconditionally > try_to_free_swap(), special casing PageKsm(). try_to_free_swap() will fail > if there are still swap entries targeting the page or if the page is under > writeback. > > After a successful try_to_free_swap() that page cannot be readded to the > swapcache because we're keeping the page locked and removed from the LRU > until we actually perform the copy. So once we succeeded removing a page > from the swapcache, it cannot be re-added until we're done copying. Add a > comment stating that. > > Signed-off-by: David Hildenbrand > --- > mm/khugepaged.c | 16 +++++++++++++--- > 1 file changed, 13 insertions(+), 3 deletions(-) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 35f14d0a00a6..bc0ff598e98f 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -683,10 +683,10 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > goto out; > } > if (!pte_write(pteval) && PageSwapCache(page) && > - !reuse_swap_page(page)) { > + (PageKsm(page) || !try_to_free_swap(page))) { > /* > - * Page is in the swap cache and cannot be re-used. > - * It cannot be collapsed into a THP. > + * Possibly shared page cannot be removed from the > + * swapache. It cannot be collapsed into a THP. > */ > unlock_page(page); > result = SCAN_SWAP_CACHE_PAGE; > @@ -702,6 +702,16 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > result = SCAN_DEL_PAGE_LRU; > goto out; > } > + > + /* > + * We're holding the page lock and removed the page from the > + * LRU. Once done copying, we'll unlock and readd to the > + * LRU via release_pte_page(). If the page is still in the > + * swapcache, we're the exclusive owner. Due to the page lock > + * the page cannot be added to the swapcache until we're done > + * and consequently it cannot be faulted in from the swapcache > + * into another process. > + */ > mod_node_page_state(page_pgdat(page), > NR_ISOLATED_ANON + page_is_file_lru(page), > compound_nr(page)); > -- > 2.34.1 >