From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB8B8C433E0 for ; Tue, 16 Feb 2021 21:30:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6FBD864E6B for ; Tue, 16 Feb 2021 21:30:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6FBD864E6B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8C0CD8D0005; Tue, 16 Feb 2021 16:30:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 871428D0002; Tue, 16 Feb 2021 16:30:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 760E38D0005; Tue, 16 Feb 2021 16:30:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0024.hostedemail.com [216.40.44.24]) by kanga.kvack.org (Postfix) with ESMTP id 5D54F8D0002 for ; Tue, 16 Feb 2021 16:30:36 -0500 (EST) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 265A218029D89 for ; Tue, 16 Feb 2021 21:30:36 +0000 (UTC) X-FDA: 77825425272.29.limit25_141750d27647 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin29.hostedemail.com (Postfix) with ESMTP id 09389180868C1 for ; Tue, 16 Feb 2021 21:30:36 +0000 (UTC) X-HE-Tag: limit25_141750d27647 X-Filterd-Recvd-Size: 5395 Received: from mail-pg1-f177.google.com (mail-pg1-f177.google.com [209.85.215.177]) by imf36.hostedemail.com (Postfix) with ESMTP for ; Tue, 16 Feb 2021 21:30:35 +0000 (UTC) Received: by mail-pg1-f177.google.com with SMTP id m2so7108038pgq.5 for ; Tue, 16 Feb 2021 13:30:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=xUh3CEqdSSKuvizlFupLbkEBddVx3zr4k3RSY54OT4Y=; b=NM1DIdK0kYbseTKMsjWst8byp45H5S9NfZds6OsuvdvCJs8jkSAtc7kZyhNY4qjswM Fujh9R8op0lc9Buq9wmf9WaxojtlROwchTm4a9ebQDkw/vCFH4ZHcv/pfzFVMvZ1gVnk x4hUT8hxdD7cEJm32J6y3RG3+BjXW8sbCIHxwjN4uetqd4JqaKXKZiCRjc2+OwabbXJr sjFu2nfqN1QkBKfieVUmDX+biIHLxIVpO7F27u8ul2SsoVJBheDnZlKe7dT4PggsiaDb YmIUAAbt32mlp11R2j8dcx77StDtr4xbHkAIIti9oXxkYRDqxqi8tQd4ol4wiStmUAdT uFXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=xUh3CEqdSSKuvizlFupLbkEBddVx3zr4k3RSY54OT4Y=; b=Qhj0/ThPiOQ6HEezj5CKNVVedVvrsgZHuUoGfSm7uWOJJJe6sTtOmWmfrpGLYVGzCW GMNy0cPJ4rfa/VGpYd5sskvVT7SYgRA2Xu5j5sbJfO1LdpCkZDQENMuuo6KxiiG2NYLk fvh+3Fn9MMpqKdrDqI5qv7okoujIfU7GR1nHFuGr7REbDvtAL4amOZhlRee9TSa7z7pw UGYeEGRMWZgtVuZZFd5nBMHabBhY918n0nsnN+ENKCJ/v6ZmhZa5oIo4TUgPtEcr7uCD G+aDKKd3TTsA5n5mkKd4vZXQ4nbo2baOb+49CSmJDH2fj+3i7UEjuDxjIs6diOS1LLss WdBA== X-Gm-Message-State: AOAM532dDKMrRdhoPK63Dyw1kEmrMJJGWsNpItcvMg88gUU3F4xosxgr D/CO8ksH2WWUTFRnhPpecLw= X-Google-Smtp-Source: ABdhPJwsn7pEQ8cmgx84W2wsuWyc794tUwXJok/ipmRpvvIjkHYT/WYi9aZXLNpkV+3bK9pFspCtvA== X-Received: by 2002:a62:14d4:0:b029:1e3:34e7:5797 with SMTP id 203-20020a6214d40000b02901e334e75797mr21993866pfu.43.1613511034569; Tue, 16 Feb 2021 13:30:34 -0800 (PST) Received: from google.com ([2620:15c:211:201:cdf7:1c5d:c444:e341]) by smtp.gmail.com with ESMTPSA id i67sm23417148pfe.19.2021.02.16.13.30.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Feb 2021 13:30:33 -0800 (PST) Date: Tue, 16 Feb 2021 13:30:31 -0800 From: Minchan Kim To: Matthew Wilcox Cc: Andrew Morton , linux-mm , LKML , cgoldswo@codeaurora.org, linux-fsdevel@vger.kernel.org, mhocko@suse.com, david@redhat.com, vbabka@suse.cz, viro@zeniv.linux.org.uk, joaodias@google.com Subject: Re: [RFC 1/2] mm: disable LRU pagevec during the migration temporarily Message-ID: References: <20210216170348.1513483-1-minchan@kernel.org> <20210216182242.GJ2858050@casper.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210216182242.GJ2858050@casper.infradead.org> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Feb 16, 2021 at 06:22:42PM +0000, Matthew Wilcox wrote: > On Tue, Feb 16, 2021 at 09:03:47AM -0800, Minchan Kim wrote: > > LRU pagevec holds refcount of pages until the pagevec are drained. > > It could prevent migration since the refcount of the page is greater > > than the expection in migration logic. To mitigate the issue, > > callers of migrate_pages drains LRU pagevec via migrate_prep or > > lru_add_drain_all before migrate_pages call. > > > > However, it's not enough because pages coming into pagevec after the > > draining call still could stay at the pagevec so it could keep > > preventing page migration. Since some callers of migrate_pages have > > retrial logic with LRU draining, the page would migrate at next trail > > but it is still fragile in that it doesn't close the fundamental race > > between upcoming LRU pages into pagvec and migration so the migration > > failure could cause contiguous memory allocation failure in the end. > > Have you been able to gather any numbers on this? eg does migration > now succeed 5% more often? What I measured was how many times migrate_pages retried with force mode below debug code. The test was android apps launching with cma allocation in background. Total cma allocation count was about 500 during the entire testing and have seen about 400 retrial with below debug code. With this patchset(with bug fix), the retrial count was reduced under 30. What I measured was how many times the migrate_pages diff --git a/mm/migrate.c b/mm/migrate.c index 04a98bb2f568..caa661be2d16 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1459,6 +1459,11 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, private, page, pass > 2, mode, reason); + if (rc && reason == MR_CONTIG_RANGE && pass > 2) { + printk(KERN_ERR, "pfn 0x%lx reason %d\n", page_to_pfn(page), rc); + dump_page(page, "fail to migrate"); + } +