From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0E51C433FE for ; Tue, 12 Apr 2022 15:06:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3C1C36B0071; Tue, 12 Apr 2022 11:06:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 370A66B0073; Tue, 12 Apr 2022 11:06:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 210906B0074; Tue, 12 Apr 2022 11:06:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.28]) by kanga.kvack.org (Postfix) with ESMTP id 113AB6B0071 for ; Tue, 12 Apr 2022 11:06:11 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id D6461234F2 for ; Tue, 12 Apr 2022 15:06:10 +0000 (UTC) X-FDA: 79348552500.07.5346726 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf28.hostedemail.com (Postfix) with ESMTP id 3BC5BC0005 for ; Tue, 12 Apr 2022 15:06:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649775969; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XcMg4A6w1yZzg9wuJ8koxUDNZ+ppyVYaFpMl0UYBBWU=; b=W2z506Q9jFmhO3j2TRDqX6xqkLZ1AL64WH64epwA6Cl7S8TaoXlQGcgASXzG+PoVRTDY+X yVvPF7hdvUGnlyG3lE7Jl8miFVRgbHdj/QN/0z67d3w5nzCW90avN+uCvVGMiRxrE30liG rjTaacR4itTlBH8E7y5EFgvJ2lFwPSE= Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-486-7Zcew3R3OlCGX-WCs1YL6w-1; Tue, 12 Apr 2022 11:06:05 -0400 X-MC-Unique: 7Zcew3R3OlCGX-WCs1YL6w-1 Received: by mail-wr1-f71.google.com with SMTP id k20-20020adfc714000000b001e305cd1597so4112257wrg.19 for ; Tue, 12 Apr 2022 08:06:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:organization:in-reply-to :content-transfer-encoding; bh=XcMg4A6w1yZzg9wuJ8koxUDNZ+ppyVYaFpMl0UYBBWU=; b=rwKIpRuxkg7mOWTZPdlZdq2mWDMcB3EclB/YV2C0Gj/63/h2W0hjpu2Zb1KsT8rcgn Er4SGAAZoBM4TPk/FtPnHN+y5+ob0udsfL8RUj/IabkpzRyWudks+niXP8rJCkBE+8c9 70ylua26DSiCqbGX/IlLVtyGJHiFX2vapy5otglc1jKatGfb3TrvrlSe+RfsRSpIJS5o 7RHZtf8+fARP/Iyro5BtFx2erjlXmLqcCVTYt6VjrgEyN4ZXAnant2WEGG684/Fki16G 2oV+S/GukwZlY/lgoF1TVgAFRabuiR+bucTyKlkcUcIl9pOgeOgKnkHhAEkUXngs5ArX LNkA== X-Gm-Message-State: AOAM531N3Ebxo7A5cXWMzgcIwxUN3WE9BkCYn/1z6QWYJ/xuLUykLwG/ E6eiFptdX/SeA62h6agPOKEi38//NVcsI3wY7LQxQGa4YkYPUidYe8LUzT9T2BV5oaRcQsTjICI ml6Fr89olwgM= X-Received: by 2002:adf:c188:0:b0:1e6:8ecb:ea5a with SMTP id x8-20020adfc188000000b001e68ecbea5amr28432434wre.711.1649775963501; Tue, 12 Apr 2022 08:06:03 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxUf6mPkR4HTFyBmy700+AFHseZaInznKf5AOScugwmvW/0QoMnFDweHlOBb+4v1i50K+KUow== X-Received: by 2002:adf:c188:0:b0:1e6:8ecb:ea5a with SMTP id x8-20020adfc188000000b001e68ecbea5amr28432412wre.711.1649775963050; Tue, 12 Apr 2022 08:06:03 -0700 (PDT) Received: from ?IPV6:2003:cb:c707:1800:7c14:16cc:5291:a9f3? (p200300cbc70718007c1416cc5291a9f3.dip0.t-ipconnect.de. [2003:cb:c707:1800:7c14:16cc:5291:a9f3]) by smtp.gmail.com with ESMTPSA id 7-20020a05600c024700b0038ec0c4a2e7sm2522715wmj.11.2022.04.12.08.06.02 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Apr 2022 08:06:02 -0700 (PDT) Message-ID: Date: Tue, 12 Apr 2022 17:06:01 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.6.2 Subject: Re: [PATCH v10 2/5] mm: page_isolation: check specified range for unmovable pages To: Zi Yan Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, Vlastimil Babka , Mel Gorman , Eric Ren , Mike Rapoport , Oscar Salvador , Christophe Leroy References: <20220406151858.3149821-1-zi.yan@sent.com> <20220406151858.3149821-3-zi.yan@sent.com> <039317AF-985C-43D7-BB99-714DD6022B5C@nvidia.com> <428828e1-6b59-8db7-62e0-1429863cb13f@redhat.com> From: David Hildenbrand Organization: Red Hat In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=W2z506Q9; spf=none (imf28.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Stat-Signature: gzu94rshtkg6r9k4whsi7hqki6cqf18q X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 3BC5BC0005 X-HE-Tag: 1649775970-727122 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 12.04.22 17:01, Zi Yan wrote: > On 12 Apr 2022, at 10:49, David Hildenbrand wrote: > >> On 12.04.22 16:07, Zi Yan wrote: >>> On 12 Apr 2022, at 9:10, David Hildenbrand wrote: >>> >>>> On 06.04.22 17:18, Zi Yan wrote: >>>>> From: Zi Yan >>>>> >>>>> Enable set_migratetype_isolate() to check specified sub-range for >>>>> unmovable pages during isolation. Page isolation is done >>>>> at MAX_ORDER_NR_PAEGS granularity, but not all pages within that >>>>> granularity are intended to be isolated. For example, >>>>> alloc_contig_range(), which uses page isolation, allows ranges without >>>>> alignment. This commit makes unmovable page check only look for >>>>> interesting pages, so that page isolation can succeed for any >>>>> non-overlapping ranges. >>>>> >>>>> Signed-off-by: Zi Yan >>>>> --- >>>> >>>> [...] >>>> >>>>> /* >>>>> - * This function checks whether pageblock includes unmovable pages or not. >>>>> + * This function checks whether the range [start_pfn, end_pfn) includes >>>>> + * unmovable pages or not. The range must fall into a single pageblock and >>>>> + * consequently belong to a single zone. >>>>> * >>>>> * PageLRU check without isolation or lru_lock could race so that >>>>> * MIGRATE_MOVABLE block might include unmovable pages. And __PageMovable >>>>> @@ -28,12 +30,14 @@ >>>>> * cannot get removed (e.g., via memory unplug) concurrently. >>>>> * >>>>> */ >>>>> -static struct page *has_unmovable_pages(struct zone *zone, struct page *page, >>>>> - int migratetype, int flags) >>>>> +static struct page *has_unmovable_pages(unsigned long start_pfn, unsigned long end_pfn, >>>>> + int migratetype, int flags) >>>>> { >>>>> - unsigned long iter = 0; >>>>> - unsigned long pfn = page_to_pfn(page); >>>>> - unsigned long offset = pfn % pageblock_nr_pages; >>>>> + unsigned long pfn = start_pfn; >>>>> + struct page *page = pfn_to_page(pfn); >>>> >>>> >>>> Just do >>>> >>>> struct page *page = pfn_to_page(start_pfn); >>>> struct zone *zone = page_zone(page); >>>> >>>> here. No need to lookup the zone again in the loop because, as you >>>> document "must ... belong to a single zone.". >>>> >>>> Then, there is also no need to initialize "pfn" here. In the loop header >>>> is sufficient. >>>> >>> >>> Sure. >>> >>>>> + >>>>> + VM_BUG_ON(ALIGN_DOWN(start_pfn, pageblock_nr_pages) != >>>>> + ALIGN_DOWN(end_pfn - 1, pageblock_nr_pages)); >>>>> >>>>> if (is_migrate_cma_page(page)) { >>>>> /* >>>>> @@ -47,8 +51,11 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page, >>>>> return page; >>>>> } >>>>> >>>>> - for (; iter < pageblock_nr_pages - offset; iter++) { >>>>> - page = pfn_to_page(pfn + iter); >>>>> + for (pfn = start_pfn; pfn < end_pfn; pfn++) { >>>>> + struct zone *zone; >>>>> + >>>>> + page = pfn_to_page(pfn); >>>>> + zone = page_zone(page); >>>>> >>>>> /* >>>>> * Both, bootmem allocations and memory holes are marked >>>>> @@ -85,7 +92,7 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page, >>>>> } >>>>> >>>>> skip_pages = compound_nr(head) - (page - head); >>>>> - iter += skip_pages - 1; >>>>> + pfn += skip_pages - 1; >>>>> continue; >>>>> } >>>>> >>>>> @@ -97,7 +104,7 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page, >>>>> */ >>>>> if (!page_ref_count(page)) { >>>>> if (PageBuddy(page)) >>>>> - iter += (1 << buddy_order(page)) - 1; >>>>> + pfn += (1 << buddy_order(page)) - 1; >>>>> continue; >>>>> } >>>>> >>>>> @@ -134,11 +141,18 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page, >>>>> return NULL; >>>>> } >>>>> >>>>> -static int set_migratetype_isolate(struct page *page, int migratetype, int isol_flags) >>>>> +/* >>>>> + * This function set pageblock migratetype to isolate if no unmovable page is >>>>> + * present in [start_pfn, end_pfn). The pageblock must intersect with >>>>> + * [start_pfn, end_pfn). >>>>> + */ >>>>> +static int set_migratetype_isolate(struct page *page, int migratetype, int isol_flags, >>>>> + unsigned long start_pfn, unsigned long end_pfn) >>>> >>>> I think we might be able do better, eventually not passing start_pfn at >>>> all. Hmm. >>> >>> IMHO, having start_pfn and end_pfn in the parameter list would make the >>> interface easier to understand. Otherwise if we remove start_pfn, >>> the caller needs to adjust @page to be within the range of [start_pfn, >>> end_pfn) >>> >>>> >>>> I think we want to pull out the >>>> start_isolate_page_range()/undo_isolate_page_range() interface change >>>> into a separate patch. >>> >>> You mean a patch just adding >>> >>> unsigned long isolate_start = pfn_max_align_down(start_pfn); >>> unsigned long isolate_end = pfn_max_align_up(end_pfn); >>> >>> in start_isolate_page_range()/undo_isolate_page_range()? >>> >>> Yes I can do that. >> >> I think we have to be careful with memory onlining/offlining. There are >> corner cases where we get called with only pageblock alignment and >> must not adjust the range. > > In the patch below, you added a new set of start_isolate_pageblocks() > and undo_isolate_pageblocks(), where start_isolate_pageblocks() still > calls set_migratetype_isolate() and noted their range should not be > adjusted. But in my patch, set_migratetype_isolate() adjusts > the range for has_unmovable_pages() check. Do you mean Right, that's broken in your patch. Memory onlining/offlining behavior changed recently when "vmemmap on memory" was added. The start range might only be aligned to pageblocks but not MAX_ORDER -1 -- and we must not align u.. The core thing is that there are two types of users: memory offlining that knows what it's doing when it aligns to less then MAX_ORDER -1 , and range allocators, that just pass in the range of interest. > start_isolate_pageblocks() should call a different version of > set_migratetype_isolate() without range adjustment? That can be done > with an additional parameter in start_isolate_page_range(), like > bool strict, right? Random boolean flags are in general frowned upon ;) -- Thanks, David / dhildenb