From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70B36C433E0 for ; Wed, 23 Dec 2020 03:39:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 14573223E8 for ; Wed, 23 Dec 2020 03:39:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 14573223E8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3A5166B0098; Tue, 22 Dec 2020 22:39:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 32EE56B009B; Tue, 22 Dec 2020 22:39:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 21DA78D0014; Tue, 22 Dec 2020 22:39:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0104.hostedemail.com [216.40.44.104]) by kanga.kvack.org (Postfix) with ESMTP id 06A696B0098 for ; Tue, 22 Dec 2020 22:39:09 -0500 (EST) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id C23DC180AD837 for ; Wed, 23 Dec 2020 03:39:08 +0000 (UTC) X-FDA: 77623141176.01.limit82_6010a2827465 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin01.hostedemail.com (Postfix) with ESMTP id A680910046490 for ; Wed, 23 Dec 2020 03:39:08 +0000 (UTC) X-HE-Tag: limit82_6010a2827465 X-Filterd-Recvd-Size: 8094 Received: from mail-lf1-f54.google.com (mail-lf1-f54.google.com [209.85.167.54]) by imf23.hostedemail.com (Postfix) with ESMTP for ; Wed, 23 Dec 2020 03:39:08 +0000 (UTC) Received: by mail-lf1-f54.google.com with SMTP id s26so36864895lfc.8 for ; Tue, 22 Dec 2020 19:39:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=O+L5nyOzWWMqEdUIpe5JXr21QALODnzsWLxiUcKPjas=; b=dDs6JYxVtvejeDtqdCy8ki6p1iBg9mDU8NZZHInY3C+LMOelK3F2qW3IH1YRqmrRjW voieSGd4gDv9lEMtFNur1Vq9EXfuvdnu+/AAO77kG4oV9kgogGzl9kkynw+VXBTDkWT8 fr/X6AEk1ujIjFVjwgI4d8NJ/OpP7Wzdw2xQvSdfKxVHuuBEJAMUHeKDXqJWWlDqm7mc apz0UZcFLtdyfZMpuYg14Kyd9qIZRsTHBzZGoquSZYmX9YPZlhopnmYPpAF7WshCx5jX 0orYS69921A/GYVFHzhWp1OLpKc+tDkiwkbU2hnjHoAwrye1fQ/rp85UfPPXKerQ4Uqc Ut+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=O+L5nyOzWWMqEdUIpe5JXr21QALODnzsWLxiUcKPjas=; b=Nh6GFKqiQlsjo/F/rFcuw/UStYklQXtoT/Dvo7OQjGWBUxexinGw0F5zAMkwP5dR/C BZvKAiZqG25DquplRpf+eCwzB/K/fWEhiU85232ri4T0YSbsYB74MHs491pF0ndU17em xD7WRXVrGYHvAmCKi4nLo0I5vH7GJWTjJZDU8heYpdhOxur4FsKsAwrTHIDxLVaJ1E4O vh1r4OLjtPKBWreZLdDtl1ct+hMS8H7rslvfprCIAlG+oIKm8jjwx/O7xPu0g9Yg6Cny e5eiNIzA+i3ARKD1cRvlUZnKd0qwkg5LZXI0j3qiP7pDmPxrJRlhKUf7AF59GkFWbJPM O9Cw== X-Gm-Message-State: AOAM531lAgMB1BiFdkhiAjcU2+yTGxFH7yRVVH+OedqXDfT2TC/2NFJ/ NOHPXwgN1V8gUI3aH82i+VYskHhj5ELB7J8jTDY= X-Google-Smtp-Source: ABdhPJymzrqQQDUgKITWsNcm3WJsGpsxFXvhUrlb5SiuxiXJIHgzuljNms0LEFw873DDo+wDJ63ogvZ4tGhx++pgMSc= X-Received: by 2002:a05:651c:1068:: with SMTP id y8mr10587775ljm.76.1608694746482; Tue, 22 Dec 2020 19:39:06 -0800 (PST) MIME-Version: 1.0 References: <20201222074656.GA30035@open-light-1.localdomain> In-Reply-To: From: Liang Li Date: Wed, 23 Dec 2020 11:38:54 +0800 Message-ID: Subject: Re: [RFC PATCH 1/3] mm: support hugetlb free page reporting To: Alexander Duyck Cc: Alexander Duyck , Mel Gorman , Andrew Morton , Andrea Arcangeli , Dan Williams , "Michael S. Tsirkin" , David Hildenbrand , Jason Wang , Dave Hansen , Michal Hocko , Liang Li , Mike Kravetz , linux-mm , LKML , virtualization@lists.linux-foundation.org, qemu-devel@nongnu.org Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > > +hugepage_reporting_cycle(struct page_reporting_dev_info *prdev, > > + struct hstate *h, unsigned int nid, > > + struct scatterlist *sgl, unsigned int *offset) > > +{ > > + struct list_head *list = &h->hugepage_freelists[nid]; > > + unsigned int page_len = PAGE_SIZE << h->order; > > + struct page *page, *next; > > + long budget; > > + int ret = 0, scan_cnt = 0; > > + > > + /* > > + * Perform early check, if free area is empty there is > > + * nothing to process so we can skip this free_list. > > + */ > > + if (list_empty(list)) > > + return ret; > > + > > + spin_lock_irq(&hugetlb_lock); > > + > > + if (huge_page_order(h) > MAX_ORDER) > > + budget = HUGEPAGE_REPORTING_CAPACITY; > > + else > > + budget = HUGEPAGE_REPORTING_CAPACITY * 32; > > Wouldn't huge_page_order always be more than MAX_ORDER? Seems like we > don't even really need budget since this should probably be pulling > out no more than one hugepage at a time. I want to disting a 2M page and 1GB page here. The order of 1GB page is greater than MAX_ORDER while 2M page's order is less than MAX_ORDER. > > > + /* loop through free list adding unreported pages to sg list */ > > + list_for_each_entry_safe(page, next, list, lru) { > > + /* We are going to skip over the reported pages. */ > > + if (PageReported(page)) { > > + if (++scan_cnt >= MAX_SCAN_NUM) { > > + ret = scan_cnt; > > + break; > > + } > > + continue; > > + } > > + > > It would probably have been better to place this set before your new > set. I don't see your new set necessarily being the best use for page > reporting. I haven't really latched on to what you mean, could you explain it again? > > > + /* > > + * If we fully consumed our budget then update our > > + * state to indicate that we are requesting additional > > + * processing and exit this list. > > + */ > > + if (budget < 0) { > > + atomic_set(&prdev->state, PAGE_REPORTING_REQUESTED); > > + next = page; > > + break; > > + } > > + > > If budget is only ever going to be 1 then we probably could just look > at making this the default case for any time we find a non-reported > page. and here again. > > + /* Attempt to pull page from list and place in scatterlist */ > > + if (*offset) { > > + isolate_free_huge_page(page, h, nid); > > + /* Add page to scatter list */ > > + --(*offset); > > + sg_set_page(&sgl[*offset], page, page_len, 0); > > + > > + continue; > > + } > > + > > There is no point in the continue case if we only have a budget of 1. > We should probably just tighten up the loop so that all it does is > search until it finds the 1 page it can pull, pull it, and then return > it. The scatterlist doesn't serve much purpose and could be reduced to > just a single entry. I will think about it more. > > +static int > > +hugepage_reporting_process_hstate(struct page_reporting_dev_info *prdev, > > + struct scatterlist *sgl, struct hstate *h) > > +{ > > + unsigned int leftover, offset = HUGEPAGE_REPORTING_CAPACITY; > > + int ret = 0, nid; > > + > > + for (nid = 0; nid < MAX_NUMNODES; nid++) { > > + ret = hugepage_reporting_cycle(prdev, h, nid, sgl, &offset); > > + > > + if (ret < 0) > > + return ret; > > + } > > + > > + /* report the leftover pages before going idle */ > > + leftover = HUGEPAGE_REPORTING_CAPACITY - offset; > > + if (leftover) { > > + sgl = &sgl[offset]; > > + ret = prdev->report(prdev, sgl, leftover); > > + > > + /* flush any remaining pages out from the last report */ > > + spin_lock_irq(&hugetlb_lock); > > + hugepage_reporting_drain(prdev, h, sgl, leftover, !ret); > > + spin_unlock_irq(&hugetlb_lock); > > + } > > + > > + return ret; > > +} > > + > > If HUGEPAGE_REPORTING_CAPACITY is 1 it would make more sense to > rewrite this code to just optimize for a find and process a page > approach rather than trying to batch pages. Yes, I will make a change. Thanks for your comments! Liang