From: Chuck Lever <chuck.lever@oracle.com>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"kuba@kernel.org" <kuba@kernel.org>
Subject: Re: [PATCH RFC] SUNRPC: Refresh rq_pages using a bulk page allocator
Date: Mon, 22 Feb 2021 14:58:04 +0000 [thread overview]
Message-ID: <33A16CEA-24CA-447A-AE8C-824771E9B3E1@oracle.com> (raw)
In-Reply-To: <20210222093505.GG3697@techsingularity.net>
> On Feb 22, 2021, at 4:35 AM, Mel Gorman <mgorman@techsingularity.net> wrote:
>
> On Mon, Feb 15, 2021 at 11:06:07AM -0500, Chuck Lever wrote:
>> Reduce the rate at which nfsd threads hammer on the page allocator.
>> This improves throughput scalability by enabling the nfsd threads to
>> run more independently of each other.
>>
>
> Sorry this is taking so long, there is a lot going on.
>
> This patch has pre-requisites that are not in mainline which makes it
> harder to evaluate what the semantics of the API should be.
>
>> @@ -659,19 +659,33 @@ static int svc_alloc_arg(struct svc_rqst *rqstp)
>> /* use as many pages as possible */
>> pages = RPCSVC_MAXPAGES;
>> }
>> - for (i = 0; i < pages ; i++)
>> - while (rqstp->rq_pages[i] == NULL) {
>> - struct page *p = alloc_page(GFP_KERNEL);
>> - if (!p) {
>> - set_current_state(TASK_INTERRUPTIBLE);
>> - if (signalled() || kthread_should_stop()) {
>> - set_current_state(TASK_RUNNING);
>> - return -EINTR;
>> - }
>> - schedule_timeout(msecs_to_jiffies(500));
>> +
>> + for (needed = 0, i = 0; i < pages ; i++)
>> + if (!rqstp->rq_pages[i])
>> + needed++;
>> + if (needed) {
>> + LIST_HEAD(list);
>> +
>> +retry:
>> + alloc_pages_bulk(GFP_KERNEL, 0,
>> + /* to test the retry logic: */
>> + min_t(unsigned long, needed, 13),
>> + &list);
>> + for (i = 0; i < pages; i++) {
>> + if (!rqstp->rq_pages[i]) {
>> + struct page *page;
>> +
>> + page = list_first_entry_or_null(&list,
>> + struct page,
>> + lru);
>> + if (unlikely(!page))
>> + goto empty_list;
>> + list_del(&page->lru);
>> + rqstp->rq_pages[i] = page;
>> + needed--;
>> }
>> - rqstp->rq_pages[i] = p;
>> }
>> + }
>> rqstp->rq_page_end = &rqstp->rq_pages[pages];
>> rqstp->rq_pages[pages] = NULL; /* this might be seen in nfsd_splice_actor() */
>>
>
> There is a conflict at the end where rq_page_end gets updated. The 5.11
> code assumes that the loop around the allocator definitely gets all
> the required pages. What tree is this patch based on and is it going in
> during this merge window? While the conflict is "trivial" to resolve,
> it would be buggy because on retry, "i" will be pointing to the wrong
> index and pages potentially leak. Rather than guessing, I'd prefer to
> base a series on code you've tested.
I posted this patch as a proof of concept. There is a clean-up patch
that goes before it to deal properly with rq_page_end. I can post
both if you really want to apply this and play with it.
> The slowpath for the bulk allocator also sucks a bit for the semantics
> required by this caller. As the bulk allocator does not walk the zonelist,
> it can return failures prematurely -- fine for an optimistic bulk allocator
> that can return a subset of pages but not for this caller which really
> wants those pages. The allocator may need NOFAIL-like semantics to walk
> the zonelist if the caller really requires success or at least walk the
> zonelist if the preferred zone is low on pages. This patch would also
> need to preserve the schedule_timeout behaviour so it does not use a lot
> of CPU time retrying allocations in the presense of memory pressure.
Waiting half a second before trying again seems like overkill, though.
--
Chuck Lever
next prev parent reply other threads:[~2021-02-22 14:58 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-15 16:06 Chuck Lever
2021-02-22 9:35 ` Mel Gorman
2021-02-22 14:58 ` Chuck Lever [this message]
2021-02-22 17:43 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=33A16CEA-24CA-447A-AE8C-824771E9B3E1@oracle.com \
--to=chuck.lever@oracle.com \
--cc=kuba@kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nfs@vger.kernel.org \
--cc=mgorman@techsingularity.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox