Re: [PATCH RFC] SUNRPC: Refresh rq_pages using a bulk page allocator

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Chuck Lever <chuck.lever@oracle.com>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"kuba@kernel.org" <kuba@kernel.org>
Subject: Re: [PATCH RFC] SUNRPC: Refresh rq_pages using a bulk page allocator
Date: Mon, 22 Feb 2021 14:58:04 +0000	[thread overview]
Message-ID: <33A16CEA-24CA-447A-AE8C-824771E9B3E1@oracle.com> (raw)
In-Reply-To: <20210222093505.GG3697@techsingularity.net>



> On Feb 22, 2021, at 4:35 AM, Mel Gorman <mgorman@techsingularity.net> wrote:
> 
> On Mon, Feb 15, 2021 at 11:06:07AM -0500, Chuck Lever wrote:
>> Reduce the rate at which nfsd threads hammer on the page allocator.
>> This improves throughput scalability by enabling the nfsd threads to
>> run more independently of each other.
>> 
> 
> Sorry this is taking so long, there is a lot going on.
> 
> This patch has pre-requisites that are not in mainline which makes it
> harder to evaluate what the semantics of the API should be.
> 
>> @@ -659,19 +659,33 @@ static int svc_alloc_arg(struct svc_rqst *rqstp)
>> 		/* use as many pages as possible */
>> 		pages = RPCSVC_MAXPAGES;
>> 	}
>> -	for (i = 0; i < pages ; i++)
>> -		while (rqstp->rq_pages[i] == NULL) {
>> -			struct page *p = alloc_page(GFP_KERNEL);
>> -			if (!p) {
>> -				set_current_state(TASK_INTERRUPTIBLE);
>> -				if (signalled() || kthread_should_stop()) {
>> -					set_current_state(TASK_RUNNING);
>> -					return -EINTR;
>> -				}
>> -				schedule_timeout(msecs_to_jiffies(500));
>> +
>> +	for (needed = 0, i = 0; i < pages ; i++)
>> +		if (!rqstp->rq_pages[i])
>> +			needed++;
>> +	if (needed) {
>> +		LIST_HEAD(list);
>> +
>> +retry:
>> +		alloc_pages_bulk(GFP_KERNEL, 0,
>> +				 /* to test the retry logic: */
>> +				 min_t(unsigned long, needed, 13),
>> +				 &list);
>> +		for (i = 0; i < pages; i++) {
>> +			if (!rqstp->rq_pages[i]) {
>> +				struct page *page;
>> +
>> +				page = list_first_entry_or_null(&list,
>> +								struct page,
>> +								lru);
>> +				if (unlikely(!page))
>> +					goto empty_list;
>> +				list_del(&page->lru);
>> +				rqstp->rq_pages[i] = page;
>> +				needed--;
>> 			}
>> -			rqstp->rq_pages[i] = p;
>> 		}
>> +	}
>> 	rqstp->rq_page_end = &rqstp->rq_pages[pages];
>> 	rqstp->rq_pages[pages] = NULL; /* this might be seen in nfsd_splice_actor() */
>> 
> 
> There is a conflict at the end where rq_page_end gets updated. The 5.11
> code assumes that the loop around the allocator definitely gets all
> the required pages. What tree is this patch based on and is it going in
> during this merge window? While the conflict is "trivial" to resolve,
> it would be buggy because on retry, "i" will be pointing to the wrong
> index and pages potentially leak. Rather than guessing, I'd prefer to
> base a series on code you've tested.

I posted this patch as a proof of concept. There is a clean-up patch
that goes before it to deal properly with rq_page_end. I can post
both if you really want to apply this and play with it.


> The slowpath for the bulk allocator also sucks a bit for the semantics
> required by this caller. As the bulk allocator does not walk the zonelist,
> it can return failures prematurely -- fine for an optimistic bulk allocator
> that can return a subset of pages but not for this caller which really
> wants those pages. The allocator may need NOFAIL-like semantics to walk
> the zonelist if the caller really requires success or at least walk the
> zonelist if the preferred zone is low on pages. This patch would also
> need to preserve the schedule_timeout behaviour so it does not use a lot
> of CPU time retrying allocations in the presense of memory pressure.

Waiting half a second before trying again seems like overkill, though.


--
Chuck Lever

next prev parent reply	other threads:[~2021-02-22 14:58 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-15 16:06 Chuck Lever
2021-02-22  9:35 ` Mel Gorman
2021-02-22 14:58   ` Chuck Lever [this message]
2021-02-22 17:43     ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=33A16CEA-24CA-447A-AE8C-824771E9B3E1@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=kuba@kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox