From: Shaohua Li <shaohua.li@intel.com>
To: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>,
Eric Dumazet <eric.dumazet@gmail.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH] percpu: preemptless __per_cpu_counter_add
Date: Wed, 27 Apr 2011 13:43:29 +0800 [thread overview]
Message-ID: <1303883009.3981.316.camel@sli10-conroe> (raw)
In-Reply-To: <20110426121011.GD878@htj.dyndns.org>
On Tue, 2011-04-26 at 20:10 +0800, Tejun Heo wrote:
> Hello, please pardon delay (and probably bad temper). I'm still sick
> & slow.
no problem.
> On Fri, Apr 22, 2011 at 10:33:00AM +0800, Shaohua Li wrote:
> > > And, no matter what, that's a separate issue from the this_cpu hot
> > > path optimizations and should be done separately. So, _please_ update
> > > this_cpu patch so that it doesn't change the slow path semantics.
> >
> > in the original implementation, a updater can change several times too,
> > it can update the count from -(batch -1) to (batch -1) without holding
> > the lock. so we always have batch*num_cpus*2 deviate
>
> That would be a pathelogical case but, even then, after the change the
> number becomes much higher as it becomes a function of batch *
> num_updaters, right?
I don't understand the difference between batch * num_updaters and batch
* num_cpus except preempt. So the only problem here is _add should have
preempt disabled? I agree preempt can make deviation worse.
except the preempt issue, are there other concerns against the atomic
convert? in the preempt disabled case, before/after the atomic convert
the deviation is the same (batch*num_cpus)
> I'll try to re-summarize my concerns as my communications don't seem
> to be getting through very well these few days (likely my fault).
>
> The biggest issue I have with the change is that with the suggested
> changes, the devaition seen by _sum becomes much less predictable.
> _sum can't be accurate. It never was and never will be, but the
> deviations have been quite predictable regardless of @batch. It's
> dependent only on the number and frequency of concurrent updaters.
>
> If concurrent updates aren't very frequent and numerous, the caller is
> guaranteed to get a result which deviates only by quite small margin.
> If concurrent updates are very frequent and numerous, the caller
> natuarally can't expect a very accurate result.
>
> However, after the change, especially with high @batch count, the
> result may deviate significantly even with low frequency concurrent
> updates. @batch deviations won't happen often but will happen once in
> a while, which is just nasty and makes the API much less useful and
> those occasional deviations can cause sporadic erratic behaviors -
> e.g. filesystems use it for free block accounting. It's actually used
> for somewhat critical decision making.
>
> If it were in the fast path, sure, we might and plan for slower
> contingencies where accuracy is more important, but we're talking
> about slow path already - it's visiting each per-cpu area for $DEITY's
> sake, so the tradeoff doesn't make a lot of sense to me.
>
> > if we really worry about _sum deviates too much. can we do something
> > like this:
> > percpu_counter_sum
> > {
> > again:
> > sum=0
> > old = atomic64_read(&fbc->counter)
> > for_each_online_cpu()
> > sum += per cpu counter
> > new = atomic64_read(&fbc->counter)
> > if (new - old > batch * num_cpus || old - new > batch * num_cpus)
> > goto again;
> > return new + sum;
> > }
> > in this way we limited the deviate to number of concurrent updater. This
> > doesn't make _sum too slow too, because we have the batch * num_cpus
> > check.
>
> I don't really worry about _sum performance. It's a quite slow path
> and most of the cost is from causing cacheline bounces anyway. That
> said, I don't see how the above would help the deviation problem.
> Let's say an updater reset per cpu counter but got preempted before
> updating the global counter. What differences does it make to check
> fbc->counter before & after like above?
yes, this is a problem. Again I don't mind to disable preempt in _add.
Thanks,
Shaohua
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-04-27 5:43 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-13 14:45 Christoph Lameter
2011-04-13 16:49 ` Christoph Lameter
2011-04-13 18:56 ` Tejun Heo
2011-04-13 20:22 ` [PATCH] " Christoph Lameter
2011-04-13 21:50 ` Tejun Heo
2011-04-13 22:17 ` Christoph Lameter
2011-04-13 22:23 ` Christoph Lameter
2011-04-13 23:55 ` Tejun Heo
2011-04-14 2:00 ` Eric Dumazet
2011-04-14 2:14 ` Eric Dumazet
2011-04-14 21:10 ` Christoph Lameter
2011-04-14 21:15 ` Tejun Heo
2011-04-15 17:37 ` Christoph Lameter
2011-04-15 18:27 ` Tejun Heo
2011-04-15 19:43 ` Christoph Lameter
2011-04-15 23:52 ` Tejun Heo
2011-04-18 14:38 ` Christoph Lameter
2011-04-21 14:43 ` Tejun Heo
2011-04-21 14:58 ` Tejun Heo
2011-04-21 17:50 ` Christoph Lameter
2011-04-21 18:01 ` Tejun Heo
2011-04-21 18:20 ` Christoph Lameter
2011-04-21 18:37 ` Tejun Heo
2011-04-21 18:54 ` Christoph Lameter
2011-04-21 19:08 ` Tejun Heo
2011-04-22 2:33 ` Shaohua Li
2011-04-26 12:10 ` Tejun Heo
2011-04-26 19:02 ` Hugh Dickins
2011-04-27 10:28 ` Tejun Heo
2011-04-27 5:43 ` Shaohua Li [this message]
2011-04-27 10:20 ` Tejun Heo
2011-04-28 3:28 ` Shaohua Li
2011-04-28 10:09 ` Tejun Heo
2011-04-28 14:11 ` Christoph Lameter
2011-04-28 14:23 ` Tejun Heo
2011-04-28 14:30 ` Tejun Heo
2011-04-28 14:58 ` Christoph Lameter
2011-04-28 14:42 ` Christoph Lameter
2011-04-28 14:44 ` Tejun Heo
2011-04-28 14:52 ` Christoph Lameter
2011-04-28 14:56 ` Tejun Heo
2011-04-28 15:05 ` Christoph Lameter
2011-04-28 15:12 ` Tejun Heo
2011-04-28 15:22 ` Christoph Lameter
2011-04-28 15:31 ` Tejun Heo
2011-04-28 15:40 ` Tejun Heo
2011-04-28 15:47 ` Christoph Lameter
2011-04-28 15:48 ` Eric Dumazet
2011-04-28 15:59 ` Eric Dumazet
2011-04-28 16:17 ` Christoph Lameter
2011-04-28 16:35 ` Eric Dumazet
2011-04-28 16:52 ` Christoph Lameter
2011-04-28 16:59 ` Eric Dumazet
2011-04-29 8:52 ` Tejun Heo
2011-04-29 8:32 ` Shaohua Li
2011-04-29 8:19 ` Shaohua Li
2011-04-29 8:44 ` Tejun Heo
2011-04-29 14:02 ` Christoph Lameter
2011-04-29 14:03 ` Christoph Lameter
2011-04-29 14:18 ` Tejun Heo
2011-04-29 14:25 ` Christoph Lameter
2011-04-29 14:43 ` Tejun Heo
2011-04-29 14:55 ` Christoph Lameter
2011-05-05 4:08 ` Shaohua Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1303883009.3981.316.camel@sli10-conroe \
--to=shaohua.li@intel.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=eric.dumazet@gmail.com \
--cc=linux-mm@kvack.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox