linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Christoph Lameter <cl@gentwo.de>
Cc: Frederic Weisbecker <frederic@kernel.org>,
	atomlin@atomlin.com, tglx@linutronix.de, mingo@kernel.org,
	peterz@infradead.org, pauld@redhat.com, neelx@redhat.com,
	oleksandr@natalenko.name, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH v13 2/6] mm/vmstat: Use vmstat_dirty to track CPU-specific vmstat discrepancies
Date: Tue, 10 Jan 2023 17:09:06 -0300	[thread overview]
Message-ID: <Y73F4tbfxT6Kb9kZ@tpad> (raw)
In-Reply-To: <7c2af941-42a9-a59b-6a20-b331a4934a3@gentwo.de>

On Tue, Jan 10, 2023 at 02:39:08PM +0100, Christoph Lameter wrote:
> On Tue, 10 Jan 2023, Frederic Weisbecker wrote:
> 
> > Note I'm absolutely clueless with vmstat. But I was wondering about it as well
> > while reviewing Marcelo's series, so git blame pointed me to:
> >
> > 7c83912062c801738d7d19acaf8f7fec25ea663c ("vmstat: User per cpu atomics to avoid
> > interrupt disable / enable")
> >
> > And this seem to mention that this can race with IRQs as well, hence the local
> > cmpxchg operation.
> 
> The race with irq could be an issue but I thought we avoided that and were
> content with disabling preemption.
> 
> But this issue illustrates the central problem of the patchset: It makes
> the lightweight counters not so lightweight anymore. 

https://lkml.iu.edu/hypermail/linux/kernel/0903.2/00569.html

With added

static void do_test_preempt(void)
{
        unsigned long flags;
        unsigned int i;
        cycles_t time1, time2, time;
        u32 rem;

        local_irq_save(flags);
        preempt_disable();
        time1 = get_cycles();
        for (i = 0; i < NR_LOOPS; i++) {
                preempt_disable();
                preempt_enable();
        }
        time2 = get_cycles();
        local_irq_restore(flags);
        preempt_enable();
        time = time2 - time1;

        printk(KERN_ALERT "test results: time for disabling/enabling preemption\n");
        printk(KERN_ALERT "number of loops: %d\n", NR_LOOPS);
        printk(KERN_ALERT "total time: %llu\n", time);
        time = div_u64_rem(time, NR_LOOPS, &rem);
        printk(KERN_ALERT "-> enabling/disabling preemption takes %llu cycles\n",
time);
        printk(KERN_ALERT "test end\n");
}


model name	: 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz

[  423.676079] test init
[  423.676249] test results: time for baseline
[  423.676405] number of loops: 200000
[  423.676676] total time: 104274
[  423.676910] -> baseline takes 0 cycles
[  423.677051] test end
[  423.678150] test results: time for locked cmpxchg
[  423.678353] number of loops: 200000
[  423.678498] total time: 2473839
[  423.678630] -> locked cmpxchg takes 12 cycles
[  423.678810] test end
[  423.679204] test results: time for non locked cmpxchg
[  423.679394] number of loops: 200000
[  423.679527] total time: 740298
[  423.679644] -> non locked cmpxchg takes 3 cycles
[  423.679817] test end
[  423.680755] test results: time for locked add return
[  423.680951] number of loops: 200000
[  423.681089] total time: 2118185
[  423.681229] -> locked add return takes 10 cycles
[  423.681411] test end
[  423.681846] test results: time for enabling interrupts (STI)
[  423.682063] number of loops: 200000
[  423.682209] total time: 861591
[  423.682335] -> enabling interrupts (STI) takes 4 cycles
[  423.682532] test end
[  423.683606] test results: time for disabling interrupts (CLI)
[  423.683852] number of loops: 200000
[  423.684006] total time: 2440756
[  423.684141] -> disabling interrupts (CLI) takes 12 cycles
[  423.684588] test end
[  423.686626] test results: time for disabling/enabling interrupts (STI/CLI)
[  423.686879] number of loops: 200000
[  423.687015] total time: 4802297
[  423.687139] -> enabling/disabling interrupts (STI/CLI) takes 24 cycles
[  423.687389] test end
[  423.688025] test results: time for disabling/enabling preemption
[  423.688258] number of loops: 200000
[  423.688396] total time: 1341001
[  423.688526] -> enabling/disabling preemption takes 6 cycles
[  423.689276] test end

> The basic primitives add a  lot of weight. 

Can't see any alternative given the necessity to avoid interruption
by the work to sync per-CPU vmstats to global vmstats.

> And the pre cpu atomic updates operations require the modification
> of multiple values. The operation 
> cannot be "atomic" in that sense anymore and we need some other form of
> synchronization that can
> span multiple instructions.

    So use this_cpu_cmpxchg() to avoid the overhead. Since we can no longer
    count on preremption being disabled we still have some minor issues.
    The fetching of the counter thresholds is racy.
    A threshold from another cpu may be applied if we happen to be
    rescheduled on another cpu.  However, the following vmstat operation
    will then bring the counter again under the threshold limit.

Those small issues are gone, OTOH.







  reply	other threads:[~2023-01-10 20:09 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-05 12:52 [PATCH v13 0/6] Ensure quiet_vmstat() is called when returning to userpace and when idle tick is stopped Marcelo Tosatti
2023-01-05 12:52 ` [PATCH v13 1/6] mm/vmstat: Add CPU-specific variable to track a vmstat discrepancy Marcelo Tosatti
2023-01-10 11:58   ` Christoph Lameter
2023-01-10 12:12     ` Frederic Weisbecker
2023-01-05 12:52 ` [PATCH v13 2/6] mm/vmstat: Use vmstat_dirty to track CPU-specific vmstat discrepancies Marcelo Tosatti
2023-01-10 12:06   ` Christoph Lameter
2023-01-10 12:18     ` Frederic Weisbecker
2023-01-10 13:39       ` Christoph Lameter
2023-01-10 20:09         ` Marcelo Tosatti [this message]
2023-01-11  8:42           ` Christoph Lameter
2023-01-11 17:07             ` Marcelo Tosatti
2023-01-16  9:51               ` Christoph Lameter
2023-01-16 16:11                 ` Marcelo Tosatti
2023-01-17 12:52                   ` Christoph Lameter
2023-01-05 12:52 ` [PATCH v13 3/6] mm/vmstat: manage per-CPU stats from CPU context when NOHZ full Marcelo Tosatti
2023-01-05 12:52 ` [PATCH v13 4/6] tick/nohz_full: Ensure quiet_vmstat() is called on exit to user-mode when the idle tick is stopped Marcelo Tosatti
2023-01-05 12:52 ` [PATCH v13 5/6] tick/sched: Ensure quiet_vmstat() is called when the idle tick was stopped too Marcelo Tosatti
2023-01-05 12:52 ` [PATCH v13 6/6] mm/vmstat: avoid queueing work item if cpu stats are clean Marcelo Tosatti
     [not found] ` <20230106001244.4463-1-hdanton@sina.com>
2023-01-06 12:51   ` [PATCH v13 3/6] mm/vmstat: manage per-CPU stats from CPU context when NOHZ full Marcelo Tosatti
2023-01-06 15:01     ` Hillf Danton
2023-01-06 18:16       ` Marcelo Tosatti
2023-01-07  0:15         ` Hillf Danton
2023-01-09 14:12           ` Marcelo Tosatti
2023-01-10  2:43             ` Hillf Danton
2023-01-10 11:50               ` Marcelo Tosatti
2023-01-10 15:19                 ` Hillf Danton
2023-01-10 16:12                   ` Frederic Weisbecker
2023-01-10 23:58                     ` Hillf Danton
2023-01-11  0:09                       ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y73F4tbfxT6Kb9kZ@tpad \
    --to=mtosatti@redhat.com \
    --cc=atomlin@atomlin.com \
    --cc=cl@gentwo.de \
    --cc=frederic@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@kernel.org \
    --cc=neelx@redhat.com \
    --cc=oleksandr@natalenko.name \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox