linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Manfred Spraul <manfreds@colorfullife.com>
To: Kanoj Sarcar <kanoj@google.engr.sgi.com>
Cc: linux-kernel@vger.rutgers.edu, linux-mm@kvack.org
Subject: Re: zap_page_range(): TLB flush race
Date: Sun, 09 Apr 2000 00:46:15 +0200	[thread overview]
Message-ID: <38EFB6B7.F737B82A@colorfullife.com> (raw)
In-Reply-To: <200004082111.OAA73647@google.engr.sgi.com>

Kanoj Sarcar wrote:
> 
> >
> > it seems we have a smp race in zap_page_range():
> >
> > When we remove a page from the page tables, we must call:
> >
> >       flush_cache_page();
> >       pte_clear();
> >       flush_tlb_page();
> >       free_page();
> >
> > We must not free the page before we have called flush_tlb_xy(),
> > otherwise the second cpu could access memory that already freed.
> >
> > but zap_page_range() calls free_page() before the flush_tlb() call.
> >
> > Is that really a bug, has anyone a good idea how to fix that?
> 
> Why do you think this is a bug? After the pte_clear, we need to flush
> tlb, so that if anyone wants to drag in the mapping (by accessing the
> virtual address), he will fault (since translation is not in tlb) and
> wait on mmap_sem.

Because the second cpu can use a stale tlb entry. We must free the page
_after_ the flush, not before the flush.

Example:
cpu1, cpu2: execute 2 threads from one process.
cpu3: unrelated thread that allocates a new page.

cpu1:
1) writes to one page in a tight loop. The tlb entry won't be discared
by the cpu without an explicit flush.

cpu2:
2) sys_munmap()
* zap_page_range(): calls free_page() for each page in the area.
do_munmap() for a 500 MB block will take a few milliseconds.

cpu3:
3) somewhere: get_free_page(). Now it gets a pointer to a page that cpu1
still writes to.

cpu2:
4) zap_page_range() returns, now the tlb is flushed.

cpu1:
5) received the ipi, the local tlb is flushed, page fault.
* but: cpu1 stomped on the page that was allocated by cpu3.


> 
> But a race does exist in establish_pte(), when the flush_tlb happens
> _before_ the set_pte(), another thread might drag in the old translation
> on a different cpu.
> 

Yes, establish_pte() is broken. We should reverse the calls:

	set_pte(); /* update the kernel page tables */
	update_mmu(); /* update architecture specific page tables. */
	flush_tlb();  /* and flush the hardware tlb */


> > 
> > filemap_sync() calls flush_tlb_page() for each page, but IMHO this is a
> > really bad idea, the performance will suck with multi-threaded apps on
> > SMP.
> 
> The best you can do probably is a flush_tlb_range?

filemap_sync() calls both flush_tlb_range() and flush_tlb_page() on each
page, I just don't know which call I should remove :-/

Btw, I couldn't find an update_mmu_range() function, what's the exact
purpose of update_mmu_cache()? mips64 uses this function.

Can't we merge update_mmu_cache() with flush_tlb_page()?

--
	Manfred

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

  reply	other threads:[~2000-04-08 22:46 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2000-04-08 20:06 Manfred Spraul
2000-04-08 21:11 ` Kanoj Sarcar
2000-04-08 22:46   ` Manfred Spraul [this message]
2000-04-08 23:31     ` Kanoj Sarcar
2000-04-08 23:37       ` Alan Cox
2000-04-08 23:54         ` Kanoj Sarcar
2000-04-09  9:10         ` Manfred Spraul
2000-04-09  9:19           ` David S. Miller
2000-04-10 22:21         ` Stephen C. Tweedie
2000-04-10 23:12           ` David S. Miller
2000-04-11  9:14             ` Stephen C. Tweedie
2000-04-11 14:41               ` Manfred Spraul
2000-04-11 16:40                 ` Andrea Arcangeli
2000-04-11 17:45                   ` Manfred Spraul
2000-04-11 18:14                     ` Kanoj Sarcar
2000-04-12 10:02                     ` Jamie Lokier
2000-04-11 11:56           ` Alan Cox
2000-04-08 23:44   ` David S. Miller
2000-04-09  0:20     ` Kanoj Sarcar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=38EFB6B7.F737B82A@colorfullife.com \
    --to=manfreds@colorfullife.com \
    --cc=kanoj@google.engr.sgi.com \
    --cc=linux-kernel@vger.rutgers.edu \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox