From: Andrew Morton <akpm@zip.com.au>
To: Robert Love <rml@tech9.net>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] low-latency zap_page_range()
Date: Thu, 29 Aug 2002 13:30:04 -0700 [thread overview]
Message-ID: <3D6E844C.4E756D10@zip.com.au> (raw)
In-Reply-To: <1030635100.939.2551.camel@phantasy>
Robert Love wrote:
>
> Andrew,
>
> Attached patch implements a low latency version of "zap_page_range()".
>
This doesn't quite do the right thing on SMP.
Note that pages which are to be torn down are buffered in the
mmu_gather_t array. The kernel throws away 507 pages at a
time - this is to reduce the frequency of global TLB invalidations.
(The 507 is, I assume, designed to make the mmu_gather_t be
2048 bytes in size. I recently broke that math, and need to fix
it up).
However with your change, we'll only ever put 256 pages into the
mmu_gather_t. Half of that thing's buffer is unused and the
invalidation rate will be doubled during teardown of large
address ranges.
I suggest that you make ZAP_BLOCK_SIZE be equal to FREE_PTE_NR on
SMP, and 256 on UP.
(We could get fancier and do something like:
tlb = tlb_gather_mmu(mm, 0):
while (size) {
...
unmap_page_range(ZAP_BLOCK_SIZE pages);
tlb_flush_mmu(...);
cond_resched_lock();
}
tlb_finish_mmu(..);
spin_unlock(page_table_lock);
but I don't think that passes the benefit-versus-complexity test.)
Also, if the kernel is not compiled for preemption then we're
doing a little bit of extra work to no advantage, yes? We can
avoid doing that by setting ZAP_BLOCK_SIZE to infinity.
How does this altered version look? All I changed was the ZAP_BLOCK_SIZE
initialisation.
--- 2.5.32/include/linux/sched.h~llzpr Thu Aug 29 13:01:01 2002
+++ 2.5.32-akpm/include/linux/sched.h Thu Aug 29 13:01:01 2002
@@ -907,6 +907,34 @@ static inline void cond_resched(void)
__cond_resched();
}
+#ifdef CONFIG_PREEMPT
+
+/*
+ * cond_resched_lock() - if a reschedule is pending, drop the given lock,
+ * call schedule, and on return reacquire the lock.
+ *
+ * Note: this does not assume the given lock is the _only_ lock held.
+ * The kernel preemption counter gives us "free" checking that we are
+ * atomic -- let's use it.
+ */
+static inline void cond_resched_lock(spinlock_t * lock)
+{
+ if (need_resched() && preempt_count() == 1) {
+ _raw_spin_unlock(lock);
+ preempt_enable_no_resched();
+ __cond_resched();
+ spin_lock(lock);
+ }
+}
+
+#else
+
+static inline void cond_resched_lock(spinlock_t * lock)
+{
+}
+
+#endif
+
/* Reevaluate whether the task has signals pending delivery.
This is required every time the blocked sigset_t changes.
Athread cathreaders should have t->sigmask_lock. */
--- 2.5.32/mm/memory.c~llzpr Thu Aug 29 13:01:01 2002
+++ 2.5.32-akpm/mm/memory.c Thu Aug 29 13:26:21 2002
@@ -389,8 +389,8 @@ void unmap_page_range(mmu_gather_t *tlb,
{
pgd_t * dir;
- if (address >= end)
- BUG();
+ BUG_ON(address >= end);
+
dir = pgd_offset(vma->vm_mm, address);
tlb_start_vma(tlb, vma);
do {
@@ -401,30 +401,53 @@ void unmap_page_range(mmu_gather_t *tlb,
tlb_end_vma(tlb, vma);
}
-/*
- * remove user pages in a given range.
+#if defined(CONFIG_SMP) && defined(CONFIG_PREEMPT)
+#define ZAP_BLOCK_SIZE (FREE_PTE_NR * PAGE_SIZE)
+#endif
+
+#if !defined(CONFIG_SMP) && defined(CONFIG_PREEMPT)
+#define ZAP_BLOCK_SIZE (256 * PAGE_SIZE)
+#endif
+
+#if !defined(CONFIG_PREEMPT)
+#define ZAP_BLOCK_SIZE (~(0UL))
+#endif
+
+/**
+ * zap_page_range - remove user pages in a given range
+ * @vma: vm_area_struct holding the applicable pages
+ * @address: starting address of pages to zap
+ * @size: number of bytes to zap
*/
void zap_page_range(struct vm_area_struct *vma, unsigned long address, unsigned long size)
{
struct mm_struct *mm = vma->vm_mm;
mmu_gather_t *tlb;
- unsigned long start = address, end = address + size;
+ unsigned long end, block;
- /*
- * This is a long-lived spinlock. That's fine.
- * There's no contention, because the page table
- * lock only protects against kswapd anyway, and
- * even if kswapd happened to be looking at this
- * process we _want_ it to get stuck.
- */
- if (address >= end)
- BUG();
spin_lock(&mm->page_table_lock);
- flush_cache_range(vma, address, end);
- tlb = tlb_gather_mmu(mm, 0);
- unmap_page_range(tlb, vma, address, end);
- tlb_finish_mmu(tlb, start, end);
+ /*
+ * This was once a long-held spinlock. Now we break the
+ * work up into ZAP_BLOCK_SIZE units and relinquish the
+ * lock after each interation. This drastically lowers
+ * lock contention and allows for a preemption point.
+ */
+ while (size) {
+ block = (size > ZAP_BLOCK_SIZE) ? ZAP_BLOCK_SIZE : size;
+ end = address + block;
+
+ flush_cache_range(vma, address, end);
+ tlb = tlb_gather_mmu(mm, 0);
+ unmap_page_range(tlb, vma, address, end);
+ tlb_finish_mmu(tlb, address, end);
+
+ cond_resched_lock(&mm->page_table_lock);
+
+ address += block;
+ size -= block;
+ }
+
spin_unlock(&mm->page_table_lock);
}
.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
next prev parent reply other threads:[~2002-08-29 20:30 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-08-29 15:31 Robert Love
2002-08-29 20:30 ` Andrew Morton [this message]
2002-08-29 20:40 ` Robert Love
2002-08-29 20:46 ` Robert Love
2002-08-29 20:59 ` Andrew Morton
2002-08-29 21:38 ` William Lee Irwin III
2002-08-29 21:00 ` Andrew Morton
2002-08-29 21:12 ` Robert Love
2002-08-29 21:22 ` Andrew Morton
2002-08-29 21:46 ` Rik van Riel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3D6E844C.4E756D10@zip.com.au \
--to=akpm@zip.com.au \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rml@tech9.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox