From: Badari Pulavarty <pbadari@us.ibm.com>
To: Hugh Dickins <hugh@veritas.com>
Cc: Chris Wright <chrisw@osdl.org>, Jeff Dike <jdike@addtoit.com>,
linux-mm <linux-mm@kvack.org>,
dvhltc@us.ibm.com
Subject: Re: [RFC][PATCH] OVERCOMMIT_ALWAYS extension
Date: Wed, 19 Oct 2005 11:50:55 -0700 [thread overview]
Message-ID: <1129747855.8716.12.camel@localhost.localdomain> (raw)
In-Reply-To: <Pine.LNX.4.61.0510191826280.8674@goblin.wat.veritas.com>
[-- Attachment #1: Type: text/plain, Size: 2700 bytes --]
On Wed, 2005-10-19 at 18:56 +0100, Hugh Dickins wrote:
> On Tue, 18 Oct 2005, Badari Pulavarty wrote:
> >
> > As you suggested, here is the patch to add SHM_NORESERVE which does
> > same thing as MAP_NORESERVE. This flag is ignored for OVERCOMMIT_NEVER.
> > I decided to do SHM_NORESERVE instead of IPC_NORESERVE - just to limit
> > its scope.
>
> Good, yes, SHM_NORESERVE is a better name.
Hugh, Big Thank you for review and help on this.
>
> > BTW, there is a call to security_shm_alloc() earlier, which could
> > be modified to reject shmget() if it needs to.
>
> Excellent. But it can only see shp, and the
> shp->shm_flags = (shmflg & S_IRWXUGO);
> will conceal SHM_NORESERVE from it.
I noticed that, but didn't feel like passing it to security_shm_alloc(),
since even SHM_HUGETLB and others are not getting passed today.
That's why I said, "we could, if need to".
>
> Since nothing in security/ is worrying about MAP_NORESERVE at present,
> perhaps you need not bother about this for now. But easily overlooked
> later if MAP_NORESERVE rejection is added.
>
> > Is this reasonable ? Please review.
>
> Looks fine as far as it goes, except for the typos in the comment
> + * Do not allow no accouting for OVERCOMMIT_NEVER, even
> + * its asked for.
> should be
> * Do not allow no accounting for OVERCOMMIT_NEVER, even
> * if it's asked for.
> (rather a lot of negatives, but okay there I think!)
Initially I wrote it as "For OVERCOMMIT_GUESS and OVERCOMMIT_ALWAYS,
allow no accounting if asked for." - which matches the code. But,
in future, if we decide to add another mode - then we need to
update the comment.
> I say "as far as it goes" because I don't think it's actually going to
> achieve the effect you said you wanted in your original post.
>
> As you've probably noticed, switching off VM_ACCOUNT here will mean that
> the shm object is accounted page by page as it's instantiated, and I
> expect you're okay with that. But you want madvise(DONTNEED) to free
> up those reservations: it'll unmap the pages from userspace, but it
> won't free the pages from the shm object, so the reservations will
> still be in force, and accumulate.
Darren Hart is working on patch to add madvise(DISCARD) to extend
the functionality of madvise(DONTNEED) to really drop those pages.
I was going to ask your opinion on that approach :)
shmget(SHM_NORESERVE) + madvise(DISCARD) should do what I was
hoping for. (BTW, none of this has been tested with database stuff -
I am just concentrating on reasonable extensions.
Here is the version of patch under test.
(Darren - I am sending this out without your permission, I hope
you are okay with it).
Thanks,
Badari
[-- Attachment #2: madvise_discard.patch --]
[-- Type: text/x-patch, Size: 14528 bytes --]
diff -purN -X /home/dvhart/.diff.exclude /home/linux/views/linux-2.6.12/include/asm-alpha/mman.h 2.6.12-madvise/include/asm-alpha/mman.h
--- /home/linux/views/linux-2.6.12/include/asm-alpha/mman.h 2003-12-17 18:58:04.000000000 -0800
+++ 2.6.12-madvise/include/asm-alpha/mman.h 2005-07-06 09:27:11.000000000 -0700
@@ -42,6 +42,7 @@
#define MADV_WILLNEED 3 /* will need these pages */
#define MADV_SPACEAVAIL 5 /* ensure resources are available */
#define MADV_DONTNEED 6 /* don't need these pages */
+#define MADV_DISCARD 7 /* free memory and page cache now */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -purN -X /home/dvhart/.diff.exclude /home/linux/views/linux-2.6.12/include/asm-arm/mman.h 2.6.12-madvise/include/asm-arm/mman.h
--- /home/linux/views/linux-2.6.12/include/asm-arm/mman.h 2003-12-17 18:58:39.000000000 -0800
+++ 2.6.12-madvise/include/asm-arm/mman.h 2005-07-06 09:28:31.000000000 -0700
@@ -35,6 +35,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_DISCARD 0x5 /* free memory and page cache now */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -purN -X /home/dvhart/.diff.exclude /home/linux/views/linux-2.6.12/include/asm-arm26/mman.h 2.6.12-madvise/include/asm-arm26/mman.h
--- /home/linux/views/linux-2.6.12/include/asm-arm26/mman.h 2003-12-17 18:58:04.000000000 -0800
+++ 2.6.12-madvise/include/asm-arm26/mman.h 2005-07-06 09:28:40.000000000 -0700
@@ -35,6 +35,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_DISCARD 0x5 /* free memory and page cache now */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -purN -X /home/dvhart/.diff.exclude /home/linux/views/linux-2.6.12/include/asm-cris/mman.h 2.6.12-madvise/include/asm-cris/mman.h
--- /home/linux/views/linux-2.6.12/include/asm-cris/mman.h 2003-12-17 18:59:44.000000000 -0800
+++ 2.6.12-madvise/include/asm-cris/mman.h 2005-07-06 09:28:53.000000000 -0700
@@ -37,6 +37,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_DISCARD 0x5 /* free memory and page cache now */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -purN -X /home/dvhart/.diff.exclude /home/linux/views/linux-2.6.12/include/asm-frv/mman.h 2.6.12-madvise/include/asm-frv/mman.h
--- /home/linux/views/linux-2.6.12/include/asm-frv/mman.h 2005-03-02 03:00:08.000000000 -0800
+++ 2.6.12-madvise/include/asm-frv/mman.h 2005-07-06 09:29:01.000000000 -0700
@@ -35,6 +35,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_DISCARD 0x5 /* free memory and page cache now */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -purN -X /home/dvhart/.diff.exclude /home/linux/views/linux-2.6.12/include/asm-h8300/mman.h 2.6.12-madvise/include/asm-h8300/mman.h
--- /home/linux/views/linux-2.6.12/include/asm-h8300/mman.h 2005-06-17 17:21:39.000000000 -0700
+++ 2.6.12-madvise/include/asm-h8300/mman.h 2005-07-06 09:29:05.000000000 -0700
@@ -35,6 +35,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_DISCARD 0x5 /* free memory and page cache now */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -purN -X /home/dvhart/.diff.exclude /home/linux/views/linux-2.6.12/include/asm-i386/mman.h 2.6.12-madvise/include/asm-i386/mman.h
--- /home/linux/views/linux-2.6.12/include/asm-i386/mman.h 2003-12-17 18:58:15.000000000 -0800
+++ 2.6.12-madvise/include/asm-i386/mman.h 2005-07-06 09:29:10.000000000 -0700
@@ -35,6 +35,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_DISCARD 0x5 /* free memory and page cache now */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -purN -X /home/dvhart/.diff.exclude /home/linux/views/linux-2.6.12/include/asm-ia64/mman.h 2.6.12-madvise/include/asm-ia64/mman.h
--- /home/linux/views/linux-2.6.12/include/asm-ia64/mman.h 2004-04-05 16:25:06.000000000 -0700
+++ 2.6.12-madvise/include/asm-ia64/mman.h 2005-07-06 09:29:14.000000000 -0700
@@ -43,6 +43,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_DISCARD 0x5 /* free memory and page cache now */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -purN -X /home/dvhart/.diff.exclude /home/linux/views/linux-2.6.12/include/asm-m32r/mman.h 2.6.12-madvise/include/asm-m32r/mman.h
--- /home/linux/views/linux-2.6.12/include/asm-m32r/mman.h 2004-10-18 15:51:10.000000000 -0700
+++ 2.6.12-madvise/include/asm-m32r/mman.h 2005-07-06 09:29:20.000000000 -0700
@@ -37,6 +37,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_DISCARD 0x5 /* free memory and page cache now */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -purN -X /home/dvhart/.diff.exclude /home/linux/views/linux-2.6.12/include/asm-m68k/mman.h 2.6.12-madvise/include/asm-m68k/mman.h
--- /home/linux/views/linux-2.6.12/include/asm-m68k/mman.h 2003-12-17 18:58:16.000000000 -0800
+++ 2.6.12-madvise/include/asm-m68k/mman.h 2005-07-06 09:29:25.000000000 -0700
@@ -35,6 +35,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_DISCARD 0x5 /* free memory and page cache now */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -purN -X /home/dvhart/.diff.exclude /home/linux/views/linux-2.6.12/include/asm-mips/mman.h 2.6.12-madvise/include/asm-mips/mman.h
--- /home/linux/views/linux-2.6.12/include/asm-mips/mman.h 2003-12-17 18:58:39.000000000 -0800
+++ 2.6.12-madvise/include/asm-mips/mman.h 2005-07-06 09:29:37.000000000 -0700
@@ -65,6 +65,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_DISCARD 0x5 /* free memory and page cache now */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -purN -X /home/dvhart/.diff.exclude /home/linux/views/linux-2.6.12/include/asm-parisc/mman.h 2.6.12-madvise/include/asm-parisc/mman.h
--- /home/linux/views/linux-2.6.12/include/asm-parisc/mman.h 2003-12-17 18:58:58.000000000 -0800
+++ 2.6.12-madvise/include/asm-parisc/mman.h 2005-07-06 09:32:51.000000000 -0700
@@ -38,6 +38,7 @@
#define MADV_SPACEAVAIL 5 /* insure that resources are reserved */
#define MADV_VPS_PURGE 6 /* Purge pages from VM page cache */
#define MADV_VPS_INHERIT 7 /* Inherit parents page size */
+#define MADV_DISCARD 8 /* free memory and page cache now */
/* The range 12-64 is reserved for page size specification. */
#define MADV_4K_PAGES 12 /* Use 4K pages */
diff -purN -X /home/dvhart/.diff.exclude /home/linux/views/linux-2.6.12/include/asm-ppc/mman.h 2.6.12-madvise/include/asm-ppc/mman.h
--- /home/linux/views/linux-2.6.12/include/asm-ppc/mman.h 2003-12-17 19:00:03.000000000 -0800
+++ 2.6.12-madvise/include/asm-ppc/mman.h 2005-07-06 09:33:13.000000000 -0700
@@ -36,6 +36,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_DISCARD 0x5 /* free memory and page cache now */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -purN -X /home/dvhart/.diff.exclude /home/linux/views/linux-2.6.12/include/asm-ppc64/mman.h 2.6.12-madvise/include/asm-ppc64/mman.h
--- /home/linux/views/linux-2.6.12/include/asm-ppc64/mman.h 2003-12-17 18:58:47.000000000 -0800
+++ 2.6.12-madvise/include/asm-ppc64/mman.h 2005-07-06 09:33:25.000000000 -0700
@@ -44,6 +44,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_DISCARD 0x5 /* free memory and page cache now */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -purN -X /home/dvhart/.diff.exclude /home/linux/views/linux-2.6.12/include/asm-s390/mman.h 2.6.12-madvise/include/asm-s390/mman.h
--- /home/linux/views/linux-2.6.12/include/asm-s390/mman.h 2003-12-17 18:58:08.000000000 -0800
+++ 2.6.12-madvise/include/asm-s390/mman.h 2005-07-06 09:33:36.000000000 -0700
@@ -43,6 +43,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_DISCARD 0x5 /* free memory and page cache now */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -purN -X /home/dvhart/.diff.exclude /home/linux/views/linux-2.6.12/include/asm-sh/mman.h 2.6.12-madvise/include/asm-sh/mman.h
--- /home/linux/views/linux-2.6.12/include/asm-sh/mman.h 2003-12-17 18:59:27.000000000 -0800
+++ 2.6.12-madvise/include/asm-sh/mman.h 2005-07-06 09:33:57.000000000 -0700
@@ -35,6 +35,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_DISCARD 0x5 /* free memory and page cache now */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -purN -X /home/dvhart/.diff.exclude /home/linux/views/linux-2.6.12/include/asm-sparc/mman.h 2.6.12-madvise/include/asm-sparc/mman.h
--- /home/linux/views/linux-2.6.12/include/asm-sparc/mman.h 2003-12-17 18:59:43.000000000 -0800
+++ 2.6.12-madvise/include/asm-sparc/mman.h 2005-07-06 09:35:02.000000000 -0700
@@ -54,6 +54,7 @@
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
#define MADV_FREE 0x5 /* (Solaris) contents can be freed */
+#define MADV_DISCARD 0x6 /* free memory and page cache now */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -purN -X /home/dvhart/.diff.exclude /home/linux/views/linux-2.6.12/include/asm-sparc64/mman.h 2.6.12-madvise/include/asm-sparc64/mman.h
--- /home/linux/views/linux-2.6.12/include/asm-sparc64/mman.h 2003-12-17 18:58:49.000000000 -0800
+++ 2.6.12-madvise/include/asm-sparc64/mman.h 2005-07-06 09:35:15.000000000 -0700
@@ -54,6 +54,7 @@
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
#define MADV_FREE 0x5 /* (Solaris) contents can be freed */
+#define MADV_DISCARD 0x6 /* free memory and page cache now */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -purN -X /home/dvhart/.diff.exclude /home/linux/views/linux-2.6.12/include/asm-v850/mman.h 2.6.12-madvise/include/asm-v850/mman.h
--- /home/linux/views/linux-2.6.12/include/asm-v850/mman.h 2003-12-17 18:59:26.000000000 -0800
+++ 2.6.12-madvise/include/asm-v850/mman.h 2005-07-06 09:35:35.000000000 -0700
@@ -32,6 +32,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_DISCARD 0x5 /* free memory and page cache now */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -purN -X /home/dvhart/.diff.exclude /home/linux/views/linux-2.6.12/include/asm-x86_64/mman.h 2.6.12-madvise/include/asm-x86_64/mman.h
--- /home/linux/views/linux-2.6.12/include/asm-x86_64/mman.h 2003-12-17 18:59:05.000000000 -0800
+++ 2.6.12-madvise/include/asm-x86_64/mman.h 2005-07-06 09:35:40.000000000 -0700
@@ -36,6 +36,7 @@
#define MADV_SEQUENTIAL 0x2 /* read-ahead aggressively */
#define MADV_WILLNEED 0x3 /* pre-fault pages */
#define MADV_DONTNEED 0x4 /* discard these pages */
+#define MADV_DISCARD 0x5 /* free memory and page cache now */
/* compatibility flags */
#define MAP_ANON MAP_ANONYMOUS
diff -purN -X /home/dvhart/.diff.exclude /home/linux/views/linux-2.6.12/mm/madvise.c 2.6.12-madvise/mm/madvise.c
--- /home/linux/views/linux-2.6.12/mm/madvise.c 2005-03-02 03:00:18.000000000 -0800
+++ 2.6.12-madvise/mm/madvise.c 2005-07-06 10:15:09.000000000 -0700
@@ -111,6 +111,37 @@ static long madvise_dontneed(struct vm_a
return 0;
}
+static long madvise_discard(struct vm_area_struct * vma,
+ unsigned long start, unsigned long end)
+{
+ struct semaphore *i_sem;
+ loff_t offset;
+
+ if (vma->vm_file && vma->vm_file->f_mapping) {
+ if (vma->vm_file->f_mapping == &swapper_space) {
+ printk("%s: vma (%p)'s mapping is swapper_space\n", __FUNCTION__, vma);
+ return -EINVAL;
+ }
+
+ if (!vma->vm_file->f_mapping->host) {
+ printk("%s: vma (%p)'s mapping->host is null\n", __FUNCTION__, vma);
+ return -EINVAL;
+ }
+
+ /* looks good, try and rip it out of page cache */
+ printk("%s: trying to rip shm vma (%p) inode from page cache\n", __FUNCTION__, vma);
+ i_sem = &vma->vm_file->f_mapping->host->i_sem;
+ offset = (loff_t)(start - vma->vm_start);
+ printk("%s: call truncate_inode_pages(%p, %x\n", __FUNCTION__,
+ vma->vm_file->f_mapping, (unsigned int)offset);
+ down(i_sem);
+ truncate_inode_pages(vma->vm_file->f_mapping, offset);
+ up(i_sem);
+ }
+
+ return 0;
+}
+
static long madvise_vma(struct vm_area_struct * vma, unsigned long start,
unsigned long end, int behavior)
{
@@ -130,6 +161,9 @@ static long madvise_vma(struct vm_area_s
case MADV_DONTNEED:
error = madvise_dontneed(vma, start, end);
break;
+ case MADV_DISCARD:
+ error = madvise_discard(vma, start, end);
+ break;
default:
error = -EINVAL;
next prev parent reply other threads:[~2005-10-19 18:51 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-10-17 17:30 [RFC] " Badari Pulavarty
2005-10-17 18:13 ` Hugh Dickins
2005-10-17 18:25 ` Hugh Dickins
2005-10-17 23:14 ` Badari Pulavarty
2005-10-18 16:05 ` [RFC][PATCH] " Badari Pulavarty
2005-10-19 17:56 ` Hugh Dickins
2005-10-19 18:32 ` Jeff Dike
2005-10-19 21:21 ` Badari Pulavarty
2005-10-19 22:38 ` Jeff Dike
2005-10-19 18:50 ` Badari Pulavarty [this message]
2005-10-19 19:12 ` Darren Hart
2005-10-19 20:10 ` Hugh Dickins
2005-10-19 20:47 ` Jeff Dike
2005-10-20 15:11 ` Badari Pulavarty
2005-10-20 17:27 ` Jeff Dike
2005-10-20 22:37 ` Badari Pulavarty
2005-10-24 20:04 ` Hugh Dickins
2005-10-24 20:22 ` Darren Hart
2005-10-24 20:24 ` Badari Pulavarty
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1129747855.8716.12.camel@localhost.localdomain \
--to=pbadari@us.ibm.com \
--cc=chrisw@osdl.org \
--cc=dvhltc@us.ibm.com \
--cc=hugh@veritas.com \
--cc=jdike@addtoit.com \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox