From: Badari Pulavarty <pbadari@us.ibm.com>
To: Hugh Dickins <hugh@veritas.com>
Cc: Chris Wright <chrisw@osdl.org>, linux-mm <linux-mm@kvack.org>
Subject: [RFC][PATCH] OVERCOMMIT_ALWAYS extension
Date: Tue, 18 Oct 2005 09:05:02 -0700 [thread overview]
Message-ID: <1129651502.23632.63.camel@localhost.localdomain> (raw)
In-Reply-To: <Pine.LNX.4.61.0510171919150.6548@goblin.wat.veritas.com>
[-- Attachment #1: Type: text/plain, Size: 2827 bytes --]
On Mon, 2005-10-17 at 19:25 +0100, Hugh Dickins wrote:
> On Mon, 17 Oct 2005, Hugh Dickins wrote:
> > On Mon, 17 Oct 2005, Badari Pulavarty wrote:
> > >
> > > I have been looking at possible ways to extend OVERCOMMIT_ALWAYS
> > > to avoid its abuse.
> > >
> > > Few of the applications (database) would like to overcommit
> > > memory (by creating shared memory segments more than RAM+swap),
> > > but use only portion of it at any given time and get rid
> > > of portions of them through madvise(DONTNEED), when needed.
> > > They want this, especially to handle hotplug memory situations
> > > (where apps may not have clear idea on how much memory they have
> > > in the system at the time of shared memory create). Currently,
> > > they are using OVERCOMMIT_ALWAYS system wide to do this - but
> > > they are affecting every other application on the system.
> > >
> > > I am wondering, if there is a better way to do this. Simple solution
> > > would be to add IPC_OVERCOMMIT flag or add CAP_SYS_ADMIN to
> > > do the overcommit. This way only specific applications, requesting
> > > this would be able to overcommit. I am worried about, the over
> > > all affects it has on the system. But again, this can't be worse
> > > than system wide OVERCOMMIT_ALWAYS. Isn't it ?
> >
> > mmap has MAP_NORESERVE, without CAP_SYS_ADMIN or other restriction,
> > which exempts that mmap from security_vm_enough_memory checking -
> > unless current setting is OVERCOMMIT_NEVER, in which case
> > MAP_NORESERVE is ignored.
>
> Having written that, it does seem rather odd that we have a flag
> anyone can set to evade that security_ checking. It was okay when
> it was just vm_enough_memory, but now it's security_vm_enough_memory,
> I wonder if this is a significant oversight, and some CAP required.
> Might break things though. CC'ed Chris.
>
> Ah, there's a security_file_mmap earlier, which could reject the
> MAP_NORESERVE flag if it feels so inclined. Perhaps you'll need
> to allow a similar opportunity for rejection in your approach.
>
> Hugh
>
> > So if you're content to move to the OVERCOMMIT_GUESS world, I
> > don't think you could be blamed for adding an IPC_NORESERVE which
> > behaves in the same way, without CAP_SYS_ADMIN restriction.
> >
> > But if you want to move to OVERCOMMIT_NEVER, yet have a flag which
> > says overcommit now, you'll get into a tussle with NEVER-adherents.
> >
> > Hugh
>
Hugh,
As you suggested, here is the patch to add SHM_NORESERVE which does
same thing as MAP_NORESERVE. This flag is ignored for OVERCOMMIT_NEVER.
I decided to do SHM_NORESERVE instead of IPC_NORESERVE - just to limit
its scope.
BTW, there is a call to security_shm_alloc() earlier, which could
be modified to reject shmget() if it needs to.
Is this reasonable ? Please review.
Thanks,
Badari
[-- Attachment #2: shm-noreserve.patch --]
[-- Type: text/x-patch, Size: 1357 bytes --]
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
--- linux-2.6.14-rc3.org/include/linux/shm.h 2005-10-18 08:44:28.000000000 -0700
+++ linux-2.6.14-rc3/include/linux/shm.h 2005-10-18 08:46:03.000000000 -0700
@@ -92,6 +92,7 @@ struct shmid_kernel /* private to the ke
#define SHM_DEST 01000 /* segment will be destroyed on last detach */
#define SHM_LOCKED 02000 /* segment will not be swapped */
#define SHM_HUGETLB 04000 /* segment will use huge TLB pages */
+#define SHM_NORESERVE 010000 /* don't check for reservations */
#ifdef CONFIG_SYSVIPC
long do_shmat(int shmid, char __user *shmaddr, int shmflg, unsigned long *addr);
--- linux-2.6.14-rc3.org/ipc/shm.c 2005-10-17 16:57:40.000000000 -0700
+++ linux-2.6.14-rc3/ipc/shm.c 2005-10-18 08:55:50.000000000 -0700
@@ -212,8 +212,16 @@ static int newseg (key_t key, int shmflg
file = hugetlb_zero_setup(size);
shp->mlock_user = current->user;
} else {
+ int acctflag = VM_ACCOUNT;
+ /*
+ * Do not allow no accouting for OVERCOMMIT_NEVER, even
+ * its asked for.
+ */
+ if ((shmflg & SHM_NORESERVE) &&
+ sysctl_overcommit_memory != OVERCOMMIT_NEVER)
+ acctflag = 0;
sprintf (name, "SYSV%08x", key);
- file = shmem_file_setup(name, size, VM_ACCOUNT);
+ file = shmem_file_setup(name, size, acctflag);
}
error = PTR_ERR(file);
if (IS_ERR(file))
next prev parent reply other threads:[~2005-10-18 16:03 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-10-17 17:30 [RFC] " Badari Pulavarty
2005-10-17 18:13 ` Hugh Dickins
2005-10-17 18:25 ` Hugh Dickins
2005-10-17 23:14 ` Badari Pulavarty
2005-10-18 16:05 ` Badari Pulavarty [this message]
2005-10-19 17:56 ` [RFC][PATCH] " Hugh Dickins
2005-10-19 18:32 ` Jeff Dike
2005-10-19 21:21 ` Badari Pulavarty
2005-10-19 22:38 ` Jeff Dike
2005-10-19 18:50 ` Badari Pulavarty
2005-10-19 19:12 ` Darren Hart
2005-10-19 20:10 ` Hugh Dickins
2005-10-19 20:47 ` Jeff Dike
2005-10-20 15:11 ` Badari Pulavarty
2005-10-20 17:27 ` Jeff Dike
2005-10-20 22:37 ` Badari Pulavarty
2005-10-24 20:04 ` Hugh Dickins
2005-10-24 20:22 ` Darren Hart
2005-10-24 20:24 ` Badari Pulavarty
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1129651502.23632.63.camel@localhost.localdomain \
--to=pbadari@us.ibm.com \
--cc=chrisw@osdl.org \
--cc=hugh@veritas.com \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox