From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-it0-f69.google.com (mail-it0-f69.google.com [209.85.214.69]) by kanga.kvack.org (Postfix) with ESMTP id B50A46B0038 for ; Tue, 21 Feb 2017 01:01:09 -0500 (EST) Received: by mail-it0-f69.google.com with SMTP id r141so170021626ita.6 for ; Mon, 20 Feb 2017 22:01:09 -0800 (PST) Received: from mail-it0-x229.google.com (mail-it0-x229.google.com. [2607:f8b0:4001:c0b::229]) by mx.google.com with ESMTPS id j90si11003646ioo.193.2017.02.20.22.00.53 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 20 Feb 2017 22:00:53 -0800 (PST) Received: by mail-it0-x229.google.com with SMTP id y135so39295997itc.1 for ; Mon, 20 Feb 2017 22:00:53 -0800 (PST) MIME-Version: 1.0 In-Reply-To: References: <20170217141328.164563-1-kirill.shutemov@linux.intel.com> <20170217141328.164563-34-kirill.shutemov@linux.intel.com> From: Michael Pratt Date: Mon, 20 Feb 2017 22:00:12 -0800 Message-ID: Subject: Re: [PATCHv3 33/33] mm, x86: introduce PR_SET_MAX_VADDR and PR_GET_MAX_VADDR Content-Type: multipart/alternative; boundary=94eb2c08cd5646e3830549041a86 Sender: owner-linux-mm@kvack.org List-ID: To: luto@amacapital.net Cc: torvalds@linux-foundation.org, kirill.shutemov@linux.intel.com, akpm@linux-foundation.org, x86@kernel.org, tglx@linutronix.de, mingo@redhat.com, arnd@arndb.de, hpa@zytor.com, ak@linux.intel.com, dave.hansen@intel.com, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, catalin.marinas@arm.com, linux-api@vger.kernel.org --94eb2c08cd5646e3830549041a86 Content-Type: text/plain; charset=UTF-8 On Mon, Feb 20, 2017 at 9:21 PM, Michael Pratt wrote: > On Fri, Feb 17, 2017 at 3:02 PM, Andy Lutomirski > wrote: > > On Fri, Feb 17, 2017 at 1:01 PM, Linus Torvalds > > wrote: > >> On Fri, Feb 17, 2017 at 12:12 PM, Andy Lutomirski > wrote: > >>> > >>> At the very least, I'd want to see > >>> MAP_FIXED_BUT_DONT_BLOODY_UNMAP_ANYTHING. I *hate* the current > >>> interface. > >> > >> That's unrelated, but I guess w could add a MAP_NOUNMAP flag, and then > >> you can use MAP_FIXED | MAP_NOUNMAP or something. > >> > >> But that has nothing to do with the 47-vs-56 bit issue. > >> > >>> How about MAP_LIMIT where the address passed in is interpreted as an > >>> upper bound instead of a fixed address? > >> > >> Again, that's a unrelated semantic issue. Right now - if you don't > >> pass in MAP_FIXED at all, the "addr" argument is used as a starting > >> value for deciding where to find an unmapped area. But there is no way > >> to specify the end. That would basically be what the process control > >> thing would be (not per-system-call, but per-thread ). > >> > > > > What I'm trying to say is: if we're going to do the route of 48-bit > > limit unless a specific mmap call requests otherwise, can we at least > > have an interface that doesn't suck? > I've got a set of patches that I've meant to send out as an RFC for a while that tries to address userspace control of address space layout and covers many of these ideas. There is a new syscall and set of prctls for controlling the "mmap layout" (i.e., get_unmapped_area search range) that look something like this: struct mmap_layout { unsigned long start; unsigned long end; /* * These are equivalent to mmap_legacy_base and mmap_base, * but are not really needed in this proposal. */ unsigned long low_base; unsigned long high_base; unsigned long flags; }; /* For flags */ #define MMAP_TOPDOWN 1 struct layout_mmap_args { unsigned long addr; unsigned long len; unsigned long prot; unsigned long flags; unsigned long fd; unsigned long off; struct mmap_layout layout; }; void *layout_mmap(struct layout_mmap_args *args); int prctl(PR_GET_MMAP_LAYOUT, struct mmap_layout *layout); int prctl(PR_SET_MMAP_LAYOUT, struct mmap_layout *layout); The prctls control the default range that mmap and friends will allocate. For 56-bit user address space, it could default to [mmap_min_addr, 1<<47), as Linus suggests. Applications that want the full address space can increase it to cover the entire range. The layout_mmap syscall allows one-off mappings that fall outside the default layout, and nicely solves the "MAP_FIXED but don't unmap anything problem" by passing an explicit range to check without actually setting MAP_FIXED. This idea is quite similar to the MAX_VADDR + default get_unmapped_area behavior ides, just more generalized to give userspace more control over the ultimate behavior of get_unmapped_area. PS. Apologies if my email client screwed up this message. I didn't have this thread in my client and have tried to import it from another account. --94eb2c08cd5646e3830549041a86 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


On Mon, Feb 20, 2017 at 9:21 PM, Michael Pratt <linux@pratt.im> wrote:
On Fri, Fe= b 17, 2017 at 3:02 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Fri, Feb 17, 2017 at 1:01 PM, Linus Torvalds
> <torvalds@linux-fo= undation.org> wrote:
>> On Fri, Feb 17, 2017 at 12:12 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>
>>> At the very least, I'd want to see
>>> MAP_FIXED_BUT_DONT_BLOODY_UNMAP_ANYTHING.=C2=A0 I *hate* = the current
>>> interface.
>>
>> That's unrelated, but I guess w could add a MAP_NOUNMAP flag, = and then
>> you can use MAP_FIXED | MAP_NOUNMAP or something.
>>
>> But that has nothing to do with the 47-vs-56 bit issue.
>>
>>> How about MAP_LIMIT where the address passed in is interpreted= as an
>>> upper bound instead of a fixed address?
>>
>> Again, that's a unrelated semantic issue. Right now - if you d= on't
>> pass in MAP_FIXED at all, the "addr" argument is used as= a starting
>> value for deciding where to find an unmapped area. But there is no= way
>> to specify the end. That would basically be what the process contr= ol
>> thing would be (not per-system-call, but per-thread ).
>>
>
> What I'm trying to say is: if we're going to do the route of 4= 8-bit
> limit unless a specific mmap call requests otherwise, can we at least<= br> > have an interface that doesn't suck?

I've got a set of patches that I've meant to send out as an = RFC for a while that tries to address userspace control of address space la= yout and covers many of these ideas.

There is a ne= w syscall and set of prctls for controlling the "mmap layout" (i.= e., get_unmapped_area search range) that look something like this:

struct mmap_layout {
unsigned long start;
=
unsigned long end;
/*
* These are equivalent to mmap_legacy_base and mmap_base,
=
* but are not really needed in= this proposal.
*/
unsigned long low_base;
unsigned long high_base;
unsigned long flags;
};

/* For flags *= /
#define MMAP_TOPDOWN 1

struct layout_mmap_= args {
unsigned long addr;
unsigned long len;
unsigned lo= ng prot;
unsigned long flags;
unsigned long fd;
unsigned = long off;
struct mmap_layout layout;
};
void *layout_mmap(struct layout_mmap_args *args);

int prctl(PR_GET_MMAP_LAYOUT, struct mmap_layout *layout);<= /div>
int prctl(PR_SET_MMAP_LAYOUT, struct mmap_layout *layout);
<= div>
The prctls control the default range that mmap and frien= ds will allocate. For 56-bit user address space, it could default to [mmap_= min_addr, 1<<47), as Linus suggests. Applications that want the full = address space can increase it to cover the entire range.

The layout_mmap syscall allows one-off mappings that fall outside th= e default layout, and nicely solves the "MAP_FIXED but don't unmap= anything problem" by passing an explicit range to check without actua= lly setting MAP_FIXED.

This idea is quite similar = to the MAX_VADDR + default get_unmapped_area behavior ides, just more gener= alized to give userspace more control over the ultimate behavior of get_unm= apped_area.


PS. Apologies if my em= ail client screwed up this message. I didn't have this thread in my cli= ent and have tried to import it from another account.
--94eb2c08cd5646e3830549041a86-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org