From: Igor Stoppa <igor.stoppa@gmail.com>
To: Dave Hansen <dave.hansen@intel.com>,
Mimi Zohar <zohar@linux.vnet.ibm.com>,
Kees Cook <keescook@chromium.org>,
Matthew Wilcox <willy@infradead.org>,
Dave Chinner <david@fromorbit.com>,
James Morris <jmorris@namei.org>,
Michal Hocko <mhocko@kernel.org>,
kernel-hardening@lists.openwall.com,
linux-integrity@vger.kernel.org,
linux-security-module@vger.kernel.org
Cc: igor.stoppa@huawei.com, Dave Hansen <dave.hansen@linux.intel.com>,
Jonathan Corbet <corbet@lwn.net>,
Laura Abbott <labbott@redhat.com>,
Vlastimil Babka <vbabka@suse.cz>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Pavel Tatashin <pasha.tatashin@oracle.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 02/17] prmem: write rare for static allocation
Date: Mon, 29 Oct 2018 20:03:07 +0200 [thread overview]
Message-ID: <311d06ab-df6d-134a-82fc-1e2098f8a924@gmail.com> (raw)
In-Reply-To: <23022d8a-dcef-20d5-cb07-a218b08b7b9a@intel.com>
On 25/10/2018 01:24, Dave Hansen wrote:
>> +static __always_inline bool __is_wr_after_init(const void *ptr, size_t size)
>> +{
>> + size_t start = (size_t)&__start_wr_after_init;
>> + size_t end = (size_t)&__end_wr_after_init;
>> + size_t low = (size_t)ptr;
>> + size_t high = (size_t)ptr + size;
>> +
>> + return likely(start <= low && low < high && high <= end);
>> +}
>
> size_t is an odd type choice for doing address arithmetic.
it seemed more portable than unsigned long
>> +/**
>> + * wr_memset() - sets n bytes of the destination to the c value
>> + * @dst: beginning of the memory to write to
>> + * @c: byte to replicate
>> + * @size: amount of bytes to copy
>> + *
>> + * Returns true on success, false otherwise.
>> + */
>> +static __always_inline
>> +bool wr_memset(const void *dst, const int c, size_t n_bytes)
>> +{
>> + size_t size;
>> + unsigned long flags;
>> + uintptr_t d = (uintptr_t)dst;
>> +
>> + if (WARN(!__is_wr_after_init(dst, n_bytes), WR_ERR_RANGE_MSG))
>> + return false;
>> + while (n_bytes) {
>> + struct page *page;
>> + uintptr_t base;
>> + uintptr_t offset;
>> + uintptr_t offset_complement;
>
> Again, these are really odd choices for types. vmap() returns a void*
> pointer, on which you can do arithmetic.
I wasn't sure of how much I could rely on the compiler not doing some
unwanted optimizations.
> Why bother keeping another
> type to which you have to cast to and from?
For the above reason. If I'm worrying unnecessarily, I can switch back
to void *
It certainly is easier to use.
> BTW, our usual "pointer stored in an integer type" is 'unsigned long',
> if a pointer needs to be manipulated.
yes, I noticed that, but it seemed strange ...
size_t corresponds to unsigned long, afaik
but it seems that I have not fully understood where to use it
anyway, I can stick to the convention with unsigned long
>
>> + local_irq_save(flags);
>
> Why are you doing the local_irq_save()?
The idea was to avoid the case where an attack would somehow freeze the
core doing the write-rare operation, while the temporary mapping is
accessible.
I have seen comments about using mappings that are private to the
current core (and I will reply to those comments as well), but this
approach seems architecture-dependent, while I was looking for a
solution that, albeit not 100% reliable, would work on any system with
an MMU. This would not prevent each arch to come up with own custom
implementation that provides better coverage, performance, etc.
>> + page = virt_to_page(d);
>> + offset = d & ~PAGE_MASK;
>> + offset_complement = PAGE_SIZE - offset;
>> + size = min(n_bytes, offset_complement);
>> + base = (uintptr_t)vmap(&page, 1, VM_MAP, PAGE_KERNEL);
>
> Can you even call vmap() (which sleeps) with interrupts off?
I accidentally disabled sleeping while atomic debugging and I totally
missed this problem :-(
However, to answer your question, nothing exploded while I was testing
(without that type of debugging).
I suspect I was just "lucky". Or maybe I was simply not triggering the
sleeping sub-case.
As I understood the code, sleeping _might_ happen, but it's not going to
happen systematically.
I wonder if I could split vmap() into two parts: first the sleeping one,
with interrupts enabled, then the non sleeping one, with interrupts
disabled.
I need to read the code more carefully, but it seems that sleeping might
happen when memory for the mapping meta data is not immediately available.
BTW, wouldn't the might_sleep() call belong more to the part which
really sleeps, instead than to the whole vmap() ?
>> + if (WARN(!base, WR_ERR_PAGE_MSG)) {
>> + local_irq_restore(flags);
>> + return false;
>> + }
>
> You really need some kmap_atomic()-style accessors to wrap this stuff
> for you. This little pattern is repeated over and over.
I really need to learn more about the way the kernel works and is
structured. It's a work in progress. Thanks for the advice.
> ...
>> +const char WR_ERR_RANGE_MSG[] = "Write rare on invalid memory range.";
>> +const char WR_ERR_PAGE_MSG[] = "Failed to remap write rare page.";
>
> Doesn't the compiler de-duplicate duplicated strings for you? Is there
> any reason to declare these like this?
I noticed I have made some accidental modifications in a couple of
cases, when replicating the command.
So I thought that if I really want to use the same string, why not doing
it explicitly? It seemed also easier, in case I want to tweak the
message. I need to do it only in one place.
--
igor
next prev parent reply other threads:[~2018-10-29 18:03 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20181023213504.28905-1-igor.stoppa@huawei.com>
2018-10-23 21:34 ` Igor Stoppa
2018-10-25 0:24 ` Dave Hansen
2018-10-29 18:03 ` Igor Stoppa [this message]
2018-10-26 9:41 ` Peter Zijlstra
2018-10-29 20:01 ` Igor Stoppa
2018-10-23 21:34 ` [PATCH 03/17] prmem: vmalloc support for dynamic allocation Igor Stoppa
2018-10-25 0:26 ` Dave Hansen
2018-10-29 18:07 ` Igor Stoppa
2018-10-23 21:34 ` [PATCH 04/17] prmem: " Igor Stoppa
2018-10-23 21:34 ` [PATCH 05/17] prmem: shorthands for write rare on common types Igor Stoppa
2018-10-25 0:28 ` Dave Hansen
2018-10-29 18:12 ` Igor Stoppa
2018-10-23 21:34 ` [PATCH 06/17] prmem: test cases for memory protection Igor Stoppa
2018-10-24 3:27 ` Randy Dunlap
2018-10-24 14:24 ` Igor Stoppa
2018-10-25 16:43 ` Dave Hansen
2018-10-29 18:16 ` Igor Stoppa
2018-10-23 21:34 ` [PATCH 07/17] prmem: lkdtm tests " Igor Stoppa
2018-10-23 21:34 ` [PATCH 08/17] prmem: struct page: track vmap_area Igor Stoppa
2018-10-24 3:12 ` Matthew Wilcox
2018-10-24 23:01 ` Igor Stoppa
2018-10-25 2:13 ` Matthew Wilcox
2018-10-29 18:21 ` Igor Stoppa
2018-10-23 21:34 ` [PATCH 09/17] prmem: hardened usercopy Igor Stoppa
2018-10-29 11:45 ` Chris von Recklinghausen
2018-10-29 18:24 ` Igor Stoppa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=311d06ab-df6d-134a-82fc-1e2098f8a924@gmail.com \
--to=igor.stoppa@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=corbet@lwn.net \
--cc=dave.hansen@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=david@fromorbit.com \
--cc=igor.stoppa@huawei.com \
--cc=jmorris@namei.org \
--cc=keescook@chromium.org \
--cc=kernel-hardening@lists.openwall.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=labbott@redhat.com \
--cc=linux-integrity@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-security-module@vger.kernel.org \
--cc=mhocko@kernel.org \
--cc=pasha.tatashin@oracle.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
--cc=zohar@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox