From: Toshi Kani <toshi.kani@hp.com>
To: Yigal Korman <yigal@plexistor.com>
Cc: Andy Lutomirski <luto@amacapital.net>,
"H. Peter Anvin" <hpa@zytor.com>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Arnd Bergmann <arnd@arndb.de>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Juergen Gross <jgross@suse.com>,
Stefan Bader <stefan.bader@canonical.com>,
Henrique de Moraes Holschuh <hmh@hmh.eng.br>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Subject: Re: [PATCH v2 6/6] x86, pat: Update documentation for WT changes
Date: Tue, 16 Sep 2014 16:13:16 -0600 [thread overview]
Message-ID: <1410905596.28990.414.camel@misato.fc.hp.com> (raw)
In-Reply-To: <CACTTzNbP23WAxDFMyJ+hbOGt-8S1_ZCUs2KxKObz1XVHOfJGaw@mail.gmail.com>
On Wed, 2014-09-17 at 00:45 +0300, Yigal Korman wrote:
> Perhaps your patch is still valid in the context of this patch:
> https://lkml.org/lkml/2014/9/9/612
> Part of the reason for creating it was the fact that ioremap is using
> a fake virtual address mapping.
> So I think we can still use set_memory_wt for memory created with
> add_persistent_memory.
> What do you think?
Yes, I am aware of the work. I agree that it will address issue 1), but
we still have issue 2). Since this work will create page tables, we may
be able to treat NV-DIMM ranges as RAM for reservation. However, we
also have an issue with RAM that set_page_memtype() needs additional bit
to track WT type. I am hoping that _PGMT_WB can be redefined for WT,
but I need to check it more carefully. Considering the risk of such
changes, this will be a separate item.
> Also, a thought: maybe the driver that will be managing the memory
> added by add_persistent_memory should be responsible for resolving
> aliasing issues.
Yes, that's one way to do it. Under the current design, though,
reserve_memtype() should work to detect aliasing for supported
use-cases.
Thanks,
-Toshi
>
> Thanks,
> Yigal
>
> On Tue, Sep 16, 2014 at 7:52 PM, Toshi Kani <toshi.kani@hp.com> wrote:
> > On Mon, 2014-09-15 at 18:22 -0700, Andy Lutomirski wrote:
> >> On Mon, Sep 15, 2014 at 2:19 PM, Toshi Kani <toshi.kani@hp.com> wrote:
> >> > On Wed, 2014-09-10 at 15:34 -0600, Toshi Kani wrote:
> >> >> On Wed, 2014-09-10 at 13:29 -0700, Andy Lutomirski wrote:
> >> >> > On Wed, Sep 10, 2014 at 1:12 PM, Toshi Kani <toshi.kani@hp.com> wrote:
> >> >> > > On Wed, 2014-09-10 at 11:30 -0700, Andy Lutomirski wrote:
> >> >> > >> On Wed, Sep 10, 2014 at 9:51 AM, Toshi Kani <toshi.kani@hp.com> wrote:
> >> >> > >> > +Drivers may map the entire NV-DIMM range with ioremap_cache and then change
> >> >> > >> > +a specific range to wt with set_memory_wt.
> >> >> > >>
> >> >> > >> That's mighty specific :)
> >> >> > >
> >> >> > > How about below?
> >> >> > >
> >> >> > > Drivers may use set_memory_wt to set WT type for cached reserve ranges.
> >> >> >
> >> >> > Do they have to be cached?
> >> >>
> >> >> Yes, set_memory_xyz only supports WB->type->WB transition.
> >> >>
> >> >> > How about:
> >> >> >
> >> >> > Drivers may call set_memory_wt on ioremapped ranges. In this case,
> >> >> > there is no need to change the memory type back before calling
> >> >> > iounmap.
> >> >> >
> >> >> > (Or only on cached ioremapped ranges if that is, in fact, the case.)
> >> >>
> >> >> Sounds good. Yes, I will use cashed ioremapped ranges.
> >> >
> >> > Well, testing "no need to change the memory type back before calling
> >> > iounmap" turns out to be a good test case. I realized that
> >> > set_memory_xyz only works properly for RAM. There are two problems for
> >> > using this interface for ioremapped ranges.
> >> >
> >> > 1) set_memory_xyz calls reserve_memtype() with __pa(addr). However,
> >> > __pa() translates the addr into a fake physical address when it is an
> >> > ioremapped address.
> >> >
> >> > 2) reserve_memtype() does not work for set_memory_xyz. For RAM, the WB
> >> > state is managed untracked. Hence, WB->new->WB is not considered as a
> >> > conflict. For ioremapped ranges, WB is tracked in the same way as other
> >> > cache types. Hence, WB->new is considered as a conflict.
> >> >
> >> > In my previous testing, 2) was undetected since 1) led using a fake
> >> > physical address which was not tracked for WB. This made ioremapped
> >> > ranges worked just like RAM. :-(
> >> >
> >> > Anyway, 1) can be fixed by using slow_virt_to_phys() instead of __pa().
> >> > set_memory_xyz is already slow, but this makes it even slower, though.
> >> >
> >> > For 2), WB has to be continuously tracked in order to detect aliasing,
> >> > ex. ioremap_cache and ioremap to a same address. So, I think
> >> > reserve_memtype() needs the following changes:
> >> > - Add a new arg to see if an operation is to create a new mapping or to
> >> > change cache attribute.
> >> > - Track overlapping maps so that cache type change to an overlapping
> >> > range can be detected and failed.
> >> >
> >> > This level of changes requires a separate set of patches if we pursue to
> >> > support ioremapped ranges. So, I am considering to take one of the two
> >> > options below.
> >> >
> >> > A) Drop the patch for set_memory_wt.
> >> >
> >> > B) Keep the patch for set_memory_wt, but document that it fails with
> >> > -EINVAL and its use is for RAM only.
> >> >
> >>
> >> I vote A. I see no great reason to add code that can't be used. Once
> >> someone needs this ability, they can add it :)
> >
> > Agreed. I will drop the patch for now. Since _PGMT_WB does not seem to
> > be used for tracking WB, we might be able to use this bit for WT. But I
> > need to look at the code more carefully for sure.
> >
> >> It's too bad that ioremap is called ioremap and not iomap. Otherwise
> >> the natural solution would be to add a different function call
> >> ioremap_wt that's like set_memory_wt but for ioremap ranges. Calling
> >> it ioreremap_wt sounds kind of disgusting :)
> >
> > :)
> >
> > Thanks,
> > -Toshi
> >
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
prev parent reply other threads:[~2014-09-16 22:23 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-10 16:51 [PATCH v2 0/6] Support Write-Through mapping on x86 Toshi Kani
2014-09-10 16:51 ` [PATCH v2 1/6] x86, mm, pat: Set WT to PA4 slot of PAT MSR Toshi Kani
2014-09-12 19:33 ` Konrad Rzeszutek Wilk
2014-09-12 20:45 ` Toshi Kani
2014-09-10 16:51 ` [PATCH v2 2/6] x86, mm, pat: Change reserve_memtype() to handle WT Toshi Kani
2014-09-10 18:26 ` Andy Lutomirski
2014-09-10 19:14 ` H. Peter Anvin
2014-09-10 19:30 ` Toshi Kani
2014-09-10 20:14 ` H. Peter Anvin
2014-09-10 20:30 ` Toshi Kani
2014-09-10 21:06 ` Andy Lutomirski
2014-09-10 21:11 ` Toshi Kani
2014-09-10 21:27 ` Andy Lutomirski
2014-09-10 21:25 ` Toshi Kani
2014-09-10 21:39 ` H. Peter Anvin
2014-09-10 21:47 ` Toshi Kani
2014-09-10 22:00 ` H. Peter Anvin
2014-09-10 23:24 ` Toshi Kani
2014-09-10 21:39 ` H. Peter Anvin
2014-09-10 20:31 ` Andy Lutomirski
2014-09-12 19:41 ` Konrad Rzeszutek Wilk
2014-09-10 16:51 ` [PATCH v2 3/6] x86, mm, asm-gen: Add ioremap_wt() for WT Toshi Kani
2014-09-10 18:29 ` Andy Lutomirski
2014-09-10 19:40 ` Toshi Kani
2014-09-10 20:08 ` Andy Lutomirski
2014-09-12 19:42 ` Konrad Rzeszutek Wilk
2014-09-10 16:51 ` [PATCH v2 4/6] x86, mm: Add set_memory_wt() " Toshi Kani
2014-09-12 19:47 ` Konrad Rzeszutek Wilk
2014-09-10 16:51 ` [PATCH v2 5/6] x86, mm, pat: Add pgprot_writethrough() " Toshi Kani
2014-09-12 19:47 ` Konrad Rzeszutek Wilk
2014-09-10 16:51 ` [PATCH v2 6/6] x86, pat: Update documentation for WT changes Toshi Kani
2014-09-10 18:30 ` Andy Lutomirski
2014-09-10 20:12 ` Toshi Kani
2014-09-10 20:29 ` Andy Lutomirski
2014-09-10 21:34 ` Toshi Kani
2014-09-15 21:19 ` Toshi Kani
2014-09-16 1:22 ` Andy Lutomirski
2014-09-16 16:52 ` Toshi Kani
2014-09-16 21:45 ` Yigal Korman
2014-09-16 22:13 ` Toshi Kani [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1410905596.28990.414.camel@misato.fc.hp.com \
--to=toshi.kani@hp.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=hmh@hmh.eng.br \
--cc=hpa@zytor.com \
--cc=jgross@suse.com \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@amacapital.net \
--cc=mingo@redhat.com \
--cc=stefan.bader@canonical.com \
--cc=tglx@linutronix.de \
--cc=yigal@plexistor.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox