Re: [PATCH v11 00/26] Speculative page faults

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Haiyan Song <haiyanx.song@intel.com>
To: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"mhocko@kernel.org" <mhocko@kernel.org>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"kirill@shutemov.name" <kirill@shutemov.name>,
	"ak@linux.intel.com" <ak@linux.intel.com>,
	"dave@stgolabs.net" <dave@stgolabs.net>,
	"jack@suse.cz" <jack@suse.cz>,
	Matthew Wilcox <willy@infradead.org>,
	"khandual@linux.vnet.ibm.com" <khandual@linux.vnet.ibm.com>,
	"aneesh.kumar@linux.vnet.ibm.com"
	<aneesh.kumar@linux.vnet.ibm.com>,
	"benh@kernel.crashing.org" <benh@kernel.crashing.org>,
	"mpe@ellerman.id.au" <mpe@ellerman.id.au>,
	"paulus@samba.org" <paulus@samba.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "hpa@zytor.com" <hpa@zytor.com>,
	Will Deacon <will.deacon@arm.com>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	"sergey.senozhatsky.work@gmail.com"
	<sergey.senozhatsky.work@gmail.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	"Wang, Kemi" <kemi.wang@intel.com>,
	Daniel Jordan <daniel.m.jordan@oracle.com>,
	David Rientjes <rientjes@google.com>,
	Jerome Glisse <jglisse@redhat.com>,
	Ganesh Mahendran <opensource.ganesh@gmail.com>,
	Minchan Kim <minchan@kernel.org>,
	Punit Agrawal <punitagrawal@gmail.com>,
	vinayak menon <vinayakm.list@gmail.com>,
	Yang Shi <yang.shi@linux.alibaba.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"haren@linux.vnet.ibm.com" <haren@linux.vnet.ibm.com>,
	"npiggin@gmail.com" <npiggin@gmail.com>,
	"bsingharora@gmail.com" <bsingharora@gmail.com>,
	"paulmck@linux.vnet.ibm.com" <paulmck@linux.vnet.ibm.com>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
	"x86@kernel.org" <x86@kernel.org>
Subject: Re: [PATCH v11 00/26] Speculative page faults
Date: Mon, 28 May 2018 16:22:36 +0800	[thread overview]
Message-ID: <20180528082235.e5x4oiaaf7cjoddr@haiyan.lkp.sh.intel.com> (raw)
In-Reply-To: <1327633f-8bb9-99f7-fab4-4cfcbf997200@linux.vnet.ibm.com>

Hi Laurent,

Yes, these tests are done on V9 patch.


Best regards,
Haiyan Song

On Mon, May 28, 2018 at 09:51:34AM +0200, Laurent Dufour wrote:
> On 28/05/2018 07:23, Song, HaiyanX wrote:
> > 
> > Some regression and improvements is found by LKP-tools(linux kernel performance) on V9 patch series
> > tested on Intel 4s Skylake platform.
> 
> Hi,
> 
> Thanks for reporting this benchmark results, but you mentioned the "V9 patch
> series" while responding to the v11 header series...
> Were these tests done on v9 or v11 ?
> 
> Cheers,
> Laurent.
> 
> > 
> > The regression result is sorted by the metric will-it-scale.per_thread_ops.
> > Branch: Laurent-Dufour/Speculative-page-faults/20180316-151833 (V9 patch series)
> > Commit id:
> >     base commit: d55f34411b1b126429a823d06c3124c16283231f
> >     head commit: 0355322b3577eeab7669066df42c550a56801110
> > Benchmark suite: will-it-scale
> > Download link:
> > https://github.com/antonblanchard/will-it-scale/tree/master/tests
> > Metrics:
> >     will-it-scale.per_process_ops=processes/nr_cpu
> >     will-it-scale.per_thread_ops=threads/nr_cpu
> > test box: lkp-skl-4sp1(nr_cpu=192,memory=768G)
> > THP: enable / disable
> > nr_task: 100%
> > 
> > 1. Regressions:
> > a) THP enabled:
> > testcase                        base            change          head       metric
> > page_fault3/ enable THP         10092           -17.5%          8323       will-it-scale.per_thread_ops
> > page_fault2/ enable THP          8300           -17.2%          6869       will-it-scale.per_thread_ops
> > brk1/ enable THP                  957.67         -7.6%           885       will-it-scale.per_thread_ops
> > page_fault3/ enable THP        172821            -5.3%        163692       will-it-scale.per_process_ops
> > signal1/ enable THP              9125            -3.2%          8834       will-it-scale.per_process_ops
> > 
> > b) THP disabled:
> > testcase                        base            change          head       metric
> > page_fault3/ disable THP        10107           -19.1%          8180       will-it-scale.per_thread_ops
> > page_fault2/ disable THP         8432           -17.8%          6931       will-it-scale.per_thread_ops
> > context_switch1/ disable THP   215389            -6.8%        200776       will-it-scale.per_thread_ops
> > brk1/ disable THP                 939.67         -6.6%           877.33    will-it-scale.per_thread_ops
> > page_fault3/ disable THP       173145            -4.7%        165064       will-it-scale.per_process_ops
> > signal1/ disable THP             9162            -3.9%          8802       will-it-scale.per_process_ops
> > 
> > 2. Improvements:
> > a) THP enabled:
> > testcase                        base            change          head       metric
> > malloc1/ enable THP               66.33        +469.8%           383.67    will-it-scale.per_thread_ops
> > writeseek3/ enable THP          2531             +4.5%          2646       will-it-scale.per_thread_ops
> > signal1/ enable THP              989.33          +2.8%          1016       will-it-scale.per_thread_ops
> > 
> > b) THP disabled:
> > testcase                        base            change          head       metric
> > malloc1/ disable THP              90.33        +417.3%           467.33    will-it-scale.per_thread_ops
> > read2/ disable THP             58934            +39.2%         82060       will-it-scale.per_thread_ops
> > page_fault1/ disable THP        8607            +36.4%         11736       will-it-scale.per_thread_ops
> > read1/ disable THP            314063            +12.7%        353934       will-it-scale.per_thread_ops
> > writeseek3/ disable THP         2452            +12.5%          2759       will-it-scale.per_thread_ops
> > signal1/ disable THP             971.33          +5.5%          1024       will-it-scale.per_thread_ops
> > 
> > Notes: for above values in column "change", the higher value means that the related testcase result
> > on head commit is better than that on base commit for this benchmark.
> > 
> > 
> > Best regards
> > Haiyan Song
> > 
> > ________________________________________
> > From: owner-linux-mm@kvack.org [owner-linux-mm@kvack.org] on behalf of Laurent Dufour [ldufour@linux.vnet.ibm.com]
> > Sent: Thursday, May 17, 2018 7:06 PM
> > To: akpm@linux-foundation.org; mhocko@kernel.org; peterz@infradead.org; kirill@shutemov.name; ak@linux.intel.com; dave@stgolabs.net; jack@suse.cz; Matthew Wilcox; khandual@linux.vnet.ibm.com; aneesh.kumar@linux.vnet.ibm.com; benh@kernel.crashing.org; mpe@ellerman.id.au; paulus@samba.org; Thomas Gleixner; Ingo Molnar; hpa@zytor.com; Will Deacon; Sergey Senozhatsky; sergey.senozhatsky.work@gmail.com; Andrea Arcangeli; Alexei Starovoitov; Wang, Kemi; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minchan Kim; Punit Agrawal; vinayak menon; Yang Shi
> > Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; haren@linux.vnet.ibm.com; npiggin@gmail.com; bsingharora@gmail.com; paulmck@linux.vnet.ibm.com; Tim Chen; linuxppc-dev@lists.ozlabs.org; x86@kernel.org
> > Subject: [PATCH v11 00/26] Speculative page faults
> > 
> > This is a port on kernel 4.17 of the work done by Peter Zijlstra to handle
> > page fault without holding the mm semaphore [1].
> > 
> > The idea is to try to handle user space page faults without holding the
> > mmap_sem. This should allow better concurrency for massively threaded
> > process since the page fault handler will not wait for other threads memory
> > layout change to be done, assuming that this change is done in another part
> > of the process's memory space. This type page fault is named speculative
> > page fault. If the speculative page fault fails because of a concurrency is
> > detected or because underlying PMD or PTE tables are not yet allocating, it
> > is failing its processing and a classic page fault is then tried.
> > 
> > The speculative page fault (SPF) has to look for the VMA matching the fault
> > address without holding the mmap_sem, this is done by introducing a rwlock
> > which protects the access to the mm_rb tree. Previously this was done using
> > SRCU but it was introducing a lot of scheduling to process the VMA's
> > freeing operation which was hitting the performance by 20% as reported by
> > Kemi Wang [2]. Using a rwlock to protect access to the mm_rb tree is
> > limiting the locking contention to these operations which are expected to
> > be in a O(log n) order. In addition to ensure that the VMA is not freed in
> > our back a reference count is added and 2 services (get_vma() and
> > put_vma()) are introduced to handle the reference count. Once a VMA is
> > fetched from the RB tree using get_vma(), it must be later freed using
> > put_vma(). I can't see anymore the overhead I got while will-it-scale
> > benchmark anymore.
> > 
> > The VMA's attributes checked during the speculative page fault processing
> > have to be protected against parallel changes. This is done by using a per
> > VMA sequence lock. This sequence lock allows the speculative page fault
> > handler to fast check for parallel changes in progress and to abort the
> > speculative page fault in that case.
> > 
> > Once the VMA has been found, the speculative page fault handler would check
> > for the VMA's attributes to verify that the page fault has to be handled
> > correctly or not. Thus, the VMA is protected through a sequence lock which
> > allows fast detection of concurrent VMA changes. If such a change is
> > detected, the speculative page fault is aborted and a *classic* page fault
> > is tried.  VMA sequence lockings are added when VMA attributes which are
> > checked during the page fault are modified.
> > 
> > When the PTE is fetched, the VMA is checked to see if it has been changed,
> > so once the page table is locked, the VMA is valid, so any other changes
> > leading to touching this PTE will need to lock the page table, so no
> > parallel change is possible at this time.
> > 
> > The locking of the PTE is done with interrupts disabled, this allows
> > checking for the PMD to ensure that there is not an ongoing collapsing
> > operation. Since khugepaged is firstly set the PMD to pmd_none and then is
> > waiting for the other CPU to have caught the IPI interrupt, if the pmd is
> > valid at the time the PTE is locked, we have the guarantee that the
> > collapsing operation will have to wait on the PTE lock to move forward.
> > This allows the SPF handler to map the PTE safely. If the PMD value is
> > different from the one recorded at the beginning of the SPF operation, the
> > classic page fault handler will be called to handle the operation while
> > holding the mmap_sem. As the PTE lock is done with the interrupts disabled,
> > the lock is done using spin_trylock() to avoid dead lock when handling a
> > page fault while a TLB invalidate is requested by another CPU holding the
> > PTE.
> > 
> > In pseudo code, this could be seen as:
> >     speculative_page_fault()
> >     {
> >             vma = get_vma()
> >             check vma sequence count
> >             check vma's support
> >             disable interrupt
> >                   check pgd,p4d,...,pte
> >                   save pmd and pte in vmf
> >                   save vma sequence counter in vmf
> >             enable interrupt
> >             check vma sequence count
> >             handle_pte_fault(vma)
> >                     ..
> >                     page = alloc_page()
> >                     pte_map_lock()
> >                             disable interrupt
> >                                     abort if sequence counter has changed
> >                                     abort if pmd or pte has changed
> >                                     pte map and lock
> >                             enable interrupt
> >                     if abort
> >                        free page
> >                        abort
> >                     ...
> >     }
> > 
> >     arch_fault_handler()
> >     {
> >             if (speculative_page_fault(&vma))
> >                goto done
> >     again:
> >             lock(mmap_sem)
> >             vma = find_vma();
> >             handle_pte_fault(vma);
> >             if retry
> >                unlock(mmap_sem)
> >                goto again;
> >     done:
> >             handle fault error
> >     }
> > 
> > Support for THP is not done because when checking for the PMD, we can be
> > confused by an in progress collapsing operation done by khugepaged. The
> > issue is that pmd_none() could be true either if the PMD is not already
> > populated or if the underlying PTE are in the way to be collapsed. So we
> > cannot safely allocate a PMD if pmd_none() is true.
> > 
> > This series add a new software performance event named 'speculative-faults'
> > or 'spf'. It counts the number of successful page fault event handled
> > speculatively. When recording 'faults,spf' events, the faults one is
> > counting the total number of page fault events while 'spf' is only counting
> > the part of the faults processed speculatively.
> > 
> > There are some trace events introduced by this series. They allow
> > identifying why the page faults were not processed speculatively. This
> > doesn't take in account the faults generated by a monothreaded process
> > which directly processed while holding the mmap_sem. This trace events are
> > grouped in a system named 'pagefault', they are:
> >  - pagefault:spf_vma_changed : if the VMA has been changed in our back
> >  - pagefault:spf_vma_noanon : the vma->anon_vma field was not yet set.
> >  - pagefault:spf_vma_notsup : the VMA's type is not supported
> >  - pagefault:spf_vma_access : the VMA's access right are not respected
> >  - pagefault:spf_pmd_changed : the upper PMD pointer has changed in our
> >    back.
> > 
> > To record all the related events, the easier is to run perf with the
> > following arguments :
> > $ perf stat -e 'faults,spf,pagefault:*' <command>
> > 
> > There is also a dedicated vmstat counter showing the number of successful
> > page fault handled speculatively. I can be seen this way:
> > $ grep speculative_pgfault /proc/vmstat
> > 
> > This series builds on top of v4.16-mmotm-2018-04-13-17-28 and is functional
> > on x86, PowerPC and arm64.
> > 
> > ---------------------
> > Real Workload results
> > 
> > As mentioned in previous email, we did non official runs using a "popular
> > in memory multithreaded database product" on 176 cores SMT8 Power system
> > which showed a 30% improvements in the number of transaction processed per
> > second. This run has been done on the v6 series, but changes introduced in
> > this new version should not impact the performance boost seen.
> > 
> > Here are the perf data captured during 2 of these runs on top of the v8
> > series:
> >                 vanilla         spf
> > faults          89.418          101.364         +13%
> > spf                n/a           97.989
> > 
> > With the SPF kernel, most of the page fault were processed in a speculative
> > way.
> > 
> > Ganesh Mahendran had backported the series on top of a 4.9 kernel and gave
> > it a try on an android device. He reported that the application launch time
> > was improved in average by 6%, and for large applications (~100 threads) by
> > 20%.
> > 
> > Here are the launch time Ganesh mesured on Android 8.0 on top of a Qcom
> > MSM845 (8 cores) with 6GB (the less is better):
> > 
> > Application                             4.9     4.9+spf delta
> > com.tencent.mm                          416     389     -7%
> > com.eg.android.AlipayGphone             1135    986     -13%
> > com.tencent.mtt                         455     454     0%
> > com.qqgame.hlddz                        1497    1409    -6%
> > com.autonavi.minimap                    711     701     -1%
> > com.tencent.tmgp.sgame                  788     748     -5%
> > com.immomo.momo                         501     487     -3%
> > com.tencent.peng                        2145    2112    -2%
> > com.smile.gifmaker                      491     461     -6%
> > com.baidu.BaiduMap                      479     366     -23%
> > com.taobao.taobao                       1341    1198    -11%
> > com.baidu.searchbox                     333     314     -6%
> > com.tencent.mobileqq                    394     384     -3%
> > com.sina.weibo                          907     906     0%
> > com.youku.phone                         816     731     -11%
> > com.happyelements.AndroidAnimal.qq      763     717     -6%
> > com.UCMobile                            415     411     -1%
> > com.tencent.tmgp.ak                     1464    1431    -2%
> > com.tencent.qqmusic                     336     329     -2%
> > com.sankuai.meituan                     1661    1302    -22%
> > com.netease.cloudmusic                  1193    1200    1%
> > air.tv.douyu.android                    4257    4152    -2%
> > 
> > ------------------
> > Benchmarks results
> > 
> > Base kernel is v4.17.0-rc4-mm1
> > SPF is BASE + this series
> > 
> > Kernbench:
> > ----------
> > Here are the results on a 16 CPUs X86 guest using kernbench on a 4.15
> > kernel (kernel is build 5 times):
> > 
> > Average Half load -j 8
> >                  Run    (std deviation)
> >                  BASE                   SPF
> > Elapsed Time     1448.65 (5.72312)      1455.84 (4.84951)       0.50%
> > User    Time     10135.4 (30.3699)      10148.8 (31.1252)       0.13%
> > System  Time     900.47  (2.81131)      923.28  (7.52779)       2.53%
> > Percent CPU      761.4   (1.14018)      760.2   (0.447214)      -0.16%
> > Context Switches 85380   (3419.52)      84748   (1904.44)       -0.74%
> > Sleeps           105064  (1240.96)      105074  (337.612)       0.01%
> > 
> > Average Optimal load -j 16
> >                  Run    (std deviation)
> >                  BASE                   SPF
> > Elapsed Time     920.528 (10.1212)      927.404 (8.91789)       0.75%
> > User    Time     11064.8 (981.142)      11085   (990.897)       0.18%
> > System  Time     979.904 (84.0615)      1001.14 (82.5523)       2.17%
> > Percent CPU      1089.5  (345.894)      1086.1  (343.545)       -0.31%
> > Context Switches 159488  (78156.4)      158223  (77472.1)       -0.79%
> > Sleeps           110566  (5877.49)      110388  (5617.75)       -0.16%
> > 
> > 
> > During a run on the SPF, perf events were captured:
> >  Performance counter stats for '../kernbench -M':
> >          526743764      faults
> >                210      spf
> >                  3      pagefault:spf_vma_changed
> >                  0      pagefault:spf_vma_noanon
> >               2278      pagefault:spf_vma_notsup
> >                  0      pagefault:spf_vma_access
> >                  0      pagefault:spf_pmd_changed
> > 
> > Very few speculative page faults were recorded as most of the processes
> > involved are monothreaded (sounds that on this architecture some threads
> > were created during the kernel build processing).
> > 
> > Here are the kerbench results on a 80 CPUs Power8 system:
> > 
> > Average Half load -j 40
> >                  Run    (std deviation)
> >                  BASE                   SPF
> > Elapsed Time     117.152 (0.774642)     117.166 (0.476057)      0.01%
> > User    Time     4478.52 (24.7688)      4479.76 (9.08555)       0.03%
> > System  Time     131.104 (0.720056)     134.04  (0.708414)      2.24%
> > Percent CPU      3934    (19.7104)      3937.2  (19.0184)       0.08%
> > Context Switches 92125.4 (576.787)      92581.6 (198.622)       0.50%
> > Sleeps           317923  (652.499)      318469  (1255.59)       0.17%
> > 
> > Average Optimal load -j 80
> >                  Run    (std deviation)
> >                  BASE                   SPF
> > Elapsed Time     107.73  (0.632416)     107.31  (0.584936)      -0.39%
> > User    Time     5869.86 (1466.72)      5871.71 (1467.27)       0.03%
> > System  Time     153.728 (23.8573)      157.153 (24.3704)       2.23%
> > Percent CPU      5418.6  (1565.17)      5436.7  (1580.91)       0.33%
> > Context Switches 223861  (138865)       225032  (139632)        0.52%
> > Sleeps           330529  (13495.1)      332001  (14746.2)       0.45%
> > 
> > During a run on the SPF, perf events were captured:
> >  Performance counter stats for '../kernbench -M':
> >          116730856      faults
> >                  0      spf
> >                  3      pagefault:spf_vma_changed
> >                  0      pagefault:spf_vma_noanon
> >                476      pagefault:spf_vma_notsup
> >                  0      pagefault:spf_vma_access
> >                  0      pagefault:spf_pmd_changed
> > 
> > Most of the processes involved are monothreaded so SPF is not activated but
> > there is no impact on the performance.
> > 
> > Ebizzy:
> > -------
> > The test is counting the number of records per second it can manage, the
> > higher is the best. I run it like this 'ebizzy -mTt <nrcpus>'. To get
> > consistent result I repeated the test 100 times and measure the average
> > result. The number is the record processes per second, the higher is the
> > best.
> > 
> >                 BASE            SPF             delta
> > 16 CPUs x86 VM  742.57          1490.24         100.69%
> > 80 CPUs P8 node 13105.4         24174.23        84.46%
> > 
> > Here are the performance counter read during a run on a 16 CPUs x86 VM:
> >  Performance counter stats for './ebizzy -mTt 16':
> >            1706379      faults
> >            1674599      spf
> >              30588      pagefault:spf_vma_changed
> >                  0      pagefault:spf_vma_noanon
> >                363      pagefault:spf_vma_notsup
> >                  0      pagefault:spf_vma_access
> >                  0      pagefault:spf_pmd_changed
> > 
> > And the ones captured during a run on a 80 CPUs Power node:
> >  Performance counter stats for './ebizzy -mTt 80':
> >            1874773      faults
> >            1461153      spf
> >             413293      pagefault:spf_vma_changed
> >                  0      pagefault:spf_vma_noanon
> >                200      pagefault:spf_vma_notsup
> >                  0      pagefault:spf_vma_access
> >                  0      pagefault:spf_pmd_changed
> > 
> > In ebizzy's case most of the page fault were handled in a speculative way,
> > leading the ebizzy performance boost.
> > 
> > ------------------
> > Changes since v10 (https://lkml.org/lkml/2018/4/17/572):
> >  - Accounted for all review feedbacks from Punit Agrawal, Ganesh Mahendran
> >    and Minchan Kim, hopefully.
> >  - Remove unneeded check on CONFIG_SPECULATIVE_PAGE_FAULT in
> >    __do_page_fault().
> >  - Loop in pte_spinlock() and pte_map_lock() when pte try lock fails
> >    instead
> >    of aborting the speculative page fault handling. Dropping the now
> > useless
> >    trace event pagefault:spf_pte_lock.
> >  - No more try to reuse the fetched VMA during the speculative page fault
> >    handling when retrying is needed. This adds a lot of complexity and
> >    additional tests done didn't show a significant performance improvement.
> >  - Convert IS_ENABLED(CONFIG_NUMA) back to #ifdef due to build error.
> > 
> > [1] http://linux-kernel.2935.n7.nabble.com/RFC-PATCH-0-6-Another-go-at-speculative-page-faults-tt965642.html#none
> > [2] https://patchwork.kernel.org/patch/9999687/
> > 
> > 
> > Laurent Dufour (20):
> >   mm: introduce CONFIG_SPECULATIVE_PAGE_FAULT
> >   x86/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT
> >   powerpc/mm: set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT
> >   mm: introduce pte_spinlock for FAULT_FLAG_SPECULATIVE
> >   mm: make pte_unmap_same compatible with SPF
> >   mm: introduce INIT_VMA()
> >   mm: protect VMA modifications using VMA sequence count
> >   mm: protect mremap() against SPF hanlder
> >   mm: protect SPF handler against anon_vma changes
> >   mm: cache some VMA fields in the vm_fault structure
> >   mm/migrate: Pass vm_fault pointer to migrate_misplaced_page()
> >   mm: introduce __lru_cache_add_active_or_unevictable
> >   mm: introduce __vm_normal_page()
> >   mm: introduce __page_add_new_anon_rmap()
> >   mm: protect mm_rb tree with a rwlock
> >   mm: adding speculative page fault failure trace events
> >   perf: add a speculative page fault sw event
> >   perf tools: add support for the SPF perf event
> >   mm: add speculative page fault vmstats
> >   powerpc/mm: add speculative page fault
> > 
> > Mahendran Ganesh (2):
> >   arm64/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT
> >   arm64/mm: add speculative page fault
> > 
> > Peter Zijlstra (4):
> >   mm: prepare for FAULT_FLAG_SPECULATIVE
> >   mm: VMA sequence count
> >   mm: provide speculative fault infrastructure
> >   x86/mm: add speculative pagefault handling
> > 
> >  arch/arm64/Kconfig                    |   1 +
> >  arch/arm64/mm/fault.c                 |  12 +
> >  arch/powerpc/Kconfig                  |   1 +
> >  arch/powerpc/mm/fault.c               |  16 +
> >  arch/x86/Kconfig                      |   1 +
> >  arch/x86/mm/fault.c                   |  27 +-
> >  fs/exec.c                             |   2 +-
> >  fs/proc/task_mmu.c                    |   5 +-
> >  fs/userfaultfd.c                      |  17 +-
> >  include/linux/hugetlb_inline.h        |   2 +-
> >  include/linux/migrate.h               |   4 +-
> >  include/linux/mm.h                    | 136 +++++++-
> >  include/linux/mm_types.h              |   7 +
> >  include/linux/pagemap.h               |   4 +-
> >  include/linux/rmap.h                  |  12 +-
> >  include/linux/swap.h                  |  10 +-
> >  include/linux/vm_event_item.h         |   3 +
> >  include/trace/events/pagefault.h      |  80 +++++
> >  include/uapi/linux/perf_event.h       |   1 +
> >  kernel/fork.c                         |   5 +-
> >  mm/Kconfig                            |  22 ++
> >  mm/huge_memory.c                      |   6 +-
> >  mm/hugetlb.c                          |   2 +
> >  mm/init-mm.c                          |   3 +
> >  mm/internal.h                         |  20 ++
> >  mm/khugepaged.c                       |   5 +
> >  mm/madvise.c                          |   6 +-
> >  mm/memory.c                           | 612 +++++++++++++++++++++++++++++-----
> >  mm/mempolicy.c                        |  51 ++-
> >  mm/migrate.c                          |   6 +-
> >  mm/mlock.c                            |  13 +-
> >  mm/mmap.c                             | 229 ++++++++++---
> >  mm/mprotect.c                         |   4 +-
> >  mm/mremap.c                           |  13 +
> >  mm/nommu.c                            |   2 +-
> >  mm/rmap.c                             |   5 +-
> >  mm/swap.c                             |   6 +-
> >  mm/swap_state.c                       |   8 +-
> >  mm/vmstat.c                           |   5 +-
> >  tools/include/uapi/linux/perf_event.h |   1 +
> >  tools/perf/util/evsel.c               |   1 +
> >  tools/perf/util/parse-events.c        |   4 +
> >  tools/perf/util/parse-events.l        |   1 +
> >  tools/perf/util/python.c              |   1 +
> >  44 files changed, 1161 insertions(+), 211 deletions(-)
> >  create mode 100644 include/trace/events/pagefault.h
> > 
> > --
> > 2.7.4
> > 
> > 
>

next prev parent reply	other threads:[~2018-05-28  8:22 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-17 11:06 Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 01/26] mm: introduce CONFIG_SPECULATIVE_PAGE_FAULT Laurent Dufour
2018-05-17 16:36   ` Randy Dunlap
2018-05-17 17:19     ` Matthew Wilcox
2018-05-17 17:34       ` Randy Dunlap
2018-05-22 12:00         ` [FIX PATCH " Laurent Dufour
2018-05-22 11:44       ` [PATCH " Laurent Dufour
2018-05-22 11:47     ` Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 02/26] x86/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 03/26] powerpc/mm: set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 04/26] arm64/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 05/26] mm: prepare for FAULT_FLAG_SPECULATIVE Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 06/26] mm: introduce pte_spinlock " Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 07/26] mm: make pte_unmap_same compatible with SPF Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 08/26] mm: introduce INIT_VMA() Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 09/26] mm: VMA sequence count Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 10/26] mm: protect VMA modifications using " Laurent Dufour
2018-11-05  7:04   ` vinayak menon
2018-11-05 18:22     ` Laurent Dufour
2018-11-06  9:28       ` Vinayak Menon
2018-05-17 11:06 ` [PATCH v11 11/26] mm: protect mremap() against SPF hanlder Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 12/26] mm: protect SPF handler against anon_vma changes Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 13/26] mm: cache some VMA fields in the vm_fault structure Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 14/26] mm/migrate: Pass vm_fault pointer to migrate_misplaced_page() Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 15/26] mm: introduce __lru_cache_add_active_or_unevictable Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 16/26] mm: introduce __vm_normal_page() Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 17/26] mm: introduce __page_add_new_anon_rmap() Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 18/26] mm: protect mm_rb tree with a rwlock Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 19/26] mm: provide speculative fault infrastructure Laurent Dufour
2018-07-24 14:26   ` zhong jiang
2018-07-24 16:10     ` Laurent Dufour
2018-07-25  9:04       ` zhong jiang
2018-07-25 10:44         ` Laurent Dufour
2018-07-25 11:23           ` zhong jiang
2018-05-17 11:06 ` [PATCH v11 20/26] mm: adding speculative page fault failure trace events Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 21/26] perf: add a speculative page fault sw event Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 22/26] perf tools: add support for the SPF perf event Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 23/26] mm: add speculative page fault vmstats Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 24/26] x86/mm: add speculative pagefault handling Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 25/26] powerpc/mm: add speculative page fault Laurent Dufour
2018-05-17 11:06 ` [PATCH v11 26/26] arm64/mm: " Laurent Dufour
2018-05-28  5:23 ` [PATCH v11 00/26] Speculative page faults Song, HaiyanX
2018-05-28  7:51   ` Laurent Dufour
2018-05-28  8:22     ` Haiyan Song [this message]
2018-05-28  8:54       ` Laurent Dufour
2018-05-28 11:04         ` Wang, Kemi
2018-06-11  7:49         ` Song, HaiyanX
2018-06-11 15:15           ` Laurent Dufour
2018-06-19  9:16             ` Haiyan Song
2018-07-02  8:59           ` Laurent Dufour
2018-07-04  3:23             ` Song, HaiyanX
2018-07-04  7:51               ` Laurent Dufour
2018-07-11 17:05                 ` Laurent Dufour
2018-07-13  3:56                   ` Song, HaiyanX
2018-07-17  9:36                     ` Laurent Dufour
2018-08-03  6:36                       ` Song, HaiyanX
2018-08-03  6:45                         ` Song, HaiyanX
2018-08-22 14:23                         ` Laurent Dufour
2018-09-18  6:42                           ` Song, HaiyanX
2018-11-05 10:42 ` Balbir Singh
2018-11-05 16:08   ` Laurent Dufour
2019-01-11 15:43 Vinayak Menon
2019-01-14 13:19 ` Vinayak Menon
2019-01-15  8:24   ` Laurent Dufour
2019-01-16 11:41     ` Vinayak Menon
2019-01-16 13:31       ` Laurent Dufour
2019-01-16 11:41     ` Vinayak Menon
2019-01-17 15:51       ` zhong jiang
2019-01-17 15:51         ` zhong jiang
2019-01-18  9:29         ` Laurent Dufour
2019-01-18 15:41           ` zhong jiang
2019-01-18 15:41             ` zhong jiang
2019-01-18 15:51             ` Laurent Dufour
2019-01-18 16:24         ` Laurent Dufour
2019-01-19 17:05           ` zhong jiang
2019-01-19 17:05             ` zhong jiang
2019-01-22 16:22           ` zhong jiang
2019-01-22 16:22             ` zhong jiang
2019-01-24  8:20             ` Laurent Dufour
2019-01-25 12:32               ` zhong jiang
2019-01-25 12:32                 ` zhong jiang
2019-01-28  8:59                 ` Laurent Dufour
2019-01-28 14:09                   ` zhong jiang
2019-01-28 14:09                     ` zhong jiang
2019-01-28 15:45                     ` Laurent Dufour
2019-01-29 15:40                       ` zhong jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180528082235.e5x4oiaaf7cjoddr@haiyan.lkp.sh.intel.com \
    --to=haiyanx.song@intel.com \
    --cc=aarcange@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexei.starovoitov@gmail.com \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=benh@kernel.crashing.org \
    --cc=bsingharora@gmail.com \
    --cc=daniel.m.jordan@oracle.com \
    --cc=dave@stgolabs.net \
    --cc=haren@linux.vnet.ibm.com \
    --cc=hpa@zytor.com \
    --cc=jack@suse.cz \
    --cc=jglisse@redhat.com \
    --cc=kemi.wang@intel.com \
    --cc=khandual@linux.vnet.ibm.com \
    --cc=kirill@shutemov.name \
    --cc=ldufour@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mhocko@kernel.org \
    --cc=minchan@kernel.org \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=opensource.ganesh@gmail.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=punitagrawal@gmail.com \
    --cc=rientjes@google.com \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=vinayakm.list@gmail.com \
    --cc=will.deacon@arm.com \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox