linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 0/6] Enable THP for text section of non-shmem files
@ 2019-06-23  5:47 Song Liu
  2019-06-23  5:47 ` [PATCH v7 1/6] filemap: check compound_head(page)->mapping in filemap_fault() Song Liu
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Song Liu @ 2019-06-23  5:47 UTC (permalink / raw)
  To: linux-mm, linux-fsdevel, linux-kernel
  Cc: matthew.wilcox, kirill.shutemov, kernel-team, william.kucharski,
	akpm, hdanton, Song Liu

Changes v6 => v7:
1. Avoid accessing vma without holding mmap_sem (Hillf Dayton)
2. In collapse_file() use readahead API instead of gup API. This matches
   better with existing logic for shmem.
3. Add inline documentation for @nr_thps (kbuild test robot)

Changes v5 => v6:
1. Improve THP stats in 3/6, (Kirill).

Changes v4 => v5:
1. Move the logic to drop THP from pagecache to open() path (Rik).
2. Revise description of CONFIG_READ_ONLY_THP_FOR_FS.

Changes v3 => v4:
1. Put the logic to drop THP from pagecache in a separate function (Rik).
2. Move the function to drop THP from pagecache to exit_mmap().
3. Revise confusing commit log 6/6.

Changes v2 => v3:
1. Removed the limitation (cannot write to file with THP) by truncating
   whole file during sys_open (see 6/6);
2. Fixed a VM_BUG_ON_PAGE() in filemap_fault() (see 2/6);
3. Split function rename to a separate patch (Rik);
4. Updated condition in hugepage_vma_check() (Rik).

Changes v1 => v2:
1. Fixed a missing mem_cgroup_commit_charge() for non-shmem case.

This set follows up discussion at LSF/MM 2019. The motivation is to put
text section of an application in THP, and thus reduces iTLB miss rate and
improves performance. Both Facebook and Oracle showed strong interests to
this feature.

To make reviews easier, this set aims a mininal valid product. Current
version of the work does not have any changes to file system specific
code. This comes with some limitations (discussed later).

This set enables an application to "hugify" its text section by simply
running something like:

          madvise(0x600000, 0x80000, MADV_HUGEPAGE);

Before this call, the /proc/<pid>/maps looks like:

    00400000-074d0000 r-xp 00000000 00:27 2006927     app

After this call, part of the text section is split out and mapped to
THP:

    00400000-00425000 r-xp 00000000 00:27 2006927     app
    00600000-00e00000 r-xp 00200000 00:27 2006927     app   <<< on THP
    00e00000-074d0000 r-xp 00a00000 00:27 2006927     app

Limitations:

1. This only works for text section (vma with VM_DENYWRITE).
2. Original limitation #2 is removed in v3.

We gated this feature with an experimental config, READ_ONLY_THP_FOR_FS.
Once we get better support on the write path, we can remove the config and
enable it by default.

Tested cases:
1. Tested with btrfs and ext4.
2. Tested with real work application (memcache like caching service).
3. Tested with "THP aware uprobe":
   https://patchwork.kernel.org/project/linux-mm/list/?series=131339

This set (plus a few uprobe patches) is also available at

   https://github.com/liu-song-6/linux/tree/uprobe-thp

Please share your comments and suggestions on this.

Thanks!

Song Liu (6):
  filemap: check compound_head(page)->mapping in filemap_fault()
  filemap: update offset check in filemap_fault()
  mm,thp: stats for file backed THP
  khugepaged: rename collapse_shmem() and khugepaged_scan_shmem()
  mm,thp: add read-only THP support for (non-shmem) FS
  mm,thp: avoid writes to file with THP in pagecache

 drivers/base/node.c    |   6 +++
 fs/inode.c             |   3 ++
 fs/namei.c             |  22 +++++++-
 fs/proc/meminfo.c      |   4 ++
 fs/proc/task_mmu.c     |   4 +-
 include/linux/fs.h     |  32 ++++++++++++
 include/linux/mmzone.h |   2 +
 mm/Kconfig             |  11 ++++
 mm/filemap.c           |   9 ++--
 mm/khugepaged.c        | 113 +++++++++++++++++++++++++++++++----------
 mm/rmap.c              |  12 +++--
 mm/vmstat.c            |   2 +
 12 files changed, 184 insertions(+), 36 deletions(-)

--
2.17.1


^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [PATCH v7 5/6] mm,thp: add read-only THP support for (non-shmem) FS
@ 2019-06-24  3:16 Hillf Danton
  2019-06-24  4:27 ` Song Liu
  0 siblings, 1 reply; 21+ messages in thread
From: Hillf Danton @ 2019-06-24  3:16 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-mm, linux-fsdevel, linux-kernel, matthew.wilcox,
	kirill.shutemov, kernel-team, william.kucharski, akpm, hdanton


Hello

On Sun, 23 Jun 2019 13:48:47 +0800 Song Liu wrote:
> This patch is (hopefully) the first step to enable THP for non-shmem
> filesystems.
> 
> This patch enables an application to put part of its text sections to THP
> via madvise, for example:
> 
>     madvise((void *)0x600000, 0x200000, MADV_HUGEPAGE);
> 
> We tried to reuse the logic for THP on tmpfs.
> 
> Currently, write is not supported for non-shmem THP. khugepaged will only
> process vma with VM_DENYWRITE. The next patch will handle writes, which
> would only happen when the vma with VM_DENYWRITE is unmapped.
> 
> An EXPERIMENTAL config, READ_ONLY_THP_FOR_FS, is added to gate this
> feature.
> 
> Acked-by: Rik van Riel <riel@surriel.com>
> Signed-off-by: Song Liu <songliubraving@fb.com>
> ---
>  mm/Kconfig      | 11 ++++++
>  mm/filemap.c    |  4 +--
>  mm/khugepaged.c | 90 ++++++++++++++++++++++++++++++++++++++++---------
>  mm/rmap.c       | 12 ++++---
>  4 files changed, 96 insertions(+), 21 deletions(-)
> 
> diff --git a/mm/Kconfig b/mm/Kconfig
> index f0c76ba47695..0a8fd589406d 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -762,6 +762,17 @@ config GUP_BENCHMARK
> 
>  	  See tools/testing/selftests/vm/gup_benchmark.c
> 
> +config READ_ONLY_THP_FOR_FS
> +	bool "Read-only THP for filesystems (EXPERIMENTAL)"
> +	depends on TRANSPARENT_HUGE_PAGECACHE && SHMEM
> +
The ext4 mentioned in the cover letter, along with the subject line of
this patch, suggests the scissoring of SHMEM.

> +	help
> +	  Allow khugepaged to put read-only file-backed pages in THP.
> +
> +	  This is marked experimental because it is a new feature. Write
> +	  support of file THPs will be developed in the next few release
> +	  cycles.
> +
>  config ARCH_HAS_PTE_SPECIAL
>  	bool

Hillf


^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [PATCH v7 5/6] mm,thp: add read-only THP support for (non-shmem)FS
@ 2019-06-24  7:48 Hillf Danton
  2019-06-24 21:17 ` Song Liu
  0 siblings, 1 reply; 21+ messages in thread
From: Hillf Danton @ 2019-06-24  7:48 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-mm, linux-fsdevel, linux-kernel, matthew.wilcox,
	kirill.shutemov, Kernel Team, william.kucharski, akpm


Hello

On Mon, 24 Jun 2019 12:28:32 +0800 Song Liu wrote:
>
>Hi Hillf,
>
>> On Jun 23, 2019, at 8:16 PM, Hillf Danton <hdanton@sina.com> wrote:
>>
>>
>> Hello
>>
>> On Sun, 23 Jun 2019 13:48:47 +0800 Song Liu wrote:
>>> This patch is (hopefully) the first step to enable THP for non-shmem
>>> filesystems.
>>>
>>> This patch enables an application to put part of its text sections to THP
>>> via madvise, for example:
>>>
>>>    madvise((void *)0x600000, 0x200000, MADV_HUGEPAGE);
>>>
>>> We tried to reuse the logic for THP on tmpfs.
>>>
>>> Currently, write is not supported for non-shmem THP. khugepaged will only
>>> process vma with VM_DENYWRITE. The next patch will handle writes, which
>>> would only happen when the vma with VM_DENYWRITE is unmapped.
>>>
>>> An EXPERIMENTAL config, READ_ONLY_THP_FOR_FS, is added to gate this
>>> feature.
>>>
>>> Acked-by: Rik van Riel <riel@surriel.com>
>>> Signed-off-by: Song Liu <songliubraving@fb.com>
>>> ---
>>> mm/Kconfig      | 11 ++++++
>>> mm/filemap.c    |  4 +--
>>> mm/khugepaged.c | 90 ++++++++++++++++++++++++++++++++++++++++---------
>>> mm/rmap.c       | 12 ++++---
>>> 4 files changed, 96 insertions(+), 21 deletions(-)
>>>
>>> diff --git a/mm/Kconfig b/mm/Kconfig
>>> index f0c76ba47695..0a8fd589406d 100644
>>> --- a/mm/Kconfig
>>> +++ b/mm/Kconfig
>>> @@ -762,6 +762,17 @@ config GUP_BENCHMARK
>>>
>>> 	  See tools/testing/selftests/vm/gup_benchmark.c
>>>
>>> +config READ_ONLY_THP_FOR_FS
>>> +	bool "Read-only THP for filesystems (EXPERIMENTAL)"
>>> +	depends on TRANSPARENT_HUGE_PAGECACHE && SHMEM
>>> +
>> The ext4 mentioned in the cover letter, along with the subject line of
>> this patch, suggests the scissoring of SHMEM.
>
>We reuse khugepaged code for SHMEM, so the dependency does exist.
>
On the other hand I see collapse_file() and khugepaged_scan_file(), and
wonder if ext4 files can be handled by the new functions. If yes, we can
drop that dependency in the game of RO thp to make ext4 be ext4, and
shmem be shmem, as they are.

Hillf


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2019-06-24 21:17 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-23  5:47 [PATCH v7 0/6] Enable THP for text section of non-shmem files Song Liu
2019-06-23  5:47 ` [PATCH v7 1/6] filemap: check compound_head(page)->mapping in filemap_fault() Song Liu
2019-06-23  5:47 ` [PATCH v7 2/6] filemap: update offset check " Song Liu
2019-06-23  5:47 ` [PATCH v7 3/6] mm,thp: stats for file backed THP Song Liu
2019-06-23  5:47 ` [PATCH v7 4/6] khugepaged: rename collapse_shmem() and khugepaged_scan_shmem() Song Liu
2019-06-23  5:47 ` [PATCH v7 5/6] mm,thp: add read-only THP support for (non-shmem) FS Song Liu
2019-06-24 12:47   ` Kirill A. Shutemov
2019-06-24 14:01     ` Song Liu
2019-06-24 14:27       ` Kirill A. Shutemov
2019-06-24 14:42         ` Song Liu
2019-06-24 14:54           ` Kirill A. Shutemov
2019-06-24 15:04             ` Song Liu
2019-06-24 15:15               ` Kirill A. Shutemov
2019-06-24 16:33                 ` Song Liu
2019-06-23  5:47 ` [PATCH v7 6/6] mm,thp: avoid writes to file with THP in pagecache Song Liu
2019-06-24 12:49   ` Kirill A. Shutemov
2019-06-24 14:01     ` Song Liu
2019-06-24  3:16 [PATCH v7 5/6] mm,thp: add read-only THP support for (non-shmem) FS Hillf Danton
2019-06-24  4:27 ` Song Liu
2019-06-24  7:48 [PATCH v7 5/6] mm,thp: add read-only THP support for (non-shmem)FS Hillf Danton
2019-06-24 21:17 ` Song Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox