From: David Hildenbrand <david@redhat.com>
To: Zhiguo Jiang <justinjiang@vivo.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Will Deacon <will@kernel.org>,
"Aneesh Kumar K.V" <aneesh.kumar@kernel.org>,
Nick Piggin <npiggin@gmail.com>,
Peter Zijlstra <peterz@infradead.org>,
Arnd Bergmann <arnd@arndb.de>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@kernel.org>,
Roman Gushchin <roman.gushchin@linux.dev>,
Shakeel Butt <shakeel.butt@linux.dev>,
Muchun Song <muchun.song@linux.dev>,
linux-arch@vger.kernel.org, cgroups@vger.kernel.org,
Barry Song <21cnbao@gmail.com>
Cc: opensource.kernel@vivo.com
Subject: Re: [PATCH v3 2/2] mm: tlb: add tlb swap entries batch async release
Date: Mon, 9 Sep 2024 08:49:12 +0200 [thread overview]
Message-ID: <f58950cd-dbe3-4629-ac92-30c76db7849a@redhat.com> (raw)
In-Reply-To: <20240805153639.1057-3-justinjiang@vivo.com>
On 05.08.24 17:36, Zhiguo Jiang wrote:
> One of the main reasons for the prolonged exit of the process with
> independent mm is the time-consuming release of its swap entries.
> The proportion of swap memory occupied by the process increases over
> time due to high memory pressure triggering to reclaim anonymous folio
> into swapspace, e.g., in Android devices, we found this proportion can
> reach 60% or more after a period of time. Additionally, the relatively
> lengthy path for releasing swap entries further contributes to the
> longer time required to release swap entries.
>
> Testing Platform: 8GB RAM
> Testing procedure:
> After booting up, start 15 processes first, and then observe the
> physical memory size occupied by the last launched process at different
> time points.
> Example: The process launched last: com.qiyi.video
> | memory type | 0min | 1min | 5min | 10min | 15min |
> -------------------------------------------------------------------
> | VmRSS(KB) | 453832 | 252300 | 204364 | 199944 | 199748 |
> | RssAnon(KB) | 247348 | 99296 | 71268 | 67808 | 67660 |
> | RssFile(KB) | 205536 | 152020 | 132144 | 131184 | 131136 |
> | RssShmem(KB) | 1048 | 984 | 952 | 952 | 952 |
> | VmSwap(KB) | 202692 | 334852 | 362880 | 366340 | 366488 |
> | Swap ratio(%) | 30.87% | 57.03% | 63.97% | 64.69% | 64.72% |
> Note: min - minute.
>
> When there are multiple processes with independent mm and the high
> memory pressure in system, if the large memory required process is
> launched at this time, system will is likely to trigger the instantaneous
> killing of many processes with independent mm. Due to multiple exiting
> processes occupying multiple CPU core resources for concurrent execution,
> leading to some issues such as the current non-exiting and important
> processes lagging.
>
> To solve this problem, we have introduced the multiple exiting process
> asynchronous swap entries release mechanism, which isolates and caches
> swap entries occupied by multiple exiting processes, and hands them over
> to an asynchronous kworker to complete the release. This allows the
> exiting processes to complete quickly and release CPU resources. We have
> validated this modification on the Android products and achieved the
> expected benefits.
>
> Testing Platform: 8GB RAM
> Testing procedure:
> After restarting the machine, start 15 app processes first, and then
> start the camera app processes, we monitor the cold start and preview
> time datas of the camera app processes.
>
> Test datas of camera processes cold start time (unit: millisecond):
> | seq | 1 | 2 | 3 | 4 | 5 | 6 | average |
> | before | 1498 | 1476 | 1741 | 1337 | 1367 | 1655 | 1512 |
> | after | 1396 | 1107 | 1136 | 1178 | 1071 | 1339 | 1204 |
>
> Test datas of camera processes preview time (unit: millisecond):
> | seq | 1 | 2 | 3 | 4 | 5 | 6 | average |
> | before | 267 | 402 | 504 | 513 | 161 | 265 | 352 |
> | after | 188 | 223 | 301 | 203 | 162 | 154 | 205 |
>
> Base on the average of the six sets of test datas above, we can see that
> the benefit datas of the modified patch:
> 1. The cold start time of camera app processes has reduced by about 20%.
> 2. The preview time of camera app processes has reduced by about 42%.
>
> It offers several benefits:
> 1. Alleviate the high system cpu loading caused by multiple exiting
> processes running simultaneously.
> 2. Reduce lock competition in swap entry free path by an asynchronous
> kworker instead of multiple exiting processes parallel execution.
> 3. Release pte_present memory occupied by exiting processes more
> efficiently.
>
> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
> ---
> arch/s390/include/asm/tlb.h | 8 +
> include/asm-generic/tlb.h | 44 ++++++
> include/linux/mm_types.h | 58 +++++++
> mm/memory.c | 3 +-
> mm/mmu_gather.c | 296 ++++++++++++++++++++++++++++++++++++
> 5 files changed, 408 insertions(+), 1 deletion(-)
>
> diff --git a/arch/s390/include/asm/tlb.h b/arch/s390/include/asm/tlb.h
> index e95b2c8081eb..3f681f63390f
> --- a/arch/s390/include/asm/tlb.h
> +++ b/arch/s390/include/asm/tlb.h
> @@ -28,6 +28,8 @@ static inline bool __tlb_remove_page_size(struct mmu_gather *tlb,
> struct page *page, bool delay_rmap, int page_size);
> static inline bool __tlb_remove_folio_pages(struct mmu_gather *tlb,
> struct page *page, unsigned int nr_pages, bool delay_rmap);
> +static inline bool __tlb_remove_swap_entries(struct mmu_gather *tlb,
> + swp_entry_t entry, int nr);
The problem I am having is that swap entries don't have any intersection
with the TLB. It sounds like we're squeezing something into an existing
concept (MMU gather) that just doesn't belong in there.
--
Cheers,
David / dhildenb
next prev parent reply other threads:[~2024-09-09 6:49 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-05 15:36 [PATCH v3 0/2] mm: " Zhiguo Jiang
2024-08-05 15:36 ` [PATCH v3 1/2] mm: move task_is_dying to h headfile Zhiguo Jiang
2024-08-05 15:36 ` [PATCH v3 2/2] mm: tlb: add tlb swap entries batch async release Zhiguo Jiang
2024-09-04 9:16 ` Barry Song
2024-09-04 11:26 ` zhiguojiang
2024-09-04 11:38 ` zhiguojiang
2024-09-09 1:59 ` Barry Song
2024-09-09 14:39 ` zhiguojiang
2024-09-10 4:18 ` Barry Song
2024-09-10 9:22 ` zhiguojiang
2024-09-10 10:11 ` Barry Song
2024-09-12 7:30 ` zhiguojiang
2024-09-09 6:49 ` David Hildenbrand [this message]
2024-09-09 14:43 ` zhiguojiang
2024-09-10 4:22 ` Barry Song
2024-09-10 8:52 ` David Hildenbrand
2024-09-10 9:27 ` zhiguojiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f58950cd-dbe3-4629-ac92-30c76db7849a@redhat.com \
--to=david@redhat.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@kernel.org \
--cc=arnd@arndb.de \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=justinjiang@vivo.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=muchun.song@linux.dev \
--cc=npiggin@gmail.com \
--cc=opensource.kernel@vivo.com \
--cc=peterz@infradead.org \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox