From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_RED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 960D4C433ED for ; Tue, 6 Apr 2021 02:14:29 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2788A613C5 for ; Tue, 6 Apr 2021 02:14:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2788A613C5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9990D6B007B; Mon, 5 Apr 2021 22:14:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 949AF6B007D; Mon, 5 Apr 2021 22:14:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 79BF96B007E; Mon, 5 Apr 2021 22:14:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0062.hostedemail.com [216.40.44.62]) by kanga.kvack.org (Postfix) with ESMTP id 5C66C6B007B for ; Mon, 5 Apr 2021 22:14:28 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 1C3406D65 for ; Tue, 6 Apr 2021 02:14:28 +0000 (UTC) X-FDA: 78000323016.25.3413B73 Received: from mail-ua1-f52.google.com (mail-ua1-f52.google.com [209.85.222.52]) by imf23.hostedemail.com (Postfix) with ESMTP id 97DF3A000381 for ; Tue, 6 Apr 2021 02:14:26 +0000 (UTC) Received: by mail-ua1-f52.google.com with SMTP id r20so1925466uam.6 for ; Mon, 05 Apr 2021 19:14:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=bo2lGDxVCGe9lNbS3u8q99z58ww2lLJ0dkDpvy52v0w=; b=JvKwIZpYl1ZrZ/CWCEVT1AOF2mrWXUr06VkbH2ArfDp0TW+a/pVvBQTk+am3kLYI34 SPRDumS7uOBLEKVJEGdV3zWGCaLtrTumMoUbGatgWL/ru+5C8HLvJgZ9MQTxbnctON3J ac6RdV3G76MTa4dm/MRHfhKz9jfEpVBcADojFeayjUBG0evBff+WgPkpZNcS+Kb3OIEU PCMSb23pCKazUjexsjLATjAZsVnhLUV8jw8TUkgZUcJ/Rt94lpXNNKV4M+HQIoXUWn1t plrYjfYIdZPsOv21ZBP7j5vHTj1cMUfRmiThn4zRZmosjLS8IlWtkpRYGOhX0DQNkvZV W//A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=bo2lGDxVCGe9lNbS3u8q99z58ww2lLJ0dkDpvy52v0w=; b=YtalTlUg1sjN+aJDWSq015ToLhuq6G+Y7Gm4DkbCPU2/j3kHDloKOUkM6xuo5b4fQ0 uL1gDrOt5Eyn13fOFrClRfdy7JRTRIIfRNkBNHbaabINe+TtoKvwtPwgUfte+szcWKSL /PQako2Nrq8QdqmfXNSEsFWvR6CVYM4iGMTihB9do79g8QC0zlnmAEhtBUpQM4WGLzlk tep2Jj/P850lv9LsuGgxlODvF1jXmQ3g2/k6oYCEuDx2FaJ3LWTkU1CqN2KJkeMnvX4e gd1MA+x7o9fQHghP1dZBDf44XMWruwiKAAPc/yLtB9yVcVyA1QuzvagNp9Al8xTq9DEU ehLA== X-Gm-Message-State: AOAM533BG76H8aihArfFq50jVFSMd+51d+p/Hs9GtcOaNjyW7jVrv4M9 zPNMacNgkIHfLUqTYIRLZqZmxNGNtY7EtFFq/DM= X-Google-Smtp-Source: ABdhPJxAzsq+e2cvHUhJ7oB4lJEPFAWLVTTzHqKcybnYLe57CJXw0rLxoH0FORpXXLfQ3ojfSSqy9fGG+NPw0t9e1xA= X-Received: by 2002:a9f:3fcf:: with SMTP id m15mr14688262uaj.55.1617675267029; Mon, 05 Apr 2021 19:14:27 -0700 (PDT) MIME-Version: 1.0 References: <20210329123635.56915-1-qianjun.kernel@gmail.com> <20210330224406.5e195f3b8b971ff2a56c657d@linux-foundation.org> In-Reply-To: <20210330224406.5e195f3b8b971ff2a56c657d@linux-foundation.org> From: jun qian Date: Tue, 6 Apr 2021 10:14:16 +0800 Message-ID: Subject: Re: [PATCH V2 1/1] mm:improve the performance during fork To: Andrew Morton Cc: ast@kernel.org, daniel@iogearbox.net, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, andriin@fb.com, john.fastabend@gmail.com, kpsingh@chromium.org, Linux-MM , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 97DF3A000381 X-Stat-Signature: 4ix1eig33xqxf97f7opd3gkfq9jiio4b X-Rspamd-Server: rspam02 Received-SPF: none (gmail.com>: No applicable sender policy available) receiver=imf23; identity=mailfrom; envelope-from=""; helo=mail-ua1-f52.google.com; client-ip=209.85.222.52 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617675266-348156 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Andrew Morton =E4=BA=8E2021=E5=B9=B43=E6=9C=883= 1=E6=97=A5=E5=91=A8=E4=B8=89 =E4=B8=8B=E5=8D=881:44=E5=86=99=E9=81=93=EF=BC= =9A > > On Mon, 29 Mar 2021 20:36:35 +0800 qianjun.kernel@gmail.com wrote: > > > From: jun qian > > > > In our project, Many business delays come from fork, so > > we started looking for the reason why fork is time-consuming. > > I used the ftrace with function_graph to trace the fork, found > > that the vm_normal_page will be called tens of thousands and > > the execution time of this vm_normal_page function is only a > > few nanoseconds. And the vm_normal_page is not a inline function. > > So I think if the function is inline style, it maybe reduce the > > call time overhead. > > > > I did the following experiment: > > > > use the bpftrace tool to trace the fork time : > > > > bpftrace -e 'kprobe:_do_fork/comm=3D=3D"redis-server"/ {@st=3Dnsecs;} \ > > kretprobe:_do_fork /comm=3D=3D"redis-server"/{printf("the fork time \ > > is %d us\n", (nsecs-@st)/1000)}' > > > > no inline vm_normal_page: > > result: > > the fork time is 40743 us > > the fork time is 41746 us > > the fork time is 41336 us > > the fork time is 42417 us > > the fork time is 40612 us > > the fork time is 40930 us > > the fork time is 41910 us > > > > inline vm_normal_page: > > result: > > the fork time is 39276 us > > the fork time is 38974 us > > the fork time is 39436 us > > the fork time is 38815 us > > the fork time is 39878 us > > the fork time is 39176 us > > > > In the same test environment, we can get 3% to 4% of > > performance improvement. > > > > note:the test data is from the 4.18.0-193.6.3.el8_2.v1.1.x86_64, > > because my product use this version kernel to test the redis > > server, If you need to compare the latest version of the kernel > > test data, you can refer to the version 1 Patch. > > > > We need to compare the changes in the size of vmlinux: > > inline non-inline diff > > vmlinux size 9709248 bytes 9709824 bytes -576 bytes > > > > I get very different results with gcc-7.2.0: > > q:/usr/src/25> size mm/memory.o > text data bss dec hex filename > 74898 3375 64 78337 13201 mm/memory.o-before > 75119 3363 64 78546 132d2 mm/memory.o-after > > That's a somewhat significant increase in code size, and larger code > size has a worsened cache footprint. > > Not that this is necessarily a bad thing for a function which is > tightly called many times in succession as is vm__normal_page() > > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -592,7 +592,7 @@ static void print_bad_pte(struct vm_area_struct *vm= a, unsigned long addr, > > * PFNMAP mappings in order to support COWable mappings. > > * > > */ > > -struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long = addr, > > +inline struct page *vm_normal_page(struct vm_area_struct *vma, unsigne= d long addr, > > pte_t pte) > > { > > unsigned long pfn =3D pte_pfn(pte); > > I'm a bit surprised this made any difference - rumour has it that > modern gcc just ignores `inline' and makes up its own mind. Which is > why we added __always_inline. > the kernel code version: kernel-4.18.0-193.6.3.el8_2 gcc version 8.4.1 20200928 (Red Hat 8.4.1-1) (GCC) and I made it again, got the results, and later i will test in the latest version kernel with the new gcc. 757368576 vmlinux inline 757381440 vmlinux no inline