From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98E6BC433E0 for ; Fri, 12 Jun 2020 20:07:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 68668206D7 for ; Fri, 12 Jun 2020 20:07:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 68668206D7 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arndb.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EADC28D00D6; Fri, 12 Jun 2020 16:07:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E5E0E8D00A0; Fri, 12 Jun 2020 16:07:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D4CA78D00D6; Fri, 12 Jun 2020 16:07:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0253.hostedemail.com [216.40.44.253]) by kanga.kvack.org (Postfix) with ESMTP id B9B5A8D00A0 for ; Fri, 12 Jun 2020 16:07:48 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 75A32181ABE8E for ; Fri, 12 Jun 2020 20:07:48 +0000 (UTC) X-FDA: 76921645416.06.glove10_5f0556e26ddf Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin06.hostedemail.com (Postfix) with ESMTP id 382151009B53E for ; Fri, 12 Jun 2020 20:07:48 +0000 (UTC) X-HE-Tag: glove10_5f0556e26ddf X-Filterd-Recvd-Size: 5193 Received: from mout.kundenserver.de (mout.kundenserver.de [217.72.192.75]) by imf44.hostedemail.com (Postfix) with ESMTP for ; Fri, 12 Jun 2020 20:07:47 +0000 (UTC) Received: from mail-qk1-f176.google.com ([209.85.222.176]) by mrelayeu.kundenserver.de (mreue106 [212.227.15.145]) with ESMTPSA (Nemesis) id 1Mr8O8-1j6dnj3gk2-00oEgg for ; Fri, 12 Jun 2020 22:07:46 +0200 Received: by mail-qk1-f176.google.com with SMTP id n141so10235029qke.2 for ; Fri, 12 Jun 2020 13:07:45 -0700 (PDT) X-Gm-Message-State: AOAM5332KZWh331fOurlkf/LilH9w4zkuZxOsRiWPIxuzQlBQfDf4LVO B09an3V0MBENOqjRWkXikcU0Wo3L3Owhz+Kt1vE= X-Google-Smtp-Source: ABdhPJxZJreofZ36189crpVnTyu17fqhOg6nCgeQSutwsT65QiKE0mUuYvNz0bxVS7IwDeSIM7UEmHDrtH4eO16x54Y= X-Received: by 2002:ae9:de85:: with SMTP id s127mr5067153qkf.352.1591992464544; Fri, 12 Jun 2020 13:07:44 -0700 (PDT) MIME-Version: 1.0 References: <9e1de19f35e2d5e1d115c9ec3b7c3284b4a4e077.1591885760.git.afzal.mohd.ma@gmail.com> <20200612135538.GA13399@afzalpc> In-Reply-To: <20200612135538.GA13399@afzalpc> From: Arnd Bergmann Date: Fri, 12 Jun 2020 22:07:28 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC 1/3] lib: copy_{from,to}_user using gup & kmap_atomic() To: afzal mohammed Cc: Russell King - ARM Linux admin , Linus Walleij , "linux-kernel@vger.kernel.org" , Linux-MM , Linux ARM , Nicolas Pitre , Catalin Marinas , Will Deacon Content-Type: text/plain; charset="UTF-8" X-Provags-ID: V03:K1:8Ehe1CDNXP3WSjknchokJs/ifTtmWuZ2v8rAHMIYOPAEEuHCKKE r2EiqETUDQL6E+JX/KaSwfNyIdPOpnO+b2q0X3KwdduD6MZyFPOWMm2DYtZn1vq0+r/amm/ llNk5fpS8UEaHZHLp9gAh4RclILQzvUlkGRxZu9KGdjKB3mwPfh0F5H3dQMxZiP7eSxTzD8 XqhnU2Yh3tM9xg2r9mvPw== X-UI-Out-Filterresults: notjunk:1;V03:K0:heZgTnz3PVU=:p5m83uXnelQqKop9UaDjZv RBKWPDYfgxqXaahfTE3IvQu4e4CULS54aDSLj/OzI08BLssLGjoDHIdqJRuw/VYGUEGnbIEKC DxalEmJ/Okh1xNEZ8zeuNXxdmIKitLgdnpz86xgC396ezHOMjA+gM/EgvduO0o63wGtxQceAH wdSgbClmMaYqwvPpbzzVM9Zqnv/Pvc5hCuri6anvaqUiRWZGtEGw7uHpXf2nSCpr4G/umI3oK TJwj2hr4w8loxlXdEDE3Hc6mzXHJtXwud/AekIC634eJjr3YaRaplp7xpBATkZGLwRw5yJt81 lqq4bCJgctcQNSP31JmlKjxVrIwyvvva5REPZEzuzXBxDbY0x9i+1ZV68fWuGYqmTSiP7c9UQ cyFXRmnf0yE6wcAXZoy47UDo07HUtn7r/UIq8tH1QnfaLyE00UGWjM6Sf+H9jcTRjIeE28II9 NO3brq5CLMdCKaOIbsTrW30ZiQx+piyj7eGdprTT8iJBRl9OLMAJA3f2x0Y7gE39ArbXe4+g/ ynZJfwLjFOdqFTu9GK45va9K6h7RfI/r8IaJLuuIi2FVfUM05ZxNs64tzdqCFTJN1F0GlwnJL eVYtJwomzemVQ79LQCp69JGwLJnVjpTdpujlBa8lx8CPuNndE+JwVkYyS7W2SYzJ9v/puoKxN rMUUjrNI99f2bYn9uE23DhgcJ/2v+Mv5G1iW+mtzy8M5sLr0udoA38v9Y8AM0gSUFkZZqD97g xY7oVTdmJZtipOiEos1UuUtv15XYDMq9RaJxshLIjbIREH+kdt+jx3dULEbW8MgoGrXaPaYfI QAQ7wqY8z3DwJ7yuvstU9/8+P/GjEUfJNxqsLbbAkF9JToyC5U= X-Rspamd-Queue-Id: 382151009B53E X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Jun 12, 2020 at 3:55 PM afzal mohammed wrote: > On Fri, Jun 12, 2020 at 02:02:13PM +0200, Arnd Bergmann wrote: > > On Fri, Jun 12, 2020 at 12:18 PM afzal mohammed wrote: > > > > Roughly a one-third drop in performance. Disabling highmem improves > > > performance only slightly. > > > There are probably some things that can be done to optimize it, > > but I guess most of the overhead is from the page table operations > > and cannot be avoided. > > Ingo's series did a follow_page() first, then as a fallback did it > invoke get_user_pages(), i will try that way as well. Right, that could help, in particular for the small copies. I think a lot of usercopy calls are only for a few bytes, though this is of course highly workload dependent and you might only care about the large ones. > Yes, i too feel get_user_pages_fast() path is the most time consuming, > will instrument & check. > > > What was the exact 'dd' command you used, in particular the block size? > > Note that by default, 'dd' will request 512 bytes at a time, so you usually > > only access a single page. It would be interesting to see the overhead with > > other typical or extreme block sizes, e.g. '1', '64', '4K', '64K' or '1M'. > > It was the default(512), more test results follows (in MB/s), > > 512 1K 4K 16K 32K 64K 1M > > w/o series 30 46 89 95 90 85 65 > > w/ series 22 36 72 79 78 75 61 > > perf drop 26% 21% 19% 16% 13% 12% 6% > > Hmm, results ain't that bad :) There is also still hope of optimizing small aligned copies like set_ttbr0(user_ttbr); ldm(); set_ttbr0(kernel_ttbr); stm(); which could do e.g. 32 bytes at a time, but with more overhead if you have to loop around it. Arnd