From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D7BD6C6FD1F for ; Thu, 23 Mar 2023 02:52:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 148266B0072; Wed, 22 Mar 2023 22:52:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0D1726B0074; Wed, 22 Mar 2023 22:52:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E8DB26B0075; Wed, 22 Mar 2023 22:52:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id D1EDD6B0072 for ; Wed, 22 Mar 2023 22:52:21 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 8A3E4C01ED for ; Thu, 23 Mar 2023 02:52:21 +0000 (UTC) X-FDA: 80598639282.07.906EFD3 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf29.hostedemail.com (Postfix) with ESMTP id BE35F120010 for ; Thu, 23 Mar 2023 02:52:19 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=bKBk3val; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf29.hostedemail.com: domain of bhe@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=bhe@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1679539939; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=W0ebE6bgoVx/Mv3B6TcJ7pQSb0U66C3AjGiXsmuMBx4=; b=10UxdtjVoym72yfLp2hfCHSxmrHceZQYdwJcQuPxsC2RyumnMdiTbL2IFOe8lPuwqzjgtr jOuFQf6LrXyFa/KPIWxuV8JVCBS4IZRnJ4kcmR9WYFPrVeDEJB4GMkhgUvv2my0zmD1cde WaRB5L5fhfqPQP/3izE7+D3HR8X/g5A= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=bKBk3val; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf29.hostedemail.com: domain of bhe@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=bhe@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1679539939; a=rsa-sha256; cv=none; b=vEoUihl6AioCgMuUo1oB0uD/AsS4DhlW25/ttXNHuXN+P3vMfR/Y7LBBWvVAhKjxi/gGYC IkNwdWfNsLdEIOuxHtXAVE5XdSDm+FEP8NZu0jKZ8bwkPRFA/4yQ236EWfUaUJZDXH0w9U ZL4YRKkZzPNzHNP7INLA3BckbdZ4Erw= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679539939; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=W0ebE6bgoVx/Mv3B6TcJ7pQSb0U66C3AjGiXsmuMBx4=; b=bKBk3val6mv1ZSDITOAGcnNJpTC4PFIIWkYX9VND7udMzTrysRuw+Zrq3nZVhXteWp41hA N8mumX4XHdwCidgZishHOUS4s1fyBIS0qZl/Yri4ySl67x6A+HxkqhkfLBSyGxkp8ayYUN 16V5opswS4M8MI1p8THZ0RBxSK9tpQs= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-472-Uu42UDlnMkup59I9Lyv7rg-1; Wed, 22 Mar 2023 22:52:15 -0400 X-MC-Unique: Uu42UDlnMkup59I9Lyv7rg-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 93FFB8828C0; Thu, 23 Mar 2023 02:52:14 +0000 (UTC) Received: from localhost (ovpn-12-97.pek2.redhat.com [10.72.12.97]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 76B4B4021B1; Thu, 23 Mar 2023 02:52:13 +0000 (UTC) Date: Thu, 23 Mar 2023 10:52:09 +0800 From: Baoquan He To: Lorenzo Stoakes , David Hildenbrand Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Andrew Morton , Uladzislau Rezki , Matthew Wilcox , Liu Shixin , Jiri Olsa , Jens Axboe , Alexander Viro Subject: Re: [PATCH v7 4/4] mm: vmalloc: convert vread() to vread_iter() Message-ID: References: <941f88bc5ab928e6656e1e2593b91bf0f8c81e1b.1679511146.git.lstoakes@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <941f88bc5ab928e6656e1e2593b91bf0f8c81e1b.1679511146.git.lstoakes@gmail.com> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: BE35F120010 X-Stat-Signature: 4bxgh831q79oyo1ioctc8kico69kgkdk X-HE-Tag: 1679539939-502865 X-HE-Meta: U2FsdGVkX19B08SKj80RoU4B31F0v4i5w2QJkYbJTvOW2XKX/Nu0Hz245dYMLePV1rzH9XPr5lcS2YRBV6ZS8r5YDhW9zbEMJYmYifyn36vThlCHwnBetAl9aVx7ozKo41JmSTXOnt0xbB16DXQP1zr5rF89fsuRsf8Kz8OgCTC4oDPovGZvpZVy2o2bPloaKWW2IYfrDPGhpU0QhDQYWMINFHOQE1flCKVBKamOUiUpuKGAvlnd5SXw1/fsFsMucfkdkKeMzODvvg1FXwQ1+5K28rfFoObotYK0/eFx1xutaIjqxo9H+/RDPwOMx/Z61SAtdY2+SOebUPf2FccgRCHTeJsMoyZlAMYd5O5DumckdQAqkMjTz1RMBqLfcLvF3x5s1zKZk5nTqcm6xnSiPcq9MbMuYKaGaob0x1Q+XMpVLcUD1sKp4LndkFtSXqay80rqbvU1wrugABOPwG5N8uskMH5fZuXyc9RJ4tNTIlaL6n9TNA2yhIZoDDO703N89P39q+7tgwiWc8DH/lEUmtT1nYkBjSo1XaDViv7R1xygoC0GZiYBESU4JYFw3LYj9up1Rw4odIroAlDu3mV7+VXwPu71CIv7FUebtVjTtz0lxGTBr8L6eYPyMdQAHCOTCzhJlCG03nvjaXGyq+uT76AZ2/woCKZRXmtv2MfdakzF7g1aFk1a+jkwXP8T+t70njdymIrcNbwSjeTcPjbuAgt1ML5/jQAIdWnP5GaQASXZ+1iDLD4rpCnLkbWB+BhBujClsR6Ss3fusXg5p49RkYpsDEumPTHMaQHwlNQQ2ucmYBKkp8xt+BxPrAX7vFCCDhPnOIbjMM+rvwFE3qxJaCB1De/vXMeqsqnb71LSq9JtQ1ffrWBfgCwMlEHLFTNTIB1BPco0WsYW+M/oFTG8u3kpeuZDqdwubwJyJSy+Ce97kxRfCGE0qdoWyvmac7jam0tXdKMbSK6uoJvrX2n QUgRnqDH s9MadNYunwA1Eo/sgQUJiausC7/bD5K1CzdmAhKWa7b3xQfCjs12ykwQfdEkjc59tTVw4eE0ePt5AnKxNR5YVgSqcl8IuOG+cNuHy4udiB++PUqVrPyAvRbwmIm2S/jxhvcJlyhevN4Qizt/NEreZJQ+KHcpzc+CVOYY3xorwsnmMOSUp+MCHXB11WXHG0RudZB+29wntJNeK4oAaarsND/a5tYdag5lhlU49POVUQ2e7e11cT/HdN5bPz68VPnSwDGHTBAJeqEasyoY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 03/22/23 at 06:57pm, Lorenzo Stoakes wrote: > Having previously laid the foundation for converting vread() to an iterator > function, pull the trigger and do so. > > This patch attempts to provide minimal refactoring and to reflect the > existing logic as best we can, for example we continue to zero portions of > memory not read, as before. > > Overall, there should be no functional difference other than a performance > improvement in /proc/kcore access to vmalloc regions. > > Now we have eliminated the need for a bounce buffer in read_kcore_iter(), > we dispense with it, and try to write to user memory optimistically but > with faults disabled via copy_page_to_iter_nofault(). We already have > preemption disabled by holding a spin lock. We continue faulting in until > the operation is complete. I don't understand the sentences here. In vread_iter(), the actual content reading is done in aligned_vread_iter(), otherwise we zero filling the region. In aligned_vread_iter(), we will use vmalloc_to_page() to get the mapped page and read out, otherwise zero fill. While in this patch, fault_in_iov_iter_writeable() fault in memory of iter one time and will bail out if failed. I am wondering why we continue faulting in until the operation is complete, and how that is done. If we look into the failing point in vread_iter(), it's mainly coming from copy_page_to_iter_nofault(), e.g page_copy_sane() checking failed, i->data_source checking failed. If these conditional checking failed, should we continue reading again and again? And this is not related to memory faulting in. I saw your discussion with David, but I am still a little lost. Hope I can learn it, thanks in advance. ...... > diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c > index 08b795fd80b4..25b44b303b35 100644 > --- a/fs/proc/kcore.c > +++ b/fs/proc/kcore.c ...... > @@ -507,13 +503,30 @@ read_kcore_iter(struct kiocb *iocb, struct iov_iter *iter) > > switch (m->type) { > case KCORE_VMALLOC: > - vread(buf, (char *)start, tsz); > - /* we have to zero-fill user buffer even if no read */ > - if (copy_to_iter(buf, tsz, iter) != tsz) { > - ret = -EFAULT; > - goto out; > + { > + const char *src = (char *)start; > + size_t read = 0, left = tsz; > + > + /* > + * vmalloc uses spinlocks, so we optimistically try to > + * read memory. If this fails, fault pages in and try > + * again until we are done. > + */ > + while (true) { > + read += vread_iter(iter, src, left); > + if (read == tsz) > + break; > + > + src += read; > + left -= read; > + > + if (fault_in_iov_iter_writeable(iter, left)) { > + ret = -EFAULT; > + goto out; > + } > } > break; > + } > case KCORE_USER: > /* User page is handled prior to normal kernel page: */ > if (copy_to_iter((char *)start, tsz, iter) != tsz) {