From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96CEAC6FD1C for ; Thu, 23 Mar 2023 10:39:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 05BE86B0072; Thu, 23 Mar 2023 06:39:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 00C4F6B0074; Thu, 23 Mar 2023 06:38:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E16CF6B0075; Thu, 23 Mar 2023 06:38:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D2EAA6B0072 for ; Thu, 23 Mar 2023 06:38:59 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 903501605B5 for ; Thu, 23 Mar 2023 10:38:59 +0000 (UTC) X-FDA: 80599815198.19.F0CCD26 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf02.hostedemail.com (Postfix) with ESMTP id 1DDE680016 for ; Thu, 23 Mar 2023 10:38:56 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=gaokiBBh; spf=pass (imf02.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1679567937; a=rsa-sha256; cv=none; b=6AWRexcIC6v47TSkpXOWE7Oz4U8b3tIOppu0G1BP592xDfWepzdOW3pXrsfRrJRujoOt02 xpNFmzbhLD1tkkGPV7PFHfKRnRqhhmpFuf9RmqQCgT8/fGt6OMlprJZRVCI1+AjJC/86Wp xK+DeTfW4DTTPYk7qHIIj71x8jMujWY= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=gaokiBBh; spf=pass (imf02.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1679567937; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/TFhcV6GkMvFSJzY1Kq9CdLM3RnvV0Xngxwo6M9TXoA=; b=xxO0Gk/F3Ggs/rXe7uigedQQUcCYdpI82oYPWcsw+bFVGSHCg+Vxk4uBub9Vu0L1CgIu+T m7tIczLmFV3vapqBYkM8hekdwJQOAd32GIN/Kog68sz49fJ7NmACRmtxwr0aQmohlcnNfO nFlm9X0zPCbahQqLD/DhBKqSq9FqMMA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679567936; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/TFhcV6GkMvFSJzY1Kq9CdLM3RnvV0Xngxwo6M9TXoA=; b=gaokiBBhVg6eD0qTgFbxWJ1M3Cii39irNvcZlLg+dn/yyyLihYPerOQm8h2iTL8MzIbmNh XyaOtK+q+mKb8yWfhcw4JSzz+CNQrTnyQkH/+tR0ObLaWRMs7oT4qyU+5k0b12akKOdvYf DO9vGAv9asM+vzPio/BpfVLGrw29JOw= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-662-F_drGVXwMcKWiuDKU6GbVA-1; Thu, 23 Mar 2023 06:38:55 -0400 X-MC-Unique: F_drGVXwMcKWiuDKU6GbVA-1 Received: by mail-wm1-f72.google.com with SMTP id k29-20020a05600c1c9d00b003ee3a8d547eso3764277wms.2 for ; Thu, 23 Mar 2023 03:38:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679567934; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=/TFhcV6GkMvFSJzY1Kq9CdLM3RnvV0Xngxwo6M9TXoA=; b=1g9TDMYvbxDHChI66yFq26Il4I/tRzeyJl/BRtfeUKbAweaLB4Gz7x4YEQEjMVtaj5 pqv6+M51J0+gfAZwLORI7bk1JNmd4CvaCv11/PMPJEatXH9n89ahPgPCFE15Zdnv7JVR ExdxeheRhvEaSZXbljPSzjlp+NfpiHIs2lRsCJYNZP2aCqDZr0RnjVZdlTgUicrLIpqo x+PqCTgIm0SQfIYSw14aa1yed5d/M7C2hSe5CfgWoztwnl0HcZQ9uS02RH3ldGExzXuP twa9351W/tWTMprO0O87f4p9CqZEOTqL5lhnCOWxWJi9XwJsP8ZSIXVZ/yG5epaE70Oz KaZA== X-Gm-Message-State: AO0yUKXHaRrpXJYtYPJlLqEV2t4vKNFovE2Hc8ovtVXZxq4Vhai44hEh ru/cDquftBRgysMQ/ZtHVvD/Tb+A+/IFSxUkcF9ZH469Fa5+Q5UngccvbA4MY7IPoIZeswNQgsl kFbp96omUMM0= X-Received: by 2002:a05:600c:211a:b0:3e1:374:8b66 with SMTP id u26-20020a05600c211a00b003e103748b66mr1663819wml.40.1679567934037; Thu, 23 Mar 2023 03:38:54 -0700 (PDT) X-Google-Smtp-Source: AK7set+tFuJvzVjW7ADCFdILrSW/DQrT2iRfpJdx3s1pAztu95TXy7RgGzThnezTLOXKYBwhqAxn4A== X-Received: by 2002:a05:600c:211a:b0:3e1:374:8b66 with SMTP id u26-20020a05600c211a00b003e103748b66mr1663806wml.40.1679567933651; Thu, 23 Mar 2023 03:38:53 -0700 (PDT) Received: from ?IPV6:2a09:80c0:192:0:5dac:bf3d:c41:c3e7? ([2a09:80c0:192:0:5dac:bf3d:c41:c3e7]) by smtp.gmail.com with ESMTPSA id p4-20020a05600c204400b003ee4e99a8f6sm1484570wmg.33.2023.03.23.03.38.52 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 23 Mar 2023 03:38:53 -0700 (PDT) Message-ID: <7aee68e9-6e31-925f-68bc-73557c032a42@redhat.com> Date: Thu, 23 Mar 2023 11:38:52 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.0 Subject: Re: [PATCH v7 4/4] mm: vmalloc: convert vread() to vread_iter() To: Baoquan He , Lorenzo Stoakes Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Andrew Morton , Uladzislau Rezki , Matthew Wilcox , Liu Shixin , Jiri Olsa , Jens Axboe , Alexander Viro References: <941f88bc5ab928e6656e1e2593b91bf0f8c81e1b.1679511146.git.lstoakes@gmail.com> From: David Hildenbrand Organization: Red Hat In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Queue-Id: 1DDE680016 X-Rspamd-Server: rspam01 X-Stat-Signature: tszsrkqtdcz3odrg1xg4sxndmin7dnfg X-HE-Tag: 1679567936-366026 X-HE-Meta: U2FsdGVkX18xNpY+kEUBgFyMhADEZmHVXQIACzh0t1DwTHVoGn3YPF/S8Vglr5B76f46ZiTCQlux1ftri7BnqftVPraWm4zKM9s2SchD66D9QhH4aXPmHZyfuxZO+S9qgCoz1RRdG0MN8YwnGHYqfisFQgrM9ITBLOfkNQiomUHN8zomjXSgZxntS+91IU69uiaWUH+xPMQYYxGfiSelBUrM9fOKz478eOERPoo1ejEE+DEr1RTK1MI/mMihorsGi/Yv4O9ngB6j8oMP+/+TrPC+oXWJH9zS+RsOVdginMWHjYVVHq2gioC8dlcHPBRIPJEZK9vjL/Pv2BrE1/iF6a1cJuCQVwzOkV7GAJX1aJ+rszjFAkR1Ya4lRJub/QVfvL4fHM+aX1fMopmYvg6e9ZU6XqF+N9hDQB8/mHKMpsGLKInR6X7I5jTnfi4BxI5upsbE8eovTaaKU9HNkNEXThJv/lRlNHD2jUbGUFjqMdWjnXLpUbUfAalvyrcmoF4GTwsl9G5muzC3XUDngi9UFGBeDkmXQCcgv9Eg0UTmQUGCb0LPb5faJXQTZipBlDqDEt5uw9rOf1/lfmgso/vW+mJ0rlDztfKI7UK0IY+0gLzisZrg5x9nQfm//kzZMjR3wnhs4qqg04a3oE2a5erCEFUom6h3Npq5l2gVOHla3ZdwtUyPmxK8MYle4dPt4ej+D1d8pm85/nG3lEVgh9Ob8ObhgUm5yglsYhjpqdrrHwwvnLFuIq0V0wVYDuGkLxtfe4RdFrfr2ZJr2WnMb+guJ/rOe3LbZg5bE6J1TKjmsie/teS+BS5z/bUifgG9A3MV3NAaX0dqWJtnMLojdqyBS3vrm/MmaGEzT4OCRhXO/oATLti83JrlEHBle33o4Y+3MYxoFM5QBHJCBQEl93oKfM0qVMiCoUZscventaL/FJk+bQeBvD4pRXpEepZDgj1F5Hi1Fp9/QIOcQk/sT97 z1oY7vln 4UkHIrNu9aF/Np7hd/UPriMm4lJ9clhW09KM5GQJuWl6ARSdixCh/S4RTgxLOCBFSp6dE54HQQQ4vrBbWOrA9PaUwPw4cq0Lri2eof1Io+UbrzUMr1G9EAbuMELyVQtK2ZA2fe27o7OuvbPsDgm3oaoqmHhJhQ12ZicQata2uroI1tXDTTDWJN/tD0WSMyUmJCh+MrxQ5q5HFqHJ08z0Fu6oAa2lDOcKAugLLZmBn6eXpZKqFcb1eELVNX4f4R2hAt7rbzHnKAe4e9Zgkxg76h2Je+O3/2AyXdSKx1bl3ryg8j3zAsCpZr5/5uCVGw3qJJsGLh94zQeooQ9mRT+lK6Y0KlYy/+y6grv/C896LnTdJmjASKOru1wSsBH8heis8l5nQscJ/4BpFQjd8gfgUfe5yVPe/PA7N9p6Q8ZkuOp2qpMKO6B5uFQq4BPLdrhhkIhRGtGsSPFv/FoBFxUw21YY+XM0OUazryYOx X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 23.03.23 11:36, Baoquan He wrote: > On 03/23/23 at 06:44am, Lorenzo Stoakes wrote: >> On Thu, Mar 23, 2023 at 10:52:09AM +0800, Baoquan He wrote: >>> On 03/22/23 at 06:57pm, Lorenzo Stoakes wrote: >>>> Having previously laid the foundation for converting vread() to an iterator >>>> function, pull the trigger and do so. >>>> >>>> This patch attempts to provide minimal refactoring and to reflect the >>>> existing logic as best we can, for example we continue to zero portions of >>>> memory not read, as before. >>>> >>>> Overall, there should be no functional difference other than a performance >>>> improvement in /proc/kcore access to vmalloc regions. >>>> >>>> Now we have eliminated the need for a bounce buffer in read_kcore_iter(), >>>> we dispense with it, and try to write to user memory optimistically but >>>> with faults disabled via copy_page_to_iter_nofault(). We already have >>>> preemption disabled by holding a spin lock. We continue faulting in until >>>> the operation is complete. >>> >>> I don't understand the sentences here. In vread_iter(), the actual >>> content reading is done in aligned_vread_iter(), otherwise we zero >>> filling the region. In aligned_vread_iter(), we will use >>> vmalloc_to_page() to get the mapped page and read out, otherwise zero >>> fill. While in this patch, fault_in_iov_iter_writeable() fault in memory >>> of iter one time and will bail out if failed. I am wondering why we >>> continue faulting in until the operation is complete, and how that is done. >> >> This is refererrring to what's happening in kcore.c, not vread_iter(), >> i.e. the looped read/faultin. >> >> The reason we bail out if failt_in_iov_iter_writeable() is that would >> indicate an error had occurred. >> >> The whole point is to _optimistically_ try to perform the operation >> assuming the pages are faulted in. Ultimately we fault in via >> copy_to_user_nofault() which will either copy data or fail if the pages are >> not faulted in (will discuss this below a bit more in response to your >> other point). >> >> If this fails, then we fault in, and try again. We loop because there could >> be some extremely unfortunate timing with a race on e.g. swapping out or >> migrating pages between faulting in and trying to write out again. >> >> This is extremely unlikely, but to avoid any chance of breaking userland we >> repeat the operation until it completes. In nearly all real-world >> situations it'll either work immediately or loop once. > > Thanks a lot for these helpful details with patience. I got it now. I was > mainly confused by the while(true) loop in KCORE_VMALLOC case of read_kcore_iter. > > Now is there any chance that the faulted in memory is swapped out or > migrated again before vread_iter()? fault_in_iov_iter_writeable() will > pin the memory? I didn't find it from code and document. Seems it only > falults in memory. If yes, there's window between faluting in and > copy_to_user_nofault(). > See the documentation of fault_in_safe_writeable(): "Note that we don't pin or otherwise hold the pages referenced that we fault in. There's no guarantee that they'll stay in memory for any duration of time." -- Thanks, David / dhildenb