From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2E07EB64D9 for ; Tue, 27 Jun 2023 07:10:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 918D18D0002; Tue, 27 Jun 2023 03:10:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8C9268D0001; Tue, 27 Jun 2023 03:10:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 76B018D0002; Tue, 27 Jun 2023 03:10:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 686DD8D0001 for ; Tue, 27 Jun 2023 03:10:41 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 416C51408E9 for ; Tue, 27 Jun 2023 07:10:41 +0000 (UTC) X-FDA: 80947655082.10.8EC3828 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf25.hostedemail.com (Postfix) with ESMTP id 07809A0019 for ; Tue, 27 Jun 2023 07:10:38 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ZO73zrVQ; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf25.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687849839; a=rsa-sha256; cv=none; b=D7YUZ3Ytm6pvRX32GaNdF0pP/1/8OqnhBgyinV50FYdNgyrCoLZ1h3UVq/gPlANZDhujNt CeFzkPIvZB+OLginet0cEL2toqEA9eaoEWNZBNn2LNMl+s2nKqpgRL8u5Z0gMd5srr2zcA vHoLTEXrsFlvbTwwkMXjY1Tha3aQhHU= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ZO73zrVQ; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf25.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687849839; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kvp18LvYIIuBEFLlfeIt2YHMxFkZkNabLPegG26yxug=; b=Xcd3nse53DWUh42mAI6fAkEq+Rv9GxXUQOLVzAq/thz9lcFJ8XF8OEEPzxLFV1/8BVl4Kw qEK03liiTQs60ZVtLobSs7Q/5oYkgHuTwtGWj6vK7gbea8IZ1FkQSzSfwJWHXYriLSFazw Eek1GhueeoAICzCOt4h0R2yPnh6/8bI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687849838; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kvp18LvYIIuBEFLlfeIt2YHMxFkZkNabLPegG26yxug=; b=ZO73zrVQZng8+qXdvRyWYExITt9HoO9zFv9i22x1tOTwulgmDGlGCKI6/LsQvPcm2jfBaL kNrS8+qrv8bibiuA4vfUJJSVeYF+0T4XjcBE7qxV6baB7ld13ahCQkVHwMnKhljEfPdL1w 1iiLRBW1JhO8VqKlnb/sjcM2U4yDZVc= Received: from mail-lf1-f70.google.com (mail-lf1-f70.google.com [209.85.167.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-462-hTEa4BH6N8ejQO153QeNEQ-1; Tue, 27 Jun 2023 03:10:36 -0400 X-MC-Unique: hTEa4BH6N8ejQO153QeNEQ-1 Received: by mail-lf1-f70.google.com with SMTP id 2adb3069b0e04-4edc7406cb5so3231996e87.3 for ; Tue, 27 Jun 2023 00:10:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687849835; x=1690441835; h=content-transfer-encoding:in-reply-to:subject:organization:from :references:cc:to:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=kvp18LvYIIuBEFLlfeIt2YHMxFkZkNabLPegG26yxug=; b=ARFpLo+eteZEGqAdFopY2S1Vt7VsEPtEyX5Fc/vd+kA8lOvKN9WOh8ITODXXtMLcYs sLO1QKEBvWNGzCW7wpaLPkJL3JeByyHfHZ3WQJ/75mO3Mm3epJU/a6LJtg6rF/Dr8yqf RDtcX01gB0ZmrRN3GAq1CLlh+HCVORNgzT/L/Ec8B43SVNWp9pnPlM+Xi8Ilj7X4ipns pfeP9MJyM/gUudmWltz3I8dVzuO0xZxnRLlWU0glbP3UDokROJ0pclaUTucnkwrvjqPc 1T9+fYL2aJgwjMJp0uH4/djtK96h14wsUgiqeaFm5GjpoBeFOPRl4PhOF1NdOGPgvaPg g+JA== X-Gm-Message-State: AC+VfDzmtVvk0PnIn3kkM4X9KSibfJynfee+PDjD93E9JWmxj8K8xKp8 71maQP26Lq4EgZFZTs+6wpYgTNRm8K2ilSVgt5gEtZPxWlCRb1dBuVl+wZt2B1MTwkUmzVya/Ih J3NkfKEv0a+E= X-Received: by 2002:a19:910d:0:b0:4f8:7124:6803 with SMTP id t13-20020a19910d000000b004f871246803mr17268669lfd.35.1687849835380; Tue, 27 Jun 2023 00:10:35 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6OeAFjlUtFh5JB+KW2HFJf4vVq30W9UjnY7roHoDY1PwxoBP5E8TrcenYXNjHih8taeNuQgw== X-Received: by 2002:a19:910d:0:b0:4f8:7124:6803 with SMTP id t13-20020a19910d000000b004f871246803mr17268644lfd.35.1687849834700; Tue, 27 Jun 2023 00:10:34 -0700 (PDT) Received: from ?IPV6:2003:cb:c737:4900:68b3:e93b:e07a:558b? (p200300cbc737490068b3e93be07a558b.dip0.t-ipconnect.de. [2003:cb:c737:4900:68b3:e93b:e07a:558b]) by smtp.gmail.com with ESMTPSA id 22-20020ac24856000000b004fb79feb289sm500218lfy.227.2023.06.27.00.10.33 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 27 Jun 2023 00:10:34 -0700 (PDT) Message-ID: <4a98a381-f184-1857-a134-efd606a3b807@redhat.com> Date: Tue, 27 Jun 2023 09:10:32 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 To: "Kasireddy, Vivek" , Peter Xu Cc: "dri-devel@lists.freedesktop.org" , "linux-mm@kvack.org" , Mike Kravetz , Gerd Hoffmann , "Kim, Dongwon" , Andrew Morton , James Houghton , Jerome Marchand , "Chang, Junxiao" , "Kirill A . Shutemov" , "Hocko, Michal" , Muchun Song , Jason Gunthorpe , John Hubbard References: <20230622072710.3707315-1-vivek.kasireddy@intel.com> <6e429fbc-e0e6-53c0-c545-2e2cbbe757de@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH v1 0/2] udmabuf: Add back support for mapping hugetlb pages In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 07809A0019 X-Stat-Signature: 6jrziey68icjtifgua9zyo5o1u5nyicd X-HE-Tag: 1687849838-174054 X-HE-Meta: U2FsdGVkX18t67pZTI4IGN5jC3u9SjqVLXAMxOe8PjRtZjI84iW3vZEpq3QUOlwPKsKjfATl0y+mdZ0fmvR9ErWCpc6BAVfS9eQv/5jPxN/0lsis8tlBc8DEQj1rCioah/MUCG5f2UnRbLXGg+kCaGZbQZdsOZ9fbdrgqdwrYVzvv+tjjhdZC8tqfmG4OE2DxCoP0XR63tv36Fxoc94ZpxPYteofwUOdiFMwZnA8yK53MSvZo7B2+T4rzRYqE63Ihd8Z7gqjZbnL882Ei8/gRvQ2+3x/1Slk1DaLIITsw/QLZ4ii9BaAowXmF10alpWe/x56mPX+1n2ToO2fb6l8OKLeT79AY3p+hvlNnG+IxuxuzkdpyizW4TzbmoiL5dtDe92U1YXPbQsMhO0h4j3OxBKeoyiRjXDaiqRN0INdUrKpEI+u0kJ/rA2gkdniB5nqNMzrGLheqU1vMBQDKGuTfLeMuz0ElAKtEflGMWVXxq6UTZUDY2DPNccW+yybHlKa7JClJDk/uRyC5TFJdnZNyhVePFPmgQwuxzarB6+VF3+3bA9XhqCNrWOeVbPH+Wgv0IBQ0Xdj8/bo7Pa67LZhm+41zRRVIjLfh7wONmBFm4Kj5k4cxh8nC5zZxvzvbxuoPYPcXrywjNDye36dj5LfgYiEN82mxlIiCD1HL9AqHS+2SramcjGaKjuaBvu3ELs7RqE/u8J+12bcvoN5CUtIpeMwaM0nzW23l9E8kpOdgrB0TAMIBENs+r5dhOTCGh4q/4pbd1B6p5Hxr84XPhmKA3BtfNIQhcVsopsetRpCiYG8ix6b9F5TWDeMChDLGX+tKtOeaGhGY9bPP48J3LT55VbHL5/jHAvMtoppB5iNGmRO7Jxlj0zJfde6fufFULLbzIiL+yp7f4rqE657rTWmzYR/alYVevMWYBHlHXf46GOfMXo8Cao0qsOjpBi1Ex4z6Ya8lHDuybQSC4GZEwv ds4oZPg4 LTb78bkiSGDtBI1/npmqHWax0EPegcf+eKOJ8NaQYSG3rh5eazoVIs/kRkk6exsDneS8hjqBikdSkGHo7K2L7ZK+etug/5iFuicvAcoMw78gFuvyBYnvDyMth7TMnvBKwlAdikTxswYuo7Ie17pKeqvAdJoNt+gJshLyUg1INGs83hd7hjgTdOvtyGwhK9K0xy45IWvwKU/WNzvKVH+z2CUPe0L3iDFANbQHVRyJSculuak2V+zbnPQbe839DqskSuLn8zqEWckq/yfbFfqmxMJDeUaH/saZddHN5pbtYbFq8Aa6WLYlw+iQo5nYiL/WTfRm/xgvq9CvbevXG7lkBzNCzktGcUcjFMydT+0OoEAg4JOx6W2hPFGrgfmRDEgzBEYTshy7qVwwGzdlAgNaCfM/vio5jBEVV4slXBDbGbwupbXSBv4RzSZs2k7VepRMyZAJlZJERVHbYDS0PmV4KTfoc4rJgy1uFYvJIFPLlTNaI+nI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 27.06.23 08:37, Kasireddy, Vivek wrote: > Hi David, > Hi! sorry for taking a bit longer to reply lately. [...] >>> Sounds right, maybe it needs to go back to the old GUP solution, though, as >>> mmu notifiers are also mm-based not fd-based. Or to be explicit, I think >>> it'll be pin_user_pages(FOLL_LONGTERM) with the new API. It'll also solve >>> the movable pages issue on pinning. >> >> It better should be pin_user_pages(FOLL_LONGTERM). But I'm afraid we >> cannot achieve that without breaking the existing kernel interface ... > Yeah, as you suggest, we unfortunately cannot go back to using GUP > without breaking udmabuf_create UAPI that expects memfds and file > offsets. > >> >> So we might have to implement the same page migration as gup does on >> FOLL_LONGTERM here ... maybe there are more such cases/drivers that >> actually require that handling when simply taking pages out of the >> memfd, believing they can hold on to them forever. > IIUC, I don't think just handling the page migration in udmabuf is going to > cut it. It might require active cooperation of the Guest GPU driver as well > if this is even feasible. The idea is, that once you extract the page from the memfd and it resides somewhere bad (MIGRATE_CMA, ZONE_MOVABLE), you trigger page migration. Essentially what migrate_longterm_unpinnable_pages() does: Why would the guest driver have to be involved? It shouldn't care about page migration in the hypervisor. [...] >> balloon, and then using that memory for communicating with the device] >> >> Maybe it's all fine with udmabuf because of the way it is setup/torn >> down by the guest driver. Unfortunately I can't tell. > Here are the functions used by virtio-gpu (Guest GPU driver) to allocate > pages for its resources: > __drm_gem_shmem_create: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_gem_shmem_helper.c#L97 > Interestingly, the comment in the above function says that the pages > should not be allocated from the MOVABLE zone. It doesn't add GFP_MOVABLE, so pages don't end up in ZONE_MOVABLE/MIGRATE_CMA *in the guest*. But we care about the ZONE_MOVABLE /MIGRATE_CMA *in the host*. (what the guest does is right, though) IOW, what udmabuf does with guest memory on the hypervisor side, not the guest driver on the guest side. > The pages along with their dma addresses are then extracted and shared > with Qemu using these two functions: > drm_gem_get_pages: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_gem.c#L534 > virtio_gpu_object_shmem_init: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/virtio/virtgpu_object.c#L135 ^ so these two target the guest driver as well, right? IOW, there is a memfd (shmem) in the guest that the guest driver uses to allocate pages from and there is the memfd in the hypervisor to back guest RAM. The latter gets registered with udmabuf. > Qemu then translates the dma addresses into file offsets and creates > udmabufs -- as an optimization to avoid data copies only if blob is set > to true. If the guest OS doesn't end up freeing/reallocating that memory while it's registered with udmabuf in the hypervisor, then we should be fine. Because that way, the guest won't end up trigger MADV_DONTNEED by "accident". -- Cheers, David / dhildenb