From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F1C1C77B75 for ; Wed, 17 May 2023 17:23:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BD486900004; Wed, 17 May 2023 13:23:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B84E2900003; Wed, 17 May 2023 13:23:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A4BA8900004; Wed, 17 May 2023 13:23:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 8E999900003 for ; Wed, 17 May 2023 13:23:24 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 60AB0A029C for ; Wed, 17 May 2023 17:23:24 +0000 (UTC) X-FDA: 80800418328.16.F4ACD32 Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com [209.85.221.47]) by imf14.hostedemail.com (Postfix) with ESMTP id 5BFEE10000C for ; Wed, 17 May 2023 17:23:22 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="SBC//ysU"; spf=pass (imf14.hostedemail.com: domain of lstoakes@gmail.com designates 209.85.221.47 as permitted sender) smtp.mailfrom=lstoakes@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684344202; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7BKD2whEK1ythHrThH/kmgQln8MVyL2sg+QVZlHM2h4=; b=A/NQEZMuvGotO56TIewA0hbQrK2MJSOgEJ47dUC4H9Lv6szjmExJ+hJ2gNHqJwEXlZvHWJ 5RQkWB0Qp9MwovlJA7niD2wW+ZGkO/EA2GZDGLAwfPDotL7ExcfoHPuRMgFEInBTZpJOoa rGbUdZks661vXJE4J+NsKg5/cVZJkoM= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="SBC//ysU"; spf=pass (imf14.hostedemail.com: domain of lstoakes@gmail.com designates 209.85.221.47 as permitted sender) smtp.mailfrom=lstoakes@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684344202; a=rsa-sha256; cv=none; b=K/l7LYP2FkO0tb0OWsQzkmrRHmrY6au8WJ2sYyyIyyefTovVK0MD+hwyR7ALVBuFcsqkXL sBg2c1XmRhNQLF5YHz2B4hjnqTwii7oC652WRT2F3zlumsFbQXluRhha2Yq/IPpe51DTnb WGUos1+74KosPM54bj6/F01giu2En1A= Received: by mail-wr1-f47.google.com with SMTP id ffacd0b85a97d-3062db220a3so715046f8f.0 for ; Wed, 17 May 2023 10:23:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684344201; x=1686936201; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=7BKD2whEK1ythHrThH/kmgQln8MVyL2sg+QVZlHM2h4=; b=SBC//ysUyt07A2/dgDEBiBiArbsFSHb0XkiDqC3g3wRyEVTTBadlJS+ClVJK8O7mEo g6HovdKo6e7cAzRNepEyhrpzCgIrwvlDwb8np/iRSWaFdWlccDVCQd0c2HiBCaZhxlKg hxfQ5Oui6egclQgMfzb9N6AZiJrctTqJov1LEgHwWYOg5FnleKCDmWxfDFZU9tm7aWSC +eHZ6h9WGz1ISEIIL5MQu/4VvyHWPbxhgvQy0ArBoir18+i1O8WJ27hVZLkZExyRSU36 Vj74LBwRo/A+Jyw4F4bZ/BcQoGsKqh91TE7SejGERdy1uECmgz5SIbkccF0QkqtghDeV 8h/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684344201; x=1686936201; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=7BKD2whEK1ythHrThH/kmgQln8MVyL2sg+QVZlHM2h4=; b=hPpNDqmtumJzZn4Y9QlRGe8wevUmiFhNrfjV4Az8giRbGU8Rf6pQZf31kK1wAxvtK0 AZJ5n/2OAiOJ//5wCJqO4obHTDarZBI+MYeGxnnBW9/g/o4LY72syCHhNhuJqfmsT97Q i25Rspb3ZQpwSHRhFwizVy8p7R8xbrOJqJwdzCuBuswte6NAKLdDHp8kQ4uLGz1gBODn TPLq+4XOaSE5KSOxiAvlUdfsdFyXfWweQh+XpCT4K0Aq4J1vt0UBVQhQihrPy2YCyE26 aUBB3LYF0WwrU2OMMiCXH0UKuWaHXhN6iSF3L0gXCZzyC1gKSoBo3c277naWhlI87TCa MJhA== X-Gm-Message-State: AC+VfDxEOxYu9vpEpi2/M5uvBDkgxijt1KLCKeEnPubkmv3h0pGJex6L K8M7Yzq9FFhV27Lv8WIWhWc= X-Google-Smtp-Source: ACHHUZ4Bm7EWE3rQkQrjXhJTqdVO9G3J0QmcuTgJ1GRFii3dXQfi9F5ixtgIaHCSJOkT0h7drZ1UnA== X-Received: by 2002:a5d:4a8f:0:b0:306:2d3d:a108 with SMTP id o15-20020a5d4a8f000000b003062d3da108mr1133169wrq.11.1684344200527; Wed, 17 May 2023 10:23:20 -0700 (PDT) Received: from localhost (host86-156-84-164.range86-156.btcentralplus.com. [86.156.84.164]) by smtp.gmail.com with ESMTPSA id z8-20020adfe548000000b00306281cfa59sm3321041wrm.47.2023.05.17.10.23.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 10:23:19 -0700 (PDT) Date: Wed, 17 May 2023 18:23:18 +0100 From: Lorenzo Stoakes To: Peter Xu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , "Liam R . Howlett" , Mark Rutland , Andrea Arcangeli , Mike Rapoport , Alexander Viro , linux-stable Subject: Re: [PATCH 2/2] mm/uffd: Allow vma to merge as much as possible Message-ID: References: <20230517150408.3411044-1-peterx@redhat.com> <20230517150408.3411044-3-peterx@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230517150408.3411044-3-peterx@redhat.com> X-Stat-Signature: m6qajrjgjdw3md4gxh6iu7j1d7ykohuk X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 5BFEE10000C X-Rspam-User: X-HE-Tag: 1684344202-281437 X-HE-Meta: U2FsdGVkX19NaDnLKA/9TC7G6kwTJJnG1me7iTYv1NzaIp3/7RONIj0HmIRLTJqiLafPoVGldanQ3P8p4f+NHVhsKdwehpV+PNz2WegA28hiWzzI7EKNRPWwxnYjphDa1qEp+rxJsHyPHZm/oV5mmdP4biA/JDSATe6wLG3Khu9s/T0YjEFrWyTtLs2XKHlBArsyGYZ8BhNQq+OJad1+7Y/ZaAwz7u4W3dO8JFP11dD2iXI5lyApOJGEiZortQqNEkCj4WGTvfdmXnsFQyPmHTjTNmo9lTpumizcbfCUMw1jrDVW7icRexfHvN5jvkEh34Z4WTHcCLVbwMqPsK0XpHiOtwIwMkkCAux8Lc2fAxLbIRseP9HWlfNSQWmlO7qFudKSrBnHaq/6/Rk0uye1OFbDRhTi6XHCVwBtH0liyFPswytdBL7eeexuuksZYErNR+SIIuLOd8j6XmykFkMrcU7tLZ3yKvzWnVikT07B0YLn8XslOlTAcDjcIkaPIlYLkVaWenrayKjUVZqKITcmufNiOhfRPSw242MKxD/auuhSOhiLdFU0h5OaE2JHt6Sg1qA4vkxVcqD1D31cyug/Lwk793kexkNA3iCv34zg5SmKExDWjJ9JuXlkLsDFn+AXWyldKAe/Vk7M+bVhwJHzzUOMY8UwL5tFP57FvK47Ri8KeciA1hhtMKxm+0JJmDNOtdIAPRtGPmTY96grt+eiPgiVM4K+7ay4z9IsmTU9N7NhhabUYtW8ZRvc+fKBT1SavUO1SduWzgSveuOlrfrzEsmybmROgR94JRGAeEDRqbFCrQxm+nA3dzHPANVzPxyfey4/8GsI42PwZ/srBuryfx+xohIAlmN4/UzY7eXCOYGfVLomDnGXfwbqnCKA5iPOb1CC9pcBcGF+unSRs9Ds5ND3kHPvdozSddF3epqa+330q93bbsqEYvI77svk6nGKYaFw+bjsDVgguN3Ljcn 1PkJ+6M9 iD3IIofsAqNxMsAEqZSbCv/T/A/AawuXTWhcEYKQ2abLJbEWXCnSkVCADXwuiLZ1X999KtbcANPvS49b814dA67ccMFedB7Xxgm+sh5KeDUErWvMe/Rg3VbzVxn+gPWwjv9b9epv5u4JKJgoqwF6iuw50LtWHC4L4AD1w+VXFsouB5EmyedDwGKEKhhLcF5sqn8iu4o4XCf3EyHhh4q3v/z1VA0w0Avzafp90jVM360r11clD19+7sOBvEBf/QL/2fbHmo4M/D9jWnT1uDSf2zYL1nq4Y6OLjbAMEz/E15rBGu5/HDM6Do2LSB8yLeHPdMc8QSHj04ZVN6Vy5N4VtfdYpTduFr7QfnsB8B1zr6z/1h0g+gFw3xCqizQFIiIxz7bEj2QfkNDOdr5Jh0qLiZiX2EB9SKlV3gSRsg5+SkWAZMDWp6hxKQk3jHothDEmS48V0G9PEZ6ig6EFNhEq3O3xjBki1P2AEGP5Ond1q6tFcq2LFa3cyiVpPcylvZzl0avGptQwWVLVofJUPOy323OsCfyH/Pbj9FaGz530nvDKg2stMQH1N9hCnow== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, May 17, 2023 at 11:04:08AM -0400, Peter Xu wrote: > We used to not pass in the pgoff correctly when register/unregister uffd > regions, it caused incorrect behavior on vma merging and can cause > mergeable vmas being separate after ioctls return. > > For example, when we have: > > vma1(range 0-9, with uffd), vma2(range 10-19, no uffd) > > Then someone unregisters uffd on range (5-9), it should logically become: > > vma1(range 0-4, with uffd), vma2(range 5-19, no uffd) > > But with current code we'll have: > > vma1(range 0-4, with uffd), vma3(range 5-9, no uffd), vma2(range 10-19, no uffd) > > This patch allows such merge to happen correctly before ioctl returns. > > This behavior seems to have existed since the 1st day of uffd. Since pgoff > for vma_merge() is only used to identify the possibility of vma merging, > meanwhile here what we did was always passing in a pgoff smaller than what > we should, so there should have no other side effect besides not merging > it. Let's still tentatively copy stable for this, even though I don't see > anything will go wrong besides vma being split (which is mostly not user > visible). > Maybe a Reported-by me since I discovered the fragmentation was already happening via the repro? :) > Cc: Andrea Arcangeli > Cc: Mike Rapoport (IBM) > Cc: linux-stable > Fixes: 86039bd3b4e6 ("userfaultfd: add new syscall to provide memory externalization") > Signed-off-by: Peter Xu > --- > fs/userfaultfd.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > index 17c8c345dac4..4e800bb7d2ab 100644 > --- a/fs/userfaultfd.c > +++ b/fs/userfaultfd.c > @@ -1332,6 +1332,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, > bool basic_ioctls; > unsigned long start, end, vma_end; > struct vma_iterator vmi; > + pgoff_t pgoff; > > user_uffdio_register = (struct uffdio_register __user *) arg; > > @@ -1484,8 +1485,9 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, > vma_end = min(end, vma->vm_end); > > new_flags = (vma->vm_flags & ~__VM_UFFD_FLAGS) | vm_flags; > + pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); > prev = vma_merge(&vmi, mm, prev, start, vma_end, new_flags, > - vma->anon_vma, vma->vm_file, vma->vm_pgoff, > + vma->anon_vma, vma->vm_file, pgoff, > vma_policy(vma), > ((struct vm_userfaultfd_ctx){ ctx }), > anon_vma_name(vma)); > @@ -1565,6 +1567,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, > unsigned long start, end, vma_end; > const void __user *buf = (void __user *)arg; > struct vma_iterator vmi; > + pgoff_t pgoff; > > ret = -EFAULT; > if (copy_from_user(&uffdio_unregister, buf, sizeof(uffdio_unregister))) > @@ -1667,8 +1670,9 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, > uffd_wp_range(vma, start, vma_end - start, false); > > new_flags = vma->vm_flags & ~__VM_UFFD_FLAGS; > + pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT); > prev = vma_merge(&vmi, mm, prev, start, vma_end, new_flags, > - vma->anon_vma, vma->vm_file, vma->vm_pgoff, > + vma->anon_vma, vma->vm_file, pgoff, > vma_policy(vma), > NULL_VM_UFFD_CTX, anon_vma_name(vma)); > if (prev) { > -- > 2.39.1 > Acked-by: Lorenzo Stoakes