From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8BB98C433EF for ; Wed, 24 Nov 2021 14:14:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 17EE76B0075; Wed, 24 Nov 2021 09:14:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 12EBE6B0078; Wed, 24 Nov 2021 09:14:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F38806B007B; Wed, 24 Nov 2021 09:14:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0057.hostedemail.com [216.40.44.57]) by kanga.kvack.org (Postfix) with ESMTP id E558F6B0075 for ; Wed, 24 Nov 2021 09:14:18 -0500 (EST) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id A3A3318289F7F for ; Wed, 24 Nov 2021 14:14:08 +0000 (UTC) X-FDA: 78844018176.16.D7BEF8B Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf05.hostedemail.com (Postfix) with ESMTP id 0E226508BB91 for ; Wed, 24 Nov 2021 14:14:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1637763247; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=A0hMqTI5hxUUo1KE4RdPFoMNQwfcw+L7bEYYOFYe8Bc=; b=GlsaFvyIoYmgQNn/h3ID5KiIa4366/RHCq5dSY7G/qu6U3raVDppUyDS7nPnwj2aa/6Vmp Jf1DD2JNhspehdC+ic8d3HTNv4zXVbV77mstqVo6cHOpGGuGolhbF1/uIotTjAESPiwxvh qeMAUpBknQzxVxRT0AG+G3f/RXiXEFI= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-396-Fq4OIJxVM9ysYAwZoZwgCA-1; Wed, 24 Nov 2021 09:14:03 -0500 X-MC-Unique: Fq4OIJxVM9ysYAwZoZwgCA-1 Received: by mail-wm1-f69.google.com with SMTP id j193-20020a1c23ca000000b003306ae8bfb7so1397679wmj.7 for ; Wed, 24 Nov 2021 06:14:03 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent :content-language:to:cc:references:from:organization:subject :in-reply-to:content-transfer-encoding; bh=A0hMqTI5hxUUo1KE4RdPFoMNQwfcw+L7bEYYOFYe8Bc=; b=mSKzZNnaG4veRrmP2HT4ixrJsYTpfovnS7qsiwwmdSs5F0E4tVA/G8yFI5mROKIFRU blmj0Gc2cvPPhRzAfJaqLEY9bGyeF20dq/M43LBRr9s2DEQwMlJJWTOSEdWdgy75TTAr 0Iz3U1xFIPBPilSwSgUaJpK0dGRoNInjSKkbnnkdw/0qsWdnuOWsbGBtXvZ9nFnebprq UOJ12LyFIRRrRQXrpUPhyzqN3xlGabdyC4ZfvfEIR6WY4N60tD7kzjNWl53jMiNqZAp3 SshTLuhuRxtj9QHZSWuZA9fukd245kz2ii3ojYpdHpyZJjIXUZeYGmnx64uMfjO9Hyl6 NkPA== X-Gm-Message-State: AOAM532pzZ/eWZqzaaI3Dlj3izThVj64pwg5Myi3s9f1+a9hYrsX/urT lqfP/HmDzGqJTQ4/LZCq0PT709ylj+up5y5nBfDmV2j7xHCFgcg/GQcBsl1uLg7nbJeyUuFfVoT ITwdCZ8oxWeE= X-Received: by 2002:a05:6000:1688:: with SMTP id y8mr19601110wrd.420.1637763242097; Wed, 24 Nov 2021 06:14:02 -0800 (PST) X-Google-Smtp-Source: ABdhPJxdkou+qBB9GG+jRq/G6D/I52HGfNekynPXgY61AlkHyvU+ti72rYSCp74uc0+h1LMfZ20aog== X-Received: by 2002:a05:6000:1688:: with SMTP id y8mr19601060wrd.420.1637763241798; Wed, 24 Nov 2021 06:14:01 -0800 (PST) Received: from [192.168.3.132] (p5b0c6380.dip0.t-ipconnect.de. [91.12.99.128]) by smtp.gmail.com with ESMTPSA id w4sm15340334wrs.88.2021.11.24.06.14.00 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 24 Nov 2021 06:14:01 -0800 (PST) Message-ID: <2cdbebb9-4c57-7839-71ab-166cae168c74@redhat.com> Date: Wed, 24 Nov 2021 15:14:00 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0 To: Jason Gunthorpe Cc: Vlastimil Babka , Jens Axboe , Andrew Dona-Couch , Andrew Morton , Drew DeVault , Ammar Faizi , linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, io_uring Mailing List , Pavel Begunkov , linux-mm@kvack.org References: <20211123140709.GB5112@ziepe.ca> <20211123170056.GC5112@ziepe.ca> <20211123235953.GF5112@ziepe.ca> <2adca04f-92e1-5f99-6094-5fac66a22a77@redhat.com> <20211124132353.GG5112@ziepe.ca> <20211124132842.GH5112@ziepe.ca> <20211124134812.GI5112@ziepe.ca> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH] Increase default MLOCK_LIMIT to 8 MiB In-Reply-To: <20211124134812.GI5112@ziepe.ca> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Stat-Signature: 3qyoq5pfe9gfee1nwb56n6qiuyj8s6g4 Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=GlsaFvyI; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf05.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=david@redhat.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 0E226508BB91 X-HE-Tag: 1637763241-373332 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 24.11.21 14:48, Jason Gunthorpe wrote: > On Wed, Nov 24, 2021 at 02:29:38PM +0100, David Hildenbrand wrote: >> On 24.11.21 14:28, Jason Gunthorpe wrote: >>> On Wed, Nov 24, 2021 at 02:25:09PM +0100, David Hildenbrand wrote: >>>> On 24.11.21 14:23, Jason Gunthorpe wrote: >>>>> On Wed, Nov 24, 2021 at 09:57:32AM +0100, David Hildenbrand wrote: >>>>> >>>>>> Unfortunately it will only be a band aid AFAIU. I can rewrite my >>>>>> reproducer fairly easily to pin the whole 2M range first, pin a second >>>>>> time only a single page, and then unpin the 2M range, resulting in the >>>>>> very same way to block THP. (I can block some THP less because I always >>>>>> need the possibility to memlock 2M first, though). >>>>> >>>>> Oh! >>>>> >>>>> The issue is GUP always pins an entire compound, no matter how little >>>>> the user requests. >>>> >>>> That's a different issue. I make sure to split the compound page before >>>> pinning anything :) >>> >>> ?? Where is that done in GUP? >> >> It's done in my reproducer manually. > > Aren't there many ways for hostile unpriv userspace to cause memory > fragmentation? You are picking on pinning here, but any approach that > forces the kernel to make a kalloc on a THP subpage would do just as > well. I'm not aware of any where you can fragment 50% of all pageblocks in the system as an unprivileged user essentially consuming almost no memory and essentially staying inside well-defined memlock limits. But sure if there are "many" people will be able to come up with at least one comparable thing. I'll be happy to learn. > > Arguably if we want to point to an issue here it is in MADV_FREE/etc > that is the direct culprit in allowing userspace to break up THPs and > then trigger fragmentation. > > If the objective is to prevent DOS of THP then MADV_FREE should > conserve the THP and migrate the subpages to non-THP > memory. > > FOLL_LONGTERM is not the issue here. Thanks Jason for the discussion but this is where I'll opt out for now because we seem to strongly disagree and as I said: "I'm going to leave judgment how bad this is or isn't to the educated reader, and I'll stop spending time on this as I have more important things to work on." But I'm going to leave one last comment to eventually give you a different perspective: after MADV_DONTNEED the compound page sits on the deferred split queue and will get split either way soon. People are right now discussion upstream to even split synchronously, which would move MADV_FREE out of the picture completely. My position that FOLL_LONGTERM for unprivileged users is a strong no-go stands as it is. Not MADV_FREE speeding up the compound page split in my reproducer. Not MADV_DONTNEED allowing us to zap parts of a THP (I could even have just used munmap or even mmap(MAP_FIXED)). -- Thanks, David / dhildenb