From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A595BEB64D7 for ; Mon, 26 Jun 2023 14:49:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 381B48D0002; Mon, 26 Jun 2023 10:49:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3323E8D0001; Mon, 26 Jun 2023 10:49:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1D3008D0002; Mon, 26 Jun 2023 10:49:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 0D56B8D0001 for ; Mon, 26 Jun 2023 10:49:35 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id CEA0F120120 for ; Mon, 26 Jun 2023 14:49:34 +0000 (UTC) X-FDA: 80945182668.08.0E08399 Received: from mail-pj1-f41.google.com (mail-pj1-f41.google.com [209.85.216.41]) by imf01.hostedemail.com (Postfix) with ESMTP id D3E144001B for ; Mon, 26 Jun 2023 14:49:31 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=jZXsSwFU; spf=pass (imf01.hostedemail.com: domain of zhangpeng.00@bytedance.com designates 209.85.216.41 as permitted sender) smtp.mailfrom=zhangpeng.00@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687790972; a=rsa-sha256; cv=none; b=eNkXf6zTetuzeklcdcHJrguusU+hTNTezfqHCCDAyt421o5/Adrkpp+t97oaCbJkHv4EX5 lXY9kbpfrz3xoBocl0vxqnskW3Lyi6MyjeM0MPT/B0TRXXrkh+k2CHux1otH6KBribiGI7 1jS/2xMzVhl3P4ATV5HqqWm2CRzvc04= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=jZXsSwFU; spf=pass (imf01.hostedemail.com: domain of zhangpeng.00@bytedance.com designates 209.85.216.41 as permitted sender) smtp.mailfrom=zhangpeng.00@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687790972; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pQb0qOt7za3JfyuW5vz4QFCethrcqJk2saWua6HMAXU=; b=6vkci1A8ObLrewJyp0ne319XcBcXX3kcTbWHSxyGZyK3UFEF9c6zWqpN5h/u/YX7crGKF7 PYw6a42C9Gau9wgdsp2CxaQuP/ioMmtIIrc2IyMWh2b3lQa7FlC6SAIQoIpwLE5QKL0ujN oc4Wa5W0qpql48YWbn9MjTHI5m1s3Ws= Received: by mail-pj1-f41.google.com with SMTP id 98e67ed59e1d1-262ff3a4659so474561a91.0 for ; Mon, 26 Jun 2023 07:49:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687790970; x=1690382970; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=pQb0qOt7za3JfyuW5vz4QFCethrcqJk2saWua6HMAXU=; b=jZXsSwFUnSSRaPAEYACr0E/2WajdCjyhOaJxs0Mv22a/FivBQTyxlWgeUoo055b7SK KhIzjME2oTmSjsPl9uP4sty9sDJAflnRE0syYDXb/aNJtdQjTZNg6vDnUdXdCtz6q8ZX l7TOz+YUv4+kMhLv2gZfNkPxwsxfRMKEbz5vQHDblVC9QD5eYQ3/PUASCMnxpxpnBl+9 qZRgTciQudVmLfp5y6yM31C8R93erE2nsnbi3xsJRzg7ZTm9AUNIAklzymRw/QmOO+1Z PEMxPIPPbTA+zsbOavFCwxeQnXzDo/C8WCx4lv3OkwpdGffxyvH7DtwumyiIMc6r39t4 rC4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687790970; x=1690382970; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=pQb0qOt7za3JfyuW5vz4QFCethrcqJk2saWua6HMAXU=; b=UiNhL6GM9l8tiaAyih8ct9Ok+vXqjN0OO/niFKBVLUMV2XwJslXtOtJlIh66r2LQYp QbuEMLqqKbNv30qM9v6d16W/evOtBZegXq7q4UVffS4YpzRT/Mv9D935FoMjecRWSbzo YmG0MVeSN3vcF0hWMYQAIlJJkgKz7Xq7xOflIGabnTw8q0HoYDuHh8pjBrLKVJc50mbE jiFPgOl3haph4RK4/k8w0sQdN2NvjlZae/mLSGN804rnBw/NF3ZaG4P3puJrX2lbhRBC GYSaQnga0ejVRLZJfU0CDCVntym3NXUbhjMv3amiipy07Y0Yd9NbgGe+5nqjq8HiFlxE ry5A== X-Gm-Message-State: AC+VfDwbW+Gz1mnpX1onA8ytDPIqN0AX0XOsSpmfV6dROIVIXESry+xy X5hw6yddCywkvSNw/Bpulre0TA== X-Google-Smtp-Source: ACHHUZ5GGmQmZK6Pdk/0WOE4p0jN9ts4cyRWDEezIQMn0sNMLYw4rloFGLp7ys1IgYmzAuUjLFF+nQ== X-Received: by 2002:a17:90a:17ec:b0:262:ba8a:f7ef with SMTP id q99-20020a17090a17ec00b00262ba8af7efmr9419568pja.23.1687790970330; Mon, 26 Jun 2023 07:49:30 -0700 (PDT) Received: from [10.255.209.141] ([139.177.225.255]) by smtp.gmail.com with ESMTPSA id 18-20020a17090a199200b00262f59ce2edsm1900239pji.10.2023.06.26.07.49.26 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 26 Jun 2023 07:49:30 -0700 (PDT) Message-ID: Date: Mon, 26 Jun 2023 22:49:24 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.11.2 Subject: Re: [PATCH v2 14/16] maple_tree: Refine mas_preallocate() node calculations To: Danilo Krummrich Cc: maple-tree@lists.infradead.org, linux-mm@kvack.org, Matthew Wilcox , Andrew Morton , "Liam R. Howlett" , linux-kernel@vger.kernel.org, Peng Zhang , David Airlie , Boris Brezillon References: <20230612203953.2093911-1-Liam.Howlett@oracle.com> <20230612203953.2093911-15-Liam.Howlett@oracle.com> <26d8fbcf-d34f-0a79-9d91-8c60e66f7341@redhat.com> <43ce08db-210a-fec8-51b4-351625b3cdfb@redhat.com> <57527c36-57a1-6699-b6f0-373ba895014c@redhat.com> From: Peng Zhang In-Reply-To: <57527c36-57a1-6699-b6f0-373ba895014c@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: D3E144001B X-Stat-Signature: kzups81u6bf6hfjb79scrdpi466nznzf X-Rspam-User: X-HE-Tag: 1687790971-169450 X-HE-Meta: U2FsdGVkX1+pCy7GrQ/sc6LFlIZHdpSTZiLHKaI3JhLg++XKK4qOSkM7iD1E8xBJ2z+BUl4rH8kLW73uNxZTUnQuX60DlkloWGsUkQD1rQ4sWu38YOHya9XjouWAkwT3ntDTHeAwfHGg7L2vrUwKdw4KAzx7VzWuCkeIDOw0KofJQfUvRVNJqFnVGnZaJe9fytdk624ZSy9EwOayGlTCr/GfAxiH22mcj1W9tCjHaoKpPY2Ebk3rQCTsnuN20bKyidH4hOdMI4uWtkERfcpdpVQZnvrQz0s//4yMuZiWMv8loHGdxG+n6om1SnIRF5/bFFdJSwPZ+BLwvjJxIeefY4zzk5AuJjXsv+9itKQnXNAKnsI37VJ+5pyZdOozvKqpg8/lZY+ZoxRmPEXZGk5BjrWSOhcPLa3JDKVFpcd++Z7Ec3CpRCCciCkl1HqMz/lYiQ2pymjbWKGGqfYXGWMGIhOqp/8DzuDJucX4b/Yf5Pm7mkMvSv99k0fBLwG3LXLyz5So1h1q6rwl74446MBNiYYdYn2V8835FCbTChtgop+OVCOztSGWfEMLohLM7pHH28sx6Fp4TbjwPFxa9EFfIw9Jveo0MKK381NRxsb81a7q4ShsQaREX6GXLH1acQztg4PHc8/uxpkaxMBHJEYsdKFgofg22Djbhyaw7kZcmjI0ros8OjEtiibrNGk1iW94/J+DlOK1pExq1FLwQWlQXdhJH8RM25VD2RaRVV8Bsdt/pDfExJbhx2zhP1WQ02x3/yw41qweSXLiLC7MSXkQG38q7DAYYbyJq7bxCYweRr3opUaBLS+z2iVxhdcIS84A+VMZP9JyqUCYVxSOgz8VrCRWMub9LYSJEjCU8YRaQ/cue7xwTkI+4Ln8i7m18YT52kKmIuhRHcl2/IsRig+KVWSjkIVYvQVa9SuA77twVouG0xSlPd8dSSpNsuB37AOoQlwBnu22DQJDM4Mu52g DDeaR4gv vfnhtQXI7wPWJDHQO693o7ewnnxBAO8ennFvLq35ve9WOsCis3mBL3PmDGG6lToP0mjqQSaD9YJWbmcC10nyGw8WUDMxYhkHBiytDuzcvHh2n4K7skF8rciYqSuQG5tJltMsuvwR/bpB/KvA5K9+NtQtjaYgzVrOgPA3EFOml7TNaxu26s+NflCbZJTs6PnFfOKAgSkwCYlYKw+oQfcN601P0wPGTXqlfVyneLyaEOVSmzCU+CTNNB+W95Z/QBr4WparC0cMW9J2cKXQ+TIqvcqUwNpExMPKVqup54p1CR++5cLIZTvw6fvQQ3u5obcyxx9t/Fo+hva65sPNn7Zl59UBMiolUPsMHKb28hHTUVIhLfQh163sHtjMmBej1KhRsbVE4Y9VWIWhSjer1aai+mWBfsCMcqN5KB8GQ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: 在 2023/6/26 22:27, Danilo Krummrich 写道: > On 6/26/23 15:19, Matthew Wilcox wrote: >> On Mon, Jun 26, 2023 at 02:38:06AM +0200, Danilo Krummrich wrote: >>> On the other hand, unless I miss something (and if so, please let me >>> know), >>> something is bogus with the API then. >>> >>> While the documentation of the Advanced API of the maple tree explicitly >>> claims that the user of the API is responsible for locking, this >>> should be >>> limited to the bounds set by the maple tree implementation. Which >>> means, the >>> user must decide for either the internal (spin-) lock or an external >>> lock >>> (which possibly goes away in the future) and acquire and release it >>> according to the rules maple tree enforces through lockdep checks. >>> >>> Let's say one picks the internal lock. How is one supposed to ensure the >>> tree isn't modified using the internal lock with mas_preallocate()? >>> >>> Besides that, I think the documentation should definitely mention this >>> limitation and give some guidance for the locking. >>> >>> Currently, from an API perspective, I can't see how anyone not >>> familiar with >>> the implementation details would be able to recognize this limitation. >>> >>> In terms of the GPUVA manager, unfortunately, it seems like I need to >>> drop >>> the maple tree and go back to using a rb-tree, since it seems there >>> is no >>> sane way doing a worst-case pre-allocation that does not suffer from >>> this >>> limitation. >> >> I haven't been paying much attention here (too many other things going >> on), but something's wrong. >> >> First, you shouldn't need to preallocate.  Preallocation is only there > > Unfortunately, I think we really have a case where we have to. Typically > GPU mappings are created in a dma-fence signalling critical path and > that is where such mappings need to be added to the maple tree. Hence, > we can't do any sleeping allocations there. > >> for really gnarly cases.  The way this is *supposed* to work is that >> the store walks down to the leaf, attempts to insert into that leaf >> and tries to allocate new nodes with __GFP_NOWAIT.  If that fails, >> it drops the spinlock, allocates with the gfp flags you've specified, >> then rewalks the tree to retry the store, this time with allocated >> nodes in its back pocket so that the store will succeed. > > You are talking about mas_store_gfp() here, right? And I guess, if the > tree has changed while the spinlock was dropped and even more nodes are > needed it just retries until it succeeds? > > But what about mas_preallocate()? What happens if the tree changed in > between mas_preallocate() and mas_store_prealloc()? Does the latter one > fall back to __GFP_NOWAIT in such a case? I guess not, since > mas_store_prealloc() has a void return type, and __GFP_NOWAIT could fail > as well. mas_store_prealloc() will fallback to __GFP_NOWAIT and issue a warning. If __GFP_NOWAIT allocation fails, BUG_ON() in mas_store_prealloc() will be triggered. > > So, how to use the internal spinlock for mas_preallocate() and > mas_store_prealloc() to ensure the tree can't change? >