From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=3.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6AD02ECE58E for ; Thu, 17 Oct 2019 08:54:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 140AD214E0 for ; Thu, 17 Oct 2019 08:54:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="NQe40YBx" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 140AD214E0 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B2C6E8E0005; Thu, 17 Oct 2019 04:54:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AB5868E0003; Thu, 17 Oct 2019 04:54:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 97D518E0005; Thu, 17 Oct 2019 04:54:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0215.hostedemail.com [216.40.44.215]) by kanga.kvack.org (Postfix) with ESMTP id 705B48E0003 for ; Thu, 17 Oct 2019 04:54:48 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id C5B0E879E for ; Thu, 17 Oct 2019 08:54:47 +0000 (UTC) X-FDA: 76052666214.06.tail97_46e3c7a41a637 X-HE-Tag: tail97_46e3c7a41a637 X-Filterd-Recvd-Size: 6880 Received: from mail-wm1-f65.google.com (mail-wm1-f65.google.com [209.85.128.65]) by imf05.hostedemail.com (Postfix) with ESMTP for ; Thu, 17 Oct 2019 08:54:47 +0000 (UTC) Received: by mail-wm1-f65.google.com with SMTP id 3so1636094wmi.3 for ; Thu, 17 Oct 2019 01:54:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=reply-to:subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=Hmw4OwL9FqHb+jV8hLh1gJHi/QJcPAOSq/jTTk2K2mA=; b=NQe40YBxmG96f9TOt2lx2nu5yeOXIxxCZAt2XQ4UJNRC/70WHa4W3XT/bvDilI4+h8 B0wNL6ov0qQEYZPgvU+KVwDDMAtWgOsYpRb8qbHhiW8s7Il0t4d5ZROl6Plmy2QWvIpe 5LLee0ctnKCYPjSfOWUzNj3fbs3MjnU5KB/6XfHtXcA0z18t6ZtuuYL6abHh+QYPPFHU NMe4BKsJHV7wCBmKf2FuGN7qsm+r93+M0G1Qj3cgpNYdL2Bdny9VlzCdUMmGvTTAG2x1 4kzSR+iWl7k73rIJNAOKvHu05m/5toXC/u7LCLrWXNmWdKxgwqIwK4JIGwfAmkq9+7ZJ N2ng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:reply-to:subject:to:cc:references:from :message-id:date:user-agent:mime-version:in-reply-to :content-transfer-encoding:content-language; bh=Hmw4OwL9FqHb+jV8hLh1gJHi/QJcPAOSq/jTTk2K2mA=; b=UgMKMnqdtcrqTv+hidShXmatNfbkKb1SLjLqDgNoiGfDUOC5Sh8maHeOMANSwrlbrm rDpQP3sgHFsk+393SGqdB6QAIJ7iXDGj3Cv0tl9KKZVtI1M03KfE1rMdAP5gPcX70Kjt 3m6HpME8Kj6CjTLjDpo/+qOi8uN8x0aZZXnYRMQeCcUvpQUXHbEi4J9l12FxaCs05Lnz n4bXUicSwDzZyLmMw00/GVwM51783Q2Wj2lqPc03yB+SaVjT/PnV7BCkC3vFyj0JkrWH fNilaKdssstPMo7amyBqBKQ3oqndEKAWnMlOe23LzZZartFL0L+AjB9/PGIwt1oAJLAG tifw== X-Gm-Message-State: APjAAAWnQpEKBnYIZeQoEeQpbDf27ZQxEjEUiAFckoybf8yVEHRMC2A3 dN+bTMQoHaX4d/jfEe2PbiE= X-Google-Smtp-Source: APXvYqzJi9+RvTSEwHzV3ur+tJTiN0DDJTk5yUlvOyd82uaG+4z5rZr2J2St19DvdGnPXph1nSYjMw== X-Received: by 2002:a7b:c936:: with SMTP id h22mr1819751wml.1.1571302485776; Thu, 17 Oct 2019 01:54:45 -0700 (PDT) Received: from ?IPv6:2a02:908:1252:fb60:be8a:bd56:1f94:86e7? ([2a02:908:1252:fb60:be8a:bd56:1f94:86e7]) by smtp.gmail.com with ESMTPSA id c132sm1490101wme.27.2019.10.17.01.54.44 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 17 Oct 2019 01:54:45 -0700 (PDT) Reply-To: christian.koenig@amd.com Subject: Re: [PATCH hmm 00/15] Consolidate the mmu notifier interval_tree and locking To: Jason Gunthorpe , "christian.koenig@amd.com" Cc: Andrea Arcangeli , Ralph Campbell , "linux-rdma@vger.kernel.org" , John Hubbard , "Felix.Kuehling@amd.com" , "amd-gfx@lists.freedesktop.org" , "linux-mm@kvack.org" , Jerome Glisse , "dri-devel@lists.freedesktop.org" , Ben Skeggs References: <20191015181242.8343-1-jgg@ziepe.ca> <20191016160444.GB3430@mellanox.com> From: =?UTF-8?Q?Christian_K=c3=b6nig?= Message-ID: <2df298e2-ee91-ef40-5da9-2bc1af3a17be@gmail.com> Date: Thu, 17 Oct 2019 10:54:43 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <20191016160444.GB3430@mellanox.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000002, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Am 16.10.19 um 18:04 schrieb Jason Gunthorpe: > On Wed, Oct 16, 2019 at 10:58:02AM +0200, Christian K=C3=B6nig wrote: >> Am 15.10.19 um 20:12 schrieb Jason Gunthorpe: >>> From: Jason Gunthorpe >>> >>> 8 of the mmu_notifier using drivers (i915_gem, radeon_mn, umem_odp, h= fi1, >>> scif_dma, vhost, gntdev, hmm) drivers are using a common pattern wher= e >>> they only use invalidate_range_start/end and immediately check the >>> invalidating range against some driver data structure to tell if the >>> driver is interested. Half of them use an interval_tree, the others a= re >>> simple linear search lists. >>> >>> Of the ones I checked they largely seem to have various kinds of race= s, >>> bugs and poor implementation. This is a result of the complexity in h= ow >>> the notifier interacts with get_user_pages(). It is extremely difficu= lt to >>> use it correctly. >>> >>> Consolidate all of this code together into the core mmu_notifier and >>> provide a locking scheme similar to hmm_mirror that allows the user t= o >>> safely use get_user_pages() and reliably know if the page list still >>> matches the mm. >> That sounds really good, but could you outline for a moment how that i= s >> archived? > It uses the same basic scheme as hmm and rdma odp, outlined in the > revisions to hmm.rst later on. > > Basically, > > seq =3D mmu_range_read_begin(&mrn); > > // This is a speculative region > .. get_user_pages()/hmm_range_fault() .. How do we enforce that this get_user_pages()/hmm_range_fault() doesn't=20 see outdated page table information? In other words how the the following race prevented: CPU A CPU B invalidate_range_start() =C2=A0=C2=A0=C2=A0 =C2=A0 mmu_range_read_begin() =C2=A0=C2=A0=C2=A0 =C2=A0 get_user_pages()/hmm_range_fault() Updating the ptes invalidate_range_end() I mean get_user_pages() tries to circumvent this issue by grabbing a=20 reference to the pages in question, but that isn't sufficient for the=20 SVM use case. That's the reason why we had this horrible solution with a r/w lock and=20 a linked list of BOs in an interval tree. Regards, Christian. > // Result cannot be derferenced > > take_lock(driver->update); > if (mmu_range_read_retry(&mrn, range.notifier_seq) { > // collision! The results are not correct > goto again > } > > // no collision, and now under lock. Now we can de-reference the page= s/etc > // program HW > // Now the invalidate callback is responsible to synchronize against = changes > unlock(driver->update) > > Basically, anything that was using hmm_mirror correctly transisions > over fairly trivially, just with the modification to store a sequence > number to close that race described in the hmm commit. > > For something like AMD gpu I expect it to transition to use dma_fence > from the notifier for coherency right before it unlocks driver->update. > > Jason > _______________________________________________ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx