From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 494ADC1744D for ; Tue, 12 Nov 2019 20:23:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D71D9214E0 for ; Tue, 12 Nov 2019 20:23:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=ziepe.ca header.i=@ziepe.ca header.b="U5tMGzDo" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D71D9214E0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ziepe.ca Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6491F6B000E; Tue, 12 Nov 2019 15:22:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5FA886B0010; Tue, 12 Nov 2019 15:22:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4714B6B0266; Tue, 12 Nov 2019 15:22:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0211.hostedemail.com [216.40.44.211]) by kanga.kvack.org (Postfix) with ESMTP id 2A5696B000E for ; Tue, 12 Nov 2019 15:22:54 -0500 (EST) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id E13FA180AD820 for ; Tue, 12 Nov 2019 20:22:53 +0000 (UTC) X-FDA: 76148749026.02.quill64_19358dc933206 X-HE-Tag: quill64_19358dc933206 X-Filterd-Recvd-Size: 15123 Received: from mail-qt1-f196.google.com (mail-qt1-f196.google.com [209.85.160.196]) by imf17.hostedemail.com (Postfix) with ESMTP for ; Tue, 12 Nov 2019 20:22:53 +0000 (UTC) Received: by mail-qt1-f196.google.com with SMTP id o49so21233778qta.7 for ; Tue, 12 Nov 2019 12:22:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=BSFUg/W9A6HnwsbPbuDzq/9L6ti4sGNMmL25wGbaBxI=; b=U5tMGzDohM88Dhb5OT+KNGasGrxJPURV3VJCgcEbLNaNLggNPtNvaJWNxh0oGCxd+G e1TTaZrAQm6MtqEffeVMDJceNyB8NHz11Rhj2PRrGCd72azzt1/3JDGtzA2cy4YX6/0u mAW2VzEiinIlE+wrNwjawDIV7wV6QvRGUW8Ksp2RYkql90eX+Syq6crGDA8ezIa/znBG nlMhQqdyxhN7pb6Jafd7qjYQ0G5GzVjTbGXchwZj6vkoj/p7ny/lKxLlNyJPBZEnc930 TG5K5USOVXpcnWbR/x/G0Wx3Ef54r2hr8Dpu4fajLIw10HvmTDAD5WQT4pVIm1hLNM6F P6MA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=BSFUg/W9A6HnwsbPbuDzq/9L6ti4sGNMmL25wGbaBxI=; b=VDFusGDZw+JZuYyCaKLJrPoBG4t2d01NaLKcc/3ZfuNt6hiL85qbAWN6DvHMAXE1DI 0iKGIVINe9g03/tMPuY+cL+HMLpggUTdv09TaopBiN3ASPwwKzdlxy2ofj9bQXBMcuEK QO7pksFF0Icyxm5Z7iJxSUwaZZKljPl0J2ZU3kzLqP/MSu4OdIEEgZYip6n0BmVrpg27 T/yf0LXaJIuCij47PBjjiWTKGeD05tcL25C7ZtjxnkZ/rwQK2AGxgD4+jAskaW3zJAkk 3LIIBgmO3eXer/DQiHinu6DQ2swiCkj0SzDAyIH6qFl4izXovOFiKOQLEZ65H6lJMmH6 A/lA== X-Gm-Message-State: APjAAAXn03IRdIDFv0eQlueoqppYBnyYVWMNeJSnPb8AuhB7A/eE2FCd ebc7POTUaM+/aj+TwspSionx24OH+ck= X-Google-Smtp-Source: APXvYqyh7GmjUHdAPkfQnL2h1rs4I1Aq/QuqAEqebx0t7FVTldSAgXXVkrqtZm6Vk74H36QsCmVYvg== X-Received: by 2002:ac8:f88:: with SMTP id b8mr33625110qtk.382.1573590172345; Tue, 12 Nov 2019 12:22:52 -0800 (PST) Received: from ziepe.ca (hlfxns017vw-142-162-113-180.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.162.113.180]) by smtp.gmail.com with ESMTPSA id q16sm7487987qkm.27.2019.11.12.12.22.47 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 12 Nov 2019 12:22:48 -0800 (PST) Received: from jgg by mlx.ziepe.ca with local (Exim 4.90_1) (envelope-from ) id 1iUcgZ-0003k4-DE; Tue, 12 Nov 2019 16:22:47 -0400 From: Jason Gunthorpe To: linux-mm@kvack.org, Jerome Glisse , Ralph Campbell , John Hubbard , Felix.Kuehling@amd.com Cc: linux-rdma@vger.kernel.org, dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, Alex Deucher , Ben Skeggs , Boris Ostrovsky , =?UTF-8?q?Christian=20K=C3=B6nig?= , David Zhou , Dennis Dalessandro , Juergen Gross , Mike Marciniszyn , Oleksandr Andrushchenko , Petr Cvek , Stefano Stabellini , nouveau@lists.freedesktop.org, xen-devel@lists.xenproject.org, Christoph Hellwig , Jason Gunthorpe Subject: [PATCH v3 06/14] RDMA/hfi1: Use mmu_interval_notifier_insert for user_exp_rcv Date: Tue, 12 Nov 2019 16:22:23 -0400 Message-Id: <20191112202231.3856-7-jgg@ziepe.ca> X-Mailer: git-send-email 2.24.0 In-Reply-To: <20191112202231.3856-1-jgg@ziepe.ca> References: <20191112202231.3856-1-jgg@ziepe.ca> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Jason Gunthorpe This converts one of the two users of mmu_notifiers to use the new API. The conversion is fairly straightforward, however the existing use of notifiers here seems to be racey. Tested-by: Dennis Dalessandro Signed-off-by: Jason Gunthorpe --- drivers/infiniband/hw/hfi1/file_ops.c | 2 +- drivers/infiniband/hw/hfi1/hfi.h | 2 +- drivers/infiniband/hw/hfi1/user_exp_rcv.c | 146 +++++++++------------- drivers/infiniband/hw/hfi1/user_exp_rcv.h | 3 +- 4 files changed, 60 insertions(+), 93 deletions(-) diff --git a/drivers/infiniband/hw/hfi1/file_ops.c b/drivers/infiniband/h= w/hfi1/file_ops.c index f9a7e9d29c8ba2..7c5e3fb224139a 100644 --- a/drivers/infiniband/hw/hfi1/file_ops.c +++ b/drivers/infiniband/hw/hfi1/file_ops.c @@ -1138,7 +1138,7 @@ static int get_ctxt_info(struct hfi1_filedata *fd, = unsigned long arg, u32 len) HFI1_CAP_UGET_MASK(uctxt->flags, MASK) | HFI1_CAP_KGET_MASK(uctxt->flags, K2U); /* adjust flag if this fd is not able to cache */ - if (!fd->handler) + if (!fd->use_mn) cinfo.runtime_flags |=3D HFI1_CAP_TID_UNMAP; /* no caching */ =20 cinfo.num_active =3D hfi1_count_active_units(); diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi= 1/hfi.h index fa45350a9a1d32..fc10d65fc3e13c 100644 --- a/drivers/infiniband/hw/hfi1/hfi.h +++ b/drivers/infiniband/hw/hfi1/hfi.h @@ -1444,7 +1444,7 @@ struct hfi1_filedata { /* for cpu affinity; -1 if none */ int rec_cpu_num; u32 tid_n_pinned; - struct mmu_rb_handler *handler; + bool use_mn; struct tid_rb_node **entry_to_rb; spinlock_t tid_lock; /* protect tid_[limit,used] counters */ u32 tid_limit; diff --git a/drivers/infiniband/hw/hfi1/user_exp_rcv.c b/drivers/infiniba= nd/hw/hfi1/user_exp_rcv.c index 3592a9ec155e85..75a378162162d3 100644 --- a/drivers/infiniband/hw/hfi1/user_exp_rcv.c +++ b/drivers/infiniband/hw/hfi1/user_exp_rcv.c @@ -59,11 +59,11 @@ static int set_rcvarray_entry(struct hfi1_filedata *f= d, struct tid_user_buf *tbuf, u32 rcventry, struct tid_group *grp, u16 pageidx, unsigned int npages); -static int tid_rb_insert(void *arg, struct mmu_rb_node *node); static void cacheless_tid_rb_remove(struct hfi1_filedata *fdata, struct tid_rb_node *tnode); -static void tid_rb_remove(void *arg, struct mmu_rb_node *node); -static int tid_rb_invalidate(void *arg, struct mmu_rb_node *mnode); +static bool tid_rb_invalidate(struct mmu_interval_notifier *mni, + const struct mmu_notifier_range *range, + unsigned long cur_seq); static int program_rcvarray(struct hfi1_filedata *fd, struct tid_user_bu= f *, struct tid_group *grp, unsigned int start, u16 count, @@ -73,10 +73,8 @@ static int unprogram_rcvarray(struct hfi1_filedata *fd= , u32 tidinfo, struct tid_group **grp); static void clear_tid_node(struct hfi1_filedata *fd, struct tid_rb_node = *node); =20 -static struct mmu_rb_ops tid_rb_ops =3D { - .insert =3D tid_rb_insert, - .remove =3D tid_rb_remove, - .invalidate =3D tid_rb_invalidate +static const struct mmu_interval_notifier_ops tid_mn_ops =3D { + .invalidate =3D tid_rb_invalidate, }; =20 /* @@ -87,7 +85,6 @@ static struct mmu_rb_ops tid_rb_ops =3D { int hfi1_user_exp_rcv_init(struct hfi1_filedata *fd, struct hfi1_ctxtdata *uctxt) { - struct hfi1_devdata *dd =3D uctxt->dd; int ret =3D 0; =20 spin_lock_init(&fd->tid_lock); @@ -109,20 +106,7 @@ int hfi1_user_exp_rcv_init(struct hfi1_filedata *fd, fd->entry_to_rb =3D NULL; return -ENOMEM; } - - /* - * Register MMU notifier callbacks. If the registration - * fails, continue without TID caching for this context. - */ - ret =3D hfi1_mmu_rb_register(fd, fd->mm, &tid_rb_ops, - dd->pport->hfi1_wq, - &fd->handler); - if (ret) { - dd_dev_info(dd, - "Failed MMU notifier registration %d\n", - ret); - ret =3D 0; - } + fd->use_mn =3D true; } =20 /* @@ -139,7 +123,7 @@ int hfi1_user_exp_rcv_init(struct hfi1_filedata *fd, * init. */ spin_lock(&fd->tid_lock); - if (uctxt->subctxt_cnt && fd->handler) { + if (uctxt->subctxt_cnt && fd->use_mn) { u16 remainder; =20 fd->tid_limit =3D uctxt->expected_count / uctxt->subctxt_cnt; @@ -158,18 +142,10 @@ void hfi1_user_exp_rcv_free(struct hfi1_filedata *f= d) { struct hfi1_ctxtdata *uctxt =3D fd->uctxt; =20 - /* - * The notifier would have been removed when the process'es mm - * was freed. - */ - if (fd->handler) { - hfi1_mmu_rb_unregister(fd->handler); - } else { - if (!EXP_TID_SET_EMPTY(uctxt->tid_full_list)) - unlock_exp_tids(uctxt, &uctxt->tid_full_list, fd); - if (!EXP_TID_SET_EMPTY(uctxt->tid_used_list)) - unlock_exp_tids(uctxt, &uctxt->tid_used_list, fd); - } + if (!EXP_TID_SET_EMPTY(uctxt->tid_full_list)) + unlock_exp_tids(uctxt, &uctxt->tid_full_list, fd); + if (!EXP_TID_SET_EMPTY(uctxt->tid_used_list)) + unlock_exp_tids(uctxt, &uctxt->tid_used_list, fd); =20 kfree(fd->invalid_tids); fd->invalid_tids =3D NULL; @@ -201,7 +177,7 @@ static void unpin_rcv_pages(struct hfi1_filedata *fd, =20 if (mapped) { pci_unmap_single(dd->pcidev, node->dma_addr, - node->mmu.len, PCI_DMA_FROMDEVICE); + node->npages * PAGE_SIZE, PCI_DMA_FROMDEVICE); pages =3D &node->pages[idx]; } else { pages =3D &tidbuf->pages[idx]; @@ -777,8 +753,8 @@ static int set_rcvarray_entry(struct hfi1_filedata *f= d, return -EFAULT; } =20 - node->mmu.addr =3D tbuf->vaddr + (pageidx * PAGE_SIZE); - node->mmu.len =3D npages * PAGE_SIZE; + node->notifier.ops =3D &tid_mn_ops; + node->fdata =3D fd; node->phys =3D page_to_phys(pages[0]); node->npages =3D npages; node->rcventry =3D rcventry; @@ -787,23 +763,34 @@ static int set_rcvarray_entry(struct hfi1_filedata = *fd, node->freed =3D false; memcpy(node->pages, pages, sizeof(struct page *) * npages); =20 - if (!fd->handler) - ret =3D tid_rb_insert(fd, &node->mmu); - else - ret =3D hfi1_mmu_rb_insert(fd->handler, &node->mmu); - - if (ret) { - hfi1_cdbg(TID, "Failed to insert RB node %u 0x%lx, 0x%lx %d", - node->rcventry, node->mmu.addr, node->phys, ret); - pci_unmap_single(dd->pcidev, phys, npages * PAGE_SIZE, - PCI_DMA_FROMDEVICE); - kfree(node); - return -EFAULT; + if (fd->use_mn) { + ret =3D mmu_interval_notifier_insert( + &node->notifier, tbuf->vaddr + (pageidx * PAGE_SIZE), + npages * PAGE_SIZE, fd->mm); + if (ret) + goto out_unmap; + /* + * FIXME: This is in the wrong order, the notifier should be + * established before the pages are pinned by pin_rcv_pages. + */ + mmu_interval_read_begin(&node->notifier); } + fd->entry_to_rb[node->rcventry - uctxt->expected_base] =3D node; + hfi1_put_tid(dd, rcventry, PT_EXPECTED, phys, ilog2(npages) + 1); trace_hfi1_exp_tid_reg(uctxt->ctxt, fd->subctxt, rcventry, npages, - node->mmu.addr, node->phys, phys); + node->notifier.interval_tree.start, node->phys, + phys); return 0; + +out_unmap: + hfi1_cdbg(TID, "Failed to insert RB node %u 0x%lx, 0x%lx %d", + node->rcventry, node->notifier.interval_tree.start, + node->phys, ret); + pci_unmap_single(dd->pcidev, phys, npages * PAGE_SIZE, + PCI_DMA_FROMDEVICE); + kfree(node); + return -EFAULT; } =20 static int unprogram_rcvarray(struct hfi1_filedata *fd, u32 tidinfo, @@ -833,10 +820,9 @@ static int unprogram_rcvarray(struct hfi1_filedata *= fd, u32 tidinfo, if (grp) *grp =3D node->grp; =20 - if (!fd->handler) - cacheless_tid_rb_remove(fd, node); - else - hfi1_mmu_rb_remove(fd->handler, &node->mmu); + if (fd->use_mn) + mmu_interval_notifier_remove(&node->notifier); + cacheless_tid_rb_remove(fd, node); =20 return 0; } @@ -847,7 +833,8 @@ static void clear_tid_node(struct hfi1_filedata *fd, = struct tid_rb_node *node) struct hfi1_devdata *dd =3D uctxt->dd; =20 trace_hfi1_exp_tid_unreg(uctxt->ctxt, fd->subctxt, node->rcventry, - node->npages, node->mmu.addr, node->phys, + node->npages, + node->notifier.interval_tree.start, node->phys, node->dma_addr); =20 /* @@ -894,30 +881,29 @@ static void unlock_exp_tids(struct hfi1_ctxtdata *u= ctxt, if (!node || node->rcventry !=3D rcventry) continue; =20 + if (fd->use_mn) + mmu_interval_notifier_remove( + &node->notifier); cacheless_tid_rb_remove(fd, node); } } } } =20 -/* - * Always return 0 from this function. A non-zero return indicates that= the - * remove operation will be called and that memory should be unpinned. - * However, the driver cannot unpin out from under PSM. Instead, retain= the - * memory (by returning 0) and inform PSM that the memory is going away.= PSM - * will call back later when it has removed the memory from its list. - */ -static int tid_rb_invalidate(void *arg, struct mmu_rb_node *mnode) +static bool tid_rb_invalidate(struct mmu_interval_notifier *mni, + const struct mmu_notifier_range *range, + unsigned long cur_seq) { - struct hfi1_filedata *fdata =3D arg; - struct hfi1_ctxtdata *uctxt =3D fdata->uctxt; struct tid_rb_node *node =3D - container_of(mnode, struct tid_rb_node, mmu); + container_of(mni, struct tid_rb_node, notifier); + struct hfi1_filedata *fdata =3D node->fdata; + struct hfi1_ctxtdata *uctxt =3D fdata->uctxt; =20 if (node->freed) - return 0; + return true; =20 - trace_hfi1_exp_tid_inval(uctxt->ctxt, fdata->subctxt, node->mmu.addr, + trace_hfi1_exp_tid_inval(uctxt->ctxt, fdata->subctxt, + node->notifier.interval_tree.start, node->rcventry, node->npages, node->dma_addr); node->freed =3D true; =20 @@ -946,18 +932,7 @@ static int tid_rb_invalidate(void *arg, struct mmu_r= b_node *mnode) fdata->invalid_tid_idx++; } spin_unlock(&fdata->invalid_lock); - return 0; -} - -static int tid_rb_insert(void *arg, struct mmu_rb_node *node) -{ - struct hfi1_filedata *fdata =3D arg; - struct tid_rb_node *tnode =3D - container_of(node, struct tid_rb_node, mmu); - u32 base =3D fdata->uctxt->expected_base; - - fdata->entry_to_rb[tnode->rcventry - base] =3D tnode; - return 0; + return true; } =20 static void cacheless_tid_rb_remove(struct hfi1_filedata *fdata, @@ -968,12 +943,3 @@ static void cacheless_tid_rb_remove(struct hfi1_file= data *fdata, fdata->entry_to_rb[tnode->rcventry - base] =3D NULL; clear_tid_node(fdata, tnode); } - -static void tid_rb_remove(void *arg, struct mmu_rb_node *node) -{ - struct hfi1_filedata *fdata =3D arg; - struct tid_rb_node *tnode =3D - container_of(node, struct tid_rb_node, mmu); - - cacheless_tid_rb_remove(fdata, tnode); -} diff --git a/drivers/infiniband/hw/hfi1/user_exp_rcv.h b/drivers/infiniba= nd/hw/hfi1/user_exp_rcv.h index 43b105de1d5427..6257eee083a1a3 100644 --- a/drivers/infiniband/hw/hfi1/user_exp_rcv.h +++ b/drivers/infiniband/hw/hfi1/user_exp_rcv.h @@ -65,7 +65,8 @@ struct tid_user_buf { }; =20 struct tid_rb_node { - struct mmu_rb_node mmu; + struct mmu_interval_notifier notifier; + struct hfi1_filedata *fdata; unsigned long phys; struct tid_group *grp; u32 rcventry; --=20 2.24.0