From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yk0-f181.google.com (mail-yk0-f181.google.com [209.85.160.181]) by kanga.kvack.org (Postfix) with ESMTP id D80E16B0258 for ; Mon, 3 Aug 2015 03:50:16 -0400 (EDT) Received: by ykdu72 with SMTP id u72so104068675ykd.2 for ; Mon, 03 Aug 2015 00:50:16 -0700 (PDT) Received: from mail-yk0-x230.google.com (mail-yk0-x230.google.com. [2607:f8b0:4002:c07::230]) by mx.google.com with ESMTPS id g83si10177873ywc.125.2015.08.03.00.50.13 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 03 Aug 2015 00:50:14 -0700 (PDT) Received: by ykeo23 with SMTP id o23so7248823yke.3 for ; Mon, 03 Aug 2015 00:50:13 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <1437159145-6548-6-git-send-email-jglisse@redhat.com> References: <1437159145-6548-1-git-send-email-jglisse@redhat.com> <1437159145-6548-6-git-send-email-jglisse@redhat.com> Date: Mon, 3 Aug 2015 13:20:13 +0530 Message-ID: Subject: Re: [PATCH 05/15] HMM: introduce heterogeneous memory management v4. From: Girish KS Content-Type: multipart/alternative; boundary=001a113a346c6fbb6b051c636bb2 Sender: owner-linux-mm@kvack.org List-ID: To: =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= Cc: Christophe Harle , Mark Hairgrove , Dave Airlie , Arvind Gopalakrishnan , Jatin Kumar , joro@8bytes.org, Greg Stoner , akpm@linux-foundation.org, Cameron Buschardt , Rik van Riel , Paul Blinzer , Lucien Dunning , Johannes Weiner , Haggai Eran , Michael Mantor , Laurent Morichetti , Larry Woodman , John Hubbard , Brendan Conoboy , John Bridgman , Subhash Gutti , Roland Dreier , Duncan Poole , linux-mm@kvack.org, Alexander Deucher , Linus Torvalds , Andrea Arcangeli , Leonid Shamis , Sherry Cheung , Linux Kernel Mailing List , Shachar Raindel , Liran Liss , Ben Sander , Joe Donohue , Mel Gorman , "H. Peter Anvin" , Peter Zijlstra , ks.giri@samsung.com --001a113a346c6fbb6b051c636bb2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 18-Jul-2015 12:47 am, "J=C3=A9r=C3=B4me Glisse" wro= te: > > This patch only introduce core HMM functions for registering a new > mirror and stopping a mirror as well as HMM device registering and > unregistering. > > The lifecycle of HMM object is handled differently then the one of > mmu_notifier because unlike mmu_notifier there can be concurrent > call from both mm code to HMM code and/or from device driver code > to HMM code. Moreover lifetime of HMM can be uncorrelated from the > lifetime of the process that is being mirror (GPU might take longer > time to cleanup). > > Changed since v1: > - Updated comment of hmm_device_register(). > > Changed since v2: > - Expose struct hmm for easy access to mm struct. > - Simplify hmm_mirror_register() arguments. > - Removed the device name. > - Refcount the mirror struct internaly to HMM allowing to get > rid of the srcu and making the device driver callback error > handling simpler. > - Safe to call several time hmm_mirror_unregister(). > - Rework the mmu_notifier unregistration and release callback. > > Changed since v3: > - Rework hmm_mirror lifetime rules. > - Synchronize with mmu_notifier srcu before droping mirror last > reference in hmm_mirror_unregister() > - Use spinlock for device's mirror list. > - Export mirror ref/unref functions. > - English syntax fixes. > > Signed-off-by: J=C3=A9r=C3=B4me Glisse > Signed-off-by: Sherry Cheung > Signed-off-by: Subhash Gutti > Signed-off-by: Mark Hairgrove > Signed-off-by: John Hubbard > Signed-off-by: Jatin Kumar > --- > MAINTAINERS | 7 + > include/linux/hmm.h | 173 +++++++++++++++++++++ > include/linux/mm.h | 11 ++ > include/linux/mm_types.h | 14 ++ > kernel/fork.c | 2 + > mm/Kconfig | 14 ++ > mm/Makefile | 1 + > mm/hmm.c | 381 +++++++++++++++++++++++++++++++++++++++++++++++ > 8 files changed, 603 insertions(+) > create mode 100644 include/linux/hmm.h > create mode 100644 mm/hmm.c > > diff --git a/MAINTAINERS b/MAINTAINERS > index 2d3d55c..8ebdc17 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -4870,6 +4870,13 @@ F: include/uapi/linux/if_hippi.h > F: net/802/hippi.c > F: drivers/net/hippi/ > > +HMM - Heterogeneous Memory Management > +M: J=C3=A9r=C3=B4me Glisse > +L: linux-mm@kvack.org > +S: Maintained > +F: mm/hmm.c > +F: include/linux/hmm.h > + > HOST AP DRIVER > M: Jouni Malinen > L: hostap@shmoo.com (subscribers-only) > diff --git a/include/linux/hmm.h b/include/linux/hmm.h > new file mode 100644 > index 0000000..b559c0b > --- /dev/null > +++ b/include/linux/hmm.h > @@ -0,0 +1,173 @@ > +/* > + * Copyright 2013 Red Hat Inc. > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * Authors: J=C3=A9r=C3=B4me Glisse > + */ > +/* This is a heterogeneous memory management (hmm). In a nutshell this provide > + * an API to mirror a process address on a device which has its own mmu using > + * its own page table for the process. It supports everything except special > + * vma. > + * > + * Mandatory hardware features : > + * - An mmu with pagetable. > + * - Read only flag per cpu page. > + * - Page fault ie hardware must stop and wait for kernel to service fault. > + * > + * Optional hardware features : > + * - Dirty bit per cpu page. > + * - Access bit per cpu page. > + * > + * The hmm code handle all the interfacing with the core kernel mm code and > + * provide a simple API. It does support migrating system memory to device > + * memory and handle migration back to system memory on cpu page fault. > + * > + * Migrated memory is considered as swaped from cpu and core mm code point of > + * view. > + */ > +#ifndef _HMM_H > +#define _HMM_H > + > +#ifdef CONFIG_HMM > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > + > +struct hmm_device; > +struct hmm_mirror; > +struct hmm; > + > + > +/* hmm_device - Each device must register one and only one hmm_device. > + * > + * The hmm_device is the link btw HMM and each device driver. > + */ > + > +/* struct hmm_device_operations - HMM device operation callback > + */ > +struct hmm_device_ops { > + /* release() - mirror must stop using the address space. > + * > + * @mirror: The mirror that link process address space with the device. > + * > + * When this is called, device driver must kill all device thread using > + * this mirror. It is call either from : > + * - mm dying (all process using this mm exiting). > + * - hmm_mirror_unregister() (if no other thread holds a reference) > + * - outcome of some device error reported by any of the device > + * callback against that mirror. > + */ > + void (*release)(struct hmm_mirror *mirror); > + > + /* free() - mirror can be freed. > + * > + * @mirror: The mirror that link process address space with the device. > + * > + * When this is called, device driver can free the underlying memory > + * associated with that mirror. Note this is call from atomic context > + * so device driver callback can not sleep. > + */ > + void (*free)(struct hmm_mirror *mirror); > +}; > + > + > +/* struct hmm - per mm_struct HMM states. > + * > + * @mm: The mm struct this hmm is associated with. > + * @mirrors: List of all mirror for this mm (one per device). > + * @vm_end: Last valid address for this mm (exclusive). > + * @kref: Reference counter. > + * @rwsem: Serialize the mirror list modifications. > + * @mmu_notifier: The mmu_notifier of this mm. > + * @rcu: For delayed cleanup call from mmu_notifier.release() callback. > + * > + * For each process address space (mm_struct) there is one and only one hmm > + * struct. hmm functions will redispatch to each devices the change made to > + * the process address space. > + * > + * Device driver must not access this structure other than for getting the > + * mm pointer. > + */ > +struct hmm { > + struct mm_struct *mm; > + struct hlist_head mirrors; > + unsigned long vm_end; > + struct kref kref; > + struct rw_semaphore rwsem; > + struct mmu_notifier mmu_notifier; > + struct rcu_head rcu; > +}; > + > + > +/* struct hmm_device - per device HMM structure > + * > + * @dev: Linux device structure pointer. > + * @ops: The hmm operations callback. > + * @mirrors: List of all active mirrors for the device. > + * @lock: Lock protecting mirrors list. > + * > + * Each device that want to mirror an address space must register one of this > + * struct (only once per linux device). > + */ > +struct hmm_device { > + struct device *dev; > + const struct hmm_device_ops *ops; > + struct list_head mirrors; > + spinlock_t lock; > +}; > + > +int hmm_device_register(struct hmm_device *device); > +int hmm_device_unregister(struct hmm_device *device); > + > + > +/* hmm_mirror - device specific mirroring functions. > + * > + * Each device that mirror a process has a uniq hmm_mirror struct associating > + * the process address space with the device. Same process can be mirrored by > + * several different devices at the same time. > + */ > + > +/* struct hmm_mirror - per device and per mm HMM structure > + * > + * @device: The hmm_device struct this hmm_mirror is associated to. > + * @hmm: The hmm struct this hmm_mirror is associated to. > + * @kref: Reference counter (private to HMM do not use). > + * @dlist: List of all hmm_mirror for same device. > + * @mlist: List of all hmm_mirror for same process. > + * > + * Each device that want to mirror an address space must register one of this > + * struct for each of the address space it wants to mirror. Same device can > + * mirror several different address space. As well same address space can be > + * mirror by different devices. > + */ > +struct hmm_mirror { > + struct hmm_device *device; > + struct hmm *hmm; > + struct kref kref; > + struct list_head dlist; > + struct hlist_node mlist; > +}; > + > +int hmm_mirror_register(struct hmm_mirror *mirror); > +void hmm_mirror_unregister(struct hmm_mirror *mirror); > +struct hmm_mirror *hmm_mirror_ref(struct hmm_mirror *mirror); > +void hmm_mirror_unref(struct hmm_mirror **mirror); > + > + > +#endif /* CONFIG_HMM */ > +#endif > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 2e872f9..b5bf210 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2243,5 +2243,16 @@ void __init setup_nr_node_ids(void); > static inline void setup_nr_node_ids(void) {} > #endif > > +#ifdef CONFIG_HMM > +static inline void hmm_mm_init(struct mm_struct *mm) > +{ > + mm->hmm =3D NULL; > +} > +#else /* !CONFIG_HMM */ > +static inline void hmm_mm_init(struct mm_struct *mm) > +{ > +} > +#endif /* !CONFIG_HMM */ > + > #endif /* __KERNEL__ */ > #endif /* _LINUX_MM_H */ > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > index 0038ac7..fa05917 100644 > --- a/include/linux/mm_types.h > +++ b/include/linux/mm_types.h > @@ -15,6 +15,10 @@ > #include > #include > > +#ifdef CONFIG_HMM > +struct hmm; > +#endif > + > #ifndef AT_VECTOR_SIZE_ARCH > #define AT_VECTOR_SIZE_ARCH 0 > #endif > @@ -451,6 +455,16 @@ struct mm_struct { > #ifdef CONFIG_MMU_NOTIFIER > struct mmu_notifier_mm *mmu_notifier_mm; > #endif > +#ifdef CONFIG_HMM > + /* > + * hmm always register an mmu_notifier we rely on mmu notifier to keep > + * refcount on mm struct as well as forbiding registering hmm on = a > + * dying mm > + * > + * This field is set with mmap_sem held in write mode. > + */ > + struct hmm *hmm; > +#endif > #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS > pgtable_t pmd_huge_pte; /* protected by page_table_lock */ > #endif > diff --git a/kernel/fork.c b/kernel/fork.c > index 1bfefc6..0d1f446 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -27,6 +27,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -597,6 +598,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p) > mm_init_aio(mm); > mm_init_owner(mm, p); > mmu_notifier_mm_init(mm); > + hmm_mm_init(mm); > clear_tlb_flush_pending(mm); > #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS > mm->pmd_huge_pte =3D NULL; > diff --git a/mm/Kconfig b/mm/Kconfig > index e79de2b..e1e0a82 100644 > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -654,3 +654,17 @@ config DEFERRED_STRUCT_PAGE_INIT > when kswapd starts. This has a potential performance impact on > processes running early in the lifetime of the systemm until kswapd > finishes the initialisation. > + > +if STAGING > +config HMM > + bool "Enable heterogeneous memory management (HMM)" > + depends on MMU > + select MMU_NOTIFIER > + default n > + help > + Heterogeneous memory management provide infrastructure for a device > + to mirror a process address space into an hardware mmu or into any > + things supporting pagefault like event. > + > + If unsure, say N to disable hmm. > +endif # STAGING > diff --git a/mm/Makefile b/mm/Makefile > index 98c4eae..90ca9c4 100644 > --- a/mm/Makefile > +++ b/mm/Makefile > @@ -78,3 +78,4 @@ obj-$(CONFIG_CMA) +=3D cma.o > obj-$(CONFIG_MEMORY_BALLOON) +=3D balloon_compaction.o > obj-$(CONFIG_PAGE_EXTENSION) +=3D page_ext.o > obj-$(CONFIG_CMA_DEBUGFS) +=3D cma_debug.o > +obj-$(CONFIG_HMM) +=3D hmm.o > diff --git a/mm/hmm.c b/mm/hmm.c > new file mode 100644 > index 0000000..198fe37 > --- /dev/null > +++ b/mm/hmm.c > @@ -0,0 +1,381 @@ > +/* > + * Copyright 2013 Red Hat Inc. > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * Authors: J=C3=A9r=C3=B4me Glisse > + */ > +/* This is the core code for heterogeneous memory management (HMM). HMM intend > + * to provide helper for mirroring a process address space on a device as well > + * as allowing migration of data between system memory and device memory refer > + * as remote memory from here on out. > + * > + * Refer to include/linux/hmm.h for further information on general design. > + */ > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include "internal.h" > + > +static struct mmu_notifier_ops hmm_notifier_ops; > + > + > +/* hmm - core HMM functions. > + * > + * Core HMM functions that deal with all the process mm activities. > + */ > + > +static int hmm_init(struct hmm *hmm) > +{ > + hmm->mm =3D current->mm; > + hmm->vm_end =3D TASK_SIZE; > + kref_init(&hmm->kref); > + INIT_HLIST_HEAD(&hmm->mirrors); > + init_rwsem(&hmm->rwsem); > + > + /* register notifier */ > + hmm->mmu_notifier.ops =3D &hmm_notifier_ops; > + return __mmu_notifier_register(&hmm->mmu_notifier, current->mm); > +} > + > +static int hmm_add_mirror(struct hmm *hmm, struct hmm_mirror *mirror) > +{ > + struct hmm_mirror *tmp; > + > + down_write(&hmm->rwsem); > + hlist_for_each_entry(tmp, &hmm->mirrors, mlist) > + if (tmp->device =3D=3D mirror->device) { > + /* Same device can mirror only once. */ > + up_write(&hmm->rwsem); > + return -EINVAL; > + } > + hlist_add_head(&mirror->mlist, &hmm->mirrors); > + hmm_mirror_ref(mirror); > + up_write(&hmm->rwsem); > + > + return 0; > +} > + > +static inline struct hmm *hmm_ref(struct hmm *hmm) > +{ > + if (!hmm || !kref_get_unless_zero(&hmm->kref)) > + return NULL; > + return hmm; > +} > + > +static void hmm_destroy_delayed(struct rcu_head *rcu) > + > + struct hmm *hmm; > + > + hmm =3D container_of(rcu, struct hmm, rcu); > + kfree(hmm); > +} > + > +static void hmm_destroy(struct kref *kref) > +{ > + struct hmm *hmm; > + > + hmm =3D container_of(kref, struct hmm, kref); > + BUG_ON(!hlist_empty(&hmm->mirrors)); > + > + down_write(&hmm->mm->mmap_sem); > + /* A new hmm might have been register before reaching that point. */ > + if (hmm->mm->hmm =3D=3D hmm) > + hmm->mm->hmm =3D NULL; > + up_write(&hmm->mm->mmap_sem); > + > + mmu_notifier_unregister_no_release(&hmm->mmu_notifier, hmm->mm); > + > + mmu_notifier_call_srcu(&hmm->rcu, &hmm_destroy_delayed); > +} > + > +static inline struct hmm *hmm_unref(struct hmm *hmm) > +{ > + if (hmm) > + kref_put(&hmm->kref, hmm_destroy); > + return NULL; > +} > + > + > +/* hmm_notifier - HMM callback for mmu_notifier tracking change to process mm. > + * > + * HMM use use mmu notifier to track change made to process address space. > + */ > +static void hmm_notifier_release(struct mmu_notifier *mn, struct mm_struct *mm) > +{ > + struct hmm *hmm; > + > + hmm =3D hmm_ref(container_of(mn, struct hmm, mmu_notifier)); > + if (!hmm) > + return; > + > + down_write(&hmm->rwsem); > + while (hmm->mirrors.first) { > + struct hmm_mirror *mirror; > + > + /* > + * Here we are holding the mirror reference from the mirror > + * list. As list removal is synchronized through rwsem, n= o > + * other thread can assume it holds that reference. > + */ > + mirror =3D hlist_entry(hmm->mirrors.first, > + struct hmm_mirror, > + mlist); > + hlist_del_init(&mirror->mlist); > + up_write(&hmm->rwsem); > + > + mirror->device->ops->release(mirror); > + hmm_mirror_unref(&mirror); > + > + down_write(&hmm->rwsem); > + } > + up_write(&hmm->rwsem); > + > + hmm_unref(hmm); > +} > + > +static struct mmu_notifier_ops hmm_notifier_ops =3D { > + .release =3D hmm_notifier_release, > +}; > + > + > +/* hmm_mirror - per device mirroring functions. > + * > + * Each device that mirror a process has a uniq hmm_mirror struct. A process > + * can be mirror by several devices at the same time. > + * > + * Below are all the functions and their helpers use by device driver to mirror > + * the process address space. Those functions either deals with updating the > + * device page table (through hmm callback). Or provide helper functions use by > + * the device driver to fault in range of memory in the device page table. > + */ > +struct hmm_mirror *hmm_mirror_ref(struct hmm_mirror *mirror) > +{ > + if (!mirror || !kref_get_unless_zero(&mirror->kref)) > + return NULL; > + return mirror; > +} > +EXPORT_SYMBOL(hmm_mirror_ref); > + > +static void hmm_mirror_destroy(struct kref *kref) > +{ > + struct hmm_device *device; > + struct hmm_mirror *mirror; > + > + mirror =3D container_of(kref, struct hmm_mirror, kref); > + device =3D mirror->device; > + > + hmm_unref(mirror->hmm); > + > + spin_lock(&device->lock); > + list_del_init(&mirror->dlist); > + device->ops->free(mirror); > + spin_unlock(&device->lock); > +} > + > +void hmm_mirror_unref(struct hmm_mirror **mirror) > +{ > + struct hmm_mirror *tmp =3D mirror ? *mirror : NULL; > + > + if (tmp) { > + *mirror =3D NULL; > + kref_put(&tmp->kref, hmm_mirror_destroy); > + } > +} > +EXPORT_SYMBOL(hmm_mirror_unref); > + > +/* hmm_mirror_register() - register mirror against current process for a device. > + * > + * @mirror: The mirror struct being registered. > + * Returns: 0 on success or -ENOMEM, -EINVAL on error. > + * > + * Call when device driver want to start mirroring a process address space. The > + * HMM shim will register mmu_notifier and start monitoring process address > + * space changes. Hence callback to device driver might happen even before this > + * function return. > + * > + * The task device driver want to mirror must be current ! > + * > + * Only one mirror per mm and hmm_device can be created, it will return NULL if > + * the hmm_device already has an hmm_mirror for the the mm. > + */ > +int hmm_mirror_register(struct hmm_mirror *mirror) > +{ > + struct mm_struct *mm =3D current->mm; > + struct hmm *hmm =3D NULL; > + int ret =3D 0; > + > + /* Sanity checks. */ > + BUG_ON(!mirror); > + BUG_ON(!mirror->device); > + BUG_ON(!mm); > + > + /* > + * Initialize the mirror struct fields, the mlist init and del dance is > + * necessary to make the error path easier for driver and for hmm= . > + */ > + kref_init(&mirror->kref); > + INIT_HLIST_NODE(&mirror->mlist); > + INIT_LIST_HEAD(&mirror->dlist); > + spin_lock(&mirror->device->lock); > + list_add(&mirror->dlist, &mirror->device->mirrors); > + spin_unlock(&mirror->device->lock); > + > + down_write(&mm->mmap_sem); > + > + hmm =3D mm->hmm ? hmm_ref(hmm) : NULL; Instead of hmm mm->hmm would be the right param to be passed. Here even though mm->hmm is true hmm_ref returns NULL. Because hmm is not updated after initialization in the beginning. > + if (hmm =3D=3D NULL) { General practice for NULL check in drivers is if(!hmm). > + /* no hmm registered yet so register one */ > + hmm =3D kzalloc(sizeof(*mm->hmm), GFP_KERNEL); > + if (hmm =3D=3D NULL) { > + up_write(&mm->mmap_sem); > + ret =3D -ENOMEM; > + goto error; > + } > + > + ret =3D hmm_init(hmm); > + if (ret) { > + up_write(&mm->mmap_sem); > + kfree(hmm); > + goto error; > + } > + > + mm->hmm =3D hmm; > + } > + > + mirror->hmm =3D hmm; > + ret =3D hmm_add_mirror(hmm, mirror); > + up_write(&mm->mmap_sem); > + if (ret) { > + mirror->hmm =3D NULL; > + hmm_unref(hmm); > + goto error; > + } > + return 0; > + > +error: > + spin_lock(&mirror->device->lock); > + list_del_init(&mirror->dlist); > + spin_unlock(&mirror->device->lock); > + return ret; > +} > +EXPORT_SYMBOL(hmm_mirror_register); > + > +static void hmm_mirror_kill(struct hmm_mirror *mirror) > +{ > + struct hmm_device *device =3D mirror->device; > + struct hmm *hmm =3D hmm_ref(mirror->hmm); > + > + if (!hmm) > + return; > + > + down_write(&hmm->rwsem); > + if (!hlist_unhashed(&mirror->mlist)) { > + hlist_del_init(&mirror->mlist); > + up_write(&hmm->rwsem); > + device->ops->release(mirror); > + hmm_mirror_unref(&mirror); > + } else > + up_write(&hmm->rwsem); > + > + hmm_unref(hmm); > +} > + > +/* hmm_mirror_unregister() - unregister a mirror. > + * > + * @mirror: The mirror that link process address space with the device. > + * > + * Driver can call this function when it wants to stop mirroring a process. > + * This will trigger a call to the ->release() callback if it did not aleady > + * happen. > + * > + * Note that caller must hold a reference on the mirror. > + * > + * THIS CAN NOT BE CALL FROM device->release() CALLBACK OR IT WILL DEADLOCK. > + */ > +void hmm_mirror_unregister(struct hmm_mirror *mirror) > +{ > + if (mirror =3D=3D NULL) > + return; > + > + hmm_mirror_kill(mirror); > + mmu_notifier_synchronize(); > + hmm_mirror_unref(&mirror); > +} > +EXPORT_SYMBOL(hmm_mirror_unregister); > + > + > +/* hmm_device - Each device driver must register one and only one hmm_device > + * > + * The hmm_device is the link btw HMM and each device driver. > + */ > + > +/* hmm_device_register() - register a device with HMM. > + * > + * @device: The hmm_device struct. > + * Returns: 0 on success or -EINVAL otherwise. > + * > + * > + * Call when device driver want to register itself with HMM. Device driver must > + * only register once. > + */ > +int hmm_device_register(struct hmm_device *device) > +{ > + /* sanity check */ > + BUG_ON(!device); > + BUG_ON(!device->ops); > + BUG_ON(!device->ops->release); > + > + spin_lock_init(&device->lock); > + INIT_LIST_HEAD(&device->mirrors); > + > + return 0; > +} > +EXPORT_SYMBOL(hmm_device_register); > + > +/* hmm_device_unregister() - unregister a device with HMM. > + * > + * @device: The hmm_device struct. > + * Returns: 0 on success or -EBUSY otherwise. > + * > + * Call when device driver want to unregister itself with HMM. This will check > + * that there is no any active mirror and returns -EBUSY if so. > + */ > +int hmm_device_unregister(struct hmm_device *device) > +{ > + spin_lock(&device->lock); > + if (!list_empty(&device->mirrors)) { > + spin_unlock(&device->lock); > + return -EBUSY; > + } > + spin_unlock(&device->lock); > + return 0; > +} > +EXPORT_SYMBOL(hmm_device_unregister); > -- > 1.9.3 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" i= n > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ --001a113a346c6fbb6b051c636bb2 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


On 18-Jul-2015 12:47 am, "J=C3=A9r=C3=B4me Glisse" <jglisse@redhat.com> wrote:
>
> This patch only introduce core HMM functions for registering a new
> mirror and stopping a mirror as well as HMM device registering and
> unregistering.
>
> The lifecycle of HMM object is handled differently then the one of
> mmu_notifier because unlike mmu_notifier there can be concurrent
> call from both mm code to HMM code and/or from device driver code
> to HMM code. Moreover lifetime of HMM can be uncorrelated from the
> lifetime of the process that is being mirror (GPU might take longer > time to cleanup).
>
> Changed since v1:
> =C2=A0 - Updated comment of hmm_device_register().
>
> Changed since v2:
> =C2=A0 - Expose struct hmm for easy access to mm struct.
> =C2=A0 - Simplify hmm_mirror_register() arguments.
> =C2=A0 - Removed the device name.
> =C2=A0 - Refcount the mirror struct internaly to HMM allowing to get > =C2=A0 =C2=A0 rid of the srcu and making the device driver callback er= ror
> =C2=A0 =C2=A0 handling simpler.
> =C2=A0 - Safe to call several time hmm_mirror_unregister().
> =C2=A0 - Rework the mmu_notifier unregistration and release callback.<= br> >
> Changed since v3:
> =C2=A0 - Rework hmm_mirror lifetime rules.
> =C2=A0 - Synchronize with mmu_notifier srcu before droping mirror last=
> =C2=A0 =C2=A0 reference in hmm_mirror_unregister()
> =C2=A0 - Use spinlock for device's mirror list.
> =C2=A0 - Export mirror ref/unref functions.
> =C2=A0 - English syntax fixes.
>
> Signed-off-by: J=C3=A9r=C3=B4me Glisse <jglisse@redhat.com>
> Signed-off-by: Sherry Cheung <SCheung@nvidia.com>
> Signed-off-by: Subhash Gutti <= sgutti@nvidia.com>
> Signed-off-by: Mark Hairgrove <mhairgrove@nvidia.com>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> Signed-off-by: Jatin Kumar <j= akumar@nvidia.com>
> ---
> =C2=A0MAINTAINERS=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 |=C2= =A0 =C2=A07 +
> =C2=A0include/linux/hmm.h=C2=A0 =C2=A0 =C2=A0 | 173 ++++++++++++++++++= +++
> =C2=A0include/linux/mm.h=C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 11 ++
> =C2=A0include/linux/mm_types.h |=C2=A0 14 ++
> =C2=A0kernel/fork.c=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 |=C2=A0 = =C2=A02 +
> =C2=A0mm/Kconfig=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0|=C2=A0 14 ++
> =C2=A0mm/Makefile=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 |=C2= =A0 =C2=A01 +
> =C2=A0mm/hmm.c=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0| 381 +++++++++++++++++++++++++++++++++++++++++++++++
> =C2=A08 files changed, 603 insertions(+)
> =C2=A0create mode 100644 include/linux/hmm.h
> =C2=A0create mode 100644 mm/hmm.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 2d3d55c..8ebdc17 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -4870,6 +4870,13 @@ F:=C2=A0 =C2=A0 =C2=A0 =C2=A0include/uapi/linux= /if_hippi.h
> =C2=A0F:=C2=A0 =C2=A0 =C2=A0net/802/hippi.c
> =C2=A0F:=C2=A0 =C2=A0 =C2=A0drivers/net/hippi/
>
> +HMM - Heterogeneous Memory Management
> +M:=C2=A0 =C2=A0 =C2=A0J=C3=A9r=C3=B4me Glisse <jglisse@redhat.com>
> +L:=C2=A0 =C2=A0 =C2=A0linux-mm@= kvack.org
> +S:=C2=A0 =C2=A0 =C2=A0Maintained
> +F:=C2=A0 =C2=A0 =C2=A0mm/hmm.c
> +F:=C2=A0 =C2=A0 =C2=A0include/linux/hmm.h
> +
> =C2=A0HOST AP DRIVER
> =C2=A0M:=C2=A0 =C2=A0 =C2=A0Jouni Malinen <j@w1.fi>
> =C2=A0L:=C2=A0 =C2=A0 =C2=A0hostap= @shmoo.com (subscribers-only)
> diff --git a/include/linux/hmm.h b/include/linux/hmm.h
> new file mode 100644
> index 0000000..b559c0b
> --- /dev/null
> +++ b/include/linux/hmm.h
> @@ -0,0 +1,173 @@
> +/*
> + * Copyright 2013 Red Hat Inc.
> + *
> + * This program is free software; you can redistribute it and/or modi= fy
> + * it under the terms of the GNU General Public License as published = by
> + * the Free Software Foundation; either version 2 of the License, or<= br> > + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.=C2=A0 See the=
> + * GNU General Public License for more details.
> + *
> + * Authors: J=C3=A9r=C3=B4me Glisse <jglisse@redhat.com>
> + */
> +/* This is a heterogeneous memory management (hmm). In a nutshell thi= s provide
> + * an API to mirror a process address on a device which has its own m= mu using
> + * its own page table for the process. It supports everything except = special
> + * vma.
> + *
> + * Mandatory hardware features :
> + *=C2=A0 =C2=A0- An mmu with pagetable.
> + *=C2=A0 =C2=A0- Read only flag per cpu page.
> + *=C2=A0 =C2=A0- Page fault ie hardware must stop and wait for kernel= to service fault.
> + *
> + * Optional hardware features :
> + *=C2=A0 =C2=A0- Dirty bit per cpu page.
> + *=C2=A0 =C2=A0- Access bit per cpu page.
> + *
> + * The hmm code handle all the interfacing with the core kernel mm co= de and
> + * provide a simple API. It does support migrating system memory to d= evice
> + * memory and handle migration back to system memory on cpu page faul= t.
> + *
> + * Migrated memory is considered as swaped from cpu and core mm code = point of
> + * view.
> + */
> +#ifndef _HMM_H
> +#define _HMM_H
> +
> +#ifdef CONFIG_HMM
> +
> +#include <linux/list.h>
> +#include <linux/spinlock.h>
> +#include <linux/atomic.h>
> +#include <linux/mm_types.h>
> +#include <linux/mmu_notifier.h>
> +#include <linux/workqueue.h>
> +#include <linux/mman.h>
> +
> +
> +struct hmm_device;
> +struct hmm_mirror;
> +struct hmm;
> +
> +
> +/* hmm_device - Each device must register one and only one hmm_device= .
> + *
> + * The hmm_device is the link btw HMM and each device driver.
> + */
> +
> +/* struct hmm_device_operations - HMM device operation callback
> + */
> +struct hmm_device_ops {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0/* release() - mirror must stop using the = address space.
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 *
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 * @mirror: The mirror that link process a= ddress space with the device.
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 *
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 * When this is called, device driver must= kill all device thread using
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 * this mirror. It is call either from : > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 *=C2=A0 =C2=A0- mm dying (all process usi= ng this mm exiting).
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 *=C2=A0 =C2=A0- hmm_mirror_unregister() (= if no other thread holds a reference)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 *=C2=A0 =C2=A0- outcome of some device er= ror reported by any of the device
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 *=C2=A0 =C2=A0 =C2=A0callback against tha= t mirror.
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 */
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0void (*release)(struct hmm_mirror *mirror)= ;
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0/* free() - mirror can be freed.
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 *
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 * @mirror: The mirror that link process a= ddress space with the device.
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 *
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 * When this is called, device driver can = free the underlying memory
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 * associated with that mirror. Note this = is call from atomic context
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 * so device driver callback can not sleep= .
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 */
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0void (*free)(struct hmm_mirror *mirror); > +};
> +
> +
> +/* struct hmm - per mm_struct HMM states.
> + *
> + * @mm: The mm struct this hmm is associated with.
> + * @mirrors: List of all mirror for this mm (one per device).
> + * @vm_end: Last valid address for this mm (exclusive).
> + * @kref: Reference counter.
> + * @rwsem: Serialize the mirror list modifications.
> + * @mmu_notifier: The mmu_notifier of this mm.
> + * @rcu: For delayed cleanup call from mmu_notifier.release() callbac= k.
> + *
> + * For each process address space (mm_struct) there is one and only o= ne hmm
> + * struct. hmm functions will redispatch to each devices the change m= ade to
> + * the process address space.
> + *
> + * Device driver must not access this structure other than for gettin= g the
> + * mm pointer.
> + */
> +struct hmm {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct mm_struct=C2=A0 =C2=A0 =C2=A0 =C2= =A0 *mm;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct hlist_head=C2=A0 =C2=A0 =C2=A0 =C2= =A0mirrors;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0unsigned long=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0vm_end;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct kref=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0kref;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct rw_semaphore=C2=A0 =C2=A0 =C2=A0rws= em;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct mmu_notifier=C2=A0 =C2=A0 =C2=A0mmu= _notifier;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct rcu_head=C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0rcu;
> +};
> +
> +
> +/* struct hmm_device - per device HMM structure
> + *
> + * @dev: Linux device structure pointer.
> + * @ops: The hmm operations callback.
> + * @mirrors: List of all active mirrors for the device.
> + * @lock: Lock protecting mirrors list.
> + *
> + * Each device that want to mirror an address space must register one= of this
> + * struct (only once per linux device).
> + */
> +struct hmm_device {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct device=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0*dev;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0const struct hmm_device_ops=C2=A0 =C2=A0 = =C2=A0*ops;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct list_head=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 mirrors;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0spinlock_t=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 lock;
> +};
> +
> +int hmm_device_register(struct hmm_device *device);
> +int hmm_device_unregister(struct hmm_device *device);
> +
> +
> +/* hmm_mirror - device specific mirroring functions.
> + *
> + * Each device that mirror a process has a uniq hmm_mirror struct ass= ociating
> + * the process address space with the device. Same process can be mir= rored by
> + * several different devices at the same time.
> + */
> +
> +/* struct hmm_mirror - per device and per mm HMM structure
> + *
> + * @device: The hmm_device struct this hmm_mirror is associated to. > + * @hmm: The hmm struct this hmm_mirror is associated to.
> + * @kref: Reference counter (private to HMM do not use).
> + * @dlist: List of all hmm_mirror for same device.
> + * @mlist: List of all hmm_mirror for same process.
> + *
> + * Each device that want to mirror an address space must register one= of this
> + * struct for each of the address space it wants to mirror. Same devi= ce can
> + * mirror several different address space. As well same address space= can be
> + * mirror by different devices.
> + */
> +struct hmm_mirror {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct hmm_device=C2=A0 =C2=A0 =C2=A0 =C2= =A0*device;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct hmm=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 *hmm;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct kref=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0kref;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct list_head=C2=A0 =C2=A0 =C2=A0 =C2= =A0 dlist;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct hlist_node=C2=A0 =C2=A0 =C2=A0 =C2= =A0mlist;
> +};
> +
> +int hmm_mirror_register(struct hmm_mirror *mirror);
> +void hmm_mirror_unregister(struct hmm_mirror *mirror);
> +struct hmm_mirror *hmm_mirror_ref(struct hmm_mirror *mirror);
> +void hmm_mirror_unref(struct hmm_mirror **mirror);
> +
> +
> +#endif /* CONFIG_HMM */
> +#endif
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 2e872f9..b5bf210 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2243,5 +2243,16 @@ void __init setup_nr_node_ids(void);
> =C2=A0static inline void setup_nr_node_ids(void) {}
> =C2=A0#endif
>
> +#ifdef CONFIG_HMM
> +static inline void hmm_mm_init(struct mm_struct *mm)
> +{
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0mm->hmm =3D NULL;
> +}
> +#else /* !CONFIG_HMM */
> +static inline void hmm_mm_init(struct mm_struct *mm)
> +{
> +}
> +#endif /* !CONFIG_HMM */
> +
> =C2=A0#endif /* __KERNEL__ */
> =C2=A0#endif /* _LINUX_MM_H */
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 0038ac7..fa05917 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -15,6 +15,10 @@
> =C2=A0#include <asm/page.h>
> =C2=A0#include <asm/mmu.h>
>
> +#ifdef CONFIG_HMM
> +struct hmm;
> +#endif
> +
> =C2=A0#ifndef AT_VECTOR_SIZE_ARCH
> =C2=A0#define AT_VECTOR_SIZE_ARCH 0
> =C2=A0#endif
> @@ -451,6 +455,16 @@ struct mm_struct {
> =C2=A0#ifdef CONFIG_MMU_NOTIFIER
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 struct mmu_notifier_mm *mmu_notifier_mm; > =C2=A0#endif
> +#ifdef CONFIG_HMM
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0/*
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 * hmm always register an mmu_notifier we = rely on mmu notifier to keep
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 * refcount on mm struct as well as forbid= ing registering hmm on a
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 * dying mm
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 *
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 * This field is set with mmap_sem held in= write mode.
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 */
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct hmm *hmm;
> +#endif
> =C2=A0#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_P= MD_PTLOCKS
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 pgtable_t pmd_huge_pte; /* protected by pa= ge_table_lock */
> =C2=A0#endif
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 1bfefc6..0d1f446 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -27,6 +27,7 @@
> =C2=A0#include <linux/binfmts.h>
> =C2=A0#include <linux/mman.h>
> =C2=A0#include <linux/mmu_notifier.h>
> +#include <linux/hmm.h>
> =C2=A0#include <linux/fs.h>
> =C2=A0#include <linux/mm.h>
> =C2=A0#include <linux/vmacache.h>
> @@ -597,6 +598,7 @@ static struct mm_struct *mm_init(struct mm_struct = *mm, struct task_struct *p)
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 mm_init_aio(mm);
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 mm_init_owner(mm, p);
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 mmu_notifier_mm_init(mm);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0hmm_mm_init(mm);
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 clear_tlb_flush_pending(mm);
> =C2=A0#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_P= MD_PTLOCKS
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 mm->pmd_huge_pte =3D NULL;
> diff --git a/mm/Kconfig b/mm/Kconfig
> index e79de2b..e1e0a82 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -654,3 +654,17 @@ config DEFERRED_STRUCT_PAGE_INIT
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 when kswapd starts. This has a pote= ntial performance impact on
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 processes running early in the life= time of the systemm until kswapd
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 finishes the initialisation.
> +
> +if STAGING
> +config HMM
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0bool "Enable heterogeneous memory man= agement (HMM)"
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0depends on MMU
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0select MMU_NOTIFIER
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0default n
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0help
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Heterogeneous memory management pro= vide infrastructure for a device
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0to mirror a process address space i= nto an hardware mmu or into any
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0things supporting pagefault like ev= ent.
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0If unsure, say N to disable hmm. > +endif # STAGING
> diff --git a/mm/Makefile b/mm/Makefile
> index 98c4eae..90ca9c4 100644
> --- a/mm/Makefile
> +++ b/mm/Makefile
> @@ -78,3 +78,4 @@ obj-$(CONFIG_CMA)=C2=A0 =C2=A0 =C2=A0+=3D cma.o
> =C2=A0obj-$(CONFIG_MEMORY_BALLOON) +=3D balloon_compaction.o
> =C2=A0obj-$(CONFIG_PAGE_EXTENSION) +=3D page_ext.o
> =C2=A0obj-$(CONFIG_CMA_DEBUGFS) +=3D cma_debug.o
> +obj-$(CONFIG_HMM) +=3D hmm.o
> diff --git a/mm/hmm.c b/mm/hmm.c
> new file mode 100644
> index 0000000..198fe37
> --- /dev/null
> +++ b/mm/hmm.c
> @@ -0,0 +1,381 @@
> +/*
> + * Copyright 2013 Red Hat Inc.
> + *
> + * This program is free software; you can redistribute it and/or modi= fy
> + * it under the terms of the GNU General Public License as published = by
> + * the Free Software Foundation; either version 2 of the License, or<= br> > + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.=C2=A0 See the=
> + * GNU General Public License for more details.
> + *
> + * Authors: J=C3=A9r=C3=B4me Glisse <jglisse@redhat.com>
> + */
> +/* This is the core code for heterogeneous memory management (HMM). H= MM intend
> + * to provide helper for mirroring a process address space on a devic= e as well
> + * as allowing migration of data between system memory and device mem= ory refer
> + * as remote memory from here on out.
> + *
> + * Refer to include/linux/hmm.h for further information on general de= sign.
> + */
> +#include <linux/export.h>
> +#include <linux/bitmap.h>
> +#include <linux/list.h>
> +#include <linux/rculist.h>
> +#include <linux/slab.h>
> +#include <linux/mmu_notifier.h>
> +#include <linux/mm.h>
> +#include <linux/hugetlb.h>
> +#include <linux/fs.h>
> +#include <linux/file.h>
> +#include <linux/ksm.h>
> +#include <linux/rmap.h>
> +#include <linux/swap.h>
> +#include <linux/swapops.h>
> +#include <linux/mmu_context.h>
> +#include <linux/memcontrol.h>
> +#include <linux/hmm.h>
> +#include <linux/wait.h>
> +#include <linux/mman.h>
> +#include <linux/delay.h>
> +#include <linux/workqueue.h>
> +
> +#include "internal.h"
> +
> +static struct mmu_notifier_ops hmm_notifier_ops;
> +
> +
> +/* hmm - core HMM functions.
> + *
> + * Core HMM functions that deal with all the process mm activities. > + */
> +
> +static int hmm_init(struct hmm *hmm)
> +{
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0hmm->mm =3D current->mm;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0hmm->vm_end =3D TASK_SIZE;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0kref_init(&hmm->kref);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0INIT_HLIST_HEAD(&hmm->mirrors);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0init_rwsem(&hmm->rwsem);
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0/* register notifier */
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0hmm->mmu_notifier.ops =3D &hmm_noti= fier_ops;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0return __mmu_notifier_register(&hmm-&g= t;mmu_notifier, current->mm);
> +}
> +
> +static int hmm_add_mirror(struct hmm *hmm, struct hmm_mirror *mirror)=
> +{
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct hmm_mirror *tmp;
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0down_write(&hmm->rwsem);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0hlist_for_each_entry(tmp, &hmm->mir= rors, mlist)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (tmp->de= vice =3D=3D mirror->device) {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0/* Same device can mirror only once. */
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0up_write(&hmm->rwsem);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0return -EINVAL;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0hlist_add_head(&mirror->mlist, &= ;hmm->mirrors);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0hmm_mirror_ref(mirror);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0up_write(&hmm->rwsem);
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0return 0;
> +}
> +
> +static inline struct hmm *hmm_ref(struct hmm *hmm)
> +{
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0if (!hmm || !kref_get_unless_zero(&hmm= ->kref))

> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0return NULL;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0return hmm;
> +}
> +
> +static void hmm_destroy_delayed(struct rcu_head *rcu)
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct hmm *hmm;
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0hmm =3D container_of(rcu, struct hmm, rcu)= ;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0kfree(hmm);
> +}
> +
> +static void hmm_destroy(struct kref *kref)
> +{
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct hmm *hmm;
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0hmm =3D container_of(kref, struct hmm, kre= f);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0BUG_ON(!hlist_empty(&hmm->mirrors))= ;
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0down_write(&hmm->mm->mmap_sem);<= br> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0/* A new hmm might have been register befo= re reaching that point. */
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0if (hmm->mm->hmm =3D=3D hmm)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0hmm->mm->= ;hmm =3D NULL;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0up_write(&hmm->mm->mmap_sem); > +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0mmu_notifier_unregister_no_release(&hm= m->mmu_notifier, hmm->mm);
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0mmu_notifier_call_srcu(&hmm->rcu, &= amp;hmm_destroy_delayed);
> +}
> +
> +static inline struct hmm *hmm_unref(struct hmm *hmm)
> +{
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0if (hmm)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0kref_put(&= hmm->kref, hmm_destroy);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0return NULL;
> +}
> +
> +
> +/* hmm_notifier - HMM callback for mmu_notifier tracking change to pr= ocess mm.
> + *
> + * HMM use use mmu notifier to track change made to process address s= pace.
> + */
> +static void hmm_notifier_release(struct mmu_notifier *mn, struct mm_s= truct *mm)
> +{
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct hmm *hmm;
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0hmm =3D hmm_ref(container_of(mn, struct hm= m, mmu_notifier));
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0if (!hmm)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return;
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0down_write(&hmm->rwsem);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0while (hmm->mirrors.first) {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0struct hmm_mir= ror *mirror;
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/*
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * Here we are= holding the mirror reference from the mirror
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * list. As li= st removal is synchronized through rwsem, no
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * other threa= d can assume it holds that reference.
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 */
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0mirror =3D hli= st_entry(hmm->mirrors.first,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 struct hmm_mirror,=
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 mlist);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0hlist_del_init= (&mirror->mlist);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0up_write(&= hmm->rwsem);
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0mirror->dev= ice->ops->release(mirror);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0hmm_mirror_unr= ef(&mirror);
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0down_write(&am= p;hmm->rwsem);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0}
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0up_write(&hmm->rwsem);
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0hmm_unref(hmm);
> +}
> +
> +static struct mmu_notifier_ops hmm_notifier_ops =3D {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0.release=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =3D hmm_notifier_release,
> +};
> +
> +
> +/* hmm_mirror - per device mirroring functions.
> + *
> + * Each device that mirror a process has a uniq hmm_mirror struct. A = process
> + * can be mirror by several devices at the same time.
> + *
> + * Below are all the functions and their helpers use by device driver= to mirror
> + * the process address space. Those functions either deals with updat= ing the
> + * device page table (through hmm callback). Or provide helper functi= ons use by
> + * the device driver to fault in range of memory in the device page t= able.
> + */
> +struct hmm_mirror *hmm_mirror_ref(struct hmm_mirror *mirror)
> +{
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0if (!mirror || !kref_get_unless_zero(&= mirror->kref))
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return NULL; > +=C2=A0 =C2=A0 =C2=A0 =C2=A0return mirror;
> +}
> +EXPORT_SYMBOL(hmm_mirror_ref);
> +
> +static void hmm_mirror_destroy(struct kref *kref)
> +{
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct hmm_device *device;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct hmm_mirror *mirror;
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0mirror =3D container_of(kref, struct hmm_m= irror, kref);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0device =3D mirror->device;
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0hmm_unref(mirror->hmm);
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0spin_lock(&device->lock);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0list_del_init(&mirror->dlist);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0device->ops->free(mirror);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0spin_unlock(&device->lock);
> +}
> +
> +void hmm_mirror_unref(struct hmm_mirror **mirror)
> +{
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct hmm_mirror *tmp =3D mirror ? *mirro= r : NULL;
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0if (tmp) {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0*mirror =3D NU= LL;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0kref_put(&= tmp->kref, hmm_mirror_destroy);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0}
> +}
> +EXPORT_SYMBOL(hmm_mirror_unref);
> +
> +/* hmm_mirror_register() - register mirror against current process fo= r a device.
> + *
> + * @mirror: The mirror struct being registered.
> + * Returns: 0 on success or -ENOMEM, -EINVAL on error.
> + *
> + * Call when device driver want to start mirroring a process address = space. The
> + * HMM shim will register mmu_notifier and start monitoring process a= ddress
> + * space changes. Hence callback to device driver might happen even b= efore this
> + * function return.
> + *
> + * The task device driver want to mirror must be current !
> + *
> + * Only one mirror per mm and hmm_device can be created, it will retu= rn NULL if
> + * the hmm_device already has an hmm_mirror for the the mm.
> + */
> +int hmm_mirror_register(struct hmm_mirror *mirror)
> +{
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct mm_struct *mm =3D current->mm; > +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct hmm *hmm =3D NULL;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0int ret =3D 0;
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0/* Sanity checks. */
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0BUG_ON(!mirror);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0BUG_ON(!mirror->device);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0BUG_ON(!mm);
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0/*
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 * Initialize the mirror struct fields, th= e mlist init and del dance is
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 * necessary to make the error path easier= for driver and for hmm.
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 */
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0kref_init(&mirror->kref);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0INIT_HLIST_NODE(&mirror->mlist); > +=C2=A0 =C2=A0 =C2=A0 =C2=A0INIT_LIST_HEAD(&mirror->dlist);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0spin_lock(&mirror->device->lock)= ;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0list_add(&mirror->dlist, &mirro= r->device->mirrors);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0spin_unlock(&mirror->device->loc= k);
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0down_write(&mm->mmap_sem);
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0hmm =3D mm->hmm ? hmm_ref(hmm) : NULL;<= /p>

Instead of hmm mm->hmm would be the right param to be pas= sed.=C2=A0 Here even though mm->hmm is true hmm_ref returns NULL. Becaus= e hmm is not updated after initialization in the beginning.

> +=C2=A0 =C2=A0 =C2=A0 =C2=A0if (hmm =3D=3D NULL) {

General practice for NULL check in drivers is if(!hmm).=C2= =A0

> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0/* no hmm registered yet so register one */
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0hmm =3D kzallo= c(sizeof(*mm->hmm), GFP_KERNEL);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (hmm =3D=3D= NULL) {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0up_write(&mm->mmap_sem);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0ret =3D -ENOMEM;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0goto error;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ret =3D hmm_in= it(hmm);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (ret) {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0up_write(&mm->mmap_sem);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0kfree(hmm);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0goto error;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0mm->hmm =3D= hmm;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0}
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0mirror->hmm =3D hmm;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0ret =3D hmm_add_mirror(hmm, mirror);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0up_write(&mm->mmap_sem);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0if (ret) {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0mirror->hmm= =3D NULL;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0hmm_unref(hmm)= ;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0goto error; > +=C2=A0 =C2=A0 =C2=A0 =C2=A0}
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0return 0;
> +
> +error:
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0spin_lock(&mirror->device->lock)= ;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0list_del_init(&mirror->dlist);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0spin_unlock(&mirror->device->loc= k);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0return ret;
> +}
> +EXPORT_SYMBOL(hmm_mirror_register);
> +
> +static void hmm_mirror_kill(struct hmm_mirror *mirror)
> +{
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct hmm_device *device =3D mirror->d= evice;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0struct hmm *hmm =3D hmm_ref(mirror->hmm= );
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0if (!hmm)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return;
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0down_write(&hmm->rwsem);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0if (!hlist_unhashed(&mirror->mlist)= ) {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0hlist_del_init= (&mirror->mlist);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0up_write(&= hmm->rwsem);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0device->ops= ->release(mirror);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0hmm_mirror_unr= ef(&mirror);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0} else
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0up_write(&= hmm->rwsem);
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0hmm_unref(hmm);
> +}
> +
> +/* hmm_mirror_unregister() - unregister a mirror.
> + *
> + * @mirror: The mirror that link process address space with the devic= e.
> + *
> + * Driver can call this function when it wants to stop mirroring a pr= ocess.
> + * This will trigger a call to the ->release() callback if it did = not aleady
> + * happen.
> + *
> + * Note that caller must hold a reference on the mirror.
> + *
> + * THIS CAN NOT BE CALL FROM device->release() CALLBACK OR IT WILL= DEADLOCK.
> + */
> +void hmm_mirror_unregister(struct hmm_mirror *mirror)
> +{
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0if (mirror =3D=3D NULL)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return;
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0hmm_mirror_kill(mirror);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0mmu_notifier_synchronize();
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0hmm_mirror_unref(&mirror);
> +}
> +EXPORT_SYMBOL(hmm_mirror_unregister);
> +
> +
> +/* hmm_device - Each device driver must register one and only one hmm= _device
> + *
> + * The hmm_device is the link btw HMM and each device driver.
> + */
> +
> +/* hmm_device_register() - register a device with HMM.
> + *
> + * @device: The hmm_device struct.
> + * Returns: 0 on success or -EINVAL otherwise.
> + *
> + *
> + * Call when device driver want to register itself with HMM. Device d= river must
> + * only register once.
> + */
> +int hmm_device_register(struct hmm_device *device)
> +{
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0/* sanity check */
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0BUG_ON(!device);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0BUG_ON(!device->ops);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0BUG_ON(!device->ops->release);
> +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0spin_lock_init(&device->lock);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0INIT_LIST_HEAD(&device->mirrors); > +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0return 0;
> +}
> +EXPORT_SYMBOL(hmm_device_register);
> +
> +/* hmm_device_unregister() - unregister a device with HMM.
> + *
> + * @device: The hmm_device struct.
> + * Returns: 0 on success or -EBUSY otherwise.
> + *
> + * Call when device driver want to unregister itself with HMM. This w= ill check
> + * that there is no any active mirror and returns -EBUSY if so.
> + */
> +int hmm_device_unregister(struct hmm_device *device)
> +{
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0spin_lock(&device->lock);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0if (!list_empty(&device->mirrors)) = {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0spin_unlock(&a= mp;device->lock);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return -EBUSY;=
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0}
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0spin_unlock(&device->lock);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0return 0;
> +}
> +EXPORT_SYMBOL(hmm_device_unregister);
> --
> 1.9.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-k= ernel" in
> the body of a message to = majordomo@vger.kernel.org
> More majordomo info at=C2=A0 http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at=C2=A0 http= ://www.tux.org/lkml/

--001a113a346c6fbb6b051c636bb2-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org