On 12/19/2017 11:12 AM, Matthew Wilcox wrote: > On Tue, Dec 19, 2017 at 09:52:27AM -0800, rao.shoaib@oracle.com wrote: >> This patch updates kfree_rcu to use new bulk memory free functions as they >> are more efficient. It also moves kfree_call_rcu() out of rcu related code to >> mm/slab_common.c >> >> Signed-off-by: Rao Shoaib >> --- >> include/linux/mm.h | 5 ++ >> kernel/rcu/tree.c | 14 ---- >> kernel/sysctl.c | 40 +++++++++++ >> mm/slab.h | 23 +++++++ >> mm/slab_common.c | 198 ++++++++++++++++++++++++++++++++++++++++++++++++++++- >> 5 files changed, 264 insertions(+), 16 deletions(-) > You've added an awful lot of code. Do you have any performance measurements > that shows this to be a win? I did some micro benchmarking when I was developing the code and did see performance gains -- see attached. I tried several networking benchmarks but was not able to get any improvement . The reason is that these benchmarks do not exercise the code we are improving. So I looked at the kernel source for users ofA kfree_rcu().A It turns out that directory deletion code callsA kfree_rcu to free the data structure when an entry is deleted. Based on that I created two benchmarks. 1) make_dirs -- This benchmark creates multi level directory structure and than deletes it. It's the delete partA where we see the performance gain of about 8.3%. The creation time remains same. This benchmark was derived from fdtree benchmark at https://computing.llnl.gov/?set=code&page=sio_downloads ==> https://github.com/llnl/fdtree 2) tsock -- I also noticed that a socket has an entry in a directory and when the socket is closed the directory entry is deleted. So I wrote a simple benchmark that goes in a loop a million times and opens and closes 10 sockets per iteration. This shows an improvement of 7.6% I have attached the benchmarks and results. Unchanged results are for stock kernel, Changed are for the modified kernel. Shoaib