On Thu, Jan 03, 2019 at 09:10:13AM -0800, Yang Shi wrote: > How about the below description: > > The test with page_fault1 of will-it-scale (sometimes tracing may just show > runtest.py that is the wrapper script of page_fault1), which basically > launches NR_CPU threads to generate 128MB anonymous pages for each thread,  > on my virtual machine with congested HDD shows long tail latency is reduced > significantly. > > Without the patch >  page_fault1_thr-1490  [023]   129.311706: funcgraph_entry: #57377.796 us |  > do_swap_page(); >  page_fault1_thr-1490  [023]   129.369103: funcgraph_entry: 5.642us   |  > do_swap_page(); >  page_fault1_thr-1490  [023]   129.369119: funcgraph_entry: #1289.592 us |  > do_swap_page(); >  page_fault1_thr-1490  [023]   129.370411: funcgraph_entry: 4.957us   |  > do_swap_page(); >  page_fault1_thr-1490  [023]   129.370419: funcgraph_entry: 1.940us   |  > do_swap_page(); >  page_fault1_thr-1490  [023]   129.378847: funcgraph_entry: #1411.385 us |  > do_swap_page(); >  page_fault1_thr-1490  [023]   129.380262: funcgraph_entry: 3.916us   |  > do_swap_page(); >  page_fault1_thr-1490  [023]   129.380275: funcgraph_entry: #4287.751 us |  > do_swap_page(); > > With the patch >       runtest.py-1417  [020]   301.925911: funcgraph_entry: #9870.146 us |  > do_swap_page(); >       runtest.py-1417  [020]   301.935785: funcgraph_entry: 9.802us   |  > do_swap_page(); >       runtest.py-1417  [020]   301.935799: funcgraph_entry: 3.551us   |  > do_swap_page(); >       runtest.py-1417  [020]   301.935806: funcgraph_entry: 2.142us   |  > do_swap_page(); >       runtest.py-1417  [020]   301.935853: funcgraph_entry: 6.938us   |  > do_swap_page(); >       runtest.py-1417  [020]   301.935864: funcgraph_entry: 3.765us   |  > do_swap_page(); >       runtest.py-1417  [020]   301.935871: funcgraph_entry: 3.600us   |  > do_swap_page(); >       runtest.py-1417  [020]   301.935878: funcgraph_entry: 7.202us   |  > do_swap_page(); That's better, thanks!