From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg0-f69.google.com (mail-pg0-f69.google.com [74.125.83.69]) by kanga.kvack.org (Postfix) with ESMTP id F04006B0069 for ; Tue, 27 Dec 2016 22:31:11 -0500 (EST) Received: by mail-pg0-f69.google.com with SMTP id n189so465293464pga.4 for ; Tue, 27 Dec 2016 19:31:11 -0800 (PST) Received: from mga06.intel.com (mga06.intel.com. [134.134.136.31]) by mx.google.com with ESMTPS id y21si48452145pgh.97.2016.12.27.19.31.11 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 27 Dec 2016 19:31:11 -0800 (PST) From: "Huang\, Ying" Subject: Re: [PATCH v4 0/9] mm/swap: Regular page swap optimizations References: <20161227074503.GA10616@bbox> <87d1gc4y3w.fsf@yhuang-dev.intel.com> <20161228023739.GA12634@bbox> <8760m43frm.fsf@yhuang-dev.intel.com> Date: Wed, 28 Dec 2016 11:31:06 +0800 In-Reply-To: <8760m43frm.fsf@yhuang-dev.intel.com> (Ying Huang's message of "Wed, 28 Dec 2016 11:15:57 +0800") Message-ID: <871sws3f2d.fsf@yhuang-dev.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ascii Sender: owner-linux-mm@kvack.org List-ID: To: "Huang, Ying" Cc: Minchan Kim , Tim Chen , Andrew Morton , dave.hansen@intel.com, ak@linux.intel.com, aaron.lu@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Hugh Dickins , Shaohua Li , Rik van Riel , Andrea Arcangeli , "Kirill A . Shutemov" , Vladimir Davydov , Johannes Weiner , Michal Hocko , Hillf Danton , Christian Borntraeger , Jonathan Corbet , jack@suse.cz "Huang, Ying" writes: > Minchan Kim writes: > >> Hi Huang, >> >> On Wed, Dec 28, 2016 at 09:54:27AM +0800, Huang, Ying wrote: >> >> < snip > >> >>> > The patchset has used several techniqueus to reduce lock contention, for example, >>> > batching alloc/free, fine-grained lock and cluster distribution to avoid cache >>> > false-sharing. Each items has different complexity and benefits so could you >>> > show the number for each step of pathchset? It would be better to include the >>> > nubmer in each description. It helps how the patch is important when we consider >>> > complexitiy of the patch. >>> >>> One common problem of scalability optimization is that, after you have >>> optimized one lock, the end result may be not very good, because another >>> lock becomes heavily contended. Similar problem occurs here, there are >>> mainly two locks during swap out/in, one protects swap cache, the other >>> protects swap device. We can achieve good scalability only after having >>> optimized the two locks. >> >> Yes. You can describe that situation into the description. For example, >> "with this patch, we can watch less swap_lock contention with perf but >> overall performance is not good because swap cache lock still is still >> contended heavily like below data so next patch will solve the problem". >> >> It will make patch's justficiation clear. >> >>> >>> You cannot say that one patch is not important just because the test >>> result for that single patch is not very good. Because without that, >>> the end result of the whole series will be not very good. >> >> I know that but this patchset are lack of number too much to justify >> each works. You can show just raw number itself of a techniqueue >> although it is not huge benefit or even worse. You can explain the reason >> why it was not good, which would be enough motivation for next patch. >> >> Number itself wouldn't be important but justfication is really crucial >> to review/merge patchset and number will help it a lot in especially >> MM community. >> >>> >>> >> >>> >> Patch 1 is a clean up patch. >>> > >>> > Could it be separated patch? >>> > >>> >> Patch 2 creates a lock per cluster, this gives us a more fine graind lock >>> >> that can be used for accessing swap_map, and not lock the whole >>> >> swap device >>> > >>> > I hope you make three steps to review easier. You can create some functions like >>> > swap_map_lock and cluster_lock which are wrapper functions just hold swap_lock. >>> > It doesn't change anything performance pov but it clearly shows what kinds of lock >>> > we should use in specific context. >>> > >>> > Then, you can introduce more fine-graind lock in next patch and apply it into >>> > those wrapper functions. >>> > >>> > And last patch, you can adjust cluster distribution to avoid false-sharing. >>> > And the description should include how it's bad in testing so it's worth. >>> > >>> > Frankly speaking, although I'm huge user of bit_spin_lock(zram/zsmalloc >>> > have used it heavily), I don't like swap subsystem uses it. >>> > During zram development, it really hurts debugging due to losing lockdep. >>> > The reason zram have used it is by size concern of embedded world but server >>> > would be not critical so please consider trade-off of spinlock vs. bit_spin_lock. >>> >>> There will be one struct swap_cluster_info for every 1MB swap space. >>> So, for example, for 1TB swap space, the number of struct >>> swap_cluster_info will be one million. To reduce the RAM usage, we >>> choose to use bit_spin_lock, otherwise, spinlock is better. The code >>> will be used by embedded, PC and server, so the RAM usage is important. >> >> It seems you already increase swap_cluster_info 4 byte to support >> bit_spin_lock. > > The increment only occurs on 64bit platform. On 32bit platform, the > size is the same as before. > >> Compared to that, how much memory does spin_lock increase? > > The size of struct swap_cluster_info will increase from 4 bytes to 16 > bytes on 64bit platform. I guess it will increase from 4 bytes to 8 > bytes on 32bit platform at least, but I did not test that. Sorry, I make a mistake during test. The size of struct swap_cluster_info will increase from 4 bytes to 8 bytes on 64 bit platform. I think it will increase from 4 bytes to 8 bytes on 32 bit platform too (not tested). Best Regards, Huang, Ying -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org