From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg0-f72.google.com (mail-pg0-f72.google.com [74.125.83.72]) by kanga.kvack.org (Postfix) with ESMTP id 026206B0069 for ; Thu, 15 Dec 2016 20:12:26 -0500 (EST) Received: by mail-pg0-f72.google.com with SMTP id q10so147595584pgq.7 for ; Thu, 15 Dec 2016 17:12:25 -0800 (PST) Received: from mga06.intel.com (mga06.intel.com. [134.134.136.31]) by mx.google.com with ESMTPS id w16si4758226pgc.313.2016.12.15.17.12.24 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 15 Dec 2016 17:12:25 -0800 (PST) From: "Li, Liang Z" Subject: RE: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration Date: Fri, 16 Dec 2016 01:12:21 +0000 Message-ID: References: <0b83db29-ebad-2a70-8d61-756d33e33a48@intel.com> <2171e091-46ee-decd-7348-772555d3a5e3@redhat.com> <20161207183817.GE28786@redhat.com> <20161207202824.GH28786@redhat.com> <060287c7-d1af-45d5-70ea-ad35d4bbeb84@intel.com> <01886693-c73e-3696-860b-086417d695e1@intel.com> <20161215173901-mutt-send-email-mst@kernel.org> In-Reply-To: <20161215173901-mutt-send-email-mst@kernel.org> Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Sender: owner-linux-mm@kvack.org List-ID: To: "Michael S. Tsirkin" , "Hansen, Dave" Cc: Andrea Arcangeli , David Hildenbrand , "kvm@vger.kernel.org" , "mhocko@suse.com" , "linux-kernel@vger.kernel.org" , "qemu-devel@nongnu.org" , "linux-mm@kvack.org" , "dgilbert@redhat.com" , "pbonzini@redhat.com" , "akpm@linux-foundation.org" , "virtualization@lists.linux-foundation.org" , "kirill.shutemov@linux.intel.com" > On Thu, Dec 15, 2016 at 07:34:33AM -0800, Dave Hansen wrote: > > On 12/14/2016 12:59 AM, Li, Liang Z wrote: > > >> Subject: Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend > > >> virtio-balloon for fast (de)inflating & fast live migration > > >> > > >> On 12/08/2016 08:45 PM, Li, Liang Z wrote: > > >>> What's the conclusion of your discussion? It seems you want some > > >>> statistic before deciding whether to ripping the bitmap from the > > >>> ABI, am I right? > > >> > > >> I think Andrea and David feel pretty strongly that we should remove > > >> the bitmap, unless we have some data to support keeping it. I > > >> don't feel as strongly about it, but I think their critique of it > > >> is pretty valid. I think the consensus is that the bitmap needs to = go. > > >> > > >> The only real question IMNHO is whether we should do a power-of-2 > > >> or a length. But, if we have 12 bits, then the argument for doing > > >> length is pretty strong. We don't need anywhere near 12 bits if doi= ng > power-of-2. > > > > > > Just found the MAX_ORDER should be limited to 12 if use length > > > instead of order, If the MAX_ORDER is configured to a value bigger > > > than 12, it will make things more complex to handle this case. > > > > > > If use order, we need to break a large memory range whose length is > > > not the power of 2 into several small ranges, it also make the code > complex. > > > > I can't imagine it makes the code that much more complex. It adds a > > for loop. Right? > > > > > It seems we leave too many bit for the pfn, and the bits leave for > > > length is not enough, How about keep 45 bits for the pfn and 19 bits > > > for length, 45 bits for pfn can cover 57 bits physical address, that = should > be enough in the near feature. > > > > > > What's your opinion? > > > > I still think 'order' makes a lot of sense. But, as you say, 57 bits > > is enough for x86 for a while. Other architectures.... who knows? >=20 > I think you can probably assume page size >=3D 4K. But I would not want t= o > make any other assumptions. E.g. there are systems that absolutely requir= e > you to set high bits for DMA. >=20 > I think we really want both length and order. >=20 > I understand how you are trying to pack them as tightly as possible. >=20 > However, I thought of a trick, we don't need to encode all possible order= s. > For example, with 2 bits of order, we can make them mean: > 00 - 4K pages > 01 - 2M pages > 02 - 1G pages >=20 > guest can program the sizes for each order through config space. >=20 > We will have 10 bits left for legth. >=20 Please don't, we just get rid of the bitmap for simplification. :) > It might make sense to also allow guest to program the number of bits use= d > for order, this will make it easy to extend without host changes. >=20 There still exist the case if the MAX_ORDER is configured to a large value,= e.g. 36 for a system with huge amount of memory, then there is only 28 bits left for the pfn, wh= ich is not enough. Should we limit the MAX_ORDER? I don't think so. It seems use order is better.=20 Thanks! Liang > -- > MST -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org