From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTP id 15DFE996 for ; Tue, 6 May 2014 13:37:35 +0000 (UTC) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by smtp1.linuxfoundation.org (Postfix) with ESMTP id B11091FAA9 for ; Tue, 6 May 2014 13:37:34 +0000 (UTC) Date: Tue, 6 May 2014 09:37:01 -0400 From: Dave Jones To: "Rafael J. Wysocki" Message-ID: <20140506133701.GB16222@redhat.com> References: <1998761.B2k0A5OtQR@vostro.rjw.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1998761.B2k0A5OtQR@vostro.rjw.lan> Cc: Len Brown , ksummit-discuss@lists.linuxfoundation.org, Peter Zijlstra , Daniel Lezcano , Amit Kucheria , Ingo Molnar Subject: Re: [Ksummit-discuss] [TECH(CORE?) TOPIC] Energy conservation bias interfaces List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Tue, May 06, 2014 at 02:54:03PM +0200, Rafael J. Wysocki wrote: > First of all, it would be good to have a place where subsystems and device > drivers can go and check what the current "energy conservation bias" is in > case they need to make a decision between delivering more performance and > using less energy. Second, it would be good to provide user space with > a means to tell the kernel whether it should care more about performance or > energy. Finally, it would be good to be able to adjust the overall "energy > conservation bias" automatically in response to certain "power" events such > as "battery is low/critical" etc. > > It doesn't seem to be clear currently what level and scope of such interfaces > is appropriate and where to place them. Would a global knob be useful? Or > should they be per-subsystem, per-driver, per-task, per-cgroup etc? I had thoughts about something along these lines a few years ago, when I was still doing cpufreq stuff. Using s/cpuidle/cpufreq/ but same principles.. > It also is not particularly clear what representation of "energy conservation > bias" would be most useful. Should that be a number or a set of well-defined > discrete levels that can be given names (like "max performance", "high > prerformance", "balanced" etc.)? If a number, then what units to use and > how many different values to take into account? I always thought that exposing frequencies to userspace was cpufreq's biggest mistake. If I were to do it all over again, I would do something probably like the latter example above. Switching governors from working system-wide to per-process would allow users to make a lot more decisions like "don't ever change speed for this pid", which isn't really do-able with our existing framework. What /proc/pid/power/policy defaults to for each new pid would likely still need to be configurable, but having users able to set the global policy to dynamic (ie, on-demand) scaling, while also being able to do echo powersave > /proc/$(pidof seti-alien-detector)/power/policy would I think be a much more deterministic interface over what we have now. (Plus apps themselves could set their own policy this way). The advantage of moving to policy names vs frequencies also means that we could use a single power saving policy for cpufreq, cpuidle, and whatever else we come up with. The scheduler might also be able to make better decisions if we maintain separate lists for each policy-type, prioritizing performance over power-save etc. Dave