From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0D20C3F68F for ; Mon, 13 Jan 2020 11:48:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A29F9222C2 for ; Mon, 13 Jan 2020 11:48:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A29F9222C2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 15AAD8E0005; Mon, 13 Jan 2020 06:48:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 10B608E0003; Mon, 13 Jan 2020 06:48:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 021468E0005; Mon, 13 Jan 2020 06:48:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0089.hostedemail.com [216.40.44.89]) by kanga.kvack.org (Postfix) with ESMTP id E0FE98E0003 for ; Mon, 13 Jan 2020 06:48:20 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id 998E5181AC9BF for ; Mon, 13 Jan 2020 11:48:20 +0000 (UTC) X-FDA: 76372437960.28.aunt47_e94662ca3f43 X-HE-Tag: aunt47_e94662ca3f43 X-Filterd-Recvd-Size: 6389 Received: from mail-ot1-f65.google.com (mail-ot1-f65.google.com [209.85.210.65]) by imf04.hostedemail.com (Postfix) with ESMTP for ; Mon, 13 Jan 2020 11:48:20 +0000 (UTC) Received: by mail-ot1-f65.google.com with SMTP id i15so8662571oto.2 for ; Mon, 13 Jan 2020 03:48:20 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=LxnhZqYlYpH5Vy9lVIo6OhZ32kMjzMITo+VXkeln1fw=; b=P6nRgggoCVKF3pOJehn5AG3HcJYEqBrHR4wweD1XS3rksIm02fTDFGTVep5xA2wPln 81U47z8KWowoHb5k3ZpLLjoFfZFF7D0SuoCiPhN17yAvmiiPphIMwEUSUajcrM31jPDq RyS25CVvpyeK5hPKBIZ0cEf12tL+MRap1pHnFU8GHWvabf5ViUt/PgOr4Do9NuXCQASY +WrrdYEcRoKkGmXCkb0DZSHk/NkdEPliFlXgXT3A+E9XExfzOH/652lots508fwc2ENh AxILtoXxnYbvemuIm2BWqIHqEenAHYAOfNNPPnpInc9Ow4jsVdjFDCEJk70QCuc37d2N zewA== X-Gm-Message-State: APjAAAWrxa9yOABAH72ufyVa5z+eDs8tUCrm6VJ6DI4vLUciyf/C6Hb3 M8AIyUsfZJChe3HUctMqUdDfOS+d2iFTUNTi36Y= X-Google-Smtp-Source: APXvYqw3Qgw3VCVDMPHWyRb4A+fn8GcFRFWgD+RozWxW13+tMz8RTbaH9oz22LVJAtTKMJUr2mfSKOEXlhCa9eU2PHs= X-Received: by 2002:a05:6830:4b9:: with SMTP id l25mr13198440otd.266.1578916099461; Mon, 13 Jan 2020 03:48:19 -0800 (PST) MIME-Version: 1.0 References: <20200107234526.GA19034@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com> <20200108105011.GY2827@hirez.programming.kicks-ass.net> <20200110153520.GC8214@u40b0340c692b58f6553c.ant.amazon.com> <20200113101609.GT2844@hirez.programming.kicks-ass.net> <857b42b2e86b2ae09a23f488daada3b1b2836116.camel@amazon.com> In-Reply-To: <857b42b2e86b2ae09a23f488daada3b1b2836116.camel@amazon.com> From: "Rafael J. Wysocki" Date: Mon, 13 Jan 2020 12:48:08 +0100 Message-ID: Subject: Re: [RFC PATCH V2 11/11] x86: tsc: avoid system instability in hibernation To: "Singh, Balbir" Cc: "peterz@infradead.org" , "Valentin, Eduardo" , "boris.ostrovsky@oracle.com" , "linux-kernel@vger.kernel.org" , "Agarwal, Anchal" , "Woodhouse, David" , "vkuznets@redhat.com" , "sstabellini@kernel.org" , "tglx@linutronix.de" , "linux-pm@vger.kernel.org" , "Woodhouse@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com" , "linux-mm@kvack.org" , "jgross@suse.com" , "pavel@ucw.cz" , "axboe@kernel.dk" , "x86@kernel.org" , "roger.pau@citrix.com" , "hpa@zytor.com" , "rjw@rjwysocki.net" , "mingo@redhat.com" , "Kamata, Munehisa" , "bp@alien8.de" , "netdev@vger.kernel.org" , "konrad.wilk@oracle.co" , "len.brown@intel.com" , "davem@davemloft.net" , "fllinden@amaozn.com" , "xen-devel@lists.xenproject.org" Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jan 13, 2020 at 12:43 PM Singh, Balbir wrote: > > On Mon, 2020-01-13 at 11:16 +0100, Peter Zijlstra wrote: > > On Fri, Jan 10, 2020 at 07:35:20AM -0800, Eduardo Valentin wrote: > > > Hey Peter, > > > > > > On Wed, Jan 08, 2020 at 11:50:11AM +0100, Peter Zijlstra wrote: > > > > On Tue, Jan 07, 2020 at 11:45:26PM +0000, Anchal Agarwal wrote: > > > > > From: Eduardo Valentin > > > > > > > > > > System instability are seen during resume from hibernation when system > > > > > is under heavy CPU load. This is due to the lack of update of sched > > > > > clock data, and the scheduler would then think that heavy CPU hog > > > > > tasks need more time in CPU, causing the system to freeze > > > > > during the unfreezing of tasks. For example, threaded irqs, > > > > > and kernel processes servicing network interface may be delayed > > > > > for several tens of seconds, causing the system to be unreachable. > > > > > The fix for this situation is to mark the sched clock as unstable > > > > > as early as possible in the resume path, leaving it unstable > > > > > for the duration of the resume process. This will force the > > > > > scheduler to attempt to align the sched clock across CPUs using > > > > > the delta with time of day, updating sched clock data. In a post > > > > > hibernation event, we can then mark the sched clock as stable > > > > > again, avoiding unnecessary syncs with time of day on systems > > > > > in which TSC is reliable. > > > > > > > > This makes no frigging sense what so bloody ever. If the clock is > > > > stable, we don't care about sched_clock_data. When it is stable you get > > > > a linear function of the TSC without complicated bits on. > > > > > > > > When it is unstable, only then do we care about the sched_clock_data. > > > > > > > > > > Yeah, maybe what is not clear here is that we covering for situation > > > where clock stability changes over time, e.g. at regular boot clock is > > > stable, hibernation happens, then restore happens in a non-stable clock. > > > > Still confused, who marks the thing unstable? The patch seems to suggest > > you do yourself, but it is not at all clear why. > > > > If TSC really is unstable, then it needs to remain unstable. If the TSC > > really is stable then there is no point in marking is unstable. > > > > Either way something is off, and you're not telling me what. > > > > Hi, Peter > > For your original comment, just wanted to clarify the following: > > 1. After hibernation, the machine can be resumed on a different but compatible > host (these are VM images hibernated) > 2. This means the clock between host1 and host2 can/will be different So the problem is specific to this particular use case. I'm not sure why to impose this hack on hibernation in all cases.