From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9FAA8C32771 for ; Thu, 9 Jan 2020 11:53:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5CBFF20656 for ; Thu, 9 Jan 2020 11:53:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5CBFF20656 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0D2AB8E0008; Thu, 9 Jan 2020 06:53:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 081778E0001; Thu, 9 Jan 2020 06:53:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E8BFB8E0008; Thu, 9 Jan 2020 06:53:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0216.hostedemail.com [216.40.44.216]) by kanga.kvack.org (Postfix) with ESMTP id D066B8E0001 for ; Thu, 9 Jan 2020 06:53:30 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 7C26740CA for ; Thu, 9 Jan 2020 11:53:30 +0000 (UTC) X-FDA: 76357935780.15.van81_75e2cafaa3046 X-HE-Tag: van81_75e2cafaa3046 X-Filterd-Recvd-Size: 8400 Received: from mail-wm1-f41.google.com (mail-wm1-f41.google.com [209.85.128.41]) by imf28.hostedemail.com (Postfix) with ESMTP for ; Thu, 9 Jan 2020 11:53:29 +0000 (UTC) Received: by mail-wm1-f41.google.com with SMTP id f129so2522672wmf.2 for ; Thu, 09 Jan 2020 03:53:29 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=yW0wdSA1OOyWEmkpsagqxtjNsagbRMwzzyaBDP34mrc=; b=s5TP+bfSxNta+0wxM7RwRtwi9mtLHE9yq37RUY3fyBjgzys7o4b5yeOMzMJZOtDr26 azBkiU4h/Nv3jsseaslHMSpGZGTRysyKZnmWmwhhxkO7q+TX6fUkeCiYKhxBAE/p8bps nPsfLRqsVmkx+nNjZO4+xKXKYHQI93Km6/ArLMBAx1cbKzz+HPGTG5HZv7Y3cVMgudJy Stt3yXvvzAxsigKMEsld9RyYB+fS8pnnTAON05KTMyI4c9K1sQivtOnjdb00ibbHwSZZ cc83Qa/n2MlDXT4+Dbs5pzWxPNf2nzJcu6nBapPPnKfdFyT9fvhkUvbCLB/BhifAfe55 t06Q== X-Gm-Message-State: APjAAAURrZB5DsfKGGsNOqo1lEyUKyomcUtndz+phpdmT8fSfzlhqVd7 Uv2ZinhKjULckLHOghiFNIE= X-Google-Smtp-Source: APXvYqxcvAOMm19th8NQok1zdO+fEkFqGIZ/J2skxEh0zcZUcQ5wY0dEh5s4yHeSrHPfqRx6mLZ4gQ== X-Received: by 2002:a05:600c:2406:: with SMTP id 6mr4644149wmp.30.1578570809109; Thu, 09 Jan 2020 03:53:29 -0800 (PST) Received: from localhost (prg-ext-pat.suse.com. [213.151.95.130]) by smtp.gmail.com with ESMTPSA id s16sm8248166wrn.78.2020.01.09.03.53.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 09 Jan 2020 03:53:28 -0800 (PST) Date: Thu, 9 Jan 2020 12:53:27 +0100 From: Michal Hocko To: Chris Murphy Cc: linux-mm@kvack.org Subject: Re: user space unresponsive, followup: lsf/mm congestion Message-ID: <20200109115327.GQ4951@dhcp22.suse.cz> References: <20200107205824.GM32178@dhcp22.suse.cz> <20200108092501.GO32178@dhcp22.suse.cz> <20200109115147.GP4951@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="jRHKVT23PllUwdXP" Content-Disposition: inline In-Reply-To: <20200109115147.GP4951@dhcp22.suse.cz> User-Agent: Mutt/1.12.2 (2019-09-21) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --jRHKVT23PllUwdXP Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu 09-01-20 12:51:48, Michal Hocko wrote: > > I've updated the bug, attaching kernel messages and /proc/vmstate in > > 1s increments, although quite often during the build multiple seconds > > of sampling were just skipped as the system was under too much > > pressure. > > I have a tool to reduce that problem (see attached). Now for real -- Michal Hocko SUSE Labs --jRHKVT23PllUwdXP Content-Type: text/x-csrc; charset=us-ascii Content-Disposition: attachment; filename="read_vmstat.c" #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include /* * A simple /proc/vmstat collector into a file. It tries hard to guarantee * that the content will get into the output file even under a strong memory * pressure. * * Usage * ./read_vmstat output_file timeout output_size * * output_file can be either a non-existing file or - for stdout * timeout - time period between two snapshots. s - seconds, ms - miliseconds * and m - minutes suffix is allowed * output_file - size of the output file. The file is preallocated and pre-filled. * * If the output reaches the end of the file it will start over overwriting the oldest * data. Each snapshot is enclosed by header and footer. * =S timestamp * [...] * E= * * Please note that your ulimit has to be sufficient to allow to mlock the code+ * all the buffers. * * This comes under GPL v2 * Copyright: Michal Hocko 2015 */ #define NS_PER_MS (1000*1000) #define NS_PER_SEC (1000*NS_PER_MS) int open_file(const char *str) { int fd; fd = open(str, O_CREAT|O_EXCL|O_RDWR, 0755); if (fd == -1) { perror("open input"); return 1; } return fd; } int read_timeout(const char *str, struct timespec *timeout) { char *end; unsigned long val; val = strtoul(str, &end, 10); if (val == ULONG_MAX) { perror("Invalid number"); return 1; } switch(*end) { case 's': timeout->tv_sec = val; break; case 'm': /* ms vs minute*/ if (*(end+1) == 's') { timeout->tv_sec = (val * NS_PER_MS) / NS_PER_SEC; timeout->tv_nsec = (val * NS_PER_MS) % NS_PER_SEC; } else { timeout->tv_sec = val*60; } break; default: fprintf(stderr, "Uknown number %s\n", str); return 1; } return 0; } size_t read_size(const char *str) { char *end; size_t val = strtoul(str, &end, 10); switch (*end) { case 'K': val <<= 10; break; case 'M': val <<= 20; break; case 'G': val <<= 30; break; } return val; } size_t dump_str(char *buffer, size_t buffer_size, size_t pos, const char *in, size_t size) { size_t i; for (i = 0; i < size; i++) { buffer[pos] = in[i]; pos = (pos + 1) % buffer_size; } return pos; } /* buffer == NULL -> stdout */ int __collect_logs(const struct timespec *timeout, char *buffer, size_t buffer_size) { char buff[4096]; /* dump to the file automatically */ time_t before, after; int in_fd = open("/proc/vmstat", O_RDONLY); size_t out_pos = 0; size_t in_pos = 0; size_t size = 0; int estimate = 0; if (in_fd == -1) { perror("open vmstat:"); return 1; } /* lock everything in */ if (mlockall(MCL_CURRENT) == -1) { perror("mlockall. Continuing anyway"); } while (1) { before = time(NULL); size = snprintf(buff, sizeof(buff), "=S %lu\n", before); lseek(in_fd, 0, SEEK_SET); size += read(in_fd, buff + size, sizeof(buff) - size); size += snprintf(buff + size, sizeof(buff) - size, "E=\n"); if (buffer && !estimate) { printf("Estimated %d entries fit to the buffer\n", buffer_size/size); estimate = 1; } /* Dump to stdout */ if (!buffer) { printf("%s", buff); } else { size_t pos; pos = dump_str(buffer, buffer_size, out_pos, buff, size); if (pos < out_pos) fprintf(stderr, "%lu: Buffer wrapped\n", before); out_pos = pos; } after = time(NULL); if (after - before > 2) { fprintf(stderr, "%d: Snapshoting took %d!!!\n", before, after-before); } if (nanosleep(timeout, NULL) == -1) if (errno == EINTR) return 0; /* kick in the flushing */ if (buffer) msync(buffer, buffer_size, MS_ASYNC); } } int collect_logs(int fd, const struct timespec *timeout, size_t buffer_size) { unsigned char *buffer = NULL; if (fd != -1) { if (ftruncate(fd, buffer_size) == -1) { perror("ftruncate"); return 1; } if (fallocate(fd, 0, 0, buffer_size) && errno != EOPNOTSUPP) { perror("fallocate"); return 1; } /* commit it to the disk */ sync(); buffer = mmap(NULL, buffer_size, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE, fd, 0); if (buffer == MAP_FAILED) { perror("mmap"); return 1; } } return __collect_logs(timeout, buffer, buffer_size); } int main(int argc, char **argv) { struct timespec timeout = {.tv_sec = 1}; int fd = -1; size_t buffer_size = 10UL<<20; if (argc > 1) { /* output file */ if (strcmp(argv[1], "-")) { fd = open_file(argv[1]); if (fd == -1) return 1; } /* timeout */ if (argc > 2) { if (read_timeout(argv[2], &timeout)) return 1; /* buffer size */ if (argc > 3) { buffer_size = read_size(argv[3]); if (buffer_size == -1UL) return 1; } } } printf("file:%s timeout:%lu.%lus buffer_size:%llu\n", (fd == -1)? "stdout" : argv[1], timeout.tv_sec, timeout.tv_nsec / NS_PER_MS, buffer_size); return collect_logs(fd, &timeout, buffer_size); } --jRHKVT23PllUwdXP--