From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E65BC43217 for ; Tue, 22 Nov 2022 20:06:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BA7D36B0073; Tue, 22 Nov 2022 15:06:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B570D6B0074; Tue, 22 Nov 2022 15:06:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A1FE28E0001; Tue, 22 Nov 2022 15:06:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 92D2E6B0073 for ; Tue, 22 Nov 2022 15:06:10 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 3BB65C083A for ; Tue, 22 Nov 2022 20:06:10 +0000 (UTC) X-FDA: 80162159700.08.B594019 Received: from mail-vs1-f47.google.com (mail-vs1-f47.google.com [209.85.217.47]) by imf26.hostedemail.com (Postfix) with ESMTP id A8C6314000B for ; Tue, 22 Nov 2022 20:06:09 +0000 (UTC) Received: by mail-vs1-f47.google.com with SMTP id d185so15590379vsd.0 for ; Tue, 22 Nov 2022 12:06:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=zkjjvph7GSIuRbhiRbs3/9b09qAojNLtWfTKgHMBCNM=; b=EKHw8TY0iwnDzUIKHY+rodJJ/AkxSWo4HZVVCllcpw9GQBAjvq5Z2tW17w75DwjlKz TivHS5jzHlS39CGYE7qhA7a9zKY8pHRwkcvn3KHmv5jSREEprRRd9eI08TlOSM44hCAi ZN7/PIst4o4dBeNi5HAdOhkeMzmAERfeDkRsHXc2ooqxrx2o8ro/gyzgMFLSlDIxXopF Hk4YN5AdX4j+NGxDQ5NAUgA3vX3d7h+U3K0yu0URqR/j65q4WpklARZe0dkMUYiK1Ocu LvwIBOVwVmPWKEfmLiwEMIqaZAFHE5+hZsNwB1QxnCaNtnlJvbi2zZ6es1qu2s70dE0P hoOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=zkjjvph7GSIuRbhiRbs3/9b09qAojNLtWfTKgHMBCNM=; b=4jbmI6P2WggBtEkcZ/2UxGJlxejWBoeSyVEaMoPYDApAN/JhGuIQEEo6QveoMK6kcL PbAdZnFwTIGnlSGbPo+Ssg01dztb6wvJlT8T27IG7nMjwp/KvtCB05PTzu4fXqnLgSRN EQgsn/2qC7KfXDZburxVF1qL1zrRNf5wcFMi6EO6YAi+JA8fG5rFfQiKn8TNu57S3FFC jeZbbpgKMy81ESSM0LzWHXHbKSeqrOx5w8cDQq8WEthXgI8ub0gF8Q5iWf9khEwuMYka Lb66ybW2bqcnlF19/iYDKRHKVmv24UhrCjs1aj2BOLcc/wyGGONbqoVm8I1MPudn+N41 518Q== X-Gm-Message-State: ANoB5plVOjZUQbbpfk0E4dir9X9R4HMOyKe8aPvY17IrqHYUS2yTSi2P L9MbgvsKhfvofvR1TPZ5ms6KOub/FoDPE7pMkYHKlw== X-Google-Smtp-Source: AA0mqf59AJtPEQWUQCAgyMsEzkAGMb6VLhEDREOmKfLqTdZ/zgN22CCY0D6BY6gFrL7UaA8TGXtfKgxqqxu8Ot7jHvM= X-Received: by 2002:a67:c906:0:b0:3aa:f64:fbfd with SMTP id w6-20020a67c906000000b003aa0f64fbfdmr5458009vsk.15.1669147568764; Tue, 22 Nov 2022 12:06:08 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Yu Zhao Date: Tue, 22 Nov 2022 13:05:32 -0700 Message-ID: Subject: Re: Low TCP throughput due to vmpressure with swap enabled To: Ivan Babrou Cc: Linux MM , Linux Kernel Network Developers , linux-kernel , Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , Eric Dumazet , "David S. Miller" , Hideaki YOSHIFUJI , David Ahern , Jakub Kicinski , Paolo Abeni , cgroups@vger.kernel.org, kernel-team Content-Type: text/plain; charset="UTF-8" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669147569; a=rsa-sha256; cv=none; b=mmee5KTbR0MB8GpXJ1VQ6HNzDAQTFJl2UA1YkYtg1rqhSpr0kFRMuJCSKXdG74W2/tbPWL 8KKJOu15pSb07t0kPDYNZb3LTgFIpaieX04DYZPFExX+lx+AabmPIkmxp5HxSnhp59WmSp gqNoA4WOGvBjhouDv7QuAZoHnwQT9Qk= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=EKHw8TY0; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf26.hostedemail.com: domain of yuzhao@google.com designates 209.85.217.47 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669147569; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zkjjvph7GSIuRbhiRbs3/9b09qAojNLtWfTKgHMBCNM=; b=UE6YqOHJ+TEkyaE9V7WmusOPMLD6PGMaEgXijO1JsnTMV2pKVtnVtnFCiL6nbT5POEUUSU QMvqmyxGcEmIYGMh37fliuAka9/HvIXnK6K54g/gO1EUZtPBlXZ9Xcozo5UA6rku8exmf8 Ivz8y2hHSqnYtcgy/nm9PYGChsJwjc8= X-Stat-Signature: orxtcd19iky5zcpmj9tc36bp9nqd3ftt X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: A8C6314000B Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=EKHw8TY0; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf26.hostedemail.com: domain of yuzhao@google.com designates 209.85.217.47 as permitted sender) smtp.mailfrom=yuzhao@google.com X-Rspam-User: X-HE-Tag: 1669147569-383598 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Nov 22, 2022 at 12:46 PM Yu Zhao wrote: > > On Mon, Nov 21, 2022 at 5:53 PM Ivan Babrou wrote: > > > > Hello, > > > > We have observed a negative TCP throughput behavior from the following commit: > > > > * 8e8ae645249b mm: memcontrol: hook up vmpressure to socket pressure > > > > It landed back in 2016 in v4.5, so it's not exactly a new issue. > > > > The crux of the issue is that in some cases with swap present the > > workload can be unfairly throttled in terms of TCP throughput. > > > > I am able to reproduce this issue in a VM locally on v6.1-rc6 with 8 > > GiB of RAM with zram enabled. > > > > The setup is fairly simple: > > > > 1. Run the following go proxy in one cgroup (it has some memory > > ballast to simulate useful memory usage): > > > > * https://gist.github.com/bobrik/2c1a8a19b921fefe22caac21fda1be82 > > > > sudo systemd-run --scope -p MemoryLimit=6G go run main.go > > > > 2. Run the following fio config in another cgroup to simulate mmapped > > page cache usage: > > > > [global] > > size=8g > > bs=256k > > iodepth=256 > > direct=0 > > ioengine=mmap > > group_reporting > > time_based > > runtime=86400 > > numjobs=8 > > name=randread > > rw=randread > > Is it practical for your workload to apply some madvise/fadvise hint? > For the above repro, it would be fadvise_hint=1 which is mapped into > MADV_RANDOM automatically. The kernel also supports MADV_SEQUENTIAL, > but not POSIX_FADV_NOREUSE at the moment. Actually fadvise_hint already defaults to 1. At least with MGLRU, the page cache should be thrown away without causing you any problem. It might be mapped to POSIX_FADV_RANDOM rather than MADV_RANDOM. POSIX_FADV_RANDOM is ignored at the moment. Sorry for all the noise. Let me dig into this and get back to you later today. > We actually have similar issues but unfortunately I haven't been able > to come up with any solution beyond recommending the above flags. > The problem is that harvesting the accessed bit from mmapped memory is > costly, and when random accesses happen fast enough, the cost of doing > that prevents LRU from collecting more information to make better > decisions. In a nutshell, LRU can't tell whether there is genuine > memory locality with your test case. > > It's a very difficult problem to solve from LRU's POV. I'd like to > hear more about your workloads and see whether there are workarounds > other than tackling the problem head-on, if applying hints is not > practical or preferrable.