From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 322D7C433F5 for ; Thu, 21 Apr 2022 05:34:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4DFCD6B0071; Thu, 21 Apr 2022 01:34:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 46B536B0073; Thu, 21 Apr 2022 01:34:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 308696B0074; Thu, 21 Apr 2022 01:34:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.28]) by kanga.kvack.org (Postfix) with ESMTP id 1AE916B0071 for ; Thu, 21 Apr 2022 01:34:10 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id E7B9729DC for ; Thu, 21 Apr 2022 05:34:09 +0000 (UTC) X-FDA: 79379770218.27.9681AEC Received: from mail-qv1-f47.google.com (mail-qv1-f47.google.com [209.85.219.47]) by imf22.hostedemail.com (Postfix) with ESMTP id 25042C000F for ; Thu, 21 Apr 2022 05:34:09 +0000 (UTC) Received: by mail-qv1-f47.google.com with SMTP id i14so2934174qvk.13 for ; Wed, 20 Apr 2022 22:34:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=lopfncGJubtM3QjrhekXM3BKrXrw09UHELrv89Q90Gs=; b=For0oSKlhehXY05+cfsw22EhOg3IFjKx6/71PVJ3qCi1AmTHGHnSeHKsuLSuBSBCw7 p1BS+oBBQvzw4FdDavLviVGtan6YcYnHkxnozbLvL5+57wgoI6XIsAsO5Gy/m5bHozXx qsndm9ZNzPHkXC1uREudD+yWci8M5HbFQJbPmgQNXawBmcd5Gy9srUemKadRikjH1Tah giVsweVuc4uvcyD82fA6XIB8DsfkpmzUfuHdirl+wZT/BVVU0bAwNp+pdoL/9GRHsjIW 6ebH3C81lpx32VU6YMFLtdnUUnO9PyzL/UEdPamK5gQPsE2SLhh5LRZgNc2iZESTPOY4 jxgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=lopfncGJubtM3QjrhekXM3BKrXrw09UHELrv89Q90Gs=; b=d792ss+EDoJy3um1MI0iMFUQATvTiTbwBqizoBZYtghzPVkuPtVUry25/+Klx+s87x /CsBd3/a7sPbuW+ecnGAsLfubGkMDRAOkIFJ/L1Dxt1V7VboNObgO9iq5IlO1/Os5Rvk sYkvqu7m0X5ySErH76YIXrGJ0yM6s73lupV5qZnUBmSs+PIfgO24pqP3FZf3pZxe1sTK ehFJyO1Z7wFLKwZ0Nv4nLbLybHfDqsPYT1EL+1gOa8qwJzE5WNT69I/keAmw/S63WwEY d+qcgA4RLnvQ+ZieU8amRNEdI2EDCi2CVmJl2GzIsmZEGZUfjXsTG6JLhFSje2jK9Oxv GEBw== X-Gm-Message-State: AOAM531+rzbhtlERJASEU92d5G3Nm6n4WwWxVKzDtgIrCWuO/T5RKmjZ zJkQHsc6JDOCVkjzh1XGpyE6FBpKKTQAsJCQzsY= X-Google-Smtp-Source: ABdhPJyJ0JvgWGOEICPX3nsDvzl8v8+rukG1D96j/m8Eqb9XEi749vPq7IR4GENhO+Pe+rBcuI/Z7XiAPBgEBvphIR4= X-Received: by 2002:ad4:5d6e:0:b0:446:4aae:630c with SMTP id fn14-20020ad45d6e000000b004464aae630cmr16522298qvb.77.1650519248696; Wed, 20 Apr 2022 22:34:08 -0700 (PDT) MIME-Version: 1.0 References: <20220418213713.273050-1-krisman@collabora.com> <20220418204204.0405eda0c506fd29e857e1e4@linux-foundation.org> <87h76pay87.fsf@collabora.com> In-Reply-To: <87h76pay87.fsf@collabora.com> From: Amir Goldstein Date: Thu, 21 Apr 2022 08:33:56 +0300 Message-ID: Subject: Re: [PATCH v3 0/3] shmem: Allow userspace monitoring of tmpfs for lack of space. To: Gabriel Krisman Bertazi Cc: Andrew Morton , Hugh Dickins , Al Viro , kernel@collabora.com, Khazhismel Kumykov , Linux MM , linux-fsdevel , Theodore Tso Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 25042C000F X-Rspam-User: Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=For0oSKl; spf=pass (imf22.hostedemail.com: domain of amir73il@gmail.com designates 209.85.219.47 as permitted sender) smtp.mailfrom=amir73il@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Stat-Signature: mdsq8x4ukkhuwu5k5sq9igxjqcqszrm3 X-HE-Tag: 1650519249-335231 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Apr 19, 2022 at 6:29 PM Gabriel Krisman Bertazi wrote: > > Andrew Morton writes: > > Hi Andrew, > > > On Mon, 18 Apr 2022 17:37:10 -0400 Gabriel Krisman Bertazi wrote: > > > >> When provisioning containerized applications, multiple very small tmpfs > > > > "files"? > > Actually, filesystems. In cloud environments, we have several small > tmpfs associated with containerized tasks. > > >> are used, for which one cannot always predict the proper file system > >> size ahead of time. We want to be able to reliably monitor filesystems > >> for ENOSPC errors, without depending on the application being executed > >> reporting the ENOSPC after a failure. > > > > Well that sucks. We need a kernel-side workaround for applications > > that fail to check and report storage errors? > > > > We could do this for every syscall in the kernel. What's special about > > tmpfs in this regard? > > > > Please provide additional justification and usage examples for such an > > extraordinary thing. > > For a cloud provider deploying containerized applications, they might > not control the application, so patching userspace wouldn't be a > solution. More importantly - and why this is shmem specific - > they want to differentiate between a user getting ENOSPC due to > insufficiently provisioned fs size, vs. due to running out of memory in > a container, both of which return ENOSPC to the process. > Isn't there already a per memcg OOM handler that could be used by orchestrator to detect the latter? > A system administrator can then use this feature to monitor a fleet of > containerized applications in a uniform way, detect provisioning issues > caused by different reasons and address the deployment. > > I originally submitted this as a new fanotify event, but given the > specificity of shmem, Amir suggested the interface I'm implementing > here. We've raised this discussion originally here: > > https://lore.kernel.org/linux-mm/CACGdZYLLCqzS4VLUHvzYG=rX3SEJaG7Vbs8_Wb_iUVSvXsqkxA@mail.gmail.com/ > To put things in context, the points I was trying to make in this discussion are: 1. Why isn't monitoring with statfs() a sufficient solution? and more specifically, the shared disk space provisioning problem does not sound very tmpfs specific to me. It is a well known issue for thin provisioned storage in environments with shared resources as the ones that you describe 2. OTOH, exporting internal fs stats via /sys/fs for debugging, health monitoring or whatever seems legit to me and is widely practiced by other fs, so exposing those tmpfs stats as this patch set is doing seems fine to me. Another point worth considering in favor of /sys/fs/tmpfs - since tmpfs is FS_USERNS_MOUNT, the ability of sysadmin to monitor all tmpfs mounts in the system and their usage is limited. Therefore, having a central way to enumerate all tmpfs instances in the system like blockdev fs instances and like fuse fs instances, does not sound like a terrible idea in general. > > Whatever that action is, I see no user-facing documentation which > > guides the user info how to take advantage of this? > > I can follow up with a new version with documentation, if we agree this > feature makes sense. > Given the time of year and participants involved, shall we continue this discussion in LSFMM? I am not sure if this even requires a shared FS/MM session, but I don't mind trying to allocate a shared FS/MM slot if Andrew and MM guys are interested to take part in the discussion. As long as memcg is able to report OOM to the orchestrator, the problem does not sound very tmpfs specific to me. As Ted explained, cloud providers (for some reason) charge by disk size and not by disk usage, so also for non-tmpfs, online growing the fs on demand could prove to be a rewarding practice for cloud applications. Thanks, Amir.