From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19054C32767 for ; Mon, 6 Jan 2020 06:41:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C92AF20848 for ; Mon, 6 Jan 2020 06:41:17 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="TmR96zu9" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C92AF20848 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6C2E58E0006; Mon, 6 Jan 2020 01:41:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 699B58E0003; Mon, 6 Jan 2020 01:41:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5D7088E0006; Mon, 6 Jan 2020 01:41:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0188.hostedemail.com [216.40.44.188]) by kanga.kvack.org (Postfix) with ESMTP id 475358E0003 for ; Mon, 6 Jan 2020 01:41:17 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 004894821 for ; Mon, 6 Jan 2020 06:41:16 +0000 (UTC) X-FDA: 76346262552.13.owner38_7f9146a1c2140 X-HE-Tag: owner38_7f9146a1c2140 X-Filterd-Recvd-Size: 7772 Received: from mail-io1-f66.google.com (mail-io1-f66.google.com [209.85.166.66]) by imf30.hostedemail.com (Postfix) with ESMTP for ; Mon, 6 Jan 2020 06:41:16 +0000 (UTC) Received: by mail-io1-f66.google.com with SMTP id d15so4094361iog.3 for ; Sun, 05 Jan 2020 22:41:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Z4RgkHe4zTXWoSeMr/8tfNK+cQWSST31EnajqbmjDNg=; b=TmR96zu9Up4TyH2vEMKBHgxzm7gJHlaCIjSL+YssOk/qfR/hOCCdIPZ/dlRN03byn8 TdZqVcsFc2CtytvbzM+qxW6bTwP221QHWWJ7ol9pUEqEXLqAWKQ4ZVxjirP2Q3slNybt U9x3Y+iT7uBphWIzFZLgCn9hnMR7+sdfpgQUCwwCacnBuCOtVeni67dSLhTzZAqlsd/+ RRztZn/iiRDzkYhThN8inidUR9kjkr0SisND49JD+WrxqFaKJ9icpIVTScGv3KebeJ7P xBgBJEpfKcGGiTRcoExOeOU05tjCmXaARTJ6UmhnidWbYcuPRIPQbg5ri+v/8jstwH1z OxzA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Z4RgkHe4zTXWoSeMr/8tfNK+cQWSST31EnajqbmjDNg=; b=D6Og/GK2uD05JPBJqWNzeZ4c7mT2sPfCJFHryUtaz9SqkliUqidc5GY2MLUFWc7pim IQVScyriM1sSweSINcG93a08bYfLpDORsBufW9mQstBKAiSBNRWcnxMr2yX3sAs2Fvoc zfc1PChGnQfmFlv4/jd+i0EdlcgDJS0o3JiQlbYucGz5v3oqistbKAS+9CMVRz+gMrMn fnfEANKluFynBxIRq7+sbcrrmf1OURKHQG37fffO1t2PWS9fig9oXexDxt7pUB8NPjGd GCNHMt8RW2ZYET7Bbb1Hy9W9/APJb1Zz+XNSPpYSicuMBv5hqfNO6eEdlian05wOrzaH dpsw== X-Gm-Message-State: APjAAAUWpytt7JQrPFL4HqzOnyU18ooufJjwbrXG5mfQ/DqxreRz6Mcg M3Qc8+iZQgtskBIdft4F5k/W04r82+L1i46Qttc= X-Google-Smtp-Source: APXvYqzl+7y0bB2lmWr+dwBwiezfzWk3Q3x5u2qaDWAdDjpnWXKpBk2OTa9cBgPuORHV2T6aoAyCc+YHEZ/n+yV9VCY= X-Received: by 2002:a02:8817:: with SMTP id r23mr79417470jai.120.1578292875785; Sun, 05 Jan 2020 22:41:15 -0800 (PST) MIME-Version: 1.0 References: <91b4ed6727712cb6d426cf60c740fe2f473f7638.1578225806.git.chris@chrisdown.name> <4106bf3f-5c99-77a4-717e-10a0ffa6a3fa@huawei.com> In-Reply-To: <4106bf3f-5c99-77a4-717e-10a0ffa6a3fa@huawei.com> From: Amir Goldstein Date: Mon, 6 Jan 2020 08:41:03 +0200 Message-ID: Subject: Re: [PATCH v5 1/2] tmpfs: Add per-superblock i_ino support To: "zhengbin (A)" Cc: Chris Down , Linux MM , Hugh Dickins , Andrew Morton , Al Viro , Matthew Wilcox , Jeff Layton , Johannes Weiner , Tejun Heo , linux-fsdevel , linux-kernel , kernel-team@fb.com Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jan 6, 2020 at 4:04 AM zhengbin (A) wrote: > > > On 2020/1/5 20:06, Chris Down wrote: > > get_next_ino has a number of problems: > > > > - It uses and returns a uint, which is susceptible to become overflowed > > if a lot of volatile inodes that use get_next_ino are created. > > - It's global, with no specificity per-sb or even per-filesystem. This > > means it's not that difficult to cause inode number wraparounds on a > > single device, which can result in having multiple distinct inodes > > with the same inode number. > > > > This patch adds a per-superblock counter that mitigates the second case. > > This design also allows us to later have a specific i_ino size > > per-device, for example, allowing users to choose whether to use 32- or > > 64-bit inodes for each tmpfs mount. This is implemented in the next > > commit. > > > > Signed-off-by: Chris Down > > Reviewed-by: Amir Goldstein > > Cc: Hugh Dickins > > Cc: Andrew Morton > > Cc: Al Viro > > Cc: Matthew Wilcox > > Cc: Jeff Layton > > Cc: Johannes Weiner > > Cc: Tejun Heo > > Cc: linux-mm@kvack.org > > Cc: linux-fsdevel@vger.kernel.org > > Cc: linux-kernel@vger.kernel.org > > Cc: kernel-team@fb.com > > --- > > include/linux/shmem_fs.h | 1 + > > mm/shmem.c | 30 +++++++++++++++++++++++++++++- > > 2 files changed, 30 insertions(+), 1 deletion(-) > > > > v5: Nothing in code, just resending with correct linux-mm domain. > > > > diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h > > index de8e4b71e3ba..7fac91f490dc 100644 > > --- a/include/linux/shmem_fs.h > > +++ b/include/linux/shmem_fs.h > > @@ -35,6 +35,7 @@ struct shmem_sb_info { > > unsigned char huge; /* Whether to try for hugepages */ > > kuid_t uid; /* Mount uid for root directory */ > > kgid_t gid; /* Mount gid for root directory */ > > + ino_t next_ino; /* The next per-sb inode number to use */ > > struct mempolicy *mpol; /* default memory policy for mappings */ > > spinlock_t shrinklist_lock; /* Protects shrinklist */ > > struct list_head shrinklist; /* List of shinkable inodes */ > > diff --git a/mm/shmem.c b/mm/shmem.c > > index 8793e8cc1a48..9e97ba972225 100644 > > --- a/mm/shmem.c > > +++ b/mm/shmem.c > > @@ -2236,6 +2236,12 @@ static int shmem_mmap(struct file *file, struct vm_area_struct *vma) > > return 0; > > } > > > > +/* > > + * shmem_get_inode - reserve, allocate, and initialise a new inode > > + * > > + * If this tmpfs is from kern_mount we use get_next_ino, which is global, since > > + * inum churn there is low and this avoids taking locks. > > + */ > > static struct inode *shmem_get_inode(struct super_block *sb, const struct inode *dir, > > umode_t mode, dev_t dev, unsigned long flags) > > { > > @@ -2248,7 +2254,28 @@ static struct inode *shmem_get_inode(struct super_block *sb, const struct inode > > > > inode = new_inode(sb); > > if (inode) { > > - inode->i_ino = get_next_ino(); > > + if (sb->s_flags & SB_KERNMOUNT) { > > + /* > > + * __shmem_file_setup, one of our callers, is lock-free: > > + * it doesn't hold stat_lock in shmem_reserve_inode > > + * since max_inodes is always 0, and is called from > > + * potentially unknown contexts. As such, use the global > > + * allocator which doesn't require the per-sb stat_lock. > > + */ > > + inode->i_ino = get_next_ino(); > > + } else { > > + spin_lock(&sbinfo->stat_lock); > > Use spin_lock will affect performance, how about define > > unsigned long __percpu *last_ino_number; /* Last inode number */ > atomic64_t shared_last_ino_number; /* Shared last inode number */ > in shmem_sb_info, whose performance will be better? > Please take a look at shmem_reserve_inode(). spin lock is already being taken in shmem_get_inode() so there is nothing to be gained from complicating next_ino counter. This fact would have been a lot clearer if next_ino was incremented inside shmem_reserve_inode() and its value returned to be used by shmem_get_inode(), but I am also fine with code as it is with the comment above. Thanks, Amir.