From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ACD55C3DA7F for ; Wed, 31 Jul 2024 13:10:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2E9DA6B0082; Wed, 31 Jul 2024 09:10:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 299D36B0083; Wed, 31 Jul 2024 09:10:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1148D6B0085; Wed, 31 Jul 2024 09:10:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id E71FE6B0082 for ; Wed, 31 Jul 2024 09:10:17 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 5729E802D1 for ; Wed, 31 Jul 2024 13:10:17 +0000 (UTC) X-FDA: 82400081274.22.D12DD10 Received: from mail-lj1-f176.google.com (mail-lj1-f176.google.com [209.85.208.176]) by imf30.hostedemail.com (Postfix) with ESMTP id 27B7A8002E for ; Wed, 31 Jul 2024 13:10:13 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=QpniLleC; spf=pass (imf30.hostedemail.com: domain of fdmanana@suse.com designates 209.85.208.176 as permitted sender) smtp.mailfrom=fdmanana@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722431358; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gYkuPJdSIuiWyS57EDnKZ9pwpCAKyUatguoyf2siBx8=; b=JncbEzlSTaDhOLG6cZKhNtQGYd69JiDdE7VuYqlhOOhp2qup5Yv/VAOZKd3cASOk0qJMUw We/fhY1F1xNJD/rbGwUe4p7eAxVOwCfoqhPGVP8SShPGmpIPGkwywtu7z6+eMoOzT1Ga5r BqBnsoYsRlkbJAsa8nU6VqDeUZqVUdw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722431358; a=rsa-sha256; cv=none; b=RNbQmTtvu/nqOfOQBbPFsOtVhlvyv7tDSFD1KDba5C4vkX51Bnidb0zzwQB/vy/ULBUT2v maTTRkpTC1XMQC5Oay2EU008E7u/Aa6TsYJmKzAdGghIFR6fHncC0vTW7a+Ctov1SCxA5m s27awgfbtpwdWpq1rrKz3W7306Y8xj8= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=QpniLleC; spf=pass (imf30.hostedemail.com: domain of fdmanana@suse.com designates 209.85.208.176 as permitted sender) smtp.mailfrom=fdmanana@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com Received: by mail-lj1-f176.google.com with SMTP id 38308e7fff4ca-2eeb1ba0481so91018801fa.2 for ; Wed, 31 Jul 2024 06:10:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1722431412; x=1723036212; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=gYkuPJdSIuiWyS57EDnKZ9pwpCAKyUatguoyf2siBx8=; b=QpniLleCVdA6EsAnsMEi/2c9tzkGSoPuE7bnzVPP8Pxd0wCFj5PtuDHvYdX9/RG+nx U4LixkZVxmLGYatYthq8JhHxIr3VwxUvKbEeY6iz2wyE4VFT0z4cmSNkGS6SIedwLuV1 BVVZgZkz7ncl8dl7ymY4k+UHfjMg4jJlwefu2vaz7HwAlTLHjfgJHNI2pM7034RD2p1R 6asE+LOAGl+BUidjFOAwKcaLbhkTF41uCZGRaTrxUIJtEFyANvLIE0vLxCuDKYPA3mZ9 dbaqB/5/qtjNx0pqExGdpSw1LOcy3fI7sMhfquf8YTQI3Am9+1sgUasyUpXp3shObl6i bKHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722431412; x=1723036212; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gYkuPJdSIuiWyS57EDnKZ9pwpCAKyUatguoyf2siBx8=; b=nxDvdwqLbpiTUgUy1sn2HJp1OPFjyezl/0QJMDAoesN8v55h0pKsQBCOrPR5uTugyn IVZVcezoMXjgn3aVTrrdVNYxLgH1HWj3dIvrLyvkhMYUY10Fq3HKsQAFY2kXQ1Ty5yko FGZBDv5tQCbQjDv6o9lfohjUJWfwopYKKchzwKrxU6EBbDzo2nKRi3qTfVd4KWnr5JHt X5VWpdPdqyqOLl3TVCsqcVtKaf2ipn/FoPxALVetrdZCS5v5GvkkCFDswZFWuMG/UXS5 AM/sLTp9gJSBEQaeC83oqg7GBFlZt91aCjDBcct/XQJ3xRqiJQ1nPVnJeyumQ8O0IrMq kI7w== X-Forwarded-Encrypted: i=1; AJvYcCVawjAwnDoztKcsaZEupqkWPMwbaIlCxCqil8LFZHdoLnWSADwisLm0dc9YyIssGR2TlM4L4I4PsWIiA3pUrjx2n64= X-Gm-Message-State: AOJu0YyozdPL6eA9zmrKbBMBZQI7CksRnaablEfm+1ItprD84aeZgDnC xmaGROkr0sTmCy1zwtSi7HCd0v5ncOLzhtrA1tW+LARa1s+fZQz7UPi98PDC9vrPoNtq5adAlZb w0OAQIdBSUIidyStxeRUMmdyJ8NXpI4GlHAOJ4Q== X-Google-Smtp-Source: AGHT+IHKcOV9wKShkfmLiSx1hYNBYicde2HtogJnbPz604l6H6tsmXYvAT6vxtE3bkgaGkKlzsEKNOuycuR8EwRgdI0= X-Received: by 2002:a2e:95d0:0:b0:2ef:2dac:9076 with SMTP id 38308e7fff4ca-2f12ecd2742mr94369741fa.11.1722431412153; Wed, 31 Jul 2024 06:10:12 -0700 (PDT) MIME-Version: 1.0 References: <20240731043835.1828697-1-yangerkun@huawei.com> <20240731115134.tkiklyu72lwnhbxg@quack3> <57de6354-f53d-d106-aed8-9dff3e88efa6@huaweicloud.com> In-Reply-To: <57de6354-f53d-d106-aed8-9dff3e88efa6@huaweicloud.com> From: Filipe Manana Date: Wed, 31 Jul 2024 14:10:00 +0100 Message-ID: Subject: Re: [PATCH] libfs: fix infinite directory reads for offset dir To: yangerkun Cc: Jan Kara , yangerkun , hch@infradead.org, chuck.lever@oracle.com, brauner@kernel.org, viro@zeniv.linux.org.uk, hughd@google.com, zlang@kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 4ecmgesfmo1h3916fyqrawh3p9zygdxz X-Rspamd-Queue-Id: 27B7A8002E X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1722431413-52310 X-HE-Meta: U2FsdGVkX19L5h+awVOyBOpyksjgiYxdsALG5uq+hJce6akShQvXSSjHTu+hQnsJsKig9HdVyjTRwt6xS2cing6bfFxPnJAh0lJBPQwA5ApE65OC4rWy8uWT124El94b60v34TmIXWyzFJfLWvqwDPzpGXcJDynLetC3SBwb28Hh5dklZ0zofV7hmpzj3uKvLFx/U/6xOTnosfkES7kHwlt7y1Q9UAu53A7ExnpZ18LWfBf5D6Qd3r//SSlqYJpY96AYjf+evVleX5e+bYkhSLLws7g1WNYNl0Y1DPmcrjPy+ezdbJOhcVW0a13Geb4LdL72w5sQ+iUHgCkcbdEFowrjUCszMwQiT8g3IDtwzo07ENMBxd2d3vfGUqwT6KTQssUuyKnCaSpforeo9Yi2ysrI+vnFW2CWeE+JKcvVRX0DohqtzcBksE+VCoTctfMozaiEqcGetj/C2R0Zg/kZ4avjdH9wNcsnWSprgYUYV23IbBYeoa3SiR1C3cTqjGK0KtmCtsW71dKgirzyQ4345aka87uukOWYCK5ac7x/5M/lRMBLQUay2ek3/LD8Wg1RbYbKHDNmvcFDVgG81s677wJrWtySiq+Q4wj52HRX3rvmbydq+HsOH19YQ6TRPXTm3karV+aoL5a3xeNNwfGIHzpUnVUCxop1UZ51XfvtQlBcPUuwTrRwfso8T3KqXL935fCMJvNKE+Ox9KUTczFuo6rfYiqqCcw0p5xbJRApBugrPLGBmjcpvmKOI1fT5IeJ37oIgzvU7lkTwwbpohO75Qh9JtMtMAFdAtrTzTuhjtJOj/z8Fva41QOk4THTWfYglopj/m2OPSSYMPx2Io7vHHE/8DSY1ggPauOuc7/WZ0Z1WCdjAU7O4lJI8VXZGiEwTC+7dDzr55Ewub8HkQWJyM0F+xFasCE3S45x6yNRimQbv8qBLELFypdxl6Vou1QgY8Z5nKKt2M1iwoxZmmg yIWmsayP 7e2lNIL/TPpT3mSpPwsFDmPCR3DbAGVlSgXrxwhqUc8YluA0bqlKwMxGi7gwrbMbssgJ2N1afizqOdo4nyAfN7J0Y8Fr5UvGQb+NSZjDdE1PFeD+hyHGpnZnuZ7/STQTxR/DXh4LNUrsbmVRh6ai+gWfRPkWA3yzKTck5QIobENYiWT1de9FhjTSQNu3YC438CdTcgzDXmO0jeP1VzQBwX0jyHcazb9zuJaaNvHF/8LeIP6UdJI4eJfP5AQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jul 31, 2024 at 1:51=E2=80=AFPM yangerkun wrote: > > Hi! > > =E5=9C=A8 2024/7/31 19:51, Jan Kara =E5=86=99=E9=81=93: > > On Wed 31-07-24 12:38:35, yangerkun wrote: > >> After we switch tmpfs dir operations from simple_dir_operations to > >> simple_offset_dir_operations, every rename happened will fill new dent= ry > >> to dest dir's maple tree(&SHMEM_I(inode)->dir_offsets->mt) with a free > >> key starting with octx->newx_offset, and then set newx_offset equals t= o > >> free key + 1. This will lead to infinite readdir combine with rename > >> happened at the same time, which fail generic/736 in xfstests(detail s= how > >> as below). > >> > >> 1. create 5000 files(1 2 3...) under one dir > >> 2. call readdir(man 3 readdir) once, and get one entry > >> 3. rename(entry, "TEMPFILE"), then rename("TEMPFILE", entry) > >> 4. loop 2~3, until readdir return nothing or we loop too many > >> times(tmpfs break test with the second condition) > >> > >> We choose the same logic what commit 9b378f6ad48cf ("btrfs: fix infini= te > >> directory reads") to fix it, record the last_index when we open dir, a= nd > >> do not emit the entry which index >=3D last_index. The file->private_d= ata > >> now used in offset dir can use directly to do this, and we also update > >> the last_index when we llseek the dir file. > > > > The patch looks good! Just I'm not sure about the llseek part. As far a= s I > > understand it was added due to this sentence in the standard: > > > > "If a file is removed from or added to the directory after the most rec= ent > > call to opendir() or rewinddir(), whether a subsequent call to readdir(= ) > > returns an entry for that file is unspecified." > > > > So if the offset used in offset_dir_llseek() is 0, then we should updat= e > > last_index. But otherwise I'd leave it alone because IMHO it would do m= ore > > harm than good. > > IIUC, what you means is that we should only reset the private_data to > new last_index when we call rewinddir(which will call lseek to set > offset of dir file to 0)? > > Yeah, I prefer the logic you describle! Besides, we may also change > btrfs that do the same(e60aa5da14d0 ("btrfs: refresh dir last index > during a rewinddir(3) call")). Filipe, how do you think? What problem does it solve? The standard doesn't forbid it, and I can't see anything wrong with it. > > Thanks, > Erkun. > > > Honza > > > >> > >> Fixes: a2e459555c5f ("shmem: stable directory offsets") > >> Signed-off-by: yangerkun > >> --- > >> fs/libfs.c | 34 +++++++++++++++++++++++----------- > >> 1 file changed, 23 insertions(+), 11 deletions(-) > >> > >> diff --git a/fs/libfs.c b/fs/libfs.c > >> index 8aa34870449f..38b306738c00 100644 > >> --- a/fs/libfs.c > >> +++ b/fs/libfs.c > >> @@ -450,6 +450,14 @@ void simple_offset_destroy(struct offset_ctx *oct= x) > >> mtree_destroy(&octx->mt); > >> } > >> > >> +static int offset_dir_open(struct inode *inode, struct file *file) > >> +{ > >> + struct offset_ctx *ctx =3D inode->i_op->get_offset_ctx(inode); > >> + > >> + file->private_data =3D (void *)ctx->next_offset; > >> + return 0; > >> +} > >> + > >> /** > >> * offset_dir_llseek - Advance the read position of a directory desc= riptor > >> * @file: an open directory whose position is to be updated > >> @@ -463,6 +471,9 @@ void simple_offset_destroy(struct offset_ctx *octx= ) > >> */ > >> static loff_t offset_dir_llseek(struct file *file, loff_t offset, in= t whence) > >> { > >> + struct inode *inode =3D file->f_inode; > >> + struct offset_ctx *ctx =3D inode->i_op->get_offset_ctx(inode); > >> + > >> switch (whence) { > >> case SEEK_CUR: > >> offset +=3D file->f_pos; > >> @@ -476,7 +487,7 @@ static loff_t offset_dir_llseek(struct file *file,= loff_t offset, int whence) > >> } > >> > >> /* In this case, ->private_data is protected by f_pos_lock */ > >> - file->private_data =3D NULL; > >> + file->private_data =3D (void *)ctx->next_offset; > >> return vfs_setpos(file, offset, LONG_MAX); > >> } > >> > >> @@ -507,7 +518,7 @@ static bool offset_dir_emit(struct dir_context *ct= x, struct dentry *dentry) > >> inode->i_ino, fs_umode_to_dtype(inode->i_mode))= ; > >> } > >> > >> -static void *offset_iterate_dir(struct inode *inode, struct dir_conte= xt *ctx) > >> +static void offset_iterate_dir(struct inode *inode, struct dir_contex= t *ctx, long last_index) > >> { > >> struct offset_ctx *octx =3D inode->i_op->get_offset_ctx(inode); > >> struct dentry *dentry; > >> @@ -515,17 +526,21 @@ static void *offset_iterate_dir(struct inode *in= ode, struct dir_context *ctx) > >> while (true) { > >> dentry =3D offset_find_next(octx, ctx->pos); > >> if (!dentry) > >> - return ERR_PTR(-ENOENT); > >> + return; > >> + > >> + if (dentry2offset(dentry) >=3D last_index) { > >> + dput(dentry); > >> + return; > >> + } > >> > >> if (!offset_dir_emit(ctx, dentry)) { > >> dput(dentry); > >> - break; > >> + return; > >> } > >> > >> ctx->pos =3D dentry2offset(dentry) + 1; > >> dput(dentry); > >> } > >> - return NULL; > >> } > >> > >> /** > >> @@ -552,22 +567,19 @@ static void *offset_iterate_dir(struct inode *in= ode, struct dir_context *ctx) > >> static int offset_readdir(struct file *file, struct dir_context *ctx= ) > >> { > >> struct dentry *dir =3D file->f_path.dentry; > >> + long last_index =3D (long)file->private_data; > >> > >> lockdep_assert_held(&d_inode(dir)->i_rwsem); > >> > >> if (!dir_emit_dots(file, ctx)) > >> return 0; > >> > >> - /* In this case, ->private_data is protected by f_pos_lock */ > >> - if (ctx->pos =3D=3D DIR_OFFSET_MIN) > >> - file->private_data =3D NULL; > >> - else if (file->private_data =3D=3D ERR_PTR(-ENOENT)) > >> - return 0; > >> - file->private_data =3D offset_iterate_dir(d_inode(dir), ctx); > >> + offset_iterate_dir(d_inode(dir), ctx, last_index); > >> return 0; > >> } > >> > >> const struct file_operations simple_offset_dir_operations =3D { > >> + .open =3D offset_dir_open, > >> .llseek =3D offset_dir_llseek, > >> .iterate_shared =3D offset_readdir, > >> .read =3D generic_read_dir, > >> -- > >> 2.39.2 > >> >