linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Race in mm/readahead.c:140 file_ra_state_init / block/ioctl.c:497 blkdev_common_ioctl
@ 2024-01-17 20:08 Gabriel Ryan
  2024-01-17 20:38 ` Matthew Wilcox
  0 siblings, 1 reply; 3+ messages in thread
From: Gabriel Ryan @ 2024-01-17 20:08 UTC (permalink / raw)
  To: Matthew Wilcox, Andrew Morton, linux-fsdevel, linux-mm

Hi,

We found a race in the mm subsystem in kernel version v5.18-rc5 that
appears to be potentially harmful using a race testing tool we are
developing. The race occurs between:

mm/readahead.c:140 file_ra_state_init

    ra->ra_pages = inode_to_bdi(mapping->host)->ra_pages;

block/ioctl.c:497 blkdev_common_ioctl

    bdev->bd_disk->bdi->ra_pages = (arg * 512) / PAGE_SIZE;


which both set the ra->ra_pages value. It appears this race could lead
to undefined behavior, if multiple threads set ra->ra_pages to
different values simultaneously for a single file inode.

Best,
Gabe


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Race in mm/readahead.c:140 file_ra_state_init / block/ioctl.c:497 blkdev_common_ioctl
  2024-01-17 20:08 Race in mm/readahead.c:140 file_ra_state_init / block/ioctl.c:497 blkdev_common_ioctl Gabriel Ryan
@ 2024-01-17 20:38 ` Matthew Wilcox
  2024-01-17 21:02   ` Gabriel Ryan
  0 siblings, 1 reply; 3+ messages in thread
From: Matthew Wilcox @ 2024-01-17 20:38 UTC (permalink / raw)
  To: Gabriel Ryan; +Cc: Andrew Morton, linux-fsdevel, linux-mm

On Wed, Jan 17, 2024 at 03:08:47PM -0500, Gabriel Ryan wrote:
> We found a race in the mm subsystem in kernel version v5.18-rc5 that

What an odd kernel version to be analysing.  It's simultaneously almost
two years old, and yet not an actual release.  You'd be better off
choosing an LTS kernel as your version to analyse, something like v6.6
or v6.1.  Or staying right on the bleeding edge and using something more
recent like v6.7.

> appears to be potentially harmful using a race testing tool we are
> developing. The race occurs between:
> 
> mm/readahead.c:140 file_ra_state_init
> 
>     ra->ra_pages = inode_to_bdi(mapping->host)->ra_pages;
> 
> block/ioctl.c:497 blkdev_common_ioctl
> 
>     bdev->bd_disk->bdi->ra_pages = (arg * 512) / PAGE_SIZE;
> 
> 
> which both set the ra->ra_pages value. It appears this race could lead
> to undefined behavior, if multiple threads set ra->ra_pages to
> different values simultaneously for a single file inode.

Undefined behaviour from the C spec point of view, sure.  But from a
realistic hardware/compiler point of view, not really.  These are going
to be simple loads & stores. since bdi->ra_pages is an unsigned long.
And if it does happen to be garbage, how much harm is really done?
We'll end up with the wrong readahead value for a single open file.
Performance might not be so great, but it's not like we're going to
grant root access as a result.

We should probably add READ_ONCE / WRITE_ONCE annotations to be
certain the compiler doesn't get all clever, but that brings me to the
question about why you're developing this tool.  We already have tooling
to annoy people with these kinds of nitpicks (KCSAN).


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Race in mm/readahead.c:140 file_ra_state_init / block/ioctl.c:497 blkdev_common_ioctl
  2024-01-17 20:38 ` Matthew Wilcox
@ 2024-01-17 21:02   ` Gabriel Ryan
  0 siblings, 0 replies; 3+ messages in thread
From: Gabriel Ryan @ 2024-01-17 21:02 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Andrew Morton, linux-fsdevel, linux-mm

Thank you for your response!

To clarify, the tool we are developing uses modified KCSAN watchpoints
in conjunction with additional enforced thread scheduling to expose
new races. Most of these races are harmless, but we are reporting ones
that appear potentially undesirable just in case they are real bugs.


On Wed, Jan 17, 2024 at 3:38 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Wed, Jan 17, 2024 at 03:08:47PM -0500, Gabriel Ryan wrote:
> > We found a race in the mm subsystem in kernel version v5.18-rc5 that
>
> What an odd kernel version to be analysing.  It's simultaneously almost
> two years old, and yet not an actual release.  You'd be better off
> choosing an LTS kernel as your version to analyse, something like v6.6
> or v6.1.  Or staying right on the bleeding edge and using something more
> recent like v6.7.
>
> > appears to be potentially harmful using a race testing tool we are
> > developing. The race occurs between:
> >
> > mm/readahead.c:140 file_ra_state_init
> >
> >     ra->ra_pages = inode_to_bdi(mapping->host)->ra_pages;
> >
> > block/ioctl.c:497 blkdev_common_ioctl
> >
> >     bdev->bd_disk->bdi->ra_pages = (arg * 512) / PAGE_SIZE;
> >
> >
> > which both set the ra->ra_pages value. It appears this race could lead
> > to undefined behavior, if multiple threads set ra->ra_pages to
> > different values simultaneously for a single file inode.
>
> Undefined behaviour from the C spec point of view, sure.  But from a
> realistic hardware/compiler point of view, not really.  These are going
> to be simple loads & stores. since bdi->ra_pages is an unsigned long.
> And if it does happen to be garbage, how much harm is really done?
> We'll end up with the wrong readahead value for a single open file.
> Performance might not be so great, but it's not like we're going to
> grant root access as a result.
>
> We should probably add READ_ONCE / WRITE_ONCE annotations to be
> certain the compiler doesn't get all clever, but that brings me to the
> question about why you're developing this tool.  We already have tooling
> to annoy people with these kinds of nitpicks (KCSAN).


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-01-17 21:02 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-17 20:08 Race in mm/readahead.c:140 file_ra_state_init / block/ioctl.c:497 blkdev_common_ioctl Gabriel Ryan
2024-01-17 20:38 ` Matthew Wilcox
2024-01-17 21:02   ` Gabriel Ryan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox