linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Gong, Sishuai" <sishuai@purdue.edu>
To: "axboe@kernel.dk" <axboe@kernel.dk>
Cc: "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [Race] data race between blkdev_ioctl() and generic_fadvise()
Date: Tue, 1 Dec 2020 20:33:51 +0000	[thread overview]
Message-ID: <15C63910-2ED5-4706-929B-0FC65F8DB757@purdue.edu> (raw)
In-Reply-To: <7F866A14-69D1-4F05-B521-05212A3F7ED7@purdue.edu>

We also want to provide more information about this data race and hope this could be helpful for you. 

------------------------------------------
Interleavings
With more experiments, we observed two interleavings of these two racy instructions.

Interleaving 1
Writer								Reader
								file->f_ra.ra_pages = bdi->ra_pages;
								// read value = 20
bdev->bd_bdi->ra_pages = (arg * 512) / PAGE_SIZE;
// write value = 0

Interleaving 2
bdev->bd_bdi->ra_pages = (arg * 512) / PAGE_SIZE;
// write value = 0
								file->f_ra.ra_pages = bdi->ra_pages;
								// read value = 0

In both interleavings, the read value would be passed to file->f_ra.ra_pages. However, in our test input, this variable was never touched after that, so we are not sure what impact it can have afterwards.

------------------------------------------
Test input
The concurrent input behind this data race is a pair of single-thread kernel input. We attach them in Syzkaller’s format here.
 
Input-1
r0 = syz_open_dev$loop(&(0x7f0000000000)='/dev/loop#\x00', 0x0, 0x0)
ioctl$BLKFRASET(r0, 0x1264, 0x0)

Input-2
r0 = syz_open_dev$loop(&(0x7f0000000180)='/dev/loop#\x00', 0x0, 0x0)
fadvise64(r0, 0x0, 0x4130122d, 0x5)


Thanks,
Sishuai

> On Nov 30, 2020, at 9:58 AM, Gong, Sishuai <sishuai@purdue.edu> wrote:
> 
> Hi,
> 
> We found a data race in linux kernel 5.3.11 that we are able to reproduce in x86 under specific interleavings. Currently, we are not sure about the consequence of this race so we would like to confirm with the community if this is a harmful bug.
> 
> ------------------------------------------
> Writer site
> 
> /tmp/tmp.B7zb7od2zE-5.3.11/extract/linux-5.3.11/block/ioctl.c:573
>        553      case BLKPBSZGET: /* get block device physical block size */
>        554          return put_uint(arg, bdev_physical_block_size(bdev));
>        555      case BLKIOMIN:
>        556          return put_uint(arg, bdev_io_min(bdev));
>        557      case BLKIOOPT:
>        558          return put_uint(arg, bdev_io_opt(bdev));
>        559      case BLKALIGNOFF:
>        560          return put_int(arg, bdev_alignment_offset(bdev));
>        561      case BLKDISCARDZEROES:
>        562          return put_uint(arg, 0);
>        563      case BLKSECTGET:
>        564          max_sectors = min_t(unsigned int, USHRT_MAX,
>        565                      queue_max_sectors(bdev_get_queue(bdev)));
>        566          return put_ushort(arg, max_sectors);
>        567      case BLKROTATIONAL:
>        568          return put_ushort(arg, !blk_queue_nonrot(bdev_get_queue(bdev)));
>        569      case BLKRASET:
>        570      case BLKFRASET:
>        571          if(!capable(CAP_SYS_ADMIN))
>        572              return -EACCES;
> ==>    573          bdev->bd_bdi->ra_pages = (arg * 512) / PAGE_SIZE;
>        574          return 0;
>        575      case BLKBSZSET:
>        576          return blkdev_bszset(bdev, mode, argp);
>        577      case BLKPG:
>        578          return blkpg_ioctl(bdev, argp);
>        579      case BLKRRPART:
>        580          return blkdev_reread_part(bdev);
>        581      case BLKGETSIZE:
>        582          size = i_size_read(bdev->bd_inode);
>        583          if ((size >> 9) > ~0UL)
>        584              return -EFBIG;
>        585          return put_ulong(arg, size >> 9);
>        586      case BLKGETSIZE64:
>        587          return put_u64(arg, i_size_read(bdev->bd_inode));
>        588      case BLKTRACESTART:
>        589      case BLKTRACESTOP:
>        590      case BLKTRACESETUP:
>        591      case BLKTRACETEARDOWN:
>        592          return blk_trace_ioctl(bdev, cmd, argp);
>        593      case IOC_PR_REGISTER:
> 
> ------------------------------------------
> Reader site
> /tmp/tmp.B7zb7od2zE-5.3.11/extract/linux-5.3.11/mm/fadvise.c:79
>         66      /*
>         67       * Careful about overflows. Len == 0 means "as much as possible".  Use
>         68       * unsigned math because signed overflows are undefined and UBSan
>         69       * complains.
>         70       */
>         71      endbyte = (u64)offset + (u64)len;
>         72      if (!len || endbyte < len)
>         73          endbyte = -1;
>         74      else
>         75          endbyte--;      /* inclusive */
>         76
>         77      switch (advice) {
>         78      case POSIX_FADV_NORMAL:
> ==>     79          file->f_ra.ra_pages = bdi->ra_pages;
>         80          spin_lock(&file->f_lock);
>         81          file->f_mode &= ~FMODE_RANDOM;
>         82          spin_unlock(&file->f_lock);
>         83          break;
>         84      case POSIX_FADV_RANDOM:
>         85          spin_lock(&file->f_lock);
>         86          file->f_mode |= FMODE_RANDOM;
>         87          spin_unlock(&file->f_lock);
>         88          break;
>         89      case POSIX_FADV_SEQUENTIAL:
>         90          file->f_ra.ra_pages = bdi->ra_pages * 2;
>         91          spin_lock(&file->f_lock);
>         92          file->f_mode &= ~FMODE_RANDOM;
>         93          spin_unlock(&file->f_lock);
>         94          break;
>         95      case POSIX_FADV_WILLNEED:
>         96          /* First and last PARTIAL page! */
>         97          start_index = offset >> PAGE_SHIFT;
>         98          end_index = endbyte >> PAGE_SHIFT;
> 
> 
> 
> ------------------------------------------
> Writer calling trace
> 
> - ksys_ioctl
> -- do_vfs_ioctl
> --- vfs_ioctl
> ---- blkdev_ioctl
> 
> ------------------------------------------
> Reader calling trace
> - ksys_fadvise64_64
> -- vfs_fadvise
> --- generic_fadvise
> 
> 
> 
> Thanks,
> Sishuai
> 


      reply	other threads:[~2020-12-01 20:33 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-30 14:58 Gong, Sishuai
2020-12-01 20:33 ` Gong, Sishuai [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=15C63910-2ED5-4706-929B-0FC65F8DB757@purdue.edu \
    --to=sishuai@purdue.edu \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox