From: Mikulas Patocka <mpatocka@redhat.com>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jens Axboe <axboe@kernel.dk>,
"Alasdair G. Kergon" <agk@redhat.com>,
Mike Snitzer <msnitzer@redhat.com>,
dm-devel@redhat.com, "David S. Miller" <davem@davemloft.net>,
linux-ide@vger.kernel.org, linux-scsi@vger.kernel.org,
linux-kernel@vger.kernel.org, Neil Brown <neilb@suse.de>,
linux-raid@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] block devices: validate block device capacity
Date: Thu, 30 Jan 2014 19:20:49 -0500 (EST) [thread overview]
Message-ID: <alpine.LRH.2.02.1401301905520.25766@file01.intranet.prod.int.rdu2.redhat.com> (raw)
In-Reply-To: <1391125027.2181.114.camel@dabdike.int.hansenpartnership.com>
On Thu, 30 Jan 2014, James Bottomley wrote:
> On Thu, 2014-01-30 at 18:10 -0500, Mikulas Patocka wrote:
> >
> > On Thu, 30 Jan 2014, James Bottomley wrote:
> >
> > > Why is this? the whole reason for CONFIG_LBDAF is supposed to be to
> > > allow 64 bit offsets for block devices on 32 bit. It sounds like
> > > there's somewhere not using sector_t ... or using it wrongly which needs
> > > fixing.
> >
> > The page cache uses unsigned long as a page index. Therefore, if unsigned
> > long is 32-bit, the block device may have at most 2^32-1 pages.
>
> Um, that's the index into the mapping, not the device; a device can have
> multiple mappings and each mapping has a radix tree of pages. For most
> filesystems a mapping is equivalent to a file, so we can have large
> filesystems, but they can't have files over actually 4GB on 32 bits
> otherwise mmap fails.
A device may be accessed direcly (by opening /dev/sdX) and it creates a
mapping too - thus, the size of a mapping limits the size of a block
device.
The main problem is that pgoff_t has 4 bytes - chaning it to 8 bytes may
fix it - but there may be some hidden places where pgoff is converted to
unsigned long - who knows, if they exist or not?
> Are we running into a problems with struct address_space where we've
> assumed the inode belongs to the file and lvm is doing something where
> it's the whole device?
lvm creates a 64TiB device, udev runs blkid on that device and blkid opens
the device and gets stuck because of unsigned long overflow.
> > > > On 32-bit architectures, we must limit block device size to
> > > > PAGE_SIZE*(2^32-1).
> > >
> > > So you're saying CONFIG_LBDAF can never work, why?
> > >
> > > James
> >
> > CONFIG_LBDAF works, but it doesn't allow unlimited capacity: on x86,
> > without CONFIG_LBDAF, the limit is 2TiB. With CONFIG_LBDAF, the limit is
> > 16TiB (4096*2^32).
>
> I don't think the people who did the large block device work expected to
> gain only 3 bits for all their pain.
>
> James
One could change it to have three choices:
2TiB limit - 32-bit sector_t and 32-bit pgoff_t
16TiB limit - 64-bit sector_t and 32-bit pgoff_t
32PiB limit - 64-bit sector_t and 64-bit pgoff_t
Though, we need to know if the people who designed memory management agree
with changing pgoff_t to 64 bits.
Mikulas
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next parent reply other threads:[~2014-01-31 0:20 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <alpine.LRH.2.02.1401301531040.29912@file01.intranet.prod.int.rdu2.redhat.com>
[not found] ` <1391122163.2181.103.camel@dabdike.int.hansenpartnership.com>
[not found] ` <alpine.LRH.2.02.1401301805590.19506@file01.intranet.prod.int.rdu2.redhat.com>
[not found] ` <1391125027.2181.114.camel@dabdike.int.hansenpartnership.com>
2014-01-31 0:20 ` Mikulas Patocka [this message]
2014-01-31 1:43 ` James Bottomley
2014-01-31 2:43 ` Mikulas Patocka
2014-01-31 5:45 ` James Bottomley
2014-01-31 8:20 ` Mikulas Patocka
2014-02-03 8:15 ` Christoph Hellwig
2014-02-03 20:22 ` Mikulas Patocka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LRH.2.02.1401301905520.25766@file01.intranet.prod.int.rdu2.redhat.com \
--to=mpatocka@redhat.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=agk@redhat.com \
--cc=axboe@kernel.dk \
--cc=davem@davemloft.net \
--cc=dm-devel@redhat.com \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-raid@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=msnitzer@redhat.com \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox