* the max size of block device on 32bit os,when using do_generic_file_read() proceed.
@ 2012-05-24 13:38 majianpeng
2012-05-26 21:23 ` Hugh Dickins
2012-05-28 6:26 ` Re: the max size of block device on 32bit os,when usingdo_generic_file_read() proceed majianpeng
0 siblings, 2 replies; 3+ messages in thread
From: majianpeng @ 2012-05-24 13:38 UTC (permalink / raw)
To: akpm, hughd; +Cc: linux-mm
Hi all:
I readed a raid5,which size 30T.OS is RHEL6 32bit.
I reaed the raid5(as a whole,not parted) and found read address which not i wanted.
So I tested the newest kernel code,the problem is still.
I review the code, in function do_generic_file_read()
index = *ppos >> PAGE_CACHE_SHIFT;
index is u32.and *ppos is long long.
So when *ppos is larger than 0xFFFF FFFF * PAGE_CACHE_SHIFT(16T Byte),then the index is error.
I wonder this .In 32bit os ,block devices size do not large then 16T,in other words, if block devices larger than 16T,must parted.
Thanks all.
--------------
majianpeng
2012-05-24
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: the max size of block device on 32bit os,when using do_generic_file_read() proceed.
2012-05-24 13:38 the max size of block device on 32bit os,when using do_generic_file_read() proceed majianpeng
@ 2012-05-26 21:23 ` Hugh Dickins
2012-05-28 6:26 ` Re: the max size of block device on 32bit os,when usingdo_generic_file_read() proceed majianpeng
1 sibling, 0 replies; 3+ messages in thread
From: Hugh Dickins @ 2012-05-26 21:23 UTC (permalink / raw)
To: majianpeng; +Cc: Al Viro, Andrew Morton, linux-mm, linux-fsdevel
On Thu, 24 May 2012, majianpeng wrote:
> Hi all:
> I readed a raid5,which size 30T.OS is RHEL6 32bit.
> I reaed the raid5(as a whole,not parted) and found read address which not i wanted.
> So I tested the newest kernel code,the problem is still.
> I review the code, in function do_generic_file_read()
>
> index = *ppos >> PAGE_CACHE_SHIFT;
> index is u32.and *ppos is long long.
> So when *ppos is larger than 0xFFFF FFFF * PAGE_CACHE_SHIFT(16T Byte),then the index is error.
>
> I wonder this .In 32bit os ,block devices size do not large then 16T,in other words, if block devices larger than 16T,must parted.
I am not surprised that the page cache limitation prevents you from
reading the whole device with a 32-bit kernel. See MAX_LFS_FILESIZE in
include/linux/fs.h. Our answer to that is just to use a 64-bit kernel.
#if BITS_PER_LONG==32
#define MAX_LFS_FILESIZE (((u64)PAGE_CACHE_SIZE << (BITS_PER_LONG-1))-1)
#elif BITS_PER_LONG==64
#define MAX_LFS_FILESIZE 0x7fffffffffffffffUL
#endif
But I am a little surprised that you get as far as 16TiB (with 4k page):
I would have expected you to be stopped just before 8TiB (although I
suspect that the limitation to 8TiB rather than 16TiB is unnecessary).
And if I understand you correctly, read() or pread() gave you no error
at those large offsets, but supplied data from the low offset instead?
That does surprise me - have we missed a check there?
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Re: the max size of block device on 32bit os,when usingdo_generic_file_read() proceed.
2012-05-24 13:38 the max size of block device on 32bit os,when using do_generic_file_read() proceed majianpeng
2012-05-26 21:23 ` Hugh Dickins
@ 2012-05-28 6:26 ` majianpeng
1 sibling, 0 replies; 3+ messages in thread
From: majianpeng @ 2012-05-28 6:26 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Al Viro, Andrew Morton, linux-mm, linux-fsdevel
Sorry for late to reply.I reviewed the code again and found some probleam.
I created a soft-raid and the size was larger than 16T.
The os is ubuntu 12.04 32bit x86.
The udev create the block node is /dev dir(as tmpfs).
And I readed the tmpfs code :
in mm/shmem.c:shmem_fill_super()
>sb->s_maxbytes = MAX_LFS_FILESIZE;
In my computer, MAX_LFS_FILESZE is equal 8T -1.
But the read code:
generic_file_aio_read-->do_generic_file_read[not use direct flag
In function:do_generic_file_read():
>index = *ppos >> PAGE_CACHE_SHIFT;
index is the type of pgoff_t.
So if *ppos is larger than 16T, the index is overflow.As you said, it will read low position data.
But I tested the write operation:
blkdev_aio_write->__generic_file_aio_write.
In function:__generic_file_aio_write()
It will check by function:generic_write_checks()
But In function
>if (likely(!isblk)) {
> if (unlikely(*pos >= inode->i_sb->s_maxbytes)) {
> if (*count || *pos > inode->i_sb->s_maxbytes) {
> return -EFBIG;
> }
> /* zero-length writes at ->s_maxbytes are OK */
> }
> if (unlikely(*pos + *count > inode->i_sb->s_maxbytes))
> *count = inode->i_sb->s_maxbytes - *pos;
> } else {
>#ifdef CONFIG_BLOCK
> loff_t isize;
> if (bdev_read_only(I_BDEV(inode)))
> return -EPERM;
> isize = i_size_read(inode);
> if (*pos >= isize) {
> if (*count || *pos > isize)
> return -ENOSPC;
> }
> if (*pos + *count > isize)
> *count = isize - *pos;
>#else
> return -EPERM;
>#endif
Although it check (s_maxbytes)MAX_LFS_FILESIZE.But is file is block device,it did not check,it only check the real size.
But there is also a bug.Because if block size > 16T,there was not error and execed continue.
When exec generic_file_buffered_write()[no odriect action] --->generic_perform_write-->write_begin[blkdev_write_begin]
--->block_write_begin
In function:block_write_begin()
>pgoff_t index = pos >> PAGE_CACHE_SHIFT;
index will overflow.
I once thought to patch those bug(I may be well-known ,haha).But I can't,as is generic_write_checks():
>/*
> * Are we about to exceed the fs block limit ?
> *
> * If we have written data it becomes a short write. If we have
> * exceeded without writing data we send a signal and return EFBIG.
> * Linus frestrict idea will clean these up nicely..
> */
> if (likely(!isblk)) {
how to deal with block? As a regular file or not?
------------------
majianpeng
2012-05-28
-------------------------------------------------------------
发件人:Hugh Dickins
发送日期:2012-05-27 05:24:13
收件人:majianpeng
抄送:Al Viro; Andrew Morton; linux-mm; linux-fsdevel
主题:Re: the max size of block device on 32bit os,when usingdo_generic_file_read() proceed.
On Thu, 24 May 2012, majianpeng wrote:
> Hi all:
> I readed a raid5,which size 30T.OS is RHEL6 32bit.
> I reaed the raid5(as a whole,not parted) and found read address which not i wanted.
> So I tested the newest kernel code,the problem is still.
> I review the code, in function do_generic_file_read()
>
> index = *ppos >> PAGE_CACHE_SHIFT;
> index is u32.and *ppos is long long.
> So when *ppos is larger than 0xFFFF FFFF * PAGE_CACHE_SHIFT(16T Byte),then the index is error.
>
> I wonder this .In 32bit os ,block devices size do not large then 16T,in other words, if block devices larger than 16T,must parted.
I am not surprised that the page cache limitation prevents you from
reading the whole device with a 32-bit kernel. See MAX_LFS_FILESIZE in
include/linux/fs.h. Our answer to that is just to use a 64-bit kernel.
#if BITS_PER_LONG==32
#define MAX_LFS_FILESIZE (((u64)PAGE_CACHE_SIZE << (BITS_PER_LONG-1))-1)
#elif BITS_PER_LONG==64
#define MAX_LFS_FILESIZE 0x7fffffffffffffffUL
#endif
But I am a little surprised that you get as far as 16TiB (with 4k page):
I would have expected you to be stopped just before 8TiB (although I
suspect that the limitation to 8TiB rather than 16TiB is unnecessary).
And if I understand you correctly, read() or pread() gave you no error
at those large offsets, but supplied data from the low offset instead?
That does surprise me - have we missed a check there?
Hugh
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2012-05-28 6:25 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-24 13:38 the max size of block device on 32bit os,when using do_generic_file_read() proceed majianpeng
2012-05-26 21:23 ` Hugh Dickins
2012-05-28 6:26 ` Re: the max size of block device on 32bit os,when usingdo_generic_file_read() proceed majianpeng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox