linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/fadvise: validate offset in generic_fadvise
@ 2025-12-22 14:18 klourencodev
  2025-12-22 17:52 ` Andrew Morton
  2025-12-23  9:45 ` David Hildenbrand (Red Hat)
  0 siblings, 2 replies; 5+ messages in thread
From: klourencodev @ 2025-12-22 14:18 UTC (permalink / raw)
  To: linux-mm; +Cc: akpm, david, Kevin Lourenco, Kevin Lourenco

From: Kevin Lourenco <klourencodev@gmail.com>

When converted to (u64) for page calculations, a negative offset
can produce extremely large page indices. This may lead to issues in certain advice modes (excessive readahead or
cache invalidation)

offsets are normally non-negative, but the API does not guarantee this. Since 'len' is already
validated, checking 'offset' here is reasonable to prevent potential system instability.

Signed-off-by: Kevin Lourenco <k.lourenco@criteo.com>
---
 mm/fadvise.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/fadvise.c b/mm/fadvise.c
index 67028e30aa91..b63fe21416ff 100644
--- a/mm/fadvise.c
+++ b/mm/fadvise.c
@@ -43,7 +43,7 @@ int generic_fadvise(struct file *file, loff_t offset, loff_t len, int advice)
 		return -ESPIPE;
 
 	mapping = file->f_mapping;
-	if (!mapping || len < 0)
+	if (!mapping || len < 0 || offset < 0)
 		return -EINVAL;
 
 	bdi = inode_to_bdi(mapping->host);
-- 
2.47.3



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm/fadvise: validate offset in generic_fadvise
  2025-12-22 14:18 [PATCH] mm/fadvise: validate offset in generic_fadvise klourencodev
@ 2025-12-22 17:52 ` Andrew Morton
  2025-12-22 23:38   ` Kevin Lourenco
  2025-12-23  9:45 ` David Hildenbrand (Red Hat)
  1 sibling, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2025-12-22 17:52 UTC (permalink / raw)
  To: klourencodev; +Cc: linux-mm, david, Kevin Lourenco

On Mon, 22 Dec 2025 15:18:17 +0100 klourencodev@gmail.com wrote:

> From: Kevin Lourenco <klourencodev@gmail.com>
> 
> When converted to (u64) for page calculations, a negative offset
> can produce extremely large page indices. This may lead to issues in certain advice modes (excessive readahead or
> cache invalidation)
> 
> offsets are normally non-negative, but the API does not guarantee this. Since 'len' is already
> validated, checking 'offset' here is reasonable to prevent potential system instability.
> 

Yeah, seems this code really didn't think about negative offsets.

	fadvise(fd, -20, 40, advice);

does make some sense I guess - equivalent to

	fadvise(fd, 0, 20, advice);

but whether this is how the implementation actually treats -20,40 is
unclear.  Probably not, and what will filemap_flush_range() make of
this.

ugh.

I suppose I'll toss this into -next for now, see if it breaks any tests. 
But more thought is needed.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm/fadvise: validate offset in generic_fadvise
  2025-12-22 17:52 ` Andrew Morton
@ 2025-12-22 23:38   ` Kevin Lourenco
  0 siblings, 0 replies; 5+ messages in thread
From: Kevin Lourenco @ 2025-12-22 23:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, david, Kevin Lourenco

Thank you, Andrew, for taking a look and for queueing in -next.

I'll keep an eye on it and will adjust the patch if needed.

Kevin

Le lun. 22 déc. 2025 à 18:52, Andrew Morton
<akpm@linux-foundation.org> a écrit :
>
> On Mon, 22 Dec 2025 15:18:17 +0100 klourencodev@gmail.com wrote:
>
> > From: Kevin Lourenco <klourencodev@gmail.com>
> >
> > When converted to (u64) for page calculations, a negative offset
> > can produce extremely large page indices. This may lead to issues in certain advice modes (excessive readahead or
> > cache invalidation)
> >
> > offsets are normally non-negative, but the API does not guarantee this. Since 'len' is already
> > validated, checking 'offset' here is reasonable to prevent potential system instability.
> >
>
> Yeah, seems this code really didn't think about negative offsets.
>
>         fadvise(fd, -20, 40, advice);
>
> does make some sense I guess - equivalent to
>
>         fadvise(fd, 0, 20, advice);
>
> but whether this is how the implementation actually treats -20,40 is
> unclear.  Probably not, and what will filemap_flush_range() make of
> this.
>
> ugh.
>
> I suppose I'll toss this into -next for now, see if it breaks any tests.
> But more thought is needed.
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm/fadvise: validate offset in generic_fadvise
  2025-12-22 14:18 [PATCH] mm/fadvise: validate offset in generic_fadvise klourencodev
  2025-12-22 17:52 ` Andrew Morton
@ 2025-12-23  9:45 ` David Hildenbrand (Red Hat)
  2025-12-23 16:16   ` Kevin Lourenco
  1 sibling, 1 reply; 5+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-12-23  9:45 UTC (permalink / raw)
  To: klourencodev, linux-mm; +Cc: akpm, Kevin Lourenco

On 12/22/25 15:18, klourencodev@gmail.com wrote:
> From: Kevin Lourenco <klourencodev@gmail.com>
> 
> When converted to (u64) for page calculations, a negative offset
> can produce extremely large page indices. This may lead to issues in certain advice modes (excessive readahead or
> cache invalidation)
> 
> offsets are normally non-negative, but the API does not guarantee this. Since 'len' is already
> validated, checking 'offset' here is reasonable to prevent potential system instability.

Hi,

we tend to break lines as 72 chars in the patch description. I assume 
Andrew will fix that up or already did it :)

> 
> Signed-off-by: Kevin Lourenco <k.lourenco@criteo.com>
> ---
>   mm/fadvise.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/fadvise.c b/mm/fadvise.c
> index 67028e30aa91..b63fe21416ff 100644
> --- a/mm/fadvise.c
> +++ b/mm/fadvise.c
> @@ -43,7 +43,7 @@ int generic_fadvise(struct file *file, loff_t offset, loff_t len, int advice)
>   		return -ESPIPE;
>   
>   	mapping = file->f_mapping;
> -	if (!mapping || len < 0)
> +	if (!mapping || len < 0 || offset < 0)
>   		return -EINVAL;
>   
>   	bdi = inode_to_bdi(mapping->host);

The man page of fadvise64()/posix_madvise() is a bit unclear. It doesn't 
really specify what's supposed to happen on negative size or offset.

Staring at test cases in LTP, we seem to have:

* Check the value that posix_fadvise returns for wrong ADVISE value
* Check the value that posix_fadvise returns for wrong file descriptor
* Check the value that posix_fadvise returns for wrong ADVISE value
* Check the value that posix_fadvise returns for pipe descriptor

And we primarily only seem to test what's documented in the man page to 
fail.

Which raises the questions:
(1) Could we accidentally break some users out there?
(2) Should we update the man page to document what is supposed to happen
     with negative size or offset.

-- 
Cheers

David


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm/fadvise: validate offset in generic_fadvise
  2025-12-23  9:45 ` David Hildenbrand (Red Hat)
@ 2025-12-23 16:16   ` Kevin Lourenco
  0 siblings, 0 replies; 5+ messages in thread
From: Kevin Lourenco @ 2025-12-23 16:16 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat); +Cc: linux-mm, akpm, Kevin Lourenco

Hi David,

Thank you for your thoughts and recommendations. Well noted, I will
now start limiting the commit message to those specifications :)

For point (1):

I see POSIX definition like this: "The byte position in the file where
the next I/O operation begins."
(https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html?utm_source=chatgpt.com),
so what is almost certain is that a negative offset makes no sense /
there is no hidden mechanism or anything. I don't think we can
accidentally break users, my intuition is that nobody ever
intentionally passes a negative offset (probably the same for len).

We could rely exclusively on the user to ensure the offset is positive
and follow POSIX recommendation. But since errors can occur, I think
it can be hard to debug where a simple test, the same for len, could
have made their experience easier. I think this is also the path
FreeBSD chose: https://github.com/freebsd/freebsd-src/blob/main/sys/kern/vfs_syscalls.c#L4918

For point (2):

Since I think nobody intentionally passes negative values and since
that makes no sense semantically speaking, I think we can consider it
unnecessary to update the man page right now and only think about it
in the future if a negative offset gains a new meaning, for example.

Wdyt?

Le mar. 23 déc. 2025 à 10:45, David Hildenbrand (Red Hat)
<david@kernel.org> a écrit :
>
> On 12/22/25 15:18, klourencodev@gmail.com wrote:
> > From: Kevin Lourenco <klourencodev@gmail.com>
> >
> > When converted to (u64) for page calculations, a negative offset
> > can produce extremely large page indices. This may lead to issues in certain advice modes (excessive readahead or
> > cache invalidation)
> >
> > offsets are normally non-negative, but the API does not guarantee this. Since 'len' is already
> > validated, checking 'offset' here is reasonable to prevent potential system instability.
>
> Hi,
>
> we tend to break lines as 72 chars in the patch description. I assume
> Andrew will fix that up or already did it :)
>
> >
> > Signed-off-by: Kevin Lourenco <k.lourenco@criteo.com>
> > ---
> >   mm/fadvise.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/fadvise.c b/mm/fadvise.c
> > index 67028e30aa91..b63fe21416ff 100644
> > --- a/mm/fadvise.c
> > +++ b/mm/fadvise.c
> > @@ -43,7 +43,7 @@ int generic_fadvise(struct file *file, loff_t offset, loff_t len, int advice)
> >               return -ESPIPE;
> >
> >       mapping = file->f_mapping;
> > -     if (!mapping || len < 0)
> > +     if (!mapping || len < 0 || offset < 0)
> >               return -EINVAL;
> >
> >       bdi = inode_to_bdi(mapping->host);
>
> The man page of fadvise64()/posix_madvise() is a bit unclear. It doesn't
> really specify what's supposed to happen on negative size or offset.
>
> Staring at test cases in LTP, we seem to have:
>
> * Check the value that posix_fadvise returns for wrong ADVISE value
> * Check the value that posix_fadvise returns for wrong file descriptor
> * Check the value that posix_fadvise returns for wrong ADVISE value
> * Check the value that posix_fadvise returns for pipe descriptor
>
> And we primarily only seem to test what's documented in the man page to
> fail.
>
> Which raises the questions:
> (1) Could we accidentally break some users out there?
> (2) Should we update the man page to document what is supposed to happen
>      with negative size or offset.
>
> --
> Cheers
>
> David


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-12-23 16:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-12-22 14:18 [PATCH] mm/fadvise: validate offset in generic_fadvise klourencodev
2025-12-22 17:52 ` Andrew Morton
2025-12-22 23:38   ` Kevin Lourenco
2025-12-23  9:45 ` David Hildenbrand (Red Hat)
2025-12-23 16:16   ` Kevin Lourenco

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox