From: Mark Tomlinson <Mark.Tomlinson@alliedtelesis.co.nz>
To: "linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: MIPS: Memory corruption forking within pthread environment
Date: Tue, 14 Mar 2017 01:15:41 +0000 [thread overview]
Message-ID: <7e632811-4ce5-7af9-6560-3d146984592c@alliedtelesis.co.nz> (raw)
I am using a MIPS CPU running linux 4.4.6, and have a problem when
forking from a process which is also using pthreads. I believe this is
OK to do, although there are limits on what the forked process is
allowed to do.
The CPU is a BCM53003, which uses a MIPS32 74K core. Although we had to
bring in some patches from Broadcom for this CPU, there are no changes
in the memory managment that I can see. The kernel is compiled without
CONFIG_PREEMPT or CONFIG_SMP.
I have traced the problem down to the duplication of the memory map for
the newly forked process. As the parent has created pthreads, this
memory map is currently shared, since pthreads all share the same memory
space. In copy_pte_range() there is a cond_resched(), which will allow
other tasks to run while this copying is taking place. If I remove the
cond_resched (and associated logic), then the problem goes away.
So I guess my question is why this copy is not working successfully with
the reschedule in the middle of it (and yet works on other platforms).
The memory corruption is occurring in the original memory map, not the
copy, as it is the existing pthreads that segfault. I do know that some
pages will get set to write-only to allow COW, so am wondering whether
that is somehow related. Is there some extra cache flush or MMU
invalidation that needs to occur on this CPU?
Here is the test code which will segfault when run on this CPU. The same
code runs fine on other architectures (x86, MIPS64, PowerPC, ARM).
/**
* Simple test app to reproduce a problem we were seeing with stack
corruption
* within a pthread, while another thread is doing a fork() operation.
*/
#include <stdlib.h>
#include <stdint.h>
#include <stdbool.h>
#include <string.h>
#include <assert.h>
#include <syslog.h>
#include <stdio.h>
#include <time.h>
#include <errno.h>
#include <unistd.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/wait.h>
#include <linux/if_packet.h>
#include <net/ethernet.h>
#include <arpa/inet.h>
#include <pthread.h>
/* If the corruption doesn't occur, this should be more like 1000000 to
avoid
* screeds of debug output */
#define PERIODIC_DEBUG 1000
#define TEST_PKT_ID 3434
#define TEST_ETHERTYPE htonl(0xbeef)
struct test_packet
{
uint8_t eth_dst[ETHER_ADDR_LEN];
uint8_t eth_src[ETHER_ADDR_LEN];
uint32_t eth_type;
uint32_t sequence;
uint8_t mac[ETHER_ADDR_LEN];
uint32_t id;
uint32_t time;
};
uint8_t test_dst_addr[ETHER_ADDR_LEN] = { 0x1, 0x00, 0x11, 0x22, 0x33,
0x44 };
uint8_t test_src_addr[ETHER_ADDR_LEN] = { 0x0b, 0xad, 0xde, 0xad, 0x0be,
0xef };
int test_sequence = 0;
uint32_t num_forks = 0;
uint32_t num_corruptions = 0;
uint32_t num_loops = 0;
/**
* This is based off code that was originally sending a packet. It
turns out that
* all we need to do to see the corruption problem is set a few fields
in the
* packet, and then sanity-check that they're all still correct afterwards.
* @note this achieves much the same as compiling with
-fstack-protector-all,
* however it just casts a slightly wider net to check for stack
corruption.
*/
void
test_packet_send (void)
{
struct test_packet pktPt;
uint8_t buf[32];
/* construct a test pkt */
memcpy (pktPt.eth_dst, test_dst_addr, ETHER_ADDR_LEN);
memcpy (pktPt.eth_src, test_src_addr, ETHER_ADDR_LEN);
pktPt.eth_type = TEST_ETHERTYPE;
pktPt.sequence = test_sequence++;
memcpy (pktPt.mac, test_src_addr, ETHER_ADDR_LEN);
pktPt.id = TEST_PKT_ID;
pktPt.time = 0;
/* this line/variable isn't strictly needed. It just seems to make the
* corruption more likely to occur in the pktPt variable, where we can
* detect it more easily */
memset (buf, 0, sizeof (buf));
/* Sanity-check the pkt values set at the start of the function are
still
* intact, i.e. the stack hasn't been stomped on */
if (memcmp (pktPt.eth_dst, test_dst_addr, ETHER_ADDR_LEN) != 0 ||
memcmp (pktPt.eth_src, test_src_addr, ETHER_ADDR_LEN) != 0 ||
pktPt.eth_type != TEST_ETHERTYPE ||
memcmp (pktPt.mac, test_src_addr, ETHER_ADDR_LEN) != 0 ||
pktPt.id != TEST_PKT_ID ||
pktPt.time != 0)
{
fprintf (stdout, "%u corruptions out of %u loops (%u forks)\n",
++num_corruptions, num_loops, num_forks);
fflush (stdout);
}
}
static void *
test_send_thread (void *unused)
{
while (true)
{
/* without some sort of yield here, the problem doesn't seem to
occur.
* The original code did a select() here, but a 1us sleep seems to
* reproduce the problem a lot better */
usleep (1);
test_packet_send ();
/* Report on the progress of the test every so often. One
problem that
* sometimes occurs is that a corruption causes a system call
to lock-up.
* So it looks like no corruptions are detected, when really
the test
* isn't running properly */
if ((++num_loops % PERIODIC_DEBUG) == 0)
{
fprintf (stdout, "%u corruptions out of %u loops (%u forks)\n",
num_corruptions, num_loops, num_forks);
fflush (stdout);
}
}
return NULL;
}
/**
* Polls to see if the last child process forked has been cleaned up.
If so,
* then it fork()s new a child again (the child process does nothing -
it just
* exits).
* @note the problem also occurs if the parent process does a blocking
call to
* waitpid() - it just seems more reproducible using a non-blocking
waitpid().
*/
void
try_fork_again (void)
{
static pid_t last_pid = -1;
int status;
if (last_pid >= 0)
{
/* wait for the child process to exit before proceeding (to
make sure
* the zombie process gets cleaned up properly) */
if (waitpid (last_pid, &status, WNOHANG) > 0)
{
last_pid = -1;
}
}
/* if any previous children are now cleaned up, then fork() again */
if (last_pid < 0)
{
last_pid = fork ();
num_forks++;
if (last_pid < 0)
{
fprintf (stderr, "fork() failed - %s", strerror(errno));
}
else if (last_pid == 0)
{
_exit (0);
}
}
}
int
main (int argc, char *argv[])
{
pthread_t tid;
/* create a separate thread that pretends to send packets */
if (pthread_create (&tid, NULL, test_send_thread, NULL) != 0)
{
fprintf (stderr, "Could not create pthread - %s\n", strerror
(errno));
return EXIT_FAILURE;
}
/* meanwhile in the main thread do lots and lots of forks */
while (true)
{
try_fork_again ();
}
return EXIT_FAILURE;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
reply other threads:[~2017-03-14 1:15 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7e632811-4ce5-7af9-6560-3d146984592c@alliedtelesis.co.nz \
--to=mark.tomlinson@alliedtelesis.co.nz \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox