So the one bf2 process still lingers. It is, however, defunct:
Code:
PID TTY STAT TIME COMMAND
18821 ? Zs 1:01 [bf2] <defunct>
So my questions are:
1. How often does GC try to restart dead servers?
2. Why does it keep polling for resource usage when the
server is dead? Is this a bug for the Linux-version?
For most games, it doesn't really matter if it takes fifteen
minutes to restart since it happens like every other week.
But for BF2 it's a totally different matter - large BF2 servers
crash more than once per day. Our previous solution which
used Monit to monitor servers would poll servers every
minute and restart them if they were dead. Is it possible to
make GC check the status of servers more frequently?
Edit: BTW, the timestamp for the RSS feed is off by one
hour. The server was restored at 02:16 AM, which is stated
by both the failure report and the RSS feed, but the date-
stamp for the feed says 03:16. This is using UTC+1 for
timezone.
I've done a quick bit of research with my uncle Google:
Aparently a defunct process is the same as a zombie, i.e. a
process whos parent has not yet received any signal that it
is dead.
If I start a BF2 server in a screen session (screen ./start
etc) , then that screen becomes the parent of the BF2
process. If the BF2 process is killed or otherwise crashes, it
sends a signal to the screen process, and the screen exits
immediately. This signal is typically a sigsegv if the server
crashes, but it can also be a sigterm or a sigkill if the server
is stopped with intent.
So it seems to me that GameCreate doesn't monitor for
signals from its children. Is this correct? In my opinion it
would be a great feature for GC to set up signal traps for its
children since that would detect a crashed server immediately,
and the new restarted server could be up and running within
seconds! Setting up such traps takes no resources, so
there should be no additional overhead from this.
If using signal traps is not your cup of tea, then it would at
least be appreciated if the interval between checks were
lowered to a minute or even less. To illustrate the effect of
not restarting the server shortly after a crash, here are two
graphs that show the number of players on two different
servers. The first server is monitored by GC and takes 10-
15 minutes to restart after a crash. The second server is
monitored by Monit and takes up to 1 minute to restart:
(query failure detected by GC at 16:55, 21:44 and 12:37)
(crashed at 16:54, 17:36, 20:52 and 14:00)
As can be seen, the first server takes forever to refill since
players decide to join other servers instead of waiting, while
the second server fills up immediately after a crash.
Last edited by Kybber on Sun Mar 05, 2006 1:32 am; edited 2 times in total
Well, technically it does what you say. But it doesn't solve
the problem, and brings with it even more serious problems.
When I start a new BF2 server, gamecreate spawns a new
child called gamecreate.x86. This is different from the previous
version which spawned the BF2 process itself, as can be seen
in my pstrees above. Now it looks like this:
There were only four gamecreate.x86 processes before the
server was started. Process 31012 is new. 31012 is also
the PID reported in gamecreate.log:
Code:
Mon Mar 6 12:40:44 2006: ** Start GameCreate Booking: ID 280060 **
Mon Mar 6 12:40:44 2006: Launching process: /home/server/games/bf2/start.sh +config "configs/16617/serversettings.con" +mapList "configs/16617/maplist.con" +pbPath "configs/16617/pb" from directory /home/server/games/bf2
Mon Mar 6 12:40:44 2006: SSL >> ACK: START id=280060&pid=31012
However, 31012 is now defunct:
Code:
$ ps 31012
PID TTY STAT TIME COMMAND
31012 ? Zs 0:00 [gamecreate.x86] <defunct>
Now for the real problem: I am unable to restart the game
server from GC. Here's the log:
Code:
Mon Mar 6 12:45:26 2006: ** Stop GameCreate Booking: ID 280060 **
Mon Mar 6 12:45:27 2006: SSL >> ACK: STOP id=280060
[20 bytes] - result: 20
Mon Mar 6 12:45:27 2006: Sending PONG, waiting for response
Mon Mar 6 12:45:27 2006: SSL >> ACK: PONG
[11 bytes] - result: 11
Mon Mar 6 12:46:06 2006: ** Stop GameCreate Booking: ID 280060 **
Mon Mar 6 12:46:12 2006: Waiting for PID: 31012 failed
Mon Mar 6 12:46:12 2006: SSL >> ACK: STOP id=280060
[20 bytes] - result: 20
Mon Mar 6 12:46:18 2006: ** Create directory: /home/server/games/bf2/pbscreens/Battlefield_2_Server **
Mon Mar 6 12:46:18 2006: ** Create directory: /home/server/games/bf2/logs/280060 **
Mon Mar 6 12:46:18 2006: ** Remove directory: /home/server/games/bf2/mods/bf2/settings **
Mon Mar 6 12:46:18 2006: ** Remove directory: /home/server/games/bf2/mods/bf2/configs **
Mon Mar 6 12:46:18 2006: ** Got file, name: /home/server/games/bf2/configs/16617/maplist.con, size: 512 **
Mon Mar 6 12:46:18 2006: ** Got file, name: /home/server/games/bf2/configs/16617/pb/pb/pbsv.cfg, size: 2589 **
Mon Mar 6 12:46:18 2006: ** Got file, name: /home/server/games/bf2/configs/16617/modmanager.con, size: 5396 **
Mon Mar 6 12:46:18 2006: ** Got file, name: /home/server/games/bf2/admin/admin_rcon16617.cfg, size: 151 **
Mon Mar 6 12:46:18 2006: ** Got file, name: /home/server/games/bf2/configs/16617/adminsettings.con, size: 485 **
Mon Mar 6 12:46:18 2006: ** Got file, name: /home/server/games/bf2/configs/16617/serversettings.con, size: 1684 **
Mon Mar 6 12:46:21 2006: ** Got file, name: /home/server/games/bf2/configs/16617/pb/pb/pbsvlog.cfg, size: 122 **
Mon Mar 6 12:46:21 2006: ** Got file, name: /home/server/games/bf2/configs/16617/actlogpbss.cfg, size: 129 **
Mon Mar 6 12:46:21 2006: ** Got file, name: /home/server/games/bf2/configs/16617/admincommands.con, size: 14472 **
Mon Mar 6 12:46:21 2006: ** Start GameCreate Booking: ID 280060 **
Mon Mar 6 12:46:21 2006: Launching process: /home/server/games/bf2/start.sh +config "configs/16617/serversettings.con" +mapList "configs/16617/maplist.con" +pbPath "configs/16617/pb" from directory /home/server/games/bf2
Mon Mar 6 12:46:21 2006: SSL >> ACK: START id=280060&pid=31523
First of all, it takes close to a minute from when I issue the
stop command until GC starts the server. However, the
original instance hasn't actually been stopped, so now we
have two BF2 processes running:
And back to the original problem: It appears that the time it
takes for the server to restart after I kill the bf2 process is
about the same as before. GC keeps polling the defunct
gamecreate.x86 process every other minute and reporting
0% CPU and RAM usage for 13 minutes before it restarts
the server.
Excellent! I've tested it and it seems to be working perfectly
now. The server is detected as being down within 15
seconds, GC makes sure it is really stopped 15 seconds
after, and the new server instance is started after another
15 seconds.
If you do a search for BF2 or BF2142, you'll find
that this is a very common problem which seems
to affect everyone, yet it hasn't been fixed. A
workaround can be found here:
http://www.gamecreate.com/forum/viewtopic.php?p=6198#6198
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum