PDA

View Full Version : Yuurei: Utterly Bizarre Crash on 2.50.x Tree Builds



ZoriaRPG
01-05-2017, 12:46 PM
This quest (http://www.purezc.net/index.php?page=quests&id=616), crashes when scrolling on all builds of the 2.50.x branch. It did not seem to crash on 2.50.3RC1, but it sure did on a clean build of the repo from 4-Dec-16.

This seems to happen in conjunction with super-fast scrolling, and some mad method that the quest creator is using to handle adding and removing enemies using arrays.

From Aevin:


This is probably related to my enemy spawn script. Enemies in Yuurei use a custom spawn system that works by storing the number of enemies remaining on each screen in an array, then spawning them based on a number of checks triggered by the screen scrolling. It ditches any enemies that normally spawn by moving them offscreen, then spawns the correct number of npcs from the array. The opening screen's "remaining enemies" number should always be zero with this method, though, so I'm confused why anything would ever try to spawn there ...


Would someone do an msys build of the 2.50.x repo tree, and verify that this happens for them?

We tested it on this build: http://timelord.insomnia247.nl/zc/zc_dev/2.50.3_Beta_2.zip

This was built from the 2.50.x tree on 4-Dec-16.

It does not happen in 2.50.2, and from what we could tell, it did not happen on 2.50.3RC1, but it did happen on a build fro the tree. That's why I'm asking for a separate confirmation on builds of that tree. Perhaps there is some change that is in 2.50.3 that was not integrated properly in the tree, such as the parser array changes that Saffith made. Anything like that should have been corrected when we rebuild the flex files, yet this bug is carried through the entirety of our fork like a contagion.

All you need to to create this crash, is move one screen in any direction from the start, then go back tot he starting screen. That has seemed to randomly crash the game. The quest is also generating this error, in conjunction with its scripts, and autoghost, and probably npc arrays.

http://timelord.insomnia247.nl/zc/zc_dev/yuurei_err.png

Dimentio
01-05-2017, 02:49 PM
I only tested the start screen crash by returning from the screen to the right. I never got crashes from scrolling right, only scrolling left. In fact, moving left onto most screens either causes noticeable lag or outright crashes, while scrolling right has no problems. Also, guaranteed crash when moving up or down a screen. This bug also only applies to the latest version of Yuurei: the original release does not crash.

DarkDragon
01-05-2017, 10:24 PM
Is "smart scrolling" enabled in this quest by any chance?

ZoriaRPG
01-06-2017, 04:52 AM
Is "smart scrolling" enabled in this quest by any chance?

I'm not sure. I'll check tomorrow, but based on what Aevin has to say, the problem can't just be related to scrolling:



Well, consider me confused. It isn't my spawn system; I commented out the entire thing and still got the crash. Then, I turned on "view ffc scripts" through the cheat menu, and ... Right before the crash, I saw that it had begun running the script for one of my ghosted enemies; the script 36 listed in the error log.
*
There's nothing that should be running that script, and it certainly doesn't run on any of the previous ZC versions. I really don't know what could be causing this. I'm really doing nothing terribly complicated on my end outside of the spawn system, which is now fully disabled. I even searched for every instance of running ffc scripts in my script file, and that particular script is never once called outside of the specific enemy spawning on certain screens. But the enemy doesn't appear on any of those screens, and it's that specific script that causes the crash on any of the other screens, too, none of which have the enemy.
*
It seems to me that ZC is just randomly calling this script, then crashing because there's no enemy associated with it.
*
Edit: Looking closer frame by frame reveals that as soon as scrolling begins, it calls a series of ffc scripts. I can't see any rhyme or reason to the ones it calls, but they are consistent depending on which screen I scroll to. For example, going right from the start doesn't crash, but going back left runs three separate ffc scripts in the same slot in quick succession, ending with script 36 before the error. However, cheating and going up through the wall runs a different script, then crashes. The other odd thing is that there are gaps before the scripts that cause the crash, as if there are "blank" scripts being called from further down the script list. For example ... My opening screen has a "ContinuePoint" script that runs at screen init. What I see when walking back left onto the screen is this:
*
ContinuePoint
-----------
-----------
ultimadome ->SandCroc->Earth Elemental (in quick succession)
*
These blank slots (totally blank, not actually dotted lines) also happen when walking up through the wall, but the script that causes the crash there is different.
*
Not sure what's going on here ...


That seems to indicate that the ffengine is losing its mind, from something. it could be related to the parser changes that Saffith implemented. I'm only guessing at present, as I need to run some tests, but perhaps if a quest has more than N ffcs loaded into slots, it isn't properly storing the datum. I truly have no out-of-hand explanation for this; which is why I would like anyone else to compile 2.50.x from 4-Dec-16, as I want to rule out some bizarre compiler issue on our end, using the latest gcc via MSYS.

If gcc is mucking something up terribly, then clearly it is not a viable option and this needs to be addressed.

Saffith
01-06-2017, 01:14 PM
(gdb) print tmpscr->ffscript
$6 = {0 <repeats 14 times>, 4096, 39, 0 <repeats 16 times>}
Well, that's not good. But in ZQuest, they're all 0, as expected. Hm...


Hardware watchpoint 1: tmpscr->ffscript[14]

Old value = 0
New value = 4096
set_register (arg=13, value=2560000) at src/ffscript.cpp:2912
2912 break;
(gdb) bt
#0 set_register (arg=13, value=2560000) at src/ffscript.cpp:2912
#1 0x0806c28d in do_set (v=false, whichFFC=255 '\377')
at src/ffscript.cpp:4857
#2 0x08072031 in run_script (type=0 '\000', script=1, i=255 '\377')
at src/ffscript.cpp:6724
#3 0x080735c9 in ZScriptVersion::RunScript (type=0 '\000', script=1,
i=255 '\377') at src/zscriptversion.h:35
#4 0x081135b3 in LinkClass::run_scrolling_script (this=0x8877d80 <Link>,
scrolldir=2, cx=17, sx=256, sy=0, end_frames=false) at src/link.cpp:12204
#5 0x081c3248 in ZScriptVersion::ScrollingScript (scrolldir=2, cx=17, sx=256,
sy=0, end_frames=false) at src/zscriptversion.cpp:16
#6 0x0811f257 in ZScriptVersion::RunScrollingScript (scrolldir=2, cx=17,
sx=256, sy=0, end_frames=false) at src/zscriptversion.h:40
#7 0x0811474e in LinkClass::scrollscr (this=0x8877d80 <Link>, scrolldir=2,
destscr=-1, destdmap=-1) at src/link.cpp:12577
#8 0x08112ae1 in LinkClass::checkscroll (this=0x8877d80 <Link>)
at src/link.cpp:11935
#9 0x080e8c8d in LinkClass::animate (this=0x8877d80 <Link>)
at src/link.cpp:4266
#10 0x081bd6f8 in game_loop () at src/zelda.cpp:2359
#11 0x081c0630 in main (argc=4, argv=0xffffd234) at src/zelda.cpp:3589

It's coming from the scrolling script. It could be a script error that ZC isn't catching. But it looks like it's actually setting ffc->X?


(gdb) print ri->ffcref
$20 = 255 '\377'
Oh, there we go. It's trying to set ffc->X for FFC #256. That'll do it.

ZoriaRPG
01-06-2017, 05:43 PM
(gdb) print tmpscr->ffscript
$6 = {0 <repeats 14 times>, 4096, 39, 0 <repeats 16 times>}
Well, that's not good. But in ZQuest, they're all 0, as expected. Hm...


Hardware watchpoint 1: tmpscr->ffscript[14]

Old value = 0
New value = 4096
set_register (arg=13, value=2560000) at src/ffscript.cpp:2912
2912 break;
(gdb) bt
#0 set_register (arg=13, value=2560000) at src/ffscript.cpp:2912
#1 0x0806c28d in do_set (v=false, whichFFC=255 '\377')
at src/ffscript.cpp:4857
#2 0x08072031 in run_script (type=0 '\000', script=1, i=255 '\377')
at src/ffscript.cpp:6724
#3 0x080735c9 in ZScriptVersion::RunScript (type=0 '\000', script=1,
i=255 '\377') at src/zscriptversion.h:35
#4 0x081135b3 in LinkClass::run_scrolling_script (this=0x8877d80 <Link>,
scrolldir=2, cx=17, sx=256, sy=0, end_frames=false) at src/link.cpp:12204
#5 0x081c3248 in ZScriptVersion::ScrollingScript (scrolldir=2, cx=17, sx=256,
sy=0, end_frames=false) at src/zscriptversion.cpp:16
#6 0x0811f257 in ZScriptVersion::RunScrollingScript (scrolldir=2, cx=17,
sx=256, sy=0, end_frames=false) at src/zscriptversion.h:40
#7 0x0811474e in LinkClass::scrollscr (this=0x8877d80 <Link>, scrolldir=2,
destscr=-1, destdmap=-1) at src/link.cpp:12577
#8 0x08112ae1 in LinkClass::checkscroll (this=0x8877d80 <Link>)
at src/link.cpp:11935
#9 0x080e8c8d in LinkClass::animate (this=0x8877d80 <Link>)
at src/link.cpp:4266
#10 0x081bd6f8 in game_loop () at src/zelda.cpp:2359
#11 0x081c0630 in main (argc=4, argv=0xffffd234) at src/zelda.cpp:3589

It's coming from the scrolling script. It could be a script error that ZC isn't catching. But it looks like it's actually setting ffc->X?


(gdb) print ri->ffcref
$20 = 255 '\377'
Oh, there we go. It's trying to set ffc->X for FFC #256. That'll do it.

Let me know where you fix this source-side please. (Further, when was this issue introduced?)

Or are you saying that this is from the quest script, and not run_scrolling_script in ZC?

Saffith
01-06-2017, 06:57 PM
Yes, it's the quest script, unless the refInfo itself is getting corrupted somehow. I'll have to check with Aevin.

The issue of ZC failing to validate the FFC has always been there. It manifests differently now because script data was rearranged for 2.50.3. Previously, it would have been corrupting something else, most likely ffc->Misc, and would probably be harmless.

ZoriaRPG
01-07-2017, 02:39 AM
Yes, it's the quest script, unless the refInfo itself is getting corrupted somehow. I'll have to check with Aevin.

The issue of ZC failing to validate the FFC has always been there. It manifests differently now because script data was rearranged for 2.50.3. Previously, it would have been corrupting something else, most likely ffc->Misc, and would probably be harmless.

Oh, lovely.

I was mostly surprised, because the official 2.50.3RC1 release didn't seem to do it. My main concern was that the builds based on the 2.50.x tree, did, and that somehow when we compiled those, values were being stored improperly, corrupted as a result. It would be interesting to see what this was doing on 2.50.2 that its effects weren't noticed.

What did you use to generate that debug output stream? Was that ffdebug.cpp's output, or something else? Would be handy to have.

Thank you for figuring this out.

Saffith
01-09-2017, 04:08 PM
That's GDB, the GNU debugger. I think the Windows version is included with MinGW.

ZoriaRPG
01-10-2017, 10:43 PM
That's GDB, the GNU debugger. I think the Windows version is included with MinGW.

Sigh, same quest, new issue:

http://www.purezc.net/forums/index.php?showtopic=71414&page=2#entry1013450

DarkDragon
01-11-2017, 06:35 AM
Can you copy the specific issue (something to do with reflected magic not working correctly?) into a new thread?

Saffith
01-19-2017, 07:33 PM
Going back to the original issue, rather than adding a whole bunch of if(ri->ffcref<=31), how about this as a solution:

byte ffcref: 5;
Any problems with that?

ZoriaRPG
01-19-2017, 11:25 PM
Going back to the original issue, rather than adding a whole bunch of if(ri->ffcref<=31), how about this as a solution:

byte ffcref: 5;
Any problems with that?

We were planning to allow calling ffcs > 32 on demand. These would not be preserved in global memory, and would use resources only when in use.

The allowed range for ffcref needs defines MIN_FFCREF and MAXFFCREF. Just modify LoadFFC and vbound it's input value to a range between those values.

That's how I'd do it.

Saffith
01-24-2017, 04:43 PM
Also a possible concern is that this may affect existing quests. Some may be doing things with invalid FFC pointers, and it just happens to be having no visible effect. If an FFC pointer can no longer be invalid, those benign invalid operations could become quest-breaking.
I don't think that's a reason not to change it - surely invalid operations have undefined behavior if the game doesn't catch them - but it's something to warn people about, at least.

ZoriaRPG
01-25-2017, 12:10 AM
Also a possible concern is that this may affect existing quests. Some may be doing things with invalid FFC pointers, and it just happens to be having no visible effect. If an FFC pointer can no longer be invalid, those benign invalid operations could become quest-breaking.
I don't think that's a reason not to change it - surely invalid operations have undefined behavior if the game doesn't catch them - but it's something to warn people about, at least.

I could always report an invalid pointer, and break, instead of returning the index, instead of vbounding them. Returning 0 or -1 there would probably be bad, for similar reasons.

I read your post, and I hope you will stick around for these little discussions in the future.

Saffith
07-22-2017, 11:10 AM
Here's a thought: add a 33rd FFC. Don't save it or load it; just make the arrays bigger and don't do anything else. Whenever ffcref is set, if it would be out of range, point it to #33. It won't hurt anything, and it won't need further verification.
Of course, if FFCs were changed to structs (which shouldn't be hard), you'd only need one global dummy rather than one per screen.

DarkDragon
07-22-2017, 12:43 PM
Doesn't FFCREF only need to be validated (i) when the interpreter first begins executing the script; and (ii) when a script explicitly changes its value? That doesn't seem like too many places to check if the ref is valid and if not, terminate the script with a warning message.

EDIT: Ah, I see the issue.

I'm not so against a lot of if(isValid(ffcref)), though. Sure it's ugly, but it's safe, makes it clear what the code is intending to do, makes it easy to later change the number of FFCs (or allow the refs to be nonconsecutive integers, i.e. UIDs), and makes it slightly more painful to add gratuitous new FFC variables ;)

ZoriaRPG
07-23-2017, 03:28 AM
Doesn't FFCREF only need to be validated (i) when the interpreter first begins executing the script; and (ii) when a script explicitly changes its value? That doesn't seem like too many places to check if the ref is valid and if not, terminate the script with a warning message.

EDIT: Ah, I see the issue.

I'm not so against a lot of if(isValid(ffcref)), though. Sure it's ugly, but it's safe, makes it clear what the code is intending to do, makes it easy to later change the number of FFCs (or allow the refs to be nonconsecutive integers, i.e. UIDs), and makes it slightly more painful to add gratuitous new FFC variables ;)


I think it would be a good idea to find out what Gleeok is working on in this regard. A month or two back, he said that he was still planning to work on FFC memory management, user-created FFCs, and stuff of that sort. if(isValid(ffcref)) would become quite important at that point. The only things that I had at all considered adding to ffcs, that would require new values and variables are: Solidity flags, and a running flag. A running flag would be useful say, to pause/suspend and unpause/resume a script. That one is surely gratuitous.

The solidity, I figured I would just use the present 'solid' flag, and we would read the combo solidity for each combo comprising the ffc. (I wouldn't mind making ffcs larger than 4x4, too, but that is incidental.)

All of that more belongs in a dev topic. I like the solutions of validity checking, and I also like the idea of putting ffcs into their own class or struct. I'm concerned that Gleeok may already have some of this legwork done, and I do not want to step on his toes.

DarkDragon
07-23-2017, 03:54 AM
Sure, that makes sense. And I completely agree that ffcs (and the ASM interpreter) could use a good refactoring.

Gleeok
07-23-2017, 04:15 AM
I think it would be a good idea to find out what Gleeok is working on in this regard. A month or two back, he said that he was still planning to work on FFC memory management, user-created FFCs, and stuff of that sort. if(isValid(ffcref)) would become quite important at that point. The only things that I had at all considered adding to ffcs, that would require new values and variables are: Solidity flags, and a running flag. A running flag would be useful say, to pause/suspend and unpause/resume a script. That one is surely gratuitous.

The solidity, I figured I would just use the present 'solid' flag, and we would read the combo solidity for each combo comprising the ffc. (I wouldn't mind making ffcs larger than 4x4, too, but that is incidental.)

All of that more belongs in a dev topic. I like the solutions of validity checking, and I also like the idea of putting ffcs into their own class or struct. I'm concerned that Gleeok may already have some of this legwork done, and I do not want to step on his toes.

Different things: The FFC memory issue has to do with how they are stored inefficiently (in the map struct) after being read from the quest file; the user-created scripts was an AS thing. I haven't started the ffc refactor because I haven't had time... :/ ..Also it's possible I might break the build for a day or two once I actually start it, so I'll start a thread about it first, depends on how long it takes.


Surely "isValid(ffcref)" is like the most horrible last-resort anti-solution for the problem, isn't it? I mean it has to be valid for it to run in the first place, right?

ZoriaRPG
07-23-2017, 07:39 AM
Different things: The FFC memory issue has to do with how they are stored inefficiently (in the map struct) after being read from the quest file; the user-created scripts was an AS thing. I haven't started the ffc refactor because I haven't had time... :/ ..Also it's possible I might break the build for a day or two once I actually start it, so I'll start a thread about it first, depends on how long it takes.



Well, the present struct changes broke the build on more than one occasion. That is clearly nothing new. That's why I make changes in a branch, and wait until they are complete before merging them.

If you give ffcs memory management, and their UIDs for each of them is stored in a vector, then adding to a pool > 32 should not be too hard. At least, I would not think it to be hard. That meaning, a user creating an ffc should work well with that system.



Surely "isValid(ffcref)" is like the most horrible last-resort anti-solution for the problem, isn't it? I mean it has to be valid for it to run in the first place, right?

It does not need to be valid at present to try loading it. LoadFFC(120) actually flipping permits loading junk data, and writing to goodness knows what memory area. I have in fact known users who have done this to try to create temporary scratch vars, and I have scolded them for it. The LoadFFC block, or soemewhere in the pipe, we need a routine that clamps the range to whatever is valid. I don;t ultimately care how this looks, but if we allow user-created ffcs, just like any other pointer, then a validity check will be needed for them, just as with any of the other objects connected to a screen.

DarkDragon
07-23-2017, 03:22 PM
The interpreter checking that the FFC is valid before trying to do anything to it makes the most sense if you imagine in the future there might be need for more or less than exactly 32 FFCs per screen, or FFCs dynamically added/removed from the screen. (Compare to the item/npc/weapon ZASM commands, which already validate their references).