Author Topic: Don't exit on losing all lemmings (feature development)  (Read 4883 times)

0 Members and 1 Guest are viewing this topic.

Offline WillLem

  • Posts: 3384
  • Unity isn't sameness, it's togetherness
    • View Profile
Re: Don't exit on losing all lemmings (feature development)
« Reply #90 on: April 03, 2024, 01:25:39 PM »
Fixed another bug today - we need the Mass Replay Check to finish checking the current level when the unplayable state is reached. Wouldn't have spotted this if I wasn't releasing SLX 2.7 later today! ;P

NL 12.13 Commit 8b2e5a66c applies this fix

Offline Simon

  • Administrator
  • Posts: 3876
    • View Profile
    • Lix
Re: Don't exit on losing all lemmings (feature development)
« Reply #91 on: April 06, 2024, 09:50:48 AM »
It's now nearer the time. Tonight, Saturday, 19:00 UTC is good.

Fixed another bug today - we need the Mass Replay Check to finish checking the current level when the unplayable state is reached.

Please build NL with this for me. I'll run 2 or more of Icho's packs through both the stable 12.12.5 and this release, and look for different behavior.

-- Simon

Offline WillLem

  • Posts: 3384
  • Unity isn't sameness, it's togetherness
    • View Profile
Re: Don't exit on losing all lemmings (feature development)
« Reply #92 on: April 08, 2024, 02:12:53 PM »
Please build NL with this for me

Here you go.

Offline Simon

  • Administrator
  • Posts: 3876
    • View Profile
    • Lix
Re: Don't exit on losing all lemmings (feature development)
« Reply #93 on: April 18, 2024, 04:13:39 AM »
Thanks. I'll call that executable nl-2024-04-08.

Icho has sent me his replay coverage for Lemmings United. I've looked at his 208 replays for what is not in the bonus rank of United. These 208 replays are mostly 1 per level, sometimes 2 per level.

I've ran those 208 Lemmings United replays separately through each of the following:
  • NL 12.12.5 stable,
  • nl-2024-04-08 with the option: Always Exit to Postview,
  • nl-2024-04-08 with the option: Exit if Save Requirement Met,
  • nl-2024-04-08 with the option: Never Exit to Postview.
Find attached NL's text output. First findings:

All 208 replays pass (solve their level) in all of the 4 runs.

Your nl-2024-04-08 produces output independent of the 3-way option. In more detail: When I run mass replay verification in the nl-2024-04-08 with the option to Exit if Save Requirement Met, I get the same output (i.e., identical text file) as when I run mass replay verification in the nl-2024-04-08 with the option Never Exit to Postview, or with the option Always Exit to Postview. This is good.

Your nl-2024-04-08 produces different output than NL 12.12.5. In NL 12.12.5, most (all?) replays run for exactly 1 physics update longer than in nl-2024-04-08. It's drudgework to check this claim for all 208 lines, I haven't written a script to verify that claim for me.

For example, NL 12.12.5 produces:
Code: [Select]
Pacifism 8:  10000 B.C..nxrp   (1882 frames) LvV 0000000000000000 / RpV: 0000000000000000
nl-2024-04-08 produces:
Code: [Select]
Pacifism 8:  10000 B.C..nxrp   (1881 frames) LvV 0000000000000000 / RpV: 0000000000000000
The only difference here is that NL 12.12.5 runs for 1882 physics updates ("frames") and nl-2024-04-08 runs for 1881 physics updates before both conclude that the replay passes its level.

We can explain this with the reworked end-of-level behavior: NL 12.12.5 must start a new physics update before it can conclude that the map is over. nl-2024-04-08 can test for completion in between two physics updates. Therefore, nl-2024-04-08 needs one physics update fewer. Thus, I believe: There is nothing to worry here. Do you agree?

The high-level result (pass or fail) is identical across NL 12.12.5 and nl-2024-04-08. All levels pass.

This doesn't conclude the testing of mass replay verification yet. Reason: I've only tested solving replays. I haven't tested a big bucket of failing replays, of indeterminate replays (that run for too long after final skill), of weird nukes in the replays, ...

And I have replays for 4 more Icho packs. That's for next week.

-- Simon

Offline WillLem

  • Posts: 3384
  • Unity isn't sameness, it's togetherness
    • View Profile
Re: Don't exit on losing all lemmings (feature development)
« Reply #94 on: April 18, 2024, 11:47:38 AM »
Therefore, nl-2024-04-08 needs one physics update fewer. Thus, I believe: There is nothing to worry here. Do you agree?

Agreed, this looks so far like everything is behaving as expected; I appreciate how thorough you've been :thumbsup:

I first spotted the MRC bug when implementing this feature in SLX 2.7 - since I'd committed to actually releasing this, I did an MRC check as standard. I checked using a batch of replays some of which aren't level-solvers (I have these ready-to-go for doing quick MRC checks) and MRC is behaving as expected in SLX.

Note, though, that the SLX version of this feature doesn't include the 3-way option. "Exit if level passed" is the general standard behaviour, with Classic Mode providing "always exit to postview" behaviour as its own standard. Not that this should matter for MRC, but it might be worth noting anyway.

Note also, there may well be things that you would know to look out for that I don't. As far as I'm concerned, if the MRC checks every replay successfully and spits out a result, then it's working. I don't tend to check any further than that simply because I wouldn't know what else to look for.

We should catch up again properly about this soon, let me know when you're available.

Offline Simon

  • Administrator
  • Posts: 3876
    • View Profile
    • Lix
Re: Don't exit on losing all lemmings (feature development)
« Reply #95 on: April 18, 2024, 08:43:36 PM »
If you have failing replays, they too should run until they reach unplayable state, then exit. Look for identical high-level results: You want to see the replay pass in your work-in-progress NL if and only if it passes in NL 12.12.5.

It's theoretically possible for the same replay to be indeterminate in one NL and to clearly fail in the other NL. I don't expect this to happen often, maybe 1 case in 1,000. Even then, both NLs would agree that it's not a winning replay, and that's good. If it happens, I'd examine it case-by-case.

On most (not all) replays (passing or failing), I expect the number of physics updates to be 1 smaller in your WIP NL than in NL 12.12.5. But it's not a necessity; I realize now: When you nuke zombies, we watch the nuke animation in WIP NL, and then the WIP NL won't take 1 fewer, but instead many more physics updates. Thus: If you see a wild difference in physics updates, look in the level for zombies, and look in the replay for nukes.

That's practically what I will do with the remaining packs next week: Compare the high-level results, and investigate by hand the cases where the high-level result differs or where the number of physics updates has not shrunk by exactly 1. Maybe I'll write a script for that.

We should catch up again properly about this soon, let me know when you're available.

Yes, for code review and to plan the pull request.

I'll be busy this weekend. Which of the following suits you best?

Monday, April 22, 15:00 UTC
Monday, April 22, 20:00 UTC
Wednesday, April 24, 15:00 UTC
Wednesday, April 24, 20:00 UTC

-- Simon
« Last Edit: April 18, 2024, 09:00:15 PM by Simon »