Who has a Mac? I'd like to debug #431: Magic Pink not Transparent

Started by Simon, January 22, 2023, 03:36:11 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Simon

Hi,

In 2022, naymapl and cameo69 reported:
github #431: Magic pink won't become transparent on macOS (both Intel and ARM64)
I'd like to debug this, but I don't have a Mac. Thus:

Who has a Mac? Would you like to compile and run Lix on it, and are you up for some testing?

I'll walk you through everything. I'll answer all of your questions. No need to hurry: It's okay if you can spare a few hours here and there. It's okay if you take several days off before you can test a new theory that I'll get.

Let's see if the bug still manifests in 2023. Since naymapl's and cameo69's reportings, has been at least one new release of Allegro 5, Lix's graphics library, and it's possible that the bug is already gone. And if the bug reproduces on your Mac, I'll create some minimal example programs for you to try, to see if the bug is in Lix or in Allegro 5.

-- Simon

Dullstar

If you can't find anyone who has one and is willing to test, I *do* have an old one, but there's a reasonably high chance that it's much too old, especially considering Apple's record regarding backwards compatibility (for reference, I believe it's on Snow Leopard). Feel free to reach out if you want me to try anyway, but definitely wait at least a week for other people to respond. Because it's so old that compatibility for the relevant software could very feasibly be a major issue, I will not try to set it up unless you explicitly ask me specifically to try it.




That said, my hunch is that it's probably an Allegro bug, though I am aware of at least one way to make an Allegro application that runs properly in Linux but not Windows (but still compiles on both), so it's not infeasible that there could be another mac-related issue. I once made something on my Linux machine while my Windows one was broken (hardware issue), and once I got the parts to get the Windows one working again, it turned out the Windows version didn't work because there was a missing addon init function call (I think it was al_init_primitives_addon). On Linux it didn't matter, at least not for what I was doing, but on Windows it turned out to be important. I doubt it's what's happening here, but I can produce a minimal example of it if you want (and/or possibly look for other combinations with similar effects). In any case, while I highly doubt one of them could have been missing this long without getting noticed, it could be worth double checking, since the library doesn't seem to define what should happen if you try to use things without calling those initialization functions first.

Simon

Thanks for the offer! Right, let's see if others have a more recent Mac.

I've looked at the source, and it looks like I call the al_init_* before creating color constants and before creating the A5 display. The calls are in src/basics/init.d: al_init_image_addon(); al_init_font_addon(); al_init_ttf_addon(); al_init_primitives_addon();

The bug report suggests wrong colorings after al_convert_mask_to_alpha that also gets called after the al_init_*; it looks like that function failed to recognize the magic pink in the images. I have further recoloring with handwritten pixel-by-pixel code with al_get_pixel, al_put_pixel, e.g., to replace the greyscale GUI icons with the GUI hues based on user's options. Those handwritten recolorings (in src/graphic/recol.d, here be dragons) also appear to fail. The screenshots on github have grey icons on the GUI buttons, not pale blue icons.

I'm not 100 % sure with the ordering of the various calls other than that the al_init_* are called early. Reducing Lix or having minimal A5 examples still looks helpful.

-- Simon

Simon

Digging around the cutting edge A5 source of al_convert_mask_to_alpha: This function is OS-independent, and it is, like my handwritten Lix recoloring, a high-level pixel-by-pixel loop.

al_convert_mask_to_alpha, C source

void al_convert_mask_to_alpha(ALLEGRO_BITMAP *bitmap, ALLEGRO_COLOR mask_color)
{
   ALLEGRO_LOCKED_REGION *lr;
   int x, y;
   ALLEGRO_COLOR pixel;
   ALLEGRO_COLOR alpha_pixel;
   ALLEGRO_STATE state;

   if (!(lr = al_lock_bitmap(bitmap, ALLEGRO_PIXEL_FORMAT_ANY, 0))) {
      ALLEGRO_ERROR("Couldn't lock bitmap.");
      return;
   }

   al_store_state(&state, ALLEGRO_STATE_TARGET_BITMAP);
   al_set_target_bitmap(bitmap);

   alpha_pixel = al_map_rgba(0, 0, 0, 0);

   for (y = 0; y < bitmap->h; y++) {
      for (x = 0; x < bitmap->w; x++) {
         pixel = al_get_pixel(bitmap, x, y);
         if (memcmp(&pixel, &mask_color, sizeof(ALLEGRO_COLOR)) == 0) {
            al_put_pixel(x, y, alpha_pixel);
         }
      }
   }

   al_unlock_bitmap(bitmap);

   al_restore_state(&state);
}


One possible method of attack: Investigate how C memcmp can fail to recognize two pixel structs as identical even though both are pink. A5 color structs contain 4 floats without padding:

typedef struct ALLEGRO_COLOR ALLEGRO_COLOR;
struct ALLEGRO_COLOR
{
   float r, g, b, a;
};


Now, for IEEE floats, the C language considers +0 == −0 when we compare them with ==, but memcmp(+0, −0) will not consider them equal. (Perhaps even +1 won't memcmp-equal some other representation of +1, but I doubt it.) Then memcmp will fail. I'll ask on A5 what they think about memcmp here. Comparing floats with equality is a can of worms in general, I have no good immediate fix to offer.




I don't want to scare away other Mac users with technicialities. Who has a Mac? :lix-grin:

-- Simon

Dullstar

Does the C standard say anything about padding, i.e. is the compiler allowed to insert whatever padding it wants? If so it could explain a system dependent memcmp result discrepency.

Simon

Padding in a struct of only floats: When I read the C11 standard on structs, it looks like it allows padding between any two members, and at the end, but not at the beginning. Nonetheless, in practice, the floats will be packed without any padding, like an array of 4 floats.

Yesterday, I've written a custom version of al_convert_mask_to_alpha and asked the Mac users in the github issue to try that. I'll see if anything comes up.

In Allgero's IRC channel, nobody can tell for sure if −0 is a valid color component within 0 and 1. I'd like to see what comes back from the Mac testing before I possibly open a pull request to change the memcmp to four ==.

-- Simon

Simon

Dullstar, I haven't heard from anybody else with a Mac. Is it time you dust off yours? >_>

In #431 are my first ideas to start debugging; if you're fine with github, we can continue to post stuff right in that github bug.

-- Simon

Dullstar

I should acknowledge that I've at least seen this: just haven't gotten around to it.

I made sure to check the toolchain, though, and at the very least I didn't find anything that explicitly says it wouldn't work on such outdated MacOS versions.

Simon

Thanks! Yes, take your time. The bug is niggling, but it's not urgent; naymapl and cameo69 haven't posted in half a year.

Yeah, the Mac/Linux build notes for Lix should work for you.

-- Simon

Dullstar

I tried but wasn't able to install the dependencies.

The mac was too old to install Homebrew. There's a fork called Tigerbrew that can run on older machines, but while I was able to install it, I was unable to get it to work.

I mentioned this in IRC, but I wonder if clang vs. gcc has anything to do with it: if you could compile the dependencies on Linux using clang instead of gcc, and then link against those instead of the system libraries, does that replicate the magic pink bug? My best guess is that, while we would expect the ALLEGRO_COLOR struct to be basically more or less equivalent to a float[4], clang may have for some reason decided that it's better to pad it to 64 bit boundaries as if it were doubles or something. That, or it's comparing a +/- 0 for some reason (but why/how?)

Dullstar

A few more thoughts on the Magic Pink issue:

First: it sounded from the initial bug reports that the workaround to bulk remove Magic Pink from the files worked. Might this be worth doing out of the box, and then culling Magic Pink support from Lix? Common image formats and editors have supported transparency for a long time; does Magic Pink still make sense in 2023? Does it save storage space compared to transparency after compression? It can't be saving CPU time since we have to replace it with transparency at runtime. And while it's not a very common color to actually want to use, it does mean that sets can't make use of it because it will always be changed to transparent. It's also a value that people have to memorize (even if it's not that hard) in applications where Magic Pink is required although my understanding is that Lix supports actually using transparency just fine so that's not as much of a problem as it could be.

Second: it seems likely that other recoloring is affected too, although it wasn't noted in the original report (while you noted that some UI elements wouldn't be recolored, we don't have any confirmation that this is the case). Knowing exactly how other recoloring is affected could be useful. If it's a +/-0 issue, for instance, I suspect recoloring would likely function properly when we're replacing a color that contains no 0 components. If it's something else (perhaps padding, perhaps some other issue we haven't thought of), then it's more likely to break consistently. But we can't know for sure without being able to test.

Simon

Thanks for digging, and you have some excellent ideas for further debugging.

There is a 99 % chance that their float in C is 32-bit, but the C standard doesn't guarantee it AFAIK. And there can be padding in the struct.

Then there are D bindings that define ALLEGRO_COLOR with D floats, which are guaranteed to be IEEE 32-bit floats, but Lix calls Allegro 5's recoloring function and that should ignore the D bindings. I think I can ignore the D bindings for now and should investigate what happens in Allegro 5 itself. Still, to be safe, I should probably write example programs both in C and in D.

First example program should be a program that prints sizeof(ALLEGRO_COLOR). I'll see when I get around to post in the bug thread again.

Magic Pink to transparent: I'll reply to you these days.

-- Simon

Simon

Dullstar, on your Mac, can you compile plain C? Please compile the program below. What does it print?

It prints 16 for me, and it should really print 16 everywhere, but, as we know, the C standard doesn't guarantee that. Even if you get 16, it's still not necessarily off-track. Allegro might compile its own code with different settings, or naymapl and cameo69 might not get 16.

-- Simon




#include <stdio.h>

struct ALLEGRO_COLOR { float r, g, b, a; };

int main()
{
    printf("%u\n", sizeof(struct ALLEGRO_COLOR));
}

Dullstar

It prints the expected 16.

I also went a bit further and grabbed the values it would insert if asked to initialize the struct with 0.0f and -0.0f, which results in 00 00 00 00 for 0.0f and 00 00 00 80 for -0.0f.

A possible next step might just be to grab the source for al_map_rgba, compile just that, and then give magic pink as the arguments and then just see what the underlying bytes are. Maybe for some weird reason converting likes 0.0f and loading likes -0.0f or vice versa.

Hmmm, if plain C works it's probably possible to manually compile all the dependencies (or at the very least get a more specific reason why it doesn't work) but that sounds like quite the pain.

Simon

2023-04-10, I asked Dominator_101:

Quote from: Simon on April 10, 2023, 08:56:21 PM
Hi Dominator,

in 2022, I had two Mac users, naymapl and cameo69, who reported on github bug #431: Magic pink won't become transparent on macOS (both Intel and ARM64)

Does this bug manifest on your Mac, too? If you don't see that bug in the newest Lix, do you see the bug in Lix tag v0.10.6? (Afterwards, I wrote my own recoloring in hopes of alleviating #431.)

What kind of Mac do you have?

On your Mac, what is your complete output text of:
dub build -f

-- Simon

Dominator_101 replied:

Quote from: Dominator_101 on April 11, 2023, 04:28:54 PM
So uh, I had seen that ticket before but hadn't even seen the behavior myself. Decided to boot Lix for the heck of it and suddenly everything is pink. Version is 0.9.44, I haven't updated recently.

Intel Proc

Output:
Performing "debug" build using ldc2 for x86_64.
allegro 4.0.4+5.2.0: building configuration "no-libs"...
derelict-util 3.0.0-beta.2: building configuration "library"...
derelict-enet 4.2.0: building configuration "library"...
enumap 0.4.2: building configuration "library"...
bolts 1.3.1: building configuration "library"...
optional 1.3.0: building configuration "library"...
taggedalgebraic 0.11.22: building configuration "library"...
sdlang-d 0.10.6: building configuration "library"...
lix 0.9.44: building configuration "application"...
Compiling Lix 0.9.44 with LDC for macOS, 64-bit...
Linking...


I dunno the last time I actually ran Lix on this machine, so I'm not sure if I've updated the OS since then (I tend to drag my feet on that), but it's possible that updates have happened in the background, not sure.

I answered:

Quote from: Simon on April 11, 2023, 08:57:32 PM
Thanks for the quick reply!

It's really helpful to hear you seeing pink out of the blue, when Lix didn't have that bug for you before.

Quote from: Dominator_101 on April 11, 2023, 04:28:54 PM
Performing "debug" build using ldc2 for x86_64.


Thanks, that disproves a theory with floating point hardware; the only other build log of that bug had "using ldc2 for aarch64, arm_hardfloat". You have an Intel Mac and the compiler considers it a perfectly normal x64.




Next theory: The pink bug might be particular to Allegro 5.2.7, which you probably have installed, judging from the date of your mouse bug report last year.

Please run, without upgrading anything (neither Lix nor Allegro):

$ bin/lix --allegro-version

Does it print "Allegro DLL version 5.2.7.something"?




Only afterwards, update Allegro to the current stable 5.2.8 (released June 6th, 2022), which is available in Homebrew:
https://formulae.brew.sh/formula/allegro

Rebuild Lix:
$ dub -f

Is your Lix 0.9.44 still pink after upgrading to Allegro 5.2.8?




If it's still pink, pull the current stable Lix 0.10.7, and rebuild again. Is Lix 0.10.7 also pink?

-- Simon

Dominator_101 answered:

Quote from: Dominator_101
Running the allegro version gave me 5.2.8.0, so already upgraded (I might've upgraded to test the new release for the mouse issue).

Not sure if Lix was built with this or the older version, or whether that matters (if it's always just getting the allegro at runtime or something). I have not rebuilt lix yet, would rebuilding the same version still be a valid test or should I just try building latest?

I haven't answered since.

-- Simon

WillLem

Sorry, only just seen this thread. I have a 2012 Mac running Catalina if that's of any use...?

Simon

WillLem: I think it's still a good test to see what happens on your Mac, even though the Macs of naymapl, cameo69, and Dominator_101 sound newer than yours (naymapl's M1 should be from 2020 and newer). Reason: Most likely, it's an Allegro bug. I already know that the bug affects both Intel and ARM Macs, and I hope that the bug doesn't depend too much on OS version. In particular, Dominator had no problems in 2021, and then in 2023, things were pink.

Mac build instructions for Lix, let me know if you run into problems, or if you want to do it on Mumble.

If you see the pink, that would be ideal, and we can continue to remote-troubleshoot.

If you don't see the pink, it'll be nice, but it'll be a weak result. Anything might have circumvented the bug then: Your Mac can be too old, current Allegro can have a fix, or Lix's custom recoloring function can have fixed it.

-- Simon