Simon blogs

Simon · March 19, 2022, 10:44:22 PM

Unix Wizard

This is a poster from 1985. What do we learn from it?

You can do everything in the shell.
You have many separate ingredients to mix and match.
Interesting data structures arise from such mixing.
Unneeded leftovers, pipe them into null.
The shell script is not an ingredient; it's a suggestion for how to combine ingredients.
By that time, C had superseded B (the programming language) since the B pot has a crack.
Cats follow those proficient in the mystic arts.

The strangest thing in the picture is the pot of oregano; there is no classic Unix tool called oregano. This oregano is presumably a joke.

The same picture as a 32 MB .png.

-- Simon

Simon · June 09, 2022, 02:40:26 AM

Mother of all Principles

Many software development principles, e.g., many of the object-oriented guidelines, follow from DRY.

DRY (Don't Repeat Yourself): Every piece of knowledge should have a single unabiguous authoritative representation in the system. Antonym: WET, Write Everything Twice, We Enjoy Typing.

Corollaries:

Repetition in logic calls for abstraction.
Repetition in process calls for automation.

Examples:

Extract duplicated logic into a new function/class/template/..., and call it from the original places.
Extract interface from several similar classes, and call the original functionality through the now-common interface.
Interface Segregation Principle. Have concise interfaces so that usercode can apply it where it fits, then usercode doesn't have to reinvent the wheel -- now you've dried usercode that you didn't even write. Also, test code doesn't need to mock too many unneeded methods.
Group closely related fields into a structure. It starts with struct Point { int x; int y; } or struct Version { int major; int minor; int patch; }. At minimum, the DRY happens when passing, re-assigning, ..., several fields when functions forward the arguments to other functions.
Write automated tests.
Write scripts or use test/build/packaging servers for common tasks.
Document solutions whenever people ask the same question more than once. I.e., have FAQs to avoid answering the same question every time.

About FAQs. Continuously convert FAQs into proper documentation, or your FAQ page will become the only important documentation, which is bad for the bigger information architecture.

A grotesque disease in modern documentation culture is to design and write FAQ sections upfront, before the release of the product, with questions that the authors would like to be asked, not with questions that really came up during support. You can often detect these anticipated questions from self-indulgent wording in the question.

Bad! FAQ are frequently asked questions, not frequently anticipated questions. Anticipated questions go into the normal documentation. If you're so terrible that all your documentation is bullet-point style without bigger structure, you don't solve the problem by calling it FAQ, you solve the problem hiring better technical documentors, better tech support, and flattening your processes to connect the two teams.

Documentation can violate DRY if you can't easily fix it. E.g., if people don't read the manual and only look in some other place, sure, put the answer in two places. But let this insight guide you to refactor the documentation into where people really look.

So much for DRY, the mother of most principles.

But some principles don't taste like DRY, no matter how long we chew. Why is a long, deeply-nested method bad? It doesn't necessarily repeat any logic, and it automates everything. It can be 100 % DRY and it still doesn't feel nice to edit/understand.

I believe that nearly all principles follow from either DRY or from SLAP.

SLAP (Single Layer of Abstraction Principle): Each piece of code should be at a single level of abstraction. Don't mix the high-level structure with the low-level details.

Examples:

Refactor the long function into several shorter functions, then either call them from the original one after another, or add even more structure that became apparent during the split.
Don't nest too deeply. Don't mix complex preparation/iteration in a single function with the payload work per element. Describe the work per element in one method and the iteration in another. Bonus: The work per element, as a function, becomes a nice argument to higher-order functions (filter, map, fold).
Single Responsibility Principle. Make a class do one thing well. Split big classes with multiple responsibilities into several classes. The original class can still stay as a (now much shorter) mediator to tie the several classes together for the original functionality.
Classes capture responsibilites, not necessarily real-world objects. Reason: Real-world objects can have too many responsibilities.
Sometimes, real-world objects appear to change their own type, a behavior that you can't capture in classical OO when classes correspond to real-world objects. Solutions like the strategy pattern, the state pattern, e.g., Lix jobs, ..., all have the same fundamental idea: A small class with the fixed high-level structure calls into other small classes with the payload work or the dynamic replaceable behavior. A very nice nontrivial example of SLAP.
Ask your caller for a favor if it helps you design a cleaner interface. Don't do his dirty work and the task; do only the task. Accept non-null of the correct type. Let the caller type-convert, let the caller skip calling you if it's null.
Null Object Pattern! Let the polymorphism handle the no-op case, and call stuff unconditionally. Easier code both on the caller and on the callee!

I merely don't see how the Liskov Substitution Principle fits into either DRY or SLAP. Maybe not breaking users' expectations is a third fundamental idea that neither DRY nor SLAP covers?

-- Simon

Simon · June 11, 2022, 11:37:16 PM

Guinea Pig Diet #4

I weigh 89 kg and I'm trying to lose weight again.

I resisted the urge to order pizza. Instead, had an entire cucumber and two bananas. There are still bell peppers and apples around. Nonetheless, very hard to resist the pizza.

The new and exciting idea is to replace snacking (outside proper meals) with entire cucumbers. Instead of putting the little cucumber slices on bread, I'll have the entire cucumber as a raw snack, gnawing away over time like a true guinea pig. Only 40 kcal a pop.

I don't know if I want another bet with LF hosting costs on the line. Last year, it didn't spur me enough to hit 80 kg. I even feel that everybody else will be discouraged from donating if I do it.

They'll think: Why donate 5 dollars if Simon will donate 60 anyway, what good does a small extra donation do at all... In hindsight, maybe it was even counterproductive to publicize my 2021 weight bet in this way.

I'll find a nice way to get my bikini body.

Does somebody want to join the weight loss? Who's up to pursue a goal together until end of 2022?

My favorite animal is still the African crested porcupine, I like giant anteaters too, and I've recently grown to like the manul, a.k.a. Pallas's cat. International Pallas's Cat Day is April 23, so it'll be a while until I write an entire topic on manuls like my 2020 topic on the giant anteater.

But in the Lix release thread, I like to post an animal picture with each release, ideally a picture that fits the feeling of the release. Expect some manuls for the next couple releases!

-- Simon

Simon · June 26, 2022, 08:46:03 AM

Simon Rhymes

filter, map, fold,
das Glück ist dir hold.
Doch std::transform,
das wurmt dich enorm.

I wrote this cheesy poem in 2016. Detailed explanation follows -- not all readers are proficient both in software development and in German here. A typical piece of code that appears every single time in an introduction to functional pipelines is:

const int y = [1, 2, 3, 4, 5]
.filter!(x => x > 2)
.map!(x => x*x)
.sum();

This sets y to 50 because 50 is 3×3 + 4×4 + 5×5.

Now for the explanation of the poem. {

Functional pipelines for data are nice: Each step relies only on the output of the function from the previous step, not on any properties/mutation/earlier side effects to/of a publicly visible buffer containing the original input. Fewer bugs because we avoided mutable data.

Even though you can chain higher-order functions such as filter, map, fold in any sensible order, e.g., map first, then filter some, then map some more, then filter some more, ..., the most common order is exactly the order of the poem:

filter: First discard some elements that we want to ignore.
map: Transform the remaining elements.
fold: Collect the mapped elements into a final result.

These functional pipelines are notoriously hard to express in C++ because you can't write them as pipes even with the C++17 standard library. You don't pass the collection itself as a single argument, instead you pass two arguments, a start iterator and an end interator. Therefore, you can't chain the calls à la source.f().g().h(); but have to put every call into a separate statement, introducing a new name for every intermediate result.

In particular std::transform is the equivalent of map in that world, and the annoyance (having to put every step of the functional chain into a separate statement) feels already big enough here that it's nicer to write a conventional loop over the collection instead of calling std::transform.

You're annoyed enough in C++17 that you might consider the range-v3 library for C++17 or to move your project to C++20 where these pipeline-friendly ranges are part of the standard library.

}

Back from C++ to D. When I write functional pipelines in D, I rarely call an explicit fold() in the last step. It's much more common to call one of these:

join() some mapped strings into a single long string, possibly join(" ") or join(", ").
array() to keep the results in an eagerly-allocated random-access array, usually because we'll only store the result for now.
sum() on a list of numbers.
any() or all() on a list of bools.

All of these are (semantically, not implementation-wise) special cases of fold().
For example, list.sum() is
list.fold!((a, b) => a + b)(0).
The better name goes a long way though to make the pipeline more readable, especially if the more specialized function comes directly from the standard library.

Even map(list, func) and filter(list, predicate) are special cases of fold(), as explained in the blog article The Universal Properties of Map, Fold, and Filter with the maximum-possible amount of category theory. The gist is: filter() accumulates (folds) the list into a new list, possibly doing nothing instead of appending the next element. Similar for map().

-- Simon

Simon · June 27, 2022, 06:38:05 PM

Writing

A job very hard.

English and German grammar suck. German sucks more than English because in German subordinate clauses, you the verb in last place, where the reader the verb won't find before he already the meaning of the sentence has forgotten, put must. But English isn't so much better in other regards, either.

Topic of the sentence, I want to put it in first place and not risk the sentence looking odd.

Noun proper should always come before their adjectives explaining, like in French, a langue très bèttre in regard this. Sometimes, I've written adjectives and forgot to add the noun.

Lots of parentheses, I want to add them everywhere. And I want to write in a two-dimensional way, e.g., I want to write a sentence and then add bubbles (with more information) around the sentence, tied to specific (parts in the sentence). The bubbles contain elaborations/definitions/... and you should be able to show (or hide) them (on demand).

C++ uniform initialization syntax, maybe it should become standard in English, too. I have Car{nice, black, Diesel} and I drive{with Car from earlier in this sentence, to Friend, Reason{visit, Reason{he is nice, I haven't seen the friend for a while}}} this afternoon. And now the braced initializer list for drive became too long, I want to refactor it; only the friend should be visible by default, and everything else from the list goes in a bubble{hidden by default, tied to drive}.

Edit Simon 2023: My immediate next post was only about C++ initialization; I've split it into:
C++ with Simon

-- Simon

Simon · June 29, 2022, 11:38:43 PM

Summer

Summer is too warm. When will be the next wonderful thunderstorm?

Upside of the heat: Worse mood yields more Simon rants. The problem is finding a peaceful moment to write it down here; I'm striving for a certain minimum quality. Careless prose isn't fun to read.

One of my strategies to counter the heat is to open the windows wide at night, to let the cool air come in -- well, the relatively cool air. At night, it's still warm enough to walk outside in shorts and T-shirt. Plainly, it doesn't get properly cold these weeks. I have flyscreen in practically every window. The flyscreen is worth as much as all the gold in Fort Knox, i.e., as much as performant backwards framestepping. I don't know what people without flyscreen do.

Then again, people also play framestep-less Lemmings without raging. I guess you can treat the experience as a kind of meditation, hmm, even I like to play L1 in Dosbox occasionally for ye olde AdLib nostalgia.

Loap and L2Player, these projects must grow. Custom L2 packs are nice alrelady in DOS L2, but Kieran and Ste had to hold back on puzzle difficulty lest they introduced accidental execution difficulty. I deem level design more an art than a science. Let's not curb the designers' creativity, let's offer them the best tooling we can.

-- Simon

Simon · August 14, 2022, 12:10:21 PM

UI with Simon

You're designing a highly interactive program, e.g., a game.

Heuristic: The best place to configure a value is where you display the value.

Reason: Before you can edit a value, you must know that the value exists. Thus, once people want to edit the value, they already know where in the UI they see the value.

Mistake in Lix: The editor offers grid sizes 1, 2, 8, 16, but the 8 is really a user-configurable size. On the editor button, I print the number 8. You click the button to pick the 8 (instead of 1, 2, 16). But to edit the user-configurable grid size, you cannot click that 8; you must leave the editor and go to the options menu. In the editor tooltip, I tell people to go to the options menu, but it's really bad UI. You should be able to click on/near/rightclick the number 8 to change it.

Refinement of the heuristic: If the value's normal display has another function, e.g., it's a button to do stuff with the already-picked value, or it's a long text that people want to select with the mouse, don't turn the entire display into the change-the-value widget. Consider to add something to the side instead.

Bad example: Jira bug tracker. When you click into the main description (a long text) of a bug, e.g., to highlight something in it, the text turns into a text-entry field, and the text jumps around to fit the slightly changed boundaries of the field. Bad bad bad! I believe any sane person would rage here. Feels as if people at Atlassian didn't use their own products themselves.

Better: Add pencil icon next to the huge text.

-- Simon

Simon · August 14, 2022, 01:19:49 PM

All-in Poker

You're playing a slightly modified version of heads-up texas holdem. Rules: Both you and your opponent each bring N chips to the table. (Assume both players have practically infinite bankrolls that can supply any number of rebuys.) Each hand proceeds as follows: Each player gets 2 hole cards, then you bet 1 chip, then your opponent raises all-in regardless of holdings. You may now call or fold (depending on your 2 hole cards). If you call, the dealer deals 5 community cards, highest hand wins. After each hand, the loser restocks to N chips from his bankroll (i.e., whether he had 0 or N − 1 left, he'll have N for the next hand), and the winner must take south any excess of N.

To be most profitable, what is your calling range for N = 10? For N = 100? How high must N be so that the calling range is only the pair of aces? (Clarification: We're looking for optimum profitability, not merely for the loosest range that is still profitable.)

Nontrivial questions, and I don't have poker simulation code ready, thus I won't promise answers. I believe that for N = 10, the optimal calling range must be reasonably loose, considerably more than (any ace, any pair).

Pondering some more. I believe that the ranges for N = 10 and N = 100 are quite similar, that the range remains reasonably loose even for N → ∞, and that tightening the range to (only pair of aces) is wrong for any N. Reason: We can reword the stakes. To avoid wagering 1 unit that our hand is better, you can pay 1/N units (the price of folding). If your hand beats two random cards on average, you should call regardless of the price of folding.

-- Simon

namida · August 14, 2022, 08:09:53 PM

QuotePondering some more. I believe that the ranges for N = 10 and N = 100 are quite similar, that the range remains reasonably loose even for N → ∞, and that tightening the range to (only pair of aces) is wrong for any N. Reason: We can reword the stakes. To avoid wagering 1 unit that our hand is better, you can pay 1/N units (the price of folding). If your hand beats two random cards on average, you should call regardless of the price of folding.

From memory, Q7o is the hand considered to be closest to a 50/50 chance of winning in such a situation.

Ste Woz Ere · August 15, 2022, 04:19:54 PM

Quote from: Simon on June 29, 2022, 11:38:43 PM

Loap and L2Player, these projects must grow. Custom L2 packs are nice alrelady in DOS L2, but Kieran and Ste had to hold back on puzzle difficulty lest they introduced accidental execution difficulty. I deem level design more an art than a science. Let's not curb the designers' creativity, let's offer them the best tooling we can.

Agreed. ToS would benefit so much more from L2Player, especially on the last level of each tribe - those are a crazy concept that probably does need a bit more in the player-aid department. (though savestate DOSBox is adequate to some degree, like it was with L3)

Simon · September 19, 2022, 10:34:20 PM

Quote from: namida on August 14, 2022, 08:09:53 PM
Q7o is the hand considered to be closest to a 50/50 chance of winning in such a situation.

I ran simulation:

Q5o is still minimally better than 50 % against random hand, but Q4o is not.
Any suited queen is better than 50 % against random hand.
J8o is better than 50 % against random hand, and J7o has less than 50 %.
J6s is better than 50 % against random hand, and J5s has minimally less than 50 %.
22 has 50.3 % against random hand; I haven't run the other pairs but they should all be even higher.
Small suited connectors are less than 50 %, e.g., 45s has 41.5 %. We want to be dealt high hole cards more than anything else.

I still think that we should call with every 50-%-or-better hand in the given scenario regardless of N, and, if N is small, with even more hands. This is plausible with regard to the the extreme cases: For N = 1, calling and losing costs the same as folding, therefore we will shove any two.

Originally, it surprised me that every 50-%-or-better is call, but this is merely because under normal poker rules, for big N, no opponent would open-shove any two.

Quote from: Ste Woz Ere on August 15, 2022, 04:19:54 PM
Agreed. ToS would benefit so much more from L2Player, especially on the last level of each tribe - those are a crazy concept that probably does need a bit more in the player-aid department. (though savestate DOSBox is adequate to some degree, like it was with L3)

Hmmm, I can imagine, and I haven't even seen most of the final levels.

I'd like to return to "normal" participation in the Lemmings culture, with more level solving, even if it's in regular Dosbox. I remember how happy I was to playtest the early Highland from Tribes of Steel in Summer of 2021. In 2022, I didn't find the time for serious level solving. Other hobbies got in the way, and whenever I found an evening for Lemmings, it always went to the Lix networking and physics.

-- Simon

Simon · September 20, 2022, 09:33:03 PM

Random things I take for granted:

Everybody is a good enough Lemmings-1-level solver to be able to complete all of DOS Lemmings 1 and DOS ONML in a modern engine.
Everybody understands most of the mechanics from the other licensed Lemmings games until the year 2000 inclusive, i.e., including Lemmings Revolution.
Everybody knows how to program in at least one popular common language.
Everybody can read C source code.
Everybody knows the rules of popular classical board/card games.
Everybody is fit with high-school mathematics, in particular, high-school level calculus.
Everybody knows some mathematics beyond the high-school level, whether from formal education or by self-teaching. Hard to define how far I assume this; let's say, complex numbers, some graph theory, linear algebra, elementary set theory, elementary topology.

I do not assume that people use version control.

I do not assume that everybody is good enough to solve all of Genesis Lemmings in a modern engine. In particular, I forgot the solution to Lemmings' Ark, and geoo will hit me over the head for this the moment he reads it. Either because it should be common knowledge, or because he's envious that I get to resolve from scratch a classic level that he likes so much.

Even though I resolved Lemmings' Ark in Redux, I don't count that because this particular solution doesn't work in vanilla Lemmix, thus I assume it won't work on Genesis. And thus, I still don't assume that everybody is proficient in Genesis Lemmings! Edit: Solved Lemmings' Ark in vanilla Lemmix.

-- Simon

Proxima · September 21, 2022, 06:52:14 AM

Quote from: Simon on September 20, 2022, 09:33:03 PMSolved Lemmings' Ark in vanilla Lemmix.

Congratulations!

How does it feel to have conquered it at last?

Simon · September 21, 2022, 08:52:19 PM

The main trick is cute, well-hidden and inspiring; it hits the core ideas of modern Lemmings.

The pure solving experience was then marred by the many engine differences: The first NL solution didn't replicate in vanilla Lemmix, and I remembered solving it years ago with vanilla L1 physics. The research that resulted from it was very valuable and makes up for the hassle.

-- Simon

DireKrow · September 25, 2022, 04:52:39 AM

QuoteEverybody can read C source code.
Everybody knows the rules of popular classical board/card games.
Everybody is fit with high-school mathematics, in particular, high-school level calculus.
Everybody knows some mathematics beyond the high-school level, whether from formal education or by self-teaching. Hard to define how far I assume this; let's say, complex numbers, some graph theory, linear algebra, elementary set theory, elementary topology.

These ones aren't true for me, and not for most people I see in at least some of the communities I frequent. It does feel like it's true for most people I hang out with in the puzzler community, though.