Home
Admin | Edit

Acetaminophen - RISC OS 256 bytes

640x480 version

My first Acorn Archimedes / RISC OS intro with two versions :
The first version works best on ARM3, CPU cache probably help a lot, works fine on ARM2 also albeit slower.

The initial idea was lit off by Optimus with his 2023 512 bytes Ibuprofen entry for Acorn machines which show an effect with multiple circles circling around and producing Moiré pattern, the effect originate from the Paracetamol Amiga demo (1990).

I thought it would be cool to do that effect with minsky display hack (an algorithm i am experimenting with since some times) so i built a prototype in JavaScript but had no plan to do something with it, i then stumbled on Optimus entry again and took the challenge to do it in 256 bytes on Acorn hardware.

I was mainly motivated by the thought that ARM architecture with its barrel shifter / 3 operands instructions would be at ease with minsky algorithm even though each instructions takes 4 bytes, also wanted to do code golfing on late 80s hardware since a long time so this was the perfect opportunity!

Minsky display hack is also used for the circles animation, it emulate sin / cos in ~4 instructions !

I started with high resolution modes at first (1 bpp / 2 bpp) with a buffer (instead of computing the circles directly) but it was uglier and slow so i switched to mode 13 and computing circles in real time, lower bits per pixel have a bytes cost to handle pixel perfect plotting, you either have to hold four copy of the buffer or compute the adjustments directly which also cost some instructions.

Since mode 13 is not supported on modern monitors i did a second version with mode 28 so the intro could works on VGA monitors and i could record the output as well since i don't have any Acorn machines.

The two versions slightly differ in style, the mode 28 version draw the circles with some sort of additive blending and the mode 13 version draw the circles with a "drop shadow", the drop shadow was easily done by modulating the color of the circle "atoms" (pixel blocks) as its height increase, this is a bit faster on early ARM because there is two RAM lookup instead of three.

320x256 version

84 bytes is taken by RISC OS API calls for screen setup, double buffering setup, vsync and clean exit with escape key, this doesn't help for code golfing but it can be reduced by not using double buffering or vsync for example but will looks uglier.

This intro could probably be done with RISC OS API by using it to draw circles and do the masking, i don't know how much code would that need and i am not sure about the speed either.

The code is tight on registers, the circle loop use all 14 registers and the remaining registers are used for calls.

It doesn't use compression but CodePressor (ARM code compressor) can be used to gain some bytes.

The mode 28 version is slightly lighter because i removed the two instructions which allocate screen memory for double buffering, it seems unneeded on RO5 / RPI zero and anyway pre-RiscPC can't double buffer mode 28 due to VIDC limits so i guess it is an okay trick.

Had plenty fun in the end and ARM proved to be adequate for code golfing (despite the high bytes cost of the RISC OS API / ARM sometimes), what helped a lot :
Could probably be done much smaller on newer ARM CPU with Thumb mode.

Advantage of this method is easy real-time radius adjustments of the circles which can lead to other effects (without much costs) such as morphing them into a ball.

I am not sure how this one would compare on x86 architecture, could reach 128 bytes on DOS platform perhaps (less ?) if the FPU / SQRT is used although it would be slower since ones would iterate over all the screen (unless buffering is added), minsky circles can also be done very efficiently (~8 bytes!) on x86 by using MOVSX instruction.

2024 Update : The thickness loop is entirely avoidable by doing the circle thickness with the circle algorithm instead (see this; the idea is to slightly perturb the stability of the algorithm to trace outside of the usual orbit), this save some bytes and may be a bit faster overall because the thickness loop is no more (= one branch less which is at least 4 cycles per iteration), the thickness will then be iterations dependent. Another optimization that i see now is using a two instructions minsky instead of a three instructions one and using a downscale step to ramp the precision if needed, this save 4 bytes.

the ARM code for the RPI version (i use BBC BASIC assembler)

back to topLicence Creative Commons