Some background on the Spry implementation may be interesting. Spry is implemented in Nim as a direct AST interpreter, it's not a JIT, in only about 2000 lines of code. It has a recursive classic "naive" design and uses a spaghetti stack of activation records, all allocated on the heap relying fully on Nim's GC to do it's work. It also relies on Nim's method dynamic dispatch in the interpreter loop for dispatching on the different AST nodes. Blocks are true closures and control structures like timesRepeat:
are implemented as primitives, normally not cheating. Suffice to say, there are LOTS of things we can do to make Spry run faster!
The philosophy of implementation is to keep Spry very small and "shallow" which means we rely as much as possible on the shoulders of others. In this case, primarily Nim and it's superb features, performance and standard library.
Enough jibbering, let's do some silly damn lies - ehrm, I mean silly tests!
python
| b r |
b := OrderedCollection new.
r := Random new.
2000000 timesRepeat: [b add: (r nextInt: 10)].
[b select: [:x | x > 8]] timeToRun
The above snippet runs in around 40 ms in latest Squeak 5.1. Nippy indeed! Ok, so in Spry then with a recently added primitive for select:
:
python
b = []
2000000 timesRepeat: [b add: (10 random)]
[b select: [:x > 8]] timeToRun
First of all, if this is the first Spry code you have seen, I hope you can tell it's Smalltalk-ish and it's even shorter. :) A few notes to make it clearer:
=
and equality uses ==
, no big reason, just aligning slightly with other languages.[]
is the syntax for creating a Block (at parse time), which is the workhorse dynamic array (and code block), like an OrderedCollection..
at the end, nor does it rely on line endings or indentation, same snippet can be written exactly the same on a single line.(10 random)
because Spry evaluates strictly from left to right.:foo
. So blocks are often shorter than in Smalltalk like [:x > 8]
or even [:a < :b]
.Spry runs this in 1000 ms, not that shabby, but of course about 25x slower than Squeak. However... I think I can double the Spry speed and if so, then we are in "can live with that-country".
Just to prove Spry is just as dynamic and cool as Smalltalk (even more so actually in many parts), we can also implement select:
in Spry itself (and for the more savvy out there, yes, detect:
can also be implenented using the same non local return trick as Smalltalk uses):
```python spryselect: = method [:pred result = ([] clone) self reset [self end?] whileFalse: [
n = (self next)
do pred n then: [result add: n]]
^result] ```
Without explaining that code, how fast is the same test using this variant implemented in Spry itself? 8.8 seconds, not horrible, but... I think we prefer the primitive :)
Now... let's pretend this particular case is an important bottleneck in our 20 million dollar project. We just need to be faster! The Spry strategy is then to drop down to Nim and make a primitive that does everything in Nim. Such a 7-line primitive could look like this:
```nimrod nimMeth("selectLarger8"): # evalArgInfix(spry) pulls in the receiver on the left. # We convert it to the type we expect - SeqComposite is a super type # of Block, Paren, Curly. let self = SeqComposite(evalArgInfix(spry)) # We create a new empty Block to put the selected nodes in. let returnBlok = newBlok() # We perform a regular Nim iteration. self.nodes is a Nim seq[Node]. for each in self.nodes:
# For each element we convert to IntVal, which is the node type for a Spry int.
# This will cause a catchable exception if it's not an IntVal.
if IntVal(each).value > 8:
returnBlok.add(each)
# A primitive always returns a Node or subclass thereof, like in this case a Blok. return returnBlok ``` Then it runs in 10 ms!
Yup, it's cheating, but the 20 million dollar project wouldn't care... The thing to realize here is that its MUCH easier to cheat in Spry than it is in Squeak/Pharo. But... yes, you would need to know how to make a primitive, and as a primitive it's compiled code so you can't mess with it live, and it also presumes that each node is an IntVal. However, Spry (when I fix error handling) should gracefully handle if it isn't an IntVal, that will trigger a Nim exception that the Spry interpreter should catch.
If you have made primitives in Squeak/Pharo you know it's much more complicated. You need to take great care with allocation since the GC can move things under your feet. You must convert things to C and so on, and building the stuff is messy. Spry on the other hand shares the underlying data structures with Nim. In other words, Spry nodes are Nim objects. It's trivial to work with them, allocate new ones like newBlok()
above creates a new block and so on. This is a huge deal! Recently when I started integrating libui with Spry (a pretty slick movie) I got callbacks from libui back into Spry working in like... 30 minutes of thinking. That's HUGE! Doing callbacks from C or C++ back into Squeak has been a really messy and complicated thing for YEARS. Not sure if it's any better.
Also, going pure Nim would be much faster still since it would use a seq[int]
and not a seq[Node]
(boxed ints) - a vast difference. So if we really wanted to work with large blocks of integers, a special such node type could easily be made that exposes primitives for it. Kinda like the FloatArray thing in Squeak, etc.
Let's look at another example where Spry actually beats Squeak. And by beat I mean really beat, by factor 4x! The test is to use findString:startingAt:
in a fairly large string, to find a match, 2 million times.
python
| s time |
"A file with a mixed text 15712 bytes, hit is at 11499, 6 partial hits before that."
s := (StandardFileStream oldFileNamed: 'string.txt') contentsOfEntireFile.
time := [2000000 timesRepeat: [
s findString: 'native threads and super high performance garbage' startingAt: 12
]] timeToRun
This snippet runs in 135 seconds in Squeak 5.1. The corresponding Spry code is:
```python
s = readFile "string.txt"
time = ([2000000 timesRepeat: [
s findString: "native threads and super high performance garbage" startingAt: 12
]] timeToRun)
``
Again, note how Smalltalkish the code looks - and ... you know, come on Smalltalk... reading a file? It shouldn't need to be
(StandardFileStream oldFileNamed: 'string.txt') contentsOfEntireFile` for such a common and mundane task!
You gotta admit, readFile "string.txt"
is nicer. But hey, says the careful reader, what the heck is that? Yes, Spry supports "prefix functions" that take arguments from the right, Rebol style. It isn't used much in Spry code, but for some things it really reads better. For example, in Spry we do echo "hey"
instead of Transcript show: 'hey'
. That's another thing that is overly verbose in Smalltalk and should IMHO be fixed, at least just to save poor newbies their fingers. Anyway (end of rant)....
...Spry runs that in 33 seconds! And just to get a sense for how large the primitive is in Spry, it's exactly 5 lines of code:
nimrod
nimMeth("findString:startingAt:"):
let self = StringVal(evalArgInfix(spry)).value
let sub = StringVal(evalArg(spry)).value
let start = IntVal(evalArg(spry)).value
newValue(find(self, sub, start))
It's quite easy to follow. We just pull in arguments and unbox them into Nim string, int, int - and then we call Nim's find and we finish by using newValue()
to box the answer as a Spry IntVal again. This shows how easily - no... trivially we can map Spry behaviors to Nim library code which runs at the speed of C/C++.
Given all this, it would still be nice to improve Spry to come say ... within 10x of Cog for general code, perhaps in this case shave it down from 1000 ms to around 300 ms. The things that I do know I should do to improve speed in general are the following:
I hope this got you interested in Spry!
]]>But the last few years, finally, I have started to feel the "burn"... as in "Let's burn our disk packs!". And last year I started doing something about it - and the result is Spry. Spry is only at version 0.break-your-hd and several key parts are still missing, but its getting interesting already.
Now... is Spry a Smalltalk? And what would that even mean?
I think the reason I am writing this article is because I am feeling a slight frustration that not more people in the Smalltalk community find Spry interesting. :)
And sure, who am I to think Spry is anything remotely interesting... but I would have loved more interest. It may of course change when Spry starts being useful... or perhaps the lack of interest is because it's not "a Smalltalk"?
The Smalltalk family of languages has a fair bit of variation, for example Self is clearly in this family, although it doesn't even have classes, but it maintains a similar "feel" and shares several Smalltalk "values". There have been a lot of Smalltalks over the years, even at PARC they made different variants before releasing Smalltalk-80.
So... if we look at Spry, can it be considered a member of the Smalltalk family?
There is an ANSI standard of Smalltalk - but not many people care about it, except for some vendors perhaps. I should note however that Seaside apparently (I think) has brought around a certain focus on the ANSI standard since every Smalltalk implementation on earth wants to be able to run Seaside and Seaside tries to enforce relying on the ANSI standard (correct me if I am wrong).
Most Smalltalk implementations share a range of characteristics, and a lot of them also follow the ANSI standard, but they can still differ on pretty major points.
My personal take on things in Smalltalk that are pretty darn important and/or unique are:
Not all Smalltalks cover all 10. For example, there are several Smalltalks without the image model and without a browser based IDE. Self and Slate and other prototypical derivatives don't have classes. Some Smalltalks have much less evolved class libraries for sure, and some are more shallow in the "turtle department".
In Spry we are deviating on a range of these points, but we are also definitely matching some of them!
So Spry scores 5/10. Not that shabby! And I am aiming for 3 more (#3, #5, #10) getting us up to 8/10. The two bullets that I can't really promise are #1 and #7, but I hope the alternative approach in Spry for these two bullets still reaches similar effects.
Let's look at #1, #2 and #6 in more detail. The other bullets can also be discussed, but ... not in this article :)
In Smalltalk everything is an object, there are no "fundamental datatypes". Every little thing is an instance of a class which makes the language clean and powerful. There are typically some things that the VM treats differently under the hood, like SmallInteger and BlockClosure etc, but the illusion is quite strong.
Spry on the other hand was born initially as a "Rebol incarnation" and evolved towards Smalltalk given my personal inclination. Rebol as well as Spry is homoiconic and when I started building Spry it felt very natural to simple let the AST be the fundamental "data is code and code is data" representation. This led to the atomic building block in Spry being the AST Node. So everything is an AST node (referred to as simply "node" hence on), but there are different kinds of nodes especially for various fundamental datatypes like string, int and float and they are explicitly implemented in the VM as "boxed" Nim types.
In Smalltalk objects imply that we can refer to them and pass them around, they have a life cycle and are garbage collected, they have an identity and they are instanciated from classes which describes what messages I can send to them.
In Spry the same things apply for nodes, except that they are not instanciated from classes. Instead nodes are either created by the parser through explicit syntax in the parse phase, or they are created during evaluation by cloning already existing ones.
An interesting aspect of Spry's approach is that we can easily create new kinds of nodes as extensions to the Spry VM. And these nodes can fall back on types in the Nim language that the VM is implemented in. This means we trivally can reuse the math libraries, string libraries and so on already available in Nim! In essence - the Spry VM and the Spry language is much more integrated with each other and since the VM is written in Nim, Nim and Spry live in symbiosis.
Using Spry it should be fully normal and easy to extend and compile your own Spry VM instead of having to use a downloaded binary VM or learning Black Magic in order to make a plugin to it, as it may feel in the Squeak/Pharo world.
Finally, just as with Smalltalk the meta level is represented and manipulated using the same abstractions as the language offers.
Conlusion? Spry is different but reaches something very similar in practice.
But what kind of behaviors are associated with a particular node then? In Spry I am experimenting with a model where all nodes can be tagged and these tags are the basis for polymorphism and dynamic function lookup. You can also avoid tagging and simply write regular functions and call them purely by name, making sure you feed them with the right kind of nodes as arguments, then we have a pure functional model with no dynamic dispatch being performed.
In Spry we have specific node types for the fundamental datatypes int, float, string and a few other things. But for "normal" objects that have instance variables we "model objects as Maps". JavaScript is similar, it has two fundamental composition types - the "array" and the "object" which works like a Map. In Spry we also have these two basic structures but we call them Block and Map. This means we can model an object using a Map, we don't declare instance variables - we just add them dynamically by name to the map.
But just being a Map doesn't make an object - because it doesn't have any behaviors associated with it! In Smalltalk objects know their class which is the basis for behavior dispatch and in Spry I am experimenting with opening up that attribute for more direct manipulation, a concept I call tags:
The net effect of this is that we end up with a very flexible model of dispatch. This style of overloading is a tad similar to structural pattern matching in Erlang/Elixir.
One can easily mimic a class by associating a bunch of functions with a specific tag. The tags on a node have an ordering, this means we also get the inheritance effect where we can inherit a bunch of functions (by adding a tag for them) and then override a subset using another tag - by putting that tag first in the tag collection of the node. Granted this is all experimental and we will see how it plays out. It does however have a few interesting advantages over class based models:
I am just starting to explore how this works, so the jury is still out.
Spry supports infix and prefix functions and additionally keyword syntax using a simple parsing transformation. The following variants are available:
```
root
echo "Hey"
if (3 < 4) [echo "yes"]
[1 2 3] size
3 + 4
[1] at: 0 put: 2
1 foo 3 4 5
loadFile: "amodule.sy" ```
This means Spry supports the classic Smalltalk messge syntax (unary, binary, keyword) in addition to prefix syntax which sometimes is quite natural, like for echo
. Currently there is no syntactic support for cascades, but I am not ruling out the ability to introduce something like it down the road.
Spry is very different from Smalltalk and I wouldn't call it "a Smalltalk", but rather "Smalltalk-ish". I hope Spry can open up new exciting programming patterns and abilities we haven't seen yet in Smalltalk country.
Hope you like it!
]]>At the moment he is rewriting the parser and code generator parts in the language itself, following a similar bootstrapping style as Ian Piumarta's idst. For example, here is the method parsing keyword messages.
At the moment Fowltalk is nowhere near usefulness but its fun stuff!
It's interesting to look at these bootstrap*
files - we can immediately notice some syntactic differences to Smalltalk-80:
Block arguments are written like [| :x :y | ... ]
and you can mix both locals and params there: [| :aParam aLocalHasNoColon | ... ]
. Instinctively I can agree with the combination, but I would probably then make the first |
optional.
Some messages have been changed, like ifTrue:ifFalse:
is instead ifTrue:else:
. I have done similar simplifications in Spry. And just like in Spry ivars are referenced using @myIvar
.
There isn't any documentation on Fowltalk yet, but it's clearly a rather elaborate implementation. It compiles to bytecodes, uses numbered primitives (I think) and there is an image mechanism.
It was also quite easy to get the REPL up and running, but just as with Spry, it's hard to know how to use it! On Ubuntu I installed boost sudo apt-get install libboost1.58-dev
and then it was easy to get it running following the instructions, as long as you change setup-linenoise.sh
to setup_linenoise.sh
.
The image constructed by the bootstrap process is 67Mb in size. Then we can do the canonical Smalltalk test in the REPL:
``` bash gokr@yoda:~/fowltalk/idk$ ./bin/oop -i oop.img --mmap --repl
(3 + 4) print 7
!quit gokr@yoda:~/fowltalk/idk$ ```
Fowl mentioned that the new parser can be loaded using !read bootstrap.1
but... at the moment that causes errors.
It will be interesting to see where this goes! Fowltalk is very early in its evolution, and it's not a JIT, but it's a real bytecode VM with an image and we can never have enough Smalltalk-like languages! :)
]]>In this article I do some silly experiments around interpreter startup time and fooling around with 40 million element arrays. As usual, I am fully aware that the languages (Pharo Smalltalk, NodeJS, Python) I compare with a) have lots of other ways to do things b) may not have been used exactly as someone else would have done it. A truck load of salt required. Now... let's go!
Spry is pretty fast starting up which obviously has to do with Spry not doing much at all when starting :)
So a trivial hello world being run using hashbang, executed 1000 times from another bash script, takes substantially less time than the same in Python. Useful benchmark? Not really, but obviously we can do scripting with Spry and at least not paying much for startup times! Here are the two trivial scripts and the bash script running them 1000 times:
``` bash
echo "Hello world" ```
``` bash
print "Hello World" ```
``` sh
for run in {1..1000} do ./hello.sy done ```
If we run the above, first for hello.sy
and then hello.py
, as reported by time
:
``` bash
real 0m4.071s user 0m0.740s sys 0m0.428s
real 0m13.812s user 0m8.904s sys 0m2.324s
real 0m2.505s user 0m0.024s sys 0m0.176s ```
Hum! So a trivial Spry script is 3-10x quicker depending on what you count (real clock vs cpu time etc), and... no, it's not output to stdout that is the issue, even a "silent" program that just concatenates "hello" with "world" suffers similarly in Python.
We can of course also compile this into a binary by embedding the Spry source code in a Nim program - it's actually trival to do. The 5th line below could of course be a full script. Since the Spry interpreter is modular we can pick some base modules to include, in this case the IO module is needed for echo
to work so we add it to the interpreter on line 3:
nimrod
import spryvm, modules/spryio
let spry = newInterpreter()
spry.addIO()
discard spry.eval """[
echo "Hello World"
]"""
..and then we build a binary using nim c -d:release hello.nim
and if we run that instead from the same bash loop we get:
real 0m0.840s
user 0m0.028s
sys 0m0.096s
Of course Python can do lots of similar tricks, so I am not making any claims! But still very neat. And oh, we didn't even try comparing to Pharo here :) Startup times is definitely not a strength of Smalltalk systems in general, typically due to lack of minimal images etc.
I wanted to create some fat collection and do some loops over it. Spry has a universal ordered collection called a Block
. Smalltalk has it's workhorse OrderedCollection
. Nodejs has an Array
. Let's stuff one with 40 million integers and then sum them up!
NOTE: The first numbers published were a bit off and I also realized an issue with Cog and LargeIntegers so this article is adjusted.
Pharo 4 with the Cog VM:
NodeJS 4.4.1:
Python 2.7.10:
Spry:
Spry with activation record reuse:
Ehum...
NOTES
If we spend some time profiling Spry we can quickly conclude that the main bottleneck is the lack of a binding phase in Spry - or in other words - every time we run a block, we lookup all words! Unless I am reading the profile wrong I think the endless lookups make up almost half the execution time. So that needs fixing. And I also will move to a stackless interpreter down the line, and that should give us a bit more.
And what about Python's sum
function that did it in whopping 0.3 seconds? Yep, definitely the way to go with an optimized primitive function for this, which brings me to...
The secret weapon of Spry!
One core idea of Spry is to make a Smalltalk-ish language with its inner machinery implemented in Nim using Nim data types. So the collection work horse in Spry, the block
is just a Nim seq
under the hood. This is very important.
Combined with a very simple way of making Nim primitives we can quickly cobble up a 6 line primitive word called sum
that will sum up the ints in a block. We simply use the fact that we know the block consists only of integers. I am guessing the sum
function of Python does something similar.
Here is the code heavily commented:
```
nimPrim("sum", true, 1): # The local variable spry refers to the Interpreter # evalArgInfix(spry) is a function call that returns # the infix argument evaluated at the call site. # We then cast this to a SeqComposite which is the # abstract super type of Blocks. let blk = SeqComposite(evalArgInfix(spry)) var sum = 0 # This is Nim iteration over the nodes member # of the SeqComposite, a seq of Nodes. for each in blk.nodes:
# We cast the node to IntVal since we know
# it's an int, and then we can get the value member
# which is a regular Nim int.
sum = sum + IntVal(each).value
# All Spry functions returns Nodes so we wrap the int # as a Node using newValue() which will wrap it as an # IntVal. return newValue(sum) ``` It's worth noting that almost all primitive words in Spry are written using this same pattern - so there are lots of examples to look at! Of course this is a bit of "cheating" but it's also interesting to see how easy it is for us to drop down to Nim in Spry. We create a new word bound to a primitive function in exactly 6 lines of code.
So how fast is Spry using this primitive word? It sums up in blazing 0.15 seconds, about 100x faster than Cog and 10x faster than NodeJS for summing up. And yeah, even 2x faster than Python!
And yes, we can easily make this primitive smarter to handle blocks with a mix of ints and floats and a proper exception if there is something else in there - then it ends up being 17 lines of code, and still almost as fast, 0.17-0.18 seconds! I love you Nim.
In summary, Cog - which is what I am most interested in comparing with - is fast but my personal goal is to get Spry within 5x slower in general speed - and that will be a good number for a pure interpreter vs an advanced JIT. And if we throw in primitive words - which is not hard - Spry can be very fast!
]]>That's obviously hard to do, but I am trying a little bit by questioning every little thing that many consider not being even relevant or possible to question. Some examples are:
Everyone is so busy "doing stuff" that noone takes the time to actually reflect. Can we really not create a development system in which I can see exactly what is going on? Is there really no more powerful ways to do debugging?
So I may not look into the future, but most good ideas come from someone doing something unexpected, weird, impossible or downright stupid. In Spry I want us to try a few of those :)
I don't really dare, but I think it's safe to say that Virtual Reality is probably going to be accessible everywhere. JavaScript has hopefully waned but leaving behind a new much lower threshold to programming being the norm, not the exception. Everyone wants to be able to program. Hardware is basically free, very capable and everywhere. People tend to think that the web is taking over everything, but I don't think its that simple - I think diversity is going to be much higher due to new companies creating new kinds of devices. Many more devices.
How does this affect choices in Spry? Well, I tend to not let performance considerations hinder various ideas. I also focus pretty hard on mobility of code and data, since I think we should be able to find a lot more models of computing in the area of distributed systems.
Finally I do think DSLs in different shapes or forms will play a big part in the future - so Spry should have excellent capabilities for that.
I also want Spry to be modular on most levels, while still being fairly simple.
The Smalltalk team was focused on user interfaces, education and children. With Spry "people" means primarily "developers".
I don't think 20 years will remove the need for writing code, but the pressure for fast results will be immensely higher. I also think the boundaries of computing will be much fuzzier and that we will need to have more advanced tools to create and mold code into doing what we want. Things will run on many devices, distributed in novel ways reaching places in our lives we can not really imagine.
I want to create and modify systems live as they run, as they are being used. Not just run locally, or as prototypes, but as they run live in deployment. Continuous deployment will probably evolve into 100% live online development. How will that affect developers? What tools do we need? How can we evolve a live system with confidence?
This implies we will have to create much more powerful ways to create, debug and modify code. We need to raise the abstraction levels, but perhaps a key to that is to create a homoiconic language that lends itself to introspection and self reference. Smalltalk didn't do that (only to some extent), nor did JavaScript. The Lisp family of languages did to some extent, but for various reasons never really took off. Hard to say why.
One vision is a globally shared live system of cooperating Spry objects. Like GemStone/S but on a global scale, and taken even further to the extreme. Today developers share code - dead code - via various package catalogs and copy/paste forums. The SaaS and PaaS etc are trying to create shared platforms, but it's still very much centered around the same coding model where we don't share actual functionality, but merely code and libraries to recreate the functionality on our own.
To be concrete - instead of downloading a library and create a small service that consumes a live feed of data and produces a stream of Spry objects, in Spry we would find not a library, but a live running existing service that we just hook into. The module is not dead code, but actually a live and running service.
This is homoiconicity driven all the way! During the years sharing of objects have been tried via various RPC-ish standards like CORBA or RMI, but those standards have always revolved around static early binding and separate specifications and have thus later been completely run over by late binding self describing technologies like REST-ful APIs using JSON and similar "soft" formats. Late binding and self description is key for how modern development is done to a large extent - experimentation.
Another vision is Spry being a language to serve as a new foundation for transferable portable active code. Kinda like a JavaScript that doesn't suck and that is homoiconic and thus easy to make tools for.
But in the end... I don't have any grand delusions about Spry - its all for fun and I just hope some of us will find it useful!
Obviously I don't have this. Yet. I hope that if I can make enough progress on my own - then people will join. And I stand on firm shoulders in the form of Nim which makes Spry suffering less of NiH (no pun intended). I hope that some Smalltalkers will eventually join, but I need a good solid language manual and perhaps even a reasonably interesting IDE to get any real traction.
Spry has almost reached the point where we can start working on the fun stuff. The module system, serialization mechanisms and lossless AST improvements are all crucial steps towards this. Next step is getting the OO model working and do the Sophia integration to get a working image based system. After that I suspect its time to make a first IDE. I do have some plans and ideas for that too :)
I think this is definitely important. The existing REPL is just a crude first trivial step. But it will get better!
I personally want to apply Spry in the domains of VR and IoT. Web systems is no longer that interesting to me, but if someone would like to evolve Spry compiled to js - I would be very grateful.
Hopefully, eventually! :)
]]>