Smalltalk


8
May 12

Literal arrays vs JSON vs STON vs Tirade

Recently there were a range of threads on the pharo-dev mailinglist discussing the textual format to use for Smalltalk source code metadata. The discussion veered off from the specific use case but basically four different formats were discussed and compared, of which one I am the author. And oh, sorry for the formatting of this article – I need to change theme on this blog for better readability.

JSON

The first format is JSON, Javascript Object Notation. JSON is a simple language neutral (despite its name) readable format that is very small to implement. It is a restricted variant of the native JavaScript literal syntax for objects (dictionaries) and arrays. Basically it excels in simplicity but lacks a bit in features, but people tend to ignore those shortcomings due to its widespread adoption. I will not go into describing it, json.org does a very good job and there are TONS of JSON implementations around.

STON

Sven Van Caekenberghe recently created a variation on JSON he calls STON, Smalltalk Object Notation. STON is basically JSON plus the following:

  • Object references, the concept of being able to refer to other previously described arrays/objects in the STON file. This is done by number using the @-sign like “@2″ refers to the second array/object in the file.
  • Class prefixing, the idea of annotating arrays and objects (JSON terminology) so that one can instantiate a reasonable class when reading.
  • Symbols, simply adding support for a primitive data type for Smalltalk symbols, although I do note – a limited form of Symbols not allowing the same range of characters in them as Squeak/Pharo does.

Then there are a few subtle differences from JSON, like using $’ instead of $” as string delimiter and nil instead of null, but not much else that I can see. Numbers seem to be exactly the same as in JSON, and escape codes inside strings are also the same, obviously by design.

First I admit that I have not played with STON, my comparison is purely in theory. STON has the same basic positive notes that JSON has, it is small, simple and well defined. But are the differences worth it?

JSON is everywhere and there are already tons of parsers for it, probably in every Smalltalk on earth, and of course all other languages too. STON on the other hand is Smalltalk only, and as of this writing probably Pharo only, although I admit it must be simple to port.

It boils down to if the additions are worth it and I don’t think they are. Embedding class names, if needed, could be done in JSON, although slightly inelegantly of course, but one approach would be to wrap each “typed” object/array in an object like this:

ByteArray [1, 2, 3] ==> {"type": "ByteArray", "data": [1, 2, 3]}

I agree, clunky, but on the other hand I tend to think that the parsing end needs to know the semantics and construction of the JSON anyway – JSON is too “simplistic” to be used as a true generic serialization mechanism and trying to turn it into such a beast by adding types and references, like STON does, is IMHO not that useful.

STON looks neat, but in practice I don’t think the benefits outweigh the ubiquity and availability of JSON. Had it been even more different it might have been another story. But if we don’t think we will use type annotations and circular references – then why not simply use JSON?

Literal Smalltalk arrays

The simplest notation of all in the lineup is the literal array syntax in Smalltalk. The example below covers all its capabilities AFAIK (in Pharo/Squeak), please tell me if I missed anything:

#(4711 3.4 16r3F 'string' #symbol #'another-symbol' (nested array) #(one more) true false nil $x #[12 32])

So we have space separated elements and arrays that can nest, with or without #-prefix inside the array. Primitive literals are numbers (full numeric Smalltalk parser, not as limited as JSON/STON), strings (no escape codes, single quotes needs to be doubled), symbols (can handle more characters than STON symbols), character literals, byte array literals and true/false/nil.

Literal arrays are quite nice but they lack the concept of “associations” and thus no simple readable way to represent a Dictionary. And that is a BIG negative. Funny enough, if we added support for literal dictionaries to Smalltalk then literal arrays would match JSON, with a few extras on the side!

Amber has recently added support for dynamic literal HashedCollections using this syntax:

#{'hey'->12 . aString->'123123'}

It is simply a dynamic {} array (was introduced originally in Squeak I believe) but with the assumption that the expressions all evaluate to Associations that are limited to a string as key. This is because it will be turned into a HashedCollection which is the Amber counter part of a JavaScript object, and JavaScript objects are limited to having strings as keys (Sidenote: Amber also has a generic Dictionary without that limitation).

Without a syntax for dictionaries, literal arrays, although nifty and syntactically quite compact, are still limited in expression. And of course, while Smalltalk literals are fairly simple to parse, other languages do not typically know how to do it – and when it comes to numbers, the Smalltalk full range of syntax is perhaps a bit of an overkill if we aim at cross language portability. Having literal syntax for Characters is also clearly of less value, ByteArrays on the other hand are obviously useful.

Sidestory: Adding literal Dictionaries to Smalltalk?

Smalltalk only evolves in micro steps every other 10 years, but with the current onslaught of Pharo perhaps there is an opportunity to actually take a few more such steps.

We will see below that Tirade has added support for “->” as a literal syntax instead of being a message send and as I mentioned above Amber has added a special syntax for dynamic Dictionaries, and that was actually done in order to more easily match JavaScript object syntax when interacting with JavaScript.

So perhaps the Smalltalk/Pharo community could decide to add literal Dictionaries to Smalltalk using the Amber “#{” syntax? In such a syntax the separators between Associations can probably not be spaces, it gets confusing to read:

#{ key -> value key2 -> value2 }

A separator is clearly needed and since we use periods generally for that in Smalltalk it’s a good choice. Syntactically it could lead people to think it’s a dynamic Dictionary, but let’s continue the thought experiment. How would it look? As is customary for #() we can ommit the # inside the array:

#(123 'hey' {key -> value. key2 -> value} 456)

It looks fairly nice. However I do admit that we probably should take a long hard look at all our syntaxes and try to bring some harmony to them. Currently, due to legacy, we have literal and dynamic Arrays using #() and {}. A bit unfortunate since we then use both $( and ${ as delimiters for Arrays and make it harder to find good characters for Dictionaries.

It would be nice to have a symmetric syntax. Ideally the leading # could indicate “literalness” – and perhaps we could use another character to indicate dynamic evaluation? Again, just a thought:

  • #() – literal Array
  • §() – dynamic Array, expressions separated by periods.
  • #{} – literal Dictionary, literals separated by periods, support for associations as literals.
  • §{} – dynamic Dictionary, expressions separated by periods, associations created as usual using sends.

Yeah, right, how would we ever be able to reach concensus on a leading dynamic character? :) Also, I do think it is wise to syntactically indicate literal vs dynamic, heuristics only lead to developer traps. Better to clearly indicate intention.

Tirade

Tirade is a format I created for Deltas (ChangeSets improved) and I have written three articles about it earlier. Now, if I would at this point subjectively rank the formats along a few axis it could look like this:

  • Interoperability
    • JSON: 100% (all languages has it)
    • STON: 70% (one could probably tweak a JSON parser in any language to work)
    • Litarrays: 30% (could get higher score if we limit them, a parser would still have to be written)
    • Tirade: 20% (same problem as with literal arrays, but even more advanced to parse)
  • Capability
    • Tirade: 100% (has the most features and options, by some margin)
    • STON: 60% (second best, still not much better than JSON)
    • JSON: 50%
    • Litarrays: 40% (severely limited by lack of assocations but has a some features to compensate)
  • Grokkability
    • JSON: 100% (well documented, we all know it and so does the rest of the world)
    • STON: 90% (rides on JSON)
    • Litarrays: 80% (not hard but has quite a few quirks)
    • Tirade: 70% (more or less as hard as literal arrays, but with a few more concepts added)

Conclusions from the above? Before looking at Tirade I think we can safely say that JSON is a strong choice. STON is IMHO in limbo, I can’t see picking it instead of any of the others in a given situation, sorry. Literal arrays could easily become the obvious “JSON for Smalltalk” if it had associations/literal dictioneris, it sucks for interoperability though.

Tirade on the other hand has associations (on two levels one could even claim) so it can be viewed as “JSON++ for Smalltalk”. But with more features comes a slightly higher learning curve and a penalty in interoperability. We now have set the scene for the last section about Tirade.

Tirade

Obviously I am partial, since I created Tirade. But let me try to contrast Tirade to all the others. Note that Tirade was never meant to be interoperable with other languages, it was however designed to be interoperable between different Smalltalk implementations, or at least all Squeak derivatives.

A stream of messages

First of all, Tirade is slightly different than the others. They describe a single structure. A valid Tirade “document” on the other hand, is a series of “records” terminated by periods. Each such “record” looks like a Smalltalk message (but without a receiver on the left side), either a unary or a keyword message, like this:

unaryMessage.
key: 'Hello' word: 'world' message: 4711.

This high level view as a “stream of messages” gives us several nice properties:

  • The selector of the Tirade message is a kind of record “type”. It normally maps to a method on the receiving end that handles this record. That method then knows what to do with the arguments, and thus we don’t need to hard code class names into Tirade, like STON does. NOTE: This is not a security problem. There is nothing forcing the parsing end to just blindly perform these messages. In fact, there is nothing forcing the parsing end to be specific at all, it could just be a generic Tirade parser.
  • If we look at a keyword message we realize that it is very similar to a JSON object, it is basically a “naked dictionary” where each key word is… right, a key! :) So for simple data we need perhaps not make it more complicated than this.
  • It makes it very easy to extend a Tirade format by simply adding new message selectors that the receiving end can ignore if it wants to.
  • Since Tirade is a flow of messages instead of a single, potentially quite large, structure like the other three formats, we can naturally stream it and handle each message one by one.
  • And since we have this flow we can also use “control messages” that can instruct the receiving end on how to receive the messages coming next in the flow. One could even use Tirade over a bidirectional link (a SocketStream for example) and do handshaking and client server communication with it.
  • Finally, in between Tirade messages one can add Smalltalk style comments which are simply skipped by the parser. JSON and STON has no concept of comments.

Smalltalk literals

The next level of Tirade is what kind of arguments we are allowed to put in between the keywords. Basically its most kinds of Smalltalk literals with some additional constructs. I would also like to point out that this part is not encarved in stone, I am still contemplating the best mix of literal support here. But the main point is that we only allow literals – no expressions, so there is no generic “eval” going on here.

Notable differences again compared to JSON/STON on the atomic level are just like with literal arrays:

  • Strings are Smalltalk strings, no escape codes except for double single quote for single quote.
  • Numbers are Smalltalk literal numbers, in fact we rely on the number parser of Pharo/Squeak. This gives us a rich notation for numbers, at the expense of possible portability issues with other Smalltalks.

NOTE: Tirade doesn’t currently implement Character literals nor ByteArrays, both can of course be added.

Let’s continue with the added features for literals.

Literal feature: Verbatim strings

A problem with JSON for dealing with readability is that JSON strings can’t have newlines in them! So if you want to store source code in JSON it will end up as a single very long line.

Smalltalk strings like in Tirade can have newlines in them, but they suffer from double quoting of single quotes and the problem that the single quotes surrounding the string needs to be first on the first line and last on the last line, which makes it less readable.

This is why I came up with verbatim strings in Tirade, specifically for being able to contain unmodified source code in a readable way with no escapes whatsoever. I am not sure if this is the best approach, perhaps here-docs would be a simpler approach, but currently a verbatim string looks like this:

some: 1 message: 'hey' withVerbatimStringForCode: [
 This is untouched, perfectly unescaped source code, ANY character combinations will work!
 Tirade will split the input on each CR (byte = 13) and then prepend each line with a TAB character.
 This means that the parser can detect the end by looking for the first line starting with "]",
 that must be the end of the verbatim string since all other lines start with TAB.
 Copy paste will work but you will need to care for the TAB indentation, but most editors
 can do that easily. Also, right before and after the string there is a newline added to improve readability.
].

Literal feature: Associations

Since we really want to be able to do dictionaries I first added literal support for Associations. This means “->” is a literal syntax for creating an Association, it doesn’t need to be in a Dictionary, you can use them wherever you like and the key and value can be ANY literal construct allowed by Tirade, even an Assocation!

Note though that we do not have parenthesis in Tirade (no expressions at all) and the current Tirade parser is a recursive descent bottom up parser so the code below will produce an Assocation with key #key and value an Association 123->’123′. In Smalltalk where #-> is a message this is instead executed from left to right creating a different result.

cool: #key->123->'123'.

This also means that Tirade can have associations inside literal arrays, which is not syntactically possible in Squeak/Pharo:

cool: #(12->'123').

Finally, since Amber lately added #{} syntax for Dictionaries I think it could be a worthwhile addition to Tirade also.

Literal feature: Dynamic arrays as literal

Tirade supports {} style arrays, but doesn’t allow expressions so they are very much like normal arrays except they do not remove #-prefixes from nested arrays/symbols and they look more natural to Squeakers since Squeak allows Association literals inside them:

cool: {12->'123. 'banana'->true}.

Is it worth supporting both kinds of arrays? It depends, either Tirade defines a literal subset that is as small as possible, or Tirade tries to cover all literals of Pharo. I was leaning towards a subset but perhaps a super set is more attractive to people.

Ending thoughts

I hope this article explained a few things and made at least Tirade a bit clearer. There are several things not fully settled in Tirade and if anyone wants to dig in and tweak it, feel free to email me.

regards, Göran


7
Feb 12

Current Smalltalk obsessions…

These days I am, as usual, torn between several interesting technical projects.

Amber

The new Smalltalk called Amber (by Nicolas Petton) that compiles to javascript is pretty awesome and there are tons of interesting things one can do with it. My contributions so far include the beginning of a package model, a faster simpler chunk format exporter/importer, a command line compiler, a Makefile system so that Amber can be built fully from the command line and a bunch of examples running on top of Nodejs and webOS, and a few other odds and ends.

I would like to port Deltas to Amber in order to create a powerful toolset for managing code changes. Using local storage it would among other things enable undo and change logging to prevent accidental code loss. It could also easily form the basis for a “commit tool”, similar functionality that git stash offers etc.

Another thing I would like to build is a dead simple public shared package repository. And play with Socket.IO, or just fool around with the compiler trying to add optimizations like various type inferencing, optimizing self and super sends etc :) . So much fun stuff to do!

STOMP and Apollo

For a personal “secret project X” I need scalability so it is being designed with lots of daemons each taking care of a specific task. I want to be able to implement these daemons primarily in either Nodejs (in plain js or using Amber) or Pharo Smalltalk, but also in any other language that fits.

This requires some kind of messaging infrastructure to tie them together. So… after looking hard and long and reading a lot about messaging, job scheduling, AMQP, 0MQ, STOMP, Beanstalkd, RabbitMQ, ActiveMQ Apollo (and tons of other things) I decided to try to use the new ActiveMQ Apollo together with STOMP 1.1 (which should also be supported by the STOMP plugin for RabbitMQ etc).

The new Apollo implementation is written in Scala using HawtDispatch so the architecture seems modern and the JVM of course has very good performance these days. So, while I generally am very tired of Java and its eco system, this actually seems like a solid product and has already shown very impressive numbers in benchmarks.

So a sound asynchronous architecture with good performance is nice but the other thing I like with ActiveMQ is their focus on STOMP. Since I intend to use Pharo as one major component I need to be able to hook it into the messaging backbone. And sure, Tony Garnock Jones – one of the main developer behind RabbitMQ – actually has an AMQP client library written for Squeak 3.9, so I could probably us AMQP, but I somehow foresee a “world of hurt” in the complexity given that AMQP is a magnitude more complex than STOMP.

I have already implemented STOMP 1.0 for Pharo, actually tried it with RabbitMQ at the time, so I am now upgrading that library to work with 1.1 of the specification.

Riak

The other important piece of the puzzle for true “Internet scalability” is of course the choice of persistence. I am a long time fan of the new NoSQL databases and having played with a few of them, implemented a C# binding for CouchDB, hacked some bindings in Squeak for both CouchDB and Tokyo Tyrant… I now have decided to focus on Riak. Riak is IMHO the most interesting NoSQL database out there right now, at least for worry free ultra scaling. Sure, it may not be the fastest on a single box – but if you are really serious about scaling – one box is totally uninteresting. :)

Runar Jordahl had already started a Riak binding in Pharo, I took it and changed quite a lot of it – not really because it was “bad” or anything, I just have a different style of coding I guess. So I decided to fork because I didn’t feel comfortable – thus Phriak was born. Now Nicolas Petton is getting hard into Riak too and has pushed Phriak forward quite a LOT in the last few days, much further than I had time to do. It now has a clean command style protocol implementation, an object model similar to the one in Ripple (Ruby Riak client) and initial working code for both secondary indexing, link walking and map/reduce! Quite impressive stuff.

Nicolas is also experimenting with writing an “OODB-ish” database using Fuel called Oak and after I managed to get him hooked on Riak he has been moving that codebase over onto Phriak. The initial experience we have with Phriak and Oak is extremely promising and who knows where this will lead.

Happy coding, Göran


25
Aug 11

ESUG day 4

This day started with some stress, Nicolas and I whipped up the last details of our co-presentation on Jtalk (Nicolas decided to skip Iliad) – and my Eris demo suddenly got b0rken. But I managed to fix it and our presentation was very well received – it was great fun!

Nicolas managed to do quite a few “on the fly” demonstrations of various Jtalk snippets etc, and running the slides in Jtalk was of course a killer thing. I explained how jtalkc is being run on top of Node.js and quickly proceeded into showing the TrivialServer demo in Node.is – when Apache benchmark showed 1800 requests/second there was a spontaneous applause. :)

Now we can relax and talk to all people about Jtalk – and now in fact the web panel starts with Nicolas on the panel. Unfortunately the panel discussion didn’t play out that well, it needs some entertainment and also at least one or two that disagree :)

Later tonight and tomorrow we will probably keep on hacking Jtalk like mad. So much fun stuff to play with! We intend to “finish” the first stab at so called “speculative inviting” that we started earlier this week, and try to do some profiling on it to verify the gains. Using the Compiler is actually a good candidate for a reasonable benchmark.

The evening ended with the usual pubs and hacking and chatting about cool things people are doing.


24
Aug 11

ESUG day 3

Suddenly it is Wednesday and we are already on day three at ESUG - a superb software developer conference focused on Smalltalk. Time flies. Yesterday I mainly hacked together with Nicolas Petton on Jtalk, really fun, unfortunately I missed a few interesting presentations, like Fuel and Bifrost etc.

This day starts with Stéphane presenting “Humane assessment”. Mmm, got distracted by my Touchpad, but Stéphane is showing some cool visualizations right now, clearly useful for large systems and organisations that need understand their own “huge legacy software”. Hehe, the browsers shows visual queues on “bad designs” like marking methods as “BrainMethod” or marking a class as “God Class” – that is indeed very slick!

All in all it looks like a very useful tool – I should probably try it out on some codebase. In fact, this tool is a really good “added value” tool that can be offered to customers when helping them. I have at least one client that really could make some good use of a tool like this.

Next up before coffee is Arden Thomas from Cincom (hehe, that was funny, the Touchpad wanted to correct “Cincom” to “Condom”…) presenting what is new in their products / ObjectStudio and VisualWorks. These are really mature and amazing Smalltalk tools, but of course they also costs money, money, money. But VisualWorks is accessible in a non commercial full version, which is quite nice if it fits your needs. Cincom is also quite active in a bunch of open source Smalltalk projects like for example GLORP (think “Hibernate” for all you non-Smalltalkers) and Seaside (the most outstanding web framework in the world).

After running around flaunting the Touchpad :) – I came slightly late to Igor Stasenko’s presentation on NativeBoost. I have worked with Igor and he has this refreshing “fearlessness” so diving into assembler is not a problem for him. So NativeBoost is an extension to the Squeak VM (and the new Cog VM) that enables dynamic machine code generation – and execution – directly from Smalltalk using just Smalltalk. So it includes a DSL for writing assembler (a port of AsmJit) and mechanisms to access memory etc etc. The machine code needs to be relocation agnostic since it is actually stored directly in a Smalltalk object (the method) and will be moving around due to the garbage collector moving things around. Another interesting issue is that if the machine code calls into the VM in order to create a Smalltalk object, it will need to be aware of the fact that this can trigger GCs and move things around – but this is just the same for building VM plugins. Of course, Igor’s stuff is very impressive and you can make very fast code using it.

The day then ended with the social event and announcing the winners of the awards and a nice dinner followed up with some beer and endless “Why doesn’t everyone use Smalltalk?” discussions – as is customary.

Over and out, Goran “typing this in on my Touchpad using the bluetooth keyboard”


23
Aug 11

Touchpad finally in my hands, first day

Sooo…. I actually managed to order a HP Touchpad 32Gb here in Edinburgh to be picked up at Comet within 48 hours. I ordered when it was still a whopping 429£, but when I went to pick it up I got it at the UK discount price of 115, and I will get the VAT back too.

The first hours were frustrating because I was in the ESUG conference and we only had a WiFi with a so called “captive portal” with a login form – and the first time you power up a Touchpad it wants to hook up to a Palm Profile, and does not want to do that using a WiFi with a captive portal.

The Montague pub to the rescue later that evening, an open wifi. I am currently writing this post using the Bluetooth keyboard (so nice) while the TP is snugly positioned on the Touchstone inductive charger. Both these are great accessories. I have also managed to do Skype with my wife, really easy and worked well, hook up the calendar to Google with perfect sync, and in fact it synched over all my contacts etc from my Palm Profile for my Palm Pre 2 – just works!

I have done the OTA 3.0.2 update (in the pub while eating) and I have installed a bunch of apps, like the one I am typing in know – for WordPress. I have also activated the included 50Gb free cloud space included from box.net – brilliant.

Email app is running fine, Facebook app is very good, tons of other little nifty things – I am a happy camper! Is it just as “smooth as silk” as the iPad? No, but it excels in other areas like true multitasking, a real Linux beneath (bonus for me as a developer), synergy, full flash, 50Gb cloudspace for life included, really good virtual keyboard (multiple sizes even) etc etc. Sure, slightly thicker and slightly heavier – but…. BUT…. It cost me around 85£ with 32Gb RAM. That argument is a killer.

Day after tomorrow I will be demonstrating apps written in Jtalk running on it – yiha!


11
Aug 11

ESUG 2011 in Edinburgh

Each year I try to attend at least one developer conference. Earlier OOPSLA was a given but it lost its appeal quite a few years back and now it is not even called OOPSLA anymore. As a die hard Smalltalker I instead attended the ESUG conference in Brest 2009 and it was easily the most rewarding conference I ever have attended! Missed last year in Barcelona but this year I am going to Edinburgh for a week of Smalltalking.

I am not presenting anything but I hope I will get my HP Touchpad from Amazon before it starts so that I can demonstrate a WebOS app running on it written in Jtalk.

If you are going too, see you there!


15
Apr 11

Tirade, supporting embedded text

Two years ago I ended up creating Tirade – a new “file format” for Smalltalkers. Or rather, a way to serialize stuff into a sequence of Smalltalk messages with literals as arguments. I have written a few blog articles about Tirade so I will not go into details in this one.

One thing that has been disturbing with Tirade is that I wanted it to be the main format for serializing Deltas, the new implementation of “21st Century ChangeSets”. This means I want Tirade to handle Smalltalk source code in the best possible way. Ideally I would want the Tirade file to be editable in a text editor if I wanted, and not being broken by that.

So, what properties do we want:

  1. No escaping of special characters. In regular Tirade strings (just like in Smalltalk) need to escape the single quote as doubled single quote, and that would suck for Smalltalk code of course.
  2. No length encoding. One way to avoid escaping is to store the length of the data before the actual data – like a Netstring for example. This prohibits easy editing in a text editor though, since that would change the length.
  3. A reasonable syntax. Tirade so far has been a subset of Smalltalk (disregarding lack of receiver to the left), but I think we might have to break that a bit here.

After pondering this for a while I have come up with this solution which feels kinda nice, but if someone has an even better idea I am all ears. This is how it could look embedding a method source in Tirade:

class: #MyClass selector: #at:put: source: [
      at: pos put: arg
      "Put something here"

      ^array at: pos put: arg
].

So what gives here?  We are reusing the syntax for Smalltalk blocks without arguments. Simply [...content...]. The content will be delivered as a String and the guarantee is that it will be received exactly as sent. There is a trick here – this is what Tirade will do:

  1. Write the starter $[ and then a CR
  2. Before each line in the string (a line being all characters up to and including the next CR or up to end) we insert a TAB. This means that the String begins on the line after the opening $[ and all lines will be prefixed with a TAB.
  3. Then, regardless if the last line ended with a CR or not - we add a CR before the closing $]. This makes sure the closing $] ends up on its own line.

The above trick gives us the ability to detect the end of the string because if a line starts with something else than a TAB then we have reached the end. Thus we do not have to escape the $] inside the string and we still don’t need to do length encoding. We DO however need to make sure all lines begin with a TAB, but if you are editing a Tirade file you should just learn that fact. :)

I am not sure if the above is a good solution, but it is ONE solution and I can’t come up with a better one, unless we would use a really “odd” marker at the end in order to not have to escape it, but that feels “dirty” to me.


8
Apr 11

Preaching Smalltalk inside a nuclear reactor

…is what I did yesterday. It was the Stockholm GTUG group having a loose and laid back meetup in a rather special venue – R1, Sweden’s first nuclear reactor! 27 meters below ground, kinda… funky.

Anyway, I tried doing an ultra compact version of several of my other presentations around Smalltalk and Seaside – didn’t really go 100% since I both had some technical issues (keyboard problems and projector issues too) and ended up taking more time than was planned. Hopefully noone got upset about that.

In about 60 minutes I taught the whole language in 5 slides (the language is very small from a semantic and grammatic view), a bit of the amazing history behind Smalltalk, some of the traditional tools in a classic Smalltalk environment – and finally, a quick whipup of a Todolist app in Seaside, including support for the back button! I used Squeak and Pharo and I hope I get to do a similar presentation some other time – then I will polish it and try to keep the “blitz” tempo. Shock and awe. :)

I think the audience appreciated it (always hard to tell), would have been nice to show more of course – I could easily spend a whole day teaching Smalltalk and various mind blowing aspects around the environment, the language, cool libraries and techniques – and of course Seaside.

Then I spent some time chatting with my friend Mikael Kindborg (also a Smalltalker at heart) and a couple of his colleagues from Mosync who were actually sponsoring the event with beer, wine and sandwiches. It was a nice evening and it’s always fun to show Smalltalk to people who have never seen it.


14
Mar 11

Node.js vs Nginx/Squeak, part 1

Hmmm, after seeing the Node.js presentation at Dyncon 2011 I couldn’t help installing Nginx and Blackfoot (SimpleCGI) in a Squeak 4.2 image running on the Cog VM to make some performance tests! In fact I started doing that during the presentation. :)

My first run on Nginx/Squeak looked quite unimpressive. Well, one client doing 1300 req/s to a small helloworld was decent although Node.js handled approximately 2x that. With Nginx we have a two tier solution so a factor of 2 is not really surprising in this trivial case. Top showed similar load, both solutions only seem to consume 8-9% of my CPU power on this box, but the Nginx/Squeak solution of course spreads load between them with approximately 1/3 or 1/4 on nginx.

But jacking up concurrent clients really destroys Nginx/Squeak! How come? I was surprised because my memory of this when I wrote Blackfoot was that it was handling that fairly ok. Trying 50 concurrent clients with Node.js pushes it up to almost 8000 req/s! Quite impressive and it still only uses about 9% of my CPU power. Blackfoot ends up serving less than 1/10nth of that. Now, thinking and looking more closely it is quite obvious – SCGI opens a new connection for each request… ouch. Why on earth did they design SCGI like that? So basically Nginx will hammer Squeak just like we hammer Nginx I guess, and Squeak doesn’t deal with that too nicely.

A small experiment with firing up 3-5 Squeak backends and letting Nginx load balance over them (really simple to do) shows that we can get around this somewhat and scare Blackfoot into serving over 3000 req/s and still not going over 30% CPU. Not that shabby, but still not in the same league as Node.js, but now we know why – we need a solution that holds the connection open between Nginx and the backend.

At this point I wanted to try three things:

  1. What numbers can Nginx on its own produce, just returning a small HelloWorld file?
  2. What numbers can plain KomHttpServer running on Cog produce?
  3. And finally, how does Nginx/AJP/Squeak behave? AJP does keep the connection open I think.

Let’s guess first – plain Nginx should beat Node.js, Kom with Cog is probably not much faster than regular Squeak VM since the issues I believe are in the Socket plugin (and we saw that it didn’t like getting hammered by Nginx), and finally I am hoping AJP puts Squeak at say half Node.js even with 50 concurrent clients, that would be 4000 req/s and I would be darn happy. And of course, with a load balancer on top even more, but that can be done with Node.js also of course.
So more on that next time…


13
Mar 11

Dyncon 2011, day 2

Day one must have ended with lots of beer because people were quite late for day two. 15 minutes late Carl Lerche finally started his Ruby presentation. One thing I found interesting was Ruby Modules vs Monticello extension methods (in some ways I presume this is how Modules are often used – to extend other classes with behavior). Evidently “method extensions” to the class side in Ruby doesn’t work like extensions to the instance side, it does in Smalltalk of course :) . Then Carl described ways to still do this, but it looked complex, and also explaining there are lots of “hooks” when messing with the MOP. Is that a good thing? If Rubyists use this a lot, then I presume utter hopeless confusion might occur.

Obviously there is a difference here I think between the Smalltalk and Ruby mindset – in Smalltalk we are always in runtime, but that doesn’t mean we go crazy on the dynamic axis – that would pull the rug out from the development environment and its capabilities in navigating and describing the code base, in much the same way as a macro system does in C/C++.
Next part was about various techniques using Ruby blocks, like for example messages taking optional blocks… hmmm, trying to figure out what I think of that. “File.open” was showed as an example. Reading on the net shows that there is a lot of… complexity regarding blocks in Ruby, doesn’t look nice. I hate needless complexity. Evidently Ruby 1.9 is cleaning up blocks – curious if anyone could elaborate on that compared to Smalltalk.

Next up was Tom Hughes-Croucher from Joyent presenting Node.js. In essence this is about hard scaling of network applications. He began by talking about scaling issues with regular forking architectures and V8 and the AreWeFastYet website etc. Javascript is indeed getting an awful lot of performance attention these days and that of course makes it the “assembler of the Internet” and a compelling base platform for more and more things.
Although I do understand this I still don’t really see how Node.js would be extremely better than say Nginx + a backend that doesn’t allocate insanely much memory per user session? For example, using Nginx (or Cherokee) with a Squeak backend running AJP, it would be fun to compare performance wise.

Sergi Mansilla then presented Cloud9, a web based IDE for Javascript. Mmmmm, well, I don’t get excited about that because I am a Smalltalker and the things I can do with Squeak/Pharo … sorry, I don’t want it in the browser! What about interactive graphics visualization? Really good browsers? Sure, it can be done, and compared to vim/emacs it might be cool. But I want to be able to hack my IDE for example. It is clearly a really ambitious project though and worth keeping track of since they are pushing the boundaries of what you can do on the web.

Robert Virding doing another presentation around Erlang, similar to the one Joe did yesterday but still different. Focusing more on principles but also with some code examples. Somewhat interesting, but I didn’t follow it too closely.

After lunch Björn Eiderbäck started his presentation on Smalltalk directly in the latest version of VisualWorks from Cincom. The style of Björn’s presentation is to interactively using the Smalltalk environment trying to quickly “dig into” code and using its tools. This style easily gets side tracked but in order to make non Smalltalkers understand the “beauty” of Smalltalk it might be the way to do it in a short period of time, Shock and Awe. :)

The day continues but I am posting this now anyway.