Roads Less Taken

A blend of programming, boats and life.

Nim Seq

| Comments

One of the most important aspects in a language is how powerful and easy it is to use its collection/container types. At least that is my feeling coming from Smalltalk where the Collection classes and their rather rich protocols are used extensively and also cover String and Array and much more. If you peek into the current leading open source Smalltalk - Pharo - you can see Collection allSubclasses size evaluate to 78. Of course, lots of those are special subclasses and not for general use, but fact remains that a strong language needs a strong library of good collections.

In Smalltalk I can for example run this:

1
#(1 2 3 4 5) select: [:x | x isOdd ] thenCollect: [:x | x * 3 ]

This is actually a single method call taking two closures, one to perform a “filtering”, and one to perform a “map”. This is not a standard message in “good old” Smaltalk-80, but it has eventually been added since the usage pattern is so common. To me personally a modern language needs similar power or I would go nuts. :)

Nim has some fundamental Collection types too - and tries deliberately to keep this number small - which I also sympathize with. Nim is meant to be lean and mean, so being minimal in the core is the Right Thing To Do. The ones in the system library are:

  • array - Can use any ordinal type as index type, but are fixed size specified at compile time. So fairly restricted use cases I think.
  • set - high performance bit vector, base type can only be ordinals. Again a fairly specialized container, given restriction to ordinals.
  • seq - dynamically resizable array. Index type is integer.

I am not listing range, string and cstring types, although especially string/cstring can be viewed as collections, and in Smalltalk they in fact are Collections.

So… in Nim we have seq as the single most important general purpose dynamically sized collection. Now… this is in the Nim language, if we extend our view to also include the standard library, we find lots more.

In order to first learn all you can do with a seq…. ah. Not so easy it turns out! Nim is not structured with “everything you can do with type X here”. This fact actually screams for better tooling, but that is another story. One way is basically to search for “seq” in the documentation of the system module. I did that and also searched through the source :)

I ended up writing code exploring what I found and more, so put on your Nim glasses, read the code and comments and join the ride. If you want to compile it yourself here is the code, just do “nim c seqtest.nim” and run it.

Code exploring seq
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
import fruitbase, future, sequtils, algorithm

# Collections are very important in Smalltalk, and they offer a huge protocol
# for working with them. The workhorse in Smalltalk is OrderedCollection - which
# together with Dictionary probably covers 90% of all collection use.
# In Nim the equivalent of an OrderedCollection is the type `seq`.
#
# This is a little rundown of what you can do with seq using stdlib.
# For fun we also add some Smalltalk behaviors, to learn how.

# Create an empty collection of Fruit. We can also use:
# var s: seq[Fruit] = @[]
var s = newSeq[Fruit]()

# Some Fruit objects to play with, borrowed from earlier articles on OOP
var b = newBanana(size = 0, origin = "Liberia", price = 20.00222.Dollar)
var p = newPumpkin(weight = 5.2.Kg, origin = "Africa", price = 10.00111.Dollar)

# First we have add(), contains() and len()
s.add(b)
echo("Seq is " & $s.len & " elements long and contains banana: " & $s.contains(b))
echo("Seq is " & $s.len & " elements long and contains pumpkin: " & $s.contains(p))

# We can add with `&` too, but it will copy s into a new seq with at least 1 more slot.
# This needs discard or assignment since it returns a new seq.
# So this is actually concatenation via "copy into a new seq" but for a single element.
s = s & p
echo("Seq is " & $s.len & " elements long and contains pumpkin: " & $s.contains(p))

# We can access using [index] and also let the whole seq represent itself as a string,
# The implementation of `$` of course calls `$` for the contained elements.
echo("First fruit: " & $s[0])
echo("The whole seq: " & $s)

# Simple iteration, uses the items() iterator under the hood. Iterators are normally inlined
# by the compiler and thus cost basically nothing. Calling "items(s)" is not needed,
# the for loop will do that automatically and also infer the type of i for us.
# Smalltalk: s do: [:i | ... ]
for i in s:
  echo(i.origin)

# Insert an element at a given index
s.insert(p, 1)

# Print countries and index, uses pairs() instead of items() to iterate over both index and value.
# Equicalent to Smalltalk: s keysAndValuesDo: [:i :v | ... ]
for i,v in pairs(s):
  echo("index: " & $i & " country: " & v.origin)

# Convert iterator to a seq of key/val tuples, just stumbled over it.
# Like calling Dictionary>>associations in Smalltalk, but here its
# generalized for any kind of iterator.
echo(toSeq(pairs(s)))

# It should produce a copy, so should hold true
assert(toSeq(items(s)) == s)

# Print in string form, neat for debugging.
echo("repr:" & repr(s))

# Delete at index. Here is a little issue with system.nim and sequtils.nim overlapping slightly.
# Should be fixed. The following will not compile since its ambiguous:
#   s.delete(0)
# Instead we need to explicitly call it in the system module.
system.delete(s, 0)

# Get and remove last element.
echo s.pop
echo("The whole seq: " & $s)
echo("Seq is " & $s.len & " elements long and contains banana: " & $s.contains(b))

# Concatenate two seqs copying into a new longer seq. Like in "," in Smalltalk.
s = s & s
echo("After concat seq is " & $s.len)

# Throw in a banana again for tests below
s.add(b)

# Sort using proc syntax
s.sort(proc(x, y: Fruit) :int =
  cmp(x.origin, y.origin))

# Sort using do syntax
s.sort do (x, y: Fruit) -> int:
  cmp(x.origin, y.origin)

# Sort using new lambda syntax
s.sort((x, y: Fruit) =>
  cmp(x.origin, y.origin))


# NOTE: In the following three calls to map we needed to wrap with $(...) to avoid
# an internal error in Compiler for Nim 0.10.1.
#
# Smalltalk's #collect: is Nim's `map`, here using proc syntax.
# In Smalltalk: s collect: [:x | x origin ]
echo($(s.map(proc(x:Fruit): string = x.origin)))
# ...and future lambda syntax, basically as short and clean as Smalltalk!
echo($(s.map((x:Fruit) => x.origin)))
# ...or using mapIt template variant, but needs a type arg for the new seq
echo($(s.mapIt(string, it.origin)))

# Smalltalk's #select: is Nim's `filter`, here using proc syntax.
# In Smalltalk: s select: [:x | x origin = 'Liberia' ]
echo(s.filter(proc(x:Fruit): bool = x.origin == "Liberia"))
# ...and future lambda syntax, almost as short and clean as Smalltalk!
echo(s.filter((x:Fruit) => (x.origin == "Liberia")))
# ...and filterIt template variant, muahaa, shorter than Smalltalk!
echo(s.filterIt(it.origin == "Liberia"))

# But Smalltalk has a HUGE amount of more behaviors like:
#   detect:ifNone:, allSatisfy:, anySatisfy:, reject:, inject:into:
#   ...well, there are TONS we would like :)
#
# Here is allSatisfy: and anySatisfy:, easily implemented.

proc allSatisfy*[T](seq1: seq[T], pred: proc(item: T): bool {.closure.}): bool =
  ## Returns true if all the items fulfill the predicate.
  ##
  ## Example:
  ##
  ## .. code-block::
  ##   let numbers = @[10, 20, 30]
  ##   assert numbers.allSatisfy(proc(x: int): bool = x > 5)
  result = true
  for i in seq1:
    if not pred(i): return false

proc anySatisfy*[T](seq1: seq[T], pred: proc(item: T): bool {.closure.}): bool =
  ## Returns true if any of the items fulfill the predicate.
  ##
  ## Example:
  ##
  ## .. code-block::
  ##   let numbers = @[10, 20, 30]
  ##   assert numbers.anySatisfy(proc(x: int): bool = x > 25)
  result = false
  for i in seq1:
    if pred(i): return true

let numbers = @[10, 20, 30]
assert(numbers.allSatisfy(proc(x: int): bool = x > 5))
assert(numbers.anySatisfy(proc(x: int): bool = x > 25))

echo($s & " all come from Africa: " & $s.allSatisfy((x:Fruit) => (x.origin == "Africa")))

# We can also check if s isNil
echo(s.isNil)
s = nil
echo(s.isNil)

# Let's see what happens if we try to port a few random Smalltalk snippets I wrote
# down just perusing the Smalltalk protocol for OrderedCollection:

# #(1 2 3 4 5) select: [:x | x odd ] thenCollect: [:x | x * 3 ]
# Above we do a filter on an array of Numbers getting those that are odd
# and then we map them by multiplying with 3.

# We need odd, an immediate template should inline it I think.
# SomeInteger is a type class matching all integer types in Nim.
template odd(x: SomeInteger): bool {.immediate.} =
  (x and 1) == 1

# And then we can write it like this using the lambda macro.
# For filter we don't need to declare type for x, but for map we do.
# I suspect this is because we start out with a literal from which
# type is inferenced one level.
var q = @[1,2,3,4,5].filter((x) => x.odd).map((x:int) => x * 3)
echo(q)

# Or procs style, slightly longer and also needing return type of procs.
# Here we get away without declaring x:int, turning the code into generic procs.
# Andreas would call it "bad style" I think.
q = @[1,2,3,4,5].filter(proc(x): bool = x.odd).map(proc(x): int = x * 3)
echo(q)

# #(1 2 3 4) sum
# Sum is already defined in math module
from math import sum
echo(@[1,2,3,4,5].sum)


# #(one two three two four) copyWithoutAll: #(one two)
# The above creates a new collection without any ones or twos.
# One and two are symbols (canonical strings)

# We need copyWithoutAll:, let's call it filterWithout.
# In Nim we don't need "All" to imply that the argument
# should be a collection, we can overload for other types.

# accumulateResult is a template taking an iterator.
# sequtils both has a filter iterator and a filter proc, here
# we use the iterator. This means both the accumulation and the iteration
# will be inlined. I think :)
#
# `notin` is a template which is sugar for `not seq2.contains(x)`
# It is also marked as a keyword so `x.notin(seq2)` does not compile, it needs
# to be `x notin seq2`
proc filterWithout*[T](seq1, seq2: seq[T]): seq[T] =
  accumulateResult(filter(seq1, proc(x:T):bool = x notin seq2))

var qq = @["one", "two", "three", "two", "four"].filterWithout(@["one", "two"])
echo($qq)

# #(1 2 3 4) includesAny: #(1 2)
# #(1 2 3 4) includesAll: #(1 2)
# Both the above can be added easily similar to allSatisfy, anySatisfy:

proc containsAny*[T](seq1, seq2: seq[T]): bool =
  ## Returns true if any of the items in seq2 is in seq1.
  ##
  ## Example:
  ##
  ## .. code-block::
  ##   let numbers = @[10, 20, 30]
  ##   assert numbers.containsAny(@[5, 10])
  result = false
  for i in seq2:
    if i in seq1: return true

proc containsAll*[T](seq1, seq2: seq[T]): bool =
  ## Returns true if all of the items in seq2 are in seq1.
  ##
  ## Example:
  ##
  ## .. code-block::
  ##   let numbers = @[10, 20, 30]
  ##   assert numbers.containsAll(@[10, 20])
  result = true
  for i in seq2:
    if i notin seq1: return false

assert(@[1,2,3,4].containsAll(@[1,2]))
assert(@[1,2,3,4].containsAny(@[1,2,20]))

Conclusion

The seq type has a reasonable minimal set of behavior, but we should add more into sequtils IMHO. Make it really strong, really smart and convenient. This is extremely important for developer productivity. I can gladly sit down and look through the Smalltalk Collection hierarchy to find more behavior we could implement.

Happy hacking!

Comments