Comparison of OrderedCollections in Smalltalk, Java, Objective-C, Ruby, Python, and C#

In my last post, I compared the collections classes at a high level. Now I’m diving into the one that is most often used, the OrderedCollection.

What’s an OrderedCollection? It is the collection class in Smalltalk that supports ordering. Seems straightforward, but why use the Smalltalk name for it? Why not the more popular List or Array? Because Smalltalk and its collection classes have been around for much longer (over 25 years) than the other languages I’m comparing. All the other languages actually derive in some fashion from Smalltalk.

I haven’t done that much Python and C#, but I figured I should check them out while I am at it. For the others, I have done quite a bit and I wanted to take a step back and look at how each compares.

Here’s a table that compares the API for each language:
(Note that I’ll use “list” since it is shorter than “OrderedCollection” so it can fit nicely in this table.)

Smalltalk Java Objective-C Ruby Python C#
OrderedCollection List – ArrayList NSArray / NSMutableArray Array List IList – List
list := OrderedCollection new List<Type> list = new ArrayList<Type>() NSMutableArray *list = [NSMutableArray array]; list = or list = [] list = [] IList list = new List();
list size list.size(); [list count]; list.length len(list) list.Count()
list isEmpty list.isEmpty(); not included list.empty? ? ?
list add: item list.append(item); [list addObject: item]; list.push(item) list.append(item) list.Add(item);
list addAll: anotherList list.addAll(anotherList); [list addObjectsFromArray: anotherList]; list = list + another_list list.extend(anotherList) list.AddRange(anotherList);
list remove: item list.remove(item); [list removeObject: item]; list.delete(item) list.remove(item) list.Remove(item);
list indexOf: item list.indexOf(item); [list indexOfObject: item]; list.index?(item) list.index(item) list.IndexOf(item);
list includes: item list.contains(item); [list containsObject: item]; list.include?(item) item in list list.Contains(item);
list at: index list.get(index); [list objectAtIndex: index]; or list[index] list[index] list[index];

Now I’m going to argue against using Array as a name, because people tend to confuse it with C-style arrays. Most of these languages can interface with C (usually to optimize certain sections of code or to interoperate with legacy libraries) and so it is nice to not have two concepts referred to via the same name. So ideally I’d like to see Ruby change Array to List and Obj-C/Cocoa use NSList and NSMutableList. I doubt I will make this happen, however. :)

Next, more heresy: Java and C# have backwards syntax for constructors. Why don’t they just make it like every other method call and have a “new” method at the class level like every other language? (Not sure if Python’s List is actually a class.) Hmm… maybe someone can contribute this for Java 6?

Nobody seems to agree on a method name for size, count, or length. Note that Python doesn’t seem to have a method for this, rather a len() function. Also Python doesn’t have an includes: method, but an “in” language operator.

Java seems to mirror Smalltalk nicely, with a few wording changes and a C++-like syntax. I included the type-safe Java 5 syntax for the Java constructor to show how it makes life a bit safer with compile-time type safety, but also clutters the code a bit to accommodate the type. Note that this clutter is saved later when you get rid of casts.

Objective-C also mirrors Smalltalk, but seems to add “Object” everywhere, making them verbose. I wouldn’t mind if this got replaced by a more bare protocol later on.

Actually, Obj-C has a language feature called “categories”, which lets you extend an existing class without creating a subclass. So does Ruby and possibly Python. Java does not let you do this. Not sure of C#. I think I will try and see if creating and using a new Obj-C category and Ruby mix-in that lets me use the Smalltalk-canonical method names will be easier, since I will only have to remember one set of method names.

Note also that Objective-C has to wrap method calls in square brackets.

Also Objective-C (actually Cocoa’s Foundation Kit) has both an immutable and a mutable version. The immutable version gives you thread-safety and is faster. I addressed this in the earlier post about collections. Note that Java’s ArrayList is not thread-safe by default.

Ruby has an interesting convention of using ? at the end of its method names instead of using “is” and camel-casing. Note that camel-casing is discouraged in Ruby. Instead, the preferred Ruby convention is a_long_method_name instead of aLongMethodName. It also seems to be derived from Python as well as Smalltalk, but is more object-oriented. While it has the [] operator like Python, it uses methods instead of len() and the keyword in.

C# looks just like Java. Note that C# 2.0 has Generics support, like Java 5. The only thing that is odd is that it uses a different casing model, with the first letter of the method name uppercased.

Next: iterating over OrderedCollections.

2 Replies to “Comparison of OrderedCollections in Smalltalk, Java, Objective-C, Ruby, Python, and C#”

  1. (disclaimer: my experience is in Java, ObjC, Smalltalk, and Python, and not C# or Ruby.)

    Typically, in Python you just use a list as a boolean if you want to test if it’s empty, so I guess you’d put “not list” in your table.

    Also, the list class/type in python is actually called ‘list’, so when you say ‘list = []’, you’re overwriting the class. “list = list()” would create a list too, although that syntax isn’t typically very useful, unless you want to convert a random iterable into a list (in which case something goes in the parentheses). Python lets you do that (so does Smalltalk for that matter, you can assign to OrderedCollection or Array), but it might come as a surprise if you’re used to all the types being capitalized.

    Over time I’ve come to prefer the ObjC method naming to the Smalltalk style, whereas initially I hated it; part of the problem with a lot of Smalltalk code I find is that the method names are so short (e.g., so many convenience constructors are just “SomeFoo with: aBar”) that it’s often hard to see what’s going on.

    Lists are classes in recent Python versions; there’s been a class/type unification so while they used to be different entities, they aren’t any more. (This was one of the most annoying things that I found about Python when I first learned it; thankfully it’s gone.) In particular, it means you can subclass built-in types. You can now do, for example:

    In [3]: [1, 2, 3].__class__
    Out[3]: ‹type ‘list’›

    In [4]: type([1, 2, 3])
    Out[4]: ‹type ‘list’›

    whereas the first would previously fail. Behaviors like conversion to len(), x in y, etc., are mapped to specially named methods, e.g. __len__, __contains__. I do see some value in Smalltalk’s minimalistic syntax, but this kind of direct access is easy to get used to, and not too hard to implement in your own classes.

    Smalltalk and Python both have immutable list classes: Smalltalk has Array (what you get from literal array syntax like #(1 2 3)), and Python has tuples (1, 2, 3). I love Smalltalk’s collection classes (and to a lesser extent Java), where Array and OrderedCollection are both SequenceableCollections, but often “duck typing” a la Python is good enough.

    Just subscribed to your weblog, so I won’t be posting on old news for long :)

Comments are closed.