[Dojo-interest] Dojo pub/sub topic and event names with commas in them

Paul D. Fernhout pdfernhout at kurtz-fernhout.com
Mon May 4 08:44:16 EDT 2015


In a Dojo application I'm writing, I experienced silent failures of topic publication/subscription for some topics. I finally tracked it down to the fact that the topic names had commas in them, and Dojo splits up topic subscriptions into multiple different subscriptions if they have commas in them in "on.parse". 

Here is a JSFiddle that shows the basic comma-in-topic-name issue:

  "Dojo topic with comma test"
  http://jsfiddle.net/pdfernhout/wuwv9hz5/

While I have a workaround (to encode the commas in the topic names using encodeURIComponent), I just wanted to bring this issue up so other users might be aware of it. Also, I wanted to suggest a couple improvements to the Dojo documentation and/or code that might help people understand and deal with this issue more easily, to avoid having to debug an what seems like a silent failure of the Dojo topic system. The most important of those suggestions beyond improving the documentation is to consider having topic.publish throw an exception if a topic has a comma in it.

More details are below.

--Paul Fernhout

==== More details

I've found the Dojo pub/sub topic approach to be an elegant way to make applications more modular, and I've been trying to use it in more ways. I've also been using JSON.stringify more and more to support extendable applications where internal symbols can be arbitrary JSON objects as a collection of data stored together in a consistent way rather than just ad-hoc strings with ad-hoc delimiters. Unfortunately, those two good things have collided in in a Dojo app I'm working on right now to create this issue I am bringing up.

I am dynamically creating Dojo topics based on user-supplied content and other application information using JSON.stringify. These topics are used to support updating a GUI which tracks data defining a semantic network of RDF-like triples that changes over time. So, those topic names could have a comma either based on what the user entered in a string or in the JSON-ified object themselves. Here is an snippet from the code just to illustrate what I am talking about:

  makeTopicKey({type: "TripleStore.addForAB", a: triple.a, b: triple.b})

Perhaps arbitrary dynamically-created topics and events were not a use case consideration in the design of Dojo's pub/sub approach. Nonetheless, it would be nice if they were supported somehow, or alternatively, if they failed faster. That would help expand the usefulness of Dojo's topic system further.

The ultimate workaround I used once I figured this out was to use encode the topic names with encodeURIComponent, which encodes commas (and other things):

  encodeURIComponent(JSON.stringify(object))

A drawback of this approach is that the encoded topic strings are harder to read by a person, although only a developer would be looking at them. Still, encodeURIComponent is overkill, as it is just the commas that are an issue. However, otherwise I'd have to escape the escaping sign itself for the strings to remain consistent -- otherwise if I just replaced commas with another character those replaced strings could collide with other strings. I may do a more limited escaping eventually, but using a standard JavaScript call was easier at first.

If someone could suggest a better workaround, I'd appreciate it. While I haven't tried it, perhaps I might be able to pass in a custom function of some sort as a topic subscription -- although that perhaps seems more kludgy to me than encodeURIComponent?

I tried passing in an array of subscription strings to "topic.subscribe", but it still had the issue, presumably because on.parse parses an array of strings recursively. By the way, that code seems to use an "events" before it is declared, which will work because of JavaScript "hoisting", but still looks confusing in the code and makes me wonder if that part of the Dojo code could use further review. I have included that snippet of code below at the end for reference.

In Dojo 1.10.14, the file dojo/on.js has a function on.parse which splits event names based on commas. It does this recursively, so even if you pass in an array of event names, it will split the strings in those array elements on commas and so on. This functionality is called by dojo/Evented.js which is called from dojo/topic.js.

The rationale for the comma parsing given here is:
http://dojotoolkit.org/reference-guide/1.10/dojo/on.html
"You can listen to multiple event types with a single call by comma-delimiting the event names. Then we can listen for multiple events (with delegation) with one call. For example, we can listen for touchend and dblclick: ..."

There is no mention of commas here regarding topics:
http://dojotoolkit.org/reference-guide/1.10/dojo/topic.html

I can understand the syntactic sugar of parsing topics on commas -- even if for my specific use case it would be better if that was not done. However, the motivation for the recursive parsing of topic names passed in an array to "topic.subscribe" or "on" is harder for me to understand. Otherwise I could just have passed in an array of subscriptions with just one string in it (with commas in the string) for my use case.

As a design concept, to be weighed against other considerations, in general, I've been learning that in JavaScript (especially compared to other languages), when you think about doing any sort of adhoc parsing of configuration information, it is best to first think about whether a JavaScript array or object could do the job instead (using JSON or YAML or whatever to encode that object as a string if needed). In the case of topics, for example, while "dblclick, touchend" as show in the documentation for "on" as an example may indeed be quick to write as code and even easy to read in some ways. However there is another aspect of an arbitrarily parsed string being harder to work with for a developer. Compare "dblclick, touchend" with ["dblclick", "touchend"]. The array notation is not that much harder to type, and with that approach, now a developer could search in code for every use of the quoted string "dblclick" and find every use of that event as a string (rather than every other me
 n
tion of dblclick). The array nomenclature makes it clearer these are separate subscriptions. The internal code for handling subscriptions would also be simpler and easier to understand and maintain. Still, I appreciate that there are other considerations as well, including backward compatibility, so some of my preference there is subjective.

Obviously, I doubt the dojo pub/sub API for parsing strings with commas is changeable at this point because so many people use it as-is for "on" events. However, I can wonder if the recursive parsing of strings in an array of subscriptions could be removed as excessive? Do people really expect an array of event subscriptions to be parsed further for each string? Is that a realistic use case, or is it surprising behavior? Perhaps If that recursive parsing was turned off, then to subscribe to topics with commas in them, you could pass in an array with just that one string in it as a workaround. And if you passed in a plain string with commas (to support the current common use case), then an error could be thrown? But, perhaps there is some other Dojo internal need to do recursive parsing, if perhaps sets of accumulated events are passed in from other parts of the Dojo infrastructure? I don't understand the Dojo internals well enough to know if that is the case or not.

Alternatively, maybe there could be some way to pass in some other sort of object when doing a subscription, where the strings in that object would not be parsed? Or perhaps there could be some non-parsing variant of the functions for "topic.subscribe" or "on" (but that seems ugly from a design standpoint leading to a set of parallel functions across "topic", "Evented", and "on")?

I know it's hard to maintain backward compatibility while moving forwards. I'd suggest, at the least, the Dojo "topic" help documentation have some indication that if you have a comma in your topic identifier that things will not work the way you expect and instead will probably silently fail. Also, perhaps topic.publish could throw an exception if you try to publish a topic with a comma in it, if it turns out there is no other workaround? Then I would have had some early indication that my topic name choices were problematical, rather than have to debug a silent failure. A silent failure is one of the harder things to debug, as it means tracing through a lot of Dojo internal code -- although silent failures are at least easier to debug than rare failures. :-) Also, changing the handling of arrays of topic strings, to not parse them recursively, might be something to consider, at least for Dojo 2.0.

Here is a more radical pub/sub API change to consider for Dojo 2.0. I mention it mostly to push the conceptual envelope of pub/sub; it is probably not worth the computation needed for this change for most uses cases. Still, with that disclaimer, how about supporting using any arbitrary JSON stringifiable object as a Dojo pub/sub topic? Dojo topic.publish and topic.subscribe could just stringifying a canonical version of an object passed in a a topic by the user (at least, when passed in as an array or other special object). Granted, that might be overkill for most use cases, with extra computation including to canonicalize a JSON object with sorted keys. It also might hide some of the most common user errors of putting in an object for a topic when they actually meant a string (so, that violates the design principle that defaults should be tuned for the novice). Still, it would be great for my particular use case. :-) The general issue is that if you are creating pub/sub topics deriv
 e
d from several pieces of data, including perhaps a type and a data item which itself may be a string or an object, you have to figure out some way to convert that collection of data into a single topic string to publish or subscribe with. You can put the data together yourself with some arbitrary internal delimiter. But a more general way is to use JSON.stringify (ideally, a canonicalizing version) on some JavaScript object which holds the data. However, it is not that hard to do that stringification yourself for such applications, so it might not be worth the internal cost for most users to support this feature, given most users are just using plain strings like with button.on("click", ...). And unless you stringify all objects used in pub/sub, then you would run into confusion between whether a topic was a plain string or whether it was a stringified string. However, stringifying all pub/sub topics would also have the benefit of protecting against topics with names like "__proto__"
  
or other internal JavaScript identifiers. Related to that last point:
  http://www.2ality.com/2012/01/objects-as-maps.html
  http://www.devthought.com/2012/01/18/an-object-is-not-a-hash/

=== For reference, from dojo/on.js; to see the recursive parsing, look at the line with handles.push(on.parse...)

	on.parse = function(target, type, listener, addListener, dontFix, matchesTarget){
		if(type.call){
			// event handler function
			// on(node, touch.press, touchListener);
			return type.call(matchesTarget, target, listener);
		}

		if(type instanceof Array){
			// allow an array of event names (or event handler functions)
			events = type;
		}else if(type.indexOf(",") > -1){
			// we allow comma delimited event names, so you can register for multiple events at once
			var events = type.split(/\s*,\s*/);
		} 
		if(events){
			var handles = [];
			var i = 0;
			var eventName;
			while(eventName = events[i++]){
				handles.push(on.parse(target, eventName, listener, addListener, dontFix, matchesTarget));
			}
			handles.remove = function(){
				for(var i = 0; i < handles.length; i++){
					handles[i].remove();
				}
			};
			return handles;
		}
		return addListener(target, type, listener, dontFix, matchesTarget);
	};


More information about the Dojo-interest mailing list