performance


In my quest for cleaner code in Kamaloka-js I have been working on simplifying the dispatch model.  AMQP has some interesting features built into it to facilitate real-time functionality along with message prioritization.  To accomplish this messages can be sent on different queues and tracks, and also be broken up into segments which can be further broken up into frames.

Frames

Frames are the basic building blocks of the AMQP data stream.  They contain complete headers that describe queue, track and segment that is currently being constructed.  The payload of a frame (the segment being built) can be broken up into arbitrary sized byte arrays which are then reassembled based on the channel and track they are sent on.  In this way applications with memory constraints can request that frames be no bigger than what the application can fit in memory.  A typical frame header looks something like:

{
  channel: 0,
  track:1,
  is_first_frame: 1,
  is_last_frame: 1,
  is_first_segment: 1,
  is_last_segment: 1
}

Segments

Segments are like frames but instead of an arbitrary split on a data size, each segment is split on struct boundaries. That is to say, when you receive a complete segment you can be sure that it can be fully parsed. There are currently four segment types: A control segment, command segment, header segment and body segment. Command messages are currently the only type that can contain a header and body segment. For instance, the transfer command is used to send messages to and from a queue. It would contain header segments which could be used to route the entire message and it would contain a body segment which contains arbitrary data the user application cared about. Each segment is broken up into at least one frame. Multiple segments would never be sent in a single frame.

Channel and Tracks

A channel is just an integer that denotes related frames and segments. One can think of each channel being a list of incoming frames which are ordered correctly. Once the last frame in a channel is seen, the message is constructed and the channel is flushed and ready to receive a new message. In this way, multiple messages can be received at the same time by utilizing different channels but only one message can be sent on a single channel at a time.

Tracks are an exception to the one message per channel rule. There are two tracks in the current spec. The control track (track 0) and the command track (track 1). Controls preempt commands on a channel, so you can be in the middle of receiving frames on the command track and a control can come in on the control track and you must respond to that first.

This all sounds complicated but you can just think of the channel/track combination as being one entry in an hash. For instance frames coming into channel 0 and track 1 would be given the hash “0.1″:

message_channels["0.1"].add_frame(incoming)

First pass – Frame dispatching

At first it was easier to think of the frame and segment issues as different layers. At the lowest layer I would decode and dispatch each frame and then pass it off to the segment layer once a complete segment had been decoded. The segment layer would then collect the segments, relate them to each other, and then construct the full message. The frame layer looked something like this:

Flowchart showing the frame decoding layer of the kamaloka-js AMQP bindings

Flowchart showing the frame decoding layer of the kamaloka-js AMQP bindings

The dispatch would then pass it off to the segment decoder.

This became overly complicated because each segment had varying degrees of metadata and the body and header segments didn’t map to the message object very well. I could have gone ahead and created a segment object but I wanted to simplify the code.

Flattening the frame and segment layers

As it turned out flattening the model only added a couple of more steps to the current frame layer. Since frames and segments are just two different ways of breaking up a message for transfer over the wire, combining the two in the same layer made sense. What I ended up with was this:

Flowchart showing how the Kamaloka-js AMQP bindings decode frame and segments into a message

Flowchart showing how the Kamaloka-js AMQP bindings decode frame and segments into a message

If you notice I now only create a new message if it is the first frame and first segment on a channel. When I see it is the last frame I incrementally decode the segment but I only dispatch the message once the last segment is seen. In the end, these minor adjustments allowed me to strip out a whole layer of redundant code. It also simplified the low level event code as I used to have to manage callbacks for each segment in order to construct a message. Now events only trigger once a full message is received and not when each frame or segment is received.

[read this post in: ar de es fr it ja ko pt ru zh-CN ]

While I am developing on Firefox today I tested and fixed some issues under Chromium (webkit based), Opera (opera based) and Konquorer (khtml based). My pub/sub demo worked the same in all four browsers (and at the same time I might add). We tried with IE8 on Luke’s machine but the error messages were cryptic. I’ll have to get my hands on a windows box at some point but I’m guessing the biggest issues are going to be trailing commas or keywords which are similar in the amqp spec and javascript (e.g. I ran into void and class keyword issues with the webkit browsers).

One thing to note is Chromium’s integrated debugger is a lot faster than Firebug. I haven’t used it extensively but if it can break at breakpoints and introspect objects without getting screwy I might end up switching. I really like firebug but lately it has become dog slow. It usually takes a couple of minutes of waiting for each frame of data to be processed and displayed while under Chromium with their debugger on, the slowdown is noticeable but not significant compared to when the debugger is off.

UPDATE: Chromium’s debugger is probably faster because it doesn’t actually do anything that I can see. It displays scripts but doesn’t actually break into the code

[read this post in: ar de es fr it ja ko pt ru zh-CN ]

I feel I owe this blog post to Chris being that I’ve been cited as one of the catalysts for some in the GNOME community aligning themselves with WebKit.  Not that I think that is bad that there is competition in the browser market (competition is one thing but a line in the sand is just counterproductive here) but my original intent was merely to ask what are our priorities and what projects would align closer to those priorities.

In any case it was reported on Slashdot that according to an article at Dot Net Perls, Firefox is now one of the most efficient browsers when it comes to memory usage.  This meshes with the internal tests Mozilla was doing and Chris blogged about.  It was one of my main gripes with Firefox when using the XULRunner and Gecko engine as the basis for an embedded browser.  At the time I was a bit nonplussed as the work that was being done to make Firefox better revolved around blaming and removing important libraries instead of fixing the root causes.

If the data is to be believed (and be transferable to Linux as the tests were run on Windows) then it does point to significant improvements in Firefox and I thank the Mozilla community for listening and dealing with the issues head on.  Software is hard and we shouldn’t turn our backs on a friend of the Linux community even when they might not be walking lock step with us.  The flip side is Mozilla does need to be concious of the needs of downstream developers and not use its market position as bludgeon to get its way. To that end there are still the issues of a stable embedded API and better platform integration. I hear those are being worked on so hopefully it won’t be an issue going forward.

Again I would like to thank the Mozilla community for putting out a great browser that is a serious competitor with Internet Explorer. I would also like to thank the Mozilla Foundation for helping fund accessibility work in GNOME. By working with each other instead of butting heads, as happens every once in awhile, the ecosystem grows and benefits both communities.

[read this post in: ar de es fr it ja ko pt ru zh-CN ]

…and say hello to Cappuccino in a Cloud.

The Red Hat “Boston” office just moved into new diggs down the street from our old office space.  This is the second move we have made since I got here four years ago and a needed one as the company continues to grow at a steady pace. Inevitably the discussion of coffee makers comes up every time we make a move (and quite frequently in the interim too) with a new coffee gadget showing up shortly after. We opted for the Flavia drink station this time around. This brings up the issue that any new gadget presented to a large audience will inevitably see high traffic for the first few days before the novelty wears off and the traffic reduces to a steady level of consumers.

There are many questions that need to be considered here. Will the machine stand up to the first few weeks of abuse? If it was engineered for a high peak capacity is it still economical to run when that traffic has fallen off? Do we just accept that the first few weeks will see some breakdowns, pissed customers who will not come back because of the failed experience and keep on chugging with the knowledge that our initial costs were low? If coffee making could be parallelized could it scale up and down economically and efficiently?

This is the Cappuccino in a Cloud problem. How do you make processes efficient and scalable for both high load peak and the inevitable lower day to day traffic? The travelling salesman problem dealt with efficiencies of one single entity (the salesman) finding the most efficient (read cheapest) single threaded route through a number of destinations. In today’s word the consumer comes to the buisness or service, sometimes all at once, and it is important to figure out the most efficient way (measured in the consumer’s satisfaction and producer’s bottom line) to handle that load.

[read this post in: ar de es fr it ja ko pt ru zh-CN ]

I got back from the BOSSA conference last week. It still is my favorite conference for just talking to people and getting things rolling. The organizers keep it small and limit the number of talks so that people can get to what they do best and just talk to each other. While I am no longer directly involved in embedded development I still feel this should be the target of most developers. A win on current generation embedded devices means improvements across the board from those devices to the desktop and even the server room. Most of my debugging and optimization techniques for D-Bus, the focus of my BOSSA talk, were gained from working with it on embedded devices. I continue to gear my work with the idea that in the near future a large portion of the people consuming those technologies will be doing so on devices that are considered to be “embedded”.

[read this post in: ar de es fr it ja ko pt ru zh-CN ]

For those of you in Fedora land who don’t know Matthew Garrett has just accepted an offer from Red Hat . If that name doesn’t ring a bell, it should. Matthew is one of the reasons Linux works on laptops. Being one of the few people who truly understands Linux from the hardware all the way up to the desktop, he will be spending his time working on power management in both Kernel land and Userspace.

It is great to see my company recognize the need for such improvements and hire top notch people to get it done. Welcome aboard Matthew.

[read this post in: ar de es fr it ja ko pt ru zh-CN ]