.NET TCP Binary Protocol
Why Bother?
The .NET is great at being a server platform: feature-rich, great toolset, comprehensive documentation, and friendly developer community. That’s why I’ve chosen it for my site, which is coded in C# and hosted by ASP.NET.
If you’re going to create a web service, .NET platform brings much value to the project.
.NET Platform Benefits
Toolset
I’m a software developer, that’s why it’s listed first.
It’s much easier to develop software using Visual Studio.
For example, in most cases F5 = “run with the debugger”. Even if running implies something as complex as “launch a specially crafted version of the web server, load a web page that references a specific Silverlight application, initialize a Silverlight runtime, then attach the debugger to that”, or “connect to some ARM-based piece of hardware plugged into a USB port, install a debugging agent on the device, deploy the target binaries to the device, launch the target on the device, then transparently use remote debugging functionality” – you just hit the F5, and it just works while you step over your lines of code.
I never saw that kind of integration in any non-Microsoft IDEs I’ve used.
Administration Tools
Just compare a look and feel of the typical custom web service administration tool, and AppFabric Management UI.
Of course you can take your time and implement your own custom administration UI. However, for a typical company developing a web service application, the administration tools are basically internal software, where the approach “Crash on invalid input? Run it again, and this time watch your {${?'s!” works OK.
Scalability
OK, the Java platform is more or less the same or even better (take a look at terracota). However, if you're using any other platform, IMO you’ll spend a lot of time trying to achieve something similar.
Why Not Use It Anywhere?
The HTTP transport protocol has its limitations. In our case, the showstopper was the inability to receive callbacks from the server.
The only remaining option is using the net.tcp transport.
Of course you can implement a .NET client consuming a web service exposed by net.tcp protocol. However, there’re cases where the modern version of .NET framework is unavailable or undesired.
We’re creating a web browser-based Unity 3D game: the .NET is Mono, the functionality is somewhere between 2.0 and 3.0.
I can imagine other possible use cases: something large written in C++, something small and targeted towards the broad audience (like a browser toolbar), something running on a non-Windows platform (e.g. embedded software), etc…
That’s why I’ve spent some time (about 1.5 weeks, to be precise) to implement the .NET TCP binary protocol on the top of a TCP socket.
Google found surprisingly few articles on the subject, that’s why I’m writing this one.
The Protocols Stack
The net.tcp protocol can be thought of additional 2 layers of abstraction on the top of TCP protocol stack.
The lower layer is .NET Message Framing Protocol.
It implements the handshake procedure, optional stream-level encryption, and how the stream is split into the individual messages.
The higher layer is binary XML format.
Framing Protocol
Google found an excellent series of articles about it, written by Nicholas Allen.
Here’s the link to the last part #7, which contains the links to the other articles in the series.
And of course you should read the official spec, see the [MC-NMF] in my “Related links” section.
The framing protocol can transfer messages in various formats, see [MC-NMF] section 2.2.3.4 “Envelope Encoding Record”. This article however is only focused on the 0x08 “Binary with in-band dictionary” message format, which is used in net.tcp WCF services by default.
Binary XML Format
Usually, web services speak SOAP.
The SOAP use XML format for messages. The XML is a human friendly format, which means it’s text-based, and extremely verbose. While it’s technically possible to use gzip to compress individual XML messages, this approach is less then optimal.
Every SOAP envelope starts with something similar to
<s:Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope" xmlns:a="http://www.w3.org/2005/08/addressing" >
The Microsoft designed its own format for serializing SOAP messages. It’s called “.NET Binary Format” and documented in [MC-NBFX], [MC-NBFS] and [MC-NBFSE] specs.
The above string “<s: Envelope xmlns…” (length is 110 characters) is encoded with the following 10 bytes: 56 02 0B 01 73 04 0B 01 61 06, here’s why:
02 = framework-defined dictionary key for "Envelope"
0B = eNodeType.DictionaryXmlsAttribute
0173 = "s" string
04 = framework-defined dictionary key for "http://www.w3.org/2003/05/soap-envelope"
0B = eNodeType.DictionaryXmlsAttribute
0161 = "a" string
06 = framework-defined dictionary key for "http://www.w3.org/2005/08/addressing"
GZip compression is great; however it’ll barely compress the XML data to 9% of the original size.
At this point you’re probably scratching you head and thinking “How about the message payload, with lots of my own “http://tempuri.org/iMyContract/MyMethod” strings?” which leads us to the next section.
User Dictionary
The answer is “user dictionary”, defined in [MC-NBFSE] specification.
When the message encoding is 0x08 (binary with in-band dictionary), every message begins with the StringTable record (see [MC-NBFSE], section 2.1), the message payload immediately following the StringTable.
User-defined dictionaries are persistent during the lifetime of the connection: the dictionary entries are accumulated as the party receives a message. That’s why the StringTable record will usually contain dictionary entries for the first several messages. For the subsequent messages, the StringTable record will be empty, which is encoded as single 0x00 byte.
Please keep in mind that your implementation must maintain two separate user-defined dictionaries:
- Incoming dictionary. New entries are added as new envelopes arrive from the remote party.
- Outgoing dictionary. You add the entries yourself. When sending an envelope, you must prepend it with the dictionary entries that were added since the previous message you’ve sent to the wire.
Implementation Details
I’ve implemented something resembling MSXML DOM or System.Xml.Linq. Here’s the piece of the interface:
{
/// <summary>Node type</summary>
public readonly eNodeType type;
/// <summary>Node dictionary key.</summary>
public int? dictionaryKey { get; protected set; }
/// <summary>Get the node's immediate children.</summary>
public virtual IEnumerable<Node> getChildren() { return s_EmptyNodeSet; }
/// <summary>Parse the search expression, return the functor that selects the node.</summary>
public static Func<Node, Node> selectSingleNodeFunc( string path );
}
class Text : Node
{
/// <remarks>The type of this object may be
/// bool, short, int, long, float, double, decimal, DateTime, string, byte[], Guid, TimeSpan or ulong,
/// depending on the attribute type.</remarks>
public object val { get; private set; }
}
abstract class Entity : Node
{
/// <summary>Full name, e.g. "s:Envelope"</summary>
public string FullName { get { } }
}
class Attribute : Entity { }
class Element : Entity { }
Since I’m currently a corporate employee, I’m not going to post the source code of my implementation.
You may want to design your own implementation in another way, e.g. create something similar to MSXML SAX = XmlReader / XmlWriter .NET classes, or something completely different.
Relevant Links
[MC-NMF]: .NET Message Framing Protocol Specification
[MC-NBFX]:
.NET Binary Format: XML Data Structure
This is the main binary XML format specification.
[MC-NBFS]:
.NET Binary Format: SOAP Data Structure
This one specifies the framework-defined dictionary entries,
such as “mustUnderstand”, “http://www.w3.org/2003/05/soap-envelope” and “RelatesTo”.
I've just copy-pasted the entries into the Dictionary.txt plain text file,
compressed it with GZip, added Dictionary.gz to my DLL manifest resource,
and wrote a dozen lines of code to load the entries from the resource and store in the Dictionary<int, string>.
[MC-NBFSE]:
.NET Binary Format: SOAP Extension
This one is for user dictionaries format.