diff --git a/index.html b/index.html index ba15181..5cb0be3 100644 --- a/index.html +++ b/index.html @@ -1,6 +1,6 @@ -
UBF, a framework for Getting Erlang to talk to the outside world. +
UBF, a framework for Getting Erlang to talk to the outside world. This document and the corresponding open-source code repositories are based on Joe Armstrong’s original UBF site and code with an MIT license file added to the distribution. Since then, a large number of diff --git a/ubf-user-guide.en.html b/ubf-user-guide.en.html index e9971a9..97a68cf 100644 --- a/ubf-user-guide.en.html +++ b/ubf-user-guide.en.html @@ -1,10 +1,10 @@ -
Table of Contents
UBF is a framework that permits the Erlang to talk to the outside +
Table of Contents
UBF is a framework that permits the Erlang to talk to the outside world [UBFPAPER]. The acronym "UBF" stands for "Universal Binary Format", designed and implemented by Joe Armstrong.
This document and the corresponding open-source code repositories -hosted on github are based on Joe Armstrong’s original UBF site -[UBFSITE] and UBF code with an MIT license file added to the +hosted on github [UBF] are based on Joe Armstrong’s original UBF +site [UBFSITE] and UBF code with an MIT license file added to the distribution. Since then, a large number of enhancements and improvements have been added.
UBF is a language for transporting and describing complex data structures across a network. It has three components:
While the XML series of languages had the goal of having a human readable format the UBF languages take the opposite view and provide a -"machine friendly" format. UBF is designed to be easy to implement.
Figure 1. Programming By Contract
UBF(a) is a transport format. UBF(a) was designed to be easy to parse +to the original UBF(a) implementation.
UBF(a) is a transport format. UBF(a) was designed to be easy to parse and to be easy to write with a text editor. UBF(a) is based on a byte encoded virtual machine, 26 byte codes are reserved. Instead of allocating the byte codes from 0, the printable character codes are @@ -115,10 +115,10 @@ which might know what it means - finally the end application is expected to know what to do with an object of type "jpg", it might for example know that this represents an image. UBF(a) will just encode the tag, UBF(b) will type check the tag, and the application should be -able to understand the tag.
This feature of integrating a "tag" in UBF(a) for the purpose -of a "type" in UBF(b) is currently not implemented. Tags can be -specified in UBF(a) but there is currently no way for the application -to act upon this semantic information. |
So far, exactly 26 control characters have been used, namely: +able to understand the tag.
Currently, this feature of integrating a "tag" in UBF(a) for +the purpose of a "type" in UBF(b) is not implemented. Tags can be +specified in UBF(a) but there is no way for the application to act +upon this semantic information. |
So far, exactly 26 control characters have been used, namely: %"~'`{}#&\s\n\t\r,-01234567890
This leaves us with 230 unallocated byte codes. These are used as follows:
>C
Where C is not one of the reserved byte codes, > means store the top of the recognition stack in the register C and pop the recognition @@ -269,12 +269,12 @@ to use either of these two tuples. This limitation introduced unintentionally after the original UBF implementation may be removed in the future.
"Contracts" and "Plugins" are the basic building blocks of an Erlang UBF server. Contracts are a server’s specifications. Plugins are a -server’s implementations.
A contract is a UBF(b) specification stored to a file. By convention, +server’s implementations.
A contract is a UBF(b) specification stored to a file. By convention, a contract’s filename has ".con" as the suffix part. Since all sections of a UBF(b) specification are optional except for the "+NAME" and "+VERSION" sections, it is possible to have "+TYPES" only contracts, "+STATE" only contracts, "+ANYSTATE" only contracts, or any -combination of such contracts.
For example, a "+TYPES" ony contract having the filename +combination of such contracts.
For example, a "+TYPES" only contract having the filename "irc_types_plugin.con" is as follows:
+NAME("irc_types").
+VSN("ubf1.0").
@@ -326,7 +326,7 @@ changeNameEvent() = {changesName, oldnick(), newnick(), group()}.For ex +ANYSTATE info() => string(); description() => string(); - contract() => term().
A plugin is just a "normal" Erlang module that follows a few simple + contract() => term().
A plugin is just a "normal" Erlang module that follows a few simple rules. For a "+TYPES" only contract, the plugin contains just the name of it’s contract. Otherwise, the plugin contains the name of it’s contract plus the necessary Erlang "glue code" needed to bind the @@ -334,12 +334,16 @@ UBF server to the server’s application. In either case, a plugin can also import all or a subset of "+TYPES" from other plugins. This simple yet powerful import mechanism permits sharing and re-use of types between plugins and servers.
The necessary Erlang "glue code" is presented later in the -Section 6, “Servers” section. |
For example, the plugin for the "+TYPES" only contract having the -filename "irc_types_plugin.erl" is as follows:
-module(irc_types_plugin). +Section 6, “Servers” section.
For the full example IRC contract described in a previous section, the +plugin having the filename "irc_plugin.erl" is as follows:
-module(irc_plugin).
-compile({parse_transform,contract_parser}).
--add_contract("irc_types_plugin").For example, the plugin for the "+STATE" and "+ANYSTATE" contract -having the filename "irc_fsm_plugin.erl" is as follows:
-module(irc_fsm_plugin).
+-add_contract("irc_plugin").The plugin for the "+TYPES" only contract having the filename +"irc_types_plugin.erl" is as follows:
-module(irc_types_plugin).
+
+-compile({parse_transform,contract_parser}).
+-add_contract("irc_types_plugin").The plugin for the "+STATE" and "+ANYSTATE" contract having the +filename "irc_fsm_plugin.erl" is as follows:
-module(irc_fsm_plugin).
-compile({parse_transform,contract_parser}).
-add_types(irc_types_plugin).
@@ -350,15 +354,10 @@ for this directive imports a subset of "+TYPEs" from the plugin named
'elsewhere' into the containing plugin. Multiple import directives
of either syntax can be freely declared as long as the "-add_types"
directives are listed before the "-add_contract" directive. A plugin
-can have only one "-add_contract" directive.Similarly for the full example IRC contract described in a previous
-section, the plugin having the filename "irc_plugin.erl" is as
-follows:
-module(irc_plugin).
-
--compile({parse_transform,contract_parser}).
--add_contract("irc_plugin").By using this Erlang "parse transform", the contract is parsed and the
+can have only one "-add_contract" directive.
By using this Erlang "parse transform", the contract is parsed and the
imported types (if any) are processed during the compilation of the
plugin’s Erlang module. The normal search path used by Erlang’s
-compiler to locate modules is used to import types from other plugins.
The plugin will fail to compile if the plugin’s contract cannot be
+compiler to locate modules is used to import types from other plugins.
The plugin will fail to compile if the plugin’s contract cannot be found, cannot be parsed properly, or if one of the following errors occurs:
where L is an Erlang list.
As a by-product of a plugin’s compilation and if one or more "record" +
where L is an Erlang list.
As a by-product of a plugin’s compilation and if one or more "record" or "extended record" types were declared in a plugin’s contract, an Erlang "header" file containing the plugin’s record definitions is automatically created. This Erlang "header" file can be included by the plugin module itself or by other Erlang modules used by the server’s application. By convention, this Erlang "header" file has the same base filename as the plugin but having a ".huc" as the suffix -part.
There are 2 experimental prototypes for extending UBF’s type and +plugin framework. [UBF_ABNF] is a framework for integrating UBF and +ABNF specifications. [UBF_EEP8] is a framework for integrating UBF +and EEP8 types. |
The original "UBF" network transport is UBF(a) over TCP/IP. Since +then, a number of new transports not based on UBF(a) and not based +on TCP/IP have been added. Nevertheless, these transports are still +considered as part of the overall UBF framework. Most importantly, +applications can share and re-use the same UBF contracts and plugins +irregardless of the network transport.
The name "UBF" is short for "Universal Binary Format". UBF is +commonly used to refer to the network transport based on UBF(a) and to +the overall UBF framework.
See Section 3.1, “UBF(a)” for further details.
EBF is an implementation of UBF(b) but it does not use UBF(a) for the +client and server communication. Instead, Erlang-style conventions +are used instead:
+Terms are framed using the gen_tcp {packet, 4} format: a 32-bit + unsigned integer (big-endian?) specifies packet length. +
+-------------------------+-------------------------------+ +| Packet length (32 bits) | Packet data (variable length) | ++-------------------------+-------------------------------+
The name "EBF" is short for "Erlang Binary Format".
JSF is an implementation of UBF(b) but it does not use UBF(a) for the +client and server communication. Instead, JSON [RFC4627] is used +instead as the wire format. The name "JSF" is short for "JavaScript +Format".
There is no generally agreed upon convention for converting Erlang +terms to JSON objects. JSF uses the convention set forth by +MochiWeb’s JSON library [MOCHIJSON2]. In addition, there are a +couple of other conventions layered on top of MochiWeb’s +implementation.
This extra convention creates something slightly messy-looking, if you +look at the raw JSON passed back-and-forth. The examples of the +Erlang record {foo, 42} and the general tuple {bar, 42} would look +like this:
record (defined in the contract as "foo() = #foo{attribute1 = term()};")
+
+ {"$R":"foo", "attribute1":42}
+
+ general tuple
+
+ {"$T":[{"$A":"bar"}, 42]}However, it requires very little JavaScript code to convert objects +with the "$R", "$T", and "$A" notation (for records, tuples, and +atoms) into whatever object is most convenient.
See [UBF_JSONRPC] for further details.
Gemini Mobile Technologies, Inc. has implemented and open-sourced +a module for classifying the input character set to detect non-UTF8 +JSON inputs [GMTCHARSET]. |
TBF and NTBF is an implementation of UBF(b) but it does not use UBF(a) +for the client and server communication. Instead, Thrift [THRIFT] +is used instead as the wire format. The name "TBF" is short for +"Thrift Binary Format". The name "NTBF" is short for "Native Thrift +Binary Format".
TBF follows the conventions set forth by the Thrift community by +re-using Thrift’s binary wire-protocol except for the following +exceptions:
TBF can encode and decode all UBF(b) objects. Synchronous calls are +implemented as Thrift T-CALL and T-REPLY message pairs. +Asynchronous casts are implemented as Thrift T-ONEWAY messages.
TBF is not compatible with standard Thrift clients and +servers. |
NTBF follows all of the conventions set forth by the Thrift community +by re-using Thrift’s binary wire-protocol. A standard Thrift client +can communicate with a UBF "NTBF" server and a UBF "NTBF" client can +communicate with a standard Thrift server.
NTBF cannot encode and decode all UBF(b) objects. There is no +straigthforward convention for converting Erlang terms to Thrift +messages. Synchronous calls are implemented as Thrift T-CALL and +T-REPLY message pairs or T-CALL and T-EXCEPTION message pairs. +Asynchronous casts are implemented as Thrift T-ONEWAY messages.
The NTBF transport is under active development to enhance, to improve, +to simplify the integration of Thrift to the UBF framework. The +impedance mismatch between the two approaches of Thrift and UBF can +only be addressed by further development.
Currently, NTBF only implements the encoding and decoding of +Thrift’s binary wire-protocol. Unlike standard Thrift clients and +servers, a NTBF client and server must "manually" implement the +features provided by the Thrift IDL. |
See [UBF_THRIFT] for further details.
It is worthwhile to mention there are two new TCP/IP transports namely +PBF and ABF that are under investigation. The name "PBF" is short for +"Google’s Protocol Buffers Format" [PROTOBUF]. The name "ABF" is +short for "Avro Binary Format" [AVRO].
JSON-RPC [JSONRPC] is a lightweight remote procedure call protocol +similar to XML-RPC. The UBF framework implementation of JSON-RPC +brings together JSF’s encoder/decoder, UBF(b)'s contract checking, and +an HTTP transport.
As previously stated, central to UBF is the idea of a "Contract" which +regulates the set of legal conversations that can take place between a +client and a server. The client-side is depicted in "red" and the +server-side is depicted in "blue". The client and server communicate +with each other via a TCP/IP and/or HTTP.
Central to UBF is the idea of contract(s) can be shared and re-used by +multiple transports. Any data that violates the same contract(s) is +rejected regardless of the transport.
See [UBF_JSONRPC] for further details.
Several transports that do not require an explicit network socket have +been added to the UBF framework. These transports permit an +application to call a plugin directly without the need for TCP/IP or +HTTP.
The concept "ETF" was added to the UBF framework. This transport +relies on Erlang’s Native Distribution for synchronous calls and +asynchronous casts.
The name "ETF" is short for "Erlang Term Format".
The concept "LPC" was added to the UBF framework. This transport is a +"non-transport" that invokes synchronous calls directly to a plugin. +Support for asynchronous casts has not been added (or designed) yet.
The name "LPC" is short for "Local Procedure Call".
LPC is used to implement the JSON-RPC transport. |
The UBF framework provides two types of Erlang servers: "stateless" +and "stateful". The stateless server is an extension of Joe +Armstrong’s original UBF server implementation. The "stateful" server +is Joe Armstrong’s original UBF server implementation.
UBF servers are introspective - which means the servers can describe +themselves. The following commands (described in UBF(a) format) are +always available:
The "ubf_server" Erlang module implements most of the commonly-used +server-side functions and provides several ways to start a server. +Configuration options for both types of servers are the same. +However, the plugin callback API is different.
-module(ubf_server).
+
+-type name() :: atom().
+-type plugins() :: [module()].
+-type ipport() :: pos_integer().
+-type options() :: [{atom(), term()}].
+
+-spec start(plugins(), ipport()) -> true.
+-spec start(name(), plugins(), ipport()) -> true.
+-spec start(name(), plugins(), ipport(), options()) -> true.
+
+-spec start_link(plugins(), ipport()) -> true.
+-spec start_link(name(), plugins(), ipport()) -> true.
+-spec start_link(name(), plugins(), ipport(), options()) -> true.The start/{2,3,4} and start_link/{2,3,4} functions start a registered +server and a TCP listener on ipport() and register all of the protocol +implementation modules in the plugins() list. If name() is undefined, +the server is not registered. The list of supported options() are as +follows:
The "ubf_server" Erlang module doesn’t provide a "stop" function. To +stop the server, instead stop the TCP listener that controls it. See +the "proc_socket_server" Erlang module for extra details.
The NTBF transport protocol is indirectly enabled by specifying +the following options: [{'proto', 'tbf'}, {'serverhello', +'undefined'}, {'simplerpc', 'true'}]. |
+ +[AVRO] "Avro is a serialization system.", + http://avro.apache.org/. + +
+ +[GMTCHARSET] "Gemini Mobile Technologies, Inc. Charset Module", + http://github.com/norton/gmt-util/blob/master/src/gmt_charset.erl. + +
+ +[JSONRPC] "A lightweight remote procedure call protocol similar + to XML-RPC", http://json-rpc.org/. + +
+ +[MOCHIJSON2] "MochiWeb is an Erlang library for building + lightweight HTTP servers.", + http://github.com/mochi/mochiweb/blob/master/src/mochijson2.erl. + +
+ +[PROTOBUF] "Protocol buffers are Google’s language-neutral, + platform-neutral, extensible mechanism for serializing structured + data.", http://code.google.com/apis/protocolbuffers/. + +
+ +[RFC4627] D. Crockford, "The application/json Media Type for + JavaScript Object Notation (JSON)", RFC4627, July 2006. + +
[RFC5234] D. Crocker, Ed. Brandenburg, "Augmented BNF for Syntax Specifications: ABNF", RFC5234, January 2008. -
+ +[THRIFT] "A software framework for scalable cross-language + services development.", http://incubator.apache.org/thrift/. + +
+ +[UBF] "Universal Binary Format", http://github.com/norton/ubf/. + +
+ +[UBF_ABNF] "Universal Binary Format and Augmented Backus-Naur + Form", http://github.com/norton/ubf-abnf/. + +
+ +[UBF_EEP8] "Universal Binary Format and Erlang Enhancement + Proposal 8", http://github.com/norton/ubf-eep8/. + +
+ +[UBF_JSONRPC] "Universal Binary Format and JavaScript Object + Notation RPC", https://github.com/norton/ubf-jsonrpc/. + +
+ +[UBF_THRIFT] "Universal Binary Format and Thrift", + https://github.com/norton/ubf-thrift/. + +
[UBFPAPER] Joe Armstrong, "Getting Erlang to talk to the outside world", Proceedings of the 2002 ACM SIGPLAN workshop on Erlang, pages 64-72, ACM Press, 2002. -
[UBFSITE] Joe Armstrong, http://www.sics.se/~joe/ubf/site/home.html, March 2003.