Learn SURF

The primary purpose of a computer program is to process data, usually in the form of objects. This data processing usually occurs in the computer's memory, but at some point the program needs to serialize the object, turning them into a series of bytes, so that they can be stored in a file or transferred to another system. Serialization is used for everything from configuration files to instant message transport. The Simple URF (SURF) format strives to be a serialization format for the reactive Internet era, striking a balance between simplicity (being almost as simple and slightly more compact as JSON) and expressiveness (having more types than JSON, extensibility through vocabularies, and compatibility with semantic frameworks).

History

Binary Serializations

Early serialization approaches used binary format and were not easily read by humans, as they used arbitrary numbers as delimiters and represented data in their binary form as stored in memory.

Good

Binary formats are very compact.

Bad

Binary formats are hard to debug, as they are not readily human-readable.
Many binary formats are proprietary, impeding interoperability.

Markup Languages

Early markup languages such as the Standard Generalized Markup Language (SGML) from the 1980s weren't originally meant to store objects but to add annotations to text data. Markup tag pairs such as “<para>…</para>” (indicating a paragraph) were inserting surrounding text to allow it to be styled for publishing. In the early 1990s Tim Berners-Lee invented the HyperText Markup Language (HTML), an implementation of SGML that provided simple markup for web pages using tag pairs such as “<p>…</p>”. (The latest version, HTML5, is the current best-practices format for creating web content.)

HTML was not meant for serializing general data such as objects, and SGML was too complex, so in the later 1990s the World Wide Web Consortium (W3C) created the Extensible Markup Language (XML), a simplified version of SGML. XML provided a simple way to define the structure of any data that could be stored in text form, yet did not define the actual tag names used or their semantics (that is, how they were to be interpreted). A typical XML document might appear like the one below:

Example XML file for storing information about a user.

<?xml version="1.0" encoding="UTF-8"?>
<user authenticated="true" sort="d">
  <name>Jane Doe</name>
  <id>bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832</id>
  <email>jane_doe@example.com</email>
  <phone>+1-201-555-0123</phone>
  <usernames>
    <username>jdoe</username>
    <username>janed</username>
  </usernames>
  <homePage>http://www.example.com/jdoe/</homePage>
  <salt>Zm9vYmFy</salt>
  <joined>2016-01-23</joined>
  <credits>123</credits>
</user>

Later versions of XML allows tags to be grouped into namespaces to allow mixing of vocabularies from different subject matters without tag name clashes. Each namespace would be identified by some standardized URI and associated with some “prefix” at the top of the XML document. This prefix would be used for all relevant tags. The namespace for user definitions for example might be identified with the URI http://example.com/ns/users and given the prefix u, so that tags would appear as e.g. <u:user> and <u:name>.

Good

XML is human-readable.
XML is standardized and ubiquitous.
XML works well for annotating text data.
XML provides a mechanism for mixing vocabularies.

Bad

XML is verbose, sometimes tripling the size of the data being encoded.
XML arbitrarily distinguishes two types of data: “attributes” (e.g. the authenticated attribute) and “child elements” (e.g. the <name> tag).
XML itself has not way to specify the types of data (e.g. that true is a Boolean value, jdoe is a string, 123 is a number, and 2016-01-23 is a date).
XML retains some complicated features that are becoming seldom used yet all parsers must support (e.g. DTDs, character references, CDATA sections).
XML is only a syntax and has no means for referring to other things as in a general graph.

JSON

For almost two decades XML has been the default serialization format for many data, but it is recently being supplanted by newer, simpler formats. Web developers in particular desired a format that was smaller for transferring between the browser and a server. In the early 2000s the JavaScript Object Notation (JSON) format was formalized, using a subset of the syntax JavaScript used for declaring objects. As JavaScript was the primary language used in the browser, JSON could be parsed directly by evaluation as if it were a JavaScript program.

The central serialized type in JSON is an “object”, a comma-separated mapping of string keys to values inside brace { and } characters, such as {"foo" : "bar"}. A value can be a string; a number; a comma-separated array inside bracket [ and ] characters; a Boolean value true or false; the value null; or another object. The same information above could be stored in JSON as follows:

Example JSON file for storing information about a user.

{
  "authenticated" : true,
  "sort" : "d",
  "name" : "Jane Doe",
  "id" : "bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832",
  "email" : "jane_doe@example.com",
  "phone" : "+1-201-555-0123",
  "usernames" : ["jdoe", "janed"],
  "homePage" : "http://www.example.com/jdoe/",
  "salt" : "Zm9vYmFy",
  "joined" : "2016-01-23",
  "credits" : 123
}

Good

JSON is more compact than XML.
JSON has no arbitrary distinction of child information as did XML with its “attributes”.
JSON is more than a delimiter syntax; it encodes values.
JSON is standardized and becoming ubiquitous.
The JSON specification is smaller, easier to understand, and quicker to implement.

Bad

JSON is still somewhat verbose, requiring quotation " characters for object keys and comma , characters between key-value pairs.
JSON has a very limited set of types; single characters, dates, URLs and the like must be represented by strings and parsed separately later by the consumer.
JSON has no inherent way for objects to refer to other objects as in a graph.
JSON has no mechanism for mixing vocabularies defined by independent parties.
JSON does not support document-level metadata, and otherwise reflects its creation as Java

In short JSON was not created from scratch to be a generic serialization format, and it shows. JSON was a convenient extraction of part of the JavaScript language which has proven to be very useful as an alternative to XML.

Semantic Frameworks

The W3C has since the 1990s been formulating the Resource Description Framework (RDF), a semantic framework for describing resources (objects or even abstract ideas) and the meaning of relationships between them. These relationships or properties are defined in a very formal sense, allowing reasoning via propositional logic. (For example, if Bob is the manager of Jane, and Jill is the manager of Bob, then Jill is the manager of Jane because the “manager” property is transitive.) In RDF, even the properties themselves are resources and can be described with their own properties.

RDF is for the most part a semantic framework independent of any serialization. Two popular serializations are RDF/XML (using special XML tags); and Turtle, a syntax that uses triples of subject, property, and object to represent semantic propositions.

Example RDF/XML file for storing information about a user.

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:user="http://example.com/ns/user#" xmlns:crypto="http://example.com/ns/cryptography#">
  <rdf:Description rdf:about="urn:uuid:bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832">
    <rdf:type rdf:resource="http://example.com/ns/user#User"/>
    <user:authenticated rdf:datatype="http://www.w3.org/2001/XMLSchema#Boolean">true</user:authenticated>
    <user:sort>d</user:sort>
    <user:name>Jane Doe</user:name>
    <user:email>jane_doe@example.com</user:email>
    <user:phone>+1-201-555-0123</user:phone>
    <user:usernames rdf:parseType="Collection">
      <rdf:Description>
        <rdf:value>jdoe</rdf:value>
      </rdf:Description>
      <rdf:Description>
        <rdf:value>janed</rdf:value>
      </rdf:Description>
    </user:usernames>
    <user:homePage rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI">http://www.example.com/jdoe/</user:homePage>
    <crypto:salt rdf:datatype="http://www.w3.org/2001/XMLSchema#base64Binary">Zm9vYmFy</crypto:salt>
    <user:joined rdf:datatype="http://www.w3.org/2001/XMLSchema#gYear">2016-01-23</user:joined>
    <user:credits rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">123</user:credits>
  </rdf:Description>
</rdf:RDF>

Example Turtle file for storing information about a user.

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix user: <http://example.com/ns/user#> .
@prefix crypto: <http://example.com/ns/cryptography#> .

<urn:uuid:bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>
  a <http://example.com/ns/user#User> ;
  user:authenticated "true"^^xsd:Boolean ;
  user:sort "d" ;
  user:name "Jane Doe" ;
  user:email "jane_doe@example.com" ;
  user:phone "+1-201-555-0123" ;
  user:usernames (
    _:jdoe
    _:janed
  ) ;
  user:homePage "http://www.example.com/jdoe/"^^xsd:anyURI ;
  crypto:salt "Zm9vYmFy"^^xsd:base64Binary ;
  user:joined "2016-01-23"^^xsd:gYear ;
  user:credits 123 .

_:jdoe rdf:value "jdoe" .
_:janed rdf:value "janed" .

Note that RDF, being a web semantic framework, has a special facility to universally identify resources by URI. Rather than simply saying that the user has an id property, RDF can express the user's identity as the IRI urn:uuid:bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832 at the framework level. SURF, based on the semantic framework URF, has this ability as well.

Good

RDF allows objects
RDF provides semantic representations among objects.
RDF allows a reasonable set of types, leveraging XML Schema Datatypes (even in the Turtle syntax).
RDF allows mixing of vocabularies.
RDF allows references for graphs of objects.

Bad

Neither RDF/XML nor Turtle can be used without understanding the RDF data model specification, which is dense, complicated, and academic.
RDF provides several redundant representations of even simple things such as strings.
The RDF/XML serialization is extremely verbose.
The Turtle serialization is verbose, unwieldy, and opaque.

URF, TURF, and SURF

In 2007, in an effort to create a simpler, elegant, and more consistent semantic framework, Garret Wilson and GlobalMentor, Inc. started work on the Uniform Resource Framework (URF). The TURF syntax for URF is meant to provide a comprehensive, text-based format for data archival; as well as satisfying the academic requirements of a rigorous semantic framework. The goal for SURF, however, is to provide a a simple serialization format that is as easy to use as JSON, while still maintaining URF semantics and compatibility with TURF.

SURF does not require any knowledge of semantic frameworks in using the syntax. In other words, SURF is an improved JSON that brings more types and more features while hardly increasing the complexity—and brings the rigor of a semantic framework for free.

Example SURF file for storing information about a user.

|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>|*User:
  authenticated = true
  sort = 'd'
  name = "Jane Doe"
  email = ^jane_doe@example.com
  phone = +12015550123
  usernames = ("jdoe", "janed")
  homePage = <http://www.example.com/jdoe/>
  salt = %Zm9vYmFy
  joined = @2016-01-23
  credits = 123
;

This SURF example uses different type than JSON for usernames in order to better indicate that order of the usernames is unimportant.

Good

SURF is almost as simple as JSON.
SURF is small; a SURF document can be more compact than JSON.
SURF can be more readable than JSON, dispensing with unnecessary syntax such as commas between items in multi-line lists.
SURF provides many more types than JSON.
SURF allows mixing of vocabularies, yet without the need to think of bulky URIs.
SURF allows references for graphs of objects.

Bad

No one has adopted SURF yet.

SURF

In addition to the benefits listed above, SURF has several useful characteristics:

All JSON documents are also valid SURF documents.
All SURF documents are also valid TURF documents.

JSON Compatibility

Any JSON document can be parsed as a valid SURF document! But SURF can be more expressive; compare the SURF and JSON documents presented earlier in this lesson. The additional features will be explained further in the following sections.

Although SURF recognizes the JSON null value, it will be ignored. Put another way, SURF interprets null to mean “no value is present”.

Comments

A comment may be added using the exclamation ! mark. The comment will continue until the end of the line.

Simple SURF document with a comment.

* !The asterisk represents an object.

Handles

SURF has the normal restrictions on identifiers as with many other formats and languages, except that SURF is fully Unicode aware. A SURF name must begin with a letter and follow with any number of letters, digits, and connectors such as the underscore _ character. Examples include Vehicle, color, and foo_bar.

SURF provides a way to prevent clashes by placing names in informal namespaces. A name may begin with one or more prefixes, delimited by the hyphen - character. These prefixes indicate a hierarchy of informal namespaces. Various parties may supply defined vocabularies which may be freely mixed in a SURF document. In such a case the namespaces indicated by the prefixes will prevent the identifiers from overlapping, even if they would otherwise have the same name. A name along with its prefix(es) is referred to as a handle.

For example, the name “salt” could denote the additional cryptographic input to a password hash function (see Salt (cryptography)), or it could identify an ionic chemical compound (see Salt (chemistry)). SURF allows these two names to be distinguished by the addition of a namespace prefix. One might use the “crypto-” prefix, while another might use the “chem-” prefix, producing the handles crypto-salt and chem-salt, respectively.

Handles in SURF and JSON.

	SURF	JSON
Handles	`foo` `fooBar` `foo_bar` `touché` `काम` `chem-salt` `crypto-salt` `User` `chem-Molecule`	N/A

SURF recommends that if you use an informal namespace prefix, you use the second-level domain of an Internet domain you control. For example, if you own the example.com domain, you could use example as a namespace prefix, resulting in e.g. example-foo.

Resource Descriptions

SURF is all about describing “resources”, which (in URF as in RDF) is anything that can be identified for discussion—that is, anything you can talk about. There are three types of resource representations in SURF:

object: A description of some resource, such as a web site.
literal: A representation of a resource that is can be identified by a lexical form, such as a number or a string.
collection: A resource that aggregates other resources.

Objects

SURF objects are analogous to the objects described by JSON. An anonymous object in SURF is denoted by the asterisk * character, while in JSON it is represented by opening and closing brace {} characters.

Objects in SURF and JSON.

	SURF	JSON
Object	`*`	`{}`

A bare object is somewhat boring, so both SURF and JSON allow them to be described. SURF uses the colon : and semicolon ; characters as delimiters for an object description block. Each element of a SURF object description consists of property, which is a SURF identifier, and its associated value, which is any SURF resource. The property and its value are separated using the equals = sign, unlike JSON which uses a colon : character.

Object descriptions in SURF and JSON.

	SURF	JSON
Resource Description	`* : color = "blue" foo = "bar" ;`	`{ "color" : "blue", "foo" : "bar" }`

Note that JSON requires keys such as "color" and "foo" to have quotation " characters, because JavaScript “objects” are actually associative arrays of string key-value pairs. The resource descriptions of SURF are true property value assignments and thus the property names need not be quoted. JSON objects therefore more closely resemble maps, although keys are restricted to string values. SURF thus considers {} to indicate a map, and moreover allows a wider variety of key types. See the section on Maps to understand how SURF can leverage the true map class available in ECMAScript 6.

If you place property value assignments on separate lines as in the example above, there is no need to separate them with the comma , character unlike JSON object descriptions.

You may also optionally specify your own custom type for any object by indicating the type name after the asterisk, as long as the type name follows the rules described for Handles, above. Examples include *User and *chem-Molecule.

Custom type indication in SURF.

	SURF	JSON
Custom Type	`*User: … ;`	N/A

Literals

Strings

Strings appear the same in SURF and in JSON.

Strings in SURF and JSON.

	SURF	JSON
Strings	`""` `"abc\ndef"` `"Hindi letter \u092E represents the \"m\" sound."`	`""` `"abc\ndef"` `"Hindi letter \u092E represents the \"m\" sound."`

Numbers

Numbers are represented the same in SURF and in JSON. But SURF comes with two number types that JSON doesn't have. One is an integer type. SURF also allows you to indicate that a numerical value is restricted to whole numbers by leaving off the fractional part and exponent. While the JSON considers both 5 and 5.0 to be general “numbers”, SURF considers the former to be specifically an integer type.

It is sometimes forgotten that IEEE 754 floating point numbers store only estimates of some numbers, 0.3 being one example. Most libraries use floating point for parsing and serializing in JSON, and SURF allows this as well. But because fractional estimates are not desired when dealing with some things such as money, SURF provides a decimal type which guarantees that the fractional part will be represented exactly. Decimals are indicated by the dollar sign $ prefix. This does not mean that this is a currency type, although the decimal type is often used to represent money.

Numbers in SURF and JSON.

	SURF	JSON
Numbers	`123.0` `0.123` `-1.2e+3`	`123` `123.0` `0.123` `-1.2e+3`
Integers	`123` `0` `-321`	N/A
Decimals	`$123` `$123.0` `$0.123` `$-1.23`	N/A

Boolean

Both SURF and JSON have the same Boolean values, true and false.

Numbers in SURF and JSON.

	SURF	JSON
Boolean	`false` `true`	`false` `true`

Dates and Times

Although JSON does not support dates, the SURF Instant format is compatible with JavaScript.Date.prototype.toJSON().

SURF comes with extensive date and time handling, addressing one of the most glaring deficiencies of JSON. SURF supports the most common representations specified by ISO 8601 for absolute instances in time. Moreover SURF supports local date+time specifications, durations, and other temporal concepts present in modern date/time libraries.

Dates and times in SURF.

	SURF	JSON
Instant	`@2017-02-12T23:29:18.829Z`	N/A
ZonedDateTime	`@2017-02-12T15:29:18.829-08:00[America/Los_Angeles]`	N/A
OffsetDateTime	`@2017-02-12T15:29:18.829-08:00`	N/A
OffsetDate	`@2017-02-12-08:00`	N/A
OffsetTime	`@15:29:18.829-08:00`	N/A
LocalDateTime	`@2017-02-12T15:29:18.829`	N/A
LocalDate	`@2017-02-12`	N/A
LocaleTime	`@15:29:18.829`	N/A
YearMonth	`@2017-02`	N/A
MonthDay	`@--02-12`	N/A
Year	`@2017`	N/A

Regular Expressions

SURF supports regular expressions as a first-class type along with strings, dates, etc. They are surrounded by the slash / delimiter character, as they are in JavaScript, which makes it somewhat curious that JSON does not support them.

Regular expressions in SURF.

	SURF	JSON
Regular Expression	`/a?b+c*/`	N/A

IRIs

SURF can represent a Internationalized Resource Identifier (IRI) (RFC 3987) as its own type rather than as a string. A URL used to identify web addresses is the most well-known form of IRI. An IRI in SURF is surrounded by less than < and greater than > signs.

IRIs in SURF.

	SURF	JSON
IRI	`<http://example.com/>`	N/A

UUIDs

SURF can also represent a different type of identifier known as a Universally Unique Identifier (UUID) (RFC 4122). This identifier is a 128-bit value that can be generated on separate systems according to a certain algorithm, yet with a extremely small chance of two UUIDs being identical across all systems. As with URIs SURF represents a UUID as a separate type, not as a string.

The SURF UUID representation begins with the ampersand & character (think of the reference operator in C/C++) followed by the canonical UUID representation of groups of hexadecimal octets in the form xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx. That is, 32 hexadecimal digits appear as a total of 36 character in the form 8-4-4-4-12.

UUIDs in SURF.

	SURF	JSON
UUID	`&5623962b-22b1-4680-ae1c-7174a46144fc`	N/A

Email Addresses

A common way of uniquely identifying people is by email address. SURF comes with a special type to represent emails as opposed to general strings. SURF emails follow the address specification in RFC 5322, with the addition of the circumflex accent or “caret” ^ character at the beginning. (Think of the paper airplane symbol representing “send email” in many user interfaces.)

Email addresses in SURF.

	SURF	JSON
Email Address	`^jdoe@example.com`	N/A

Telephone Numbers

Another popular identifier is the telephone number, and SURF has a literal representation for those as well. A SURF telephone number follows the syntax for “global numbers” (i.e. those that begin with the plus + sign) described in RFC 3966. The visual separators allowed by RFC 3966 are optional. SURF does not allow telephone number parameters.

Telephone numbers in SURF.

	SURF	JSON
Telephone Number	`+12015550123`	N/A

Binary Values

SURF allows you represent a series of bytes. The percent % character is used as the beginning delimiter (think of the percent symbol as representing 1s and 0s), with the following binary data encoded as Base64 (RFC 4648) using the “base64url” alphabet. SURF does not allow Base64 padding.

Binary values in SURF.

	SURF	JSON
Binary	`%Zm9vYmFy`	N/A

Collections

SURF collections are represented by the appropriate high-level abstract data type implementation in whatever language is used to process them. In Java, for instance, a SURF list will be represented by a java.util.List<Resource> instead of a simple array. Likewise in JavaScript a SURF map will not be represented by a JavaScript object but instead the more semantically appropriate ECMAScript 6 Map type.

Lists

Lists in SURF are presented like arrays in JSON using the bracket [ and ] characters. An added benefit over JSON is that you do not need the comma , separator if you place the list items on separate lines.

Lists and nested lists in SURF.

	SURF	JSON
List	`[ "foo" [1, 2, 3] 1.23 ]`	`[ "foo", [1, 2, 3], 1.23 ]`

Sets

Sets are unordered collections that do not allow duplicates. They are formatted the same as list, except that they use parentheses ( and ) characters.

Lists and nested lists in SURF.

	SURF	JSON
Set	`("red", "green", "blue")`	N/A

Maps

Maps are associations of values with unique key values. SURF represents map key+value entries inside brace { and } characters, with each key and value separated by a colon : character. Unlike the keys of JSON associative arrays, SURF map keys do not have to be strings. The comma , separator is not needed if the map entries appear on on separate lines.

Maps in SURF.

	SURF	JSON
Map	`{ "pi" : 3.14159 1 : "first" 2 : "second" 3 : "third" true : "yes" false : "no" }`	N/A

When representing resources or objects in SURF, be careful not to use the SURF map notation even though the it resembles the JSON notation for an “object”. If the key values represent object properties, use the *:…; notation for a resource. Only use the {…} notation for a true collection of true key+value pairs.

Labels

In most programming languages, an in-memory graph of objects may have several references to the same instance. Many serialization formats, such as JSON, do not have built-in mechanisms for referencing nodes serialized elsewhere in the document. Other formats such as XML require special processors that understand additional specifications such as XML Schema, or rely on proprietary application-level interpretations completely outside the format specifications. SURF allows you to reference any resource within a document by using a label, which consists of an identifier inside vertical line | characters and placed in front of a resource. SURF supports three types of labels, which differ in their resolving power, that is, how broadly the label will identify a unique resource.

Aliases

At the finest granularity, you can create an alias by placing any SURF name inside the label delimiters, such as |foo|. An alias can be given to any resource. After assigning a tag to a resource, the alias can then be used anywhere in the SURF document that the original resource could have been used. The SURF parser will recognize all occurrences of an alias as referring to the same resource instance. An alias only exists as syntax within a SURF document; an alias itself is not present in the object graph a SURF parser returns.

In a typical web authentication scenario, several web pages may indicate that only a user with the “administrator” role may have access. Without a built-in mechanism for referencing resources, the usual workaround is to create an extra identifier field and require the processing application to understand that these references should create links, as in the following example:

Referencing resources in SURF without using aliases requires special application knowledge.

*WebApplication:
  roles = [
    *Role:
      id = "userRole"
      name = "Normal User"
    ;
    *Role:
      id = "adminRole"
      name = "Administrator"
    ;
  ]
  pages = [
    *WebPage:
      title = "Home Page"
      url = <http://example.com/>
      allow = "userRole"
    ;
    *WebPage:
      title = "Maintenance"
      url = <http://example.org/maintain/>
      allow = "adminRole"
    ;
    *WebPage:
      title = "Edit Users"
      url = <http://example.org/edit-users/>
      allow = "adminRole"
    ;
  ]
;

Adding SURF aliases moves the linking semantics into the document itself, relieving the application from the need to manually make connections based upon some proprietary scheme. In the example below, a SURF parser will automatically create references to the Role resources; no extra work is required on the part of the application. This example removes the role id property to indicate it is no longer needed for referencing within the SURF document.

Referencing resources in SURF using tags.

*WebApplication:
  roles = [
    |userRole|*Role:
      name = "Normal User"
    ;
    |adminRole|*Role:
      name = "Administrator"
    ;
  ]
  pages = [
    *WebPage:
      title = "Home Page"
      url = <http://example.com/>
      allow = |userRole|
    ;
    *WebPage:
      title = "Maintenance"
      url = <http://example.org/maintain/>
      allow = |adminRole|
    ;
    *WebPage:
      title = "Edit Users"
      url = <http://example.org/edit-users/>
      allow = |adminRole|
    ;
  ]
;

IDs

SURF provides one more kind of reference identifier called an ID. Like a tag, an ID is assigned the object an exists outside the SURF document. Unlike a tag, which provides a global unique identification, an ID uniquely identifies a resource only among the resources of the indicated type. For this reason if you indicate an ID for a resource, you must also indicate a resource type. You can specify an ID by providing a string such as "user" (note the surrounding double quote " characters) as the label identifier.

Using global ID tags.

*WebApplication:
  roles = [
    ! same as |<https://urf.name/Role#user>|
    |"user"|*Role:
      name = "Normal User"
    ;
    ! same as |<https://urf.name/Role#user>|
    |"admin"|*Role:
      name = "Administrator"
    ;
  ]
  pages = [
    |<http://example.com/>|*WebPage:
      title = "Home Page"
      allow = |userRole|
    ;
    |<http://example.org/maintain/>|*WebPage:
      title = "Maintenance"
      allow = |adminRole|
    ;
    |<http://example.org/edit-users/>|*WebPage:
      title = "Edit Users"
      allow = |adminRole|
    ;
  ]
;

In the underlying URF model on which SURF is based, an ID actually represents a tag IRI, determined from the type handle, resolved to the IRI https://urf.name/, using the ID string as an IRI fragment identifier. If the example above were parsed by an URF processor, the ID |"user"|, because it is used with the Role type, is equivalent to indicating the tag |<https://urf.name/Role#user>|.

The TURF format provides a shorthand representation of ID tags using the type name with the ID as a fragment, such as Role#user to indicate |"user"|*Role.

URF

On its own the SURF format provides a concise, elegant, and flexible syntax for data storage. Yet unlike other formats that only address syntax, SURF is built on a rigorous semantic model named the Uniform Resource Framework (URF). No understanding of semantic frameworks is needed to use SURF; the information presented here merely provides a taste of the consistency URF provides, along with some of the ways URF makes it possible to access and process data.

While providing a better data storage format, SURF sneaks in a rigorous data model that is as capable as RDF yet simpler and more consistent. Although a SURF parser that is not aware of URF can still extract an object graph from the SURF syntax, a SURF parser that is also an URF processor can produced additional knowledge and make semantic inferences.

Tag IRIs

Every SURF handle is in fact represented by a unique tag IRI. Related handles belong to a namespace, which is also identified by an IRI. An URF processor will automatically map SURF handles to URF tag IRIs:

If a SURF handle has no namespace prefix, it is placed in the ad-hoc namespace identified by https://urf.name/. Thus the SURF handle foo is identified in URF by the tag IRI https://urf.name/foo.
If a SURF tag has one or more namespace prefixes, a namespace IRI is formed relative to https://urf.name/ using those prefixes, with prefix delimiters replaced by the slash / character. Thus the SURF handle example-foo is identified in URF by the tag IRI https://urf.name/example/foo.
If a SURF object has an ID, its encoded ID is added as the fragment of the type tag IRI. Thus Thus the object |"bar"|*example-foo is identified in URF by the tag IRI https://urf.name/example/foo#bar, of the type identified by the tag IRI https://urf.name/example/foo.

Resources

In URF everything that can be described is a resource. Even simple value types such as strings or integers are also resources, each conceptually identified by a unique tag IRI. For more information on forming tag URIs for common value resources, see the URF specification.

Statements

SURF documents processed by an URF processor are equivalent to a set of logical propositions or statements. Similar to those in RDF, URF statements consist of a subject, a property, and a value resource. Additionally each URF resource may have an associated type. The example at the beginning of this lesson would be understood by an URF processor as representing the following statements:

Example propositions derived from a SURF file by an URF processor.

Subject	Property	Value
`\|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>\|` (`User`)	`authenticated` (`urf-Property`)	`true` (`urf-Boolean`)
`\|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>\|` (`User`)	`sort` (`urf-Property`)	`'d'` (`urf-Character`)
`\|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>\|` (`User`)	`name` (`urf-Property`)	`"Jane Doe"` (`urf-String`)
`\|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>\|` (`User`)	`email` (`urf-Property`)	`^jane_doe@example.com` (`urf-Email`Address)
`\|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>\|` (`User`)	`phone` (`urf-Property`)	`+12015550123` (`urf-TelephoneNumber`)
`\|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>\|` (`User`)	`usernames` (`urf-Property`)	`\|accountList\|` (`urf-Set`)
`\|accountList\|` (`urf-Set`)	`urf-member+` (`urf-Property`)	`"jdoe"` (`urf-String`)
`\|accountList\|` (`urf-Set`)	`urf-member+` (`urf-Property`)	`"janed"` (`urf-String`)
`\|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>\|` (`User`)	`homePage` (`urf-Property`)	`<http://www.example.com/jdoe/>` (`urf-Iri`)
`\|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>\|` (`User`)	`salt` (`urf-Property`)	`%Zm9vYmFy` (`urf-Binary`)
`\|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>\|` (`User`)	`joined` (`urf-Property`)	`@2016-01-23` (`urf-LocalDate`)
`\|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>\|` (`User`)	`credits` (`urf-Property`)	`123` (`urf-Integer`)

Review

Summary

Example object comparison between SURF and JSON. The JSON version loses semantics compared to the SURF version.

SURF JSON

SURF	JSON
`\|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>\|*User: authenticated = true sort = 'd' name = "Jane Doe" id = &bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832 email = ^jane_doe@example.com phone = +12015550123 usernames = ("jdoe", "janed") homePage = <http://www.example.com/jdoe/> salt = %Zm9vYmFy joined = @2016-01-23 credits = 123 ;`	`{ "authenticated" : true, "sort" : "d", "name" : "Jane Doe", "email" : "jane_doe@example.com", "phone" : "+1-201-555-0123", "usernames" : ["jdoe", "janed"], "homePage" : "http://www.example.com/jdoe/", "salt" : "Zm9vYmFy", "joined" : "2016-01-23", "credits" : 123 }`

|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>|*User:
  authenticated = true
  sort = 'd'
  name = "Jane Doe"
  id = &bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832
  email = ^jane_doe@example.com
  phone = +12015550123
  usernames = ("jdoe", "janed")
  homePage = <http://www.example.com/jdoe/>
  salt = %Zm9vYmFy
  joined = @2016-01-23
  credits = 123
;

{
  "authenticated" : true,
  "sort" : "d",
  "name" : "Jane Doe",
  "email" : "jane_doe@example.com",
  "phone" : "+1-201-555-0123",
  "usernames" : ["jdoe", "janed"],
  "homePage" : "http://www.example.com/jdoe/",
  "salt" : "Zm9vYmFy",
  "joined" : "2016-01-23",
  "credits" : 123
}

Gotchas

JSON does not distinguish between general numbers and integers; parsing a JSON document with non-fractional numbers as SURF will gain semantics, even if unintended, based upon whether the JSON numbers were serialized with decimal points and/or exponents.
While a SURF parser recognizes the JSON object syntax, it means a slightly different thing in SURF. JSON objects are interpreted as maps in JSON. If you want to define an object in SURF, use the SURF object syntax.
SURF parsers recognize JSON null, but it is discarded and does not from the resulting SURF object graph.
SURF aliases are syntax only; they do not become part of the parsed model.
Don't confuse an IRI such as <http://www.example.com/>, with a resource identified by that IRI using the tag |<http://www.example.com/>|.

In the Real World

A SURF parser can process any JSON text, with the understanding that JSON objects will be stored in SURF maps, and that numbers without decimal points and exponents will be considered integers.

Think About It

Is the object you are describing uniquely identified by some identifier, such as a UUID or an email address? Rather than simply adding a proprietary property name to your object, it would be more useful semantically to use a tag for universal references recognized across vocabularies.

Self Evaluation

What are the three types of resource representations available in SURF?
Do the three resource representations results in different “types” of resources in the underlying URF data model?
What is the difference between SURF objects and JSON objects?
When do you need a comma to separate properties in SURF object descriptions?
Which SURF types cannot be represented in JSON?
In what cases would you want to use the TURF format instead of SURF?

Learn SURF

History

Binary Serializations

Markup Languages

JSON

Semantic Frameworks

URF, TURF, and SURF

SURF

JSON Compatibility

Comments

Handles

Resource Descriptions

Objects

Literals

Strings

Numbers

Boolean

Dates and Times

Regular Expressions

IRIs

UUIDs

Email Addresses

Telephone Numbers

Binary Values

Collections

Lists

Sets

Maps

Labels

Aliases

Tags

IDs

URF

Tag IRIs

Resources

Statements

Review

Summary

Gotchas

In the Real World

Think About It

Self Evaluation

Task

See Also

References

Resources