Learn SURF

The primary purpose of a computer program is to process data, usually in the form of objects. This data processing usually occurs in the computer's memory, but at some point the program needs to serialize the object, turning them into a series of bytes, so that they can be stored in a file or transferred to another system. Serialization is used for everything from configuration files to instant message transport. The Simple URF (SURF) format strives to be a serialization format for the reactive Internet era, striking a balance between simplicity (being almost as simple and slightly more compact as JSON) and expressiveness (having more types than JSON, extensibility through vocabularies, and compatibility with semantic frameworks).

History

Binary Serializations

Early serialization approaches used binary format and were not easily read by humans, as they used arbitrary numbers as delimiters and represented data in their binary form as stored in memory.

Good
  • Binary formats are very compact.
Bad
  • Binary formats are hard to debug, as they are not readily human-readable.
  • Many binary formats are proprietary, impeding interoperability.

Markup Languages

Early markup languages such as the Standard Generalized Markup Language (SGML) from the 1980s weren't originally meant to store objects but to add annotations to text data. Markup tag pairs such as “<para>…</para>” (indicating a paragraph) were inserting surrounding text to allow it to be styled for publishing. In the early 1990s Tim Berners-Lee invented the HyperText Markup Language (HTML), an implementation of SGML that provided simple markup for web pages using tag pairs such as “<p>…</p>”. (The latest version, HTML5, is the current best-practices format for creating web content.)

HTML was not meant for serializing general data such as objects, and SGML was too complex, so in the later 1990s the World Wide Web Consortium (W3C) created the Extensible Markup Language (XML), a simplified version of SGML. XML provided a simple way to define the structure of any data that could be stored in text form, yet did not define the actual tag names used or their semantics (that is, how they were to be interpreted). A typical XML document might appear like the one below:

Example XML file for storing information about a user.
<?xml version="1.0" encoding="UTF-8"?>
<user authenticated="true" sort="d">
  <name>Jane Doe</name>
  <id>bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832</id>
  <email>jane_doe@example.com</email>
  <phone>+1-201-555-0123</phone>
  <usernames>
    <username>jdoe</username>
    <username>janed</username>
  </usernames>
  <homePage>http://www.example.com/jdoe/</homePage>
  <salt>Zm9vYmFy</salt>
  <joined>2016-01-23</joined>
  <credits>123</credits>
</user>
Good
  • XML is human-readable.
  • XML is standardized and ubiquitous.
  • XML works well for annotating text data.
  • XML provides a mechanism for mixing vocabularies.
Bad
  • XML is verbose, sometimes tripling the size of the data being encoded.
  • XML arbitrarily distinguishes two types of data: “attributes” (e.g. the authenticated attribute) and “child elements” (e.g. the <name> tag).
  • XML itself has not way to specify the types of data (e.g. that true is a Boolean value, jdoe is a string, 123 is a number, and 2016-01-23 is a date).
  • XML retains some complicated features that are becoming seldom used yet all parsers must support (e.g. DTDs, character references, CDATA sections).
  • XML is only a syntax and has no means for referring to other things as in a general graph.

JSON

For almost two decades XML has been the default serialization format for many data, but it is recently being supplanted by newer, simpler formats. Web developers in particular desired a format that was smaller for transferring between the browser and a server. In the early 2000s the JavaScript Object Notation (JSON) format was formalized, using a subset of the syntax JavaScript used for declaring objects. As JavaScript was the primary language used in the browser, JSON could be parsed directly by evaluation as if it were a JavaScript program.

The central serialized type in JSON is an “object”, a comma-separated mapping of string keys to values inside brace { and } characters, such as {"foo" : "bar"}. A value can be a string; a number; a comma-separated array inside bracket [ and ] characters; a Boolean value true or false; the value null; or another object. The same information above could be stored in JSON as follows:

Example JSON file for storing information about a user.
{
  "authenticated" : true,
  "sort" : "d",
  "name" : "Jane Doe",
  "id" : "bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832",
  "email" : "jane_doe@example.com",
  "phone" : "+1-201-555-0123",
  "usernames" : ["jdoe", "janed"],
  "homePage" : "http://www.example.com/jdoe/",
  "salt" : "Zm9vYmFy",
  "joined" : "2016-01-23",
  "credits" : 123
}
Good
  • JSON is more compact than XML.
  • JSON has no arbitrary distinction of child information as did XML with its “attributes”.
  • JSON is more than a delimiter syntax; it encodes values.
  • JSON is standardized and becoming ubiquitous.
  • The JSON specification is smaller, easier to understand, and quicker to implement.
Bad
  • JSON is still somewhat verbose, requiring quotation " characters for object keys and comma , characters between key-value pairs.
  • JSON has a very limited set of types; single characters, dates, URLs and the like must be represented by strings and parsed separately later by the consumer.
  • JSON has no inherent way for objects to refer to other objects as in a graph.
  • JSON has no mechanism for mixing vocabularies defined by independent parties.
  • JSON does not support document-level metadata, and otherwise reflects its creation as Java

In short JSON was not created from scratch to be a generic serialization format, and it shows. JSON was a convenient extraction of part of the JavaScript language which has proven to be very useful as an alternative to XML.

Semantic Frameworks

The W3C has since the 1990s been formulating the Resource Description Framework (RDF), a semantic framework for describing resources (objects or even abstract ideas) and the meaning of relationships between them. These relationships or properties are defined in a very formal sense, allowing reasoning via propositional logic. (For example, if Bob is the manager of Jane, and Jill is the manager of Bob, then Jill is the manager of Jane because the “manager” property is transitive.) In RDF, even the properties themselves are resources and can be described with their own properties.

RDF is for the most part a semantic framework independent of any serialization. Two popular serializations are RDF/XML (using special XML tags); and Turtle, a syntax that uses triples of subject, property, and object to represent semantic propositions.

Example RDF/XML file for storing information about a user.
<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:user="http://example.com/ns/user#" xmlns:crypto="http://example.com/ns/cryptography#">
  <rdf:Description rdf:about="urn:uuid:bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832">
    <rdf:type rdf:resource="http://example.com/ns/user#User"/>
    <user:authenticated rdf:datatype="http://www.w3.org/2001/XMLSchema#Boolean">true</user:authenticated>
    <user:sort>d</user:sort>
    <user:name>Jane Doe</user:name>
    <user:email>jane_doe@example.com</user:email>
    <user:phone>+1-201-555-0123</user:phone>
    <user:usernames rdf:parseType="Collection">
      <rdf:Description>
        <rdf:value>jdoe</rdf:value>
      </rdf:Description>
      <rdf:Description>
        <rdf:value>janed</rdf:value>
      </rdf:Description>
    </user:usernames>
    <user:homePage rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI">http://www.example.com/jdoe/</user:homePage>
    <crypto:salt rdf:datatype="http://www.w3.org/2001/XMLSchema#base64Binary">Zm9vYmFy</crypto:salt>
    <user:joined rdf:datatype="http://www.w3.org/2001/XMLSchema#gYear">2016-01-23</user:joined>
    <user:credits rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">123</user:credits>
  </rdf:Description>
</rdf:RDF>
Example Turtle file for storing information about a user.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix user: <http://example.com/ns/user#> .
@prefix crypto: <http://example.com/ns/cryptography#> .

<urn:uuid:bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>
  a <http://example.com/ns/user#User> ;
  user:authenticated "true"^^xsd:Boolean ;
  user:sort "d" ;
  user:name "Jane Doe" ;
  user:email "jane_doe@example.com" ;
  user:phone "+1-201-555-0123" ;
  user:usernames (
    _:jdoe
    _:janed
  ) ;
  user:homePage "http://www.example.com/jdoe/"^^xsd:anyURI ;
  crypto:salt "Zm9vYmFy"^^xsd:base64Binary ;
  user:joined "2016-01-23"^^xsd:gYear ;
  user:credits 123 .

_:jdoe rdf:value "jdoe" .
_:janed rdf:value "janed" .
Good
  • RDF allows objects
  • RDF provides semantic representations among objects.
  • RDF allows a reasonable set of types, leveraging XML Schema Datatypes (even in the Turtle syntax).
  • RDF allows mixing of vocabularies.
  • RDF allows references for graphs of objects.
Bad
  • Neither RDF/XML nor Turtle can be used without understanding the RDF data model specification, which is dense, complicated, and academic.
  • RDF provides several redundant representations of even simple things such as strings.
  • The RDF/XML serialization is extremely verbose.
  • The Turtle serialization is verbose, unwieldy, and opaque.

URF, TURF, and SURF

In 2007, in an effort to create a simpler, elegant, and more consistent semantic framework, Garret Wilson and GlobalMentor, Inc. started work on the Uniform Resource Framework (URF). The TURF syntax for URF is meant to provide a comprehensive, text-based format for data archival; as well as satisfying the academic requirements of a rigorous semantic framework. The goal for SURF, however, is to provide a a simple serialization format that is as easy to use as JSON, while still maintaining URF semantics and compatibility with TURF.

SURF does not require any knowledge of semantic frameworks in using the syntax. In other words, SURF is an improved JSON that brings more types and more features while hardly increasing the complexity—and brings the rigor of a semantic framework for free.

Example SURF file for storing information about a user.
|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>|*User:
  authenticated = true
  sort = 'd'
  name = "Jane Doe"
  email = ^jane_doe@example.com
  phone = +12015550123
  usernames = ("jdoe", "janed")
  homePage = <http://www.example.com/jdoe/>
  salt = %Zm9vYmFy
  joined = @2016-01-23
  credits = 123
;
Good
  • SURF is almost as simple as JSON.
  • SURF is small; a SURF document can be more compact than JSON.
  • SURF can be more readable than JSON, dispensing with unnecessary syntax such as commas between items in multi-line lists.
  • SURF provides many more types than JSON.
  • SURF allows mixing of vocabularies, yet without the need to think of bulky URIs.
  • SURF allows references for graphs of objects.
Bad
  • No one has adopted SURF yet.

SURF

In addition to the benefits listed above, SURF has several useful characteristics:

JSON Compatibility

Any JSON document can be parsed as a valid SURF document! But SURF can be more expressive; compare the SURF and JSON documents presented earlier in this lesson. The additional features will be explained further in the following sections.

Comments

A comment may be added using the exclamation ! mark. The comment will continue until the end of the line.

Simple SURF document with a comment.
* !The asterisk represents an object.

Handles

SURF has the normal restrictions on identifiers as with many other formats and languages, except that SURF is fully Unicode aware. A SURF name must begin with a letter and follow with any number of letters, digits, and connectors such as the underscore _ character. Examples include Vehicle, color, and foo_bar.

SURF provides a way to prevent clashes by placing names in informal namespaces. A name may begin with one or more prefixes, delimited by the hyphen - character. These prefixes indicate a hierarchy of informal namespaces. Various parties may supply defined vocabularies which may be freely mixed in a SURF document. In such a case the namespaces indicated by the prefixes will prevent the identifiers from overlapping, even if they would otherwise have the same name. A name along with its prefix(es) is referred to as a handle.

For example, the name “salt” could denote the additional cryptographic input to a password hash function (see Salt (cryptography)), or it could identify an ionic chemical compound (see Salt (chemistry)). SURF allows these two names to be distinguished by the addition of a namespace prefix. One might use the “crypto-” prefix, while another might use the “chem-” prefix, producing the handles crypto-salt and chem-salt, respectively. 

Handles in SURF and JSON.
SURF JSON
Handles
  • foo
  • fooBar
  • foo_bar
  • touché
  • काम
  • chem-salt
  • crypto-salt
  • User
  • chem-Molecule
N/A

Resource Descriptions

SURF is all about describing “resources”, which (in URF as in RDF) is anything that can be identified for discussion—that is, anything you can talk about. There are three types of resource representations in SURF:

object
A description of some resource, such as a web site.
literal
A representation of a resource that is can be identified by a lexical form, such as a number or a string.
collection
A resource that aggregates other resources.

Objects

SURF objects are analogous to the objects described by JSON. An anonymous object in SURF is denoted by the asterisk * character, while in JSON it is represented by opening and closing brace {} characters.

Objects in SURF and JSON.
SURF JSON
Object * {}

A bare object is somewhat boring, so both SURF and JSON allow them to be described. SURF uses the colon : and semicolon ; characters as delimiters for an object description block. Each element of a SURF object description consists of property, which is a SURF identifier, and its associated value, which is any SURF resource. The property and its value are separated using the equals = sign, unlike JSON which uses a colon : character.

Object descriptions in SURF and JSON.
SURF JSON
Resource Description
* :
  color = "blue"
  foo = "bar"
;
{
  "color" : "blue",
  "foo" : "bar"
}

You may also optionally specify your own custom type for any object by indicating the type name after the asterisk, as long as the type name follows the rules described for Handles, above. Examples include *User and *chem-Molecule.

Custom type indication in SURF.
SURF JSON
Custom Type
*User:
  …
;
N/A

Literals

Strings

Strings appear the same in SURF and in JSON.

Strings in SURF and JSON.
SURF JSON
Strings
  • ""
  • "abc\ndef"
  • "Hindi letter \u092E represents the \"m\" sound."
  • ""
  • "abc\ndef"
  • "Hindi letter \u092E represents the \"m\" sound."
Numbers

Numbers are represented the same in SURF and in JSON. But SURF comes with two number types that JSON doesn't have. One is an integer type. SURF also allows you to indicate that a numerical value is restricted to whole numbers by placing the number sign # in front of the value.

It is sometimes forgotten that IEEE 754 floating point numbers store only estimates of some numbers, 0.3 being one example. Most libraries use floating point for parsing and serializing in JSON, and SURF allows this as well. But because fractional estimates are not desired when dealing with some things such as money, SURF provides a decimal type which guarantees that the fractional part will be represented exactly. Decimals are indicated by the dollar sign $ prefix. This does not mean that this is a currency type, although the decimal type is often used to represent money.

Numbers in SURF and JSON.
SURF JSON
Numbers
  • 123
  • 0.123
  • -1.2e+3
  • 123
  • 0.123
  • -1.2e+3
Integers
  • #123
  • #0
  • -321
N/A
Decimals
  • $123
  • $0.123
  • $-1.23
N/A
Boolean

Both SURF and JSON have the same Boolean values, true and false.

Numbers in SURF and JSON.
SURF JSON
Boolean
  • false
  • true
  • false
  • true
Dates and Times

SURF comes with extensive date and time handling, addressing one of the most glaring deficiencies of JSON. SURF supports the most common representations specified by ISO 8601 for absolute instances in time. Moreover SURF supports local date+time specifications, durations, and other temporal concepts present in modern date/time libraries.

Dates and times in SURF.
SURF JSON
Instant @2017-02-12T23:29:18.829Z N/A
ZonedDateTime @2017-02-12T15:29:18.829-08:00[America/Los_Angeles] N/A
OffsetDateTime @2017-02-12T15:29:18.829-08:00 N/A
OffsetDate @2017-02-12-08:00 N/A
OffsetTime @15:29:18.829-08:00 N/A
LocalDateTime @2017-02-12T15:29:18.829 N/A
LocalDate @2017-02-12 N/A
LocaleTime @15:29:18.829 N/A
YearMonth @2017-02 N/A
MonthDay @--02-12 N/A
Year @2017 N/A
Regular Expressions

SURF supports regular expressions as a first-class type along with strings, dates, etc. They are surrounded by the slash / delimiter character, as they are in JavaScript, which makes it somewhat curious that JSON does not support them.

Regular expressions in SURF.
SURF JSON
Regular Expression /a?b+c*/ N/A
IRIs

SURF can represent a Internationalized Resource Identifier (IRI) (RFC 3987) as its own type rather than as a string. A URL used to identify web addresses is the most well-known form of IRI. An IRI in SURF is surrounded by less than < and greater than > signs.

IRIs in SURF.
SURF JSON
IRI <http://example.com/> N/A
UUIDs

SURF can also represent a different type of identifier known as a Universally Unique Identifier (UUID) (RFC 4122). This identifier is a 128-bit value that can be generated on separate systems according to a certain algorithm, yet  with a extremely small chance of two UUIDs being identical across all systems. As with URIs SURF represents a UUID as a separate type, not as a string.

The SURF UUID representation begins with the ampersand & character (think of the reference operator in C/C++) followed by the canonical UUID representation of groups of hexadecimal octets in the form xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx. That is, 32 hexadecimal digits appear as a total of 36 character in the form 8-4-4-4-12.

UUIDs in SURF.
SURF JSON
UUID &5623962b-22b1-4680-ae1c-7174a46144fc N/A
Email Addresses

A common way of uniquely identifying people is by email address. SURF comes with a special type to represent emails as opposed to general strings. SURF emails follow the address specification in RFC 5322, with the addition of the circumflex accent or “caret” ^ character at the beginning. (Think of the paper airplane symbol representing “send email” in many user interfaces.)

Email addresses in SURF.
SURF JSON
Email Address ^jdoe@example.com N/A
Telephone Numbers

Another popular identifier is the telephone number, and SURF has a literal representation for those as well. A SURF telephone number follows the syntax for “global numbers” (i.e. those that begin with the plus + sign) described in RFC 3966. The visual separators allowed by RFC 3966 are optional. SURF does not allow telephone number parameters.

Telephone numbers in SURF.
SURF JSON
Telephone Number +12015550123 N/A
Binary Values

SURF allows you represent a series of bytes. The percent % character is used as the beginning delimiter (think of the percent symbol as representing 1s and 0s), with the following binary data encoded as Base64 (RFC 4648) using the “base64url” alphabet. SURF does not allow Base64 padding.

Binary values in SURF.
SURF JSON
Binary %Zm9vYmFy N/A

Collections

Lists

Lists in SURF are presented like arrays in JSON using the bracket [ and ] characters. An added benefit over JSON is that you do not need the comma , separator if you place the list items on separate lines.

Lists and nested lists in SURF.
SURF JSON
List
[
  "foo"
  [1, 2, 3]
  1.23
]
[
  "foo",
  [1, 2, 3],
  1.23
]
Sets

Sets are unordered collections that do not allow duplicates. They are formatted the same as list, except that they use parentheses ( and ) characters.

Lists and nested lists in SURF.
SURF JSON
Set
("red", "green", "blue")
N/A
Maps

Maps are associations of values with unique key values. SURF represents map key+value entries inside brace { and } characters, with each key and value separated by a colon : character. Unlike the keys of JSON associative arrays, SURF map keys do not have to be strings. The comma , separator is not needed if the map entries appear on on separate lines.

Maps in SURF.
SURF JSON
Map
{
  "pi" : 3.14159
  1 : "first"
  2 : "second"
  3 : "third"
  true : "yes"
  false : "no"
}
N/A

Labels

In most programming languages, an in-memory graph of objects may have several references to the same instance. Many serialization formats, such as JSON, do not have built-in mechanisms for referencing nodes serialized elsewhere in the document. Other formats such as XML require special processors that understand additional specifications such as XML Schema, or rely on proprietary application-level interpretations completely outside the format specifications. SURF allows you to reference any resource within a document by using a label, which consists of an identifier inside vertical line | characters and placed in front of a resource. SURF supports three types of labels, which differ in their resolving power, that is, how broadly the label will identify a unique resource.

Aliases

At the finest granularity, you can create an alias by placing any SURF name inside the label delimiters, such as |foo|. An alias can be given to any resource. After assigning a tag to a resource, the alias can then be used anywhere in the SURF document that the original resource could have been used. The SURF parser will recognize all occurrences of an alias as referring to the same resource instance. An alias only exists as syntax within a SURF document; an alias itself is not present in the object graph a SURF parser returns.

In a typical web authentication scenario, several web pages may indicate that only a user with the “administrator” role may have access. Without a built-in mechanism for referencing resources, the usual workaround is to create an extra identifier field and require the processing application to understand that these references should create links, as in the following example:

Referencing resources in SURF without using aliases requires special application knowledge.
*WebApplication:
  roles = [
    *Role:
      id = "userRole"
      name = "Normal User"
    ;
    *Role:
      id = "adminRole"
      name = "Administrator"
    ;
  ]
  pages = [
    *WebPage:
      title = "Home Page"
      url = <http://example.com/>
      allow = "userRole"
    ;
    *WebPage:
      title = "Maintenance"
      url = <http://example.org/maintain/>
      allow = "adminRole"
    ;
    *WebPage:
      title = "Edit Users"
      url = <http://example.org/edit-users/>
      allow = "adminRole"
    ;
  ]
;

Adding SURF aliases moves the linking semantics into the document itself, relieving the application from the need to manually make connections based upon some proprietary scheme. In the example below, a SURF parser will automatically create references to the Role resources; no extra work is required on the part of the application. This example removes the role id property to indicate it is no longer needed for referencing within the SURF document.

Referencing resources in SURF using tags.
*WebApplication:
  roles = [
    |userRole|*Role:
      name = "Normal User"
    ;
    |adminRole|*Role:
      name = "Administrator"
    ;
  ]
  pages = [
    *WebPage:
      title = "Home Page"
      url = <http://example.com/>
      allow = |userRole|
    ;
    *WebPage:
      title = "Maintenance"
      url = <http://example.org/maintain/>
      allow = |adminRole|
    ;
    *WebPage:
      title = "Edit Users"
      url = <http://example.org/edit-users/>
      allow = |adminRole|
    ;
  ]
;
Tags

If you are describing a resource such as web site (or in the semantic world, even people or abstract ideas) that is already identified by an IRI, you can assign a tag to the resource. A tag is the identifying IRI of the object and is placed inside the label along with IRI delimiters. For example the tag |<http://example.com/>| is used to identify the web site <http://example.com/>. A tag becomes the resource's official global identifier, and will continue to exist outside the SURF document.

You could improve the SURF document above by identifying the two page resources by their IRIs. After parsing the document an application can ask each of the page resources for its identifier. Moreover you could reference resources within the SURF document by using their tags, just as you can do with aliases. This example removes the page url property to indicate it is no longer needed, as the page objects themselves are identified by IRIs.

Referencing resources in SURF using tags.
*WebApplication:
  roles = [
    |userRole|*Role:
      name = "Normal User"
    ;
    |adminRole|*Role:
      name = "Administrator"
    ;
  ]
  pages = [
    |<http://example.com/>|*WebPage:
      title = "Home Page"
      allow = |userRole|
    ;
    |<http://example.org/maintain/>|*WebPage:
      title = "Maintenance"
      allow = |adminRole|
    ;
    |<http://example.org/edit-users/>|*WebPage:
      title = "Edit Users"
      allow = |adminRole|
    ;
  ]
;
IDs

SURF provides one more kind of reference identifier called an ID. Like a tag, an ID is assigned the object an exists outside the SURF document. Unlike a tag, which provides a global unique identification, an ID uniquely identifies a resource only among the resources of the indicated type. For this reason if you indicate an ID for a resource, you must also indicate a resource type. You can specify an ID by providing a string such as "user" (note the surrounding double quote " characters) as the label identifier.

Using global ID tags.
*WebApplication:
  roles = [
    ! same as |<https://urf.name/Role#user>|
    |"user"|*Role:
      name = "Normal User"
    ;
    ! same as |<https://urf.name/Role#user>|
    |"admin"|*Role:
      name = "Administrator"
    ;
  ]
  pages = [
    |<http://example.com/>|*WebPage:
      title = "Home Page"
      allow = |userRole|
    ;
    |<http://example.org/maintain/>|*WebPage:
      title = "Maintenance"
      allow = |adminRole|
    ;
    |<http://example.org/edit-users/>|*WebPage:
      title = "Edit Users"
      allow = |adminRole|
    ;
  ]
;

URF

On its own the SURF format provides a concise, elegant, and flexible syntax for data storage. Yet unlike other formats that only address syntax, SURF is built on a rigorous semantic model named the Uniform Resource Framework (URF). No understanding of semantic frameworks is needed to use SURF; the information presented here merely provides a taste of the consistency URF provides, along with some of the ways URF makes it possible to access and process data.

While providing a better data storage format, SURF sneaks in a rigorous data model that is as capable as RDF yet simpler and more consistent. Although a SURF parser that is not aware of URF can still extract an object graph from the SURF syntax, a SURF parser that is also an URF processor can produced additional knowledge and make semantic inferences.

Tag IRIs

Every SURF handle is in fact represented by a unique tag IRI. Related handles belong to a namespace, which is also identified by an IRI. An URF processor will automatically map SURF handles to URF tag IRIs:

Resources

In URF everything that can be described is a resource. Even simple value types such as strings or integers are also resources, each conceptually identified by a unique tag IRI. For more information on forming tag URIs for common value resources, see the URF specification.

Statements

SURF documents processed by an URF processor are equivalent to a set of logical propositions or statements. Similar to those in RDF, URF statements consist of a subject, a property, and a value resource. Additionally each URF resource may have an associated type. The example at the beginning of this lesson would be understood by an URF processor as representing the following statements:

Example propositions derived from a SURF file by an URF processor.
Subject Property Value
|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>| (User) authenticated (urf-Property) true (urf-Boolean)
|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>| (User) sort (urf-Property) 'd' (urf-Character)
|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>| (User) name (urf-Property) "Jane Doe" (urf-String)
|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>| (User) email (urf-Property) ^jane_doe@example.com (urf-EmailAddress)
|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>| (User) phone (urf-Property) +12015550123 (urf-TelephoneNumber)
|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>| (User) usernames (urf-Property) |accountList| (urf-Set)
|accountList| (urf-Set) urf-member (urf-Property) "jdoe" (urf-String)
|accountList| (urf-Set) urf-member (urf-Property) "janed" (urf-String)
|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>| (User) homePage (urf-Property) <http://www.example.com/jdoe/> (urf-Iri)
|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>| (User) salt (urf-Property) %Zm9vYmFy (urf-Binary)
|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>| (User) joined (urf-Property) @2016-01-23 (urf-LocalDate)
|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>| (User) credits (urf-Property) 123 (urf-Integer)

Review

Summary

Example object comparison between SURF and JSON. The JSON version loses semantics compared to the SURF version.
SURF JSON
|<&bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832>|*User:
  authenticated = true
  sort = 'd'
  name = "Jane Doe"
  id = &bb8e7dbe-f0b4-4d94-a1cf-46ed0e920832
  email = ^jane_doe@example.com
  phone = +12015550123
  usernames = ("jdoe", "janed")
  homePage = <http://www.example.com/jdoe/>
  salt = %Zm9vYmFy
  joined = @2016-01-23
  credits = 123
;
{
  "authenticated" : true,
  "sort" : "d",
  "name" : "Jane Doe",
  "email" : "jane_doe@example.com",
  "phone" : "+1-201-555-0123",
  "usernames" : ["jdoe", "janed"],
  "homePage" : "http://www.example.com/jdoe/",
  "salt" : "Zm9vYmFy",
  "joined" : "2016-01-23",
  "credits" : 123
}

Gotchas

In the Real World

Think About It

Self Evaluation

Task

TODO

See Also

References

Resources