Taglist

The taglist is a list with all the tags and relationships within a Silk site. It is important in understanding the structure and contents of a site.

Site

Page

User

Taglist 

Query 

FAQ

Facts collected from tagged text on this page
Facts about this page

About tags

We changed domains: please change any mentions of silkapp.com in the code examples to silk.co.

Tags are a very important concept within Silk. Tags are key-value pairs that tell us something about a fact/concept within Silk. For example, look at the Monaco page below from world.silk.co.



It shows a page with the category 'country' and characteristics of this country (government type, language, etc). All these concepts (category, government type, language, Head of State, etc)  are tags. All these tags are found in the taglist.

You won't see the term 'category' in the taglist, instead the tag is 'country' and its value is the title of the page (here: Monaco).

Get the taglist

The taglist is XML that you can GET from this url

The taglist is automatically generated every time the tags of a site change. You can only GET the taglist, not modify or delete it; it is as a data summary of the Silk site.

Structure of the taglist

The taglist has the following XML structure (this example shows the taglist of world.silk.co)

Tags

Each tag, both categories (such as City or Country) as well as tags within pages (such as Birthdate and Capital) can be found in the taglist. Each tag has a uri.

Tag

If we look at the <tag> element, we see this structure (we use the country tag as an example):

Lets go through these one by one:

Name

<name> ... </name>

This is the name of the tag. This can be different from the tag uri, for example for the tag 'Borders With' in the taglist for world.silk.co has the uri http://world.silk.co/tag/Borders%20With.

Total

<total> ... </total>

This is the total number of occurrences of the tag within the site. So all occurrences in all pages. When a page has this tag twice it is counted as 2. 

Doclevel

<doclevel> ... </doclevel>

This is the total number of times the tag occurs as a category. From this you can conclude that in the above example there are 192 pages with category 'Country'. You can also conclude that the tag country is used as a tag within pages 3 times (195 minus 192).

Ctxlevel

<ctxlevel> ... </ctxlevel>

Please ignore this tag.

Arity

<arity> ... </arity>

Indicates how many times on average the tag is used within a page. For category tags, mostly, this will be 1.0. But for tags such as 'borders with' (this tag indicated the countries that surround a country) it will be higher than 1.0 because per country (the type of page this tag will occur) there are usually more countries it shares its border with.

Context

<context> ... </context>

This indicates in which category this tag is used. For example the tag 'birthdate' is a tag that is only present in pages with the category 'person':

Revcontext

<revcontext> ... </revcontext>

This indicates which tags are present within documents with this category. For example the tag 'person' is a tag that is used as category (see <doclevel>), <revcontext> gives all the tags that are found within documents with category 'person'.

In this example, we see that for Person pages, 335 contain a birthdate and 359 contain a Country of residence.

Outtypes

<outtypes> ... </outtypes>

The <outtypes> element shows the category of the page that this tag links to. For example take the Netherlands page from world.silk.co. It says that Amsterdam is the capital (Amsterdam is tagged as Capital), we also see that Amsterdam is linked to its own page. The Amsterdam page has as category 'City', in the taglist for the tag 'Capital' we see:

So we can conclude that capital is of category 'City'.

Types

<types> ... </types>

Silk analyzes the tags to see whether a tag is a currencydateyearnumericgeo or  textual. This is expressed in the <types> element. Within each <count> element the number of occurrences and the certainty (from 0 to 1) is stated, the numbers are separated by a space. 

In the example above (for the tag GDP) we see that there are 567 GDP tags, all of which are a currency with 100% certainty. 91 are a geo location with 57% certainty and 567 are a textual value with 100% certainty (all tags are 100% text by default).

Metadata

<metadata> ... </metadata>

The metadata tag holds information about the tag, such as namestyle and board image (image used in the component 'tag board'). 

Fuzzy

<fuzzy>...</fuzzy>

This is the name of the tag with some textual variations. It allows to better search for a tag name. For example, tag names are transformed to lowercase and tag names with letters with accents (e.g. ü è î ø) are transformed to 'normal' letters (e.g. u e i o).

Enum

This tag gives a list of values that occur multiple times. If all values of tags are unique the <enum> tag will be empty, but if values occur more than a few times it will be indicated here, along with the count for that value. This can be used for example to make suggestions when a tag is filled in. 

The above mentioned example shows the <enum> values for the tag Government Type. It shows that this tag has the value 'republic' 65 times and the value 'parliamentary democracy' 16 times.