- Independence from natural language
- Conceptual enrichment by conceptually merging natural languages
- Explicit, self-describing formal data model
- Reusable consented and structured domain knowledge
- Machine-interpretable semantics
The advantages of implementing SWT with formal semantics are multifold.
Independence from natural language
Translating data models and data into a formal lingua franca, i.e. the unifying standard language of logic mentioned in the introduction, the meaning of concepts becomes natural language independent. Of course this also implies that in the modeling process of the concepts there needs to be a minimal international concensus between domain specialists speaking different languages, or there are resources sufficiently describing the concepts to safely abstract the semantics.
Unifying formal semantics directly leads to semantic interoperability, which strongly enhances data availability, since very disparate data sources will be usable together and comparable, generating unprecedented research possibilites.
Conceptual enrichment by conceptually merging natural languages
Since there are natural language specific concepts only existing in 1 language or languages of the same group, a formal vocabulary means a conceptual enrichment compaired to a domain specialist’s natural language. At the end different domain specialists will mutually benefit from each other using formal vocabularies based on different natural languages.
Similarly there will be semantic interoperability and a conceptual enrichment between different domains of discours in the same natural language, e.g. between different scientific domains, or a scientific and industrial domain, and even between different groups within the same domain.
Explicit, self-describing formal data model
For a machine to deal with formal semantics all intended meaning has to be stated explicitly (no hidden assumtions). In first instance this can be confronting for a researcher, realizing that own data are not transparent and bearing some gaps, only well understood by the creator(s) of the data model.
This need for explicitness creates a quality control feedback loop for the source data model and data.
Easier data management
More transparant data models also lead to easier data management, less depending on specific persons. In this context it is worthwhile to mention that OWL-ontologies are self-describing formal vocabularies, for a human by means of the labels and comments (definitions) of ontological elements (classes and properties), and by means of the documentation in queries and rules, and for both a human and machine by all the rest of the formal statements about these ontological elements.
Reusable consented and structured domain knowledge
From a pure content point of view domain knowledge becomes highly exchangeable and reusable through expression in formal domain ontologies and N3-rules.
The built-in logic of the W3C SW standards makes the semantics also machine-interpretable enabling automated semantic interoperability and machine reasoning.
Automated semantic interoperability
Interoperability on the semantic level in an automated way is the highest form, crossing the borders of natural languages, knowledge domains and disparate databases. It has to be pointed out that for adding new knowledge in ontologies and rules human intervention is mostly still needed.
The biggest advantage, building further on the previous one, is machine reasoning, to make inferences of all kinds by deriving new from existing data.
Semantic conversion and enrichment with inferred knowledge
For example data expressed with a simpler model can be in a deductive way converted to data expressed with more complex domain ontologies (two-step formalization), unfolding the implicit assumptions into a multidimensional seamantic space. Lifting a tip of the veil, a plain literal value indicating implicitly a year (e.g. “1789”) can be converted to a year period of which the start and end are explicitly expressed with a datatyped literal date in the Gregorian calendar (e.g. “1789-01-01^^xsd:date” and “1789-12-31^^xsd:date”), with which can be calculated as time indicators. Also measurements of a certain quantity can be formally expressed in a certain unit (e.g. length in foot), and converted to an SI unit (e.g. meter). This other unit together with the foot-related scale factor (0.3048) is then the formal built-in domain knowledge in this case, stored in an ontology and used in an N3-rule, which can automatedly enrich the processed data. Such formal knowledge can also come from linked (external) RDF-data sources.
Data analysis, research, data mining and decision support
Rules can also be applied to analyse data, e.g. text. All kinds of research questions can be answered, providing new domain knowledge. Further advanced applications are data mining and decision support, requiring implementation of probability theory (e.g. Bayesian inference).