104
International Journal on New Computer Architectures and Their Applications (IJNCAA) 2(1): 167-182
The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2220-9085)
168
knowledge of data acquisition and ontology
modeling but experts in financial trading. The
main guideline in constructing ontology was
to develop it to the level that enables direct
employment in an application, which differs
from majority of existing approaches where
ontologies are mainly developed only to
formally define the conceptualization of the
problem domain.
The remainder of this paper is structured as
follows. First some related work is presented
in section 2 with emphasis on ontology
development methodologies and applications
of financial ontologies. The following section
3 introduces our approach for facilitating
Semantic Web applications construction. The
details of case study from the domain of
financial instruments and trading strategies is
further presented in section 4. First FITS
ontology is presented, followed by semantic
integration of data sources and then
technological details about the prototype are
depicted. Finally in section 5 conclusions with
future work are given.
2 RELATED WORK
Ontology is a vocabulary that is used for
describing and presentation of a domain and
also the meaning of that vocabulary. The
definition of ontology can be highlighted from
several aspects. From taxonomy [1-3] as
knowledge with minimal hierarchical
structure, vocabulary [4, 5] with words and
synonyms, topic maps [6, 7] with the support
of traversing through large amount of data,
conceptual model [8, 9] that emphasizes more
complex knowledge and logic theory [1, 10,
11] with very complex and consistent
knowledge.
Ontologies are used for various purposes such
as natural language processing [12],
knowledge management [13], information
extraction [14], intelligent search engines [15],
digital libraries [16], business process
modeling [17-19] etc. While the use of
ontologies was primarily in the domain of
academia, situation now improves with the
advent of several methodologies for ontology
manipulation. Existing methodologies for
ontology development in general try to define
the activities for ontology management,
activities for ontology development and
support activities. Several methodologies exist
for ontology manipulation and will be briefly
presented in the following section.
CommonKADS [20] is in fact not a
methodology for ontology development, but is
focused towards knowledge management in
information systems with analysis, design and
implementation
of
knowledge.
CommonKADS puts an emphasis to early
stages of software development for knowledge
management. Enterprise Ontology [21]
recommends three simple steps: definition of
intention; capturing concepts, mutual relation
and expressions based on concepts and
relations; persisting ontology in one of the
languages. This methodology is the
groundwork for many other approaches and is
also used in several ontology editors.
METHONTOLOGY [22] is a methodology
for ontology creation from scratch or by
reusing existing ontologies. The framework
enables building ontology at conceptual level
and this approach is very close to prototyping.
Another approach is TOVE [23] where authors
suggest using questionnaires that describe
questions to which ontology should give
answers. That can be very useful in
environments where domain experts have very
little expertise of knowledge modeling.
Moreover authors of HCONE [24] present
decentralized
approach
to
ontology
development by introducing regions where
ontology is saved during its lifecycle. OTK
Methodology [25] defines steps in ontology
development into detail and introduces two
processes
–
Knowledge Meta Process and
Knowledge Process. The steps are also
supported by a tool. UPON [26] is an
interesting methodology that is based on
Unified Software Development Process and is
supported by UML language, but it has not
been yet fully tested. The latest proposal is
DILIGENT [13] and is focused on different
94
International Journal on New Computer Architectures and Their Applications (IJNCAA) 2(1): 167-182
The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2220-9085)
169
approaches
to
distributed
ontology
development.
In the domain of finance several ontologies
and implementations of Semantic Web based
application exits. Finance ontology [27]
follows ISO standards and covers several
aspects (classification of financial instruments,
currencies, markets, parties involved in
financial transactions, countries etc.).
Suggested Upper Merged Ontology (SUMO)
[28] also includes a subset related to finance
domain, which is richly axiomatized, not just
taxonomic information but with terms
formally defined. There are also several
contributions in financial investments and
trading systems [29-31]. Several authors deal
with construction of expert and financial
information systems [32-35].
3 FACILITATING SEMANTIC WEB
APPLICATIONS CONSTRUCTION
3.1 Problem and proposal for solution
This paper describes semantic mash up
application construction based on ontologies.
The process is supported by continuous
evaluation of ontology where developer is
guided throughout the development process
and constantly aided by recommendations to
progress to next step and improve the quality
of the final result. Our main objective is to
combine dynamic (Web) data sources with a
minimal effort required from the user. The
results of this process are data sources that are
later used together with ontology and rules to
create a new application. This final result
includes ontology that not only represents the
common understanding of a problem domain
but is also executable and directly used in the
semantic mash up application.
Existing approaches for ontology development
and semantic mash up application construction
are complex and they require technical
knowledge that business users and developers
don’t possess. As mentioned in section
2 vast
majority
of
ontology
development
methodologies define a complex process that
demands a long learning curve. The required
technical knowledge is very high therefore
making ontology development very difficult
for non-technically oriented developers. Also
majority of reviewed methodologies include a
very limited evaluation support of developed
ontologies and if this support exists it is
limited to latter stages of development and not
included throughout the process as is the case
with our approach. Another problem that also
exists is that the development process of
ontology is completed after the first cycle and
not much attention is given to applicability of
ontology in an application.
3.2 Rapid Ontology Development
The process for ontology development ROD
[36] that we follow in our approach is based
on existing approaches and methodologies but
is enhanced with continuous ontology
evaluation throughout the complete process.
Developers start with capturing concepts,
mutual relations and expressions based on
concepts and relations. This task can include
reusing elements from various resources or
defining them from scratch. When the model
is defined, schematic part of ontology has to
be binded to existing instances of that
vocabulary. This includes data from relational
databases, text files, other ontologies etc. The
last step in bringing ontology into use is
creating
functional
component
for
employment
in
other
systems.
171
International Journal on New Computer Architectures and Their Applications (IJNCAA) 2(1): 167-182
The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2220-9085)
170
Fig. 1: Process of Rapid Ontology Development (ROD)
Steps in ontology development defined by the process of selected methodology
Description
Partition
Redundancy
Consistency
Anomaly
79%
75%
43%
100%
33%
59%
...
Ontology completeness (OC) calculation
Step m
(Task m
1
, m
2
, …, m
p
)
Step n
(Task n
1
, n
2
, …, n
p
)
ua
ev
al
ti
on
Step a
(Task a
1
, a
2
, …, a
p
)
List of ontology recommendations
+3%
Recommendation 1 (description)
+5%
Recommendation 2 (description)
+1%
Recommendation 3 (description)
+2%
Recommendation n (description)
...
...
...
price
OC
Developed
ontology
59%
Weights of
semantic
checks
price
OC
³
treshold
ua
ev
al
ti
on
ua
ev
al
ti
on
Fig. 2: OC calculation
The ROD development process can be divided
into the following stages: pre-development,
development and post-development depicted in
Fig. 1. Every stage delivers a specific output
with the common goal of creating functional
component based on ontology that can be used
in several systems and scenarios.
The role of constant evaluation as depicted in
Fig. 2 is to guide developer in progressing
through steps of ROD process or it can be
used independently of ROD process. In latter
case, based on semantic review of ontology,
enhancements for ontology improvement are
available to the developer in a form of
multiple actions of improvement, sorted by
Feasibility
study
Implementation
model definition
Pre-development
Development
Post-development
Vocabulary
linking with
data
Essential
model
defininition
ua
ev
al
ti
on
Functional
component
composition
use
ua
ev
al
ti
on
Business
vocabulary
acquisition
Enumeration
of concepts' and
properties' examples
Taxonomy
identification
Ad hoc
binary relations
identification
Describe concepts'
atributes and relations
Add complex
restrictions and rules
I/O
definition
Concepts, relations,
restrictions and rules
selection
Target
identification
use
(1)
(2)
(3)
(2.1)
(2.5)
(2.4)
(2.3)
(2.2)
(3.1)
(3.2)
(3.2.1)
(3.2.2)
(3.2.3)
ua
ev
al
ti
on
ua
ev
al
ti
on
(2.6)
77
International Journal on New Computer Architectures and Their Applications (IJNCAA) 2(1): 167-182
The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2220-9085)
171
their impact. Besides actions and their
impacts, detail explanation of action is also
available (see Fig. 3). When OC measurement
reaches a threshold (e.g. 80%) developer can
progress to the following step. The adapted
OC value for every phase is calculated on-the-
fly and whenever a threshold value is crossed,
a recommendation for progressing to next step
is generated. This way developer is aided in
progressing through steps of ROD process
from business vocabulary acquisition to
functional component composition.
Fig. 3: Ontology completeness and improvement
recommendation
Ontology completeness (OC) indicator used
for guiding developer in progressing through
steps of ROD process and ensuring the
required quality level of developed ontology is
defined as
OC = f (C, P, R, I)
[0, 1]
where C is set of concepts, P set of properties,
R set of rules and I set of instances. Based on
these input the output value in an interval [0,
1] is calculated. The higher the value, more
complete the ontology is. OC is weighted sum
of semantic checks, while weights are being
dynamically altered when traversing from one
phase in ROD process to another. OC can be
further defined as
∑
Where n is the number of leaf conditions and
leafCondition is leaf condition, where
semantic check is executed. For relative
weights and leaf condition calculation the
following restrictions apply
∑
,
[ ]
and
[ ]
.
Relative weight
denotes global importance
of
and is dependent on all
weights from leaf to root concept.
The tree of conditions in OC calculation is
depicted in Fig. 4 and contains semantic
checks that are executed against the ontology.
The top level is divided into TBox, RBox and
ABox components. Subsequent levels are then
furthermore divided based on ontology error
classification [37]. Aforementioned sublevels
are description, partition, redundancy,
consistency and anomaly.
This proposed structure can be easily adapted
and altered for custom use. Leafs in the tree of
OC calculation are implemented as semantic
checks while all preceding elements are
aggregation with appropriate weights.
Algorithm for ontology completeness (OC)
price is depicted in Fig. 5, where X is
condition and
(
)
is the weight
between condition X and condition Y.
Each leaf condition implements a semantic
check against ontology and returns value
[
]
.
109
International Journal on New Computer Architectures and Their Applications (IJNCAA) 2(1): 167-182
The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2220-9085)
172
Description
English
Other
Consistency
Circulatory
error
Anomaly
Chain of
Inheritance
Partition
Common
classes
Redundancy
Subclass
redundancy
Identical
definition
TBox
Property
description
English
Other
Inverse property
existence
Path between
concepts
Anomaly
Property
clumps
Concept
description
RBox
Subproperty
redundancy
Identical
definition
Redundancy
Description
Partition
Lazy
Concept
Lazy
Property
Anomaly
Partition
Common
instances
External
instances
Redundancy
Instance
redundancy
Identical
definition
ABox
Formal
description
Subproperty
Class
Subclass
Equivalent
class
Disjoint
class
Hierarchy
Class
Instance
existence
Property
Description
Equivalent
property
Property
OC
components
Class
existence
Property
existence
Fig. 4: Ontology completeness (OC) tree of conditions, semantic checks and corresponding weights
' Evaluation is executed on top condition “OC components”
with weight 1
Evaluate (X, w)
price
OC
= 0
mark condition X as visited
if not exists sub-condition of X
' Execute semantic check on leaf element
return w
exec (X)
else for all conditions Y that are sub-conditions of X such that Y is not visited
' Aggregate ontology evaluation prices
if w(X,Y)
0
price
OC
+= Evaluate (Y, w(X,Y))
return w
price
OC
End
Fig. 5: Ontology completeness evaluation algorithm
83
International Journal on New Computer Architectures and Their Applications (IJNCAA) 2(1): 167-182
The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2220-9085)
173
Fig. 6: Impact of weights on OC sublevels in ROD process
Fig. 6 depicts the distribution of OC
components
(description,
partition,
redundancy, consistency and anomaly)
regarding individual phase in ROD. In first
two phases 2.1 and 2.2 developer deals with
business vocabulary identification and
enumeration of concepts’ and properties’
examples. Evidently with aforementioned
steps emphasis is on description of ontology,
while partition is also taken into consideration.
The importance of components description
and partition is then in latter steps decreased
but it still remains above average. In step 2.3
all other components are introduced
(redundancy, consistency and anomaly),
because developer is requested to define
taxonomy of schematic part of ontology.
While progressing to the latter steps of ROD
process emphasis is on detail description of
classes, properties and complex restriction and
rules are also added. At this stage redundancy
becomes more important. This trend of
distributions of weights remains similarly
balanced throughout the last steps 2.5 and 2.6
of development phase. In post-development
phase
when
functional
component
composition is performed, ontology
completeness calculation is mainly involved in
redundancy, description and anomaly
checking.
4 CASE STUDY
4.1 FITS ontology
The problem domain presented in this paper is
financial trading and analysis of financial
instruments. As already discussed in related
work section there are several financial
instruments ontologies already present. The
purpose of our work was to extend these
approaches to the information system level,
couple the ontology with reasoning
capabilities, define inputs, outputs, dynamic
imports and build fully executable Semantic
Web solution for financial instruments
analysis and trading strategies. For this
purpose basic Financial Instruments (FI)
ontology was developed following ROD
approach (see Fig. 7). The FI ontology
introduces basic concepts, including financial
instrument, stock exchange market, trading
day and analysis. Further details in form of
taxonomy are provided for financial
instruments, trading day and analysis.
While FI ontology defines elementary entities
from financial trading domain, are ontologies
that capture trading strategies more complex,
including advanced axioms and rules. In our
0%
20%
40%
60%
80%
100%
)2.1(
)2.2(
)2.3(
)2.4(
)2.5(
)2.6(
)3.1(
Description
Partition
Redundancy
Consistency
Anomaly
152
International Journal on New Computer Architectures and Their Applications (IJNCAA) 2(1): 167-182
The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2220-9085)
174
case we have define four different trading
strategies: (1) simple trading strategy (STs),
(2) strategy of simple moving averages
(SMAs), (3) Japanese candlestick trading
strategy (JCTs) and (4) strategy based on
fundamental analysis (FAs).
Every user has a possibility to define its own
trading strategy whether from scratch or
reusing existing ones. The main purpose of
trading strategies is to examine the instances
of FI:TradingDay concept and decide whether
the instance can be classified into
FI:SellTradingDay or FI:BuyTraddingDay.
An example of this process can be found on
Fig. 8 where and excerpt from JCTs is
presented.
The JCTs is based on price movements which
enable to identify patterns from daily trading
formations. In this strategy price of a financial
instrument is presented in a form of
candlestick (low, open, close, high) and
several patterns are identified (e.g. doji,
hammer, three white soldiers, shooting star
etc.). This strategy is rather complex but by
following ROD approach (presented in
section
3.2) domain experts can define it
without being familiar with technical details of
knowledge declaration and encoding.
Fig. 7: Excerpt from FITS ontology
rdfs:subClassOf
rdfs:subClassOf
rdfs:subClassOf
rdfs:subClassOf
rdfs:subClassOf
rdfs:subClassOf
rdfs:subClassOf
rdfs:subClassOf
.rdfs:domain
rdfs:range
.owl:inverseOf.
rdfs:domain
.rdfs:range.
.owl:inverseOf.
owl:disjointWith
rdfs:subClassOf
rdfs:subClassOf
rdfs:domain
.rdfs:domain.
rdfs:domain
rdfs:domain
.rdfs:domain.
rdfs:domain
rdfs:domain
.owl:disjointWith.
rdfs:domain
FI:Stock
ExchangeMarket
FI:hasTraded
FI:ISIN
FI:tradedAt
FI:validFor
FI:isTradedOn
FI:open
rdfs:domain
FI:close
FI:low
FI:high
FI:volume
FI:BuyTradingDay
FI:TradingDay
FI:tradeReason
FI:InterestRate
Instrument
FI:EquityRelated
Instrument
FI:Stock
FI:Loan
FI:Deposit
FI:SellTradingDay
FI:EquityLinked
Derivate
FI:ForwardType
Derivate
FI:OptionType
Derivate
FI:date
FI:nextTradingDay
FI:previousTradingDay
rdfs:range
.owl:inverseOf.
rdfs:domain
FI:ETF
.rdfs:subClassOf.
rdfs:subClassOf
FI:name
rdfs:domain
FI:KOBarrier
.rdfs:domain.
FI:KOCertificate
FI:location
.rdfs:domain.
FI:isBasedOn
.rdfs:range.
.rdfs:domain.
.rdfs:domain.
FI:isPartOf
.owl:inverseOf.
FI:
FI:Analysis
FI:hasAnalysis
.rdfs:domain.
.rdfs:domain.
.rdfs:range.
FI:date
.rdfs:domain.
FI:FinancialInstrument
rdfs:range.
.rdfs:domain.
FI:value
FI:symbol
.rdfs:subClassOf.
FI:PS
FI:PE
.rdfs:range.
FI:hasValuation
FI:Fundamentals
.rdfs:subClassOf.
rdfs:subClassOf.
FI:MeanAnalysts
Recommendation
rdfs:subClassOf.
.rdfs:domain.
rdfs:subClassOf.
FI:rating
FI:summary
FI:gradeOwnership
FI:gradeValuation
FI:gradeTechnical
.rdfs:domain.
.rdfs:domain.
.rdfs:domain.
.rdfs:domain.
FI:StockScouter
Rating
FI:gradeFundamental
.rdfs:domain.
.rdfs:domain.
118
International Journal on New Computer Architectures and Their Applications (IJNCAA) 2(1): 167-182
The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2220-9085)
175
Fig. 8: Excerpt from Japanese candlestick trading strategy
Fig. 9: Composition of final ontology for employment in Semantic Web application
After the selection of desired trading
ontologies or composition of existing ones
user can define the final ontology (see Fig. 9)
which is then coupled with reasoning engine
to allow the execution and performing trading
analysis on real data available from several
sources. At this point the schematic part of
ontology (TBox component) is defined and
further it still needs to be associated to
instances (ABox component) by semantic
integration of several data sources, which we
will address in the following section
0.
4.2 Semantic integration of data sources
In ROD approach there are several imports
available: (1) existing ontologies, (2) relational
or analytical databases, (3) CSV file and (4)
semi structured data sources (e.g. HTML). In
the process of creating FITS ontology the most
prominent approach was reusing data from
semi structured sources, mainly from HTML
pages. When building executable ontology we
relied on publicly available data about trading
financial instruments, which are available on
web pages and in vast majority in an
unstructured form. Therefore linking wizard
from ROD approach was used which
incorporated the technology of regular
expression and XQuery formulation for
rdfs:subClassOf
rdfs:subClassOf
.rdfs:subClassOf.
FI:nexTradingDay(?day1, ?day2) Ù Ù FI-Strategy-Japanese:BlackBody(?day1) Ù Ù
FI-Strategy-Japanese:Engulfing(?day2) Ù Ù FI-Strategy-Japanese:LargeBody(?day2) Ù Ù
FI-Strategy-Japanese:WhiteBody(?day2) Ù Ù FI-Strategy-Japanese:MLT(?day2) ®
FI:BuyTradingDay(?day2)
.swrl:Imp.
FI:BuyTradingDay
FI:TradingDay
FI:SellTradingDay
rdfs:subClassOf
FI-Strategy-Japanese:
JapanesePatterns
FI:nexTradingDay(?day1, ?day2) Ù Ù FI-Strategy-Japanese:Doji(?day1) Ù Ù
FI-Strategy-Japanese:Engulfing(?day2) Ù Ù FI-Strategy-Japanese:LargeBody(?day2) Ù Ù
FI-Strategy-Japanese:WhiteBody(?day2) Ù Ù FI-Strategy-Japanese:MLT(?day2) ®
FI:BuyTradingDay(?day2)
.swrl:Imp.
.swrl:Imp.
FI:nexTradingDay(?day1, ?day2) Ù Ù FI-Strategy-Japanese:WhiteBody(?day1) Ù Ù
FI-Strategy-Japanese:Engulfing(?day2) Ù Ù FI-Strategy-Japanese:LargeBody(?day2) Ù Ù
FI-Strategy-Japanese:BlackBody(?day2) Ù Ù FI-Strategy-Japanese:MHT(?day2) ®
FI:SellTradingDay(?day2)
FI:nexTradingDay(?day1, ?day2) Ù Ù FI-Strategy-Japanese:WhiteBody(?day1) Ù Ù
FI-Strategy-Japanese:Engulfing(?day2) Ù Ù FI-Strategy-Japanese:LargeBody(?day2) Ù Ù
FI-Strategy-Japanese:BlackBody(?day2) Ù Ù FI-Strategy-Japanese:MHY(?day2) ®
FI:SellTradingDay(?day2)
.swrl:Imp.
.swrl:Imp.
.swrl:Imp.
FI:nexTradingDay(?day1, ?day2) Ù Ù FI-Strategy-Japanese:Doji(?day1) Ù Ù
FI-Strategy-Japanese:Engulfing(?day2) Ù Ù FI-Strategy-Japanese:LargeBody(?day2) Ù Ù
FI-Strategy-Japanese:BlackBody(?day2) Ù Ù FI-Strategy-Japanese:MHT(?day2) ®
FI:SellTradingDay(?day2)
FI:nexTradingDay(?day1, ?day2) Ù Ù FI-Strategy-Japanese:Doji(?day1) Ù Ù
FI-Strategy-Japanese:Engulfing(?day2) Ù Ù FI-Strategy-Japanese:LargeBody(?day2) Ù Ù
FI-Strategy-Japanese:BlackBody(?day2) Ù Ù FI-Strategy-Japanese:MHY(?day2) ®
FI:SellTradingDay(?day2)
FI-Strategy-Japanese:
Doji
.swrl:Imp.
FI-Strategy-Japanese:BlackBody(?day) Ù Ù FI:close(?day, ?close) Ù Ù
FI:open(?day, ?open) Ù Ù KAON2:ifTrue("($2 - $1) <= $1*0.0025", ?close, ?open) ®
FI-Strategy-Japanese:Doji(?day)
FI-Strategy-Japanese:BlackBody(?day) Ù Ù FI:close(?day, ?close) Ù Ù
FI:open(?day, ?open) Ù Ù FI:high(?day, ?high) Ù Ù FI:low(?day, ?low) Ù Ù
KAON2:ifTrue("($2 - $1) <= ($3 - $4)*0.1", ?close, ?open, ?high, ?low) ®
FI-Strategy-Japanese:Doji(?day)
.swrl:Imp.
FI-Strategy-Japanese:
owl:imports
owl:imports
owl:imports
owl:imports
owl:imports
owl:imports
owl:imports
owl:imports
.owl:imports.
owl:imports
owl:imports
owl:imports
.owl:imports.
.owl:imports.
.owl:imports.
owl:imports
owl:imports
FI:
FI-Strategy-Simple:
FI-Strategy-Japanese:
FI-Yahoo-Finance-CSV:
FI-Yahoo-Finance-CSV-GOOG:
FI-Yahoo-Finance-CSV-AAPL:
FI-User-2:
FI-Yahoo-Finance-HTML:
FI-Yahoo-Finance-HTML-PCX:
FI-AmiBroker-AQH:
FI-AmiBroker-KRKG:
FI-User-1:
FI-Strategy-SMA:
FI-Strategy-Foundamental:
.owl:imports.
36
International Journal on New Computer Architectures and Their Applications (IJNCAA) 2(1): 167-182
The Society of Digital Information and Wireless Communications, 2012 (ISSN: 2220-9085)
176
extracting data from semi structured data
sources.
Fig. 10: Dynamic import of data property values related to financial instrument concept from Google Finance web data
source
http://finance.google.com/finance?q=AAPL
Web site location
Template for information exctraction
Dynamic link with ontology
.rdfs:domain.
rdfs:range
.rdfs:domain.
.rdfs:domain.
HTML source of web site
Graphical layout of web site
.owl:instanceOf.
FI:isTradedOn
.owl:instanceOf.
Ontology instances
FI:StockExchangeMarket
FI:symbol
FI:name
FI:Stock
FI:isTradedOn
<h3>([^&]+?) </h3>\(\w+, (\w+?):AAPL\)
<div class="g-tpl-67-33 g-split hdg top" id="companyheader">
<div class="g-unit g-first"><h3>Apple Inc. </h3>(Public, NASDAQ:AAPL)
<a href="/finance/portfolio?action=add&addticker=NASDAQ%3AAAPL" class="norm">Watch this stock</a>
FI:StockExchangeMarket
FI:Stock
FI:NASDAQ
FI:AAPL
FI:symbol = "AAPL"
FI:name = "Apple Inc."
Documents you may be interested
Documents you may be interested