113
time for certainqueriesto execute, particularlythose that entail cal-
culation of large cross products. By using the standard distribution
of ARQ, we have ensured it will be possible to rapidly integrate
future releasesand optimisations of the SPARQL engine. Even so,
it may be possible to decrease the amount of work the engine has
to do. With improved understanding of the query planning and ex-
ecution within ARQ, it should be possible for the QueryBuilder to
produce queries which will execute more efficiently. Streamlining
and optimising the RDF model output by the DOM2RDF compo-
nent of SparqPlug with this query execution and planning in mind
may also have a positive impact on performance. An alternative
approach to optimisation would be to investigate attaching differ-
ent SPARQL query processors and triple stores to the SparqPlug
service.
Another aspect that could be improved is the user interface. The
ability for a user to click-and-drag to select the area of HTML they
wish to RDFize could,ifcoupledwithquery builder enhancements,
allow SparqPlug to combine the ease of use in systems such as
Thresher without reducing its query expressivity in any shape or
form.
Afinal area that mayprove fruitfulto expandresearchintotheSpar-
qPlug approach is that of less formally structured data. SparqPlug
can be veryefficient and accurate over highly structuredDOM doc-
uments(e.g. listsof entries, rows oftablesetc.) It can,however, be
challengingtoconstruct asuccessful SparqPlug jobto,forexample,
parse the contents of a wiki page into useful RDF. Anexpansion of
the property functions available to the SparqPlug engine, perhaps
to normalise the data before processing, may prove valuable here.
Another relevant approach here may be that of embedding custom
property functions[35] as javascript inside the query.
9. ACKNOWLEDGEMENTS
Thisresearch waspartiallysupported bythe OpenKnowledge (OK)
project. OK is sponsored by the European Commission as part
of the Information Society Technologies (IST) programme under
grant number IST-2001-34038.
10. REFERENCES
[1] Auer, S; Bizer, C; Kobilarov, G; Lehmann, J; Cyganiak, R;
Ives, Z: “DBpedia: A Nucleusfor a Web of Open Data”. In
Proc. of the 6th Intl. Semantic Web Conference
(ISWC2007), Busan, Korea.
[2] http://www.wikipedia.org/
[3] Prud'hommeaux, E; Seaborne, A: “SPARQL Query
Language for RDF" http://www.w3.org/TR/rdf-sparql-query/
[4] Gennari, J; Musen, M; Fergerson, R; Grosso, W; Crubézy,
M; Eriksson, H; Noy, N; Tu, S: “The evolution of Protégé:
an environment for knowledge-based systems development".
In International Journal of Human-Computer Studies
(January 2003)
[5] Huynh, D; Miller,R; Karger, D: “Enabling Web Browsers to
Augment Web Sites' Filtering and Sorting Functionalities".
In Proc. of the 19th annual ACM symposium on User
interface software and technology, Montreux, Switzerland.
[6] http://www.w3.org/TR/grddl
[7] Droop, M; Flarer, M; Groppe, J; Groppe, S; Linnemann, V;
Pinggera, J; Santner, F; Schier, M; Schoepf, F; Staffler, H;
Zugal, S: “Translating XPath Queriesinto SPARQL
Queries". In Proc. of On the Move to Meaningful Internet
Systems2007: OTM2007 Workshops
[8] http://jena.sourceforge.net/ARQ/
[9] Seaborne, A: ARQTick 13 October 2006 - “Assignment
PropertyFunction"
http://seaborne.blogspot.com/2006/10/assignment-property-
function.html
[10] Eric Prud'hommeaux: “SPAT - SPARQL Annotations"
http://www.w3.org/2007/01/SPAT/
[11] Welty, C; Murdock, JW: “Towards Knowledge Acquisition
from Information Extraction". In Proc. of the 5th Intl.
Semantic Web Conference (ISWC2006), Athens, Georgia,
USA.
[12] Hogue, A; Karger, D: “Thresher: automating the unwrapping
of semantic content from the World Wide Web”. In Proc. of
the 14th Intl. World Wide Web conference (WWW2005),
Chiba, Japan.
[13] Kalyanpur, A: “RDFWeb Scraper"
http://www.mindswap.org/~aditkal/rdf.shtml
[14] http://simile.mit.edu/wiki/Piggy_Bank
[15] http://openlinksw.com/virtuoso
[16] http://triplr.org
[17] http://microformats.org
[18] Birbeck, M; Pemberton, S; Adida, B: “RDFa Syntax - A
collection of attributes for layering RDF on XML languages"
http://www.w3.org/2006/07/SWD/RDFa/syntax/
[19] “Rdf In Html"
http://research.talis.com/2005/erdf/wiki/Main/RdfInHtml
[20] http://base.google.com
[21] http://flickr.com
[22] http://del.icio.us
[23] Bizer, C; Cyganiak, R; Heath, T: “Howto Publish Linked
Data on the Web”http://sites.wiwiss.fu-
berlin.de/suhl/bizer/pub/LinkedDataTutorial/
[24] Sauermann, L; Cyganiak, R; Ayers, D; Voelkel, M: “Cool
URIs for the Semantic Web"
http://www.w3.org/TR/2007/WD-cooluris-20071217/
[25] Oldakowski, R; Bizer, C; Westphal, D:
“RAP: RDF API for PHP". In Proc. of the 1st Workshop on
Scripting for the Semantic Web (SFSW2005), 2nd European
Semantic Web Conference (ESWC2005), Heraklion, Greece
[26] http://jena.sourceforge.net/
[27] Bizer, C; Cyganiak, R: “NG4J - Named Graphs API for
Jena"http://sites.wiwiss.fu-berlin.de/suhl/bizer/ng4j/
[28] http://jtidy.sourceforge.net/
[29] http://tomcat.apache.org/
[30] http://www.mysql.org/
[31] http://www.timeout.com/film/index/film/a/1.html
[32] http://purl.org/dc/elements/1.1/
[33] Suchanek, F; Kasneci, G; Weikum, G: “Yago - A Core of
Semantic Knowledge". In Proc. of 16thIntl. World Wide
Web conference (WWW2007), Banff, Alberta, Canada.
[34] http://dbpedia.org/
[35] Williams, G: “Extensible SPARQL Functions With
Embedded Javascript". From 3rd Workshop on Scripting for
the Semantic Web (SFSW2007), 4th European Semantic
Web Conference (ESWC2007), Innbruck, Austria.