IntegratingPDFinterfaceintoJavaapplication
Abstract
Purpose–ThepurposeofthispaperistoproposeanovelapproachtointegratePDF
interfaceintoJava-baseddigitallibraryapplication.Itbridgesthegapbetween
conductingcontentoperationandviewingonPDFdocumentasynchronously.
Design/methodology/approach–Inthispaper,wefirstlyreviewsomerelated
researchanddiscussPDFanditsdrawbacks.Next,weproposethedesignstepsand
implementationofthreemodesofdisplayingPDFdocument:PDFdisplay,image
displayandXMLdisplay.Acomparisonofthesethreemodeshasbeencarriedout.
Findings–WefindthatthePDFdisplayisabletocompletelypresenttheoriginal
PDFdocumentcontentsandthusobviouslysuperiortotheothertwodisplays.In
addition,theformatspecificationofPDF-basede-bookdoesnotperformwell;lackof
standardizationandcomplexstructureisexposedtothepublication.
Practicalimplications–TheproposedapproachmakesviewingthePDFdocuments
moreconvenientandeffective,andcanbeusedtoretrieveandvisualizethePDF
documentsandtosupportthepersonalizedfunctioncustomizationofPDFinthe
digitallibraryapplications.
Originality/value–Thispaperproposesanovelapproachtosolvetheproblem
betweencontentoperationandtheviewofPDFsynchronously,providingusersanew
tooltoretrieveandreusethePDFdocuments.Itcontributestoimprovetheservice
specificationandpolicyofviewingthePDFfordigitallibrary.Besides,the
personalizedinterfaceandpublicindexmakefurtherdevelopmentandapplication
morefeasible.
KeywordsPortabledocumentformat(PDF),Java,Userexperience,Integrated
interface,Indexmechanism,Digitallibrary
ArticleClassificationTechnicalpaper
Pdf link to specific page - insert, remove PDF links in C#.net, ASP.NET, MVC, Ajax, WinForms, WPF
Free C# example code is offered for users to edit PDF document hyperlink (url), like inserting and deleting
add link to pdf file; add links to pdf document
Pdf link to specific page - VB.NET PDF url edit library: insert, remove PDF links in vb.net, ASP.NET, MVC, Ajax, WinForms, WPF
Help to Insert a Hyperlink to Specified PDF Document Page
add a link to a pdf in acrobat; add hyperlink pdf file
Introduction
Nowadays,thereisalargenumberofPDFdocumentsindigitallibrariesand
full-textdatabases,suchase-books,e-journalandotherrelativefiles(AdobeSystems
Inc.,2012).Theyareincreasinglypopularandsignificanttolibraries(Nelson,2008).
ThePDFdocumentscarryacombinationofinformationinvariousmediumformats,
suchastext,image,font,color,symbolandshapes,etc.,whichbringthereadersan
unprecedentedreadingexperience.ComparedwiththemediaformatssuchasTXT,
HTMLandXML,PDFhastheadvantageofmakingdescribingandprintinga
documenteasier(AdobeSystemsInc.,2012).
However,thePDFhasitsinherentdefects.MostPDFdocumentsdonothave
basichigh-leveldocumentlogicalstructuralinformation,whichmakestheretrieve
andreuseofthedocumentsdifficult.Moreover,Wangpointedoutthedrawbacksof
PDF(Shaofeng,2004),including:
 Lackingofapowerfulsearchenginetodealwiththestructureandcontentof
thedocument.
 Beingdifficultforuserstohavedifferentaccessprivilegesondifferentparts
ofthedocument.
 Beingdifficulttoprovidepersonalizedinterfaceforusers.
AllofthesedefectswillgreatlylimittheaccesstoPDFanditscirculationinthe
digitallibrary.ThoughtherearemanyreadersandanalysistoolsaboutPDFinreality,
theyarealwayseitherdisplayingthePDFdocumentaloneoranalysisofthePDFtext
content.Forexample,AdobeReader,FoxitReader,etc.,candisplaythePDF
documentcompletelybutdon’thavethefunctionofanalyzingthetextcontenton
PDF.Liketheconcordance(Watt,2012),TextArc(Paley,2002),FeatureLens(Donet
al.,2012),ProfileSkim(Harperetal.,2006)andiSee(Sunetal.,2008),thoughthey
haveastrongabilityoftextanalysis,theyarenotgooddocumentreaders.Atpresent,
ifwewanttoviewandanalyzethecontentofPDFdocuments,wewillhaveto
abstractthePDFtextcontentinadocumentreaderandputitintoaanalysistool.It
notonlydecreasestheeffectivenessofreadingandlearning,butalsoaffectsthe
efficiencyofsearchandretrieve.Veronicaetaloncecomparedsearchinginrealistic
books(LiesaputraandWitten,2012)withsearchinginPDFfiles,theresult
demonstratedthatthesearchinginPDFfilescouldn’tsatisfytheusers(Liesaputraet
al.,2009).TheusersfoundithardtounderstandthestructureofthePDFdocument,
notknowingwheretheyweresincetheyareeasilydisorientedandfindingitdifficult
toreturntoaspecificlocation(Liesaputraetal.,2009).
Therefore,weproposeanovelapproachtosolvethisdisjunctionproblem
betweencontentoperationandtheviewofPDFsynchronously,namelyintegrating
PDFinterfaceintoJavaapplication.First,weanalyzethecurrentrelevantresearch
andapplicationsaboutthePDF.
Based
on
currentresearchresults,wedesignthree
modestointegratethePDFdocumentcontentintoJavapanel.Aftercomparingthe
threeexperimentalresults,wedemonstratethatintegratingthePDFinterfaceintoJava
applicationcancompletelydisplaythePDFcontentandimprovetheviewandsearch
experienceonPDFdocuments.
C# PDF File & Page Process Library SDK for C#.net, ASP.NET, MVC
Image: Copy, Paste, Cut Image in Page. Link: Edit URL. XDoc.PDF allows you to easily move PDF document pages specific APIs to copy and get a specific page of PDF
pdf link to attached file; check links in pdf
VB.NET PDF File & Page Process Library SDK for vb.net, ASP.NET
By referring to this VB.NET guide, you can use specific APIs to copy and get a specific page of PDF file; you are also able to copy and paste pages from a PDF
add a link to a pdf; add page number to pdf hyperlink
Relatedresearch
Theportabledocumentformat(PDF)iswidelyacceptedasadigitalarchiving
formatandPDFdocumentsaresupportedinvirtuallyeveryrepository(Seadle,2009).
Itsplatform-independentandopenlyaccessiblefeaturemakesittheidealfileformat
toreleaseanddisseminateelectronicdocumentsinthedigitallibrary(AdobeSystems
Inc.,2012).ThoughsomediscussionsonwhetherPDFformatsareappropriatefor
long-termdigitalarchiving(Seadle,2009;Zhao,2011;Uneson,2005),thereisno
doubtthatthedigitallibraryhasbecomethemainaggregationofthePDF(Vasileiou
etal.,2009).Severalmajorpublishers,likeElsevier,Emerald,WileyandSpringer,
providelargenumbersofacademice-booksande-journalbasedonthePDFformat.
Theirtargetmarketmainlyfocusesonthelibraries-academic,publicandspecial
(Vasileiouetal.,2009).Therefore,theelectronicbookshaveamoresignificant
potentialtoimpactusers’readingexperience(Liesaputraetal.,2009),withimportant
consequencesforthefutureroleandexistenceoflibraries(Vasileiouetal.,2009).
Atpresenttherearemanytypesofe-bookformatinthemarket.Vasileiouetal.
analyzedthemainwebsitesofninee-bookpublishersandelevene-bookaggregators
andfoundthatthemostcommonformatofe-booksappearstobePDF.Whilethe
majorityofvendorsusePDF,acoupleofcompaniesprovidetheircontentinHTML,
suchasKnovelandGale.However,EbraryusesitsownEDFformatandQuestiauses
XML(Vasileiouetal.,2009).Inaddition,Thomasetal.builttheJSTORsystemto
scanthedocumentsandconvertthemtotheimages.Althoughthismethodcancreate
thecontentidenticaltotheoriginaldocument,itsgeneratedimageslosethetextual
contextintheoriginaldocumentandfailtosearchthefulltextinthegeneratedimage
files(Thomasetal.,1999).Patrickvanetal.developedi*Docthatisbasedonthe
Extensiblemarkuplanguage(XML)whichcanserveasintegrationformatanddeliver
thepersonalizedcontents(Patrickvanetal.,2000).
Inordertoexplorethedifferencesamongthecommonlyusedelectronicformats,
manyresearcherstrytodistinguishtheseformatsfromdifferentperspective.Shaofeng
indicatedthatthoughHTMLwasdesignedtodescribethestructureofthedocument
andthusconcernedtheappearancemorethanthecontentofthedocument,ithas
weaknessinretrievingandprintingtheelectronicdocumentaswellasotherproblems
(Shaofeng,2004).HealsomadeacomparisonbetweenPDFandXML,theresult
showsthatthoughthePDFhastheadvantageofdescribingandprintingwhata
documentlookslike,thenon-structureandaccessprivilegecontrolmakeithardtobe
searchedandreuse(Shaofeng,2004).Furthermore,userscriticizedthatalltheclients
havethesameinterface.Bycontrast,theXMLcanachieveefficientretrievaland
personalizedcustomizationbythethird-partysoftware(Shaofeng,2004).Zhaoand
UnesonthencomparedthePDFwithXMLfromaperspectiveoflong-term
preservation(Zhao,2011;Uneson,2005).TheresultsdemonstratethatthePDFisnot
thebestchoiceforretrievingandreusing,andtheXMLdoesnothavetheabsolute
advantageoffullypreservingtheappearanceofanelectronicdocument.
Allinall,theresearchersfoundthemaindrawbacksofPDFareaboutsupporting
searchandextractionfunction,andbegantolookfornewideasofextractingPDF’s
VB.NET PDF Page Insert Library: insert pages into PDF file in vb.
Add and Insert Blank Page to PDF File Using VB. This demo explains how to use VB to insert an empty page to a specific location of current PDF file .
add link to pdf; add links pdf document
C# PDF Page Insert Library: insert pages into PDF file in C#.net
Add and Insert Blank Page to PDF File in C#.NET. This C# demo explains how to insert an empty page to a specific location of current PDF file.
pdf link; add url link to pdf
contentandstructuretoachievethegoalofretrievingandreusingthePDF.Numerous
studieshaveshownthatthestructuredcontentismoreeffectivetoobtaininformation
accuratelyandquickly.Thereisneedforextractingobjectsinastructuredformand
savingthedocumentinXMLformatinordertoallowindexing,moreaccurately
searching,anddealingwithversioningandMetadata.InitiallyHadjaretal.proposed
anewtooltoextracttext,images,andgraphicsfromaPDFdocument,butdidnot
considerthehiddenlayoutandlogicalstructuresofdocuments(Hadjaretal.,2004).
Consequentially,ChaoandFanmadeafurtherimprovement,developingtechniques
thatidentifiedlogicalcomponentsonaPDFdocumentpage.Theoutlines,style
attributesandthecontentsofthelogicalcomponentswereextractedandexpressedin
anXMLformat(ChaoandFan,2004).StillsomeonetrytoconvertthePDFtoother
formats.Forexample,RahmanandAlamconvertedthePDFintoHTML(Rahman
andAlam,2003),DéjeanandMeunierconvertedPDFdocumentintostructuredXML
format(DéjeanandMeunier,2006),andZhangconvertedPDFfilestoXMLfiles
(Zhang,2008)andsoon.Unfortunately,theseconversionandextractionlostthe
originalappearanceofPDFfiles,suchasfont,imageandparagraphinformationor
others.Butthesetechniquesdofacilitatetheretrieveandreuseofthelayoutandthe
contentofaPDFdocumentpage.
Researchshowsthatthecompletedisplayoforiginaldocumentinformation
contributestofasterlearningandretention,andthusleadstomoreeffectively
searchingandhighersatisfaction(Ahmedetal.,2006).However,oneofthe
significantchallengeswithPDFaccessibilityisthelackingofeffectivewaystounify
contentoperationandtheviewonPDFsynchronously.Thereby,itisdifficultto
designtheuserinterfacethatembedsthePDFdocumentintoapplications.Though
severalapplicationsarenowavailable,theyarejustonlineservice,suchasScribdfor
viewingandstoring,Pdfvueforonlineediting,andZamzarforPDFConversion.
Thepreparationforintegration
ToenabletheusertointeractwiththePDFdocumentfriendly,weshouldconsider
theuser'sdemandsbeforedesigningtheinterface:
 ThecontentsofPDFdocumentshouldbedisplayedcompletely,includingthe
picture,graph,fontandothers;
 Operationandviewshouldbeinthesameinterfacesynchronously,avoidingthe
shiftingamongmultipleinterfaces;
 Usercouldintegratesuitableapplicationintotheinterfacetoaidthemanalyzethe
contentsofPDFdocument;
 Theapplicationcouldberunoneveryplatformandsuitablefore-bookbasedon
thePDFformat.
Basedontheserequirements,wedevelopanapplicationthatprovidesuserswith
thefollowingfunctions:
 AllowingtheusertoviewmultiplePDFdocumentsviatheinterface;
 AchievingthesynchronybetweencontentoperationandtheviewofPDF;
VB.NET PDF - Annotate PDF Online with VB.NET HTML5 PDF Viewer
Click to add a text box to specific location on PDF page. Outline width, outline color, fill color and transparency are all can be altered in properties.
add links to pdf in preview; adding a link to a pdf in preview
C# HTML5 PDF Viewer SDK to annotate PDF document online in C#.NET
Click to add a text box to specific location on PDF page. Outline width, outline color, fill color and transparency are all can be altered in properties.
pdf link open in new window; add link to pdf acrobat
 Providingtheindexfileforthedeveloperstoenablethefurtherdevelopmentand
thenmoreapplicationscouldbeintegratedintotheinterface.
Therefore,weproposeanewtooltointegratethePDFinterfaceintoJava
application.TheframeworkofitsdesignispresentedinFigure1.Wedesignourtool
inthefollowingsteps.
Figure1.Theframeworkofdesign
1).ThePDFconverting.Wefinditdifficulttodisplaythecompleteoriginal
informationofPDFdocumentthroughcallingtheAPIsdirectlyprovidedbythe
Adobe,whichmaylosttheparagraphinformation.Therefore,weshouldfirstconvert
thePDFformattootherformats.ConsideringthefactthatXML(W3C,2012),avery
flexiblefileformat,iswidelyusedininformationstorage,informationinterchange
andotheraspects(Moghrabietal.,2004),andcandowhatJavahasdonefor
programs.TherearealsoanumberofresearchachievementsonXMLapplication
(Luketal.,2002;Chuetal.,2000;Zhangetal.,2008;GeroimenkoandGeroimenko,
2001;Luetal.,2008).Theseadvantagespromptustochooseitastheconverted
object.ThespecificprocesscanbefoundinapaperbyZhang(Zhang,2008).
2).Theindexconstruction.Thatwhatkindofindexshouldbeconstructedwill
affecttheperformanceofretrieve.Moskovitchetal.madeacomparativeevaluation
offull-text,concept-basedandcontext-sensitivesearch;theydemonstratedusefulness
ofconcept-basedandcontext-sensitivequeriesforenhancingtheprecisionofretrieval
fromadigitallibraryofsemi-structuredclinicalguidelinedocuments(Moskovitchet
al.,2007).Jimmy’sresultssuggestthatthehighestoveralleffectivenessmaybe
achievedbycombiningevidencefromspansandfullarticles(Lin,2009).Thenwe
constructedaXMLindexmechanismfromsentencetopage.Theindexismainly
usedforthesynchronousmechanism.Wecouldmakeafurtherapplication
developmentwiththeaidofit.ThroughtheconvertedXMLfile,wecouldbuildthe
index.Thespecificprocesswillberevealedinafollow-uppaper.Thespecificindex
mechanismwillbeillustratedinnextsection.
3).Interfacedesign.Inthisstep,wedevelopaspecificJavaapplicationtocontain
thePDFdocument.WechooseJavaasourdevelopmentlanguageforitis
platform-independentandcanrunonanycomputersthathavetheJava’sruntime
environment,orvirtualmachine.ThecombiningXMLwithJavakeepsits
C# PDF remove image library: remove, delete images from PDF in C#.
Image: Copy, Paste, Cut Image in Page. Link: Edit URL. Bookmark: Edit Bookmark. Metadata: Edit Delete and remove all image objects contained in a specific PDF page
adding hyperlinks to pdf; adding an email link to a pdf
C# PDF Image Extract Library: Select, copy, paste PDF images in C#
C#: Select All Images from One PDF Page. C# programming sample for extracting all images from a specific PDF page. // Open a document.
pdf links; accessible links in pdf
cross-platformfeatureseffectively.AnotherreasonisthatJavaprovidestheneeded
technologiesforXML.Forexample,JAXPallowsintegrationofanyXMLparser
withaJavaapplicationinordertoread,manipulateandgenerateXMLdocuments.
Thespecificinterfacelayoutisasfollow:
Figure2.userinterface(theareaofAcontainthePDFdocumentcontent,the
areaofBcouldbeintegratedapplication)
4).ThePDFintegration.Intheaboverelativeresearches,wementionsome
differentwaystoviewanelectronicdocument.Inordertoclearlyidentifywhatkind
ofinterfacedisplayismoresuitableforusers,wedesignthreemodes.Thefirstone
containsthePDFformat,thesecondonecontainstheimageformatandthethirdone
containstheXMLformat.Throughthemutualcomparison,wecaneasilyfindwhat
interfacedisplaysatisfiesuserbetter.
Theimplementationprocedure
Wechooseagenerale-book,containingthetableofcontents,bodyandpagination,
basedonthePDFformatasasample,andthenelaboratetheimplementationofthree
modesfromthefollowingaspects:howtoembedandhowtosynchronize
HowtoembedthePDFdocumentcontentintotheinterface
Afterconversionandextractionofthee-book’sinformation,thedesignedtoolhas
theabilitytointegratethee-bookintotheJavapanel.Thedesignedtoolprovidesa
friendlyandreliablewaytobrowsethee-book’sinformation.Usercouldbrowsethe
e-bookbythefollowingthreeways.
1) EmbeddingtheXMLintotheinterface.
 First,weneedtheXMLfilederivedfromtheconvertede-bookthroughthe
conversionmethod(Zhang,2008).
 Second,weexhibittheXMLinformationontheTextfieldcomponentadded
intotheareaofAthroughtheJavaAPI,namelydom4j.jar.Thislibrarycould
readtheXMLinformationandprintitonthescreen.
 Third,inordertovisittheXMLinformationconveniently,wedesigna
contenttreethatcanjumptodifferentpositioninthisdevelopment.Youcan
designotherfunctionstovisittheXMLinformation.
Figure3.XMLdisplay
2) Embeddingtheimageintotheinterface.
 First,weneedtoconvertthee-booktoimagesthroughthepdfview.jar
providedbythesuncompany.WemainlyusetheclassofPDFFile.javaand
PDFPage.javadrawingthee-bookpage.
 Second,weaddaLabelcomponentintotheareaofAandusetheLabel
componenttoexhibittheconvertedimage.Wecanscaneveryimage
accordingtotheselectedpagenumberorcontentinformation.
Figure4.Imagedisplay
3) EmbeddingthePDFintotheinterface.
 ThisapproachmainlyusestheICEpdf.jartoexhibitthee-bookontheareaofA.
WeaddapanelcomponentintotheareaofAandputthee-bookintoit.Through
thisway,wecaneasilyscantheoriginalPDFformat.Besides,thisjavapackage
alsoprovidesusmanyusefultools.Themainlyembeddedcodeareasfollow:
Amongthem,
pdfcontroller,factoryandviewConponentPanel
representtheJava
classes,
pdfUrl
representthee-booksavepath.Theinterfaceisasfollow:
Figure5.PDFdisplay
Howtoaccomplishthesynchronousmechanism
Ourresearchgoalismorethanbrowsethee-bookcontentinformation.More
importantly,weprovidethesynchronousmechanismtooperatethee-book.Wecan
notonlyexhibitthee-bookinformation,butalsosupporttheuserAPItointegrate
moreapplicationsinthisinterface.UseroperatestheXMLindexinsteadofthePDF.
Ofcourse,allofthesebenefitsarefromtheestablishmentoftheindex.
Theindexmechanism:weconstructthree-levelindexintheXMLfile.First,the
structureofindexislikethis,arootnodeasthecataloganditssub-noteisthechapter;
similarly,thesub-noteisthesectionandthelowestlevelisthesentence.Wecould
obtainthenodeinformationfromtheconvertede-book,namelytheXMLfile.
AccordingtothecontentsincludedintheXMLfile,wecouldestablishtree-based
XMLindexfromchapters,sectionstologicalpagenumber.However,itneedsfurther
processingtoobtainthenodeofsentenceinformation.Second,throughscanningthe
XMLfile,wesegmentthee-bookonthebasisofsentenceandaddthesentencenodes
intothesection’schildnote.Third,inordertoexactlylocatewherethesentence
occurs,weaddsomeattributeintothenode.Themaingoalistoachievethe
correspondencebetweenthephysicalpageandthelogicalpage.Thefinalindexisas
follow:
SwingControllerpdfcontroller new SwingController();
SwingViewBuilder factory new SwingViewBuilder(pdfcontroller);
Jpanel viewConponentPanel factory.buildViewerPanel();
pdfcontroller.openDocument(pdfUrl);
Inaboveindex,welabelthestartlogicalpageandendlogicalpageinthechapter
andsection,andalsosettheirchapterandsectionnameandcountthenumberof
chaptersandsectionsintheentiree-book.Inaddition,weuseanodetostorethe
sentencesineverysection.Afterthesesettingsbeingcompleted,wecoulddisplayand
findtheinformationsynchronously.
1) ThesynchronousmechanismofXML.
BecausetheindexderivedfromtheXMLfile,weonlyfindthenodethatyou
selectandprinttheinformationonthescreen.
2) Thesynchronousmechanismofimage.
Becausemanye-books’contentsdonotstartfromthepagenumberone,thereis
aninconsistencybetweenthelogicalpageandthephysicalpage.Tosolvethis
problem,weshouldrecordthephysicalpagenumbersbeforethearticlebodystartsif
thearticlebodystartsfromthelogicalpageone.Wecallthispagenumbers“distance”.
ThisphenomenonisalsosuitableforthesynchronousmechanismofPDF.
Whenfindingasentence,wefirstfindthelogicalpagenumberfromtheindex.
Butthisisnottheeventualpagination,weneedtousethisnumericalvaluetoaddthe
“distance”,thentheeventualvalueisthephysicalpagenumberswherethesentence
occurs.Next,wefindtheimagebythesequencesandreturnittotheLabel.
3) ThesynchronousmechanismofPDF.
Becauseofthe“distance”,wecan’tdirecttothepageaccordingtothelogical
pagenumberderivedfromtheindex.Weshouldaddthelogicalpagenumbertothe
“distance”,andthenobtaintheeventualpagination.Next,wecouldcallthefunction
thatICEpdf.jarhastodirecttothepage.
Resultsanalysisanddiscussion
Throughthreemodes’implementation,wecouldpresentthecontentofPDF
documentbythreedifferentformats.However,theirperformancediffers.Wecanfind
somethingamongthem,asshowninTable1:
<?xmlversion=”1.0”encoding=”utf-8”?>
<catalog>
<chapterchaptername=”chapter0”startp=”1”chaptercount=”1”endp=”12”>
<sectionsectionname=”section1”startp=”2”fullseccount=”2”endp=”4”>
<sentencestartp=”2”fullcount=”60”>thisisaninformationage.</sentence>
</section>
</chapter>
</catalog>
Throughtable1,wecanclearlyfindthatthePDFdisplayis
obviously
superiorto
theothertwointheintegritydisplayofthePDFdocumentcontent.Itcanalmostview
thePDFdocumentcontentcompletely.Forexample,agraphcanbepresented
completelyinPDFdisplay,butitisnoneintheothertwo,asshowninFigure3,
Figure4andFigure5.Intermsofefficiency,despitethepoorperformanceofXML
ondisplay,butitrunsthefastest.WhenmeetingthePDFdocumentsthatcontain
almostentirelyofsimpletext,theXMLdisplaymaybeagoodchoice,likethe
documentofliterature,historyandphilosophyandsoon.Butifyouviewanalbumof
art,theimagedisplaymaybeabetterchoice.Wecouldchoosedifferentmodesto
viewaPDFdocumentaccordingtotherealproblems.Ofcourse,wecanalsoidentify
theirperformancefromtheaspectsofaestheticfeeling,usabilityandlegibility.In
short,embeddingthePDFisaneffectiveandsuitablewaytodisplaytheoriginalPDF
documentcontent,andcanimprovetheuserexperience.
ThesignificantcontributionofthisstudyliesinintegratingPDFinterfaceintothe
JavaapplicationtoachievethegoalofviewingandoperatingthePDFdocument
synchronously.Basedonthisstudy,practicalimplicationsarediscussed.Itcan
improvethedigitallibrary’susabilityfromfouraspects.
 First,itcanbeusedtoretrieveandvisualizethePDFdocument.Digitallibrary
providesacomprehensivecollectionofdigitalresourcesandservicesthatare
accessiblethroughtheWeb.Everydaytherearelargeamountsofsearch
behaviors.Thesearchenginealwaysprovidesasetofkeywordsforusers,butthe
majorsearchenginestypicallydonotindexthecontentofPDFdocumentsatall
(Lawrenceetal.,1999).Therefore,userscanretrievetheXMLindextoachieve
thegoalofretrievingthePDFdocument.Ifpossible,itwillmakeforthedigital
librarytoconstructanindexlibraryofPDFdocumentcontentbasedonXML.In
addition,wecanvisualizethePDFdocumentinformationbytheXMLadvantage
ofstoragecontent,asshowninfigure6.Thiswillgreatlyenhanceuserretrieval
andreadingefficiency.Itcouldalsobeusedinotherareassuchasthe
anti-plagiarismservices(Pateletal.,2011),e-commerce(SengandLai,2010)
andteaching(CarlockandPerry,2008),etc.
 Second,itcanbeusedtosupportthepersonalizedfunctioncustomizationofPDF.
DuetothelimitationofsecurityandprivacyinPDF(Castiglioneetal.,2010),the
availabletoolsorsoftwareareveryfewandtheirfunctionsarealsolimited.Users
cannotcustomizetherequiredfunctionalityaccordingtotheirownpreferences.
Kani-Zabihietal.havesurveyedthedigitallibraryaboutwhatdouserswantand
foundthatbasedonusers’previousexperienceswithdigitallibraries,their
requirementswithrespecttospecificfeaturesmaychange(Kani-Zabihietal.,
2006).Manyresearchesemphasizedtheimportanceofuser-centereddesign.Our
studyprovidestheinterfaceandindexfortheresearchertomakeafurther
development.Forexample,userscanintegratetheLuceneintotheinterfaceto
helpthemviewandanalyzethePDFfulltextcontentsynchronously.Besides,
userscanalsoputthevisualinformationofPDFfulltextcontentintothe
interfacetoaidthembrowseandview.AsshowninFigure6,inarelated
research,weuseaseriesofconcentriccirclestovisualizethefulltext,acertain
Documents you may be interested
Documents you may be interested