mvc return pdf : Convert pdf to text file online application SDK cloud html wpf asp.net class Artefact0-part1432

CHAPTER
1
GeneratePDFDocumentswith
Artefact
OlivierAuverlotandGuillaumeLarchevêquewithJohanFabry
TheAdobePDFformatisprobablyoneofthemostwidespreadelectronic
documentformats.Useddaily,itisthebasisfortheproductionofexchange-
abledocumentsthatcontainbothtextandgraphics.Ifyoureceiveabill,fol-
lowapurchaseonawebsite,downloadareport,abookoranadministrative
form,thesefileswillmostlikelybePDFdocuments.Forprogrammersthat
needtoprovideanysuchreportingfunctionality,supportingthisformathas
becomeamustandthegenerationofPDFdocumentsispartoftheirtoolkit.
InPharo,Artefactisaninnovativeframeworkthatsupportsthedesignand
generationofPDFdocumentsandisdevelopedbyOlivierAuverlotandGuil-
laumeLarcheveque.
1.1 OverviewofArtefact
ArtefactisaPDFframeworkwhosedesignwasguidedbythegoalsofeffi-
ciency,productivityandscalability.Toachievethis,eachdocumentisde-
scribedbyatreeofobjects.Adocumentisanobjectcontainingacollection
ofotherobjects,eachcorrespondingtoapage.Oneachpagebothvisibleand
non-visibleitemsarealsoobjects.Theseobjectsthenhavethepossibility
tobereusedinthesamedocumentbutalsoacrossdocuments.Objectsare
elementsthatcanbesimple,e.g.apieceoftextoranimage,butalsobecom-
plexelementswithadvancedbehaviorandaspecialappearance,e.g.that
displaydatainatableorgenerateabarcode.
Artefactcontainsdefaultelementssuchasparagraphsortablesthatallow
toquicklygeneratereports.Thestrengthoftheseelementsisthattheyare
1
Convert pdf to text file online - Convert PDF to txt files in C#.net, ASP.NET MVC, WinForms, WPF application
C# PDF to Text (TXT) Converting Library to Convert PDF to Text
convert pdf document to text; pdf image to text
Convert pdf to text file online - VB.NET PDF Convert to Text SDK: Convert PDF to txt files in vb.net, ASP.NET MVC, WinForms, WPF application
VB.NET Guide and Sample Codes to Convert PDF to Text in .NET Project
convert pdf to txt; pdf to text
GeneratePDFDocumentswithArtefact
independentofeachother.Theorderinwhichyoupositiontheminthedoc-
umentdoesnotaffecttheirappearance.ThisisincontrasttomanyPDF
frameworksthatexploitthenotionofstreaminthedefinitionofstyles(a
pieceofbluetextwillbefollowedbyanotherpieceofbluetextintheab-
senceofadirectivetouseadifferentstyle),Artefactconsidersthatevery
elementincludesitsownstyle.Ifanattributeisnotdefinedintheelement,
Artefactthenusesastylesheetthatissetatthedocumentlevelbydefault.
ThisautonomyofelementsandstylemanagementisastrongfeatureofArte-
fact.Itmakesiteasytogenerateadocumentandquicklycustomizeitfora
particularoperation.
Concepts,KeyAspectsandLimits
Aftermorethanayearofdevelopment,theconceptsusedinArtefactare
consideredstableanditisalreadyusedinindustry.Inthissectionwelistits
currentfeaturesandknownlimitations.
• Artefacthasasimplearchitecturethatfacilitatesscalabilityandnew
features.
• ItsupportsthedefinitionofaPDFdocumentanditscontents.
• Itcanspecifymetainformationsuchastitleorauthor.
• Itmanagesdisplayoptionswhenopeningadocumentinareaderthat
iscompatiblewiththisfeature.
• ItsupportscompressedPDFdocumentgeneration.
EachpageofaPDFdocumentcanhaveitsownparticularformatandorien-
tation.Bydefault,Artefactsupportsasetofcommonformats,e.g.A3,A4,
orebook.Itcaneasilybeextendedtofitspecificneeds.Pagelocationisde-
terminednotwhenthepageiscreatedbutwhenitisaddedtoadocument.
Henceeachpageisindependent,whichallowsonetogeneratedocuments
withvariablearchitecture.
Oneachpage,Artefactplacessimpleorcomplexelements.Acomplexele-
mentisgenerallydefinedusingsimpleelementsorothercomplexelements.
Eachelementisindependentandispositionedrelativetotheupperleftcor-
nerofapage.
ArtefactprovidesgreyscalemanagementandcolorsdefinedbytheRGB
model(whereeachcolorcomponentisrepresentedbyonebyte).Charac-
terfontsarethoseimposedbythePDFbutArtefactdoesnotsupporttrue
typefonts(TTF)specification.YoucaninsertimagesintoaPDFdocument
butonlytheJPEGformatiscurrentlysupported.Artefactdoesnotsupport
thedefinitionofinteractiveinputfields,integratingJavaScriptorsafetyas-
pectsofPDFsuchascertificates.Ofcourse,thesespecificationsaresubjectto
changeasandwhentheframeworkchanges.
2
Online Convert PDF to Text file. Best free online PDF txt
Online PDF to Text Converter. Download Free Trial. Convert a PDF to Text. Just upload your file by clicking on the blue button or drag-and-drop your PDF file into
convert pdf to txt format online; convert image pdf to text pdf
C# PDF Text Extract Library: extract text content from PDF file in
Free online source code for extracting text from adobe PDF document in C#.NET class. Able to extract and get all and partial text content from PDF file.
convert pdf to ascii text; convert pdf to editable text
1.2 GettingStartedin10Minutes
1.2 GettingStartedin10Minutes
Say,youalreadyprograminPharoandyouwanttogeneratePDFdocuments.
Thissectionwillshowyouhowtodosoinlessthan10minutes.
Firstyoushouldloadtheframework.Thegoodnewsisthatthereisnoneed
fornativelibrariesasArtefactiswrittenentirelyinPharo.Whateveryour
executionplatform(MicrosoftWindows,MacOSX,Linux,Android,IOS,etc.),
Artefactwillbeavailableandusable.
InstallingArtefact
ArtefactishostedonSmalltalkHub
1
.ToinstallArtefact,executethefollow-
ingexpressions:
Gofer new
smalltalkhubUser: '' ' project: 'RMoD/Artefact';
package: 'ConfigurationOfArtefact';
load.
ConfigurationOfArtefact load
LoadingtheconfigurationautomaticallyloadprojetssuchastheUnitframe-
work(whichsupportsthedefinitionofdifferentmeasurementunits.Byde-
faulttheconfigurationloadsthestableversionthatisproductionready.
Onceloaded,youcanbrowsethemainpackagesandclasses.
• The
Artefact-Examples
packagecontainsmanyusageexamples.
• The
Artefact-Core
packagecontainsthemainelementssuchasdoc-
uments,pagesorstylesheetsbutalsoelectronicdocumentationthatis
accessibleviatheHelpBrowser.
• ThePDFobjects(text,geometricshapes,images,etc.)offeredbythe
basicframeworkarein
Artefact-Core-Elements-Basic
and
Artefact-
Core-Elements-Composites
.
• Thefontsaredefinedinthepackage
Artefact-Core-Fonts
anddocu-
mentformatsinthepackage
Artefact-Core-Formats
.
ExecutingtheFirstDemo’s
ThebestwaytostartwithArtefactistohavealookatthe
Artefact-Examples-
Demos
packageandtoruneachof
PDFDemos
classmethods.
Ifyouwanttorunalldemos,justexecute
PDFDemos runAllDemos
1
http://smalltalkhub.com/#!/~RMoD/Artefact
3
VB.NET PDF Text Extract Library: extract text content from PDF
advanced PDF Add-On, developers are able to extract target text content from source PDF document and save extracted text to other file formats through VB
convert pdf to plain text online; convert pdf to txt file format
C# HTML5 PDF Viewer SDK to convert and export PDF document to
PDF to HTML. Convert PDF to SVG. Convert PDF to Text. Convert PDF Convert PDF to Png, Gif, Bitmap Images. File and Page Process. File: Merge, Append PDF Files. File
change pdf to txt file; pdf to text converter
GeneratePDFDocumentswithArtefact
%\GHIDXOWHDFKJHQHUDWLRQUHVXOWLVZULWWHQLQWKHGHIDXOW3KDURGLUHFWRU\
EXW\RXFDQGHILQH\RXURZQE\PRGLI\LQJWKH
demoPath
FODVVPHWKRGH
J
DVIROORZV
PDFDemos class>>demoPath
^ '/Users/pharo/pdf/'
Finally”HelloWorld!”
<RXZLOOQRZFUHDWH\RXUILUVWDQGVLPSOHVW3')GRFXPHQWZKLFKLVDWH[W
RQDSDJH
7RGRWKLV\RXPXVWGHILQHDQLQVWDQFHRID3')GRFXPHQWWKDW
FRQWDLQVDSDJHZKHUH\RXZLOOSRVLWLRQDWH[WFRPSRQHQW
PDFDocument new
exportTo: 'helloworld.pdf' asFileReference writeStream
2QFHWKHLQVWDQFHRI
PDFDocument
LVFUHDWHGLWLVH[SRUWHGXVLQJDVWUHDP
WRDILOHQDPHG
helloworld.pdf
%\GHIDXOWWKHSURGXFHG3')GRFXPHQWLV
SODFHGLQWKHGLUHFWRU\RI3KDUR
,I\RXRSHQWKHILOHLWLVHPSW\
PDOVLQFH\RXKDYHQRW\HWGHILQHGDQGDGGHGDQ\FRQWHQWWRWKHGRFXPHQW
/HWXVHQULFKWKHSUHYLRXVH[DPSOHDQGDGGDSDJHWRWKHGRFXPHQW
PDFDocument new
add: PDFPage new;
exportTo: 'helloworld.pdf' asFileReference writeStream
1RZLI\RXRSHQWKHILOHWKHUHVXOWLVGLIIHUHQWVLQFHWKHGRFXPHQWFRQWDLQV
DQHPSW\SDJH
/HWXVDGGDILUVWWH[WFRPSRQHQWWRRXUSDJH
PDFDocument new add:
(PDFPage new add:
(PDFTextElement new w text: 'Hello o World!'; ; from: 10mm @ @ 10mm));
exportTo: 'helloworld.pdf' asFileReference writeStream
7RSODFHWKHWH[WRQWKHSDJHZHFUHDWHDFRPSRQHQWRIW\SH
PDFTextEle-
ment
:HDGGLWWRWKHSDJHDQGGHILQHLWVSRVLWLRQXVLQJWKHPHVVDJH
from:
1RWHWKDWZHFDQVSHFLI\GLPHQVLRQVXVLQJVHYHUDOXQLWVVXFKDVPLOOLPHWHUV
PP FHQWLQHWHUV FP RULQFKHV LQFK
7KHVHFRRUGLQDWHVDUHGHILQHGIURP
WKHXSSHUOHIWFRUQHURIWKHSDJH
$UWHIDFWXVHVDVHWRIGHIDXOWVWRJHWFRPSDFWFRGHZKHQFUHDWLQJHOHPHQWV
WKDWDUHSDUWRIDGRFXPHQW
0RUHVSHFLILFDOO\VW\OHSDUDPHWHUVDUHVHWWR
ZKDWDUHFRQVLGHUHGWKHPRVWFRPPRQYDOXHV
,QWKLVH[DPSOHWKHSDJHIRU
PDWLVVHWWR$DQGLWVRULHQWDWLRQWRSRUWUDLW
WHQLQEODFNXVLQJWKH+HOYHWLFDIRQW
7KLVILUVWH[DPSOHLQWURGXFHGVRPHEDVLFFRQFHSWVDQGVKRZVKRZVLPSOHLW
LVWRSURGXFHD3')GRFXPHQWZLWK3KDUR
7KHIROORZLQJVHFWLRQVJRGHHSHU
LQ$UWHIDFWDQGVKRZKRZWRGHILQHPRUHFRPSOH[GRFXPHQWV
4
VB.NET PDF - Convert PDF Online with VB.NET HTML5 PDF Viewer
Convert PDF to SVG. Convert PDF to Text. Convert PDF to JPEG. Convert PDF to Png, Gif Images. File & Page Process. File: Merge, Append PDF Files. File: Split PDF
convert pdf picture to text; extract text from pdf
VB.NET PDF File Compress Library: Compress reduce PDF size in vb.
Convert smooth lines to curves. Detect and merge image fragments. Flatten visible layers. VB.NET Demo Code to Optimize An Exist PDF File in Visual C#.NET Project
c# convert pdf to text; converting pdf to editable text for
1.3 DocumentDefinition
1.3 DocumentDefinition
ArtefactrepresentsPDFdocumentsasobjectsthatareinstanceoftheclass
PDFDocument
.Theyplaytheroleofcontainersforreceivingpages.APDF-
Documentalsosupportsadvancedoptionssuchasthedocumentsize,man-
agementofcompression,theopeninginthePDFreaderandthedefinitionof
metainformation.
Theorderinwhichpagesareaddedtothe
PDFDocument
objectdefinethe
organizationofdatawithinthedocument,nottheorderinwhichthepages
arecreated.Thismodeofoperationallowsyoutoproducedocumentswhose
contentscanbedynamicallygeneratedandorganizedatalatertime.
PageAddition
Toaddpagestoadocument,themessage
add:
isused.Itappendsapage
afterthosealreadypresentinthedocument.WhengeneratingthePDFfile,
Artefacttraversesthelistofpagesstartingfromtheearliestaddedtothe
last.Thefollowingscriptdefinesadocumentwithasingleblankpage.
PDFDocument new
add: PDFPage new;
exportTo: 'EmptyPage.pdf' asFileReference writeStream
DocumentProperties
APDFDocumentcanbeconfiguredwithaspecificformat,orientation,com-
pressionanddisplaymode,asweshownext.
DocumentFormatandOrientation
Bydefault,adocumentisgeneratedintheA4formatbutotherformatsare
available.ThePackage
Artefact-Core-Formats
containsalistofprede-
finedformatscoveringmanyneeds.Examplesare:A3(
PDFA3Format
),letter
size(
PDFLetterFormat
)andaformatsuitablefore-readers(
PDFEbookFormat
).
Ifyouneedaparticularformat,youcandefineit.Aformatissimplydefined
bythevaluereturnedbythemessage
defaultSize
.
A
PDFDocument
acceptsthemessage
format:
tospecifytheformatofall
pagesofthedocument.Foreachpage,thisvaluewillbethedefaultifnot
redefinedotherwise.Eachpagecanspecifyadifferentformat.Thefollowing
examplecreatesadocumentusingtheA3format:
PDFDocument new
format: PDFA3Format new;
add: PDFPage new;
exportTo: 'A3.pdf' asFileReference writeStream
5
C# PDF File Split Library: Split, seperate PDF into multiple files
SharePoint. C#.NET control for splitting PDF file into two or multiple files online. Support to break a large PDF file into smaller files.
convert scanned pdf to text word; text from pdf
VB.NET PDF Convert to Jpeg SDK: Convert PDF to JPEG images in vb.
Convert PDF to SVG. Convert PDF to Text. Convert PDF to JPEG. Convert PDF to Png, Gif Images. File & Page Process. File: Merge, Append PDF Files. File: Split PDF
convert pdf to text format; .net extract text from pdf
GeneratePDFDocumentswithArtefact
Theabstractsuperclassofallformats(
PDFFormat
)isresponsiblefordefining
thepageorientation.Therearetwoalternatives:portraitorlandscape.Page
orientationissetbysendingoneofthetwomessagestotheformatobject:
setPortrait
and
setLandscape
.
ThefollowingexamplegeneratesadocumentwhosepagesareinA3format
andlandscapeorientation.
PDFDocument new
format: PDFA3Format new setLandscape;
add: PDFPage new;
exportTo: 'A3landscape.pdf' ' asFileReference writeStream
Notethatsettingthedefaultlandscapemodeforadocumentdoesnotex-
cludethepossibilityforaparticularpagetobeorientedinportraitmode.
Artefactfullysupportspagesofdifferentsizesanddifferentorientations
withinasingledocument.
Compression
ThePDFformatallowsyoutocompressthedata,whichisagoodthingasa
PDFdocumentcancontainlargeamountsofdata.Totominimizetheweight
ofgenerateddocumentsArtefactdefaultstocompressingthedata.Ifyou
needtodisablethisoption,youshouldsendthe
uncompressed
messageto
thedocument.
ThefollowingexamplegeneratesanuncompressedPDFdocument:
PDFDocument new
uncompressed;
add: PDFPage new;
exportTo: 'uncompressed.pdf' ' asFileReference writeStream
Anothermessage,named
compressed
,setsthecompression.
ControlingDocumentOpening
AdobeAcrobatreadersupportsvariousdisplaymodeswhenopeningaPDF
document.TheselectedmodeisdefineddirectlyintothePDFdocument.
NotethatifthePDFreaderthatisusedtolookatthedocumentisnotcom-
patiblewiththeseoptions,theywillbeignored.
Displaymodepropertiesaredividedintwocategories:thosedetermining
thesizeofthepagesandthoserelatedtothepageorganizationonthescreen.
Theformeraresetusingthemessages
fullPage
,
fullWidth
,
real
and
zoom:
,
andthelatterusing
singlepage
,
twoPages
and
continuousPages
.These
messagesshouldbesenttoan
PDFDocument
instance.
With
fullPage
,eachpageofthedocumentoccupiestheentiredisplayspace.
With
fullWidth
,thedisplayisoptimizedtothepagewidth.With
real
,the
displaymeetsthedimensionsspecifiedinthePDFdocument.
6
1.3 DocumentDefinition
Thefollowingexamplecreatesadocumentthatwilloccupyallavailabledis-
playspace:
pdfdoc := = PDFDocument t new w fullPage.
Withthemessage
zoom:
,youcandefineazomingratio,expressedinper-
centages.Thefollowingexampledefinesthatthedocumentshouldbeopened
withzoomof400percent.
pdfdoc := = PDFDocument t new w zoom: 400.
Youcanalsochoosetodisplayasinglepage(
singlePage
),pagestwobytwo
(
twoPages
)oroneaftertheother(
continousPages
)asinthefollowingex-
ample:
pdfdoc := = PDFDocument t new w continousPages.
Thesesmessagescanbecombinedasshowninthefollowingexample:
pdfdoc := = PDFDocument t new w zoom: 200; continuousPages.
SettingMeta
I
nformation
EachPDFdocumentcontainsasetofinformationaboutitsorigins.These
dataarenottobeoverlooked,especiallyifyourdocumentisintendedtocon-
tributetoanEDM(ElectronicDocumentManagement)systemorispartof
aneditorialworkflow.Withthisinformationitispossibletosearchamonga
setofPDFdocumentsandselect,forexample,thosewrittenbyaparticular
authororthoseforwhichcertainkeywordshavebeenspecified.
Artefactimplementsthisinformationbyusinganinstanceof
PDFMetaData
.
Toeachinstanceof
PDFDocument
,aninstanceof
PDFMetaData
isassociated
andisaccessibleusingthemessage
metaData
.Bydefault,theproducerisset
to
'Artefact'
.Youcanspecifythedocumenttitle,subjectorashortsum-
mary,thenameoftheauthor,alistofkeywordsandthedocumentcreator.
Thefollowingexamplegeneratesanewdocumentanditsmetadatainforma-
tion:
pdfdoc := = PDFDocument t new.
pdfdoc metaData
title: 'Document title';
subject: 'subject t of f the document';
author: 'The Pasta Team';
keywords: 'cool rock best';
creator: 'Pharo'.
7
GeneratePDFDocumentswithArtefact
1.4 Pages,FormatsandModels
PagesarethesupportforwritinganddrawinginyourPDFdocuments.A
pagedefinesapagesize,orientationandpositionwithinaPDFdocument.A
pagecanbebuiltfromamodelthatprovidesanoverlayonwhichthepage
contentsaredeposited.
PageCreation
Apageisrepresentedbyaninstanceoftheclass
PDFPage
.Creatingisapage
issimplydonebysendingthemessage
new
totheclass.
page := PDFPage new.
Sendingthemessage
add:
toadocumentwithapageasargumentwillap-
pendthepagetothedocument.
pdfdoc := = PDFDocument t new.
page := PDFPage new.
pdfdoc add: page.
Bydefault,apagetakesthedimensionsandorientationofitsdocument.If
yourPDFdocumentisA4landscape,alladdedpageswillusethesesettings.
However,Artefactcanassignspecificdimensionsandorientationtoeach
page,allowingonedocumenttohaveamixofpageswithdifferentcharac-
teristics.Toallowthis,eachinstanceof
PDFPage
understandsthemessage
format:
,whichtakesaninstanceof
PDFFormat
asargument.
Thefollowingexamplecreatesatwo-pagedocument.Thefirstusesthede-
faultformatofthedocument,thesecondisinA4landscape.
pdfdoc := = PDFDocument t new.
page1 := = PDFPage e new.
page2 := = PDFPage e new w format: : (PDFA3Format t new setLandscape).
pdfdoc add: page1; ; add: page2.
Templates
Atemplateisaninstanceofclass
PDFTemplate
,whichinheritsfromthe
class
PDFPage
.Itisapagewithpredefinedcontentsthatwillactastheback-
groundpageonwhichyouwilldraworaddyourcomponents.Forexample,
itcanbecomposedofaheaderforaletter,aheaderandafooterforareport,
oradelimitedsurface.
Thepackage
Artefact-Examples-Demos-Templates
offerstwoexample
oftemplatetocreateCDorDVDsleevepages.Thefollowingcodesnippet
producesaA4pageonwhichtheoutlinesofaCDsleevearedrawn.
pdfdoc := = PDFDocument t new.
cover := = PDFCompactDiscTemplate new.
8
1.5 Elements
Atemplateisdefinedusingthemessage
drawTemplate
whichaddstheArte-
factelementstothepage.Thisbuildsthepagebackground.Forexample,the
codeoftheCDtemplateisbelow.(Asitisrelativelystraightforwardwedo
notexplainthecodehere.)
PDFCompactDiscTemplate>>drawTemplate
self add: ((PDFRectElement
from: 10 0 mm m @ @ 10 0 mm
dimension: 240 0 mm m @ @ 120 0 mm)
dotted: self f dotted
).
self add: ((PDFLineElement
from: 130mm @ @ 10mm
to: 130mm m @ 130mm)
dotted: self f dotted
).
1.5 Elements
Thecontentsofpagesisdefinedusingreusablecomponentscalledelements.
Artefacthasbasicelementsthatperformsimpleoperationssuchasdrawing
aline,butalsocomplexelementsthatcan,forexample,displaydatainata-
bleorgenerateabarcode.Ofcourse,itispossibletoidentifyanddefinenew
components.
Morespecifically,a
PDFElement
isareusablecomponentthatrepresentsa
text,animage,ageometricshapeorevenacomplexgraphortable.There
aretwokindsof
PDFElement
:
• Simpleelementsinheritfrom
PDFBasic
(aprimitiveoperationinthe
pdfspecification).
• Compositeelementsinheritfrom
PDFComposite
(awrapperaround
multiple
PDFElements
whethertheyarebasicorcomposite).
Simpleelementsareasfollows,andtheirhierarchyisshowninFigure1.1:
PDFBezierCurveElement
PDFCircleElement
PDFLineElement
PDFPolygonElement
PDFRectElement
PDFJpegElement
PDFTextElement
Compositeelementsareasfollows,andtheirhierarchyisshowninFigure
1.2:
9
Figure1.1 PageandDocumentElements
Figure1.2 CompositeElements
10
Documents you may be interested
Documents you may be interested