pdf conversion in c# : Export pdf data to excel SDK application service wpf html azure dnn itext_so-sample5-part1256

InspectaPDFwithiText
45
How to Get PDF page width and height?
IhaveaPDF,andIwanttogetthewidthandheightofeachpageusingiTextSharp.This
iswhatIhavesofar:
string source=@"D:\pdf\test.pdf";
PdfReader reader = new PdfReader(source);
PostedonStackOverflowonAug13,2013⁸³byMohamedKamal⁸⁴
DoyouwanttheMediaBox?
Rectangle mediabox = reader.GetPageSize(page);
Doyouwanttherotation?
int rotation = reader.GetPageRotation(page);
Doyouwantthecombinationofboth?
Rectangle pagesize = reader.GetPageSizeWithRotation(page);
DoyouwanttheCropBox?
Rectangle cropbox = reader.GetCropBox(page);
Thesearesomemethodsthatwillgiveyou informationaboutthedimensionsofapage. Mostof
themreturnanobjectoftype
Rectangle
thathasmethodssuchas
getWidth()
and
getHeight()
to
getthewidthandtheheightofthepage.Other usefulmethodsare
getLeft()
and
getRight()
as
wellas
getTop()
and
getBottom()
.Thesefourmethodsreturnthe
x
and
y
coordinatesthatdefine
theboundariesofyourpage.
⁸³
http://stackoverflow.com/questions/18202660/how-to-get-pdf-page-width-and-height
⁸⁴
http://stackoverflow.com/users/2677532/mohamed-kamal
Export pdf data to excel - extract form data from PDF in C#.net, ASP.NET, MVC, Ajax, WPF
Help to Read and Extract Field Data from PDF with a Convenient C# Solution
vb extract data from pdf; how to extract data from pdf to excel
Export pdf data to excel - VB.NET PDF Form Data Read library: extract form data from PDF in vb.net, ASP.NET, MVC, Ajax, WPF
Convenient VB.NET Solution to Read and Extract Field Data from PDF
fill in pdf form reader; how to type into a pdf form in reader
Manipulating existing PDFs
Inthischapter,we’regoingtosolvesomeproblemswhenworkingwithexistingPDFsthatneedtobe
splitintodifferentfiles,mergedorstamped.Usually,wearegoingtouseacombinationof
PdfReader
toreadthedocumentand
PdfStamper
,
PdfCopy
or
PdfSmartCopy
tocreateanewPDF.Notethatwe’ll
skipfillingoutinteractiveformsfornow.We’lldealwithAcroFormandXFAtechnologyinthenext
chapter.
How to update a PDF without creating a new PDF?
Ineed to change the value of a field in an existing PDF file. I am using
PdfReader
,
PdfStamper
and
AcroFields
andthat’sworkingfine.But,in doing so,it isrequired to
createanewPDFandIwouldlikethechangetobereflectedintheexistingPDFitself.
IfIamsettingthedestination
filename
tobethesameastheoriginal
filename
,thenmy
applicationfails.
PostedonStackOverflowonApr18,2013⁸⁵bytk2013⁸⁶
Youcan’treadafileandwritetoitsimultaneously.ThinkofhowMicrosoftWordworks:youcan’t
open aWorddocumentandwritedirectlytoit. Wordalwayscreatesatemporaryfile,writesthe
changestoit,thenreplacestheoriginalfilewithit,andthenthrowsawaythetemporaryfile.
Youcandothattoo:
• readtheoriginalfilewith
PdfReader
;
• createatemporaryfilefor
PdfStamper
,andwhenyou’redone,
• replacetheoriginalfilewiththetemporaryfile.
Or:
• readtheoriginalfileintoa
byte[]
,
• create
PdfReader
withthis
byte[]
,and
• usethepathtotheoriginalfilefor
PdfStamper
.
Thelatteroptionismoredangerous,asyou’lllosetheoriginalfileifyoudosomethingthatcauses
anexceptionin
PdfStamper
.IfIwereyou,I’dcreateatemporaryfile.
⁸⁵
http://stackoverflow.com/questions/16081831/using-itextsharp-stamper-required-to-update-in-the-same-pdf
⁸⁶
http://stackoverflow.com/users/2239456/tk2013
VB.NET Create PDF from Excel Library to convert xlsx, xls to PDF
Create PDF from Text. PDF Export. Convert PDF to Word (.docx Image to PDF. Image: Remove Image from PDF Page. Image Data: Read, Extract Field Data. Data: Auto Fill
how to save editable pdf form in reader; collect data from pdf forms
C# Create PDF from Excel Library to convert xlsx, xls to PDF in C#
Merge all Excel sheets to one PDF file. Export PDF from Excel with cell border or no border. Free online Excel to PDF converter without email.
can reader edit pdf forms; extract data from pdf to excel
ManipulatingexistingPDFs
47
How to add an image watermark to a PDF file?
I’musingC#andiTextSharptoaddawatermarktomyPDFfiles:
Document document = new Document();
PdfReader pdfReader = new PdfReader(strFileLocation);
PdfStamper pdfStamper = new PdfStamper(pdfReader, new FileStream(strFileLocation\
Out, FileMode.Create, FileAccess.Write, FileShare.None));
iTextSharp.text.Image img = iTextSharp.text.Image.GetInstance(WatermarkLocation);
img.SetAbsolutePosition(100300);
PdfContentByte waterMark;
for (int pageIndex = 1; pageIndex x <= pdfReader.NumberOfPages; pageIndex++) ) {
waterMark = pdfStamper.GetOverContent(pageIndex);
waterMark.AddImage(img);
}
pdfStamper.FormFlattening = true;
pdfStamper.Close();
Itworksfine,butmyproblemisthatinsomePDFfilesnowatermarkisaddedalthough
thefilesizeincreased,anyidea?
PostedonStackOverflowonJul8,2013⁸⁷byAbady⁸⁸
The factthat thefilesizeincreasesisa good indication that thewatermark is added. Themain
problemisthatyou’readdingthewatermarkoutsidethevisibleareaofthepage.Seemyanswerto
thequestionHowtopositiontextrelativetopageusingiText?
Youneedsomethinglikethis:
Rectangle pagesize = reader.getCropBox(pageIndex);
if (pagesize == null)
pagesize = reader.getMediaBox(pageIndex);
img.SetAbsolutePosition(
pagesize.GetLeft(),
pagesize.GetBottom());
Thatis:ifyouwanttoaddtheimageinthelower-leftcornerofthepage.Youcanaddanoffset,but
makesuretheoffsetinthexdirectiondoesn’texceedthewidthofthepage,andtheoffsetinthey
directiondoesn’texceedtheheightofthepage.
⁸⁷
http://stackoverflow.com/questions/17522965/how-to-add-a-watermark-to-a-pdf-file
⁸⁸
http://stackoverflow.com/users/1450667/abady
C# WPF PDF Viewer SDK to convert and export PDF document to other
PDF from RTF. Create PDF from Text. PDF Export. Convert PDF Edit, Delete Metadata. Watermark: Add Watermark to PDF. Form Process. Data: Read, Extract Field Data.
extract data from pdf file; flatten pdf form in reader
VB.NET PDF - Convert PDF with VB.NET WPF PDF Viewer
Create PDF from Text. PDF Export. Convert PDF to Word (.docx Image to PDF. Image: Remove Image from PDF Page. Image Data: Read, Extract Field Data. Data: Auto Fill
extract data from pdf form; how to save a filled out pdf form in reader
ManipulatingexistingPDFs
48
Why does the function to concatenate / merge PDFs
cause issues in some cases?
I’musingthefollowingcodetomergePDFstogetherusingiText:
public static void concatenatePdfs(List<FilelistOfPdfFilesFile outputFile)
throws DocumentExceptionIOException {
Document document new Document();
FileOutputStream outputStream new FileOutputStream(outputFile);
PdfWriter writer = PdfWriter.getInstance(document, outputStream);
document.open();
PdfContentByte cb = writer.getDirectContent();
for (File inFile listOfPdfFiles) {
PdfReader reader new PdfReader(inFile.getAbsolutePath());
for (int = 1; <= reader.getNumberOfPages(); i++) {
document.newPage();
PdfImportedPage page = writer.getImportedPage(reader, i);
cb.addTemplate(page, 0, , 0);
}
}
document.close();
}
This usually works great! But once and a while, it’s rotating some of the pages by 90
degrees?Anyoneeverhavethishappen?
PostedonStackOverflowonApr14,2014⁸⁹byNicholasDiPiazza⁹⁰
Thereareerrorsonceinawhilebecauseyouareusingthewrongmethodtoconcatenatedocuments.
Youshouldnotuse
PdfWriter
toconcatenate(ormerge)PDFdocuments.Thatiswrongbecause:
• Youcompletelyignorethepagesizeofthepagesintheoriginaldocument(youassumethey
areallofsizeA4),
• Youignorepageboundariessuchasthecropbox(ifpresent),
• Youignoretherotationvaluestoredinthepagedictionary,
• Youthrowawayallinteractivitythatispresentintheoriginaldocument,andsoon.
ConcatenatingPDFsisdoneusing
PdfCopy
,seeforinstance:
⁸⁹
http://stackoverflow.com/questions/23062345/function-that-can-use-itext-to-concatenate-merge-pdfs-together-causing-some
⁹⁰
http://stackoverflow.com/users/1174024/nicholas-dipiazza
C# PDF Converter Library SDK to convert PDF to other file formats
Able to export PDF document to HTML file. for C#.NET supports file conversion between PDF and various and images, like Microsoft Office (Word, Excel, and PPT
extract data from pdf; how to extract data from pdf file using java
VB.NET PDF Converter Library SDK to convert PDF to other file
PDF Export. |. Home ›› XDoc.PDF ›› VB.NET PDF: PDF Export. for converting MicroSoft Office Word, Excel and PowerPoint document to PDF file in VB
how to fill pdf form in reader; extract data from pdf to excel online
ManipulatingexistingPDFs
49
Document document new Document();
PdfCopy copy new PdfSmartCopy(documentnew FileOutputStream(dest));
document.open();
PdfReader reader;
String line = br.readLine();
// loop over readers
// add the PDF to PdfCopy
reader new PdfReader(baos.toByteArray());
copy.addDocument(reader);
reader.close();
// end loop
document.close();
Ifyouaremergingdocumentsthatcontainfields,youneedtoaddthefollowingline:
copy.SetMergeFields();
C# WPF PDF Viewer SDK to view, annotate, convert and print PDF in
PDF from RTF. Create PDF from Text. PDF Export. Convert PDF Image to PDF. Image: Remove Image from PDF Page. Form Process. Data: Read, Extract Field Data. Data: Auto
pdf data extractor; extract table data from pdf to excel
VB.NET Create PDF from PowerPoint Library to convert pptx, ppt to
Create PDF from Text. PDF Export. Convert PDF to Word (.docx Image to PDF. Image: Remove Image from PDF Page. Image Data: Read, Extract Field Data. Data: Auto Fill
pdf data extraction tool; pdf form data extraction
ManipulatingexistingPDFs
50
How to merge documents correctly?
Iwouldliketoaddalinktoanexistingpdfthatjumpstoacoordinateonanotherpage.
IhavethefollowingproblemwhenprintingthePDFfileaftermerge,thePDFdocuments
getcutoff.Sometimesthishappensbecausethedocumentsaren’t8.5x11whereasthepage
sizemightbe11x17.
Is there some way to detect the page size and then use that same page sizefor those
documents?Or,ifnot,isitpossibletohaveitfittopage?Thisismycode:
Document document new Document();
PdfWriter writer = PdfWriter.getInstance(document, outputStream);
document.open();
BaseFont bf = BaseFont.createFont(BaseFont.HELVETICA,
BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
PdfContentByte cb = writer.getDirectContent();
PdfImportedPage page;
int currentPageNumber = 0;
int pageOfCurrentReaderPDF = 0;
Iterator<PdfReader> iteratorPDFReader = readers.iterator();
while (iteratorPDFReader.hasNext()) {
PdfReader pdfReader = iteratorPDFReader.next();
while (pageOfCurrentReaderPDF pdfReader.getNumberOfPages()) {
Rectangle r = pdfReader.getPageSize(
pdfReader.getPageN(pageOfCurrentReaderPDF + 1));
if(r.getWidth()==792.0 && r.getHeight()==612.0)
document.setPageSize(PageSize.A4.rotate());
else
document.setPageSize(PageSize.A4);
document.newPage();
pageOfCurrentReaderPDF++;
currentPageNumber++;
page = writer.getImportedPage(pdfReader, pageOfCurrentReaderPDF);
cb.addTemplate(page, 0, , 0);
cb.beginText();
cb.setFontAndSize(bf, 9);
cb.showTextAligned(PdfContentByte.ALIGN_CENTER""
currentPageNumber + " of " + totalPages, 520, 5, 0);
cb.endText();
}
pageOfCurrentReaderPDF = 0;
}
document.close();
VB.NET PDF- HTML5 PDF Viewer for VB.NET Project
Create PDF from Text. PDF Export. Convert PDF to Word (.docx Image to PDF. Image: Remove Image from PDF Page. Image Data: Read, Extract Field Data. Data: Auto Fill
edit pdf form in reader; exporting data from pdf to excel
ManipulatingexistingPDFs
51
Screenshot
PostedonStackOverflowonFeb12,2014⁹¹bySumitVaidya⁹²
Using
PdfWriter
tomergedocumentsisabadidea.ThishasbeenexplainedonStackOverflowmany
times!
Mergingdocumentsisdoneusing
PdfCopy
(or
PdfSmartCopy
).
Ifyouneedanexample,seeforinstancetheFillFlattenMerge2⁹³example:
Document document new Document();
PdfCopy copy new PdfSmartCopy(documentnew FileOutputStream(dest));
document.open();
PdfReader reader;
String line = br.readLine();
// loop over readers
// add the PDF to PdfCopy
reader new PdfReader(baos.toByteArray());
copy.addDocument(reader);
⁹¹
http://stackoverflow.com/questions/21731439/pdf-page-cutting-through-itext-api
⁹²
http://stackoverflow.com/users/2853641/sumit-vaidya
⁹³
http://itextpdf.com/sandbox/acroforms/reporting/FillFlattenMerge2
ManipulatingexistingPDFs
52
reader.close();
// end loop
document.close();
Inyourcase,youalsoneedtoaddpagenumbers,youcandothisinasecondgo,asisdoneinthe
StampPageXofY⁹⁴example:
PdfReader reader new PdfReader(src);
int reader.getNumberOfPages();
PdfStamper stamper new PdfStamper(readernew FileOutputStream(dest));
PdfContentByte pagecontent;
for (int = 0; n; ) {
pagecontent = stamper.getOverContent(++i);
ColumnText.showTextAligned(pagecontent, Element.ALIGN_RIGHT,
new Phrase(String.format("page %s of f %s"in)), 559, 806, 0);
}
stamper.close();
reader.close();
Oryoucanaddthemwhilemerging,asisdoneintheMergeWithToc⁹⁵example.
Document document new Document();
PdfCopy copy new PdfCopy(documentnew FileOutputStream(filename));
PageStamp stamp;
document.open();
int n;
int pageNo = 0;
PdfImportedPage page;
Chunk chunk;
for (Map.Entry<StringPdfReaderentry filesToMerge.entrySet()) {
= entry.getValue().getNumberOfPages();
for (int = 0; n; ) ) {
pageNo++;
page = copy.getImportedPage(entry.getValue(), ++i);
stamp = copy.createPageStamp(page);
chunk new Chunk(String.format("Page %d", pageNo));
if (== 1)
chunk.setLocalDestination("p" + pageNo);
ColumnText.showTextAligned(stamp.getUnderContent(),
⁹⁴
http://itextpdf.com/sandbox/stamper/StampPageXofY
⁹⁵
http://itextpdf.com/sandbox/merge/MergeWithToc
ManipulatingexistingPDFs
53
Element.ALIGN_RIGHTnew Phrase(chunk),
559, 810, 0);
stamp.alterContents();
copy.addPage(page);
}
}
document.close();
for (PdfReader r filesToMerge.values()) {
r.close();
}
reader.close();
Istronglyadviseagainstusing
PdfWriter
tomergedocuments!Asdocumented,youarethrowing
awayall annotationsbyadding
PdfImportedPage
instancesto adocumentusing
addTemplate()
.
Thisistypicallynotwhatyouwant.You’reonlymakingitharderonyourselfifyouinsistonusing
thatclass.Idon’tunderstandwhysomanypeopleusethewrongapproachtomergedocuments.I
blametheunofficialdocumentationforthepopularityofthiswrongapproach.
Interactive forms
IsyourformbasedonAcroFormtechnologyorisitbasedontheXMLFormsArchitecture?That’s
acommoncounter-questionyou’llbeconfrontedwithwhenaskingaquestionaboutforms.Inany
case,theseanswersshouldhelpyousolvingthemostcommonproblemswithrespecttoforms.
How to fill out apdf file programmatically? (AcroForm
technology)
Whattechniquesavailabletofillapdfformautomaticallyusingexternaldataandsave
them.IhavetousedatafromadatabasetofillatemplatePDFandsaveacopyofiton
diskwiththatdata.Languageandplatformisnotissuebutitwouldbegoodifitcanrun
onwindowsandLinux.
PostedonStackOverflowonJun24,2010⁹⁶byaffan⁹⁷
IfyourformisbasedonAcroFormtechnology,youcanuseiTexttofillitoutlikethis:
PdfReader reader new PdfReader(src);
PdfStamper stamper new PdfStamper(readernew FileOutputStream(dest));
AcroFields form = stamper.getAcroFields();
form.setField(key, value);
stamper.setFormFlattening(true);
stamper.close();
reader.close();
Inthissnippet,
src
isthesourceofaPDFfile(couldbeapathtoafile,couldbea
byte[]
)and
dest
isthepathtotheresultingPDF.The
key
correspondswiththenameofafieldinyourtemplate.The
value
correspondswiththevalueyouwanttofillin.Ifyouwanttheformtokeepitsinteractivity,
youneedtoremovetheline
stamper.setFormFlattening(true);
otherwiseallformfieldswillbe
removed,resultinginaflatPDF.
⁹⁶
http://stackoverflow.com/questions/3108704/how-to-fill-out-a-pdf-file-programatically
⁹⁷
http://stackoverflow.com/users/109769/affan
Documents you may be interested
Documents you may be interested