This can be e substitutedinto Eq. (6) ) to determine e the sequence loss s (upto
a constant that t depends only on n the e quantisation n of f the e data a anddoes s not
in uencenetworktraining):
L(x)=
XT
t=1
log
0
@
X
j
j
t
N(x
t+1
j
j
t
;
j
t
;
j
t
)
1
A
(
loge
t
if(x
t+1
)
3
=1
log(1 e
t
) otherwise
(26)
Thederivativeofthelosswithrespecttotheend-of-strokeoutputsisstraight-
forward:
@L(x)
@^e
t
=(x
t+1
)
3
e
t
(27)
The derivatives withrespectto themixturedensity outputs canbefoundby
rstdeningthecomponentresponsibilities
j
t
:
^
j
t
=
j
t
N(x
t+1
j
j
t
;
j
t
;
j
t
)
(28)
j
t
=
^
j
t
P
M
j0=1
^
j0
t
(29)
Thenobservingthat
@L(x)
@^
j
t
=
j
t
j
t
(30)
@L(x)
@(^
j
t
;^
j
t
;^
j
t
)
j
t
@logN(x
t+1
j
j
t
;
j
t
;
j
t
)
@(^
j
t
;^
j
t
;^
j
t
)
(31)
where
@logN(xj;;)
@^
1
=
C
1
x
1
1
1
(x
2
2
)
2
(32)
@logN(xj;;)
@^
2
=
C
2
x
2
2
2
(x
1
1
)
1
(33)
@logN(xj;;)
@^
1
=
C(x
1
1
)
1
x
1
1
1
(x
2
2
)
2
1
(34)
@logN(xj;;)
@^
2
=
C(x
2
2
)
2
x
2
2
2
(x
1
1
)
1
1
(35)
@logN(xj;;)
@^
=
(x
1
1
)(x
2
2
)
1
2
+(1 CZ)
(36)
withZdenedasinEq.(25)and
C=
1
1 2
(37)
Fig.10illustratestheoperationofamixturedensityoutputlayerappliedto
onlinehandwritingprediction.
21
Convert pdf to 300 dpi jpg - Convert PDF to JPEG images in C#.net, ASP.NET MVC, WinForms, WPF project
How to convert PDF to JPEG using C#.NET PDF to JPEG conversion / converter library control SDK
change file from pdf to jpg; convert pdf into jpg online
Convert pdf to 300 dpi jpg - VB.NET PDF Convert to Jpeg SDK: Convert PDF to JPEG images in vb.net, ASP.NET MVC, WinForms, WPF project
Online Tutorial for PDF to JPEG (JPG) Conversion in VB.NET Image Application
change pdf file to jpg online; conversion of pdf to jpg
Output Density
Figure10: Mixture e density y outputs s for handwriting g prediction. . The
topheatmapshows thesequenceofprobabilitydistributionsforthepredicted
pen locations s as the word ‘under’ ’ is s written. . The e densities for successive
predictions are added together, , giving g high values where e the distributions
overlap.
Two types s of f prediction n are e visible e from m the e density map: : the e small
blobs that spell l out t the letters s are e the predictions s as s the strokes s are being
written,thethreelargeblobsarethepredictionsattheendsofthestrokesfor
the rst point t inthe e next t stroke. . The e end-of-stroke predictions s have e much
highervariancebecausethepenpositionwasnotrecordedwhenitwasothe
whiteboard,andhencetheremay be alargedistance betweentheendofone
strokeandthestartofthenext.
The bottom heatmap shows the mixture component weights during the
same sequence. . The e stroke ends are also visible e here, , withthe most active
componentsswitchingointhreeplaces,andothercomponentsswitchingon:
evidently end-of-stroke predictions usea dierent set of mixture components
fromin-strokepredictions.
22
VB.NET PDF Convert to Images SDK: Convert PDF to png, gif images
page As PDFPage = doc.GetPage(0) ' Convert the first New PDFDocument(inputFilePath) ' Get the first page of PDF. Render the page with a resolution of 300 dpi.
change from pdf to jpg on; convert pdf to high quality jpg
C# PDF Convert to Images SDK: Convert PDF to png, gif images in C#
int targetResolution = 300; Bitmap bitmap3 = page.GetBitmap(targetResolution); bitmap3.Save(inputFilePath + "_3 Description: Convert all the PDF pages to
changing file from pdf to jpg; best way to convert pdf to jpg
4.2 Experiments
Eachpointinthedatasequencesconsistedofthreenumbers:thexandyoset
from the previous point, , andthe e binary end-of-stroke feature. . The e network
input layer was therefore size 3. . The e co-ordinate osets were normalisedto
mean0,std. dev. 1overthetrainingset. 20mixturecomponents s wereused
tomodel the osets, , giving g a total of 120 mixture parameters per timestep
(20weights,40means,40standarddeviationsand20correlations). Afurther
parameter was used d tomodel the e end-of-stroke probability, givinganoutput
layer of size 121. . Two o network architectures were comparedfor the hidden
layers: onewiththreehiddenlayers,eachconsistingof400LSTMcells,andone
withasinglehiddenlayerof900LSTMcells. Bothnetworkshadaround3.4M
weights. Thethreelayernetworkwasretrainedwithadaptiveweightnoise[8],
withallstd.devs.initialisedto0.075. Trainingwithxedvarianceweightnoise
provedineective,probablybecauseitpreventedthemixturedensitylayerfrom
usingpreciselyspeciedweights.
Thenetworksweretrainedwithrmsprop,aformofstochasticgradientde-
scentwherethegradientsaredividedbyarunningaverageoftheirrecentmag-
nitude[32].Dene
i
=
@L(x)
@w
i
wherew
i
isnetworkweighti. Theweightupdate
equationswere:
n
i
=@n
i
+(1 @)
2
i
(38)
g
i
=@g
i
+(1 @)
i
(39)
i
=i
i
j
i
p
n
i
g
2
i
+k
(40)
w
i
=w
i
+
i
(41)
withthefollowingparameters:
@=0:95
(42)
i=0:9
(43)
j=0:0001
(44)
k=0:0001
(45)
The output derivatives
@L(x)
@^y
t
were clipped d in the e range [ 100;100], , andthe
LSTMderivateswereclippedintherange[ 10;10]. . Clippingtheoutputgradi-
ents provedvitalfornumericalstability;evenso,thenetworkssometimeshad
numericalproblemslateonintraining,aftertheyhadstartedoverttingonthe
trainingdata.
Table3showsthatthethreelayernetworkhadanaverageper-sequenceloss
15.3 nats s lower thanthe one layer net. . However r the sum-squared-error was
slightly lower r for the single layer r network. . the e use of adaptive weight noise
reducedthelossbyanother16.7nats relativetotheunregularisedthreelayer
network,butdidnotsignicantlychangethesum-squarederror.Theadaptive
weightnoisenetworkappearedtogeneratethebestsamples.
23
Table3: HandwritingPredictionResults. Allresultsrecordedontheval-
idationset. ‘Log-Loss’is s themeanvalueofL(x)(innats). . ‘SSE’isthemean
sum-squared-errorperdatapoint.
Network
Regularisation
Log-Loss SSE
1layer
none
-1025.7
0.40
3layer
none
-1041.0
0.41
3layer
adaptiveweightnoise -1057.7
0.41
4.3 Samples
Fig.11showshandwritingsamples generatedbythepredictionnetwork. . The
networkhasclearlylearnedtomodelstrokes,lettersandevenshortwords(es-
peciallycommononessuchas‘of’and‘the’). Italsoappearstohavelearneda
basiccharacterlevellanguagemodels,sincethewordsitinvents(‘eald’,‘bryoes’,
‘lenrest’)looksomewhatplausibleinEnglish.Giventhattheaveragecharacter
occupiesmorethan25timesteps,thisagaindemonstratesthenetwork’sability
togeneratecoherentlong-rangestructures.
5 HandwritingSynthesis
Handwritingsynthesisisthegenerationofhandwritingforagiventext.Clearly
thepredictionnetworks wehavedescribedsofar are unable todothis,since
thereisnowaytoconstrainwhichlettersthenetworkwrites. Thissectionde-
scribesanaugmentationthatallowsapredictionnetworktogeneratedatase-
quencesconditionedonsomehigh-levelannotationsequence(acharacterstring,
inthecaseofhandwritingsynthesis). Theresultingsequencesaresuciently
convincingthattheyoftencannotbedistinguishedfromrealhandwriting. Fur-
thermore, this s realism is achievedwithout sacricing g the diversity inwriting
styledemonstratedintheprevioussection.
Themainchallengeinconditioningthepredictionsonthetextisthatthetwo
sequencesareofverydierentlengths(thepentracebeingonaveragetwenty
vetimesaslongasthetext),andthealignmentbetweenthemisunknownuntil
thedataisgenerated.Thisisbecausethenumberofco-ordinatesusedtowrite
eachcharactervariesgreatlyaccordingtostyle,size,penspeedetc.Oneneural
network modelabletomakesequentialpredictionsbasedontwosequences of
dierent lengthandunknownalignmentis the RNN transducer r [9]. . However
preliminaryexperimentsonhandwritingsynthesiswithRNNtransducerswere
notencouraging. Apossibleexplanationisthatthetransducer r usestwosepa-
rateRNNstoprocessthetwosequences,thencombinestheiroutputstomake
decisions,whenit is usuallymoredesirabletomakealltheinformationavail-
abletosinglenetwork. Thisworkproposesanalternativemodel,wherea‘soft
window’isconvolvedwiththetext stringandfedinasanextrainput tothe
predictionnetwork. Theparametersofthewindowareoutputbythenetwork
24
Figure 11: : Online e handwriting g samples s generated by the prediction
network. Allsamplesare700timestepslong.
25
atthesametimeasitmakesthepredictions,sothatitdynamicallydetermines
analignmentbetweenthetextandthepenlocations. Putsimply,it t learnsto
decidewhichcharactertowritenext.
5.1 SynthesisNetwork
Fig.12illustratesthenetworkarchitectureusedforhandwritingsynthesis. As
withthepredictionnetwork,thehiddenlayersarestackedontopofeachother,
each feeding upto thelayer r above, , andthere are skipconnections s from the
inputs to all l hidden n layers s and d from m all hidden n layers to the outputs. . The
dierence is the e added d input t from m the character sequence, mediatedby the
windowlayer.
GivenalengthU charactersequencecandalengthT datasequencex,the
soft window w w
t
into cat t timestept (1 t  T)is denedby the following
discreteconvolutionwithamixtureofKGaussianfunctions
(t;u)=
XK
k=1
k
t
exp
k
t
k
t
u
2
(46)
w
t
=
XU
u=1
(t;u)c
u
(47)
where(t;u)isthewindowweightofc
u
attimestept.Intuitively,the
t
param-
eterscontrolthelocationofthewindow,the
t
parameterscontrolthewidthof
thewindowandthe
t
parameterscontroltheimportanceofthewindowwithin
themixture. Thesizeofthesoftwindowvectorsisthesameasthesizeofthe
charactervectorsc
u
(assumingaone-hotencoding,thiswillbethenumberof
charactersinthealphabet). Notethatthewindowmixtureisnotnormalised
andhencedoesnotdetermineaprobabilitydistribution;however thewindow
weight(t;u)canbelooselyinterpretedasthenetwork’sbeliefthatitiswrit-
ingcharacterc
u
attimet. Fig.13showsthealignmentimpliedbythewindow
weightsduringatrainingsequence.
Thesize3K vectorpofwindowparametersisdeterminedasfollowsbythe
outputsofthersthiddenlayerofthenetwork:
(^
t
;
^
t
;^
t
)=W
h1p
h
1
t
+b
p
(48)
t
=exp(^
t
)
(49)
t
=exp
^
t
(50)
t
=
t 1
+exp(^
t
)
(51)
Notethatthelocationparameters 
t
aredenedas osets fromtheprevious
locationsc
t 1
,andthatthesizeoftheosetisconstrainedtobegreaterthan
zero. Intuitively,thismeansthatnetworklearns s howfar r toslideeachwindow
at eachstep, rather thananabsolutelocation. . Usingosets s was essentialto
gettingthenetworktoalignthetextwiththepentrace.
26
Inputs
Characters
Hidden 1
Window
Hidden 2
Outputs
Figure12: Synthesis s Network ArchitectureCirclesrepresentlayers,solid
linesrepresentconnectionsanddashedlinesrepresentpredictions.Thetopology
issimilartothepredictionnetworkinFig.1,exceptthatextrainputfromthe
character sequence c, , is s presentedto o the hiddenlayers s via a the window layer
(withadelayintheconnectiontothersthiddenlayertoavoidacycleinthe
graph).
27
Thought that the muster from
Figure13: Window w weights during ahandwriting synthesis sequence
Eachpointonthemapshowsthevalueof(t;u),wheretindexesthepentrace
alongthehorizontalaxisanduindexesthetextcharacteralongtheverticalaxis.
Thebrightlineisthealignmentchosenbythenetworkbetweenthecharacters
andthewriting. Noticethat t thelinespreads outat the boundariesbetween
characters;thismeansthenetworkreceivesinformationaboutnextandprevious
lettersasitmakestransitions,whichhelpsguideitspredictions.
28
Thew
t
vectors arepassedtothesecondandthirdhiddenlayersattimet,
andthersthiddenlayerattimet+1(toavoidcreatingacycleintheprocessing
graph).Theupdateequationsforthehiddenlayersare
h
1
t
=H
W
ih1
x
t
+W
h1h1
h
1
t 1
+W
wh1
w
t 1
+b
1
h
(52)
h
n
t
=H
W
ihn
x
t
+W
hn 1hn
h
n 1
t
+W
hnhn
h
n
t 1
+W
whn
w
t
+b
n
h
(53)
Theequations for the output layer remainunchangedfromEqs.(17)to(22).
Thesequencelossis
L(x)= logPr(xjc)
(54)
where
Pr(xjc)=
YT
t=1
Pr(x
t+1
jy
t
)
(55)
Notethaty
t
isnowafunctionofcaswellasx
1:t
.
Thelossderivativeswithrespecttotheoutputs ^e
t
;^
t
;^
t
;^
t
;^
t
remainun-
changedfrom Eqs.(27),(30) and(31). . Giventhe e loss derivative
@L(x)
@w
t
with
respecttothesizeW windowvectorw
t
,obtainedbybackpropagatingtheout-
putderivativesthroughthecomputationgraphinFig.12,thederivativeswith
respecttothewindowparametersareasfollows:
(k;t;u)
def
=
k
t
exp
k
t
k
t
u
2
XW
j=1
@L(x)
@w
j
t
c
j
u
(56)
@L(x)
@^k
t
=
XU
u=1
(k;t;u)
(57)
@L(x)
@
^
k
t
= 
k
t
XU
u=1
(k;t;u)(
k
t
u)
2
(58)
@L(x)
@
k
t
=
@L(x)
@
k
t+1
+2
k
t
XU
u=1
(k;t;u)(u 
k
t
)
(59)
@L(x)
@^k
t
=exp
^
k
t
@L(x)
@k
t
(60)
Fig.14illustratestheoperationofamixturedensityoutputlayerappliedto
handwritingsynthesis.
5.2 Experiments
Thesynthesisnetworkwasappliedtothesameinputdataasthehandwriting
predictionnetworkintheprevioussection. Thecharacter-leveltranscriptions
fromtheIAM-OnDBwerenowusedtodenethecharactersequencesc.Thefull
transcriptionscontain80distinctcharacters(capitalletters,lowercaseletters,
digits, andpunctuation). . However r we usedonly asubset of 57, , withall the
29
Synthesis Output Density
Figure14: Mixturedensityoutputsforhandwritingsynthesis. . Thetop
heatmapshows thepredictivedistributions for thepenlocations,the bottom
heatmapshowsthemixturecomponentweights.ComparisonwithFig.10indi-
catesthatthesynthesisnetworkmakesmoreprecisepredictions(withsmaller
densityblobs)thantheprediction-onlynetwork,especiallyattheendsofstrokes,
wherethesynthesisnetworkhastheadvantageofknowingwhichlettercomes
next.
30
Documents you may be interested
Documents you may be interested