DeepLearningTutorial,Release0.1
# build symbolic c expression n that computes the convolution of input with filters in w
conv_out conv.conv2d(input, W)
# build symbolic c expression n to add bias and apply activation function, i.e. produce e neural l net layer output
# A few words on n ‘‘dimshuffle‘‘ ‘ :
#
‘‘dimshuffle‘‘ is a powerful tool in reshaping g a a tensor;
#
what it allows you to do is to shuffle dimension around
#
but also to insert new ones along which the tensor will be
#
broadcastable;
#
dimshuffle(’x’, 2, ’x’, 0, 1)
#
This will work on 3d tensors with no broadcastable
#
dimensions. The first dimension will be broadcastable,
#
then we will l have e the third dimension of the input tensor as
#
the second of f the e resulting tensor, etc. If the e tensor r has
#
shape (20, 30, 40), the resulting tensor will have dimensions
#
(1, 40, 1, 20, 30). (AxBxC tensor is mapped to o 1xCx1xAxB B tensor)
#
More examples:
#
dimshuffle(’x’) -> make a 0d (scalar) into a 1d vector
#
dimshuffle(0, 1) -> identity
#
dimshuffle(1, 0) -> inverts the first and second dimensions
#
dimshuffle(’x’, 0) -> make a row out of a 1d vector (N to 1xN)
#
dimshuffle(0, ’x’) -> make a column out of a 1d vector (N to Nx1)
#
dimshuffle(2, 0, 1) -> AxBxC to CxAxB
#
dimshuffle(0, ’x’, 1) -> AxB to Ax1xB
#
dimshuffle(1, ’x’, 0) -> AxB to Bx1xA
output T.nnet.sigmoid(conv_out b.dimshuffle(’x’0’x’’x’))
# create theano function to compute filtered images
theano.function([input], output)
Let’shavealittlebitoffunwiththis...
import numpy
import pylab
from PIL import Image
# open random image of dimensions 639x516
img Image.open(open(’doc/images/3wolfmoon.jpg’))
# dimensions are e (height, , width, channel)
img numpy.asarray(img, dtype=’float64’256.
# put image in 4D D tensor r of shape (1, 3, height, width)
img_ img.transpose(201).reshape(13639516)
filtered_img f(img_)
# plot original image and first and second components of output
pylab.subplot(131); pylab.axis(’off’); pylab.imshow(img)
pylab.gray();
# recall that the e convOp p output (filtered image) is s actually y a "minibatch",
# of size 1 here, , so o we take index 0 in the first dimension:
pylab.subplot(132); pylab.axis(’off’); pylab.imshow(filtered_img[00, :, :])
pylab.subplot(133); pylab.axis(’off’); pylab.imshow(filtered_img[01, :, :])
pylab.show()
6.5. TheConvolutionOperator
55
Convert pdf file to jpg online - Convert PDF to JPEG images in C#.net, ASP.NET MVC, WinForms, WPF project
How to convert PDF to JPEG using C#.NET PDF to JPEG conversion / converter library control SDK
batch pdf to jpg converter; convert online pdf to jpg
Convert pdf file to jpg online - VB.NET PDF Convert to Jpeg SDK: Convert PDF to JPEG images in vb.net, ASP.NET MVC, WinForms, WPF project
Online Tutorial for PDF to JPEG (JPG) Conversion in VB.NET Image Application
convert pdf file to jpg on; change pdf to jpg image
DeepLearningTutorial,Release0.1
Thisshouldgeneratethefollowingoutput.
Noticethatarandomlyinitializedfilteractsverymuchlikeanedgedetector!
NotethatweusethesameweightinitializationformulaaswiththeMLP.Weightsaresampledrandomly
fromauniformdistributionintherange[-1/fan-in,1/fan-in],wherefan-inisthenumberofinputstoahidden
unit.ForMLPs,thiswasthenumberofunitsinthelayerbelow. ForCNNshowever,wehavetotakeinto
accountthenumberofinputfeaturemapsandthesizeofthereceptivefields.
6.6 MaxPooling
AnotherimportantconceptofCNNsismax-pooling,whichisaformofnon-lineardown-sampling. Max-
poolingpartitionstheinputimageintoasetofnon-overlappingrectanglesand,foreachsuchsub-region,
outputsthemaximumvalue.
Max-poolingisusefulinvisionfortworeasons:
1. Byeliminatingnon-maximalvalues,itreducescomputationforupperlayers.
2. Itprovidesaformoftranslationinvariance. . Imaginecascadingamax-poolinglayerwitha
convolutionallayer. Thereare8directionsinwhichonecantranslatetheinputimagebya
singlepixel.Ifmax-poolingisdoneovera2x2region,3outofthese8possibleconfigurations
willproduceexactlythesameoutputattheconvolutionallayer. Formax-poolingovera3x3
window,thisjumpsto5/8.
Sinceitprovidesadditionalrobustnesstoposition,max-poolingisa“smart”wayofreducing
thedimensionalityofintermediaterepresentations.
Max-poolingisdoneinTheanobywayoftheano.tensor.signal.downsample.max_pool_2d.
ThisfunctiontakesasinputanNdimensionaltensor(whereN>=2)andadownscalingfactorandperforms
max-poolingoverthe2trailingdimensionsofthetensor.
Anexampleisworthathousandwords:
from theano.tensor.signal import downsample
input T.dtensor4(’input’)
maxpool_shape (22)
pool_out downsample.max_pool_2d(input, maxpool_shape, ignore_border=True)
theano.function([input],pool_out)
invals numpy.random.RandomState(1).rand(3255)
56
Chapter6. ConvolutionalNeuralNetworks(LeNet)
Online Convert Jpeg to PDF file. Best free online export Jpg image
Online JPEG to PDF Converter. Download Free Trial. Convert a JPG to PDF. You can drag and drop your JPG file in the box, and then start immediately to sort the
convert pdf to jpg batch; convert pdf file into jpg format
Online Convert PDF to Jpeg images. Best free online PDF JPEG
Online PDF to JPEG Converter. Download Free Trial. Convert a PDF File to JPG. Drag and drop your PDF in the box above and we'll convert the files for you.
convert pdf pages to jpg; convert pdf image to jpg
DeepLearningTutorial,Release0.1
print ’With ignore_border set to True:’
print ’invals[0, 0, , :, :] =\n, invals[00, :, :]
print ’output[0, 0, , :, :] =\n, f(invals)[00, :, , :]
pool_out downsample.max_pool_2d(input, maxpool_shape, ignore_border=False)
theano.function([input],pool_out)
print ’With ignore_border set to False:’
print ’invals[1, 0, , :, :] =\n , invals[10, :, :]
print ’output[1, 0, , :, :] =\n , f(invals)[10, :, , :]
Thisshouldgeneratethefollowingoutput:
With ignore_border set to True:
invals[0, 0, , :, , :] =
[[
4.17022005e-01
7.20324493e-01
1.14374817e-04
3.02332573e-01 1.46755891e-01]
[
9.23385948e-02
1.86260211e-01
3.45560727e-01
3.96767474e-01 5.38816734e-01]
[
4.19194514e-01
6.85219500e-01
2.04452250e-01
8.78117436e-01 2.73875932e-02]
[
6.70467510e-01
4.17304802e-01
5.58689828e-01
1.40386939e-01 1.98101489e-01]
[
8.00744569e-01
9.68261576e-01
3.13424178e-01
6.92322616e-01 8.76389152e-01]]
output[0, 0, , :, , :] =
[[ 0.72032449 0.39676747]
0.6852195
0.87811744]]
With ignore_border set to False:
invals[1, 0, , :, , :] =
[[ 0.01936696 0.67883553
0.21162812
0.26554666
0.49157316]
0.05336255 0.57411761
0.14672857
0.58930554
0.69975836]
0.10233443 0.41405599
0.69440016
0.41417927
0.04995346]
0.53589641 0.66379465
0.51488911
0.94459476
0.58655504]
0.90340192 0.1374747
0.13927635
0.80739129
0.39767684]]
output[1, 0, , :, , :] =
[[ 0.67883553 0.58930554
0.69975836]
0.66379465 0.94459476
0.58655504]
0.90340192 0.80739129
0.39767684]]
NotethatcomparedtomostTheanocode,themax_pool_2doperationisalittlespecial. Itrequiresthe
downscalingfactords(tupleoflength2containingdownscalingfactorsforimagewidthandheight)tobe
knownatgraphbuildtime.Thismaychangeinthenearfuture.
6.7 TheFullModel:LeNet
Sparse,convolutionallayersandmax-poolingareattheheartoftheLeNetfamilyofmodels. Whilethe
exactdetailsofthemodelwillvarygreatly,thefigurebelowshowsagraphicaldepictionofaLeNetmodel.
6.7. TheFullModel:LeNet
57
C# Create PDF from images Library to convert Jpeg, png images to
C# Create PDF from Raster Images, .NET Graphics and REImage File with XDoc Batch convert PDF documents from multiple image formats, including Jpg, Png, Bmp
convert pdf picture to jpg; convert multiple pdf to jpg
C# Image Convert: How to Convert Adobe PDF to Jpeg, Png, Bmp, &
C# sample code for PDF to jpg image conversion. This demo code convert PDF file all pages to jpg images. // Define input and output files path.
batch pdf to jpg online; convert pdf to jpg
DeepLearningTutorial,Release0.1
Thelower-layersarecomposedtoalternatingconvolutionandmax-poolinglayers. Theupper-layershow-
everarefully-connectedandcorrespondtoatraditionalMLP(hiddenlayer+logisticregression).Theinput
tothefirstfully-connectedlayeristhesetofallfeaturesmapsatthelayerbelow.
Fromanimplementationpointofview,thismeans lower-layersoperateon4Dtensors. . Thesearethen
flattenedtoa2Dmatrixofrasterizedfeaturemaps,tobecompatiblewithourpreviousMLPimplementation.
Note: Notethattheterm“convolution”couldcorrespondstodifferentmathematicaloperations:
1. theano.tensor.nnet.conv2d,whichisthemostcommononeinalmostalloftherecentpublishedcon-
volutionalmodels.Inthisoperation,eachoutputfeaturemapisconnectedtoeachinputfeaturemap
byadifferent2Dfilter,anditsvalueisthesumoftheindividualconvolutionofallinputsthroughthe
correspondingfilter.
2. TheconvolutionusedintheoriginalLeNetmodel: : Inthiswork, , eachoutputfeaturemapisonly
connectedtoasubsetofinputfeaturemaps.
3. Theconvolutionusedinsignalprocessing: theano.tensor.signal.conv.conv2d,whichworksonlyon
singlechannelinputs.
Here,weusethefirstoperation,sothismodelsdifferslightlyfromtheoriginalLeNetpaper.Onereasonto
use2.wouldbetoreducetheamountofcomputationneeded,butmodernhardwaremakesitasfasttohave
thefullconnectionpattern. Anotherreasonwouldbetoslightlyreducethenumberoffreeparameters,but
wehaveotherregularizationtechniquesatourdisposal.
6.8 PuttingitAllTogether
WenowhaveallweneedtoimplementaLeNetmodelinTheano.WestartwiththeLeNetConvPoolLayer
class,whichimplementsa{convolution+max-pooling}layer.
class LeNetConvPoolLayer(object):
"""Pool Layer r of f a convolutional network """
def __init__(self, rng, input, filter_shape, image_shape, poolsize=(22)):
"""
Allocate a a LeNetConvPoolLayer with shared variable internal parameters.
:type rng: numpy.random.RandomState
:param rng: a random number generator used d to o initialize weights
:type input: theano.tensor.dtensor4
58
Chapter6. ConvolutionalNeuralNetworks(LeNet)
C# Image Convert: How to Convert Dicom Image File to Raster Images
RasterEdge.XDoc.Office.Inner.Office03.dll. RasterEdge.XDoc.PDF.dll. This demo code convert dicom file all pages to jpg images.
change file from pdf to jpg on; convert pdf to jpeg
VB.NET PDF Convert to Images SDK: Convert PDF to png, gif images
Convert PDF documents to multiple image formats, including Jpg, Png, Bmp, Gif, Tiff, Bitmap, .NET Graphics, and REImage. Turn multipage PDF file into image
reader pdf to jpeg; best convert pdf to jpg
DeepLearningTutorial,Release0.1
:param input: symbolic image tensor, of shape image_shape
:type filter_shape: tuple or list of length h 4
:param filter_shape: (number of filters, num input feature maps,
filter height, filter r width)
:type image_shape: tuple or list of length h 4
:param image_shape: (batch size, num input t feature e maps,
image height, image width)
:type poolsize: tuple or list of length 2
:param poolsize: the downsampling (pooling) ) factor r (#rows, #cols)
"""
assert image_shape[1== filter_shape[1]
self.input input
# there are "num input feature maps
*
filter height
*
filter width"
# inputs s to o each hidden unit
fan_in numpy.prod(filter_shape[1:])
# each unit in the lower layer receives a gradient from:
# "num output feature maps
*
filter height
*
filter width" /
#
pooling size
fan_out (filter_shape[0]
*
numpy.prod(filter_shape[2:]) /
numpy.prod(poolsize))
# initialize weights with random weights
W_bound numpy.sqrt(6. (fan_in fan_out))
self.theano.shared(
numpy.asarray(
rng.uniform(low=-W_bound, high=W_bound, size=filter_shape),
dtype=theano.config.floatX
),
borrow=True
)
# the bias is a 1D tensor -- one bias per output feature map
b_values numpy.zeros((filter_shape[0],), dtype=theano.config.floatX)
self.theano.shared(value=b_values, borrow=True)
# convolve input feature maps with filters
conv_out conv.conv2d(
input=input,
filters=self.W,
filter_shape=filter_shape,
image_shape=image_shape
)
# downsample each feature map individually, , using g maxpooling
pooled_out downsample.max_pool_2d(
input=conv_out,
ds=poolsize,
ignore_border=True
)
6.8. PuttingitAllTogether
59
VB.NET PDF File Merge Library: Merge, append PDF files in vb.net
scanned images to PDF, such as tiff, jpg, png, gif Append one PDF file to the end of another one in NET framework library download and VB.NET online source code
change format from pdf to jpg; .pdf to jpg converter online
VB.NET PDF - Convert PDF with VB.NET WPF PDF Viewer
Convert PDF to image file formats with high quality, support converting PDF to PNG, JPG, BMP and GIF. VB.NET WPF PDF Viewer: Convert and Export PDF.
convert pdf to jpg file; changing pdf to jpg
DeepLearningTutorial,Release0.1
# add the e bias s term. Since the bias is a vector (1D array), we first
# reshape e it t to a tensor of shape (1, n_filters, 1, 1). Each bias will
# thus be e broadcasted d across mini-batches and feature map
# width & & height
self.output T.tanh(pooled_out self.b.dimshuffle(’x’0’x’’x’))
# store parameters of this layer
self.params [self.W, self.b]
# keep track of model input
self.input input
Noticethatwheninitializingtheweightvalues,thefan-inisdeterminedbythesizeofthereceptivefields
andthenumberofinputfeaturemaps.
Finally,usingtheLogisticRegressionclassdefinedinClassifyingMNISTdigitsusingLogisticRegression
andtheHiddenLayerclassdefinedinMultilayerPerceptron,wecaninstantiatethenetworkasfollows.
T.matrix(’x’)
# the data is presented as s rasterized d images
T.ivector(’y’)
# the labels are presented d as s 1D vector of
# [int] labels
######################
# BUILD ACTUAL MODEL #
######################
print ’... building the model’
# Reshape matrix of rasterized images of shape e (batch_size, , 28
*
28)
# to a 4D tensor, compatible with our LeNetConvPoolLayer
# (28, 28) is s the e size of MNIST images.
layer0_input x.reshape((batch_size, 12828))
# Construct the first convolutional pooling layer:
# filtering reduces the image size to (28-5+1 , , 28-5+1) ) = (24, 24)
# maxpooling g reduces s this further to (24/2, 24/2) = (12, 12)
# 4D output tensor is thus of shape (batch_size, nkerns[0], 12, 12)
layer0 LeNetConvPoolLayer(
rng,
input=layer0_input,
image_shape=(batch_size, 12828),
filter_shape=(nkerns[0], 155),
poolsize=(22)
)
# Construct the second convolutional pooling layer
# filtering reduces the image size to (12-5+1, , 12-5+1) ) = (8, 8)
# maxpooling g reduces s this further to (8/2, 8/2) ) = = (4, 4)
# 4D output tensor is thus of shape (batch_size, nkerns[1], 4, 4)
layer1 LeNetConvPoolLayer(
rng,
input=layer0.output,
image_shape=(batch_size, nkerns[0], 1212),
filter_shape=(nkerns[1], nkerns[0], 55),
60
Chapter6. ConvolutionalNeuralNetworks(LeNet)
DeepLearningTutorial,Release0.1
poolsize=(22)
)
# the HiddenLayer being fully-connected, it operates on 2D matrices of
# shape (batch_size, num_pixels) (i.e matrix of f rasterized d images).
# This will generate a matrix of shape (batch_size, nkerns[1]
*
4
*
4),
# or (500, 50
*
4
*
4) = (500, 800) with the default values.
layer2_input layer1.output.flatten(2)
# construct a a fully-connected d sigmoidal layer
layer2 HiddenLayer(
rng,
input=layer2_input,
n_in=nkerns[1]
*
4
*
4,
n_out=500,
activation=T.tanh
)
# classify the values of the fully-connected sigmoidal layer
layer3 LogisticRegression(input=layer2.output, n_in=500, n_out=10)
# the cost we e minimize e during training is the NLL of the model
cost layer3.negative_log_likelihood(y)
# create a function to compute the mistakes that are made by the model
test_model theano.function(
[index],
layer3.errors(y),
givens={
x: test_set_x[index
*
batch_size: (index 1)
*
batch_size],
y: test_set_y[index
*
batch_size: (index 1)
*
batch_size]
}
)
validate_model theano.function(
[index],
layer3.errors(y),
givens={
x: valid_set_x[index
*
batch_size: (index 1)
*
batch_size],
y: valid_set_y[index
*
batch_size: (index 1)
*
batch_size]
}
)
# create a list of all model parameters to be fit by gradient descent
params layer3.params layer2.params layer1.params layer0.params
# create a list of gradients for all model parameters
grads T.grad(cost, params)
# train_model l is s a function that updates the model parameters by
# SGD Since this model has many parameters, it t would d be tedious to
# manually create an update rule for each model l parameter. . We thus
# create the e updates s list by automatically looping over all
6.8. PuttingitAllTogether
61
DeepLearningTutorial,Release0.1
# (params[i], , grads[i]) ) pairs.
updates [
(param_i, param_i learning_rate
*
grad_i)
for param_i, grad_i in zip(params, grads)
]
train_model theano.function(
[index],
cost,
updates=updates,
givens={
x: train_set_x[index
*
batch_size: (index 1)
*
batch_size],
y: train_set_y[index
*
batch_size: (index 1)
*
batch_size]
}
)
Weleaveoutthecodethatperformstheactualtrainingandearly-stopping,sinceitisexactlythesameas
withanMLP.Theinterestedreadercanneverthelessaccessthecodeinthe‘code’folderofDeepLearning-
Tutorials.
6.9 RunningtheCode
Theusercanthenrunthecodebycalling:
python code/convolutional_mlp.py
ThefollowingoutputwasobtainedwiththedefaultparametersonaCorei7-2600KCPUclockedat3.40GHz
andusingflags‘floatX=float32’:
Optimization complete.
Best validation score of 0.910000 % obtained at iteration 17800,with test
performance 0.920000 %
The code for file e convolutional_mlp.py y ran for 380.28m
UsingaGeForceGTX285,weobtainedthefollowing:
Optimization complete.
Best validation score of 0.910000 % obtained at iteration 15500,with test
performance 0.930000 %
The code for file e convolutional_mlp.py y ran for 46.76m
AndsimilarlyonaGeForceGTX480:
Optimization complete.
Best validation score of 0.910000 % obtained at iteration 16400,with test
performance 0.930000 %
The code for file e convolutional_mlp.py y ran for 32.52m
Notethatthediscrepanciesinvalidationandtesterror(aswellasiterationcount)areduetodifferentimple-
mentationsoftheroundingmechanisminhardware.Theycanbesafelyignored.
62
Chapter6. ConvolutionalNeuralNetworks(LeNet)
DeepLearningTutorial,Release0.1
6.10 TipsandTricks
6.10.1 ChoosingHyperparameters
CNNsareespeciallytrickytotrain,astheyaddevenmorehyper-parametersthanastandardMLP.While
theusualrulesofthumbforlearningratesandregularizationconstantsstillapply,thefollowingshouldbe
keptinmindwhenoptimizingCNNs.
Numberoffilters
Whenchoosingthenumberoffiltersperlayer,keepinmindthatcomputingtheactivationsofasingle
convolutionalfilterismuchmoreexpensivethanwithtraditionalMLPs!
Assumelayer(l 1)containsK
l 1
featuremapsandMNpixelpositions(i.e.,numberofpositionstimes
numberoffeaturemaps),andthereareK
l
filtersatlayerlofshapemn.Thencomputingafeaturemap
(applyinganmnfilteratall(M m)(N n)pixelpositionswherethefiltercanbeapplied)costs
(M m)(N n)mnK
l 1
.ThetotalcostisK
l
timesthat.Thingsmaybemorecomplicatedif
notallfeaturesatonelevelareconnectedtoallfeaturesatthepreviousone.
ForastandardMLP,thecostwouldonlybeK
l
K
l 1
wherethereareK
l
differentneuronsatlevell.As
such,thenumberoffiltersusedinCNNsistypicallymuchsmallerthanthenumberofhiddenunitsinMLPs
anddependsonthesizeofthefeaturemaps(itselfafunctionofinputimagesizeandfiltershapes).
Sincefeaturemapsizedecreaseswithdepth,layersneartheinputlayerwilltendtohavefewerfilterswhile
layershigherupcanhavemuchmore. Infact,toequalizecomputationateachlayer,theproductofthe
numberoffeatures andthe numberofpixelpositionsistypicallypickedtoberoughlyconstantacross
layers. Topreservetheinformationabouttheinputwouldrequirekeepingthetotalnumberofactivations
(numberoffeaturemapstimesnumberofpixelpositions)tobenon-decreasingfromonelayertothenext
(ofcoursewecouldhopetogetawaywithlesswhenwearedoingsupervisedlearning). Thenumberof
featuremapsdirectlycontrolscapacityandsothatdependsonthenumberofavailableexamplesandthe
complexityofthetask.
FilterShape
Commonfiltershapesfoundinthelitteraturevarygreatly,usuallybasedonthedataset. Bestresultson
MNIST-sizedimages(28x28)areusuallyinthe5x5rangeonthefirstlayer,whilenaturalimagedatasets
(oftenwithhundredsofpixelsineachdimension)tendtouselargerfirst-layerfiltersofshape12x12or
15x15.
Thetrickisthustofindtherightlevelof“granularity”(i.e. filtershapes)inordertocreateabstractionsat
theproperscale,givenaparticulardataset.
MaxPoolingShape
Typicalvaluesare2x2ornomax-pooling.Verylargeinputimagesmaywarrant4x4poolinginthelower-
layers.Keepinmindhowever,thatthiswillreducethedimensionofthesignalbyafactorof16,andmay
resultinthrowingawaytoomuchinformation.
6.10. TipsandTricks
63
DeepLearningTutorial,Release0.1
Tips
Ifyouwanttotrythismodelonanewdataset,hereareafewtipsthatcanhelpyougetbetterresults:
• Whiteningthedata(e.g.withPCA)
• Decaythelearningrateineachepoch
64
Chapter6. ConvolutionalNeuralNetworks(LeNet)
Documents you may be interested
Documents you may be interested