calibreUserManual,Release2.55.0
'author_sort',
'book_producer',
'timestamp',
# Dates and d times must be e timezone e aware
'pubdate',
'last_modified',
'rights',
# So o far r only known publication type is s periodical:calibre
# If f None, means book
'publication_type',
'uuid',
# A A UUID D usually y of f type e 4
'languages',
# ordered d list t of f languages in n this s publication
'publisher',
# Simple string, , no o special l semantics
# Absolute e path h to image file encoded in n filesystem_encoding
'cover',
# Of f the e form (format, , data) where format t is, , for r e.g. 'jpeg', , 'png', 'gif'...
'cover_data',
# Either thumbnail data, or r an n object with the e attribute
# image_path which is s the e path to o an n image file, encoded
# in n filesystem_encoding
'thumbnail',
])
BOOK_STRUCTURE_FIELDS frozenset([
# These are e used by y code, Null values are None.
'toc''spine''guide''manifest',
])
USER_METADATA_FIELDS frozenset([
# A dict of f dicts similar r to o field_metadata. . Each field description dict
# also contains a a value e field with h the e key #value#.
'user_metadata',
])
DEVICE_METADATA_FIELDS frozenset([
'device_collections',
# Ordered d list t of strings
'lpath',
# Unicode, , / / separated
'size',
# In n bytes
'mime',
# Mimetype e of f the book k file being represented
])
CALIBRE_METADATA_FIELDS frozenset([
'application_id',
# An n application id, , currently set t to o the e db_id.
'db_id',
# the e calibre e primary y key y of f the item.
'formats',
# list t of formats (extensions) ) for r this book
# a dict of f user category y names, where the value is s a a list of item names
# from the e book k that are in n that t category
'user_categories',
# a dict of f author r to o an n associated hyperlink
'author_link_map',
]
)
ALL_METADATA_FIELDS =
SOCIAL_METADATA_FIELDS.union(
PUBLICATION_METADATA_FIELDS).union(
BOOK_STRUCTURE_FIELDS).union(
USER_METADATA_FIELDS).union(
1.9. Tutorials
157
Convert pdf slides to powerpoint online - C# Create PDF from PowerPoint Library to convert pptx, ppt to PDF in C#.net, ASP.NET MVC, WinForms, WPF
Online C# Tutorial for Creating PDF from Microsoft PowerPoint Presentation
convert pdf to powerpoint online; pdf to powerpoint
Convert pdf slides to powerpoint online - VB.NET Create PDF from PowerPoint Library to convert pptx, ppt to PDF in vb.net, ASP.NET MVC, WinForms, WPF
VB.NET Tutorial for Export PDF file from Microsoft Office PowerPoint
export pdf to powerpoint; copying image from pdf to powerpoint
calibreUserManual,Release2.55.0
DEVICE_METADATA_FIELDS).union(
CALIBRE_METADATA_FIELDS)
# All fields except custom m fields
STANDARD_METADATA_FIELDS SOCIAL_METADATA_FIELDS.union(
PUBLICATION_METADATA_FIELDS).union(
BOOK_STRUCTURE_FIELDS).union(
DEVICE_METADATA_FIELDS).union(
CALIBRE_METADATA_FIELDS)
# Metadata fields that smart update must do o special l processing to copy.
SC_FIELDS_NOT_COPIED =
frozenset(['title''title_sort''authors',
'author_sort''author_sort_map',
'cover_data''tags''languages',
'identifiers'])
# Metadata fields that smart update should copy only if f the e source e is s not t None
SC_FIELDS_COPY_NOT_NULL =
frozenset(['lpath''size''comments''thumbnail'])
# Metadata fields that smart update should copy without t special l handling
SC_COPYABLE_FIELDS =
SOCIAL_METADATA_FIELDS.union(
PUBLICATION_METADATA_FIELDS).union(
BOOK_STRUCTURE_FIELDS).union(
DEVICE_METADATA_FIELDS).union(
CALIBRE_METADATA_FIELDS) \
SC_FIELDS_NOT_COPIED.union(
SC_FIELDS_COPY_NOT_NULL)
SERIALIZABLE_FIELDS =
SOCIAL_METADATA_FIELDS.union(
USER_METADATA_FIELDS).union(
PUBLICATION_METADATA_FIELDS).union(
CALIBRE_METADATA_FIELDS).union(
DEVICE_METADATA_FIELDS) \
frozenset(['device_collections''formats',
'cover_data'])
# these are e rebuilt t when needed
Usinggeneralprogrammode
Formorecomplicatedtemplateprograms,itissometimeseasiertoavoidtemplatesyntax(allthe{and}characters),
insteadwritingamoreclassical-lookingprogram.Youcandothisincalibrebybeginningthetemplatewithprogram:.
Inthiscase,notemplateprocessingisdone.Thespecialvariable$isnotset.Itisuptoyourprogramtoproducethe
correctresults.
Oneadvantageofprogram:modeisthatthebracketsarenolongerspecial.Forexample,itisnotnecessarytouse[[
and]]whenusingthetemplate()function.AnotheradvantageisthatprogrammodetemplatesarecompiledtoPython
andcanrunmuchfasterthantemplatesintheothertwomodes. Speedimprovementdependsonthecomplexityof
thetemplates;themorecomplicatedthetemplatethemoretheimprovement. Compilationisturnedofforonusing
thetweakcompile_gpm_templates(CompileGeneralProgramModetemplatestoPython).Themainreasonto
turnoffcompilationisifacompiledtemplatedoesnotwork,inwhichcasepleasefileabugreport.
Thefollowingexampleisaprogram:modeimplementationofarecipeontheMobileReadforum:“Putseriesintothe
title,usingeitherinitialsorashortenedform.Stripleadingarticlesfromtheseriesname(any).”Forexample,forthe
bookTheTwoTowersintheLordoftheRingsseries,therecipegivesLotR[02]TheTwoTowers. Usingstandard
templates,thereciperequiresthreecustomcolumnsandaplugboard,asexplainedinthefollowing:
Thesolutionrequirescreatingthreecompositecolumns.Thefirstcolumnisusedtoremovetheleadingarticles.The
158
Chapter1. Sections
VB.NET PowerPoint: Read, Edit and Process PPTX File
split PowerPoint file, change the order of PPTX sildes and extract one or more slides from PowerPoint How to convert PowerPoint to PDF, render PowerPoint to
pdf to powerpoint converter; chart from pdf to powerpoint
C# PowerPoint - How to Process PowerPoint
slide processing library provides users with access to operate PowerPoint slides/pages in the simplest procedures, for instance, using online clear C# methods
convert pdf pages into powerpoint slides; how to change pdf to powerpoint slides
calibreUserManual,Release2.55.0
secondisusedtocomputethe‘shorten’form. Thethirdistocomputethe‘initials’form. . Onceyouhavethese
columns,theplugboardselectsbetweenthem.Youcanhideanyorallofthethreecolumnsonthelibraryview:
First column:
Name: #stripped_series.
Template: {series:re(^(A|The|An)\s+,)||}
Second column (the shortened form):
Name: #shortened.
Template: {#stripped_series:shorten(4,-,4)}
Third column (the initials s form):
Name: #initials.
Template: {#stripped_series:re(([^\s])[^\s]+(\s|$),\1)}
Plugboard expression:
Template:{#stripped_series:lookup(.\s,#initials,.,#shortened,series)}{series_index:0>2.0f| [|] ] }{title}
Destination field: title
This set of fields and plugboard d produces:
Series: The Lord of the Rings
Series index: 2
Title: The Two o Towers
Output: LotR [02] The e Two Towers
Series: Dahak
Series index: 1
Title: Mutineers Moon
Output: Dahak [01] Mutineers Moon
Series: Berserkers
Series Index: 4
Title: Berserker Throne
Output: Bers-kers [04] Berserker r Throne
Series: Meg Langslow w Mysteries
Series Index: 3
Title: Revenge e of f the e Wrought-Iron n Flamingos
Output: MLM [03] Revenge of the e Wrought-Iron Flamingos
Thefollowingprogramproducesthesameresultsastheoriginalrecipe,usingonlyonecustomcolumntoholdthe
resultsofaprogramthatcomputesthespecialtitlevalue:
Custom column:
Name: #special_title
Template: (the following with all leading spaces removed)
program:
#
compute the equivalent of f the e composite fields and store them m in n local variables
stripped re(field('series'), '^(A|The|An)\s+''');
shortened shorten(stripped, 4'-' ,4);
initials re(stripped, '[^\w]
*
(\w?)[^\s]+(\s|$)''\1');
#
Format the series index. Ends up p as s empty if there is no series index.
#
Note that t leading g and trailing spaces s will be e removed by the e formatter,
#
so we e cannot add them here. We will do o that in n the strcat below.
#
Also note e that because e we e are e in n 'program' mode, we e can n freely y use
#
curly brackets s in n strings, , something g we cannot do o in n template e mode.
s_index template('{series_index:0>2.0f}');
1.9. Tutorials
159
VB.NET PowerPoint: Process & Manipulate PPT (.pptx) Slide(s)
add image to slide, extract slides and merge library SDK, this VB.NET PowerPoint processing control powerful & profession imaging controls, PDF document, image
convert pdf back to powerpoint; how to convert pdf into powerpoint slides
VB.NET PowerPoint: Sort and Reorder PowerPoint Slides by Using VB.
clip art or screenshot to PowerPoint document slide large amount of robust PPT slides/pages editing powerful & profession imaging controls, PDF document, image
how to convert pdf to ppt online; how to convert pdf to powerpoint
calibreUserManual,Release2.55.0
#
print(stripped, shortened, , initials, , s_index);
#
Now concatenate e all l the e bits together. . The switch picks between
#
initials and d shortened, depending on whether there is a a space
#
in stripped. . We then n add d the brackets s around d s_index if f it t is
#
not empty. . Finally, , add d the title. As s this is s the e last function n in
#
the program, , its value will be returned.
strcat(
switch( stripped,
'.\s', initials,
'.', shortened,
field('series')),
test(s_index, strcat(' [', s_index, '] '), ''),
field('title'));
Plugboard expression:
Template:{#special_title}
Destination field: title
Itwouldbepossibletodotheabovewithnocustomcolumnsbyputtingtheprogramintothetemplateboxofthe
plugboard.However,todoso,allcommentsmustberemovedbecausetheplugboardtextboxdoesnotsupportmulti-
lineediting.Itisdebatablewhetherthegainofnothavingthecustomcolumnisworththevastincreaseindifficulty
causedbytheprogrambeingonegiantline.
User-definedTemplateFunctions
Youcanaddyourownfunctionstothetemplateprocessor. Suchfunctionsarewritteninpython,andcanbeused
inanyofthethreetemplateprogrammingmodes. ThefunctionsareaddedbygoingtoPreferences->Advanced->
TemplateFunctions.Instructionsareshowninthatdialog.
Specialnotesforsave/sendtemplates
Specialprocessingisappliedwhenatemplateisusedinasavetodiskorsendtodevicetemplate. Thevaluesofthe
fieldsarecleaned,replacingcharactersthatarespecialtofilesystemswithunderscores,includingslashes.Thismeans
thatfieldtextcannotbeusedtocreatefolders.However,slashesarenotchangedinprefixorsuffixstrings,soslashes
inthesestringswillcausefolderstobecreated.Becauseofthis,youcancreatevariable-depthfolderstructure.
Forexample,assumewewantthefolderstructureseries/series_index-title,withthecaveatthatifseriesdoesnot
exist,thenthetitleshouldbeinthetopfolder.Thetemplatetodothisis:
{series:||/}{series_index:|| - }{title}
Theslashandthehyphenappearonlyifseriesisnotempty.
Thelookupfunctionletsusdoevenfancierprocessing.Forexample,assumethatifabookhasaseries,thenwewant
thefolderstructureseries/seriesindex-title.fmt.Ifthebookdoesnothaveaseries,thenwewantthefolderstructure
genre/author_sort/title.fmt.Ifthebookhasnogenre,wewanttouse‘Unknown’. Wewanttwocompletelydifferent
paths,dependingonthevalueofseries.
Toaccomplishthis,we:
1. Createacompositefield(callitAA)containing{series}/{series_index} } - {title’}.Ifthe
seriesisnotempty,thenthistemplatewillproduceseries/series_index-title.
2. Createacompositefield(callitBB)containing{#genre:ifempty(Unknown)}/{author_sort}/{title}.
Thistemplateproducesgenre/author_sort/title,whereanemptygenreisreplacedwithUnknown.
160
Chapter1. Sections
VB.NET PowerPoint: Use PowerPoint SDK to Create, Load and Save PPT
Besides, users also can get the precise PowerPoint slides count as soon as the PowerPoint document has been loaded by using the page number getting method.
convert pdf document to powerpoint; convert pdf to powerpoint online for
VB.NET PowerPoint: Extract & Collect PPT Slide(s) Using VB Sample
want to combine these extracted slides into a please read this VB.NET PowerPoint slide processing powerful & profession imaging controls, PDF document, image
and paste pdf into powerpoint; convert pdf file to ppt online
calibreUserManual,Release2.55.0
3. Setthesavetemplateto{series:lookup(.,AA,BB)}.ThistemplatechoosescompositefieldAAif
seriesisnotempty,andcompositefieldBBifseriesisempty.Wethereforehavetwocompletelydifferent
savepaths,dependingonwhetherornotseriesisempty.
TemplatesandPlugboards
Plugboardsareusedforchangingthemetadatawrittenintobooksduringsend-to-deviceandsave-to-diskoperations.
Aplugboardpermitsyoutospecifyatemplatetoprovidethedatatowriteintothebook’smetadata. Youcanuse
plugboardstomodifythefollowingfields:authors,author_sort,language,publisher,tags,title,title_sort.Thisfeature
helpspeoplewhowanttousedifferentmetadatainbooksondevicestosolvesortingordisplayissues.
Whenyoucreateaplugboard,youspecifytheformatanddeviceforwhichtheplugboardistobeused. Aspecial
deviceisprovided,save_to_disk,thatisusedwhensavingformats(asopposedtosendingthemtoadevice). Once
youhavechosentheformatanddevice,youchoosethemetadatafieldstochange,providingtemplatestosupplythe
newvalues.Thesetemplatesareconnectedtotheirdestinationfields,hencethenameplugboards.Youcan,ofcourse,
usecompositecolumnsinthesetemplates.
Whenaplugboardmightapply(contentserver,savetodisk,orsendtodevice),calibresearchesthedefinedplugboards
tochoosethecorrectoneforthegivenformatanddevice.Forexample,tofindtheappropriateplugboardforanEPUB
bookbeingsenttoanANDROIDdevice,calibresearchestheplugboardsusingthefollowingsearchorder:
• aplugboardwithanexactmatchonformatanddevice,e.g.,EPUBandANDROID
• aplugboardwithanexactmatchonformatandthespecialany y devicechoice,e.g.,EPUBandany device
• aplugboardwiththespecialany y formatchoiceandanexactmatchondevice,e.g., , any y formatand
ANDROID
• aplugboardwithany y formatandany device
Thetagsandauthorsfieldshavespecialtreatment,becausebothofthesefieldscanholdmorethanoneitem.Abook
canhavemanytagsandmanyauthors.Whenyouspecifythatoneofthesetwofieldsistobechanged,thetemplate’s
resultisexaminedtoseeifmorethanoneitemisthere.Fortags,theresultiscutapartwherevercalibrefindsacomma.
Forexample,ifthetemplateproducesthevalueThriller, Horror,thentheresultwillbetwotags,Thriller
andHorror.Thereisnowaytoputacommainthemiddleofatag.
Thesamethinghappensforauthors,butusingadifferentcharacterforthecut,a&(ampersand)insteadofacomma.
Forexample,ifthetemplateproducesthevalueBlogs, Joe&Posts, Susan,thenthebookwillendupwithtwo
authors,Blogs, JoeandPosts, Susan.IfthetemplateproducesthevalueBlogs, Joe;Posts, Susan,
thenthebookwillhaveoneauthorwitharatherstrangename.
Plugboardsaffectthemetadatawrittenintothebookwhenitissavedtodiskorwrittentothedevice. Plugboards
donotaffectthemetadatausedbysave to diskandsend to devicetocreatethefilenames. . Instead,file
namesareconstructedusingthetemplatesenteredontheappropriatepreferenceswindow.
HelpfulTips
Youmightfindthefollowingtipsuseful.
• Createacustomcompositecolumntotesttemplates. . Onceyouhavethecolumn,youcanchangeitstemplate
simplybydouble-clickingonthecolumn.Hidethecolumnwhenyouarenottesting.
• Templatescanuseothertemplatesbyreferencingacompositecustomcolumn.
• Inaplugboard,youcansetafieldtoempty(orwhateverisequivalenttoempty)byusingthespecialtemplate
{}.Thistemplatewillalwaysevaluatetoanemptystring.
• Thetechniquedescribedabovetoshownumberseveniftheyhaveazerovalueworkswiththestandardfield
series_index.
1.9. Tutorials
161
VB.NET PowerPoint: Merge and Split PowerPoint Document(s) with PPT
of the split PPT document will contain slides/pages 1-4 code in VB.NET to finish PowerPoint document splitting If you want to see more PDF processing functions
how to convert pdf file to powerpoint presentation; how to change pdf file to powerpoint
VB.NET PowerPoint: Complete PowerPoint Document Conversion in VB.
It contains PowerPoint documentation features and all PPT slides. Control to render and convert target PowerPoint or document formats, such as PDF, BMP, TIFF
convert pdf to powerpoint slides; how to change pdf to powerpoint on
calibreUserManual,Release2.55.0
1.9.4 Allaboutusingregularexpressionsincalibre
Regularexpressionsarefeaturesusedinmanyplacesincalibretoperformsophisticatedmanipulationofebookcontent
andmetadata.Thistutorialisagentleintroductiontogettingyoustartedwithusingregularexpressionsincalibre.
Contents
• First,awordofwarningandawordofcourage(page162)
• Whereincalibrecanyouuseregularexpressions?(page162)
• Whatonearthisaregularexpression?(page162)
• Caretoexplain?(page163)
• Thatdoesn’tsoundtoobad.What’snext?(page163)
• Hey,neat!Thisisstartingtomakesense!(page163)
• Well,thesespecialcharactersareveryneatandall,butwhatifIwantedtomatchadotoraquestion
mark?(page164)
• So,whatarethemostusefulsets?(page164)
• ButifIhadafewvaryingstringsIwantedtomatch,thingsgetcomplicated?(page164)
• Youmissed...(page165)
• Inthebeginning,yousaidtherewasawaytomakearegularexpressioncaseinsensitive?(page165)
• IthinkI’mbeginningtounderstandthese regular expressionsnow... . how w doIusethemincalibre?
(page165)
– Conversions(page165)
– Addingbooks(page166)
– Bulkeditingmetadata(page166)
• Credits(page167)
First,awordofwarningandawordofcourage
Thisis,inevitably,goingtobesomewhattechnical-afterall,regularexpressionsareatechnicaltoolfordoingtechnical
stuff.I’mgoingtohavetousesomejargonandconceptsthatmayseemcomplicatedorconvoluted. I’mgoingtotry
toexplainthoseconceptsasclearlyasIcan,butreallycan’tdowithoutusingthematall. Thatbeingsaid,don’tbe
discouragedbyanyjargon,asI’vetriedtoexplaineverythingnew. Andwhileregularexpressionsthemselvesmay
seemlikeanarcane,blackmagic(or,tobemoreprosaic,arandomstringofmumbo-jumbolettersandsigns),Ipromise
thattheyarenotallthatcomplicated.Eventhosewhounderstandregularexpressionsreallywellhavetroublereading
themorecomplexones,butwritingthemisn’tasdifficult-youconstructtheexpressionstepbystep. So,takeastep
andfollowmeintotherabbithole.
Whereincalibrecanyouuseregularexpressions?
Thereareafewplacescalibreusesregularexpressions.There’stheSearch&Replaceinconversionoptions,metadata
detectionfromfilenamesintheimportsettingsandSearch&Replacewheneditingthemetadataofbooksinbulk.The
calibrebookeditorcanalsouseregularexpressionsinitssearchandreplacefeature.
Whatonearthisaregularexpression?
Aregularexpressionisawaytodescribesetsofstrings.Asingleregularexpressioncanmatchanumberofdifferent
strings. Thisiswhatmakesregularexpressionsopowerful–theyareaconcisewayofdescribingapotentiallylarge
numberofvariations.
Note: I’musingstringhereinthesenseitisusedinprogramminglanguages: : astringofoneormorecharacters,
162
Chapter1. Sections
VB.NET PowerPoint: Convert & Render PPT into PDF Document
Using this VB.NET PowerPoint to PDF converting demo code below, you can easily convert all slides of source PowerPoint document into a multi-page PDF file.
convert pdf file to powerpoint; pdf to ppt
VB.NET PowerPoint: Add Image to PowerPoint Document Slide/Page
insert or delete any certain PowerPoint slide without methods to reorder current PPT slides in both powerful & profession imaging controls, PDF document, tiff
embed pdf into powerpoint; convert pdf to powerpoint with
calibreUserManual,Release2.55.0
charactersincludingactualcharacters,numbers,punctuationandso-calledwhitespace(linebreaks, tabulatorsetc.).
Pleasenotethatgenerally,uppercaseandlowercasecharactersarenotconsideredthesame,thus“a”beingadifferent
characterfrom“A”andsoforth. Incalibre,regularexpressionsarecaseinsensitiveinthesearchbar,butnotinthe
conversionoptions.There’sawaytomakeeveryregularexpressioncaseinsensitive,butwe’lldiscussthatlater.Itgets
complicatedbecauseregularexpressionsallowforvariationsinthestringsitmatches,sooneexpressioncanmatch
multiplestrings,whichiswhypeoplebotherusingthematall.Moreonthatinabit.
Caretoexplain?
Well,that’swhywe’rehere. First,thisisthemostimportantconceptinregularexpressions:Astringbyitselfisa
regularexpressionthatmatchesitself. Thatistosay,ifIwantedtomatchthestring"Hello, World!"using
aregularexpression,theregularexpressiontousewouldbeHello, World!. Andyes,itreallyisthatsimple.
You’llnotice,though,thatthisonlymatchestheexactstring"Hello, World!",note.g."Hello, , wOrld!"or
"hello, world!"oranyothersuchvariation.
Thatdoesn’tsoundtoobad.What’snext?
Nextisthebeginningofthereallygoodstuff. RememberwhereIsaidthatregularexpressionscanmatchmultiple
strings? Thisiswereitgetsalittlemorecomplicated. . Say,asasomewhatmorepracticalexercise,theebookyou
wantedtoconverthadanastyfootercountingthepages,like“Page5of423”.Obviouslythepagenumberwouldrise
from1to423,thusyou’dhavetomatch423differentstrings,right?Wrong,actually:regularexpressionsallowyou
todefinesetsofcharactersthatarematched:Todefineaset,youputallthecharactersyouwanttobeinthesetinto
squarebrackets. So,forexample,theset[abc]wouldmatcheitherthecharacter“a”,“b”or“c”. Setswillalways
onlymatchoneofthecharactersintheset.They“understand”characterranges,thatis,ifyouwantedtomatchallthe
lowercasecharacters,you’dusetheset[a-z]forlower-anduppercasecharactersyou’duse[a-zA-Z]andsoon.
Gottheidea?So,obviously,usingtheexpressionPage [0-9] ] of f 423you’dbeabletomatchthefirst9pages,
thusreducingtheexpressionsneededtothree:ThesecondexpressionPage [0-9][0-9] ] of f 423wouldmatch
alltwo-digitpagenumbers,andI’msureyoucanguesswhatthethirdexpressionwouldlooklike. Yes,goahead.
Writeitdown.
Hey,neat!Thisisstartingtomakesense!
Iwashopingyou’dsaythat.Butbraceyourself,nowitgetsevenbetter!Wejustsawthatusingsets,wecouldmatch
oneofseveralcharactersatonce. Butyoucanevenrepeatacharacterorset, , reducingthenumberofexpressions
neededtohandletheabovepagenumberexampletoone. Yes,ONE!Excited? ? Youshouldbe! Itworkslikethis:
Someso-calledspecialcharacters,“+”,”?”and“*”,repeatthesingleelementprecedingthem.(Elementmeanseither
asinglecharacter,acharacterset,anescapesequenceoragroup(we’lllearnaboutthoselasttwolater)-inshort,any
singleentityinaregularexpression.) Thesecharactersarecalledwildcardsorquantifiers. . Tobemoreprecise,”?”
matches0or1oftheprecedingelement,“*”matches0ormoreoftheprecedingelementand“+”matches1ormore
oftheprecedingelement.Afewexamples:Theexpressiona?wouldmatcheither“”(whichistheemptystring,not
strictlyusefulinthiscase)or“a”,theexpressiona
*
wouldmatch“”,“a”,“aa”oranynumberofa’sinarow,and,
finally,theexpressiona+wouldmatch“a”,“aa”oranynumberofa’sinarow(Note:itwouldn’tmatchtheempty
string!).Samedealforsets:Theexpression[0-9]+wouldmatcheveryintegernumberthereis!Iknowwhatyou’re
thinking,andyou’reright:Ifyouusethatintheabovecaseofmatchingpagenumbers,wouldn’tthatbethesingleone
expressiontomatchallthepagenumbers? Yes,theexpressionPage e [0-9]+ of 423wouldmatcheverypage
numberinthatbook!
Note: Anoteonthesequantifiers:Theygenerallytrytomatchasmuchtextaspossible,sobecarefulwhenusing
them.Thisiscalled“greedybehaviour”-I’msureyougetwhy.Itgetsproblematicwhenyou,say,trytomatchatag.
Consider, forexample, thestring"<p p class="calibre2">Title e here</p>"andlet’ssayyou’dwantto
1.9. Tutorials
163
calibreUserManual,Release2.55.0
matchtheopeningtag(thepartbetweenthefirstpairofanglebrackets,alittlemoreontagslater).You’dthinkthatthe
expression<p.
*
>wouldmatchthattag,butactually,itmatchesthewholestring!(Thecharacter”.”isanotherspecial
character.Itmatchesanythingexceptlinebreaks,so,basically,theexpression.
*
wouldmatchanysinglelineyoucan
thinkof.)Instead,tryusing<p.
*
?>whichmakesthequantifier"
*
"non-greedy.Thatexpressionwouldonlymatch
thefirstopeningtag,asintended. There’sactuallyanotherwaytoaccomplishthis:Theexpression<p[^>]
*
>will
matchthatsameopeningtag-you’llseewhyafterthenextsection.Justnotethattherequitefrequentlyismorethan
onewaytowritearegularexpression.
Well,thesespecialcharactersareveryneatandall,butwhatifIwantedtomatchadotoraquestion
mark?
Youcanofcoursedothat: Justputabackslashinfrontofanyspecialcharacteranditisinterpretedastheliteral
character,withoutanyspecialmeaning. Thispairofabackslashfollowedbyasinglecharacteriscalledanescape
sequence,andtheactofputtingabackslashinfrontofaspecialcharacteriscalledescapingthatcharacter.Anescape
sequenceisinterpretedasasingleelement. Thereareofcourseescapesequencesthatdomorethanjustescaping
specialcharacters,forexample"\t"meansatabulator.We’llgettosomeoftheescapesequenceslater.Oh,andby
theway,concerningthosespecialcharacters:Consideranycharacterwediscussinthisintroductionashavingsome
functiontobespecialandthusneedingtobeescapedifyouwanttheliteralcharacter.
So,whatarethemostusefulsets?
Knewyou’dask.Someusefulsetsare[0-9]matchingasinglenumber,[a-z]matchingasinglelowercaseletter,
[A-Z]matchingasingleuppercaseletter,[a-zA-Z]matchingasingleletterand[a-zA-Z0-9]matchingasingle
letterornumber.Youcanalsouseanescapesequenceasshorthand:
\d is equivalent t to [0-9]
\w is equivalent t to [a-zA-Z0-9_]
\s is equivalent t to any whitespace
Note: “Whitespace”isatermforanythingthatwon’tbeprinted.Thesecharactersincludespace,tabulator,linefeed,
formfeedandcarriagereturn.
Asalastnoteonsets,youcanalsodefineasetasanycharacterbutthoseintheset. Youdothatbyincludingthe
character"^"astheveryfirstcharacterintheset. Thus,[^a]wouldmatchanycharacterexcluding“a”. . That’s
calledcomplementingtheset. Thoseescapesequenceshorthandswesawearliercanalsobecomplemented:"\D"
meansanynon-numbercharacter,thusbeingequivalentto[^0-9]. Theothershorthandscanbecomplementedby,
youguessedit,usingtherespectiveuppercaseletterinsteadofthelowercaseone. So, , goingbacktotheexample
<p[^>]
*
>fromtheprevioussection,nowyoucanseethatthecharactersetit’susingtriestomatchanycharacter
exceptforaclosinganglebracket.
ButifIhadafewvaryingstringsIwantedtomatch,thingsgetcomplicated?
Fearnot,lifestillisgoodandeasy. Considerthisexample:Thebookyou’reconvertinghas“Title”writtenonevery
oddpageand“Author”writtenoneveryevenpage. Looksgreatinprint,right? ? Butinebooks,it’sannoying. . You
cangroupwholeexpressionsinnormalparentheses,andthecharacter"|"willletyoumatcheithertheexpression
toitsrightortheonetoitsleft. Combinethoseandyou’redone. . Toofastforyou? ? Okay,firstoff,wegroupthe
expressionsforoddandevenpages, thusgetting(Title)(Author)asourtwoneededexpressions. . Nowwe
makethingssimplerbyusingtheverticalbar("|"iscalledtheverticalbarcharacter): Ifyouusetheexpression
(Title|Author)you’lleithergetamatchfor“Title”(ontheoddpages)oryou’dmatch“Author”(ontheeven
pages).Well,wasn’tthateasy?
164
Chapter1. Sections
calibreUserManual,Release2.55.0
Youcan,ofcourse, usetheverticalbarwithoutusinggroupingparentheses,aswell. . RememberwhenIsaidthat
quantifiersrepeattheelementprecedingthem? Well,theverticalbarworksalittledifferently: : Theexpression“Ti-
tle|Author”willalsomatcheitherthestring“Title”orthestring“Author”,justastheaboveexampleusinggrouping.
Theverticalbarselectsbetweentheentireexpressionprecedingandfollowingit. So, , ifyouwantedtomatchthe
strings“Calibre”and“calibre”andwantedtoselectonlybetweentheupper-andlowercase“c”,you’dhavetouse
theexpression(c|C)alibre,wherethegroupingensuresthatonlythe“c”willbeselected. Ifyouweretouse
c|Calibre,you’dgetamatchonthestring“c”oronthestring“Calibre”,whichisn’twhatwewanted.Inshort:If
indoubt,usegroupingtogetherwiththeverticalbar.
Youmissed...
...waitjustaminute,there’sonelast,reallyneatthingyoucandowithgroups.Ifyouhaveagroupthatyoupreviously
matched,youcanusereferencestothatgrouplaterintheexpression:Groupsarenumberedstartingwith1,andyou
referencethembyescapingthenumberofthegroupyouwanttoreference,thus,thefifthgroupwouldbereferenced
as\5.So,ifyousearchedfor([^ ]+) ) \1inthestring“TestTest”,you’dmatchthewholestring!
Inthebeginning,yousaidtherewasawaytomakearegularexpressioncaseinsensitive?
Yes,Idid,thanksforpayingattentionandremindingme. Youcantellcalibrehowyouwantcertainthingshandled
byusingsomethingcalledflags. Youincludeflagsinyourexpressionbyusingthespecialconstruct(?flags s go
here)where,obviously,you’dreplace“flagsgohere”withthespecificflagsyouwant. Forignoringcase,theflag
isi,thusyouinclude(?i)inyourexpression. Thus,test(?i)wouldmatch“Test”,“tEst”,“TEst”andanycase
variationyoucouldthinkof.
Anotherusefulflagletsthedotmatchanycharacteratall,includingthenewline,theflags.Ifyouwanttousemultiple
flagsinanexpression,justputtheminthesamestatement:(?is)wouldignorecaseandmakethedotmatchall.It
doesn’tmatterwhichflagyoustatefirst,(?si)wouldbeequivalenttotheabove.Bytheway,goodplacesforputting
flagsinyourexpressionwouldbeeithertheverybeginningortheveryend. Thatway,theydon’tgetmixedupwith
anythingelse.
IthinkI’mbeginningtounderstandtheseregularexpressionsnow...howdoIusethemincalibre?
Conversions
Let’sbeginwiththeconversionsettings,whichisreallyneat.IntheSearchandReplacepart,youcaninputaregexp
(shortforregularexpression)thatdescribesthestringthatwillbereplacedduringtheconversion.Theneatpartisthe
wizard. Clickonthewizardstaffandyougetapreviewofwhatcalibre“sees”duringtheconversionprocess.Scroll
downtothestringyouwanttoremove,selectandcopyit,pasteitintotheregexpfieldontopofthewindow.Ifthere
arevariableparts,likepagenumbersorso,usesetsandquantifierstocoverthose,andwhileyou’reatit,rememberto
escapespecialcharacters,iftherearesome.HitthebuttonlabeledTestandcalibrehighlightsthepartsitwouldreplace
wereyoutousetheregexp.Onceyou’resatisfied,hitOKandconvert.Becarefulifyourconversionsourcehastags
likethisexample:
Maybe, but the e cops s feel like you do, , AnitaWhat's one e more dead d vampire?
New laws don't change that. . </p>
<p class="calibre4"> <b class="calibre2">Generated by y ABC Amber LIT T Conv
<a href="http://www.processtext.com/abclit.html" class="calibre3">erter,
http://www.processtext.com/abclit.html</a></b></p>
<p class="calibre4"It had only y been two o years since Addison n vClark.
The court case gave e us a revised d version n of f what t life was
1.9. Tutorials
165
calibreUserManual,Release2.55.0
(shamelesslyrippedoutofthisthread
78
). You’dhavetoremovesome e ofthetagsaswell. . Inthisexample, , I’d
recommendbeginningwiththetag<b class="calibre2">,nowyouhavetoendwiththecorrespondingclos-
ingtag(openingtags are<tag>, closingtags are</tag>), whichissimplythenext</b>inthis case. . (Re-
fertoagoodHTMLmanualoraskintheforum ifyouareunclearonthis point.) ) Theopeningtagcanbede-
scribedusing<b.
*
?>, theclosingtagusing</b>, thus we couldremoveeverythingbetweenthose tagsusing
<b.
*
?>.
*
?</b>.Butusingthisexpressionwouldbeabadidea,becauseitremoveseverythingenclosedby<b>-
tags(which, bytheway, rendertheenclosedtextinboldprint), andit’safairbetthatwe’llremoveportionsof
thebookinthisway. Instead,includethebeginningoftheenclosedstringaswell,makingtheregularexpression
<b.
*
?>\s
*
Generated\s+by\s+ABC\s+Amber\s+LIT.
*
?</b>The\swithquantifiersareincludedhere
insteadofexplicitlyusingthespacesasseeninthestringtocatchanyvariationsofthestringthatmightoccur. Re-
membertocheckwhatcalibrewillremovetomakesureyoudon’tremoveanyportionsyouwanttokeepifyoutest
anewexpression. Ifyouonlycheckoneoccurrence,youmightmissamismatchsomewhereelseinthetext. Also
notethatshouldyouaccidentallyremovemoreorfewertagsthanyouactuallywantedto,calibretriestorepairthe
damagedcodeafterdoingtheremoval.
Addingbooks
Anotherthingyoucanuseregularexpressionsforistoextractmetadatafromfilenames.Youcanfindthisfeaturein
the“Addingbooks”partofthesettings.There’saspecialfeaturehere:Youcanusefieldnamesformetadatafields,for
example(?P<title>)wouldindicatethatcalibreusesthispartofthestringasbooktitle.Theallowedfieldnames
arelistedinthewindows,togetherwithanothernicetestfield. Anexample:Sayyouwanttoimportawholebunch
offilesnamedlikeClassical Texts:
The Divine Comedy by Dante Alighieri.mobi. . (Obvi-
ously,thisisalreadyinyourlibrary, sinceweallloveclassicalitalianpoetry)orScience e Fiction epics:
The Foundation Trilogy by Isaac Asimov.epub. . This s is obviouslya naming g schemethat calibre
won’textractanymeaningfuldataoutof-itsstandardexpressionforextractingmetadatais(?P<title>.+) -
(?P<author>[^_]+). Aregularexpressionthatworksherewouldbe[a-zA-Z]+: : (?P<title>.+) ) by
(?P<author>.+).Pleasenotethat,insidethegroupforthemetadatafield,youneedtouseexpressionstodescribe
whatthefieldactuallymatches.Andalsonotethat,whenusingthetestfieldcalibreprovides,youneedtoaddthefile
extensiontoyourtestingfilename,otherwiseyouwon’tgetanymatchesatall,despiteusingaworkingexpression.
Bulkeditingmetadata
Thelastpartisregularexpressionsearchandreplaceinmetadatafields. Youcanaccessthisbyselectingmultiple
booksinthelibraryandusingbulkmetadataedit. Beverycarefulwhenusingthislastfeature, , asitcandoVery
BadThingstoyourlibrary! Doublecheckthatyourexpressionsdowhatyouwantthemtousingthetestfields,and
onlymarkthebooksyoureallywanttochange!Intheregularexpressionsearchmode,youcansearchinonefield,
replacethetextwithsomethingandevenwritetheresultintoanotherfield. Apracticalexample: : Sayyourlibrary
containedthebooksofFrankHerbert’sDuneseries,namedafterthefashionDune 1 - Dune,Dune 2 - Dune
Messiahandsoon.NowyouwanttogetDuneintotheseriesfield.Youcandothatbysearchingfor(.
*
?)
\d+
- .
*
inthetitlefieldandreplacingitwith\1intheseriesfield.SeewhatIdidthere?That’sareferencetothefirst
groupyou’rereplacingtheseriesfieldwith. Nowthatyouhavetheseriesallset,youonlyneedtodoanothersearch
for.
*
?
-inthetitlefieldandreplaceitwith""(anemptystring),againinthetitlefield,andyourmetadataisall
neatandtidy.Isn’tthatgreat?Bytheway,insteadofreplacingtheentirefield,youcanalsoappendorprependtothe
field,so,ifyouwantedthebooktitletobeprependedwithseriesinfo,youcoulddothataswell.Asyoubynowhave
undoubtedlynoticed,there’sacheckboxlabeledCasesensitive,soyouwon’thavetouseflagstoselectbehaviour
here.
Well,thatjustaboutconcludestheveryshortintroductiontoregularexpressions. HopefullyI’llhaveshownyou
enoughtoatleastgetyoustartedandtoenableyoutocontinuelearningbyyourself-agoodstartingpointwouldbe
thePythondocumentationforregexps
79
.
78
http://www.mobileread.com/forums/showthread.php?t=75594”
79 https://docs.python.org/2/library/re.html
166
Chapter1. Sections
Documents you may be interested
Documents you may be interested