Test-Driven Synthesis
DanielPerelman
UniversityofWashington
perelman@cs.washington.edu
SumitGulwani
MicrosoftResearch
sumitg@microsoft.com
DanGrossman
UniversityofWashington
djg@cs.washington.edu
PeterProvost
MicrosoftCorporation
peterpr@microsoft.com
Abstract
Programming-by-exampletechnologiesempowerend-userstocre-
atesimpleprogramsmerelyby providinginput/output examples.
Existing systems are designed around solvers specialized for a
specificset ofdatatypesordomain-specific language (DSL).We
presentaprogramsynthesizerwhichcanbeparameterizedbyanar-
bitraryDSLthatmaycontainconditionalsandloopsandtherefore
isabletosynthesizeprogramsinanydomain.Inordertouseour
synthesizer, the userprovidesa sequence ofincreasinglysophis-
ticated input/output examples alongwith an expert-written DSL
definition.Thesetwoinputscorrespondtothe twokeyideasthat
allowoursynthesizertoworkinarbitrarydomains.First,wedevel-
opeda noveliterativesynthesistechniqueinspiredbytest-driven
development—which alsogivesour technique the name of test-
drivensynthesis—wheretheinput/outputexamplesareconsumed
oneata timeastheprogramisrefined.Second,theDSL allows
oursystemtotakeanefficientcomponent-basedapproachtoenu-
meratingpossibleprograms. We present applicationsofoursyn-
thesismethodologytoend-userprogrammingfortransformations
overstrings,XML,andtablelayouts.Wecompareoursynthesizer
ontheseapplicationstostate-of-the-art DSL-specificsynthesizers
aswelltothegeneralpurposesynthesizerSketch.
CategoriesandSubject Descriptors D.1.2[ProgrammingTech-
niques]:AutomaticProgramming—Programsynthesis
GeneralTerms Languages,Experimentation
Keywords programsynthesis,end-userprogramming,testdriven
development
1. Introduction
Programming-by-example(PBE[4,20])empowersend-userswith-
out programming experience to automate tasks like normaliz-
WorkdoneduringtwointershipsatMicrosoftResearch
Permissiontomakedigitalorhardcopiesofpartorallofthisworkforpersonalor
classroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributed
forprofitorcommercialadvantageandthatcopiesbearthisnoticeandthefullcitation
onthefirstpage.Copyrightsforthird-partycomponentsofthisworkmustbehonored.
Forallotheruses,contacttheowner/author(s).
PLDI’14, June09–11,2014,Edinburgh,UnitedKingdom.
Copyrightisheldbytheowner/author(s).
ACM978-1-4503-2784-8/14/06.
http://dx.doi.org/10.1145/2594291.2594297
ingtable layoutsorreformattingstringsmerelyby providingin-
put/outputexamples.PresentPBEtechnologiesaredesignedfora
specific setof datatypesusuallyeither byusing anSMTsolver
orsimilartechnology[30,33]andbeinglimitedtoefficientlyhan-
dlingonlythosedatatypeswhoseoperationsmapwelltotheSMT
solver’stheoriesorbycreatingadomain-specificlanguage(DSL)
forthetaskinconcertwithsaprogram synthesisalgorithm capa-
bleofproducingprogramsinthatDSL[8,11,18].LaSy
1
instead
isable to synthesize programsin arbitrarydomainsspecified by
anexpert-writtenDSL.Inordertoefficientlysupportarbitrarydo-
mains, we developeda synthesizer that combinestwo keyideas.
First,weuseanoveliterativesynthesistechniqueinspiredbytest-
drivendevelopment(TDD)[12,22],giving ourmethodologythe
name of t
est-d
rivens
ynthesis(TDS). Second, we have advanced
thestate-of-the-artinDSL-basedsynthesisasouriterativesynthe-
sisonlyshineswhen pairedwith aalgorithm thatcanefficiently
improvetheprogramfromapreviousstep.
Iterativesynthesis Takinginspirationfromthetest-drivendevel-
opment(TDD)[12,22]conceptofiteratingonaprogramthatal-
waysworksforthetestswrittensofar,thebehavioroffunctionsin
LaSyisconstrainedbyasequenceofincreasinglysophisticatedex-
ampleslikethosethatmightbewrittenbyaprogrammerauthoring
afunctionusingtheTDDprogrammingmethodologyordemon-
stratingthefunctionalityviaexamplestoanotherhuman.
PriorPBEsystemstakeasinputasetofrepresentativeexamples
fromwhichaprogramissynthesized(theymaybeconsumedoneat
atime,buttheirorderisnotprioritized).Itisdesirablethattheuser
providesrepresentativeexamples;otherwisethesystemmayendup
over-fittingonasimpleexample,andwouldfailtofindacommon
programthatworksforallexamples[17].Ontheotherhand,when
describingataskpeoplecaneasilyprovideasequenceofexamples
ofincreasingcomplexity—forexample,ina human-readableex-
planation[26]orinamachine-readablesequenceoftestsforTDD.
Thepresentationasasequenceallowsfortheproblemoflearning
thetasktobebrokendownintoaseriesofsub-problemsoflearning
slightlymoredetailaboutthetaskafterseeingeachnewexample—
thisisoneofthekeyinsightsofthispaper.Whilethisbearssome
similaritytogeneticprogramming[23],oursystem doesnotrely
onanyconceptofafunctionalmostorapproximatelymatchingan
example.Byworkinginaseriesofstepswhereaftereachstepthe
resultisaprogramthatiscorrectforasubsetoftheinputspace,our
systemisabletomore effectivelynarrowdownthesearchspace
thanpriorwork.
1
La
nguageforSy
nthesis,pronounced“lazy”
Add image to pdf preview - insert images into PDF in C#.net, ASP.NET, MVC, Ajax, WinForms, WPF
Sample C# code to add image, picture, logo or digital photo into PDF document page using PDF page editor control
add photo to pdf; add picture to pdf preview
Add image to pdf preview - VB.NET PDF insert image library: insert images into PDF in vb.net, ASP.NET, MVC, Ajax, WinForms, WPF
Guide VB.NET Programmers How to Add Images in PDF Document
adding an image to a pdf in acrobat; add image to pdf acrobat reader
DSL-basedsynthesis Theunderlyingsynthesistechnique,which
werefertoasDBSforD
SL-b
aseds
ynthesis,isparameterizedbya
DSL allowingit tobespecializedtodifferentdomainswithmin-
imaleffort.Similartohowaparsergeneratorisgivenalanguage
definitiontomakeaparserforthatlanguage,ourgeneral-purpose
synthesizerisgivenaDSLdefinitiontomakeasynthesizerforthat
DSL.A DSL primarilyconsistsof a context-free grammar over
DSL-definedfunctions. Additionally,theDSLmayuse synthesis
strategiesweprovide forlearningconstructsamenableto special
reasoninglikeconditionalsandcertainformsofloops.Expertsmay
providefurtherextensions, but that isoutofscope forthispaper
aswe focusonthestrengthsofDBSandourtest-drivensynthesis
methodologywithoutthecomplicationofadditionaluser-provided
synthesisstrategies.
Asit ispart of aniterative synthesisprocedure,DBSisgiven
apreviousprogramthatsatisfiessomeprefixoftheexamplesand
searchesforamodificationthatmakesit satisfyalloftheexam-
ples.Thesemodificationsreplacesomesubexpressionoftheprevi-
ousprogramwithanewexpression.Thegenerationofreplacement
subexpressionsis done bycomponent-basedsynthesis[15]: DBS
considersthesemanticallydistinctexpressionsoftheDSL,build-
ingthemupstartingwiththeatomsintheDSLandtheprevious
program’ssubexpressions.InourDSL-basedapproach,DBSuses
the grammar ofthe DSL asopposedto justthe typesof the ex-
pressionstodecidewhatexpressionstobuild,whichsignificantly
narrowsthesearchspace.
Contributions
1. The problem statement ofsolvinga PBEproblemgivena se-
quenceofincreasinglysophisticatedinput/outputexamplesfor
eachfunctiontobe synthesized, reifiedinourLaSylanguage
(§3).
2. Ournovel test-drivensynthesismethodologyforsolvingsuch
synthesisproblemscombinesiterativesynthesis(§4)andDSL-
basedsynthesis(§5)tobeabletosynthesizeprogramswithnon-
trivialcontrolflowincludingconditionals,loops,andrecursion
inanarbitraryDSL.
3. Beingdomain-agnostic,ourtechniquehasmyriadapplications,
including significant implications for end-user programming;
wesummarizeafewpossiblitieswithconcreteexamplesin§2.
4. Werunexperimentsdemonstratingtheeffectivenessofeachof
ourkeyideasandshowingtheoveralleffectivenessofouralgo-
rithm on important applications including comparing against
state-of-the-art synthesizers in each domain and against the
general purpose synthesizer Sketch[30](§6).Asasource for
moredifficultproblemsforoursystem,particularlyforlongse-
quencesofexamples,wehaditplaythePex4Funprogramming
game[32](§6.1.4).
2. Motivatingexamples
WepresentexamplesdemonstratingthebreadthofprogramsLaSy
isable tosynthesizetohighlight itsdomain-agnosticcapabilities.
§6goesintomoredetailonevaluatingtheperformanceanddesign
decisionsofoursynthesizer.
2.1 AutomatingTDD
Todemonstrateiterativelybuildingupaprogram,weusetheexam-
pleofgreedywordwrap,whoseuseasaTDDpracticeproblemin
[22]wastheoriginalinspirationforourmethodology.Wordwrap
insertsnewlinesintoastringsothatnolineexceedsagivenmax-
imum line length; the greedy versioninsertsnewlinesaslate as
possibleinsteadoftryingtoevenoutthelinelengths.
Fig.1showspartofaLaSyprogramforimplementinggreedy
word wrap, showingonly1 or2 of the upto6 examples under
strings;
string WordWrap(string text, int length);
// Single word doesn’t wrap.
WordWrap("Word", 4) == "Word";
// Two words wrap...
WordWrap("Extremely longWords", 14) ==
"Extremely\nlongWords";
// ... when longer than line.
WordWrap("How are", 76) == "How are";
// Wrap as late as possible...
WordWrap("How are you?", 9) == "How are\nyou?";
// ... but no later.
WordWrap("Hello, how are you today?", 14) ==
"Hello, how are\nyou today?";
// Wrap in middle of word.
WordWrap("Abcdef", 5) == "Abcde\nf";
WordWrap("ThisIsAVeryLongWord a", 15) ==
"ThisIsAVeryLong\nWord a";
// Wrap multiple times (using recursion).
WordWrap("How are you?", 4) == "How\nare\nyou?";
// Complicated test to ensure program is correct.
WordWrap("This is a longer test sentence. a bc",
7) == "This is\na\nlonger\ntest\nsentenc\ne. a bc";
Figure1. AbbreviatedLaSyprogramforgreedywordwrapshow-
ingarepresentativesubsetofthe24testcases
eachcommentforspace reasons.TheLaSyprogrambeginswith
adeclaration sayingtousethe“strings” DSL followed bythe
functionsignatureforwordwrap.Mostoftheprogramisthelistof
examplesthatdictatehowthefunctionshouldbehave.
Theexamplesareinterspersedwithcommentsthatexplainhow
this sequence iteratively builds up a solution with intermediate
steps where it iseasy to describe what subset of the inputs the
programiscorrect for.Exactlywheretobreakthelineisrefined
overseveralexamples:firstitisjustatthespace,thenatthespace
ina stringlongerthanthe maximum line length,thenat the last
space in a long string, then at the last space before (or at) the
maximum line length, only to be refined furthertoward the end
toinclude inserting breaksinthemiddleof wordsthat donot fit
ona singleline. Also,our synthesizer isabletogeneralizefrom
insertingasinglelinebreaktoinsertinganynumberoflinebreaks
viarecursion.
Thehighnumberofexamplesindicatesthisisafairlysophis-
ticatedprogramforPBE;itshouldbenotedthatfulltestcoverage
forTDDtakesasimilarnumberofexamples[22].Weincludeword
wraptoshowLaSyiscapableofscalinguptothatmanyexamples.
2.2 End-userdatatransformations
Enabling (hundreds of millions of) end-usersto construct small
scriptsfordatamanipulationisaveryimportanttopic[1,6].LaSy
can be used to synthesize useful scripts in thisdomain that are
beyondthecapabilitiesofleadingpriorworkinthisarea[6].
Stringtransformations Considerthetaskofconvertingbibliog-
raphyentries(representedasstrings)fromoneformattoanother.
Fig.2 showsanexample of a LaSyprogram for convertingbe-
tweentwosuch formats.Thisprogramisstructured using helper
functionsforrewritingtheauthorslistandformappingvenueab-
breviationstothefullvenuenamesusedinthetargetformat.Un-
likeConvertNameandConvertList,VenueFullNamedoesnot
doanycomputation(thefullnamefor"POPL 2013"cannotbein-
ferredfromthefullnamefor"PLDI 2012"withoutawebsearch),
soitisdeclaredasalookup,meaningthefunctionwilljuststore
thelistofinput/outputexamplesandlookupanyinputsinthatlist
tofindthecorrespondingoutput.
C# WinForms Viewer: Load, View, Convert, Annotate and Edit PDF
Convert PDF to Tiff image (.tif, .tiff). • Convert PDF to HTML (.htm, .html). PDF Annotation. • Add sticky notes to PDF document in preview.
how to add image to pdf in acrobat; add image to pdf form
C# WPF Viewer: Load, View, Convert, Annotate and Edit PDF
PDF to Tiff image (.tif, .tiff). • Convert PDF to HTML (.htm, .html). PDF Annotation. • Add sticky notes to PDF document. • Highlight PDF text in preview.
acrobat add image to pdf; how to add image to pdf file
strings;
string ConvertBib(string oldFormat);
string ConvertName(string fullName);
string ConvertList(string oldFormat);
string VenueFullName(string abbr);
ConvertName("John Smith") == "Smith, J.";
ConvertName("Ann Miller") == "Miller, A.";
ConvertList("John Smith") == "Smith, J.";
ConvertList("Donna Jones, John Smith, Ann
Miller") == "Jones, D.; Smith, J.; Miller, A.";
VenueFullName("PLDI 2012") == "The 33rd ACM
SIGPLAN conference on Programming Language Design and
Implementation, Beijing, June, 2012";
ConvertBib("Reagents: Expressing and Composing
Fine-grained Concurrency, PLDI 2012, Aaron Turon")
== "Reagents: Expressing and Composing Fine-grained
Concurrency\nTuron, A.\nThe 33rd ACM SIGPLAN conference
on Programming Language Design and Implementation,
Beijing, June, 2012";
VenueFullName("CACM 2012") == "Communications of
the ACM, 2012";
ConvertBib("Spreadsheet Data Manipulation using
Examples, CACM 2012, Sumit Gulwani, William Harris,
Rishabh Singh") == "Spreadsheet Data Manipulation
using Examples\nGulwani, S.; Harris, W.; Singh, R.\n
Communications of the ACM, 2012";
Figure2. LaSyprogramforconvertingbibliographyentries
oldXml = "<doc><div id="ch1"> <name="a1">1st
Alinea.</p> <name="a1.1">Zomaar ertussen.</p> <p
name="a2">2nd Alinea.</p> <p name="a3">3rd Alinea.</p>
</div> <div id="ch2"> <name="a1">First Para.</p> <p
name="a2">Second Para.</p> <p name="a2.1">Something
added here.</p> <name="a3">Third Para.</p>
</div></doc>";
xml;
XDocument ToTable(XDocument oldXml);
XElement BuildRow(XDocument oldXml,
string rowName);
BuildRow(oldXml, "a1.1") == "<tr><td>Zomaar
ertussen.</td><td/></tr>";
ToTable(oldXml) == "<table>
<tr><td>1st Alinea.</td><td>First Para.</td></tr>
<tr><td>Zomaar ertussen.</td><td/></tr>
<tr><td>2nd Alinea.</td><td>Second Para.</td></tr>
<tr><td/><td>Something added here.</td></tr> <tr><td>3rd
Alinea.</td><td>Third Para.</td></tr> </table>";
Figure3. LaSyprogramforconvertingsetofXMLliststoatable
XML transformations We showtwo LaSy programs forXML
transformation tasksfound onhelpforums.Fig. 3takesa set of
<div>s containingnamed paragraphsand arranges the data as a
table with a column for each <div> and a row for each name,
sodataislinedupinarowifthesamenameappearsinmultiple
<div>s.Thistransformationrequiresahelperwhichdescribeshow
atable row is built from a paragraph name. Both functionsare
simple enough that they require only a single example. Fig. 4
assignsclassattributestoparagraphs without them according to
the class of the nearest previous paragraph (if present), and is
implementedbythesynthesizerusingaloop.
3. Language
InadditiontotheLaSyprogrammingbyexamplelanguagedemon-
stratedintheprevioussection,whichisexplainedinmoredetailin
XML;
XDocument AddClasses(XDocument oldXml);
AddClasses("<doc> <p>1</p> </doc>") == "<doc>
<p>1</p> </doc>";
AddClasses("<doc> <p>1</p> <class=
a
>2</p>
<p>3</p> <p>4</p> <class=
b
>5</p> <p>6</p> <p
class=
c
>7</p> </doc>")
== "<doc> <p>1</p> <class=
a
>2</p> <class=
a
>3</p>
<class=
a
>4</p> <class=
b
>5</p> <class=
b
>6</p>
<class=
c
>7</p> </doc>";
Figure4. LaSyprogramforaddingclassattributes
P ::=language I; F E
(P
rogram)
F ::=function t f((t x;));
(F
unctiondeclaration)
jlookup t f((t x;));
(Lookupdeclaration)
E::=require f((V,))==V;(E
xample)
V::=anyconstantexpression
(V
alue)
t ::=IjI<(t,)>
(t
ypename)
f ::=I
(f
unctionname)
x ::=I
(variablename)
I ::=anyvalididentifier
(I
dentifer)
Figure5. TheLaSylanguage.
§3.1,wealsodiscussthelanguagethatexpertsusetodefineDSLs
foruseinLaSyin§3.2.
3.1 LaSyprogrammingbyexamplelanguage
Fig.5givesthesyntaxofLaSy.NotethatLaSyreliesonabaselan-
guage,C#inourimplementation,toprovidebasictypes,functions,
andvalues,andthereforetheprecisesyntaxforvaluesandidenti-
fiersisomitted:typereferencesareC#typereferencesandvalues
areC#expressions.
ProgramsinLaSyconsistofasetoffunctiondeclarationsanda
sequenceofexamples.
Language AllLaSyprogramsbeginwithareferencetotheDSL
beingsynthesizedover.Thelanguageisdefinedbeforehandbyan
expertasdescribedbelowin§3.2.
Functions Thefunctiondeclarationslistthefunctionstobesyn-
thesized.Eachfunctionhasanameandtypesignature.
Examplesand semantics Eachexample is afunction call with
literalsgiven for its arguments and its required return value. A
functionfisconsideredtosatisfyanexamplef(V
1
;:::)==V
R
if
wheneverfiscalledwithargumentsthatarestructurallyequivalent
(.Equals()inC#)toV
1
;:::,itreturnsavaluethatisstructurally
equivalenttoV
r
.
Specifically, the examplesare an ordered list dictating a se-
quence ofprogramsynthesisoperationswhereinthe functionthe
examplereferencesismodifiedtosatisfytheexamplewithoutvi-
olatinganypreviousexample.Atthebeginningofthesynthesisof
aLaSyprogram,allfunctionsareempty(andthereforesatisfyno
examples).Thesynthesisprocessisdescribedindetailin§4.
ThesemanticsofLaSyarenaturallyquiteweak:theonlyguar-
anteeisthatifthesynthesissucceeds,thentheexampleswillbesat-
isfied.Thesynthesizerisheavilybiasedtowardsmallerprograms
withfewerconditionalswhichtendtobemoregeneralizable,but
doesnot guaranteeitwillfindthesmallestprogramsatisfyingall
oftheexamples.Whilethesynthesizerimplementationshouldact
inamannerpredictableenoughthattheusercantrustitsprograms
tobereasonable,ultimatelyonlytheusercandetermineifthesyn-
thesizedprogramactuallyfulfillstheuser’sintendedpurpose.
TheresultofthesynthesizerisC#codewhichcanbecompiled
andusedinany.NETprogram,includinganotherLaSyprogram.
How to C#: Preview Document Content Using XDoc.Word
C# DLLs for Word File Preview. Add references: Get Preview From File. You may get document preview image from an existing Word file in C#.net.
add jpg to pdf document; how to add a jpeg to a pdf
How to C#: Preview Document Content Using XDoc.PowerPoint
Add necessary XDoc.PowerPoint DLL libraries into your created C# application as You may get document preview image from an existing PowerPoint file in C#.net.
add jpg to pdf online; how to add a jpg to a pdf
strings;
flashfill.dll
lasy.FlashFill;
P;
P ::=
CONDITIONAL(b, e);
b ::= ||(d, :::, d);
d ::= &&(, :::, );
 ::= m | !(m);
m ::= Match(v, r, k)
;
e ::= Concatenate(f, :::, f)
;
f ::= ConstStr(s) | SubStr(v,p,p) | Loop(w:e)
;
s ::=
CONSTANT;
p ::= Pos(r,r,c) | CPos(k)
;
c ::= k | k*w+k;
k ::=
CONSTANT;
r ::= TokenSeq(T,:::,T) | ;
v ::=
PARAM;
&&(
0, 
1) ==> &&(
1, 
0);
||(d
0, d
0) ==> d
0;
0*w
0+k
0 ==> k
0;
Figure6. TheextendedFlashFillDSL;grammarrulesnotpresent
intheoriginalDSLareinbold.
3.2 DSLdefinitionlanguage
Anexample DSL definitionisgiven in Fig.6.TheDSLgives a
grammar which specifies what programs are possible as well as
whatthesemanticsofthoseprogramsare.Additionally,theDSL
optionallyprovidesafewdifferentkindsofhintstothesynthesizer
thatallowthesynthesizertotakeadvantageofexpertknowledgeof
thesemantics.
Grammar TheDSLisgivenasacontext-freegrammar.Eachline
definesanoptionfora non-terminalgivenonthe left ofthe::=.
For most rules, the right side givesa DSL-defined function and
alist of non-terminalsforthe argumentsto that function. Func-
tionsare.NETfunctionsdefinedintheclassspecifiedatthetopof
thefile.ThesemanticsoftheDSL must befunctional;thatis,all
functionscalledmustbepure.Notethatloopscanbehandledina
purewaybyusinglambdas.Ingeneral awhile loopcanbe writ-
tenusingthe functionWhileLoop(condition, body, final)
= state => condition(state) ? WhileLoop(condition,
body, final)(body(state)) : final(state).
RulesotherthanDSL-definedfunctionsarewritteninall-caps
starting with an underscore to distinguish them. These are used
for a fewdifferent purposes. First, some are items that are not
functionslikeconstants,lambdavariables,andlambdaabstraction.
SomereferenceitemsdependontheLaSyprogram:
PARAMcorre-
spondstoanyparameterofthecorrecttypeand
LASY
FNallows
foracall toanotherLaSyfunction.Those starting withadouble
underscoreareforfunctionswithspecializedsynthesisstrategies.
CONDITIONAL(b, e) means that some number ofif...else
if...elsebranchesareallowedwheretheconditionsmatchthe
non-terminal band the branches match the non-terminal e; this
couldbeaDSL-definedfunctionexceptforthefactthatthesynthe-
sizerisawareofthesemanticsofconditionalsandhasspecialized
logicforlearningthemdescribedin§5.2.Similarly,whilenotused
Algorithm1:TDS(S,L)
input :sequenceofexamplesS,DSLspecificationL
output:aprogramPthatsatisfiesSorFAILURE
1
e allgrammarrulesinL;
2
P
0
?;
3
failuresInARow 0;
4
foreachi 0;:::;jSj 1do
5
ifP
i
(input(S
i
))=output(S
i
)then
6
P
i+1
P
i
;
7
failuresInARow 0;
8
else
9
contexts ;,exprs e[parametersofP
i
;
10
foreachsubexpressionsofP
i
do
11
Addexpr:P
i
[s=expr]tocontexts;
12
Addstoexprs;
13
foreachbranchBofP
i
do
14
foreachsubexpressionsofBdo
15
Addexpr:B[s=expr]tocontexts;
16
P
i+1
DBS(contexts;(S
0
;:::;S
i
);exprs;L;
17
num
branch(P
i
)+failuresInARow);
18
ifP
i+1
isTIMEOUT then
19
P
i+1
P
i
;
20
failuresInARow failuresInARow+1;
21
else
22
failuresInARow 0;
23
iffailuresInARow=0then
24
returnP
jSj
;
25
else
26
returnFAILURE;
in the example,
FOR(i) and
FOREACH(arr) have associated
synthesisstrategiesforcertainformsofloopswhicharedescribed
in§5.3.
Constantvaluegeneration TheDSLmayincludecodetodecide
whatconstantvaluesmayappearintheprogramwhichmaydepend
ontheexamples(not showninthe figure). The simplest logic a
DSL coulduse isthat any values in the examplesmaybe used
as constants in the program. Other DSLs may have cases like
includingonlyregularexpressionsfromapresetlistthatmatchone
oftheinputsor,whensynthesizingXML,extractingthenamesof
thetagsandattributesintheoutputs.
Rewrite rules The
rules allow the DSL designer to
express algebraic identities in their language to help prune the
searchspaceofprogramswithidenticalsemantics.
4. Test-DrivenSynthesis
TheTest-DrivenSynthesismethodologysynthesizesaprogramby
consideringasequenceofexamplesinorderandbuildingupitera-
tivelymorecomplicatedprograms.Wewilldescribethemethodol-
ogyforaLaSyprogramwithonefunction,butiteasilygeneralizes
tomultiplefunctions.§4.1describesthealgorithm indetail.§4.2
detailshowtheiterativenatureof thealgorithmworks.§4.3 dis-
cussestheimportanceoftheorderofexamples.
4.1 Algorithm
TheT
est-D
rivenS
ynthesisalgorithm(TDSinAlgorithm1)synthe-
sizesa program P givena sequence ofexamplesS anda set of
VB.NET PDF File Compress Library: Compress reduce PDF size in vb.
enables compressing and decompressing in preview in ASP.NET to reduce or minimize original PDF document size Reduce image resources: Since images are usually or
adding image to pdf; add image to pdf file
How to C#: Preview Document Content Using XDoc.excel
Add necessary references: RasterEdge.Imaging.Basic.dll. Get Preview From File. You may get document preview image from an existing Excel file in C#.net.
add image to pdf; add photo to pdf file
basecomponents.By“program”wemeanasinglefunctionwitha
specifiedsetofinputparametersandreturntype.By“example”we
meanasetofinputvaluesforthoseparametersandthecorrectout-
putvalue.By“component”wemeananyofthesetofexpressions
knowntothesynthesizerwhichareusedasthebuildingblocksfor
the synthesized program; the base components are the functions
referencedbytheDSLintheLaSyprogrambutcomponentsmay
alsobepartiallyfilled-infunctioncallsorlargerDSLexpressions.
InthespiritofTDD,webuildP up,alittleatatime,toallow
synthesisoflargerprograms. P
i
satisfiesthe first iexamples; its
successorprogramP
i+1
isbuiltbytheD
SL-b
aseds
ynthesis(DBS)
algorithm usingthe first i+1 examplesalong withinformation
fromP
i
.ThepreviousprogramP
i
isusedinthreeways:
1. Itssubexpressionsareaddedtothecomponentset.
2. Contextstosynthesizeinareformedbyremovingeachsubex-
pressionofP
i
oneatatime.
3. The number of branches may only exceed the number of
branches in P
i
if failuresInARow > 0. Newconditionals
areallowedonlyafterfailuresinordertoavoidoverfittingtothe
examplesbycreatingaseparatebranchforeachone.
E
XAMPLE
1 (WalkthroughofTDS). We will use the DSLC ::=
CharAt(S;N)jToUpper(C), S ::= Word(S;N)j
PARAM, N ::=
0j1 where Word(s, n) selects the n
th
word from the string s
and
PARAM is any function parameter and e is the set of all
grammarrulesinthatDSLtodemonstratesynthesizingthefunction
f(a))ToUpper(CharAt(Word(a, 1), 0)):
i=0: S
0
=(a = "Sam Smith";RET =
S
).exprs =e[fag
(the whole languageplustheparametera)andcontexts =
fg (the set containing just the trivial context) because the
previous program P
0
= ?, so there are no subexpressions
to remove to build contexts out of. The smallest program to
compute
S
istoselectthefirstcharacterofa,andtherefore
P
1
=f(a))CharAt(a;0).
i=1: S
1
= (a = "Amy Smith";RET =
S
). contexts =
f;CharAt(;0);CharAt(a;)g because P
1
has two subex-
pressionsthat can be removed. exprs nowalso containsthe
expressionCharAt(a;0).Thesimplestprogramconsistentwith
bothexamplesselectsthe secondwordofainsteadof aitself,
soDBSwillgenerateWord(a;1)toselectthesecondwordand
plug it into the second context to generate P
2
= f(a) )
CharAt(Word(a;1);0).
i=2: S
2
= (a = "jane doe";RET =
D
). contexts=
f;CharAt(;0);CharAt(Word(;1);0);CharAt(Word(a;);
0)g.exprs = e[fa;Word(a;1);CharAt(Word(a;1))g.No-
tably,CharAt(a;0)doesnotappearinexprsdespiteitappear-
inginexprsforthei =1stepbecauseitisnotasubexpres-
sionof P
2
.Itisimportant that suchtemporarydiversionsare
forgottensotimeisnotwastedontheminlatersteps.DBSwill
output P
3
= f(a))ToUpper(CharAt(Word(a, 1), 0))
whichtakesonly asingle stepbecauseitistheapplication of
ToUppertoP
2
whichappearsinexprs.
We nowdiscuss a fewdetailsof the algorithm toclarify the
descriptionandjustifysomedesignchoices.
RelationbetweenTDSandDBS DBSisdescribedlaterin§5.We
separateTDSfrom DBSboth toshowexplicitly howthe previous
programP
i
isusedwhenconstructingthenextprogramP
i+1
and
tohighlightthetwokeynovelideasinourapproach:
1. TDS encapsulatesthe newidea of treating the examples asa
sequenceand usingthat fact toiterativelybuildupP byway
of a seriesof programswhich are correct fora subset ofthe
inputspacedefinedbyaprefixoftheexamples.
2. DBSencapsulatestheparameterizationbyaDSLwhichallows
fortheflexibilityofthealgorithm.
TDS runs DBS repeatedly, each time giving it the next exam-
ple from S along with expressions and contexts from the pro-
gram synthesizedinthe previousiteration. Inother words, it it-
eratively synthesizesP
k
for each k  jSj where P
k
issynthe-
sized using S
0
;S
1
;:::;S
k 1
and the previous program P
k 1
.
Inthisformulation,P
0
isthe empty program ?,or throw new
NotImplementedException();inC#.
State Asdescribed,the onlystate keptbetweeniterationsisthe
program P
i
and the failure count. DBS doesnot maintain state,
andcontexts andexprs depend onlyonP
i
.Additionally, DBS
is passed all of the examples up to S
i
,not just S
i
.One could
imaginea moregeneral problemdefinitionwhere arbitrary(orat
least more) state couldbe kept between invocationsof DBS,but
in our experience this tended to be more harmful than helpful:
preservingstateessentiallycorrespondstonotforgettingabout
failedattempts.
Nolookahead Althoughwehaveformulatedtheproblemasgiv-
ingTDSthesequenceoftests,noticeitdoesnotlookbeyondtest
S
i
togenerateP
i+1
.Henceinaninteractivesettingtheusercould
lookatP
i+1
oritsoutputwhenchoosingS
i+1
.
4.2 Contextsandsubexpressions
Theintuitionforthestrategyofreplacingsubexpressionsisthat
theprogramgeneratedsofarisdoingthecorrectcomputation
forsomesubsetoftheinputspaceandisoverspecializedtothat
subset.In the example above, after the i = 0 step, we had the
programthatreturnsthefirstcharacterofthestringinsteadofthe
firstcharacterofthesecondwordofthestring.Thatprogramwas
overspecializedtoinputswherethefirstandsecondwordstartwith
sameletter.Selectingthefirstletterwastherightcomputationbut
onthewronginput,sofillinginthecontextCharAt(;0)withthe
rightinputgavethedesiredprogram.
Eachcontext representsahypothesisaboutwhichpart of
theprogramiscorrectandcorrespondinglythattheexpression
removedisoverspecialized.Notethattheexpressionappearsin
thesetofcomponents,soifasmallchangeissufficient,theeffort
tobuilditinpreviousiterationswillnotbewasted.Also,onesuch
hypothesisisalwaysthatthe entireprogramiswrongandshould
bereplacedentirely.
Contextsaremadeoutofeachbranchaswellastheentirepro-
graminordertobettersupportbuildingnewconditionalstructures
(§5.2)usingpartsofoneormoreoftheexistingbranches.
Thistheorydoesnotlimitcontextstoasinglehole,but,empiri-
cally,doingsokeepsthenumberofcontextsmanageableandseems
tobesufficientinpractice.Also,itallowsthe algorithmtoprune
awaylocationsbasedonwhethertheyarereachedwhenexecutinga
failingexample:modificationselsewherecouldnotpossiblyaffect
whethersuchexamplesarehandledcorrectly.Ifweallowedmulti-
plemodifications,thechoiceofmodificationpointswouldhaveto
bechangedafteranymodificationaffectingcontrolflow.
4.3 Exampleorder
[22]observesthatinTDDthetestcaseordercanaffecttheabilityof
theprogrammertoproduceaprogramthroughsmallcodechanges.
Similarly, ouralgorithm may fail to synthesize a program ifnot
givenexamplesinagoodorder—afterall,oneofourkeyinsights
isthattheorderingofexamplesisausefulinputtothesynthesizer.
Asa“good”orderisdefinedasbeingonethatresultsinsynthe-
sizingaprogramsatisfyingthespecificationtheuserhasinmind,
itisunclearhowtodefinea“good”orderwithoutreferencingthe
finalsynthesizedprogram.Needlesstosay,suchadefinitioncannot
bedirectlyusedtoguidethegenerationofasequenceofexamples.
Thisisunsatisfying,butweprovide someintuitiononwhat such
orderslooklikeand§6.2givesevidencethatoursynthesizerisro-
How to C#: Set Image Thumbnail in C#.NET
VB.NET How-to, VB.NET PDF, VB.NET Word following steps below, you can create an image viewer WinForm Open or create a new WinForms application, add necessary dll
add image in pdf using java; how to add image to pdf in preview
C# PDF remove image library: remove, delete images from PDF in C#.
Generally speaking, using well-designed APIs, C# developers can do following things. Remove Image from PDF Page Using C#. Add necessary references:
adding images to pdf forms; add multiple jpg to pdf
busttosmallvariationsintheorderofexamples.Wethusremark
thata humancouldlearntoproduce suchanorderingjustlike a
humancanlearntoproduceTDDtestcasesinanorderthateasily
resultsinacorrectprogram.Fundamentally,thisisanalogoustothe
issueofhowahumanshouldgenerateaconcisebutsufficientsetof
black-boxtestsforaprogram.Manyguidelinesexist,butthereisno
precisemethodology.Nonetheless,black-boxtestingissuccessful.
Once the user has covered the entire specification they had
in mind, they have produced a suite ofsimple test cases forthe
algorithmtheyaresynthesizing.AsinTDD,toconfirm thatthey
haveinfactsynthesizedthecorrectprocedure,theusershouldwrite
afewlarger,morecomprehensivetestsoftheprocedure.
5. DSL-BasedSynthesis
The D
SL-B
asedS
ynthesisalgorithm(DBSinAlgorithm 2)isthe
part ofTDS that actually generates new programs. DBS takes as
input a set of examplesS,a set of contextsC whichgenerated
DSL expressionsare pluggedintoto formsynthesized programs
matching L, a set of expressions e, a DSL definition L, and a
maximum numberof branchesm. It outputsa new program P
0
thatsatisfiesallexamplesinSorTIMEOUTifitisunabletodoso.
Thealgorithmisbuiltonfivekeyconcepts:
1. Newprogramsarenot generated directly; insteadexpressions
aregeneratedandpluggedintocontextsprovidedashintstonar-
rowthesearchspace.ThisisusedbyTDStoindirectlyprovide
thepreviousprogramasahint(§4.2).
2. Newexpressionsare formedfrom all compositionsofexpres-
sionsaccordingtotheDSL L.Toproduceall smallerexpres-
sionsbeforegeneratinglargerones,DBSrunsasaseriesofiter-
ations,where,ineachiteration,onlyexpressionsfromprevious
iterationsare composed into newexpressions. §5.1 discusses
generationofnewexpressions(andimportantoptimizations).
3. Anewbranchingstructurewillbesynthesizedifnogenerated
programsatisfiesallexamplesinSandm>1.Onlyprograms
containingatmostmbrancheswillbe synthesizedinorderto
avoidover-specializingtotheexamples.Synthesisofcondition-
alsisdiscussedindetailin§5.2.
4. Ifthealgorithmtimesoutbeforeasolutionisfound,itwillre-
turn a specialfailure value TIMEOUT.In TDS,thiscase incre-
mentsmallowingformorebranchesinthenextrunofDBS.
5. Thesearchspacecanbereducedevenfurtherusingspecialized
strategiesforsomefunctions.Wedemonstratethisbydescribing
strategieswedefinedforacouplecommonloopformsin§5.3.
5.1 Choosingnewexpressions
Newexpressionstouseinthecontextsaregeneratedbycomponent-
basedsynthesis[14].Incomponent-basedsynthesis,asetofcom-
ponents(expressionsand methods) are provided asinput andit-
eratively combinedtoproduce expressionsinorder ofincreasing
sizeuntilanexpressionisgeneratedthatmatchesthespecification.
In our case, the “specification” is the examples. As opposed to
previouscomponent-basedsynthesiswork,thegenerationofnew
expressionsisguidedbya DSL L andinsteadof testingthe ex-
pressionsagainstthespecification,theyareusedtofillincontexts
producinglargerprogramswhicharethentested.
In our system, all components are expressions marked with
which non-terminal in the grammar defined them. Methods are
represented as curried functions. The synthesizer generates new
expressionsbytaking onecurried function and applying it to an
expressionmarkedwiththecorrectnon-terminal.Eachiterationof
thesynthesizerdoessoforeveryvalidcombinationofpreviously
generatedexpressionsinordertogenerateprogramsofincreasing
size.Representingmethodsasanonymousfunctionsalsosimplifies
Algorithm2:DBS(C,S,e,L,m)
input :setofcontextsC,setofexamplesS,setof
expressionsetobuildnewexpressionsfrom,DSL
specificationL,maximumnumberofbranchesm
output:aprogramP
0
thatsatisfiesSorTIMEOUT
/*
generates one or more programs and if
one satisfies S, DBS returns it.
*/
1
Tryloopstrategiesinaseparatethread(§5.3);
2
allExprs e;
3
whilenottimedoutdo
4
foreachc Cdo
5
foreachexpr allExprsdo
6
Tryc(expr);
7
Tryconditionalsolutionsuptombranches(§5.2);
8
allExprs generatenewexpressions(§5.1);
9
returnTIMEOUT;
handling methods that themselves take functions as arguments,
whicharecommoninhigher-orderfunctionslikemapandfold.
As the numberof components generated after k iterations is
exponentialwiththebasebeingthenumberofgrammarrules(i.e.,
functionsandconstantsintheDSL)intheworstcase,aDSLthatis
toolargewillcauseDBStorunoutoftimeormemorybeforefinding
asolution.Inpractice,around40–50grammarrulesseemstobethe
limitforDBS,butitdependsgreatlyonthestructureoftheDSL.An
earlierversion ofDBS withouttheoptimizationsdescribed below
couldnothandlemorethanaround20–30grammarrules.Further
optimizationstobetterprunethesearchspacecouldpossiblyallow
forevenlargerDSLs.
Optimizations
Minimizingthenumberofgeneratedexpressionsisimportant for
performance.Redundantexpressionsareeliminatedintwoways:
the first is syntactic andhence it is fastandalwaysvalid, while
thesecondissemanticandvalidonlywhenanexpressiondoesnot
takeonmultiplevaluesinasingleexecution(e.g.,iftheprogramis
recursive).
Syntactic Allexpressionsconstructedarerewrittenintocanonical
formsaccordingtothe
rulesintheDSLandduplicates
are discarded.For example, x+y and y+x are writtendifferently
but canberewrittenintothesameformsoonewillbediscarded.
DBS will only accept sets of
rules which are acyclic
(oncecommutativityandothereasilybrokencyclesareremoved)
toensurethereisacanonicalform.Relatedtothis,constantfolding
isappliedwherepossible,so,forexample,2*5and5+5wouldboth
beconstantfoldedto10,furtherreducingthesearchspace.
Semantic Thevastmajorityofthetime,anexpressiontakeson
onlyasinglevalueforeachexampleinput.Inotherwords,theex-
pressionis equivalent toa lookuptable from the example being
executedtoitsvalueonthatexample.Onlyexpressionswithdis-
tinctvaluesareinteresting,so,forexamplex*xand2+xwouldbe
considered identicalifthe onlyexample inputswere x = 2and
x=-1.Thisissimilartotheredundantexpressioneliminationin
versionspace algebras[18]. Theexceptionsareifthe expression
ispartofarecursiveprogramorlambdaexpression,inwhichcase
thisoptimizationisnotused.
5.2 Conditionals
So farwehave notconsideredsynthesizing programscontaining
conditionals,whichareofcoursenecessaryformostprograms.We
considerfirstsynthesizingprogramswherea singlecascadingse-
quenceof if...else if...elseexpressionsoccuratthe top-
levelofthefunctionbody,witheachbranchnotcontainingcondi-
tionals.Thenthegoalistohaveasfewbranchesintheonetop-level
conditional aspossible.Theproblemistopartitionthe examples
intowhich-branch-handles-themtoachievethisgoal.
Foreveryprogramp DBS tries,thesetofexamplesithandles
correctlyisrecordedandcalledT(p).IfT(p)=S (allexamples
handled),p isacorrect solution andcanbe returned. Otherwise,
eachsetofprogramsQ(wherejQjm)whoseunionofhandled
examples
S
p2Q
T(p)equalsS isacandidate forasolutionwith
appropriateconditionals.Tobeasolution,Qalsoneedsguardsthat
leadexamplestoabranchthatisvalidforthem; tosimplifythis,
wheneverabooleanexpressiongisgenerated,thesetofexamples
itreturnstruefor,B(g),isrecorded.ThesetsQareconsideredin
orderofincreasingsize,soiftherearemultiplesolutions,theone
withthefewestbrancheswillbechosen.
Iftheconditionaldoesnotappearatthetop-level,thenitmust
appearastheargumenttosomefunction.Tohandlethiscase,we
notethatifeverybranchoftheconditionalgeneratedasdescribed
already happensto contain a call to a function f with different
arguments,thenitcouldberewrittensuchthatthecalltof occurs
outsideoftheconditionalifthatisallowedbytheDSL.Inthatcase,
wecansaythatallofthebranchesmatchthecontextf().
In the algorithm, for each non-terminal the DSL allows for
conditionalsat, eachprogram pisput into zero ormore buckets
labeledwiththecontextthatnon-terminalappearsin.Forexample,
iftheargumentoffmaybeaconditionalandp=f(f(x))thenp
wouldbeputinthebucketsforf()andf(f()).Thenthesame
algorithm asabove isrun for eachbucket with the conditionals
being rewritten to appear inside the context. Inserting multiple
conditionalsjustinvolvesfollowingthislogicmultipletimes.
5.3 Loops
TheprimarywayDBS handlesloopsistosimplynotdoanything
specialatall:recursionandcallinghigher-orderfunctionslikemap
and fold are handled by the algorithm asdescribed so far. As
describedin§3.2,ageneralWhileLoophigher-orderfunctioncan
beusedtoexpressarbitraryloopsthatDBSmaysynthesizelikeany
otherDSL-definedfunction.Onthe otherhand,theuse ofloops
incodeoftencorrespondstopatternsintheinput/outputexamples.
Thissectiondiscussestwosuchcommonpatternswehavewritten
strategiesfor;expertsdesigningDSLsmayadditionallydefinetheir
ownstrategiesforotherformsofloops.
ThesecanbeusedinaDSLviathe
FOREACH(
)or
FOR(
)
ruleswhereEisthenon-terminalforthebodyoftheloop.
Foreach The“foreach”loopstrategy’shypothesisisthatthereis
a1-to-1correspondencebetweenaninputarrayandanoutputarray.
Assumingthathypothesis,theexamplescanbesplitintooneexam-
pleforeachelementwhereiistheindex,currentistheelement
atthatindex,andaccisthearrayofoutputsforpreviousindexes:
(in =f3;5;4g;RET =f9;25;16g)wouldbecometheexamples
(in = f3;5;4g;i = 0;current = 3; acc = fg;RET = 9),
(in = f3;5;4g;i = 1;current = 5;acc = f3g;RET = 25),
and(in=f3;5;4g;i=2;current=4;acc=f9;25g;RET=
16). Those examplescouldbeusedtosynthesize theloopbody
current*current using TDS. The strategy includes the boiler-
plate code to take the loop bodycurrent*current andoutput
aforeachloopovertheinputarray.
Thatexampleisoverlysimpleassuchacomputationcouldeas-
ilybecapturedbyamap.However,loopstrategiesalsoallowfor
loopsthatarenotaseasilyexpressedwithhigher-orderfunctions.
Forexample,theloopbodiesexamplescouldalsoincludetheval-
uescomputedsofar:(in=f5;2;3g;RET=f5;7;10g)wouldbe-
come(in=f5;2;3g;i=0;current=5;acc=fg;RET=5),
(in=f5;2;3g;i=1;current=2;acc=f5g;RET=7),and
(in=f5;2;3g;i=2;current=3;acc=f5;7g;RET=10).
Then the synthesized loop body would be acc.Length > 0 ?
current + acc.Last() : current,whichcouldbe rewritten
intoaloopcomputingthecumulativesumofin.
For Patternsmay also showup across examples.Forinstance,
giventheexamples(in=0;RET=0);(in=1;RET=1);(in=
2;RET = 3);(in = 3;RET = 6)we cansee,lookingacrossex-
amples,that for eachinput value the result shouldbe the result
forthe previousinput value plus the newinput (which isanin-
direct way of saying “sum the numbersup to in”).In terms of
loopstrategies,thehypothesisisthatpairsofexampleswherethe
input in differby one correspond to adjacent loop iterations so
bycombiningthosepairswecangetexamplesfortheloopbody
where i is current value of the loop iterator and acc is the re-
turnvalueof the i 1iteration: (i = 1;acc = 0;RET = 1),
(i = 2;acc = 1;RET = 3),and(i = 3;acc = 3;RET = 6).
ThenTDSwillgivei + accfortheloopbody.Theloopstrategy
willidentifythat(in=0;RET=0)indicatesthattheloopiterator
shouldstartat0andtheaccumulatorshouldstartat0andproducea
forloopfor(int i = 1; i <= in; i++) acc = i + acc;.
Otherstrategies Different loop strategiescangive different in-
formationlikeincludingtheindexinaforeachorgivingacccor-
respondingtogoinginreverseorder.Furthermore,theconcept of
splitinguparraysbyelementtofindpatternscanalsobeappliedto
splittingstrings(bylengthordelimiters),XMLnodes,orwhatever
otherstructureddatamaybeinthetargetdomain.
5.4 Conditionalsandloops,ageneraltheory
DSLdefinitionscontainruleswithandwithoutspecializedstrate-
gies.Mostrules,includingallDSL-definedfunctions,donothave
specializedstrategiessoexpressionsusingthoserulesarebuiltus-
ingthedefaultstrategyofsearchingthroughthesemanticallydis-
tinctexpressions(usingtheexampleinputstodecidewhichexpres-
sionsare distinguishable).Onotherotherhand, we havedefined
strategiesforconditionalsandloopsthatusetheexampleoutputs
aswellastheinputstopowermoredirectedapproachestolearning
thoseconstructs.Whilewereferencedtheoutputvaluesdirectlyin
theexplanationofthestrategiesforloops,thediscussionofcondi-
tionalsonlyreferencedthemindirectlybykeepingtrackofwhich
examplesaprogramwascorrectfor.
Thestrategiesforconditionalsandloopsshouldbeconsidered
asjust different instancesofthesameconcept. Whilecondition-
alsmerelyselectasubsetofthe examplesforeachbranch,loops
dolarger rewritesof the examplesusedinthe recursive callsto
thesynthesizer.Theoretically,aDSLdesignercouldincludeother
strategieslikeinversesofDSL-definedfunctionsorapolynomial
solverfor synthesizing arithmetic. We presentlyomit suchfunc-
tionalitybecausewe believeit placesundue burdenon the DSL
designerbutintendtoinvestigateitinfuturework.
6. Evaluation
OurevaluationdemonstratesthatTDSissufficientlypowerfuland
generaltosynthesizenon-trivial—albeitsmall—programsinmul-
tiple domainsfrom small sequences of real world examples ob-
tainedfromhelpforums.WealsocompareTDStotoSketch[30],
thepresentstate-of-the-artindomain-agnosticprogramsynthesis,
andtostate-of-the-artspecializedsynthesizerswhereapplicable.
Furthermore,weexploredhowsensitiveouriterativesynthesis
techniqueactuallyistothepreciseorderingofexamples,showing
that exampleorder issignificant tospeedor abilitytosynthesize
ona non-trivial proportion oftheexamples, especiallyon larger
programs.Wealsovalidatedsomeofourdesigndecisionsbyse-
lectivelydisablingpartsofouralgorithm.
AllexperimentswererunonamachinewithaquadcoreIntel
XeonW35202.66GHzprocessorand6GBofRAM.
§6.1describesthebenchmarksusedinourexperimentandour
successinsynthesizingthemandcomparesTDStopriorspecialized
synthesizerswhereapplicable.§6.2investigatestheeffectofexam-
ple ordering to our synthesizer’sperformance. §6.3breaksdown
theusefulnessofthedifferentpartsofouralgorithm.§6.4evaluates
performanceforoursynthesizerandvalidatesourtimeoutchoice.
6.1 Benchmarks
We evaluateourtechnique infourdomains:§6.1.1comparesour
techniquetoastate-of-the-artspecializedprogrammingbyexample
systemforstringtransformations,§6.1.2comparesourtechnique
to astart-of-the-art specializedprogramming by example system
fortabletransformations,§6.1.3discussesusingoursystemforthe
noveldomainofXMLtransformations,whilefinally§6.1.4shows
ourtechniqueisabletoautomatecodinginTDDforintroductory
programmingproblems.
Forthefirstthree,theexamplesequencesusedandoutputofour
synthesizer can be found at https://homes.cs.washington.
edu/
~
perelman/publications/pldi14-tds.zip;thelastsec-
tion’sprogramsarenotpublic.
6.1.1 Stringtransformations
Stringtransformationprogramstake asinputoneormorestrings
andoutputastringconstructedfromtheinput usingmainlysub-
stringandconcatenationoperations.
Stringsareanaturalformatforinput/outputexamplesthatoften
appearin real-world end-userprogrammingtasksasrecentwork
byGulwanietal.[6]hasshown:theirworkbecametheFlashFill
featureinExcel2013[1].UnlikeFlashFill,TDSdoesnotusecareful
reasoningaboutthedomainofstrings,butitisstillabletoquickly
synthesize many of the same examples as well assome similar
programsthatthepriorwork(i.e.,FlashFill)cannotsynthesize.
Benchmarks Tocompareagainst FlashFill, we firstdefinedex-
actly the FlashFill DSL in our DSL definition language and ran
oursynthesizerontheexampleswhichappearin[6]toconfirmwe
couldsynthesizethem.Thenwemadeafewmodificationstothe
DSL whichare showninFig. 6tomakeit more general;specif-
ically,we allowed nested substringoperations,substring indexes
dependentontheloopvariable,andcallstootherLaSyfunctions.
InadditiontoWordWrapandtheexamplesfrom[6]wewrote
testcasesequencesfor7simplerealworldstringmanipulationex-
amplesoutside ofthe scope of FlashFill but handled byour ex-
tendedDSLincludingselectingthetwodigityearfromadate(re-
quiresnestedsubstrings),reversingastring(requiressubstringin-
dexesdependentontheloopvariable),andbibliographyexamples
liketheoneinFig.2(requiresauser-definedlookup).
Results Eachofthe15examplesequencescontains1–8examples
exceptforwordwrapwhichuses24examples.5oftheexamples
canbe synthesizedinunderasecond.6take morethan1second
butunder5seconds,whiletheother4finishinunder25seconds.
FlashFillsynthesizesalloftheexamplesitcanhandleinwellunder
asecond.SimplybyspecifyingtheDSL,ourdomain-agnosticsyn-
thesistechniquenearstheperformanceofastate-of-the-artspecial-
izedsynthesistechniquewhilemaintainingtheabilitytogenerate
morecomplicatedcontrolflowstructures.
WecodedallexamplesusingthecorrespondingDSLsinSketch
andnoneofthemcompletedwithin10minutes.
6.1.2 Tabletransformations
Tabletransformationsconvertspreadsheettablesbetweendifferent
formatsbyrearrangingandcopyingatable’scells.
Benchmarks [11]givesaDSLandsynthesisalgorithmforthese
transformationsalongwithacollectionofbenchmarkstheauthors
foundononlinehelpforums.WedefinedtheirDSLforoursynthe-
sizerandranitontheirbenchmarks.
Foradditionalbenchmarksnothandledby[11]weaddedmore
predicates to the grammar to allow it to handle a wider range
ofrealworldnormalizationscenarios.Forexample,ourextended
grammarcansupportconvertingvariousnon-standardspreadsheets
withsubheadersintonormalizedrelationaltables.
Results Eachofthe8benchmarksuses1–6examples.TDSsyn-
thesizesmostoftheminunder10seconds;2take30secondsand
onetakesafullminute.
[11]saysSketchwasunabletosynthesizetheirbenchmarksso
wedidnotattempttorunthebenchmarksusingSketch.
6.1.3 XMLtransformations
XMLtransformationsinvolvesomecombinationofmodifyingan
XMLtreestructureandtagattributesandstringtransformationson
thecontentoftheXMLnodes.
Benchmarks Weselected10differentrealworldexamplesfrom
online help forums and constructed a DSL able to express the
operationsnecessarytosynthesizeprogramsforallofthem.Two
oftheexamplesappearin§2.2.
Onetransformationinvolvedputtinga tagaroundeveryword,
evenwhenwordshadtagswithinthem,whichwaseasiesttoex-
presstreating the words as strings instead of as XML. Making
thestringandXMLDSLsworktogetherrequiredsimplyputting
the functionsto convert betweenthe two in the DSL.Thiskind
ofcross-domaincomputationshows the strength ofour domain-
agnosticapproach.
Results Most of the benchmarks used a single example while
the restusedno more than 3.TDS synthesizesall buttwo ofthe
benchmarksinunder10secondsandthe remainingtwoinunder
20seconds.
We also implemented the DSL and benchmarks in Sketch,
whichwasunabletosynthesizeanyofthemwithin10minutes.
6.1.4 Pex4Funprogramminggame
Motivation Althoughwebelieveourend-userprogrammingsce-
nariosarecompelling,wewantedtotestoursynthesizerinasce-
narioclosertotheTDDprogrammingstyleitwasinspiredbyand
getasourceforsomemorechallengingfunctionstosynthesize.To
thatend,wehadoursynthesizerplaythePex4Fun[32]program-
minggamewhereaplayerischallengedtoimplementanunknown
functiongivenonlya fewexamples.Eachtimetheplayerthinks
theyhaveasolution,thePex[31]testgenerationtoolusesdynamic
symbolicexecutiontocomparetheplayer’scodetoasecretrefer-
ence solution andgeneratesadistinguishinginputif the player’s
code does not match the specification.Pex provides the test for
the test stepoftheTDDwhile the playerisperforming the pro-
grammingstepwithoutfullknowledgeofthespecificationtheyare
codingfor.Thisisamore“pure”formofTDDastheplayerisnot
biasedintheircodingbyknowingthespecification,givingacloser
paralleltooursynthesizerwhichdoesnothaveknowledgeofthe
specificationofthefunctionitissynthesizing.
Experiment description Weuse a single DSL with aset of 40
simplestringandintfunctionswhichmaybecombinedinany
type-safewaytoshowthatwhileacarefullyconstructedDSLcan
begiventooursynthesizertomakeitperformespeciallywellina
givendomain,it can still successfullysynthesizeprogramsusing
aless specialized DSL capable of describing a wider range of
programs.Note that ourDSL waswrittenwithoutlookingat the
1
10
100
0
0.2
0.4
0.6
0.8
1
Normalized execution time
Normalized inversions count
Figure 7. Norm.time(1=opt)formanyreorderingsofexamples
(dist.measuredininversions,norma.so1=reverse);thelineshows
thegeo.meanofthedatapointsforeachnorm.inversioncount.
0
0.2
0.4
0.6
0.8
1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Proportion failed
Normalized inversions count
0
4
2
12
9
14
13
4
2
2
Figure 8. Prop. ofreorderingsTDS failedonbynorm.inversion
count.Numbersabovebarsareabsolute#offailedreorderings.
Pex4Funpuzzlesandthereforeendedupmissingsomefunctions
necessaryforsomepuzzleslikebitwiseoperations.
ForeachpuzzleinthePex4Fundata,wehadouralgorithmplay
Pex4Funforamaximumof7iterations,afterwhichthesynthesizer
wasconsideredtohavefailedifitstillhadnotproducedasolution.
Thismayseemliketoofewiterations,butit ismorecallstoPex
thanmostusersusedforanyofthepuzzles.
Forsomeofthepuzzles,usingPextogeneratetestcasesfailed
togenerate asolutiondespitemanualinspectiondeterminingthat
thepuzzleswerewellwithinthecapabilitiesofthesynthesizerand
theDSLused.Insuchcases,asequenceoftestcaseswasgenerated
manuallytosynthesizesolutionstothosepuzzles.
Benchmarks The puzzles our synthesizer could synthesize in-
cluded,amongmanyothers,factorial,swappingelementsofanar-
ray,andsummingnumbersinastringdelimitedbydelimiterspec-
ifiedonthefirstlineofthestringandsomemoretrivialexamples
likeconcatenatingthefirstandlastelementofastringarray.
The remainingunsynthesizedpuzzleseither involvedlooping
structuresnot covered by our strategies (e.g., count the number
ofstepsofthe3n+1 problem
2
neededtoreach1),components
notinourcomponentset (e.g.computebitwise or),orarithmetic
expressionstoolargetoconstructusingcomponent-basedsynthesis
(e.g.computethevalueofaspecificcubicpolynomial).
Results We ran our experiment across 172 randomly selected
puzzlesfromPex4Fun.The synthesizerfoundsolutionsfor72of
those.For60ofthose,thetestcasesgeneratedbyPexweresuffi-
cient,butforanother12thetestcaseshadtobewrittenmanually.
6.2 Exampleordering
WehypothesizedthatexampleorderingisusefulandTDSisrobust
tosmallvariationsinexampleorder.Forthe vast majorityofour
benchmarks,the number ofexamples issmall and their order is
unimportant;theuseofcontextsandsubexpressionsfromthepre-
viousprogramisimportant(whennotsynthesizingfromasingle
2
https://en.wikipedia.org/wiki/Collatz_conjecture
0
10
20
30
40
50
F
la
s h
F
ill
T
a b
le
s
X
M
L
P
e
x 4 F
u
n
 M
a n
u a
l
P
e x 4
F
u
n  P
e x
# synthesized
Benchmark set
full
no contexts from prev.
no subexprs from prev.
nothing from prev.
no DSL
Figure9. #synthesizedbybenchmarksetandfeaturesenabledin
algorithm.“full”isthefullalgorithm,contextsandsubexpressions
frompreviousprogramtogetherformtheinformationTDSuses.
0.4
0.5
0.6
0.7
0.8
0.9
1
0
60
120
180
0.4
0.5
0.6
0.7
0.8
0.9
1
Proportion
Time (seconds)
Figure10. ExecutiontimesofallDBSruns.
example)asisshownin§6.3by disablingthem,butthe specific
orderofthe examplesisnot.Onthe otherhand,the12Pex4Fun
puzzleswhichrequiredmanuallywrittenexamplesequenceswere
difficultenoughforTDSthattheorderingofexampleswasinfact
valuabletothealgorithm.Togiveanideaofhowimportanttheor-
deringactuallywas,weranTDSonrandomlyreorderedcopiesof
thoseexamplesequences.
Fig. 7 shows the timing results for the example sequences
that were successfully synthesized. Each circle is one example
sequence,andthelineshowsgeometricmeanofcircles.
Thex-coordinatemeasureshowfarfromtheoptimalexample
sequenceitwaswhere0istheoptimalsequenceand1isthereverse
ofthe optimal sequence. Specifically, the value isthe number of
inversions
3
betweenthetwosequencesdividedbythe maximum
possiblenumberofinversions(
n(n 1)
2
forasequenceoflengthn).
They-coordinateisthetimeittooktosynthesizethesolutionusing
the random sequence measured in the time it took to synthesize
usingtheoptimalsequence.Notethatthey-axisisalogscale.
Fig. 8 shows for how many of the reorderings the program
wasnot successfully synthesized. The x-axis is the same; each
bar corresponds to a range of normalized inversion counts. The
barheightsare theproportionofsequencesforwhicha program
couldnotbesynthesized,andthenumbersabovethebarsarethe
absolutecounts.Therearemoreexamplestowardthemiddleasthe
sequenceswereselecteduniformlyatrandom,andtherearemore
possiblesequencesinthemiddle.
ThechartsshowtwoimportantpropertiesofTDS.First,itdoes
infactdependonexampleordering:largechangestotheexample
orderingmakeittakemuchlongertofindasolutionorfailtofind
asolution at all. Second, it is robust to small changes in the
exampleordering:fornormalizedinversioncountslessthan0.3,
fewerthanhalfthereorderingsfailedandthosethatsucceededtook
onaveragelessthanthreetimesaslongastheoptimalordering.
Those12examplesshowtheworstcaseforoursynthesizer.For
theother60Pex4FunpuzzleswithtestcasessequencesfromPex,
3
#ofexamplepairsthathavedifferentorderinthetwolists
51ofthemwerealsosuccessfullysynthesizedwiththosetestcases
inreverseorder.Forthe restoftheexamples,nearlyallcouldbe
synthesizedwiththeexamplessequenceinreverseorder,atworst
sloweddown bya factorof 3. Thisindicatesthat the sensitivity
totestcaseorderingisaffectedbyhowcomplicatedtheprogram
beingsynthesizedis.
6.3 Significanceofvariouspartsofalgorithm
In order to evaluate the usefulness of the different parts of our
algorithm,weranourbenchmarkswithpartsofitdisabled.Fig.9
showshowmanyofthebenchmarksweresynthesizedundereach
limitedversionofthealgorithm.
Iterativesynthesis Theiterative synthesisstrategyemployedby
TDSisimplementedbypassingcontextsandsubexpressionsfrom
thepreviousprogramtoDBS.Wedisabledthesetwopiecesofinfor-
mationindividuallyandtogether.ThePex4Funandtabletransfor-
mationbenchmarksweremostaffectedbytheremovaloffeatures
from TDS duetoworkingwith larger programs.Notably,we see
thateither thesubexpressionsorcontextsaloneishelpful,but
whencombinedtheyaresignificantlymorepowerful.
DSL-basedsynthesis WealsodisabledtheuseoftheDSLwhen
generatingcomponentsinDBS,soitinsteadwouldbelimitedonly
bythetypesofexpressions.There are no“no DSL” barsforthe
Pex4FunbenchmarksbecausethePex4FunDSLalreadyonlyused
thetypes,sothatconfigurationisidenticaltothe“full”configura-
tion.Manyoftheotherprogramsweresynthesizedfromjustasin-
gleexample,sotheweakeningofTDSdidnothavealargeeffect,
butthissuccesswasachieveddue tothe powerDBS gained from
theDSLascanbeseenfromthefactthatveryfewoftheend-user
benchmarkscouldbesynthesizedwithoutaccesstotheDSL.
6.4 Performance
Fig.10showsaCDFofallexecutiontimesofalloftheDBSruns
usedinourexperiments.Thischartshowsthat
isquiteeffi-
cientwithamedianrunningtimeofapproximately2seconds
andrunninginunder10secondsaround75%ofthetime.
Timeout Throughouttheexperiments,weuseda3minutetime-
out.OnlyveryrarelyinourexperimentsdidDBSeverrunforany-
wherenear3minuteswithouttimingout.Thereisavisiblebump
around60–70secondsafterwhichthelineisalmostflat,indicating
thatitisveryunlikelythatgivingDBSasmallamountofadditional
timewouldhavemadeanynoticeabledifferenceinourresults.This
isfurtherverifiedbyad-hocexperiencethatwithoutatimeout,DBS
runsforover30minuteswithoutaresultorrunsoutofmemory.
7. RelatedWork
Programmingbyexample Programmingbyexample(PBE)[7],
orinductive programming[16], isa subfieldof program synthe-
siscoveringmanydifferenttechniqueswhereaprogramisincom-
pletelyspecifiedbyexamples(inputspairedwithexplicitoutputsor
aqualityfunctionforresults).Inallofthepriorwork,alloftheex-
amplesaregivenatonce,althoughmuchofitusessomevariantof
geneticprogramming[23]whereprogramsarebuiltanditeratively
mutatedtowardasolutionorCEGIS[30]wherenewprogramsare
constructeduntilasolverfailstoprovideacounterexample,similar
tohowoursynthesizerwasusedwithPexinthePex4Funexperi-
ment.Akeyideainthe recentprior workisreducingthesearch
spaceusingversionspacealgebras[6,8,11,18,28,29],whichwe
avoidinordertomaintaingeneralityastheymustbe constructed
foragivendomain.
TDSismostsimilartopriorworkoncomponent-basedsynthesis
andgeneticprogramming.
Component-based synthesis Component-based synthesis is a
family oftechniquesthat perform program synthesisby starting
withasetof“components”—thatis,expressionsandfunctions—
andconsideringactualprogramsconstructedfromcombiningthese
components (as opposed to the version space algebra approach
wheretheactualprogramismerelyanartifactthatcanberecov-
eredafterthemainsynthesisalgorithmiscomplete).Component-
based synthesis has been successfully applied to a wide variety
ofdomainsincludingbit-vectoralgorithms[9],stringtransforma-
tions [24], peephole optimizations [5], type convertors [21, 27],
deobfuscation[13],andgeometryconstructions[10].
The actual search isperformed in different ways dictated by
the information required be known about the components. For
example, Gulwani et al. [9] used an SMT solver because their
componentsarestandardbitwisearithmeticoperators,sotheyhave
easilyexpressiblelogicalspecifications.Also,thisisthepriorwork
we found with the largest program synthesized by component-
basedsynthesisat16linesofcode,whichtookaroundanhourto
synthesize.Incomparison,ouralgorithmcansynthesizeprograms
ofupto20linesofcodewithin400secondsconsistingofarbitrary
user-definedfunctions.
ThepriorworkmostsimilartoourDBSisbyKatayama [15].
Likeouralgorithm,anyfunctioncanbeacomponentanditmain-
tainsanexplicitlistofthe componentsgeneratedsofaranduses
them whengeneratingcomponentson the nextstep.It findsand
prunesredundantcomponentsbyevaluatingeverycomponentona
subsetoftheexamplesandonlykeepingcomponentsthatevaluate
todifferentvaluesonsomeexample.Thisissimilartothepruning
donebyDBSexceptthatDBSusesallexamplesseensofar.
Genetic programming Genetic programming [23] synthesizes
programs by performing a genetic search which involves using
previouscandidateprogramstoproducenewcandidateprograms.
The primary difference between our work and genetic program-
mingisthatgeneticprogrammingisdirectedbyaqualityfunction
thattellshowwellagivenprogramperformsatthedesiredtaskand
reliesonthatqualityfunctionbeingwell-behavedwhileoursearch
isonlygivenabooleanfailureorsuccessoneachtestcase.
ADATE [25] takes as input a set of test inputs and quality
functionsfor theiroutputsand performs a genetic searchwhere
programsaremutatedandonlyprogramsthatimproveormaintain
thesumofthequalityfunctionvaluesarekept.Thisformulation
meansthatADATEhastochoosewhichtestcasetoimprovenext,
unlikeinoursystemwherethefirstktest casesmustpassbefore
thesynthesizerconsiderstestcasek+1.
Template-based synthesis Program sketching [30] is a form of
program synthesiswhere the programmer writesa sketch, i.e.,a
partialprogramwithholes,andprovidesaspecificationthesolution
mustsatisfy.LaSycanbeseenasalessprecisesketchwithaDSL
insteadofpartialbodies:infact,weportedourLaSyprogramsto
Sketchinordertodoacomparisonforourevaluation.However,the
searchstrategyisverydifferent:oursynthesizerfillsintheholes
usingcomponent-basedsynthesis(asopposedtousingSAT/SMT
solvers) and checks against input/output examples (instead of a
morecompletespecification).
Syntax-guidedsynthesis OurDSL-basedapproachissimilarto
thesyntax-guidedsynthesisideaproposedin[2],butismoreflexi-
blebyallowingforaDSLthatusesarbitrary.NETfunctions.That
work[2] onlypresentssimple prototypesynthesizerswhich lack
thepowerofTDS.SyGuS’suseofconstraintsinsteadofexamples
makesadirectcomparisonofsynthesizertechnologiesdifficult.
Rosette [33] isanextensiontothe Racket programming lan-
guagewhichallowseasyaccesstoasolverforapplicationsinclud-
ingsynthesisoverDSLsembeddedinRacket.Whiletheprogram-
merexperienceismuchsleekerthanwithSyGuSorLaSy,thesyn-
Documents you may be interested
Documents you may be interested