Re: CAPTCHAs – Understanding CAPTCHA-Solving Services in an
Economic Context
MartiMotoyama, KirillLevchenko, Chris Kanich, Damon McCoy,
Geoffrey M. Voelker and Stefan Savage
University ofCalifornia, San Diego
fmmotoyam, klevchen, ckanich, dlmccoy, voelker, savageg@cs.ucsd.edu
Abstract
Reverse Turing tests, or
CAPTCHA
s, have become an
ubiquitous defense used to protect open Web resources
from being exploited at scale. An effective
CAPTCHA
resists existing mechanistic software solving, yet can
be solved with high probability by a human being. In
response, a robust solving ecosystem has emerged, re-
selling both automated solving technology and real-
time human labor to bypass these protections. Thus,
CAPTCHA
scanincreasinglybeunderstoodandevaluated
inpurelyeconomicterms;the marketpriceofasolution
vsthemonetizablevalueoftheassetbeingprotected. We
examine the market-side of this question in depth, ana-
lyzingthe behaviorand dynamics of
CAPTCHA
-solving
service providers, their price performance, and the un-
derlyinglabormarketsdrivingthiseconomy.
1 Introduction
Questions of Internet security frequently reflect under-
lyingeconomic forces thatcreatebothopportunities and
incentivesforexploitation.Forexample, muchoftoday’s
Internet economy revolves around advertising revenue,
andconsequently, a vast array of services—including e-
mail, socialnetworking, blogging—are nowavailable to
newusersona basis thatis bothfreeandlargelyanony-
mous.Theimplicitcompactunderlyingthismodelisthat
the users of these services are individuals and thus are
effectively“paying”forservicesindirectlythrough their
unique exposure to ad content. Unsurprisingly, attack-
ers havesoughttoexploitthis samefreedom andacquire
largenumbersofresourcesundersingularcontrol, which
caninturnbemonetized(e.g.,viathousandsoffreeWeb
mailaccounts forsourcingspam e-mailmessages).
CAPTCHA
s were developed as a means to limit the
ability of attackers to scale their activities using auto-
mated means. In its most common implementation, a
CAPTCHA
consists of a visual challenge in the form of
alphanumeric charactersthatare distorted in such away
thatavailablecomputervisionalgorithms have difficulty
segmenting and recognizing the text. At the same time,
humans, with some effort, have the ability to decipher
the text and thus respond tothechallenge correctly. To-
day,
CAPTCHA
s of various kinds are ubiquitously de-
ployedforguardingaccountregistration, commentpost-
ing, andsoon.
This innovation has, in turn, attached value to the
problem of solving
CAPTCHA
s and created an indus-
trialmarket. Suchcommercial
CAPTCHA
solvingcomes
in two varieties: automated solving and human labor.
Thefirstapproachdefines atechnicalarmsracebetween
those developing solving algorithms and those who de-
velopever more obfuscated
CAPTCHA
challenges inre-
sponse. However, unlike similararms races that revolve
aroundspamormalware, wewillarguethattheunderly-
ing cost structure favors the defender, andconsequently,
the conscientiousdefenderhaslargelywonthe war.
The second approach has been transformative, since
the use of human labor to solve
CAPTCHA
seffectively
side-steps theirdesignpoint. Moreover, the combination
of cheap Internet access and the commodity nature of
today’s
CAPTCHA
s has globalized the solving market;
infact, wholesale cost has droppedrapidly as providers
have recruited workers from the lowest cost labor mar-
kets. Today, there are many service providers that can
solve large numbers of
CAPTCHA
svia on-demand ser-
vices withretailpricesas lowas $1perthousand.
Ineithercase,wearguethatthesecurityof
CAPTCHA
s
cannow be consideredinan economiclight. This prop-
erty pits the underlying cost of
CAPTCHA
solving, ei-
therinamortizeddevelopmenttimeforsoftware solvers
or piece-meal in the global labor market, against the
valueofthe assetitprotects.Whilethe veryexistenceof
CAPTCHA
-solving services tells us thatthe value of the
associatedassets (e.g., ane-mail account)isworthmore
tosomeattackersthanthecostofsolvingthe
CAPTCHA
,
theoverallshapeofthemarketispoorlyunderstood.Ab-
Convert pdf to editable text online - SDK software project:C# PDF Convert to Text SDK: Convert PDF to txt files in C#.net, ASP.NET MVC, WinForms, WPF application
C# PDF to Text (TXT) Converting Library to Convert PDF to Text
www.rasteredge.com
Convert pdf to editable text online - SDK software project:VB.NET PDF Convert to Text SDK: Convert PDF to txt files in vb.net, ASP.NET MVC, WinForms, WPF application
VB.NET Guide and Sample Codes to Convert PDF to Text in .NET Project
www.rasteredge.com
(a)Aol.
(b) mail.ru
(c) phpBB3.0
(d) SimpleMachinesForum
(e) Yahoo!
(f) youku
Figure 1:Examplesof
CAPTCHA
sfrom variousInternet properties.
sentthis understanding, itisdifficult toreason aboutthe
securityvaluethat
CAPTCHA
sofferus.
This paperinvestigates this issue in depthand, where
possible,onaempiricalbasis.Wedocumentthecommer-
cialevolutionofautomatedsolvingtools(particularlyvia
the successful Xrumer forum spamming package) and
how they have been largely eclipsed by the emergence
ofthe human-based
CAPTCHA
-solvingmarket. To char-
acterize this latter development, our approach is to en-
gagetheretail
CAPTCHA
-solvingmarketonboththesup-
ply side and the demand side, as both a client and as
“workers for hire.” In addition to these empirical mea-
surements, we also interviewed the owner and operator
ofasuccessful
CAPTCHA
-solving service (M
R
.E), who
has providedus both validationandinsightinto the less
visibleaspects ofthe underlyingbusiness processes.
1
In
the course of our analysis, we attempt to address key
questionssuchaswhich
CAPTCHA
saremostheavilytar-
geted, the roughsolving capacity ofthe market leaders,
the relationship of service quality to price, the impact
ofmarket transparency and arbitrage, thedemographics
ofthe underlying workforce andthe adaptabilityofser-
vice offerings to changes in
CAPTCHA
content. We be-
lieve our findings, or at least our methodology, provide
acontext for reasoning about the net value provided by
CAPTCHA
sunder existing threats and offer some direc-
tionsforfuturedevelopment.
The remainder of this paper is organized as fol-
lows: Section2 reviews
CAPTCHA
design and provides
a qualitative history and overview of the
CAPTCHA
-
solving ecosystem. Next, in Section 3 we empirically
characterize two automated solver systems, the popular
Xrumer package and a specialized reCaptcha solver. In
Sections4 and5 we then characterize today’s human-
powered
CAPTCHA
-solvingservices,first describingour
1
By agreement,wedonotidentifyM
R
.Eortheparticularservice
heruns.Whilewecannotvalidateallofhisstatements,whenwetested
hisserviceempiricallyourresultsformeasuressuchasresponsetime,
accuracy,capacityand labormakeupwereconsistentwithhisreports,
supportinghisveracity.
data collectionapproach andthen presentingourexperi-
mentstomeasurekeyqualitiessuchasresponsetime,ac-
curacy,andcapacity.Section6describesthedemograph-
ics ofthe
CAPTCHA
-solvinglaborpool. Finally, we dis-
cuss the implications of our results in Section7 along
withpotentialdirectionsforfutureresearch.
2 Background
The term “
CAPTCHA
”was first introduced in 2000 by
vonAhnetal.[21],describingatestthatcandifferentiate
humans from computers. Undercommondefinitions[4],
the testmustbe:
 Easilysolvedbyhumans,
 Easilygeneratedandevaluated,but
 Noteasilysolvedbycomputer.
Over the past decade, a number of different techniques
for generating
CAPTCHA
s have been developed, each
satisfying the properties described above to varying de-
grees. The mostcommonlyfound
CAPTCHA
sarevisual
challenges thatrequirethe usertoidentifyalphanumeric
characterspresentinanimageobfuscatedbysomecom-
bination of noise and distortion.
2
Figure 1 shows ex-
amples of such visual
CAPTCHA
s. The basic challenge
in designing these obfuscations is to make them easy
enoughthatusersarenotdissuadedfromattemptingaso-
lution, yetstilltoodifficulttosolveusingavailablecom-
putervisionalgorithms.
Theissueofusabilityhasbeenstudiedonafunctional
level—focusingondifferencesinexpectedaccuracyand
responsetime [3,19,22,26]—butthe ultimateeffect of
CAPTCHA
difficulty on legitimate goal-oriented users is
notwelldocumentedintheliterature. Thatsaid, Elsonet
al. provideanecdotalevidence that“evenrelativelysim-
plechallengescandriveawayasubstantialnumberofpo-
2
Thereexistsarangeofnon-textualandevennon-visual
CAPTCHA
s
thathavebeencreatedbut,exceptingMicrosoft’sAsirra[9],wedonot
considerthemhereas they playasmallroleinthecurrent
CAPTCHA
-
solvingecosystem.
2
SDK software project:C# Create PDF Library SDK to convert PDF from other file formats
to download free trial and use online example source file created by RasterEdge C# PDF document creator can be fully populated with editable text and graphics
www.rasteredge.com
SDK software project:C# PDF Convert to Word SDK: Convert PDF to Word library in C#.net
Quick to remove watermark and save PDF text, image, table Convert PDF to multiple MS Word formats such as .doc Create editable Word file online without email.
www.rasteredge.com
tential customers” [9], suggesting
CAPTCHA
design re-
flectsa realtrade-offbetweenprotectionandusability.
The second challenge, defeating automation, has re-
ceived far more attention andhas kickedoff a competi-
tion of sorts between those building evermore sophisti-
cated algorithms forbreaking
CAPTCHA
sand thosecre-
ating new, more obfuscated
CAPTCHA
sin response [7,
11, 16, 17, 18, 25].Inthenextsectionweexaminethis
issuein more depthandexplainwhy, for economicrea-
sons, automated solving has been relegated to a niche
status intheopenmarket.
Finally, an alternative regime for solving
CAPTCHA
s
is to outsource the problem to human workers. Indeed,
this labor-based approach has been commoditized and
todaya broad range ofprovidersoperatetobuy and sell
CAPTCHA
-solving service in bulk. We are by no means
the firsttoidentifythegrowthofthis activity.Inparticu-
lar, Danchev provides an excellent overview of several
CAPTCHA
-solving services in his 2008 blog post “In-
side India’s CAPTCHA solving economy” [5]. We are,
however, unaware of significant quantitative analysis of
thesolvingecosystemanditsunderlyingeconomics.The
closest work toourown is the complementary study of
Bursztein et al. [3] which also uses active
CAPTCHA
-
solvingexperiments,butisfocusedprimarilyontheissue
of
CAPTCHA
difficulty rather than the underlying busi-
nessmodels.
3 Automated Software Solvers
From the standpoint of an adversary, automated solv-
ing offers a number of clear advantages, includingboth
near-zero marginal cost and near-infinite capacity. At
a high level, automated
CAPTCHA
solving combines
segmentation algorithms, designed to extract individ-
ual symbols from a distorted image, with basic op-
tical character recognition (OCR) to identify the text
present in
CAPTCHA
s. However, building such algo-
rithms is complex (by definition, since
CAPTCHA
s are
designedtoevadeexistingvisiontechniques), and auto-
mated
CAPTCHA
solving often fails to replicate human
accuracy. These constraints have in turn influenced the
evolution of automated
CAPTCHA
solving as it transi-
tionedfrom amere academiccontesttoanissueofcom-
mercialviability.
3.1 EmpiricalCase Studies
We explore these issues empirically through two rep-
resentative examples: Xrumer, a mature forum spam-
ming tool with integrated support for solving a range
of
CAPTCHA
s andreCaptchaOCR,amodernspecialized
solverthattargetsthepopularreCaptchaservice.
Xrumer
Xrumer [24] is a well-known forum spamming tool,
widelydescribedon“blackhat”SEOforumsasbeingone
of the most advanced tools for bypassing many differ-
entanti-spam mechanisms, including
CAPTCHA
s. Ithas
beencommerciallyavailablesince2006andcurrentlyre-
tails for $540, and we purchased a copy from the au-
thor at this price for experimentation. While we would
havelikedtoincludeseveralotherwellknownspamming
tools (SEnuke, AutoPligg, ScrapeBox, etc), the cost of
these packages range from $97 to $297, which would
renderthis studyprohibitivelyexpensive.
Xrumer’s market success in turn led to a surge of
spam postings causing most service providers targeted
byXrumertoupdatetheir
CAPTCHA
s.Thisdevelopment
kickedoff an“arms race” period in Xrumer’s evolution
as the author updated solvers to overcome these obsta-
cles. Version 5.0 ofXrumer was released in October of
2008 withsignificantly improvedsupportfor
CAPTCHA
solving. We empirically verified that 5.0 was capable
of solving the default
CAPTCHA
sfor then current ver-
sions of a number of major message boards, including:
Invision Power Board (IPB) version 2.3.0, phpBB ver-
sion3.0.2, SimpleMachineForums(SMF)version1.1.6,
and vBulletin version 3.6. These systems responded in
kind, andwhenwe installed versions of these packages
released shortly after Xrumer 5.0 (in particular, phpBB
andvBulletin)weverifiedthattheir
CAPTCHA
shadbeen
modified to defeat Xrumer’s contemporaneous solver.
Today, we have found that the only major message fo-
rumsoftwarewhosedefault
CAPTCHA
Xrumercansolve
isSimpleMachinesForum (SMF).
With version 5.0.9 (released August 2009), Xrumer
added integration for human-based
CAPTCHA
-solving
services: Anti-Captcha (an alias for Antigate) and
CaptchaBot.Wetakethisas anindicationthattheauthor
ofXrumer found the ongoing investment in
CAPTCHA
-
solving software to be insufficient to support customer
requirements.3 That said, Xrumer can be configured
to use a hybrid software/human based approach where
Xrumerdetects instancesof
CAPTCHA
svulnerable toits
automated solvers and uses human-based solvers oth-
erwise. In the current version of Xrumer (5.0.12), the
CAPTCHA
-related development seems to focus on sup-
portingautomaticnavigationand
CAPTCHA
“extraction”
(detecting the
CAPTCHA
and identifying the image file
to send to the human-based
CAPTCHA
-solving service)
of more Web sites, as well as evading other anti-spam
techniques.
3
The developers of Xrumer have recently been advertising en-
hanced
CAPTCHA
-solvingfunctionalityintheirforthcoming“7.0Elite”
version(includingsupportforreCaptcha),butthereleasedatehasbeen
steadilypostponedand,asofthiswriting(June2010),version5.0.12is
thelatest.
3
SDK software project:Online Convert PDF to Text file. Best free online PDF txt
to convert PDF document to editable & searchable to text converter control toolkit can convert PDF document to Download and try RasterEdge.XDoc.PDF for .NET
www.rasteredge.com
SDK software project:C# PDF Text Box Edit Library: add, delete, update PDF text box in
PDF annotation application able to add text box comments to adobe PDF file online in ASP Able to create a fillable and editable text box to PDF document in
www.rasteredge.com
When compared with developers targeting “high-
value”
CAPTCHA
s (e.g., reCaptcha, Microsoft, Yahoo,
Google, etc.), Xrumer has mostly targeted “weaker”
CAPTCHA
sand seems to have a policy of only includ-
ing highlyefficientand accuratesoftware-basedsolvers.
Inourtests, allbutoneincludedsolverrequiredasecond
orless per
CAPTCHA
(on anetbookclass computerwith
onlya1.6-GHzIntelAtom CPU)andhadanaccuracyof
100%. The onemoredifficultcasewas thesolverforthe
phpBBversion3forumsoftwarewiththeGD
CAPTCHA
generatorandforegroundnoise.Inthiscase, Xrumerhad
an accuracy of only 35% and required 6–7 seconds per
CAPTCHA
toexecute.
reCaptchaOCR
At the other end of the spectrum, we obtained a spe-
cialized solver focused singularly on the popular re-
Captchaservice. Wilkins developedthesolverasaproof
of concept [23]. The existence of this OCR-based re-
Captcha solver was reported in a blog posting on De-
cember 15, 2009 [6]. Although developed to defeat an
earlierversion ofreCaptcha
CAPTCHA
s(Figure2a), re-
CaptchaOCRwasalsoabletodefeatthe
CAPTCHA
vari-
ant in use at the time of release (Figure 2b). Subse-
quently, reCaptcha changed their
CAPTCHA
-generation
code again to the version as of this writing(Figure2c).
Thetoolhasnotbeenupdatedtosolvethisnewvariant.
We tested reCaptchaOCR on 100 randomly selected
CAPTCHA
sof the early 2008 variant and 100 randomly
selected
CAPTCHA
sof the late 2009 variant. We scored
the answers returned using the same algorithm that re-
Captcha uses by default. reCaptcha images consist of
two words, a control word for which the correct solu-
tion is known, and the other a word for which the solu-
tionis unknown (the serviceis usedtoopportunistically
implement human-based OCR functionality fordifficult
words).BydefaultreCaptchawillmarkasolutionascor-
rect ifit is within an edit distance ofone of the control
word.However,whileweknowthegroundtruthforboth
wordsinourtests, wedonotknowwhichwasthecontrol
word.Thus,wecreditedthesolverwithhalfacorrectso-
lutionforeachworditsolvedcorrectlyinthe
CAPTCHA
,
reasoningthat there was a50% chanceofeachwordbe-
ingthecontrolword.
Weobservedanaccuracyof30%forthe2008-era test
set and 18% for the 2009-era test set using the default
settingof613 iterations,
4
far lowerthan the averagehu-
man accuracy for the same challenges (75–90% in our
experiments).
Finally, we measuredtheoverheadofreCaptchaOCR.
Onalaptopusinga2.13-GHzIntelCore2Duoeachso-
4Thesolverperformsmultipleiterationsandusesthemajorityso-
lutiontoimproveitsaccuracy.
lution required an average of105seconds. By reducing
the numberofiterations to 75we could reduce thesolv-
ing time to 12 seconds per
CAPTCHA
,which is in line
withtheresponsetime fora humansolver. Atthis num-
berofiterations,reCaptchaOCRstillachievedsimilarac-
curacies:29% for the 2008-era
CAPTCHA
sand17% for
the 2009-era
CAPTCHA
s.
3.2 Economics
Bothoftheseexamples illustratetheinherentchallenges
infieldingcommercial
CAPTCHA
-solvingsoftware.
While the
CAPTCHA
problem is often portrayed in
academia as a technicalcompetition between
CAPTCHA
designers and computer vision experts, this perspective
does not capture the business realities ofthe
CAPTCHA
-
solving ecosystem. Arms races in computer security
(e.g., anti-virus, anti-spam, etc.) traditionally favor the
adversary, largely because the attacker’s role is to gen-
erate new instances while the defender must recognize
them—and the recognition problem is almost always
much harder. However,
CAPTCHA
s reverse these roles
sinceWebsitescanbeagileintheiruseofnew
CAPTCHA
types, while attackers own the more challenging recog-
nition problem. Thus, theeconomics ofautomatedsolv-
ingare drivenbyseveralfactors:thecosttodevelopnew
solvers, theaccuracyofthesesolversandtheresponsive-
nessofthesiteswhose
CAPTCHA
sare attacked.
While it is difficultto precisely quantify the develop-
ment costfornew solvers, it is clear that highly skilled
laboris required and suchdevelopers must charge com-
mensuratefees torecouptheirtimeinvestment.Anecdo-
tally,wecontactedonesuchdeveloperwhowas offering
an automated solving library for the current reCaptcha
CAPTCHA
.He was charging $6,500 on a non-exclusive
basis,andwe didnotpaytotestthis solver.
At the same time, as we saw with reCaptchaOCR, it
canbeparticularlydifficulttoproduceautomatedsolvers
that can deliverhuman-comparable accuracy(especially
for “high-value”
CAPTCHA
s). While it seems thataccu-
racy shouldbe a minor factorsince the cost of attempt-
ing a
CAPTCHA
is all but “free”, in reality low success
rates limit boththe utilityofa solverandits useful life-
time. In particular, overshort time scales, manyforums
will blacklist an IP address after 5–7 failed attempts.
More importantly, should a solver beputinto wide use,
changes in the gross
CAPTCHA
success rate overlonger
periods (e.g., days) is a strongindicator that a software
solver is in use—a signature savvy sites use to revise
their
CAPTCHA
sinturn.
5
Thus, for a software solver to be profitable, its price
must be less than the total value that can be extracted
5Weareawarethatsomewell-managedsitesalreadyhavealterna-
tive
CAPTCHA
sreadyforswiftdeploymentinjustsuchasituation.
4
SDK software project:VB.NET PDF Convert to Word SDK: Convert PDF to Word library in vb.
Convert PDF document to DOC and DOCX formats in Visual Basic control to export Word from multiple PDF files in Create editable Word file online without email.
www.rasteredge.com
SDK software project:C# PowerPoint - PowerPoint Creating in C#.NET
PowerPoint document file created by RasterEdge C# PowerPoint document creator library is searchable and can be fully populated with editable text and graphics
www.rasteredge.com
(a) Early2008
(b)December16th2009
(c) January24th2010
Figure2:Examplesof
CAPTCHA
sdownloadeddirectlyfrom reCaptchaatdifferent timeperiods.
in the useful lifetime before the solver is detected and
the
CAPTCHA
changed. Moreover, for this approach to
be attractive, it must also cost less than the alterna-
tive: usinga human
CAPTCHA
-solving service. Tomake
this tradeoff concrete, consider the scenario in which a
CAPTCHA
-solvingserviceprovidermustchoosebetween
commissioninganew software solver(e.g., foravariant
ofa popular
CAPTCHA
)orsimply outsourcing recogni-
tion piecemeal to human laborers. If we suppose that it
costs $10,000toimplementasolverforanew
CAPTCHA
type with a 30% accuracy(like reCaptchaOCR), then it
would need to be used over 65 million times (20 mil-
lionsuccessful)before itwas a betterstrategy thansim-
ply hiring labor at $0.5/1,000.
6
However, the evidence
from reCaptcha’s response to reCaptchaOCR suggests
that
CAPTCHA
providers arewellable torespondbefore
suchamortizationis successful. Indeed,inourinterview,
M
R
.E said thathe had dabbledwith automated solving
butthatnewsolversstoppedworkingtooquickly. Inhis
ownwords, “Itisa bigwasteoftime.”
For these reasons, software solvers appear to have
been relegated to a niche status in the solving
ecosystem—focusingon those
CAPTCHA
sthatarestatic
orchange slowly inresponse topressure. Whilea tech-
nologicalbreakthroughcouldreversethisstateofaffairs,
fornowitappearsthathuman-basedsolvinghascometo
dominatethecommercialmarketforservice.
4 Human Solver Services
Since
CAPTCHA
s are only intended to obstruct au-
tomated solvers, their design point can be entirely
sidestepped by outsourcing the task to human labor
pools,eitheropportunisticallyorona“forhire”basis. In
thissection,wereviewtheevolutionofthislabormarket,
its basic economics and some of the underlying ethical
issuesthatinformedoursubsequentmeasurementstudy.
4.1 Opportunistic Solving
Opportunistichumansolving relies onconvincinganin-
dividualto solve a
CAPTCHA
as part ofsome otherun-
related task. For example, an adversary controlling ac-
cess to a popular Web site might use its visitors to op-
6
Moreover,humanlaboris highly flexibleand canbeusedforthe
widevarietyof
CAPTCHA
sdemanded by customers,whileasoftware
solverinevitablyisspecializedtooneparticular
CAPTCHA
type.
portunistically solving third-party
CAPTCHA
sby offer-
ing these challenges as its own [18]. A modern vari-
antof this approach has recently been employedby the
Koobface botnet, which asks infected users to solve a
CAPTCHA
(underthe guise of a Microsoft system man-
agement task) [13]. However, we believe that retention
oftheseunwittingsolverswillbedifficultduetothehigh
profile nature and annoyance of such a strategy, and we
do not believe that opportunistic solving plays a major
roleinthemarkettoday.
4.2 Paid Solving
Ourfocusisinsteadonpaidlabor,whichwebelievenow
represents the core of the
CAPTCHA
-solvingecosystem,
andthe businessmodelthathas emerged aroundit. Fig-
ure3illustrates atypicalworkflowandthebusiness rela-
tionshipsinvolved.
Thepremise underlyingthisapproachisthatthereex-
ists a pool of workers who are willing to interactively
solve
CAPTCHA
s in exchange for less money than the
solutionsareworthtotheclientpayingfortheirservices.
The earliest description we have found for such a re-
lationship is in a Symantec Blog post from September
2006 that documents an advertisement for a full-time
CAPTCHA
solver [20]. The authorestimates that the re-
sulting bids were equivalent to roughly one cent per
CAPTCHA
solved, or$10/1,000 (solvingprices are com-
monly expressed in units of 1,000
CAPTCHA
ssolved).
Starting from this date, one can find increasing num-
bersofsuchadvertisementson“work-for-hire”sitessuch
asgetafreelancer.com, freelancejobsearch.com, andmis-
tersoft.com. Shortly thereafter, retail
CAPTCHA
-solving
services beganto surface to resell such capabilities to a
broadrangeofcustomers.
Moreover, a fairly standard business model has
emerged in which such retailers aggregate the demand
for
CAPTCHA
-solving services via a public Web site
and open API. The example in Figure 3 shows the
DeCaptcher service performing this role in steps `
and ¯. In addition, these retailers aggregate the sup-
ply of
CAPTCHA
-solving labor by actively recruiting
individuals to participate in both public and private
Web-based “job sites” that provide online payments for
CAPTCHA
ssolved.PixProfit, aworkeraggregatorforthe
DeCaptcher service, performs this role insteps ´–˜ in
the example.
5
SDK software project:VB.NET Create PDF from Word Library to convert docx, doc to PDF in
Export all Word text and image content into high quality PDF without losing formatting. Convert multiple pages Word to fillable and editable PDF documents.
www.rasteredge.com
SDK software project:C# Word - Word Creating in C#.NET
The Word document file created by RasterEdge C# Word document creator library is searchable and can be fully populated with editable text and graphics
www.rasteredge.com
DeCaptcher 
(Customer Front End)
PixProfit 
(Worker Back End)
demenoba
1
7
2
3
6
demenoba
5
4
Figure 3:
CAPTCHA
-solvingmarket workflow: ÀGYC Automator attemptsto registera Gmail account and ischallenged with a
Google
CAPTCHA
.`GYCusesthe DeCaptcher plug-in tosolve the
CAPTCHA
at $2/1,000.´DeCaptcher queuesthe
CAPTCHA
fora worker onthe affiliatedPixProfit backend.ˆPixProfit selectsaworkerandpaysat $1/1,000.˜Worker entersa solution to
PixProfit,which¯returnsittotheplug-in.˘GYCthenentersthe solutionforthe
CAPTCHA
toGmail toregistertheaccount.
4.3 Economics
While the market for
CAPTCHA
-solving services has
expanded, the wages of workers solving
CAPTCHA
s
have been declining. A cursory examination of histori-
cal advertisements on getafreelancer.com shows that, in
2007,
CAPTCHA
solvingroutinelycommandedwages as
high as $10/1,000, but by mid-2008 a typical offer had
sunk to $1.5/1,000, $1/1,000 by mid-2009, and today
$0.75/1,000 is common, with some workers earning as
little as$0.5/1,000.
This downward price pressure reflects thecommodity
natureof
CAPTCHA
solving.Sincesolvingisanunskilled
activity, it can easily be sourced, via the Internet, from
the most advantageous labor market—namely the one
withthe lowestlaborcost.Weseeanecdotalevidenceof
precisely this pattern as advertisers switched from pur-
suinglaborersinEasternEuropetothoseinBangladesh,
China,IndiaandVietnam (observations furthercorrobo-
ratedbyourownexperimentalresults later).
Moreover, competition on the retail side exerts
pressure for all such employers to reduce their wages
in turn. For example, here is an excerpt from a recent
announcement at typethat.biz, the “worker side” of one
such
CAPTCHA
-solvingservice:
009-12-14 13:54 Admin post
Hello, as you could see, server was unstable
last days. We can’t get more captchas
because of too high prices in comparison
with other services. To solve this problem,
unfortunately we have to change the rate,
on Tuesday it will be reduced.
Shortly thereafter, typethat.biz reduced their offered
ratefrom$1/1,000to$0.75/1,000tostaycompetitive.
These changes reflect similar decreases on the re-
tail side: the customer cost to have 1,000
CAPTCHA
s
solvedis nowcommonly$2/1,000 and canbe as low as
$1/1,000. To protect prices, a number of retailers have
tried to tie their services to third-party products with
varyingdegreesofsuccess.Forexample,GYCAutoma-
tor is a popular “black hat” bulk account creator for
Gmail, Yahoo and Craigslist; Figure 3 shows GYC’s
role in the
CAPTCHA
ecosystem, with the tool scrap-
ing a
CAPTCHA
in step À and supplying a
CAPTCHA
solution in step ˘. GYC has a relationship with the
CAPTCHA
-solving service Image2Type (not to be con-
fused with ImageToType). Similarly, SENuke is a blog
and forum spamming product that has integral sup-
portfortwo“up-market” providers, BypassCaptcha and
BeatCaptchas. In both cases, this relationship allows
the
CAPTCHA
-solving services to charge higher rates:
roughly$7/1,000 forBypassCaptcha and BeatCaptchas,
andover$20/1,000 forImage2Type. It also provides an
ongoing revenue source for the software developer. For
his service, M
R
.E confirmsthat software partners bring
inmanycustomers(indeed, theyarethemajorityrevenue
source)andthatheoffersavarietyofrevenuesharingop-
tionstoattractsuchpartners.
However,suchlargeprice differences encouragearbi-
trage,andinsomecasesthird-partydevelopers havecre-
atedplug-instoallowtheuseofcheaperservicesonsuch
packages. Indeed, in thecase ofGYCAutomator, anin-
dependent developer built a DeCaptcher plug-in which
6
reduced the solving cost by overanorderof magnitude.
This development has created an ongoing conflict be-
tweenthesellerofGYCAutomatorandthedistributorof
the DeCaptcherplug-in. Othersoftware developershave
chosentoforgolargemarginrevenue sharinginfavorof
service diversity. For example, modern versions of the
Xrumerpackagecanuse multipleprice-leadingservices
(AntigateandCaptchaBot).
Finally,whileitischallengingtomeasureprofitability
directly,wehaveoneanecdotaldatapoint.Inourdiscus-
sions with M
R
.E, whose serviceis inthemiddle ofthe
price spectrum, he indicated that routinely 50% of his
revenue is profit, roughly 10% is for servers and band-
width, and the remainder is split between solving labor
andincentivesforpartners.
4.4 Active Measurement Issues
The remainder ofourpaper focuses on active measure-
ment of such services, both by payingfor solutions and
by participating in the role of a
CAPTCHA
-solving la-
borer. The securitycommunity has become increasingly
awareoftheneedtoconsiderthelegalandethicalcontext
of its actions, particularly for such active involvement,
andwebrieflyconsidereachinturnforthisproject.
IntheUnitedStates(werestrictourbriefdiscussionto
U.S. law since that is where we operate), there are sev-
eral bodies oflawthat may impinge on
CAPTCHA
solv-
ing. First, even though the services being protected are
themselves “free”, it can be argued that
CAPTCHA
sare
anaccess controlmechanism andthus evadingthem ex-
ceeds the authorization grantedbythe siteowner, inpo-
tential violation of the ComputerFraud and Abuse Act
(and certainly of their terms of service). While this in-
terpretationisdebatable, itis a moot point forourstudy
since we never make use ofsolved
CAPTCHA
sand thus
neveraccess anyofthesitesinquestion. Atrickierissue
israisedbytheDigitalMillennium CopyrightAct’s anti-
circumvention clause. While there are arguments that
CAPTCHA
solvers provide a realuse outside circumven-
tion of copyright controls (e.g., as aids for the visually
impaired) it is not clear—especially in light of increas-
inglycommonaudio
CAPTCHA
options—thatsuchade-
fense is sufficient to protect infringers. Indeed, Ticket-
master recently won a default judgment against RMG
Technologies (who sold automated software to bypass
the Ticketmaster
CAPTCHA
)using just such an argu-
ment[2]. Thatsaid, whileone could certainlyapply the
DMCA against those offering a service for
CAPTCHA
-
solvingpurposes,itseems astretchtoincludeindividual
humanworkers as violators since any such “circumven-
tion”wouldinclude innatehumanvisualprocesses.
Aside from potential legal restrictions, there are also
relatedethical concerns; one can do harm without such
actions being illegal. Inconsidering these questions, we
use a consequentialist approach – comparing the con-
sequences of our intervention to an alternate world in
which we took no action — and evaluate the outcome
forits cost-benefittradeoff.
On the purchasing side, we impart no direct impact
sincewedonotactuallyusethesolutionsontheirrespec-
tive sites. We dohavean indirect impact howeversince,
through purchasing services, we are providing support
to bothworkers and service providers. In weighing this
risk,weconcludedthattheindirectharmofourrelatively
small investment was outweighed by the benefits that
comefrom betterunderstandingthe nature of thethreat.
On the solving side, the ethical questions are murkier
since we understand that solutions to such
CAPTCHA
s
will be used to circumvent the sites they are associated
with. To sidestep this concern, we chose not to solve
these
CAPTCHA
sourselves. Instead, foreach
CAPTCHA
oneofourworkeragents wasaskedtosolve, weproxied
the image back into the same service via the associated
retail interface. Since each
CAPTCHA
is then solved by
the same set of solvers who would have solved it any-
way, weargue thatouractivitiesdonotimpactthegross
outcome.Thisapproachdoescauseslightlymoremoney
tobe injectedintothesystem, butthis amountis small.
Finally, we consultedwithourhumansubjects liaison
onthis workandwewere toldthatthe study didnotre-
quire approval.
5 Solver Service Quality
In this section we present our analysis of
CAPTCHA
-
solvingservicesbasedonactivelyengagingwitharange
of services as a client. We evaluate the customer inter-
face, solution accuracy, response time, availability, and
capacity of the eight retail
CAPTCHA
-solving services
listedinTable1.
WechosetheseservicesthroughacombinationofWeb
searching and reading Web forums focused on “black-
hat” search-engine optimization (SEO). In October of
2009, we selected the eight listed in Table 1 because
they were well-advertised and reflected a spectrum of
priceofferingsatthetime. Overthe course ofourstudy,
two of the services (CaptchaGateway and CaptchaBy-
pass) ceased operation—we suspect because of compe-
titionfrom lower-pricedvendors.
5.1 Customer Account Creation
Formostoftheseservices,accountregistrationisaccom-
plished via a combination of the Web and e-mail: con-
tact information is provided via a Web site and subse-
quent sign-up interactions are conducted largely via e-
mail. However, most services presented some obstacles
7
Service
$/1KBulk
Dates(2009–2010)
Requests
Responses
Antigate (AG)
$1.00
Oct06–Feb01(118days)
28,210
27,726(98.28%)
BeatCaptchas(BC)
$6.00
Sep21–Feb01(133days)
28,303
25,708(90.83%)
BypassCaptcha(BY)
$6.50
Sep23–Feb01(131days)
28,117
27,729(98.62%)
CaptchaBot(CB)
$1.00
Oct06–Feb01(118days)
28,187
22,677(80.45%)
CaptchaBypass(CP)
$5.00
Sep23–Dec 23(91days)
17,739
15,869(89.46%)
CaptchaGateway(CG) $6.60
Oct21–Nov03(13days)
1,803
1,715(95.12%)
DeCaptcher(DC)
$2.00
Sep21–Feb01(133days)
28,284
24,411(86.31%)
ImageToText (IT)
$20.00
Oct06–Feb01(118days)
14,321
13,246(92.49%)
Table1:Summaryofthecustomerworkloadtothe
CAPTCHA
-solvingservices.
to account creation, reflecting varying degrees of due
diligence.
For example, both CaptchaBot and Antigate required
third-party “invitation codes” to join their services,
which we acquired from the previously mentioned fo-
rums. Interestingly, Antigate guards against Western
users by requiring site visitors to enter the name of
the Russian prime minister in Cyrillic before grant-
ing access—an innovation we refer to as a “culturally-
restricted
CAPTCHA
”.
7
Some services require a live
phone call for account creation, for which we used an
anonymous mobile phone to avoid any potential biases
arisingfromusingaUniversityphonenumber. Inourex-
perience,however,theburdenofproofdemandedisquite
lowandourprecautionswerelikelyunnecessary.Forex-
ample,settingupanImageToTextaccountrequiredaval-
idation call, but the only question asked was “Did you
open an account on ImageToText?” Upon answering in
the affirmative (in a voice clearly conflicting with the
gender of the account holder’s name), our account was
promptlyenabled. Forone service, DeCaptcher, wecre-
ated multipleaccountstoevaluatewhetherper-customer
ratelimitingis inuse(wefounditwas not).
Finally, eachservicetypicallyrequires prepaymentby
customers,inunitsdefinedbytheirpriceschedule(1,000
CAPTCHA
sisthesmallest“package”generallyoffered).
To fund eachaccount, weused prepaid VISA gift cards
issuedbyanationalbankunaffiliatedwithouruniversity.
5.2 Customer Interface
Most services provide an API package for uploading
CAPTCHA
sand receiving results, often in multiple pro-
gramming languages;we generallyused the PHP-based
APIs. BeatCaptchas and BypassCaptcha did not offer
7
Inprinciple,suchanapproachcould beusedtoartificiallyrestrict
labormarketstospecificcultures(i.e.,
CAPTCHA
laborprotectionism).
Howeverit is an open problem if such a general form of culturally-
restricted
CAPTCHA
can be devised that has both a large numberof
examplesandalowfalserejectratefromitstargetpopulation.
pre-builtAPIpackages,soweimplementedourownAPI
inRubytointerfacewiththeirWebsites. TheclientAPIs
generally employ one of two methods when interacting
with their corresponding services. In the first, the API
clientperformsasingleHTTPPOSTthatuploadstheim-
ageto the service, waits forthe
CAPTCHA
to be solved,
and receives the answer in the HTTP response; Beat-
Captchas, BypassCaptcha, CaptchaBypass and Captch-
aBotutilizethismethod.
Inthesecond, theclientperforms oneHTTP POST to
uploadthe image, receives animageIDintheresponse,
and subsequently polls the site for the
CAPTCHA
solu-
tionusingthe imageID;Antigate, CaptchaGateway, and
ImageToText employ this approach. These APIs recom-
mend poll rates between 1–5 seconds; we polled these
servicesoncepersecond.DeCaptcherusesacustompro-
tocolthatisnotbasedonHTTP, althoughtheyalsooffer
anHTTPinterface.OneinterestingnoteaboutImageTo-
Text is that customers must verify that their API code
works inatestenvironmentbeforegainingaccess tothe
actual service. The test environment allows users to see
the
CAPTCHA
stheysubmitandsolvethemmanually.
5.3 Service Pricing
Several of the services, notably Antigate and De-
Captcher, offerbiddingsystems wherebyacustomercan
offerpaymentoverthemarketrateinexchangeforhigher
priority access to solvers when load is high. In our ex-
perience, DeCaptcher charges customers their full bid
price,whileAntigatetypicallychargesatalowerratede-
pendingonload(asmighthappeninasecond-priceauc-
tion). ToeffectivelyuseAntigate,wesetourbidprice to
$2/1,000 solutions since we experienced alargevolume
of load shedding error codes at the minimum bid price
of$1/1,000(Section5.9reports onourexperienceswith
serviceloadinmoredetail).Wehavenotseenpricefluc-
tuations on the worker side of these services, and thus
webelieve thatthis overagerepresents pureprofit to the
serviceprovider.
8
5.4 Test Corpus
Weevaluatedtheeight
CAPTCHA
-solvingservicesinTa-
ble1asa customeroverthecourseofaboutfivemonths
using a representative sample of
CAPTCHA
s employed
by popular Web sites. To collect this
CAPTCHA
work-
load, we assembled a list of 25 popular Web sites with
unique
CAPTCHA
sbased on the Alexa rank of the site
andourinformalassessment of its valueas a target (see
Figure5forthecompletelist). We alsoused
CAPTCHA
s
from reCaptcha, a popular
CAPTCHA
provider used by
many sites. We then collected about 7,500 instances of
each
CAPTCHA
directlyfrom eachsite. Forthe capacity
measurementexperiments(Section5.8),weused12,000
instancesoftheYahoo
CAPTCHA
graciouslyprovidedto
usbyYahoo.
5.5 Verifying Solutions
Toassess the accuracyofeachservice,weneededtode-
termine the correct solution for each
CAPTCHA
in our
corpus. We used the services themselves to do this for
us.Foreachinstance,weusedthemostfrequentsolution
returned by the solver services, after normalizing cap-
italization and whitespace. If there was more than one
most frequentsolution, we treated all answers as incor-
rect (taking this tomean that the
CAPTCHA
had no cor-
rectsolution).Table1showstheoverallaccuracyofeach
serviceasgivenbyourmethod.
Tovalidate thisheuristic, we randomlyselected1,025
CAPTCHA
shavingatleastoneservice-providedsolution
and manually examined the images. Of these, we were
able to solve 1,009, of which 940 had a unique plural-
ity that agreed with our solution, giving an error rate
for the heuristic of just over 8%. Of the 16
CAPTCHA
s
(1.6%) we could not solve, seven were entirelyunread-
able, six had ambiguous characters (e.g., ‘0’ vs. ‘o’, ‘6’
vs.‘b’), andthreewererenderedambiguous duetoover-
lapping characters. (We note thatBurszteinet al. [3]re-
moved
CAPTCHA
swith no majority from their calcula-
tion, whichresultedina higher estimatedaccuracythan
wefoundinourstudy.)
5.6 Quality of Service
Toassesstheaccuracy, responsetime,andserviceavail-
ability ofthe eight
CAPTCHA
solvingservices, we con-
tinuouslysubmitted
CAPTCHA
sfrom ourcorpus to each
service over the course of the study. We submitted a
single
CAPTCHA
every five minutes to all services si-
multaneously,recordingthetime whenwesubmittedthe
CAPTCHA
andthe time when wereceived the response.
RecallthatImageToText, AntigateandCaptchaGateway
require customers to poll the service forthe response to
BypassCaptcha
CaptchaBypass
CaptchaBot
Antigate
CaptchaGateway
ImageToText
Decaptcher
BeatCaptchas
20%
15%
10%
5%
0%
19.9%
13.4%
13.3%
12.4%
11.9%
11.3%
10.3%
10.3%
Median Error Rate
0
5
10
15
20
14.1
15.9
12.8
9.6
21.3
9.4
17.1
17.3
Median Response Time (seconds)
Figure4: Medianerrorrateandresponse time(inseconds)for
all services. Services are ranked top-to-bottom in order of in-
creasingerrorrate.
Youku
Slashdot
Taobao
reCaptcha
Bebo
Wikipedia
AOL
Yandex
Google
Conduit
Dailymotion
MSN
QQ
Yahoo
Maktoob
MySpace
Sina
digg
FC2
Baidu
Friendster
eBay
VKontakte
Skyrock
Rediff
PayPal
20%
15%
10%
5%
0%
57.4%
30.9%
29.5%
27.9%
25.2%
23.6%
20.5%
15.3%
14.0%
13.4%
13.4%
12.8%
11.8%
11.6%
11.5%
10.9%
10.3%
10.1%
10.1%
9.5%
9.3%
8.5%
7.6%
6.9%
5.0%
4.9%
Median Error Rate
0
5
10
15
20
17.1
15.7
14.8
17.3
15.0
17.3
16.0
15.4
15.7
13.8
14.5
16.0
12.9
15.2
13.8
15.9
15.0
14.0
15.1
12.9
15.1
14.8
13.9
16.3
14.8
13.9
Median Response Time (seconds)
Figure6: Medianerrorrateandresponse time(inseconds)for
all
CAPTCHA
s.
CAPTCHA
sarerankedtop-to-bottominorderof
increasingerrorrate.
asubmitted
CAPTCHA
;we paused one second between
eachpollcall.
Table 1 also summarizes the dates, durations, and
number of
CAPTCHA
requests we submitted to the ser-
vices;Figure5presentstheerrorrateandmeanresponse
time at a glance for each combination of solverservice
and
CAPTCHA
type. We usedeachservice forupto118
days,submittingupto28,303requestsperserviceduring
that period. We were not able to submit the same num-
ber of
CAPTCHA
sto all services for a number of rea-
sons. For example, services would go offline temporar-
ily, orwe would rewrite parts of ourclient implementa-
tion,thus requiringus totemporarilyremove theservice
from theexperiment.Furthermore,CaptchaGatewayand
CaptchaBypassceasedoperationduringourstudy.
9
Error Rate
Median Response Time
Youku
Slashdot
Taobao
reCaptcha
Bebo
Wikipedia
AOL
Yandex
Google
conduit
Dailymotion
MSN
QQ
Yahoo
Maktoob
MySpace
Sina
digg
FC2
Baidu
Friendster
eBay
Vkontakte
Skyrock
Rediff
PayPal
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
Antigate
ImageToText
CaptchaBot
BypassCaptcha
BeatCaptchas
Decaptcher
CaptchaBypass
CaptchaGateway
12
12
16
15
19
19
19
21
8
9
13
14
17
16
14
17
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
BypassCaptcha
CaptchaBypass
CaptchaBot
Antigate
CaptchaGateway
ImageToText
Decaptcher
BeatCaptchas
66
60
59
59
44
52
56
54
12
3
5
5
7
5
5
4
Figure 5:Errorrateandmedian response timefor eachcombinationof service and
CAPTCHA
type. The area ofeach circle upper
tableisproportionaltotheerrorrate(amongsolved
CAPTCHA
s).Inthe lowertable,circleareaisproportionaltotheresponse time
minustenseconds(forincreased contrast);negativevaluesare denotedbyunshaded circles.Numericvaluescorresponding tothe
valuesintheleftmostandrightmostcolumnsareshownontheside.Thus,theerrorrateofBypassCaptcha onYouku
CAPTCHA
sis
66%,and forBeatCaptchasonPayPal4%.Themedianresponse timeof CaptchaGatewayonYoukuis21seconds,and8seconds
forAntigateonPayPal.
Accuracy
A
CAPTCHA
solution is only useful if it is correct. The
left bar plotin Figure4 shows the median error rate for
each service. Overall the services are reasonably accu-
rate: with the exception of BypassCaptcha, 86–89% of
responses
8
werecorrect.Thislevelofaccuracyis inline
with results reported by Bursztein et al. [3] for human
solvers and substantially better than the accuracy ofre-
CaptchaOCR(Section3).
By design,
CAPTCHA
svary in difficulty. Do the ob-
served error rates reflect such differences? The top half
of Figure5 shows service accuracy (in terms of its er-
rorrate)oneach
CAPTCHA
type. The areaofeach circle
is proportional to a service’s mean error rate on a par-
ticular
CAPTCHA
type. Services are arranged along the
y-axisinorderofincreasing accuracy,with themostac-
curate(lowesterrorrate)atthetopandtheleastaccurate
(highesterrorrate)atthebottom.
CAPTCHA
typesarear-
rangedindecreasingorderoftheirmedianerrorrate.The
medianerrorrateofeachtypeisalsoshowninFigure6.
Accuracy clearly depends on the type of
CAPTCHA
.
TheerrorrateforImageToTextwithYouku, forinstance,
is5times its PayPal errorrate. Furthermore, theranking
of
CAPTCHA
accuracies are generally consistent across
8
Theerrorrateis overreceived responsesand doesnotincludere-
jectedrequests.Weconsiderresponseratetobeameasureofavailabil-
ityratherthanaccuracy.
the services—all services have relatively poor accuracy
onYoukuandgoodaccuracyonPayPal.
Based on the data, one might conclude that a group
of
CAPTCHA
son the left headed by Youku, reCaptcha,
Slashdot, and Taobao are “harder” than the rest. How-
ever an important factor affecting solution accuracy (as
wellasresponsetime)inourmeasurementsisworkerfa-
miliaritywitha
CAPTCHA
type.InthecaseofYouku, for
instance, workers may simply be unfamiliar with these
CAPTCHA
s. Ontheotherhand, workers arelikelyfamil-
iarwith reCaptcha
CAPTCHA
s(see Section6.6), which
may genuinely be “harder” than the rest. As a point of
comparison,M
R
.Ereportedinourinterviewthathisser-
viceexperiences a5–10%errorrate.Sincehis
CAPTCHA
mixislikelydifferent, andless diverse, thanourfullset,
his claim seemsreasonable.
ResponseTime
In addition to accuracy, customers want services that
solve
CAPTCHA
squickly. Figure7showsthecumulative
distributionofresponsetimesofeachservice.Thecurves
ofCaptchaBot, CaptchaBypass, ImageToText,andAnti-
gate exhibitthe quantizationeffect of polling—either in
the client API or on the server—as a stair-step pattern.
The shape of the distributions is characteristically log-
normal, with a median response of 14 seconds (across
all services) and a third-quartile response time of 20
seconds—well within the session timeout of most Web
10
Documents you may be interested
Documents you may be interested