{"id":153,"date":"2009-12-22T00:00:32","date_gmt":"2009-12-22T08:00:32","guid":{"rendered":"http:\/\/4gi.wtf\/wp\/?p=153"},"modified":"2021-12-08T10:17:24","modified_gmt":"2021-12-08T18:17:24","slug":"recaptcha-as-a-service","status":"publish","type":"post","link":"https:\/\/looseassociations.com\/?p=153","title":{"rendered":"reCAPTCHA as a Service"},"content":{"rendered":"<p>All I wanted an implementation of reCAPTCHA for my <a href=\"http:\/\/twiki.org\/\" target=\"_blank\" rel=\"noopener\">TWiki<\/a>. What I got is reCAPTCHA almost anywhere.<\/p>\n<p><!--more--><\/p>\n<p>If you host any kind of web site that allows your users to post content &#8212; blogs, wikis, calendars, comment pages, guest books, content management systems &#8212; then you&#8217;ve probably encountered <a href=\"http:\/\/en.wikipedia.org\/wiki\/Comment_spam\" target=\"_blank\" rel=\"noopener\">comment spam<\/a>. Automated processes clog your content with links to try to improve their search engine ranking. Even on a low-volume site, keeping up with deleting the dreck can be taxing.<\/p>\n<p>That&#8217;s why sites use <a href=\"http:\/\/en.wikipedia.org\/wiki\/CAPTCHA\" target=\"_blank\" rel=\"noopener\">CAPTCHAs<\/a> to prevent automated abuse. One popular source for CAPTCHAs is <a href=\"http:\/\/recaptcha.net\/\" target=\"_blank\" rel=\"noopener\">reCAPTCHA<\/a>. Read about it: it&#8217;s CAPTCHA with a conscience.<\/p>\n<p>I naively thought that I could use reCAPTCHA by just putting a snippet of HTML widget code on my sign-in pages. If you read the reCAPTCA instructions, though, you&#8217;ll find it&#8217;s considerably more complex than that. I&#8217;m no stranger to Perl and CGI coding, but hacking TWiki code isn&#8217;t trivial, and hacks often break with new releases. (To do it properly would require writing a TWiki plugin or add-on. I&#8217;m a doctor, Jim, not a TWiki developer!) I&#8217;d also like to be able to trivially add reCAPTCHA to some other platforms I use. What about people who can&#8217;t write their own CGIs? Or are hosted on sites where they&#8217;re not permitted to run their own code? Or people who want to be able to add a CAPTCHA to a site but can&#8217;t afford the time and effort to integrate reCAPTCHA into existing code?<\/p>\n<p>Why can&#8217;t reCAPTCHA be easy?<\/p>\n<p>Here&#8217;s an experiment: reCAPTCHA as a web service. No coding or CGI scripts. No PHP or ASP or .NET. Just a small modification to the HTML on your sign-in\/posting page and a change in the access restrictions for your sign-in\/posting script file (on Apache, this just means creating a small text file named <code>.htaccess<\/code>). It&#8217;s a little more complex than just putting a widget in an <code>iframe<\/code>, but it&#8217;s still pretty darned simple.<\/p>\n<h2><a name=\"modify_your_HTML\"><\/a>modify your HTML<\/h2>\n<p>First, figure out where you want the CAPTCHA to appear in the work flow. Usually, it will be right after submitting the form for a comment or registration request. Find the HTML code for the form. Somewhere in there will be a form tag, which will look vaguely like:<\/p>\n<pre>&lt;form method=post action=<b><i>\"\/cgi-bin\/addcommentscript\"<\/i><\/b>&gt;\r\n<\/pre>\n<p>The form tag can have lots of variations, but the important piece is what&#8217;s in quotes after &#8220;action=&#8221;. Here&#8217;s what you do:<\/p>\n<ol>\n<li>replace the <code>action<\/code> parameter in the form tag (<code><i>\"\/cgi-bin\/addcommentscript\"<\/i><\/code> in this example) with <code>\"https:\/\/captcha.sacdoc.org:9080\/cgi\/captcha.pl\"<\/code><\/li>\n<li>below the form tag, add <code>&lt;input type=\"hidden\" name=\"ultimate_destination_url\" value=<b><i>\"\/cgi-bin\/addcommentscript\"<\/i><\/b>&gt;<\/code><\/li>\n<li>if the <code>value<\/code> parameter doesn&#8217;t start with <code>\"http:\/\/\"<\/code> or <code>\"https:\/\/\"<\/code>, then look at the address bar of your browser when you&#8217;re viewing the comment or registration page. The url should start with something like <code>\"https:\/\/yourdomain.com\/\"<\/code>. Insert that text before the action parameter. You should end up with something like this:<\/li>\n<\/ol>\n<pre>&lt;form method=post action=\"https:\/\/captcha.sacdoc.org:9080\/cgi\/captcha.pl\"&gt;\r\n&lt;input type=\"hidden\" name=\"ultimate_destination_url\" value=<b><i>\"https:\/\/yourdomain.com\/cgi-bin\/addcommentscript\"<\/i><\/b>&gt;\r\n<\/pre>\n<p>Give it a try! After you submit a comment or registration request, you should be presented with a reCAPTCHA page. Prove you&#8217;re sentient, and you should be sent to the next step in your posting\/registration process.<\/p>\n<h2><a name=\"setting_access_restrictions\"><\/a>setting access restrictions<\/h2>\n<p>If this is all you do, it might stop some &#8216;bots. It&#8217;s just <a href=\"http:\/\/www.schneier.com\/crypto-gram-0205.html\" target=\"_blank\" rel=\"noopener\">security-by-obscurity<\/a>, though. Anyone who knows the location of the actual posting\/registration script can just bypass the CAPTCHA screen. In fact, for popular platforms, &#8216;bots already bypass the entry form and go straight for the script. In any case, your modified entry form has the address of the posting\/registration in plain text, so it would be trivial to bypass it. What to do?<\/p>\n<p>Fortunately, all legitimate posting\/registration requests will now be coming from my IP address (currently 69.62.162.196). If your hosting site uses an Apache server (most do), look in the directory that has your posting\/registration script. Check if there is a file there named <code>.htaccess<\/code>. If there isn&#8217;t one, create one. Then add this to the end of the file:<\/p>\n<pre>&lt;FilesMatch \"^<b><i>addcommentscript<\/i><\/b>.*\"&gt;\r\nSetHandler cgi-script\r\norder deny,allow\r\ndeny from all\r\nallow from 69.62.162.196\r\nSatisfy any\r\n&lt;\/FilesMatch&gt;\r\n<\/pre>\n<p>The <code>\"<b><i>addcommentscript<\/i><\/b>\"<\/code> should be replaced by just the file name from <code>value<\/code> parameter above. In the example, the value parameter was <code><b><i>\"https:\/\/yourdomain.com\/cgi-bin\/addcommentscript\"<\/i><\/b><\/code>, so you&#8217;d use the <code>addcommentscript<\/code> part in the <code>.htaccess<\/code> file.<\/p>\n<p>I have no idea if Microsoft&#8217;s web server (IIS) supports <code>.htaccess<\/code> files or the equivalent, so you&#8217;re on your own if you&#8217;re hosted on an IIS server.<\/p>\n<h2><a name=\"important_caveats\"><\/a>important caveats<\/h2>\n<h3><a name=\"privacy\"><\/a>privacy<\/h3>\n<p>The astute reader has already recognized that this scheme will pass all data from the posting\/registration form through my servers. For some such, that might include username\/password pairs, hidden access tokens (but not cookies), or the contents of posts to private blogs. I could try to convince you that I am a person of such honesty and integrity that you should trust me. Instead, let me state that <strong><em>I just might sell all the data to the highest bidder.<\/em><\/strong> If that makes you uncomfortable, find a different solution.<\/p>\n<h3><a name=\"reliability\"><\/a>reliability<\/h3>\n<p>Over the past ten years, my servers have achieved a 99.98% scheduled availability level for mission-critical services. But reCAPTCHA as a service isn&#8217;t mission critical. I might decide not to host it any more. The <code>captcha.sacdoc.org<\/code> IP address (which has to be hard-coded into <code>.htaccess<\/code> files) might change without notice at the whim of my ISP (they&#8217;ve done that once in a decade, in spite of my paying for static addresses). I might die or sell out or just arbitrarily stop supporting reCAPTCHA as a service. No guarantee is expressed or implied. Use at your own risk. Don&#8217;t blame me if you get hurt.<\/p>\n<p>If you use this service in any serious way, I strongly suggest that you email me at r<a title=\"Reveal this e-mail address\" href=\"http:\/\/mailhide.recaptcha.net\/d?k=01x8JlQEzKumG3jvJcQ2ck7w==&amp;c=rVTmpxMTWHz26I_l1fYRL9oLxMQFl4zkrVDIJrnOVOM=\">&#8230;<\/a>@risley.net letting me know how to contact you if things change. I might not bother to do so, but at least this way you&#8217;ll have a chance.<\/p>\n<p>&#8212; <span class=\"twikiNewLink\">Ron<\/span>\u00a0&#8211; 27 Dec 2009<\/p>\n","protected":false},"excerpt":{"rendered":"<p>All I wanted an implementation of reCAPTCHA for my TWiki. What I got is reCAPTCHA almost anywhere.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[],"class_list":["post-153","post","type-post","status-publish","format-standard","hentry","category-coding"],"_links":{"self":[{"href":"https:\/\/looseassociations.com\/index.php?rest_route=\/wp\/v2\/posts\/153"}],"collection":[{"href":"https:\/\/looseassociations.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/looseassociations.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/looseassociations.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/looseassociations.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=153"}],"version-history":[{"count":4,"href":"https:\/\/looseassociations.com\/index.php?rest_route=\/wp\/v2\/posts\/153\/revisions"}],"predecessor-version":[{"id":367,"href":"https:\/\/looseassociations.com\/index.php?rest_route=\/wp\/v2\/posts\/153\/revisions\/367"}],"wp:attachment":[{"href":"https:\/\/looseassociations.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=153"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/looseassociations.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=153"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/looseassociations.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=153"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}