What is the project about?

TAPCHA is our ongoing research project since 2012. As the name implies, it is about developing a mobile friendly CAPTCHA scheme that is easy to use by human users whilst retaining the same level of security as required.  Ultimately, throughout this project, we would like to inform the design of new CAPTCHA schemes that are ready for the upcoming ubiquitous computing era.

It is worth to note that, obviously the project mainly tackles the usability issues of mainstream CAPTCHAs but we won’t compromise the security requirements during the design process as they are equally important.

What is wrong with the current mainstream CAPTCHAs?

(credits: Microsoft)

A simple search on Google will tell you how much people hate to use CAPTCHAs. There are two main usability issues with the character recognition based CAPTCHA schemes  (AKA the mainstream as shown in the figure above). First, recognising distorted characters correctly has become a more and more challenging task to humans as a result of growing interests received from researchers in the field of Artificial Intelligence (AI) and Computer Vision (CV). Second, accessing such CAPTCHAs on a mobile phone device is a more painful experience than accessing them from a traditional computer system – the interaction method is one reason and the display size limitation is the other. Just think you need to zoom in and out to see the text clearly and launch the virtual keyboard to enter the text, then close the keyboard, zoom out to go back to your focus.

What is your design approach?

An ideal CAPTCHA scheme should represent a kind of AI-complete or AI-hard problem so that they can only be solved by humans not by an AI algorithm. This means if the best way to compromise a CAPTCHA scheme is to through brute force, we can say the design of the CAPTCHA scheme is effective. Keeping this in mind, we acknowledge the ‘ugly beauty’ of character recognition based CAPTCHA schemes when the characters are post-processed ‘nicely’ to bring down the probability of solving a CAPTCHA scheme.

At the same time, we have to agree that when characters are distorted badly, it would also become difficult for us to recognise them correctly as some letters and characters can present a degree of similarity after post-processing (e.g., b and 6, B and 8, m and nn etc.). We argue this is mainly because the context that helps us interpret the actual characters is missing here. That is to say, even if recognising such characters is difficult for us, according to the context effect, humans maybe able to figure out the characters when certain cues (context or environmental factors) are present.

For example, recognising the following word may be difficult (top) but it can be still done if a context is provided (bottom).

 

(top, the word “from” without any context provided)

 (bottom, the word “from” with more context provided)

 

Certainly, when the context is becoming more and more obvious, there will be a chance that the whole CAPTCHA challenge can be compromised by a bot. For this point, we argue that this is not only to do with the context itself and also to do with what the context is used for (i.e., as part of the challenge itself or as the instruction to help understand the challenge or both).

After considering the main intuitive interaction method (e.g., touch) and the common interaction styles supported by smart phones/tablets (e.g., touch gestures), we proposed a CAPTCHA scheme design featuring an object drag and drop challenge with distorted instruction. This design provides two level of contexts which must be used together to understand the instruction of a challenge and solve the challenge (a screenshot of the demo below). We believe this will make it harder for bot to understand what exactly is needed and through this, more security related features can be added (e.g., the construction of the instruction, the number of objects presented, the number of tasks that need to be completed etc.).

(two levels of context, instruction and actual canvas)

Do you have any proof-of-concept demos?

Yes we have two demos developed separately and our plan is to create one single solution based on the two below.

TapCHA v2 can be accessed at: https://ithop.uk/tapchav2/ (mobile and chrome device simulator only). This features customised construction of instruction with customised distortion level.

TapCHA v3 can be accessed at: https://ithop.uk/tapchav3/ (all platforms). This features sub tasks and customisable number of objects.

What’s your next step?

Our next step is to produce a TRL 7 application (EC definition). We are also in preparation of several journal papers discussing the relevant progress.


Related publications:

  • Jiang, N., Dogan, H. and Tian, F. 2017. Designing Mobile Friendly CAPTCHAs: An Exploratory Study. In: Proceedings of the 31st British HCI Group Annual Conference on People and Computers: Digital Make Believe, 2017. (link at the conference website)
  • Jiang, N. and Dogan, H. 2015. A Gesture-based CAPTCHA Design Supporting Mobile Devices. In: Proceedings of the 2015 British HCI Conference (HCI 2015), pp. 202-207. (link at ACM Digital Library)
  • Jiang, N. and Tian, F., 2013. A Novel Gesture-based CAPTCHA Design for Smart Devices. In: 27th International British Computer Society Human Computer Interaction Conference (HCI 2013) 9-13 September 2013 Brunel University, UK. (link at eWic)