iink SDK 3.0: How our new Math Recognizer is changing the game

At the crossroads of technology and mathematics, MyScript has been pioneering handwriting recognition through groundbreaking advancements. The latest release of MyScript iink SDK 3.0 unveils the revolutionary Math Recognizer, powered by a new type of neural network.

Harnessing the power of advanced AI and deep learning, this new engine promises unprecedented accuracy and efficiency. It stands as the culmination of many years of extensive research, showcasing our commitment to blending mathematics with innovative technology.

Charting new territories: The debut of our Math Recognizer

MyScript introduced its first Math Recognizer to the market in 2011. At that time, it represented a significant leap forward in terms of precision when compared to the state-of-the-art research of the era. During the 2012 CROHME mathematical expression competition, MyScript engine reached a 60% recognition accuracy while the best system only obtained 20% a year earlier. The overall recognition system was built around three distinct experts:

  1. A character recognizer based on feed-forward neural networks;
  2. A segmentation module using a Context-Free Grammar (CFG);
  3. A statistical language model designed to estimate likelihood probabilities of adjacent symbols.

Over the years, MyScript enhanced this system by implementing BLSTM (Bidirectional Long Short-Term Memory) recurrent neural networks for both symbol classification and statistical language models. In less than a decade, the team successfully divided the number of erroneous recognitions by three.

However, the localized analysis conducted by these three experts – each focused on a specific task – had inherent limitations and required a manual tuning to make them work together. This means the architectural framework had reached a "plateau" in terms of accuracy, struggling to grasp long-range dependencies among symbols and hindering its progress. Facing this plateau, we knew we had to come up with a whole new system in order to do better.

Reaching new horizons: Our new revolutionary Math Recognizer unleashed

A revamped deep learning architecture

In the early 2020s, delving into advanced deep learning techniques, MyScript initiated the development of a new system. The new architecture comprises an end-to-end model, functioning as an encoder-decoder transformer based on the multi-head attention mechanism. Input strokes are reordered to be time-independent, normalized to be agnostic to device capture resolutions (both spatial and temporal) and passed to the encoder for analysis. The encoder’s backbone consists of a stack of transformer blocks with relative positional embeddings within the self-attention module. Down and up sampling layers are added respectively before and after the transformer blocks to speed up the encoding process by reducing the input sequence length.

Encoder-decoder transformer at work

The decoder is trained to predict a linear sequence that describes the spatial structure of math expression, similar to the LaTeX language. It's structured with transformer blocks that include cross-attention modules. In an autoregressive manner, the decoder predicts symbols one by one, building on previously generated symbols. It leverages cross-attention probabilities to identify symbol segments and boundaries, which are then reinjected into the decoder transformer blocks. This process minimizes the "hallucination effect", ensuring each part of the input ink is interpreted just once.

User-backed dataset training for superior accuracy

The encoder-decoder transformer was trained using a vast and diverse dataset containing 120,000 mathematical expressions gathered from real end-users worldwide, which is four times larger than the dataset used in the previous system. Data augmentation techniques using ink transformation and generation were also employed to improve the robustness of the overall recognition system. The accuracy has been validated on a series of user tests, demonstrating its consistent performance across different writing styles and syntaxes, thus ensuring an optimal user experience.

With this new architecture, the team successfully managed to halve the number of recognition errors compared to the previous system, setting a new standard in the market. With this new technology, MyScript reaffirms its commitment to leading AI research in handwriting recognition.

The backbone of math technology innovation

The unveiling of MyScript's new Math Recognizer based on a new cutting-edge AI technology marks a significant milestone in handwriting recognition technology. By halving recognition errors and offering superior accuracy, it not only sets a new industry standard but also redefines the integration of technology and mathematics. It's a leap towards a future where our engine will discern mathematical expressions from regular text or shapes autonomously, offering a seamless and intuitive user experience.

Getting started with MyScript iink SDK 3.0

Check out iink SDK 3.0's latest enhancements on the MyScript Developer website. With easy-to-follow documentation and examples, see for yourself how our SDK can elevate your existing solutions. Start exploring for free, and join us in pioneering the future of digital handwriting!

26 de abril de 2024