Adil Khan 9 months ago
AdiKhanOfficial #FYP Ideas

Handwritten to text transformation of math equation for better search experience

Searching mathematical equations in their exact form on a browser for a frequent user is a hectic task. Regardless of how intelligent the search conventions are; the equations still need to be searched as they are to get better results. An alternate to this is to use a math editor where you select m

Project Title

Handwritten to text transformation of math equation for better search experience

Project Area of Specialization

Artificial Intelligence

Project Summary

Searching mathematical equations in their exact form on a browser for a frequent user is a hectic task. Regardless of how intelligent the search conventions are; the equations still need to be searched as they are to get better results. An alternate to this is to use a math editor where you select math symbols or write its LaTeX to get those symbols but that requires us to search the symbols and operators and click each of them to complete the equation and search it after copy-paste which appears to be a drag for a simple equation search. So how good would it be if we are saved from all this effort of searching and clicking each operator to write an equation? To cater to this issue, an interactive web-based application named “Handwritten to Text Transformation of mathematical equations for better search experience” is proposed as a solution in which we’ll input a handwritten mathematical equation(either by uploading an existing image, using a stylus, taking a screenshot of a handwritten equation or typing your equation via a Math-centered on-screen keyboard) and not only get our mathematical equation converted into any format(Text, LaTeX, XML, MathML, i.e. CMML, PMML) we want but even get it searched on a Math Information Retrieval System. In addition, the available Math Information Retrieval Systems requires a certain MathML format to search Math Equations, whereas most of the math documentation is based on TeX/LaTeX format.” The project caters to all these issues and appears as a single integrated environment where you can write your equations, get the converted format, and get the search results against that equation all with a few clicks.

Project Objectives

Our main objective is to automate the traditional user approach, i.e. “write math on paper” and convert it to the automated version, i.e. “write it by hand and leave the rest to the computer”. We do this with the following objectives:

  • To enable multiple ways for users to interact with math on the browser by providing the utility of uploading an existing image, using a stylus to write/draw an equation, taking a screenshot of a handwritten equation, or typing your equation via a Math-centered on-screen keyboard.
  • To make the handwritten equation Math information retrieval System (MIR) readable or simply browser readable.
  • Searching for the equivalent of handwritten equations on the browser. and enable math information retrieval on the web.
  • Providing the converted format of the detected handwritten equation into LaTeX, Text, Extensible Markup language (XML), Math Markup Language (MathML), i.e. Content Math Markup Language (CMML), and Presentation Math Markup Language (PMML).

Project Implementation Method

The implementation details can be depicted via the following diagram:

System Diagram

Our system consists of 3 blocks named Input Block, Convertor, and Output Block. The input can be given in 4 different formats in the current scope. We can either write the equation in LaTeX, upload an image, write on an E-writing pad or Take a screenshot of the Equation. Once we get input, it’ll be checked if the image specifications are as per the defined specification. These images will go to the Convolutional Neural Network Model and the already trained model will generate their equivalent LaTeX in a way that we'll get the LaTeX of the equation. All this is done in Input Block. Now that the LaTeX is generated, it'll be passed to the Converter Block.

In Convertor Block, the LaTeX will be converted to Math Grammar Rules. These rules are used to generate an Abstract Syntax Tree (AST) via LaTeX Macros which extracts structural information from mathematical expressions given in LaTeX format. AST is used to convert mathematical expressions into Extensible Markup Language (XML).

XML marks the basis of mathematical expressions being machine-readable. This XML Tree is encoded to Content Math Markup Language (CMML) via an algorithm. CMML produces semantic enrichments in web documents i.e. text analysis via the algorithm that searches the specific entities from XML schema.

In addition, Presentation Math Markup Language (PMML) is also encoded on XML Tree alongside CMML. and is basically used to provide a better presentation so as to be searched on MIR. This completes the working of the Conversion block and the format is passed to the Output Block.

Output Block gets CMML to search mathematical expressions via Mathematical Search Engines or more formally Mathematical Information Retrieval (MIR) system. The MIR used in this case is called Math Search. It’ll do so by analyzing the syntactic (E.g. Textual) and Semantic (E.g. Structural) information of a mathematical expression that it got as CMML and bringing the instances of the equation which was initially passed as an input in the order of the weightage in descending order depending upon the number of times it was searched previously and list down all the results.

Benefits of the Project

The three most significant benefits of the proposed project are as follows:

Increased Revenue:

Previously, each of the steps is done at a different application and the output of each application serves as input for the next. For instance, a handwritten image is input into one
application which converts it into LaTeX. That LaTeX is given to MIR. If intelligent, it converts into MathML and searches. If not, we convert LaTeX to MathML at another app and then use it to search at MIR. Now that the whole process is brought to a single platform, the audience of 3 different platforms is attracted and the revenue is increased.

Avoid Costs

Currently, users have to pay for the snips and conversions as there is a limit. This application comes without any snip limit or conversion limit and merges all the steps in a single platform so the cost of using multiple platforms for a single task is avoided and resubscription is dealt with.

Improved Service

Since all the steps are done on a single platform there is a uniformity between results which ultimately results in error reduction, the decision-making is better in terms of ranking and searching the math queries and the calculations being in a single place is fast and robust.

What’s new?

  • Input by writing the equation on an E-writing Pad.

  • Upload an image of the handwritten equation.
  • Take a screenshot of an equation from some document or any other source.
  • Type an equation in LaTeX.
  • Visualize Abstract Syntax Tree.

Technical Details of Final Deliverable

The result will be a fully working executable Website that’ll allow the user to get the handwritten equation in text format, LaTeX format, XML format, CMML format, and PMML format along with a search utility in a repository of PDF on the internet or a direct search in a Browser. 

No technical knowledge is required except a readable written equation and a correct one while at it. 

The following assumptions are taken in the current scope of the project:

  • Assumption 01 - The handwritten equation will be written on E-Writing Pad or Touch Screen or a paper. A screenshot, picture, or image of it or an already existing of the specified math formats (LaTeX, CMML, PMML) corresponding to some equation will be used as an input with one or more handwritten equations so as to be converted in another form.

  • Assumption 02 - The image would be clear, and would not have any unwanted (extra) annotations.

  • Assumption 03 - The image will be in Joint Photographic Experts Group (JPEG) or Portable Network Graphics (PNG) format.

  • Assumption 04 - The image should have equations in the form of digit, mathematical, and algebraic notations. There should not be any text, trigonometrical shapes or graphs.

Final Deliverable of the Project

Software System

Core Industry

Education

Other Industries

Core Technology

Artificial Intelligence(AI)

Other Technologies

Others

Sustainable Development Goals

Quality Education

Required Resources

Item Name Type No. of Units Per Unit Cost (in Rs) Total (in Rs)
GPU Equipment12500025000
5TB Hard Drive Equipment11800018000
32GB RAM Equipment11200012000
Stylus Equipment180008000
Data Collection Equipment510005000
Total in (Rs) 68000
If you need this project, please contact me on contact@adikhanofficial.com
Real time human activity recognition in control and unconditional envi...

Real time human recognition in controlled and uncontrolled environment is the research pro...

1675638330.png
Adil Khan
9 months ago
IoT Based Quality Enhancement In Learning Management System

As we know that learning management system (LMS) is a software application or web-bas...

1675638330.png
Adil Khan
9 months ago
Brain Computer Interaction. some functions are( close and open tab usi...

Brain Computer Interaction (BCI) is a system that use the brain signals of a person to mov...

1675638330.png
Adil Khan
9 months ago
Design, Fabrication and Control of a 3 Degree-of-Freedom SCARA Robot

style="display:inline;">I affirm that all information submitted through this FYP applicati...

1675638330.png
Adil Khan
9 months ago
Dielectric Strength Measurement of Different Insulating Gases I Approa...

This project is based on the researches made in all over the world to replace SF6 as a Die...

1675638330.png
Adil Khan
9 months ago