Searching mathematical equations in their exact form on a browser for a frequent user is a hectic task. Regardless of how intelligent the search conventions are; the equations still need to be searched as they are to get better results. An alternate to this is to use a math editor where you select m
Handwritten to text transformation of math equation for better search experience
Searching mathematical equations in their exact form on a browser for a frequent user is a hectic task. Regardless of how intelligent the search conventions are; the equations still need to be searched as they are to get better results. An alternate to this is to use a math editor where you select math symbols or write its LaTeX to get those symbols but that requires us to search the symbols and operators and click each of them to complete the equation and search it after copy-paste which appears to be a drag for a simple equation search. So how good would it be if we are saved from all this effort of searching and clicking each operator to write an equation? To cater to this issue, an interactive web-based application named “Handwritten to Text Transformation of mathematical equations for better search experience” is proposed as a solution in which we’ll input a handwritten mathematical equation(either by uploading an existing image, using a stylus, taking a screenshot of a handwritten equation or typing your equation via a Math-centered on-screen keyboard) and not only get our mathematical equation converted into any format(Text, LaTeX, XML, MathML, i.e. CMML, PMML) we want but even get it searched on a Math Information Retrieval System. In addition, the available Math Information Retrieval Systems requires a certain MathML format to search Math Equations, whereas most of the math documentation is based on TeX/LaTeX format.” The project caters to all these issues and appears as a single integrated environment where you can write your equations, get the converted format, and get the search results against that equation all with a few clicks.
Our main objective is to automate the traditional user approach, i.e. “write math on paper” and convert it to the automated version, i.e. “write it by hand and leave the rest to the computer”. We do this with the following objectives:
The implementation details can be depicted via the following diagram:
Our system consists of 3 blocks named Input Block, Convertor, and Output Block. The input can be given in 4 different formats in the current scope. We can either write the equation in LaTeX, upload an image, write on an E-writing pad or Take a screenshot of the Equation. Once we get input, it’ll be checked if the image specifications are as per the defined specification. These images will go to the Convolutional Neural Network Model and the already trained model will generate their equivalent LaTeX in a way that we'll get the LaTeX of the equation. All this is done in Input Block. Now that the LaTeX is generated, it'll be passed to the Converter Block.
In Convertor Block, the LaTeX will be converted to Math Grammar Rules. These rules are used to generate an Abstract Syntax Tree (AST) via LaTeX Macros which extracts structural information from mathematical expressions given in LaTeX format. AST is used to convert mathematical expressions into Extensible Markup Language (XML).
XML marks the basis of mathematical expressions being machine-readable. This XML Tree is encoded to Content Math Markup Language (CMML) via an algorithm. CMML produces semantic enrichments in web documents i.e. text analysis via the algorithm that searches the specific entities from XML schema.
In addition, Presentation Math Markup Language (PMML) is also encoded on XML Tree alongside CMML. and is basically used to provide a better presentation so as to be searched on MIR. This completes the working of the Conversion block and the format is passed to the Output Block.
Output Block gets CMML to search mathematical expressions via Mathematical Search Engines or more formally Mathematical Information Retrieval (MIR) system. The MIR used in this case is called Math Search. It’ll do so by analyzing the syntactic (E.g. Textual) and Semantic (E.g. Structural) information of a mathematical expression that it got as CMML and bringing the instances of the equation which was initially passed as an input in the order of the weightage in descending order depending upon the number of times it was searched previously and list down all the results.
The three most significant benefits of the proposed project are as follows:
Previously, each of the steps is done at a different application and the output of each application serves as input for the next. For instance, a handwritten image is input into one
application which converts it into LaTeX. That LaTeX is given to MIR. If intelligent, it converts into MathML and searches. If not, we convert LaTeX to MathML at another app and then use it to search at MIR. Now that the whole process is brought to a single platform, the audience of 3 different platforms is attracted and the revenue is increased.
Currently, users have to pay for the snips and conversions as there is a limit. This application comes without any snip limit or conversion limit and merges all the steps in a single platform so the cost of using multiple platforms for a single task is avoided and resubscription is dealt with.
Since all the steps are done on a single platform there is a uniformity between results which ultimately results in error reduction, the decision-making is better in terms of ranking and searching the math queries and the calculations being in a single place is fast and robust.
Input by writing the equation on an E-writing Pad.
The result will be a fully working executable Website that’ll allow the user to get the handwritten equation in text format, LaTeX format, XML format, CMML format, and PMML format along with a search utility in a repository of PDF on the internet or a direct search in a Browser.
No technical knowledge is required except a readable written equation and a correct one while at it.
The following assumptions are taken in the current scope of the project:
Assumption 01 - The handwritten equation will be written on E-Writing Pad or Touch Screen or a paper. A screenshot, picture, or image of it or an already existing of the specified math formats (LaTeX, CMML, PMML) corresponding to some equation will be used as an input with one or more handwritten equations so as to be converted in another form.
Assumption 02 - The image would be clear, and would not have any unwanted (extra) annotations.
Assumption 03 - The image will be in Joint Photographic Experts Group (JPEG) or Portable Network Graphics (PNG) format.
Assumption 04 - The image should have equations in the form of digit, mathematical, and algebraic notations. There should not be any text, trigonometrical shapes or graphs.
| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| GPU | Equipment | 1 | 25000 | 25000 |
| 5TB Hard Drive | Equipment | 1 | 18000 | 18000 |
| 32GB RAM | Equipment | 1 | 12000 | 12000 |
| Stylus | Equipment | 1 | 8000 | 8000 |
| Data Collection | Equipment | 5 | 1000 | 5000 |
| Total in (Rs) | 68000 |
Real time human recognition in controlled and uncontrolled environment is the research pro...
As we know that learning management system (LMS) is a software application or web-bas...
Brain Computer Interaction (BCI) is a system that use the brain signals of a person to mov...
style="display:inline;">I affirm that all information submitted through this FYP applicati...
This project is based on the researches made in all over the world to replace SF6 as a Die...