It is an ancient dream to replicate machines to perform human functions, like reading. However, machine learning has grown from a dream to reality, over the last five decades. Now, there are several techniques and algorithms to train a machine in order to perform things like humans. Listening is the
Book Prism
It is an ancient dream to replicate machines to perform human functions, like reading. However, machine learning has grown from a dream to reality, over the last five decades. Now, there are several techniques and algorithms to train a machine in order to perform things like humans. Listening is the first language skill that we acquire. Listening to audio books can make learning much easier and entertaining.
Our application intends to provide three modules, the image processing module, voice processing module and syncing of text and speech. Image processing module converts the image into text, whereas, voice processing module changes the text into sound. However, in the last module we will synchronize text and speech and highlight the text. Optical character recognition and speech synthesis are the two main components which will be used in these modules. Optical character recognition is the process of converting scanned images of machine printed or handwritten text, into a computer format text. Speech synthesis is the artificial synthesis of human speech.
The main goals and objectives of our project are:
1. To do image processing to convert the given image into text.
2. To do text processing to analyze, normalize and transcribe the text into a phonetic or some other linguistic representation.
3. To generate speech from text produced by OCR.
4. To synchronize text and speech in order to highlight the text.
5. To highlight the text in order to make it visible to the user.
This project will consist of creating a web application for listening audiobooks using speech synthesis and text highlighting. The input data will be given either in the form of text or image. As stated earlier, there are five modules on which this application will work. The first is an image processing module which converts the image into text. The second is a voice processing module that generates speech from text. The third module is to remove garbage text from the storybook which includes removal of page numbers and the title of the storybooks from the top of the top of the page. Moreover, in fourth module we will synchronize text and speech. These modules could be done using machine learning (ML). Once the speech is generated, we will do synchronization between text and speech and highlight the word at that time. This project will consist of creating a web application for listening audiobooks using speech synthesis and text highlighting. The input data will be given either in the form of text or image. As stated earlier, there are five modules on which this application will work. The first is an image processing module which converts the image into text. The second is a voice processing module that generates speech from text. The third module is to remove garbage text from the storybook which includes removal of page numbers and the title of the storybooks from the top of the top of the page. Moreover, in fourth module we will synchronize text and speech. These modules could be done using machine learning (ML). Once the speech is generated, we will do synchronization between text and speech and highlight the word at that time.
The main goal of our project is to make a scalable web application which will help the users to listen the audio of the book. Our goal is to keep things as simple as we can. As the users of our app will be kids so we will be using simple graphical user interfaces for interaction with users. As we are making a web application, we have large amount of data to be accessed in a very short time and we have to process the requests of several users. So, our objective is to make the image uploading time and video loading time efficient enough such that the user does not find it slow. We will be using python and django framework to make our web application.
Product functions are divided into two categories depending upon the type of user. Product Functions are as follows:
Administration:
User:
Performance
System will be efficient and optimized for better performance. Response time will be as low as possible. Number of users that can access the system at a time depends on the limit provided by the server.
Realibility
The user will get the accurate result of the uploaded book. System generates the video in accurate format.
Final Deliverable of the Project and Beneficiaries
Administration, students, and book enthusiastic at a university or college are the primary beneficiaries of this system.
| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| Domain Registration (1 Year) | Equipment | 1 | 8000 | 8000 |
| Python And Django Framework For Beginners Complete Course | Miscellaneous | 1 | 3000 | 3000 |
| The Complete ReactJs Course - Basics to Advanced (2021) | Miscellaneous | 1 | 3000 | 3000 |
| Front End Web Development Ultimate Course 2021 | Miscellaneous | 1 | 4000 | 4000 |
| SSL Encryption (1 Year) | Equipment | 1 | 7000 | 7000 |
| Hosting For Website | Equipment | 1 | 8000 | 8000 |
| Total in (Rs) | 33000 |
The global water demand has been continuously increasing by ~1% per year since the 1980s,...
The solar smart inverter has become buzzword in the electronics industry which is blending...
With the rapid development of wireless communication technology, there is an increasing de...
Students and new Researchers can not easily detect the non indexed research papers/low qua...
Every Person need to consult law and lawyer in their life. The current procedure is costly...