ABSTRACT:Theadvancement of Information and Communication Technology has affected all partsof the human life.
It has changed the way we work, travel, think about andconvey.Inthe real world dumb people normally uses there gestures to express their feelingswith the others. it is a very difficult task to do. We can observe that thetechnology has been developed very fast and presents each action in digitalform then it may be in images or audio format. In order to make their life moreadvanced, an application is needed to be developed so they can get opportunityto express their feelings and ideas and also they can get a chance to introducewith new technologies.Textto speech converter is a concept which helps the people to communicate betterwith the rest of the world and among themselves by means of an electronicdevice as a medium. This applicationis mainly useful for the people who are visually impaired and the people whoare dumb. And provides away formore close to people in the real world.
And also helps the people to learn newtechnologies by expressing their feelings effectively. KEYWORDS:Opticalcharacter recognition (OCR), Text-to-Speech synthesis (TTS). 1. INTRODUCTION:Dumbpeople are usually deprived of normal communication with other people in thesociety. It has been observed that they find it really difficult at times tointeract with normal people with their gestures, as only a very few of thoseare recognized by most people.
Text to speech conversion concept will removesthis problem of the visually impaired and dumb people. The main objective of this application is to convert thegiven text into a corresponding spoken waveform. Character recognizing, Textprocessing and speech generation are the main components this system.A Text image or Text to speech converter convert’sprinted, written text image or text into speech. It is an artificial productionof human speech from the text that is input to an electronic machine. Theprocess of converting text to speech is called a speech synthesizing 2.Text image or Text to speech converter application takesinput as a printed, written text image or text 1.· For image input ithas to followcertain steps for speechanalysis like database creation, character recognition and text to speechconversion.
· For the text inputit simply performs a text to speechconversion. 2. LITERATURE SURVEY:This application takes input as the text image throughcam available to the mobile phone and text through text box, the image istransferred to optical recognizer phase to convert into text and the outputfrom the OCR is input to the text-to-speech engine the speech engine willprovides the output in the from audioand also at the same time the audio will be recorded automatically.2.1.Input:Thisapplication will take an image as a input or the text through the text box andalso there is facility to retrieve the previous speeches those are storedinternal memory. 2.
2.Opticalcharacter recognition (OCR):This phasewill take input as a printed text image or hand written text image and throughdifferent steps like scanning, pre-processing, segmentation, feature extractionand selection and finally it produces a text output. The output from theoptical character recognition is input to the speech engine 2.3.Text-to-speechsynthesis:This phase isgenerally implemented by text to speech engine which is a predefined softwareprovided by the android which consists of different phases like text analysis,linguistic analysis and finally speech generation 910. 2.
4.Internalstorage: This willstores the previously generated speeches in the memory. Which provides a greatadvantage because if the input is already once converted to speech the userwill not repeat all the process again.
By simply retrieving the speech filefrom the memory we can reuse it 8. 3. PROPOSED SYSTEM:· Optical Character Recognition System (PaperText) · Speech Synthesis(text to speech) 3.4.OPTICAL CHARACTER RECOGNITION (OCR): Itis a mechanical or electronic translation of images to text, type written orprinted text 3.BLOCK DIAGRAM: opticalcharacter recognition 3.1.Input:Ittakes input a printed, writtentext image or text and sent to scanner.
3.2.Scanning:Text digitization is a processto convert the image into proper digital image. 3.3.Scanned image has aresolution level typically 300- 1000 dot per inch for better accuracy oftext extraction and saves it in preferably TIF, JPG and GIF format.3.5.
Pre-processing:3.5.1. Pre-processing consists of a number of preliminary steps to make the rawdata usable for recognizer. Firstly the scanned image is converted to grayscale image by binarization method.3.
5.2. Sometimesskew detection and correction method is necessary todigitized image to make text lines horizontal.
The noise free image is passedto the segmentation step 4. 3.6.Segmentation: Here the imageis segmentedinto characters. 3.
7.Feature extraction and Classification: All characters will be divided intogeometric elements like lines, arc and circles and compare the combination ofthese elements with stored combination of known characters 5.3.8.SPEECHSYNTHESIS: Speechsynthesis isTheautomatic generation of speech waveforms that convert the input text data tospeech waveforms 4.3.
8.1. Concatenate intheprerecorded speech that is stored in database produces synthesized speech. TheTTS synthesizer is composed of two phases as mentioned in the fig.
front endand back end 126.96.36.199. The two areanalysiswhich converts the input text to a phoneme in the front end.
3.8.3. Back endconvertsthe phoneme to waveforms that can output as sound. 4.
EXPERIMENTAL OUTPUTS: 5. CONCLUSION: In this way, we have completed thedesign part of the project with the requirement specification. Modules of theproject are designed and are well studied in order to fulfill the requirementsof the project.Thus,the completion of partial report is being completed with full hard work andcomplete support and guidance of our guide and project plan is made to ensurethe proper planning of the project6. References 1)Jithendra Vepaand Simon King, Subjective Evaluation of Join Costand Smoothing Methods forUnit Selection Speech Synthesis, IEEETrans. SpeechAudio Process, Vol. 14, (2006) pp.
1763 – 1771.2)T. Dutoit, AnIntroduction to Text-to-Speech Synthesis, KluwerAcademicPublishers, Dordrecht, ISBN 0-7923-4498-7, (1997).3)Marc Schröder,Expressing Degree of Activation in SyntheticSpeech, IEEETrans.
Speech Audio Process, Vol. 14, (2006) pp.1128 – 1136.4)ArchanaBalyan, S.
S. Agrwal and Amita Dev, Speech Synthesis:Review, IJERT,ISSN 2278-0181 Vol. 2 (2013) p. 57 – 75.5)Yee-Ling Lu;Mak, Man-Wai; Wan-Chi Siu,, “Application of a fast real time recurrentlearning algorithm to text-tophonemeconversion,”Neural Networks, 1995. Proceedings.
, IEEE International Conference on , vol.5,no.,pp.2853,2857vol.5, Nov/Dec 1995.6)AlistairConkie, Thomas Okken, Yeon-Jun Kim, Giuseppe DiFabbrizio,Building Text-To-Speech Voices in the Cloud, in Proc.AT LabsResearch, Park Avenue, Florham Park, NJ- USA).
7)Allen, John,Hunnicutt, Sharon, and Dennis Klatt, Text To Speech,The MITTALKSystem (Cambridge: Cambridge University Press,1987).8)Hunt A.J. andBlackA.W., “Unit selectionin a concatenativespeech synthesissystem for a large speech database,”in Proceedingsof IEEE Int.
Conf. Acoust., Speech, andSignalProcessing, 1996, pp. 373–376.9)Anand Arokiaet al.
, Text Processing for Text-to-Speech Systems inIndianLanguages, Proc. in 36th ISCA Workshop on SpeechSynthesis,(Bonn, Germany, August 2007) pp.22-24.10)A. Indumatiand Dr.
E. Chandra, Speech processing –An Overview,Int. J. of Engg.Sci. and Tech., Vol. 4, (2012) p.