Deep Learning and Applications

Deep learning is the newest direction of machine learning. It is based on the theory of artificial neural networks which has been in use for decades. However, in contrast to classic neural networks, the deep neural nets contain many processing layers, which allow the hierarchic processing of the input data.

While the concept of deep networks in not entirely new, their efficient training required several new technological advancements. The first efficient deep training algorithm was published in 2006, so we consider this date to be the first appearance of deep learning. Besides new algorithms, the training of deep networks also required fast hardware, the issue of which was solved by the invention of graphic processors (GPUs). Nowadays, deep learning algorithms are run almost solely on these special pieces of hardware. Lastly, in order to achieve good results with training, we also need enormous amounts of data, which was made possible by the wide spreading of the Internet and electronic data processing in general. The lucky coincidence of these three factors brought us the deep learning "revolution" of artificial intelligence. Deep learning achieved the biggest breakthroughs in the areas of image processing and speech recognition, but its areas of application are growing increasingly, for example in the area of language technology, medical applications, and so on.

Our team follows the newest developments of the field and joins the theoretical research, but most importantly we focus on the possible applications of deep learning. Our main experience is with speech technology, but we are also working on medical applications, image recognition projects, and natural language technology problems.


Gábor Gosztolya (contact), László Vidács, György Kovács, Tamás Grósz, Márk Jelasity, András Kicsi, László Tóth


Classification of non-functional requirements Requirements engineering is one of the very first tasks of the software development processes which fundamentally influences the quality of the software under development. The requirements are mostly given in natural language form which can be both functional and non-functional requirements. The non-functional requirements are the foundation of the quality aspects of the software such as security, usability, reliability. Classifying the non-functional requirements is one of the most important tasks of software engineering. The object of the project is to develop machine-learning (and deep-learning) based methods and tools which can support system analysts in classifying non-functional requirements given in natural language form. The collection of classified non-functional requirements can be used for both analysis and design phases.

Information retrieval from hungarian radiology reports In this project information retrieval of MR reports is carried out using manual annotations of anonymized reports. Machine learning is applied on the annotated reports to label Body parts, changes and their properties in free text. In the near future we plan to apply deep learning and ontology based method to make the analysis more credible. Based on this phase the project will seek answers for questions related to concrete illnesses.


Silent speech interfaces Reconstruction of the speech signal from ultrasound movies of the tongue movement.

Medical applications The group develops machine learning techniques in various application areas, like the detection of mild cognitive impairment from spontaneous speech.

Multi-label classification for tagging user feedbacks given in natural language form When users or customers express their expectations relating to the software, they use natural languages. These sentences of feedbacks or requirements often contain more than one aspect of the expectations, therefore, they can be classified more than one classes.The object of the project is to develop machine-learning (and deep-learning) based methods which can be applied to multi-label classification and to develop tagger tool based on these methods. The method is to be extended also to support multi-label tagging process of sentences.