Yactraq's Core Technology



Yactraq indexes all relevant data, be it text, audio, or video, and consolidates it into a single dashboard. Accurate B2B speech systems need to understand industry and company specific terms. The benefits of custom vocabularies range across domains like business intelligence and video search.


Yactraq delivers these capabilities via CoreTraq, our patent pending speech based semantic platform. CoreTraq incorporates Large Vocabulary Continuous Speech Recognition (LVCSR) technology, combined with Yactraq’s proprietary approach to automating the building of custom vocabularies.


Yactraq works with standard taxonomies like Open Directory, IAB, and Internet Search Terms, as well as customer specific taxonomies. Automated web crawling is used to collect linguistic data, and these data are used to train Yactraq’s NLU (Natural Language Understanding) module. The NLU module then triggers a machine learning process that results in building a language model for Yactraq’s speech engine.


Once deployed, CoreTraq eliminates music, silence, and noise to deliver dramatic speech recognition throughput gains. The output text from the speech recognizer is further processed by Yactraq’s NLU module to determine the primary subject topics of the given minute of audio or video data.



Yactraq Technology Roadmap Highlights



Versatile Speech Recognition:

Yactraq’s machine learning based information compression capability allows large topic sets and lexicon’s to be compressed, with minimal loss of information, into smaller footprint embedded speech recognizer’s which can be part of a distributed system that includes cloud infrastructure.



Deep Neural Networks:

Yactraq is studying the application of Deep Neural Network (DNN) technology. Acoustic modeling is an area where neural network based systems have shown great promise in the last few years and are a key aspect of Yactraq’s roadmap.



Class Based Language Models:

Complex business intelligence applications have deep requirements but sometimes only limited amounts of linguistic data may be available. In such cases, class based language models represent a possible solution.



Dynamic Configuration API:

Yactraq is building the components required to build an API that allows customers to send Yactraq a continuous configuration feed of target topics and entities. The expected outcome is an API that allows CoreTraq’ vocabulary to be dynamically reconfigured on a fully automated basis.



Taxonomies and Linguistic Data:

Yactraq is also building other standard and vertically specific vocabularies. Examples of standard data sets include Wikipedia, Freebase, Yellow Pages, and IAB. Verticals of interest include Finance, Retail, Government, and Healthcare.


If you have any questions on Yactraq's core technology,
please contact us for more information.