Welcome to this week’s No Jitter Roll, our regular roundup of product news in the communication and collaboration spaces. Leading off this week, we highlight: Amazon Connect’s launch of an analytics data lake for contact centers; Tonic.ai’s data lakehouse for securely developing generative AI applications; and CallTower made Operator Connect an easier sell for service providers.
Amazon Connect Launches Analytics Data Lake
Amazon Connect announced the general availability of analytics data lake, which provides a single reservoir for contact center data including contact records, agent performance, Contact Lens insights, etc. This data can be combined with third-party sources so organizations can create their own custom reports using Amazon Connect data or combine data queried from third-party sources using zero-ETL (integrations that eliminate or minimize the need to build ‘extract, transform and load’ data pipelines).
Using the analytics data lake along with a business intelligence tool (e.g., Amazon Connect QuickSight), a contact center manager could create a report that would visualize which agents have the highest customer satisfaction for calls about lost orders and then adjust routing profiles to staff their queues with the ideal agents to achieve their desired business outcomes. Organizations can also use data from the data lake to inform new contact center optimizations when used with machine learning or AI models. For example, a trend of short agent interactions could highlight an opportunity for self-service.
Amazon Connect data lake supports querying engines like Amazon Athena and data visualization applications like Amazon QuickSight or other third-party business intelligence (BI) applications. The Amazon Connect analytics data lake is available in all the AWS Regions where Amazon Connect is available.
Tonic.ai Launches Secure Unstructured Data Lakehouse for LLMS
Tonic.ai provides developer-focused software solutions. The company launched Tonic Textual, a secure data lakehouse for large language models (LLMs), that allows AI developers to use unstructured data (e.g., text, social media posts) for retrieval-augmented generation (RAG) systems and LLM fine-tuning. (See below for the differences between data lakes, lakehouses and data warehouses.)
With Tonic Textual, developers can:
- Build, schedule, and automate unstructured data pipelines that extract and transform data into a standardized format for embedding, ingesting into a vector database, or pre-training and fine-tuning LLMs. (Vectors are fixed-length lists of numbers used to represent unstructured data.)
- Automatically detect, classify, and redact sensitive information in unstructured data, and optionally re-seed redactions with synthetic data to maintain the semantic meaning of the data. Textual uses proprietary named entity recognition (NER) models to identify sensitive data and protect it.
- Enrich a vector database with document metadata and contextual entity tags to improve retrieval speed and context relevance in RAG systems.
In short, Tonic Textual helps AI developers build generative AI systems on proprietary data while keeping that data, and any sensitive data within it, secure.
Want to know more?
A data lake is a good repository for unstructured data; a data warehouse is best for structured data. (See here and here.) Per Google, a “data lakehouse merges [lakes and warehouses] to create a single structure that allows you to access and leverage data for many different purposes, from BI to data science to machine learning.” Additionally, a key benefit of a data lakehouse is low-cost storage for structured, unstructured, and semi-structured data types.
As compared to a data lake, Databricks says that lakehouses are good for AI because they provide “data versioning, governance, security and ACID properties that are needed even for unstructured data.” (The acronym ACID refers to the four properties that define a transaction: atomicity, consistency, isolation and durability properties.) For still more, see this thorough Datacamp explainer article.
RAG is the process of optimizing the output of a large language model, so it references an authoritative knowledge base outside of its training data sources before generating a response. Fine-tuning is the process of taking a pre-trained model and further training it on a domain-specific dataset. Per the IBM link above, NER (also called entity chunking or entity extraction) is a component of natural language processing (NLP) that identifies predefined categories of objects in a body of text. These categories can include names of people, organizations, medical codes, etc.
Per this post by Paul Baier of GAI Insights, which is itself referencing a talk at a recent MIT AI Summit, 90 percent of the information in a firm is unstructured data. Some types of unstructured data file formats include TXT, PDF, CSV, TIFF, JPG, PNG, JSON, DOCX and XLSX.
CallTower Unveils GTx for Microsoft Teams Operator Connect
The provider of enterprise-class UCaaS and CCaaS communications solutions launched GTx, a rebiller program for Managed Service Providers (MSPs), System Integrators, Resellers, Distributors, and VARs. GTx allows these companies to provide Operator Connect for Microsoft Teams to their customers via an all-in-one transaction tool. CallTower then manages the telecom tax, compliance regulations, and additional responsibilities.
This Week on No Jitter
In case you missed them, here are some of our top stories:
Reserve Your Spot: Enterprise Connect AI 2024
With AI moving at a frenetic pace and IT leaders scrambling to keep up, this new 2-day event from Enterprise Connect will provide up-to-the-minute, in-depth, unbiased content with conference tracks covering CX, Productivity, and IT Management to help the Enterprise IT community leverage AI to advance the enterprise. Learn More about Enterprise Connect AI, October 1-2, 2024, Santa Clara, CA.