Working with XLS or CSV files

How It Works

Upload

Upload your CSV file to Storytell. For detailed instructions on uploading content, see our Uploading Content guide.

Processing

Storytell processes the first 2,000 rows of the CSV file

AI Analysis

Storytell LLM analyzes the Chunks.

Insight Extraction

Query your data using SmartChat™.

As you can see in the image above, your uploaded CSV files are easily accessible in the “Stored Assets” section. You can start querying your data immediately after upload.

Generating Chunks

Chunks are clusters of related concepts from your data. For example, let’s say you have a CSV file with the following information:

Name	Composition	Distance from the Sun (AU)	Orbital Period (years)	Diameter (km)
Mercury	Rocky	0.39	0.24	4879
Venus	Rocky	0.72	0.62	12104
Earth	Rocky	1	1	12742
Mars	Rocky	1.52	1.88	6779
Jupiter	Gas Giant	5.2	11.86	139820
Saturn	Gas Giant	9.58	29.46	116460
Uranus	Ice Giant	19.22	84.01	50724
Neptune	Ice Giant	30.05	164.79	49244
Moon	Rocky	N/A	0.074	3474
Titan	Icy	N/A	15.94	5149

Here’s what a Chunk would look like: CSVs as Story Tiles™

“Titan” is a “Icy” celestial body located “N/A” AU from the Sun with an orbital period of “15.94” years.It has a diameter of “5149” kilometers.

This transformation enables our AI to understand relationships and context within your data, making it possible to answer complex queries.

Querying Your Data

With Storytell, you can ask questions about your CSV data and receive clear, insightful answers through SmartChat™: SmartChat™ response

As demonstrated in the image above, when you ask about Titan’s orbital period, Storytell provides a precise answer based on the processed data.

Verifying Accuracy

Storytell’s responses are based on the data you provide. You can always verify the information by checking the original CSV file: Source CSV

This screenshot confirms that Storytell’s response matches the data in your original CSV file.

Technical Considerations

Processing limited to first 2,000 rows for speed and efficiency
Secure, isolated environments for data privacy
Scalable architecture for concurrent processing

Handling Structured vs. Semi-Structured CSV Files

The Storytell process involves classifying CSV file content as either “structured” or “semi-structured” to determine the appropriate processing strategy. This classification is crucial for handling CSV files that do not conform to traditional tabular formats, ensuring that data is processed accurately and efficiently.

Structured vs. Semi-Structured Data

Structured Data Structured data in CSV files typically includes a clear header row followed by consistent data rows. Each column represents a specific data attribute, and each row contains data entries corresponding to these attributes. This format is straightforward to process using standard CSV parsing techniques. Semi-Structured Data Semi-structured data, on the other hand, may not have a consistent structure. These CSV files might lack headers, have inconsistent columns, or contain data that resembles reports rather than traditional tables. Such files require a different approach to ensure accurate data extraction and processing.

Process for Classifying CSV Content

Initial Inspection: Storytell begins by automatically inspecting the CSV file. During this phase, the system determines key characteristics, such as the presence of a header row and the consistency of data rows throughout the file.
Classification: Based on the results of the initial inspection, Storytell classifies the CSV content into one of two categories:
- Structured: If the CSV file contains a clear header row and all data rows are consistent, it is classified as structured data.
- Semi-Structured: If the file is missing a header or exhibits inconsistent columns, it is classified as semi-structured data.
Prompt Selection: Based on the classification, Storytell selects the appropriate processing prompts tailored to the content type:
- For structured data, Storytell utilizes standard CSV processing prompts to ensure efficient data handling.
- For semi-structured data, the system employs specialized prompts designed to manage variability and generate data “chunks.” Key considerations include:
  - Ensuring that the language model does not drop any data points by refining the prompts.
  - Including a “source” identifier in each chunk to enhance data searchability and retrieval.

Multi-Tab XLS Files

Large language models (LLMs) often struggle with interpreting data from multi-tab XLS files due to their reliance on proximity to make sense of information, which can lead to confusion, especially with complex datasets. Storytell addresses this challenge by converting data into a more coherent format.

Storytell currently supports up to 30 tabs for multi-tab XLS files. Ensure that your file does not exceed this limit.

Storytell’s Process for Multi-Tab XLS Files

1. Uploading the File When you upload a multi-tab Excel file into Storytell, the system processes each tab’s data individually. 2. Breaking Down the Data Storytell breaks the uploaded file into discrete pieces of information, analyzing each row and column to identify key concepts and relationships. 3. Creating Chunks After breaking down the data, Storytell converts the rows into Chunks, coherent sentences that represent the information for easy understanding by LLMs. This transformation allows LLMs to provide accurate responses based on the data you’ve uploaded. 4. Interacting with Your Data After uploading your multi-tab Excel file, create a Collection to organize your data. You can then interact with this Collection and ask specific questions. You can reference specific tabs or columns in your prompts by mentioning the tab or column name directly. For example, if you have an Excel file with three tabs named “tacos”, “enchiladas”, and “tostadas”, and a column named “price”, you can ask:

From the tacos tab in the @$filename please provide a matrix of the 3 most common customer choices and the price of each item

This allows you to target your queries to specific tabs when needed. For example, you can pose a question that needs information from two separate tabs: Sample: “Tell me how the orbital period of Titan compares to the orbital period of Planet 164.”

Storytell will pull the relevant information from your multi-tab XLS data.

Learn more about the process by watching DROdio’s video demonstration that highlights its functionality.

Future Enhancements

Support for larger datasets (beyond 2,000 rows)
Advanced data type detection and custom Story Tile™ generation

Magic Under the Hood

​How It Works

​Generating Chunks

​Querying Your Data

​Verifying Accuracy

​Technical Considerations

​Handling Structured vs. Semi-Structured CSV Files

​Structured vs. Semi-Structured Data

​Process for Classifying CSV Content

​Multi-Tab XLS Files

​Storytell’s Process for Multi-Tab XLS Files

​Future Enhancements

How It Works

Generating Chunks

Querying Your Data

Verifying Accuracy

Technical Considerations

Handling Structured vs. Semi-Structured CSV Files

Structured vs. Semi-Structured Data

Process for Classifying CSV Content

Multi-Tab XLS Files

Storytell’s Process for Multi-Tab XLS Files

Future Enhancements