Introducing CSVAI: Automate Data Enrichment from Any CSV or Excel File with Generative AI
UPDATE: Latest version of csvai support Image Analysis
At Zyxware Technologies, we constantly look for ways to reduce manual effort, streamline processes, and unlock insights for our clients and the wider community. Today, we are delighted to announce the release of CSVAI, a free and open-source tool we built to automate data enrichment from both text and images in CSV and Excel files using Generative AI.
What is CSVAI?
CSVAI is a Python library and command-line tool that applies a powerful AI prompt to every row in your CSV or Excel file. It can analyze textual data, image URLs, or local image paths, enrich the data using multimodal OpenAI Vision APIs, and write the results back into a structured output file.
Think of it as a bridge between your raw business data - including product photos, user-generated content, and more - and actionable insights, without needing to build a custom app for every use case.

Why did we build CSVAI?
In our own projects, we frequently deal with large datasets - leads, reviews, support tickets, product catalogs - where every row needs additional intelligence. Doing this manually is slow, error-prone, and expensive. We wanted a crash-safe, and scalable way to enrich these datasets using AI.
Realizing its potential beyond our internal workflows, we decided to make it available to the wider community under a GNU GPL license.
Use Cases
- Enriching lead databases with missing fields from the available data.
- Summarizing customer reviews into actionable insights
- Categorizing support tickets automatically
- Extracting structured values (e.g., city, state, country from raw address) from unstructured text
- Automatically generate product descriptions, SEO alt-text, and tags directly from product images.
- Analyze user-uploaded images in bulk to flag inappropriate content.
- Extract objects, themes, and sentiment from images in a social media dataset.
- Create descriptive listings by analyzing property photos to identify features (e.g., "hardwood floors," "natural light," "granite countertops").
- Perform an initial damage assessment by analyzing photos of vehicles or property submitted in a claim file.
Doing this manually is time-consuming and error-prone. Writing custom scripts for every case is overkill. CSVAI solves this gap by providing a reusable, prompt-driven engine.
Key Features
- Structured Outputs with JSON Schema: enforce consistent, validated results.
- Image Analysis: Process and understand images directly from URLs or local file paths specified in your spreadsheet columns.
- JSON mode: schema-less, but still ensures each row produces a valid JSON object.
- Async & concurrent: process thousands of rows in parallel.
- Resumable: safely restart without reprocessing completed rows.
- CSV & Excel support: works with .csv, .xlsx, and .xls files.
- Command-line or Streamlit UI: use it in automation scripts or with a simple UI.
Examples
Address Standardization: Input: address.csv
Address
"1600 Amphitheatre Parkway, USA"
"221B Baker Street, London, UK"
Prompt:
Extract city, state, and country from the given address and output JSON.
Rules:
- city: city/town/locality (preserve accents, proper case)
- state: ISO-standard name of the state/region/province or "" if none
- country: ISO 3166 English short name of the country; infer if obvious, else ""
- Ignore descriptors like "(EU)"
- Do not guess street-level info
Inputs:
Address: {{Address}}
Output:
Address | city | state | country |
---|---|---|---|
1600 Amphitheatre Parkway, USA | Mountain View | California | United States |
221B Baker Street, London, UK | London | United Kingdom |
E-commerce Product Tagging
Let's say you have a CSV file with product information and want to automatically generate descriptions and tags from the product images.
Input: products.csv
product_id,image_url
"SKU-001","https://example.com/images/blue-suede-shoes.jpg"
"SKU-002","https://example.com/images/leather-backpack.jpg"
Prompt (product.prompt.txt):
Analyze the product image provided. Generate a JSON with concise marketing description and identify 3-5 relevant product tags.
Rules:
- description: A short, appealing description (1-2 sentences).
- tags: A comma-separated list of relevant keywords.
- Do not invent features not visible in the image.
Output
product_id | image_url | description | tags |
SKU-001 | https://example.com/images/blue-suede-shoes.jpg | A stylish pair of classic men's dress shoes in a vibrant blue suede finish. | men's shoes, suede, blue, formal wear, lace-up |
SKU-002 | https://example.com/images/leather-backpack.jpg | A durable and fashionable vintage-style leather backpack with multiple pockets. | backpack, leather, travel, vintage, brown |
Getting Started
Installation
pip install csvai
# Include Streamlit UI dependencies
pip install "csvai[ui]"
# Add your OpenAI API key in .env
Upgrade
pip install csvai --upgrade
CLI Usage
# Text-only example
csvai input.csv --prompt text.prompt.txt
# Image analysis example
csvai products.csv --prompt image.prompt.txt --image-column "image_url" --model gpt-4o --process-image
--prompt - text file containing your prompt
--schema - JSON schema for strict structured output (recommended)
--image-column - The column name containing image URLs or local paths. (default column name - image)
--process-image - flag to enable image analysis mode
--limit - process only N rows for testing
--model - choose your preferred OpenAI model (e.g., `gpt-4o`, `gpt-4o-mini`).
Web UI
CSVAI includes a browser-based UI implemented in streamlit.
csvai-ui
Prompt Builder for CSVAI
To make prompt and schema creation easier, we’ve built a CSV AI Prompt Builder — a Custom GPT that generates tailored prompts and JSON Schemas for your data.
With this you can create the detailed prompts and schema files required for csvai with simple prompts like
- I have products.csv with Product Title, Product Description, Category, and Sub Category. Help me enrich with SEO meta fields.
- I have reviews.csv with Title, Body, and Stars. Help me extract sentiment and generate a short summary.
- I have address.csv with an Address field. Help me extract City, State, and Country using ISO-standard names.
- I have tickets.csv with Subject and Description. Help me classify each ticket into predefined support categories.
- I have posts.csv with Title, Body, URL, Image URL, Brand, and Platform. Help me generate social media captions, hashtags, emojis, CTAs, and alt text.
- I have jobs.csv with Job Title and Description. Help me categorize jobs into sectors and identify the level of seniority.
- I have products.csv with an image_url column. Help me generate alt text and a marketing description for each image.
- I have real_estate.csv with a column of property photos. Help me identify key features like 'swimming pool', 'updated kitchen', and 'hardwood floors'.
- I have social_posts.csv with a column of user-submitted images. Help me classify the content of each image.
Why Free and Open Source?
CSVAI began as an internal Zyxware tool for our own enrichment workflows. Recognizing its potential to help others, we released it under GPL v2. This reflects our philosophy of contributing back to the community while continuing to offer custom AI solutions for businesses.
CSVAI is free, open-source, and production-ready. Whether you are a business user, developer, or researcher, we invite you to explore how it can simplify your data enrichment workflows.
Get Involved
- Download: www.github.com/zyxware/csvai
- Report issues: GitHub Issues
- Need customization? Contact us
About Zyxware Technologies
At Zyxware Technologies, we help organizations harness AI, automation, and digital platforms to solve real-world problems. Our mission is to empower businesses with tools that reduce manual effort and unlock data-driven insights.
CSVAI is one such tool, freely available to the community, but we also provide commercial support and custom solutions tailored to your needs. If you’re looking to automate a unique business process or build a similar system, we invite you to schedule a free discovery call.