Welcome to Pathway AI Pipelines Documentation!

Pathway AI Pipelines are ready-to-run templates for RAG and LLMs pipelines, and enterprise search with live data.

Pathway Product overview

If you are looking for the Pathway Live Data Framework, the Python data stream processing framework behind the AI Pipelines, you can find the associated docs here.

Pathway's AI Pipelines allow you to quickly put in production AI applications which offer high-accuracy RAG and AI enterprise search at scale using the most up-to-date knowledge available in your data sources. You can test them on your own machine and deploy on-cloud (GCP, AWS, Azure, Render, ...) or on-premises.

The apps connect and sync (all new data additions, deletions, updates) with data sources on your file system, Google Drive, Sharepoint, S3, Kafka, PostgreSQL, real-time data APIs. They come with no infrastructure dependencies that would need a separate setup. They include built-in data indexing enabling vector search, hybrid search, and full-text search - all done in-memory, with cache.

AI Pipelines overview

Application Templates

The application templates provided in this repo scale up to millions of pages of documents. Some of them are optimized for simplicity, some are optimized for amazing accuracy. Pick the one that suits you best. You can use it out of the box, or change some steps of the pipeline - for example, if you would like to add a new data source, or change a Vector Index into a Hybrid Index, it's just a one-line change.

Run a template

Pathway AI Pipelines provide several ready-to-go templates for common use cases. Whether you need a real-time alerting system, document indexing, or context-based Q&A, you'll find templates for each.

Configure your app

Customize your app to meet your needs without modifying Python code by using YAML configuration files. Learn more about the capabilities of Pathway's custom YAML parser, designed to simplify template configuration.

Create your own app

Not the template you were looking for? Create your own customized LLM app, and let us know what you are building, we are always adding new templates!

REST API

Our RAG templates rely on a REST API to communicate.

GitHub repository

Pathway AI Pipelines sources are available on GitHub. Don't hesitate to clone the repo and contribute!

Based on Pathway Live Data Framework

The apps rely on the Pathway framework for data source synchronization and for serving API requests (Pathway is a standalone Python library with a Rust engine built into it). They bring you a simple and unified application logic for back-end, embedding, retrieval, LLM tech stack. There is no need to integrate and maintain separate modules for your Gen AI app: Vector Database (e.g. Pinecone/Weaviate/Qdrant) + Cache (e.g. Redis) + API Framework (e.g. Fast API). Everything works out of the box.