All you need for your ETL and AI pipelines

Pathway is a scalable and robust data processing framework. Use it to build and power AI/ML applications with live data and real-time pipelines.

Community
up to8 GB RAM - 4 cores
Free and open under BSL 1.1.
Packaged for pip, poetry, docker.
Deploy quickly on

What's included:
20+ App templates
  • basic RAG pipelines
  • document pipelines
  • log monitoring
  • Kafka ETL
  • social media sentiment analysis

Core features
  • High-performance input/output connectors (Kafka, S3, cloud file systems, databases)
  • Predefined API connectors to 300+ data sources
  • REST API endpoint for serving query/answer and realtime features with sub-millisecond latency
  • Python programming API (Table API)
  • SQL programming API
  • Incremental stream and table operations: join, filter, group-by, reduce
  • Advanced join types, temporal joins, windows, and ranges
  • Custom stateful reducers
  • Incremental "apply" and "map-reduce"
  • Support for pointer-based data structures: trees, graphs, event sequences
  • User Defined Functions: call external libraries
  • Make API calls from Pathway data flows
  • Async data processing: call APIs, libraries, LLM services
  • Data schema support (Python/mypy compatible typing)
  • True streaming data processing engine in Rust
  • Same engine for streaming and batch workloads
  • Same code logic for streaming and for backfilling
  • Parallelization with multi-processing
  • Data persistence with S3 storage
  • Fully interactive execution with streaming & live data sources
  • Develop and run in Jupyter notebooks (also in streaming)
  • Data table introspection & debugging in streaming
  • Out-of-the-box code completion supported by IDEs (VS Code, PyCharm, ...)
  • Dataflow and schema validation at compile time
  • Schema autodetection from data samples
  • Exception handling at runtime
AI Toolkits
  • Libraries for time-series operations, sampling, and filters
  • LLM extension pack
  • Unstructured data parsing toolkit (data source sync, parsing, extraction, indexing)
  • Advanced data indexing built-in (vector with HNSW, BM24, hybrid)
  • Support for custom indexing with knowledge graph and summarization techniques
  • Support for fully local and API-based ML models
  • Built-in support for temporal graph data
  • Library for classification and clustering
Input connectors
  • Kafka
  • BigQuery
  • PostgreSQL
  • Elastic Search
  • http
  • JSON Lines
  • Redpanda
  • Logstash
  • Slack
  • File System
  • Google PubSub
  • Custom Python connector (APIs and other destinations)
Support & Deployment
  • Self hosted setup: Run Pathway on your own machine or cloud
  • Community support
Scale
up to
Free or paid, with license key.
Packaged for pip, poetry, docker.
Deploy quickly on

Everything in Community, plus:
Extra App templates
  • high-accuracy RAG
  • Sharepoint AI search
  • Delta lake ETL
  • monitored instances

Advanced features
  • Enterprise data source connectors for Sharepoint, Delta Table (more coming soon)
  • Monitoring and traces for Pathway Instances
  • OpenTelemetry compatible
  • Grafana integration
  • Business Support with tickets1-business day reply time for paid tier customers
  • Free 1st consultation call for pipeline design for your RAG, streaming and ETL use case
  • #DeveloperAssist program
Enterprise
up to24 TB RAM - 128 coresx 40 nodes
Pathway Enterprise Licensing.
Private package registry
Deploy on cloud of choice
AWS Azure GCP Intel

Everything in Scale, plus:
Industry solutions
  • real-time GPS data analytics
  • logistics
  • automotive
  • IoT analytics
  • fraud detection
  • RAG for sales and marketing
  • search in slide decks

Enterprise features
  • Horizontal Scalability
  • High availability with hot failover
  • Pathway Helm Chart models
  • Kubernetes deployment guides
  • Connectors for custom messaging formats (MQTT/IoT, ...)
  • Pathway Visual Explorer: Live Dashboards with Pathway, including geospatial data viz
  • Queryable historical data snapshots
  • Real-time geospatial & trajectory mining library (GPS traces)
  • Support for data schema & code schema versioning
  • Pathway managed services and hosting
  • Solutioning for industry use cases
  • Made-to-measure data pipelines
  • Access to professional services
  • Multi-zone SLA
  • 24/7 phone support