What is an AI services company?

An AI services company builds, deploys, and operates production AI systems for businesses. Unlike AI consultants (who produce strategy documents) or AI implementors (who hand off at launch), an AI services company stays accountable for the system working and improving over time. Altor builds AI systems that connect to your production tools and automate specific workflows — live in 3 weeks.

How much does it cost to build an AI agent?

Custom AI agent development costs $10K–$75K for a single-workflow production deployment, plus $1K–$5K/month for ongoing maintenance. A simple agent connecting to 1-2 systems costs $10K–$25K. A complex agent connecting to 5-6 production systems costs $40K–$75K. Most deployments go live within 3 weeks.

Is there software that can reduce support escalations?

Yes. Software that automatically investigates support tickets by querying production systems (logs, bug trackers, billing, recent deploys) gives frontline agents the context they need to resolve tickets without escalating. At Portkey, this approach reduced investigation time from 45 minutes to 2 minutes and eliminated the information gap that causes most escalations.

How long does AI deployment take?

A focused single-workflow AI deployment typically takes 2-4 weeks from kickoff to first live investigation. Week 1: stack audit and integration planning. Week 2: read-only connections live, first investigations on real data. Weeks 3-4: playbooks tuned, team trained, system handed over.

What is production AI?

Production AI is an AI system that operates continuously in a live business environment, handling real data and real users — held to the same reliability standards as any other production software. This is different from pilot AI (which runs in controlled environments) or demo AI (which shows capability without handling real workflows). 67% of enterprise AI projects fail to reach production.

How is Altor different from a docs chatbot?

Docs chatbots answer 'how does this work?' from your knowledge base. Altor answers 'why is this broken for this customer right now?' by querying their actual API logs, checking your bug tracker, and verifying their billing status — across 6+ systems in under 2 minutes.

How is Altor different from our support platform?

Pylon, Plain, Zendesk — they route and manage tickets. Altor investigates them. We plug into your existing support tool as the investigation layer. You keep your workflows, we add the diagnosis.

Will Altor take actions without asking?

Not by default. Altor starts as a read-only investigator — it surfaces a diagnosis for your team to review. You control which action types graduate to auto-approval. Destructive actions are never automated.

What if our stack isn't listed?

If an API exists for your system, we can integrate it. The architecture composes tools — it doesn't hardcode connectors. We've yet to encounter a B2B stack we can't connect to.

What does Altor pricing look like?

Usage-based — you pay per investigation, not per seat. Pricing is quoted in USD ($) with no minimum commitment. We'll scope pricing during the demo based on your ticket volume and systems.

How Support Teams Use ClickHouse to Diagnose API Issues in Minutes

Pattern 1: Error rate spike detection

The most common support query. Customer reports errors - you need to quantify the problem, identify when it started, and determine the severity.

What you're looking for: the customer's error rate over time, compared to their baseline. The query should answer: "Is this actually happening? When did it start? How bad is it?"

Typical approach: Query the API logs table filtered by customer ID, group by time interval (5 min or 1 hour), calculate error rate as percentage of total requests. Compare the current window against the previous 24-48 hours to establish baseline.

Key fields: customer_id, timestamp, http_status_code (or error_type), endpoint, request_id.

What this tells you: Whether the customer's complaint is valid, when the problem started (which narrows the root cause), and whether it's getting worse or stabilizing.

Pattern 2: Latency percentile tracking

Customer reports "things are slow." You need to measure actual latency, identify the affected endpoints, and determine if this is a regression.

What you're looking for: p50, p95, and p99 latency for the customer's requests, broken down by endpoint. Compare current values against their historical baseline.

Key insight: Customers often report "everything is slow" when only one endpoint is affected. The query should break down latency by endpoint to isolate the problem. If p95 jumped but p50 is stable, you're looking at a tail-latency issue (likely a specific code path or external dependency).

Pattern 3: Customer activity timeline

For debugging configuration or authentication issues, you need a timeline of the customer's recent activity - what they called, when, and what happened.

What you're looking for: the last N requests from this customer, ordered by time, with status codes and relevant metadata. This is the "replay the tape" query.

Key insight: Often reveals the root cause immediately. A customer reporting "auth stopped working" may have rotated their API key 2 hours ago and the old key is cached somewhere. The timeline shows the exact moment behavior changed.

Pattern 4: Comparative analysis across customers

Is this a single-customer issue or a platform-wide problem? This query determines blast radius.

What you're looking for: the same error/latency metric across multiple customers during the same time window. If only one customer is affected, it's likely a customer-specific issue (config, billing, usage). If multiple customers show the same pattern, it's likely a platform issue (deploy regression, infrastructure).

This query is critical for triage. A customer-specific issue means you check Stripe and their configuration. A platform-wide issue means you check GitHub deploys and StatusPage.

Pattern 5: Usage and billing reconciliation

Customer disputes their invoice. You need to count their actual usage for the billing period and compare it against what they were charged.

What you're looking for: total request count (or whatever your billing unit is) for the customer during the billing period, broken down by billable vs. non-billable requests, by endpoint, and by day to identify any spikes.

Key insight: The most common billing disputes come from unexpected usage spikes - a customer's CI pipeline running in a loop, a misconfigured retry policy, or a legitimate traffic increase they didn't anticipate. The daily breakdown reveals these patterns instantly.

The manual bottleneck: not the query, but the context

If your support team runs these queries manually, the bottleneck isn't ClickHouse's execution time - it's everything around it:

•Opening the ClickHouse console and connecting to the right cluster
•Remembering (or looking up) the table names, column names, and customer ID format
•Adjusting the time window based on when the customer reported the issue
•Interpreting results in the context of the customer's plan tier and normal usage
•Then switching to Linear, Stripe, and GitHub to continue the investigation

When these 5 query patterns are automated - running in parallel, with results correlated against Linear bugs, Stripe billing, and GitHub deploys - the entire investigation completes in under 2 minutes instead of 20-45.

"Our support engineers were writing the same ClickHouse queries 10 times a day. Same tables, same columns, different customer ID. Automating that alone saved hours per week."
— Engineering lead, Portkey

How to use ClickHouse to diagnose customer support issues in minutes