Skip to main content

Available Models

Model: gpt-4 or gpt-4-turbo
Capabilities:
- Advanced reasoning
- Complex problem-solving
- Multi-step thinking
- Best for critical tasks

Cost: ~$0.03-0.06 per 1K tokens
Speed: 1-3 seconds response time
Best For: Customer service, technical support, complex queries
When to Use GPT-4:
  • Legal or compliance-critical interactions
  • Technical troubleshooting
  • Complex negotiations
  • High-value customer interactions
Model: gpt-3.5-turbo
Capabilities:
- Fast responses (< 1 second)
- Good reasoning
- Cost-effective
- Excellent for most use cases

Cost: ~$0.0005-0.002 per 1K tokens
Speed: < 1 second response time
Best For: General customer service, sales, lead generation
When to Use GPT-3.5 Turbo:
  • High-volume calling
  • Time-sensitive responses
  • Cost-conscious operations
  • Standard customer interactions

Setup

Step 1: Get Your API Key

  1. Go to platform.openai.com
  2. Sign in to your OpenAI account
  3. Click “API keys” in the left menu
  4. Click “Create new secret key”
  5. Copy the key (you won’t see it again)
Example Key Format:
sk-proj-XXXXXxxxxxxxxxxxxxxxxxxxx

Step 2: Set up Billing

OpenAI charges per API usage:
  1. Go to “Billing” in OpenAI dashboard
  2. Set usage limits to prevent surprises
  3. Add payment method
  4. Review pricing page for current rates

Step 3: Add to CallIntel

For Super Admins:
  1. Go to Settings → Developer Settings
  2. Click “API Keys”
  3. Select “OpenAI” from provider list
  4. Paste API key
  5. Click “Test Connection”
  6. Save
For Organization Admins:
  1. Go to Settings → AI Models
  2. Click “Add OpenAI Model”
  3. Paste API key (or use super admin configured key)
  4. Select which models to enable
  5. Save

Step 4: Configure Agent

  1. Create or Edit an Agent
  2. Under “Language Model”, select:
    • gpt-4-turbo or
    • gpt-3.5-turbo
  3. Configure advanced settings:
    • Temperature: 0.7 (default)
    • Max Tokens: 150 (for responses)
    • Frequency Penalty: 0 (no penalty)
    • Presence Penalty: 0 (no penalty)
  4. Save agent

Step 5: Test

Make a test call to verify:
  1. Use Web Call feature
  2. Speak to agent
  3. Verify proper responses
  4. Check call logs

Model Selection

Model Comparison

FeatureGPT-4GPT-3.5
Quality⭐⭐⭐⭐⭐⭐⭐⭐⭐
Speed⭐⭐⭐⭐⭐⭐⭐⭐
Cost⭐⭐⭐⭐⭐⭐⭐
Reasoning⭐⭐⭐⭐⭐⭐⭐⭐⭐
Languages⭐⭐⭐⭐⭐⭐⭐⭐⭐

Cost Example (1000 calls/month)

GPT-4 Scenario:
1000 calls × 500 tokens average × $0.00003/token = $15/month
Cost per call: $0.015
Monthly cost: $15 (for 1000 calls)
GPT-3.5 Turbo Scenario:
1000 calls × 500 tokens average × $0.0000005/token = $0.25/month
Cost per call: $0.00025
Monthly cost: $0.25 (for 1000 calls)
60x cost difference for 1000 calls!

Configuration Options

Temperature

Controls response creativity (0-2):
0.0 = Deterministic (same response every time)
0.7 = Balanced (default - good for most uses)
1.5 = Creative (varied responses)
2.0 = Very creative (unpredictable)
Recommended Settings:
Customer Service: 0.3-0.5 (consistent)
Sales: 0.7-0.9 (engaging but helpful)
Creative Tasks: 1.2-1.5 (varied responses)

Max Tokens

Maximum response length (1-4096):
Short Responses: 50-100 tokens
Medium: 150-200 tokens
Long Responses: 300-500 tokens
Tip: Lower max tokens saves costs!
Max Tokens Example:
50 tokens: $0.0015 per call
150 tokens: $0.0045 per call
300 tokens: $0.009 per call

System Prompt

Define agent behavior (see Agent Setup Guide):
Example:
"You are a friendly customer service representative. 
Keep responses under 100 words. Be helpful and professional."

Advanced Features

Function Calling

Enable agents to call external functions:
{
  "name": "get_order_status",
  "description": "Get status of customer order",
  "parameters": {
    "type": "object",
    "properties": {
      "order_id": {
        "type": "string",
        "description": "The order ID"
      }
    }
  }
}
Agents can:
  • Look up information
  • Process transactions
  • Update systems
  • Trigger actions

Vision Capabilities

GPT-4V can analyze images:
Supported:
- Receipt analysis
- Document scanning
- Image description
- Quality inspection
Note: Requires image input in calls (enterprise feature).

Cost Optimization

1. Use GPT-3.5 Turbo for Most Tasks

GPT-4: Use sparingly for complex queries
GPT-3.5: Default for all other interactions
Savings: 10-20x cost reduction

2. Optimize Prompts

Bad (expensive):
"You are a helpful AI assistant in a contact center. 
You should be friendly, professional, and knowledgeable 
about our products. When customers call, listen carefully 
to their questions and provide helpful, accurate responses..."
Good (cheaper):
"You are a helpful customer service representative. 
Be friendly and professional."
Savings: 60% reduction in tokens!

3. Reduce Token Usage

Technique        | Savings
-----------------|-------
Shorter KB       | 30-50%
Concise prompts  | 20-40%
Lower max_tokens | 10-30%
No history       | 10-20%

Combined: 50-70% possible

4. Batch Processing

For scheduled calls, use batch API:
Standard: $0.002 per 1K tokens
Batch: $0.0005 per 1K tokens
Savings: 75% cheaper
How to Use:
  1. Schedule calls for off-peak hours
  2. Use batch endpoint (CallIntel handles this)
  3. 24-hour processing window
  4. Save significantly on costs

Monitoring & Limits

Token Monitoring

Check usage in OpenAI dashboard:
  1. Go to Usage Dashboard
  2. View current usage
  3. Check spending
  4. View by model breakdown

Rate Limits

Default limits for standard accounts:
GPT-4: 200 requests/minute
GPT-3.5: 3,500 requests/minute

Increase limits in account settings
Contact OpenAI for higher limits

Budget Controls

Set spending limits to prevent surprises:
  1. Go to Billing → Usage Limits
  2. Set hard limit (e.g., $100/month)
  3. Optional email alert at 50%
  4. API requests blocked when limit reached

Best Practices

1. Start with GPT-3.5 Turbo

Phase 1: Use GPT-3.5 for all agents
Phase 2: A/B test GPT-4 on specific agents
Phase 3: Use GPT-4 only where needed
Result: Optimized cost and quality balance

2. Monitor Performance

Track key metrics:
- Average response time
- User satisfaction
- Cost per call
- Error rate
- Token usage
Review monthly and optimize.

3. Test Before Production

1. Create test agents with both models
2. Make identical calls
3. Compare responses
4. Evaluate cost/quality tradeoff
5. Choose best option for each use case

4. Use Lower Temperature for Consistency

Customer Service: 0.3-0.5
- Consistent responses
- Predictable behavior
- Better for compliance-critical tasks

Troubleshooting

Verify key starts with sk- and is not truncated. Copy from OpenAI dashboard again and test.
Switch to GPT-3.5 Turbo for faster responses. Also reduce max_tokens setting.
Check token usage in OpenAI dashboard. Reduce max_tokens, shorten KB, and simplify prompts.
Lower temperature setting to 0.3-0.5 for more deterministic responses.
Consider upgrading to GPT-4 for better reasoning, or improve system prompt and knowledge base.

See Also


Support

Contact Support