LLM API
Unlock NLP Capabilities for Keyword Generation, Q&A, and Semantic Search.
This document provides comprehensive documentation for the WebAI LLM API, which offers natural language processing capabilities including keyword generation, question answering, and NLP-powered search functionality.
Table of Contents
API Endpoints
Authentication
Generate Keywords Endpoint
QnA Endpoint
NLP Search Endpoint
Data Models
Error Handling
API Endpoints
All endpoints are prefixed with /v1
.
/generate-keywords
POST
Generate alternative keywords based on input keywords
/qna
POST
Answer questions based on content from a specified domain
/nlp_search
POST
Process natural language queries and convert them to structured search parameters
Authentication
The API uses AWS Cognito for authentication. All requests must include a valid JWT token in the Authorization header.
Authorization: Bearer your-jwt-token
The token must contain valid claims including:
cognito:username
orusername
: The user's usernamesub
: The user's unique identifiercognito:groups
: The Cognito groups the user belongs to
Access to certain resources is restricted based on the user's identity and group membership.
Generate Keywords Endpoint
Endpoint
POST /v1/generate-keywords
Description
This endpoint takes a list of keywords and generates semantically related alternative keywords. It uses Azure OpenAI's language models to analyze the input keywords and suggest alternatives that capture related themes and concepts.
Request Body
{
"keywords": ["artificial intelligence", "machine learning", "neural networks"]
}
Parameters
keywords
array
Yes
List of input keywords for which related terms will be generated
Response
{
"original_keywords": ["artificial intelligence", "machine learning", "neural networks"],
"alternative_keywords": [
"deep learning",
"AI algorithms",
"computer vision",
"natural language processing",
"predictive analytics",
"data science",
"cognitive computing",
"reinforcement learning",
"pattern recognition",
"automated reasoning"
]
}
QnA Endpoint
Endpoint
POST /v1/qna
Description
This endpoint answers questions based on content scraped from a specified domain. It uses web scraping to extract text from the provided URL and then uses a language model to generate an answer to the user's question based on that content.
Request Body
{
"question": "What services does this company offer?",
"domain": "example.com",
"max_pages": 3
}
Parameters
question
string
Yes
The user's question
domain
string
Yes
The URL to fetch text from
max_pages
integer
No
Maximum number of pages to scrape (default: 1)
Response
If the question can be answered based on the content:
{
"success": true,
"can_answer": true,
"message": "Question answered successfully",
"data": {
"answer": "## Company Services\n\nBased on the website content, Example Company offers the following services:\n\n* Web Development\n* Mobile App Development\n* Cloud Solutions\n* AI Integration\n* Data Analytics\n\nTheir core focus appears to be enterprise-level software solutions with an emphasis on scalability and security.",
"sources": ["Homepage", "Services page"],
"confidence": 0.92,
"follow_up_questions": [
"What technologies do they use for web development?",
"Do they offer maintenance services?"
]
}
}
If the question cannot be answered based on the content:
{
"success": true,
"can_answer": false,
"message": "I don't have enough information in the provided context to answer this question. Please try a different question or provide additional context.",
"data": null
}
NLP Search Endpoint
Endpoint
POST /v1/nlp_search
Description
This endpoint processes natural language queries and converts them into structured search parameters. It uses a workflow that analyzes the query to determine intent, extract locations, generate keywords, identify domains, and create semantic search inputs.
Request Body
{
"user_query": "Find AI companies in Germany that specialize in computer vision",
"search_id": "unique-search-id"
}
Parameters
user_query
string
Yes
The natural language query provided by the user
search_id
string
Yes
A unique identifier for the search request
Response
{
"payload": {
"keywords": {
"must_one": ["computer vision", "image recognition", "object detection"],
"must_all": ["AI", "artificial intelligence"],
"must_not": []
},
"locations": {
"country": ["Germany"],
"state": [],
"region": [],
"district": [],
"municipality": []
},
"custom_filters": {},
"domains": [],
"excludes": [],
"semantic_input": "computer vision artificial intelligence"
},
"metadata": {
"search_id": "unique-search-id",
"user_query": "Find AI companies in Germany that specialize in computer vision",
"timestamp": "2023-01-01T12:00:00Z",
"processing_time": 1.5,
"workflow_steps": [
"flag_detection",
"location_extraction",
"keyword_generation",
"semantic_input_generation"
]
}
}
Data Models
Keyword Generation
KeywordRequest
{
"keywords": ["string"]
}
BatchKeyword
{
"original_keywords": ["string"],
"alternative_keywords": ["string"]
}
QnA
QAResponse
{
"answer": "string",
"sources": ["string"],
"confidence": 0.95,
"follow_up_questions": ["string"],
"can_answer": true
}
NLP Search
QueryRequest
{
"user_query": "string",
"search_id": "string"
}
QueryResponse
{
"payload": {
"keywords": {
"must_one": ["string"],
"must_all": ["string"],
"must_not": ["string"]
},
"locations": {
"country": ["string"],
"state": ["string"],
"region": ["string"],
"district": ["string"],
"municipality": ["string"]
},
"custom_filters": {},
"domains": ["string"],
"excludes": ["string"],
"semantic_input": "string"
},
"metadata": {
"search_id": "string",
"user_query": "string",
"timestamp": "string",
"processing_time": 0,
"workflow_steps": ["string"]
}
}
Error Handling
The API returns standard HTTP status codes to indicate success or failure:
200 OK
: Request was successful400 Bad Request
: Invalid request parameters401 Unauthorized
: Missing or invalid authentication token403 Forbidden
: Authentication successful but access denied500 Internal Server Error
: Server-side error
Error responses include a detail message explaining the issue:
{
"detail": "Error message explaining the issue"
}
For the QnA and Generate Keywords endpoints, errors are also returned in a structured format:
{
"success": false,
"can_answer": false,
"message": "Error message explaining the issue",
"error": "Detailed error information"
}
Implementation Details
Language Models
The API uses the following language models:
Azure OpenAI: Used for keyword generation and QnA functionality
OpenAI GPT-4o: Used for NLP search processing
Web Scraping
The QnA endpoint uses web scraping to extract content from the specified domain. It can scrape multiple pages from the same website if the max_pages
parameter is greater than 1.
Token Limits
The QnA endpoint limits the document context to a maximum of 40,000 tokens to ensure compatibility with the language model's context window.
Last updated
Was this helpful?