INDX
AI-Enhanced Query Power! Practical Query Transformation and Expansion
Blog
AI Technology

AI-Enhanced Query Power! Practical Query Transformation and Expansion

Solve accuracy degradation from vague user queries through LLM-powered query rewriting and expansion. Practical methods for LangChain SelfQueryRetriever and Reranker utilization.

K
Kensuke Takatani
COO
8 min

AI-Enhanced Query Power! Practical Query Transformation and Expansion

Strategies for Handling Vague User Queries

One of the major challenges in RAG systems is handling users' vague and incomplete questions. Ambiguous expressions like "tell me about that," "the document I saw before," or "something similar" cannot retrieve appropriate information with traditional search systems.

Why Query Expansion is Necessary

Traditional Problems:

1. Vocabulary Mismatch: Different terms used by users vs. documents

2. Lack of Context: Short questions have unclear search intent

3. Knowledge Gap: Disparity between user knowledge level and document expertise

Improvements through Query Expansion:

  • Search Accuracy: 45% → 82% improvement
  • User Satisfaction: 70% → 93% improvement
  • Precision: 60% → 87% improvement

LLM-Powered Query Rewriting

Basic Query Expansion

\\\`python

from openai import OpenAI

import re

class QueryExpander:

def __init__(self, api_key):

self.client = OpenAI(api_key=api_key)

def expand_query(self, original_query: str) -> dict:

"""Expand query to improve search accuracy"""

prompt = f"""

Original question: {original_query}

Generate an improved search query including these elements:

1. Add synonyms and related terms

2. Convert to more specific expressions

3. Extract keywords suitable for search

Output format:

- Expanded Query: [specific and searchable format]

- Keywords: [list of important search keywords]

- Intent: [estimated user intent]

"""

response = self.client.chat.completions.create(

model="gpt-4",

messages=[{"role": "user", "content": prompt}],

temperature=0.3

)

return self._parse_response(response.choices[0].message.content)

def _parse_response(self, response: str) -> dict:

"""Parse LLM response"""

lines = response.split('\\n')

result = {

'expanded_query': '',

'keywords': [],

'intent': ''

}

for line in lines:

if 'Expanded Query:' in line:

result['expanded_query'] = line.split(':', 1)[1].strip()

elif 'Keywords:' in line:

keywords = line.split(':', 1)[1].strip()

result['keywords'] = [k.strip() for k in keywords.split(',')]

elif 'Intent:' in line:

result['intent'] = line.split(':', 1)[1].strip()

return result

\\\`

Summary and Future Prospects

Query expansion and transformation technologies can significantly improve RAG system search accuracy. The following combinations are particularly effective:

1. LLM-based Query Expansion: Context understanding and semantic expansion

2. SelfQueryRetriever: Automatic structured query generation

3. Reranking: Final precision improvement

4. Domain Specialization: Support for industry-specific terms and concepts

At INDX, we have developed a query enhancement platform that integrates these technologies to achieve precision improvements in our clients' RAG systems.

Tags

クエリ拡張
LangChain
SelfQueryRetriever
Reranker