Skip to content

Releases: lzjever/lexilux

Release v2.8.0

13 Feb 17:20

Choose a tag to compare

Added

  • Unified Reasoning Mode Support: Enable extended thinking across providers with a single API

    • reasoning=True parameter for Chat methods (chat, stream, acall, astream)
    • reasoning={"effort": "high"} for providers that support effort levels
    • reasoning={"max_tokens": 16000} for providers that support budget tokens
    • Supported providers: OpenAI, DeepSeek, Anthropic, Kimi, GLM, Minimax
  • New providers/ module: Provider-specific reasoning configurations

    • ReasoningConfig dataclass for provider settings
    • get_reasoning_config() to retrieve provider config
    • detect_provider_from_url() for automatic provider detection
  • New chat/reasoning.py module: Reasoning helper functions

    • normalize_reasoning(): Convert various input formats to normalized dict
    • build_reasoning_request(): Build provider-specific request params
    • extract_reasoning_content(): Extract reasoning from response
  • ChatResult enhancements:

    • New reasoning field containing reasoning content
    • New has_reasoning property for easy checking
  • ChatStreamChunk enhancements:

    • New reasoning property (alias for reasoning_content)
    • New has_reasoning property for easy checking
  • Data sync: make sync-models command to sync from models.dev

Changed

  • models.json: Synced to latest from models.dev (89 providers, 2561 models)

Test Coverage

  • New tests/test_reasoning.py with 34 tests
  • All 602 tests passing (568 existing + 34 new)

Installation

pip install lexilux

Or with tokenizer support:

pip install lexilux[tokenizer]

Release v2.7.3

13 Feb 08:44

Choose a tag to compare

Changed

  • Code quality review: Completed comprehensive production readiness review
    • Verified all resource management patterns (connection pooling, cleanup)
    • Confirmed thread safety in singleton patterns
    • Validated error handling paths

Test Coverage

  • All 568 tests passing
  • Coverage: 77.96% (target: 68%)

Installation

pip install lexilux

Or with tokenizer support:

pip install lexilux[tokenizer]

Release v2.7.2

13 Feb 04:37

Choose a tag to compare

Fixed

  • Python 3.9 compatibility: Added __init__.py to lexilux/data/ directory
    • Required for importlib.resources.read_text() to work with subpackages in Python 3.9
    • Fixed bundled data loading in ModelRegistry

Installation

pip install lexilux

Or with tokenizer support:

pip install lexilux[tokenizer]

Release v2.7.1

13 Feb 04:02

Choose a tag to compare

Added

  • StreamingResult.set_result(): New method for properly setting complete result
    • Uses __slots__ attributes correctly (_text_parts, _text_cache)
    • Avoids dynamic attribute creation

Changed

API Improvements (UX)

  • Conversation class renamed: Conversation_ResponseContinuer (internal API)
    • Old name was confusing (suggested conversation history, but was for response continuation)
    • Users should use chat.complete() instead of direct _ResponseContinuer access
    • Conversation and ChatContinue kept as deprecated aliases (will be removed in v3.0.0)
  • Updated documentation: Improved clarity on Chat vs ChatHistory distinction
    • New example file examples/05_chat_vs_conversation.py
    • Updated AGENTS.md with clear concept explanations

Internal Improvements

  • astream() rate limiting: Now applies rate limiting before streaming (consistent with acall())
  • StreamingResult: Fixed _merged_streaming_result() to use proper __slots__ attributes

Fixed

  • Python 3.9 compatibility: Fixed TypeAlias import from typing_extensions
    • typing.TypeAlias not available in Python 3.9
  • pool_size validation: Added upper limit (max 100) to prevent resource exhaustion
    • Applies to BaseAPIClient, Embed, and Rerank
  • Thread safety: Added double-checked locking to ModelRegistry.get_instance()
    • Prevents race conditions in multi-threaded environments

Deprecated

  • Conversation class: Use chat.complete() instead
  • ChatContinue alias: Use chat.complete() instead

Test Coverage

  • New test file tests/test_v271_fixes.py with 15 tests
  • Overall coverage: 77.95% (target: 68%)
  • All 568 tests passing

Installation

pip install lexilux

Or with tokenizer support:

pip install lexilux[tokenizer]

Release v2.7.0

12 Feb 17:59

Choose a tag to compare

Added

Exception Handling

  • ToolExecutionError: New exception class for tool execution failures
    • Includes tool_name attribute for debugging
    • Non-retryable error type

Type Definitions

  • chat/types.py: New module with type aliases for better type safety
    • JSONValue, JsonObject: Type aliases for JSON data
    • MessageDict, ToolCallDict, UsageDict: TypedDicts for API structures
    • ChatResponse, ChatResponseChoice: Full response types
    • ContinuePromptCallable, ProgressCallback, ErrorCallback: Callback types

Test Coverage

  • test_chat_validation.py: New test file for validation functions (86% coverage)
  • test_chat_continuer.py: New test file for ConversationContinuer (59% coverage)

Changed

Performance Improvements

  • Embed class: Added connection pooling with requests.Session
    • New pool_size parameter (default: 10)
    • Reuses HTTP connections for sync requests
    • Added close() method for proper resource cleanup
  • Rerank class: Added connection pooling with requests.Session
    • New pool_size parameter (default: 10)
    • Shared session between Rerank and RerankModeHandler
    • Added close() method for proper resource cleanup
  • ChatHistory: Optimized factory methods to skip redundant deepcopy
    • from_messages() and from_chat_result() now use _from_trusted()
    • Avoids double copying when creating history from normalized messages

Code Deduplication

  • AsyncClientMixin: New mixin for async client management
    • Shared by Embed and Rerank classes
    • Provides _get_async_client(), aclose(), close() methods
    • Provides sync/async context manager support
    • Reduced duplicate code by ~36 lines

Exception Handling Improvements

  • Reduced broad exception catching: From 14 instances to 3
    • Remaining uses are intentional for user-provided callbacks
    • Added explanatory comments for all remaining except Exception blocks
  • More specific exception types:
    • validation.py: Now catches (TypeError, ValueError, AttributeError)
    • continuer.py: Now catches LexiluxError instead of Exception
    • conversation.py: Now catches LexiluxError for continuation methods
    • tokenizer.py: Now catches (OSError, ValueError) for filesystem errors

Documentation

  • AGENTS.md updates: Reflected new exception handling standards
    • Updated structure section with new modules
    • Updated exports section with new types
    • Removed outdated anti-pattern warnings
  • tests/AGENTS.md updates: Added new test files

Fixed

  • Test test_error_handling_return_partial now raises ServerError instead of generic Exception to match production behavior

Test Coverage

  • Overall coverage increased to 77.89% (target: 68%)
  • validation.py: 86%
  • continuer.py: 59%
  • All 555 tests passing

Installation

pip install lexilux

Or with tokenizer support:

pip install lexilux[tokenizer]

Release v2.5.0

26 Jan 22:12

Choose a tag to compare

Added

Streaming Tool Call Improvements

  • StreamingToolCall: New dataclass for representing incremental tool call data during streaming
    • index: Position of the tool call in the response
    • id: Tool call identifier
    • name: Function name
    • arguments_accumulated: Accumulated arguments string
    • arguments_delta: Latest chunk of arguments
    • is_complete: Whether arguments form valid JSON

Tool Call Accumulation

  • Enhanced SSEChatStreamParser: Now properly accumulates streaming tool calls across chunks
    • Maintains state for tool call IDs, names, and arguments during streaming
    • Parses tool call deltas incrementally from streaming responses
    • Validates accumulated arguments as complete JSON before emitting ToolCall objects
    • Supports multiple concurrent tool calls with proper index tracking

Changed

ChatStreamChunk

  • Now includes streaming_tool_calls field for incremental tool call data during streaming
  • Provides has_streaming_tool_calls property for checking if chunk contains tool call deltas

Fixed

Test Updates

  • Mock Path Alignment: Updated test mocks from requests.Session.post to requests.post
    • Aligns with the refactored BaseAPIClient that uses direct requests.post calls
    • Updated in test_chat_stream.py, test_chat_api_improvements.py, and test_chat_continue.py

Installation

pip install lexilux

Or with tokenizer support:

pip install lexilux[tokenizer]

Release v2.4.0

26 Jan 19:18

Choose a tag to compare

Changed

HTTP Client Simplification

  • Remove Connection Pooling: Simplified HTTP client by removing connection pooling
    • Each HTTP request now creates a new connection and closes it after completion
    • No more connection state management or pooling overhead
    • Removed pool_connections and pool_maxsize parameters from Chat.__init__
    • Removed connection_idle_timeout parameter and cleanup logic
    • Async client configured with max_connections=1, max_keepalive_connections=0
    • Removed connection cleanup scheduling from streaming iterators

Fixed

Bug Fixes

  • Assistant Messages with Tool Calls: Allow assistant messages with tool_calls to omit the content field
    • Previously required all messages to have a content field, even for tool-only responses
    • Now complies with OpenAI API specification (content can be null/omitted when tool_calls exist)
    • Content is automatically set to None for such messages

Documentation

  • README Rewrite: Updated README with professional style, removed all emojis
  • Sphinx Documentation: Fixed all documentation build warnings and errors
    • Added async support documentation
    • Updated example references to numbered structure
    • Fixed API reference issues

Removed

Deprecated Parameters

  • pool_connections parameter from Chat.__init__ and ChatFactory.create()
  • pool_maxsize parameter from Chat.__init__ and ChatFactory.create()
  • connection_idle_timeout parameter from BaseAPIClient.__init__()

Installation

pip install lexilux

Or with tokenizer support:

pip install lexilux[tokenizer]

Release v2.2.0

15 Jan 06:11

Choose a tag to compare

🎯 Quality & Infrastructure Improvements

This release focuses on code quality, robustness, and developer experience improvements without breaking changes.

Added

Connection Pooling & Performance

  • Connection Pooling: All API clients now use HTTP connection pooling for better performance under high concurrency
    • Configurable via pool_connections and pool_maxsize parameters (default: 10 each)
    • Reduces connection overhead for repeated requests
    • Improves performance in high-throughput scenarios

Automatic Retry Logic

  • Retry with Exponential Backoff: Automatic retry for transient failures
    • Configurable via max_retries parameter (default: 0, disabled)
    • Retries on status codes: 429, 500, 502, 503, 504
    • Exponential backoff: 0.1s, 0.2s, 0.4s...
    • Helps recover from temporary network issues

Enhanced Timeout Configuration

  • Separate Timeouts: Fine-grained timeout control for connection and read phases
    • connect_timeout_s: Connection establishment timeout (default: from timeout_s)
    • read_timeout_s: Data read timeout (default: from timeout_s)
    • Legacy timeout_s parameter still supported for backward compatibility
    • Allows different timeouts for connect vs read operations

Unified Exception Hierarchy

  • Custom Exception System: Complete exception hierarchy with error codes and retryable flags
    • LexiluxError - Base exception class for all Lexilux errors
    • AuthenticationError - Authentication/authorization failures (401, not retryable)
    • RateLimitError - Rate limit exceeded (429, retryable)
    • TimeoutError - Request timeouts (retryable)
    • ConnectionError - Connection failures (retryable)
    • ValidationError - Invalid input (400, not retryable)
    • NotFoundError - Resource not found (404, not retryable)
    • ServerError - Internal server errors (5xx, retryable)
    • InvalidRequestError - Alias for ValidationError
    • ConfigurationError - Client configuration issues (not retryable)
    • NetworkError - Base class for network issues
    • All exceptions have code, message, and retryable properties

Logging & Monitoring

  • Request Logging: Comprehensive logging for debugging and monitoring
    • Logs request start, completion, timing, and errors
    • Uses appropriate log levels (DEBUG, INFO, WARNING, ERROR)
    • Enable with: import logging; logging.basicConfig(level=logging.INFO)
    • Helps with debugging and performance monitoring

BaseAPIClient Architecture

  • New Base Class: BaseAPIClient provides common HTTP functionality to all clients
    • Session management with connection pooling
    • Retry logic with exponential backoff
    • Configurable timeouts (connect/read)
    • Authentication handling
    • Error response parsing and exception mapping
    • Request logging and timing

Documentation

  • CONTRIBUTING.md: Comprehensive contribution guidelines
    • Code style guidelines (PEP 8, type hints, docstrings)
    • Commit message format (Conventional Commits)
    • PR workflow and checklist
    • Bug report and feature request templates
    • Coverage goals and test structure examples
  • docs/source/troubleshooting.rst: Troubleshooting guide for common issues
    • Installation issues (module not found, version conflicts)
    • Connection issues (timeout, connection refused)
    • Authentication issues (401, 403)
    • Rate limiting (429)
    • Streaming issues
    • Performance issues
    • Debugging techniques
    • Common errors reference table
  • TESTING.md: Testing documentation with coverage goals and guidelines
  • Updated Examples: error_handling_demo.py updated to use new exception hierarchy

CI/CD Improvements

  • Multi-Version Testing: CI now tests across Python 3.8-3.14 in separate jobs
  • Security Scanning: Automated vulnerability detection
    • pip-audit for dependency vulnerabilities
    • bandit for code security issues
    • Runs daily and on every push/PR
  • Pre-commit Hooks: Code quality checks before commits
    • ruff lint and format
    • trailing whitespace and file ending fixes
    • YAML syntax checking
  • Coverage Threshold: Minimum 60% code coverage enforced in CI
  • Separate Lint Job: Lint and format checks run in parallel with tests

Changed

Chat Client Improvements

  • BaseAPIClient Integration: Chat now inherits from BaseAPIClient for consistent HTTP behavior
    • All HTTP requests now use connection pooling
    • Network errors raise custom exceptions instead of raw requests exceptions
    • Consistent timeout and retry behavior across all clients

CI/CD Architecture

  • Enhanced Pipeline: Lint and format checks run in parallel with tests
    • Coverage uploaded only from Python 3.14 to save CI resources
    • Separate security workflow for dependency and code scanning

Fixed

Bug Fixes

  • ChatHistory Deep Copy Protection: Added deep copy to prevent external modifications to internal state
    • Messages list is deep copied during initialization
    • Prevents state pollution from external code modifying original input
    • Ensures history immutability
  • Error Message Extraction: API errors now extract and include detailed error messages from JSON response bodies
    • Supports OpenAI-style error format ({"error": {"message": "..."}})
    • Falls back to generic message if parsing fails
  • Timeout Handling: Network timeouts now raise TimeoutError instead of generic requests.exceptions.Timeout
  • Backward Compatibility: Added timeout_s property to Chat for backward compatibility with tuple timeout configuration
  • Test Mocks: Fixed test mocks to work with BaseAPIClient architecture

Code Quality

  • Fixed all ruff linting and formatting issues
  • Removed unused imports and variables
  • Corrected import order

Security

  • All dependency vulnerabilities now scanned in CI pipeline using pip-audit
  • Code security linted with bandit for common security issues
  • Security scan runs daily and on every push/PR
  • Found issues are acceptable for library context (non-critical use cases)

Migration Guide

Enabling Retry Logic

from lexilux import Chat

# Enable automatic retry with exponential backoff
chat = Chat(
    base_url="https://api.example.com/v1",
    api_key="your-key",
    max_retries=3,  # Automatically retry on transient failures
)

# Manual retry using retryable flag
from lexilux import LexiluxError
import time

max_retries = 3
for attempt in range(max_retries):
    try:
        result = chat("Hello, world!")
        break
    except LexiluxError as e:
        if e.retryable and attempt < max_retries - 1:
            time.sleep(2 ** attempt)  # Exponential backoff
        else:
            raise

Using New Exceptions

from lexilux import (
    Chat,
    AuthenticationError,
    RateLimitError,
    TimeoutError,
    LexiluxError,
)

try:
    result = chat("Hello, world!")
except AuthenticationError as e:
    print(f"Auth failed: {e.message}")
    print(f"Error code: {e.code}")  # "authentication_failed"
    print(f"Can retry: {e.retryable}")  # False
except RateLimitError as e:
    print(f"Rate limited: {e.message}")
    print(f"Error code: {e.code}")  # "rate_limit_exceeded"
    print(f"Can retry: {e.retryable}")  # True
except LexiluxError as e:
    print(f"Error: {e.code} - {e.message}")

Enabling Logging

import logging

# Enable INFO level logging
logging.basicConfig(level=logging.INFO)

from lexilux import Chat
chat = Chat(base_url="...", api_key="...")
result = chat("Hello")
# Logs: "Request completed in 0.52s with status 200: https://..."

Configuring Connection Pooling

from lexilux import Chat

chat = Chat(
    base_url="https://api.example.com/v1",
    api_key="your-key",
    pool_connections=20,  # Increase for high concurrency
    pool_maxsize=20,
)

Separate Timeouts

from lexilux import Chat

# New API: separate connect and read timeouts
chat = Chat(
    base_url="https://api.example.com/v1",
    api_key="your-key",
    connect_timeout_s=5,   # Connection timeout
    read_timeout_s=30,     # Read timeout
)

# Old API still works
chat = Chat(
    base_url="https://api.example.com/v1",
    api_key="your-key",
    timeout_s=30,  # Used for both connect and read
)

Developer Experience

  • Better error messages with error codes
  • Automatic retry reduces manual error handling
  • Logging helps with debugging
  • Comprehensive documentation for troubleshooting
  • Clear contribution guidelines
  • Automated quality checks (pre-commit, CI)

Performance

  • Connection pooling reduces overhead for repeated requests
  • Retry logic with exponential backoff improves reliability
  • Request timing via logging helps identify bottlenecks

Installation

pip install lexilux

Or with tokenizer support:

pip install lexilux[tokenizer]

Release v2.1.0

09 Jan 17:32

Choose a tag to compare

🎯 API Improvements: History Immutability & Customizable Continue Strategy

This minor version update introduces important API improvements focusing on immutability, clarity, and customization capabilities.

Changed

  • History Immutability: All methods that receive a history parameter now create a clone internally and never modify the original history object. This ensures:

    • No unexpected side effects
    • Thread-safe operations (multiple threads can use the same history)
    • Functional programming principles
    • Better predictability
  • chat.complete() and chat.complete_stream() history parameter: Now optional instead of required. If None, a new ChatHistory instance is created internally. This simplifies single-turn complete requests:

    # Before (v2.0.0)
    history = ChatHistory()
    result = chat.complete("Write JSON", history=history)
    
    # After (v2.1.0)
    result = chat.complete("Write JSON")  # No history needed for single-turn
  • API Clarity: Updated docstrings to clearly distinguish between:

    • chat() / chat.stream() → Single response (may be truncated)
    • chat.complete() / chat.complete_stream() → Complete response (guaranteed)

Added

  • Customizable Continue Strategy: Enhanced chat.complete() and ChatContinue.continue_request() with extensive customization options:

    • Custom continue prompt: Support for function-based prompts: continue_prompt: str | Callable
    • Progress tracking: on_progress callback for monitoring continuation progress
    • Request delay control: continue_delay parameter (fixed or random range)
    • Error handling strategies: on_error and on_error_callback for flexible error handling
    • Helper method: ChatContinue.needs_continue(result) to check if continuation is needed
  • Enhanced ChatContinue.continue_request() and continue_request_stream():

    • Support for all customization options (progress, delay, error handling)
    • History immutability (clones internally)
    • Better error handling and recovery

Removed

  • Chat.continue_if_needed(): Removed in favor of chat.complete() which provides the same functionality with better API clarity.
  • Chat.continue_if_needed_stream(): Removed in favor of chat.complete_stream().

Migration Guide

Using continue_if_needed() → Use complete() instead

# Before (v2.0.0)
history = ChatHistory()
result = chat("Write JSON", history=history, max_tokens=100)
if result.finish_reason == "length":
    full_result = chat.continue_if_needed(result, history=history)

# After (v2.1.0)
result = chat.complete("Write JSON", max_tokens=100)  # Automatically handles truncation

History Immutability

# Before (v2.0.0) - history was modified
history = ChatHistory()
result = chat("Hello", history=history)
# history now contains: [user: "Hello", assistant: result.text]

# After (v2.1.0) - history is immutable, manual update needed for multi-turn
history = ChatHistory()
result = chat("Hello", history=history)
# history is unchanged, manually update if needed:
history.add_user("Hello")
history.append_result(result)

Custom Continue Strategy

# New in v2.1.0: Customizable continue behavior
def on_progress(count, max_count, current, all_results):
    print(f"🔄 Continuing {count}/{max_count}...")

def smart_prompt(count, max_count, current_text, original_prompt):
    return f"Please continue (attempt {count}/{max_count})"

result = chat.complete(
    "Write a long JSON",
    max_tokens=100,
    continue_prompt=smart_prompt,
    on_progress=on_progress,
    continue_delay=(1.0, 2.0),  # Random delay 1-2 seconds
    on_error="return_partial",  # Return partial on error
)

Installation

pip install lexilux

Or with tokenizer support:

pip install lexilux[tokenizer]

Release v2.0.0

06 Jan 19:22

Choose a tag to compare

🚀 Major Architecture Overhaul: Explicit History Management

This is a major version update with significant architectural changes. The core design philosophy has shifted from implicit to explicit history management, providing better control, predictability, and consistency.

Changed

Breaking Changes

  • Removed auto_history parameter: The auto_history parameter has been completely removed from Chat.__init__(). History management is now always explicit.

    • Migration: Create a ChatHistory instance and pass it explicitly to all methods:
      # Before (v0.5.x)
      chat = Chat(..., auto_history=True)
      result = chat("Hello")
      history = chat.get_history()
      
      # After (v2.0.0)
      history = ChatHistory()
      result = chat("Hello", history=history)
  • All Chat methods now require explicit history parameter: All methods that interact with history now accept an explicit history: ChatHistory | None parameter.

    • Chat.__call__(messages, *, history=None, **params)
    • Chat.stream(messages, *, history=None, **params)
    • Chat.complete(messages, *, history: ChatHistory, **params) (now required)
    • Chat.continue_if_needed(result, *, history: ChatHistory, **params) (now required)
    • ChatContinue.continue_request(chat, last_result, *, history: ChatHistory, **params) (now required)
    • ChatContinue.continue_request_stream(chat, last_result, *, history: ChatHistory, **params) (now required)
  • Removed history management methods from Chat class:

    • Chat.get_history() - Use explicit history parameter instead
    • Chat.clear_history() - Use history.clear() instead
    • Chat.clear_last_assistant_message() - Use history.remove_last() instead
  • Simplified ChatResult: ChatResult now only contains the result of a single LLM request, without any merged history information. This makes the API more predictable and easier to understand.

  • Unified Chat interface: All Chat methods now only accept a single turn's message, with history being managed explicitly. This ensures consistent behavior between streaming and non-streaming modes.

Added

  • Enhanced ChatHistory with MutableSequence protocol: ChatHistory now implements Python's collections.abc.MutableSequence protocol, enabling array-like operations:

    • Indexing: history[0] - Get message by index
    • Slicing: history[1:5] - Get slice as new ChatHistory instance
    • Iteration: for msg in history - Iterate over messages
    • Length: len(history) - Get number of messages
    • Membership: msg in history - Check if message exists
    • Assignment: history[0] = new_msg - Replace message at index
    • Deletion: del history[0] - Remove message at index
    • Insertion: history.insert(0, msg) - Insert message at index
  • New ChatHistory methods:

    • clone() - Create a deep copy of the history
    • __add__(other) - Merge two ChatHistory instances: history1 + history2
    • add_system(content) - Explicitly add or update system message
    • remove_last() - Remove the last message
    • remove_at(index) - Remove message at specific index
    • replace_at(index, message) - Replace message at specific index
    • get_user_messages() - Get all user message contents as a list
    • get_assistant_messages() - Get all assistant message contents as a list
    • get_last_message() - Get the last message dictionary
    • get_last_user_message() - Get the content of the last user message
  • Streaming complete functionality:

    • Chat.complete_stream(messages, *, history: ChatHistory, ...) - Streaming version of complete() that ensures complete responses with real-time chunk streaming
    • Supports progress callbacks (on_progress, on_continue_start, on_continue_end) for monitoring continuation progress
  • Streaming continue functionality:

    • ChatContinue.continue_request_stream(chat, last_result, *, history: ChatHistory, ...) - Stream continuation chunks in real-time
    • Automatically merges results from all continuation requests
    • Provides access to accumulated result via iterator.result.to_chat_result()
  • Convenience methods:

    • Chat.chat_with_history(history, message=None, **params) - Convenience method for chat with history
    • Chat.stream_with_history(history, message=None, **params) - Convenience method for streaming with history

Improved

  • Explicit history management: All history operations are now explicit, making the API more predictable and easier to debug.
  • Consistency: Streaming and non-streaming modes now have identical behavior regarding history management.
  • Type safety: Better type hints and validation for history parameters.
  • Error handling: More precise error messages when history is required but not provided.
  • Documentation: Comprehensive documentation updates reflecting the new explicit history management approach.

Fixed

  • Fixed finish_reason propagation in streaming responses: Corrected issue where finish_reason could be incorrectly set to None in streaming responses, especially during continuation requests.
  • Fixed history update timing: User messages are now added to history before the API request, ensuring they are recorded even if the request fails.
  • Fixed empty result handling in continuation: Empty ChatResult objects are now properly filtered out during continuation merging.
  • Fixed docstring formatting: Resolved reStructuredText formatting issues in docstrings that caused documentation build warnings.

Removed

  • Removed auto_history parameter from Chat.__init__()
  • Removed Chat.get_history() method
  • Removed Chat.clear_history() method
  • Removed Chat.clear_last_assistant_message() method
  • Removed obsolete test files: test_chat_auto_history.py, test_chat_new_features.py (replaced with v2.0 tests)
  • Removed obsolete documentation: auto_history.rst and related examples

Documentation

  • Comprehensive documentation updates: All documentation has been updated to reflect the new explicit history management approach
  • Migration guide: Detailed examples showing how to migrate from v0.5.x to v2.0.0
  • New examples: Updated all examples to use the new explicit history API
  • API reference: Updated API reference documentation for all changed methods

Testing

  • New comprehensive test suite: Created new test files for v2.0.0 API:
    • test_chat_v2.py - Tests for Chat client's v2.0.0 API
    • test_chat_history_v2.py - Tests for ChatHistory's MutableSequence protocol and new methods
    • test_chat_continue_v2.py - Tests for ChatContinue's v2.0.0 API
    • test_chat_streaming_continue_v2.py - Tests for streaming continue edge cases
    • test_chat_integration_v2.py - Integration tests for v2.0.0 API
  • All tests passing: Comprehensive test coverage ensuring correctness of the new architecture

Installation

pip install lexilux

Or with tokenizer support:

pip install lexilux[tokenizer]