Adjust concurrency limits more LLM friendly settings for new comers

- Lowered max async LLM processes to 4
- Enabled LLM cache for entity extraction
- Reduced max parallel insert to 2
This commit is contained in:
yangdx
2025-03-16 23:56:34 +08:00
parent 9d971e5889
commit c2ba7f33ff
5 changed files with 7 additions and 6 deletions

View File

@@ -224,7 +224,7 @@ LightRAG supports binding to various LLM/Embedding backends:
Use environment variables `LLM_BINDING` or CLI argument `--llm-binding` to select LLM backend type. Use environment variables `EMBEDDING_BINDING` or CLI argument `--embedding-binding` to select LLM backend type.
### Entity Extraction Configuration
* ENABLE_LLM_CACHE_FOR_EXTRACT: Enable LLM cache for entity extraction (default: false)
* ENABLE_LLM_CACHE_FOR_EXTRACT: Enable LLM cache for entity extraction (default: true)
It's very common to set `ENABLE_LLM_CACHE_FOR_EXTRACT` to true for test environment to reduce the cost of LLM calls.

View File

@@ -364,7 +364,7 @@ def parse_args(is_uvicorn_mode: bool = False) -> argparse.Namespace:
# Inject LLM cache configuration
args.enable_llm_cache_for_extract = get_env_value(
"ENABLE_LLM_CACHE_FOR_EXTRACT", False, bool
"ENABLE_LLM_CACHE_FOR_EXTRACT", True, bool
)
# Select Document loading tool (DOCLING, DEFAULT)