Add document scan API notes in API README.md

2025-01-19 12:24:46 +08:00
parent 3a227701b2
commit a7b37652cf
1 changed files with 15 additions and 4 deletions
--- a/lightrag/api/README.md
+++ b/lightrag/api/README.md
@@ -17,6 +17,7 @@ git clone https://github.com/HKUDS/lightrag.git
 # Change to the repository directory
 cd lightrag

+# create a Python virtual enviroment if neccesary
 # Install in editable mode with API support
 pip install -e ".[api]"
 ```
@@ -309,6 +310,16 @@ curl -X POST "http://localhost:9621/documents/batch" \
    -F "files=@/path/to/doc2.txt"
 ```

+#### POST /documents/scan
+
+Trigger document scan for new files in the Input directory.
+
+```bash
+curl -X POST "http://localhost:9621/documents/scan" --max-time 1800
+```
+
+> Ajust max-time according to the estimated index time  for all new files.
+
 ### Ollama Emulation Endpoints

 #### GET /api/version
@@ -391,15 +402,15 @@ You can test the API endpoints using the provided curl commands or through the S
 2. Start the RAG server
 3. Upload some documents using the document management endpoints
 4. Query the system using the query endpoints
+5. Trigger document scan if new files is put into inputs directory

 ### Important Features

 #### Automatic Document Vectorization
 When starting any of the servers with the `--input-dir` parameter, the system will automatically:
-1. Scan the specified directory for documents
-2. Check for existing vectorized content in the database
-3. Only vectorize new documents that aren't already in the database
-4. Make all content immediately available for RAG queries
+1. Check for existing vectorized content in the database
+2. Only vectorize new documents that aren't already in the database
+3. Make all content immediately available for RAG queries

 This intelligent caching mechanism:
 - Prevents unnecessary re-vectorization of existing documents