feat(lightrag): Add document status tracking and checkpoint support

功能(lightrag): 添加文档状态跟踪和断点续传支持

- Add DocStatus enum and DocProcessingStatus class for document processing state management
- 添加 DocStatus 枚举和 DocProcessingStatus 类用于文档处理状态管理

- Implement JsonDocStatusStorage for persistent status storage
- 实现 JsonDocStatusStorage 用于持久化状态存储

- Add document-level deduplication in batch processing
- 在批处理中添加文档级别的去重功能

- Add checkpoint support in ainsert method for resumable document processing
- 在 ainsert 方法中添加断点续传支持,实现可恢复的文档处理

- Add status query methods for monitoring processing progress
- 添加状态查询方法用于监控处理进度

- Update LightRAG initialization to support document status tracking
- 更新 LightRAG 初始化以支持文档状态跟踪
This commit is contained in:
Magic_yuan
2024-12-28 00:11:25 +08:00
parent c022db4355
commit 650b8e38b7
4 changed files with 256 additions and 61 deletions

View File

@@ -1,7 +1,8 @@
import asyncio
import inspect
import json
import os, sys
import os
import sys
from contextlib import asynccontextmanager
from dataclasses import dataclass
from typing import Any, Dict, List, NamedTuple, Optional, Tuple, Union
@@ -22,8 +23,10 @@ from ..base import BaseGraphStorage
if sys.platform.startswith("win"):
import asyncio.windows_events
asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
class AGEQueryException(Exception):
"""Exception for the AGE queries."""