Files
lightrag/README.md
2024-10-29 15:46:48 +08:00

378 B

Quick start

Currently, the test supports pptx, pdf, csv, word, txt file types

  • install textract
pip install textract
  • example
import textract
# 指定要提取文本的文件路径
file_path = 'path/to/your/file.pdf'
# 从文件中提取文本
text_content = textract.process(file_path)
# 打印提取的文本
print(text_content.decode('utf-8'))