Merge pull request #164 from MrGidea/main
Multiple file types support input
This commit is contained in:
17
README.md
17
README.md
@@ -22,6 +22,7 @@ This repository hosts the code of LightRAG. The structure of this code is based
|
||||
</div>
|
||||
|
||||
## 🎉 News
|
||||
- [x] [2024.10.29]🎯🎯📢📢Multi-file types are now supported by `textract`.
|
||||
- [x] [2024.10.20]🎯🎯📢📢We’ve added a new feature to LightRAG: Graph Visualization.
|
||||
- [x] [2024.10.18]🎯🎯📢📢We’ve added a link to a [LightRAG Introduction Video](https://youtu.be/oageL-1I0GE). Thanks to the author!
|
||||
- [x] [2024.10.17]🎯🎯📢📢We have created a [Discord channel](https://discord.gg/mvsfu2Tg)! Welcome to join for sharing and discussions! 🎉🎉
|
||||
@@ -285,6 +286,19 @@ with open("./newText.txt") as f:
|
||||
rag.insert(f.read())
|
||||
```
|
||||
|
||||
### Multi-file Type Support
|
||||
|
||||
The `testract` supports reading file types such as TXT, DOCX, PPTX, CSV, and PDF.
|
||||
|
||||
```python
|
||||
import textract
|
||||
|
||||
file_path = 'TEXT.pdf'
|
||||
text_content = textract.process(file_path)
|
||||
|
||||
rag.insert(text_content.decode('utf-8'))
|
||||
```
|
||||
|
||||
### Graph Visualization
|
||||
|
||||
<details>
|
||||
@@ -863,3 +877,6 @@ archivePrefix={arXiv},
|
||||
primaryClass={cs.IR}
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
|
||||
|
Reference in New Issue
Block a user