This website requires JavaScript.
Explore
Help
Sign In
Pathogen
/
monolith
Watch
2
Star
0
Fork
You've already forked monolith
0
Code
Issues
Pull Requests
Projects
Releases
Wiki
Activity
Multi-source OSINT data collection and parallel processing tool. Indexes 4chan, Discord and IRC, reorganizes the data into a common format, annotates language, sentiment and tokens in multiple threads, and outputs the results to Elasticsearch.
8
Commits
1
Branch
1
Tag
8
MiB
Python
99.1%
Dockerfile
0.4%
Makefile
0.3%
Shell
0.2%
b8d2ecc009
Go to file
HTTPS
Download ZIP
Download TAR.GZ
Download BUNDLE
Clone in VS Code
Cite this repository
APA
BibTeX
Cancel
Mark Veidemanis
b8d2ecc009
Make crawler more efficient and implement configurable parameters
2022-09-05 07:20:30 +01:00
docker
Split thread list into chunks to save memory
2022-09-05 07:20:30 +01:00
schemas
Reformat code
2022-09-04 21:40:04 +01:00
sources
Make crawler more efficient and implement configurable parameters
2022-09-05 07:20:30 +01:00
.gitignore
Reinstate Redis cache
2022-09-04 21:38:53 +01:00
.pre-commit-config.yaml
Reinstate Redis cache
2022-09-04 21:38:53 +01:00
db.py
Make crawler more efficient and implement configurable parameters
2022-09-05 07:20:30 +01:00
docker-compose.yml
Run processing in thread
2022-09-04 21:29:00 +01:00
monolith.py
Reformat code
2022-09-04 21:40:04 +01:00
requirements.txt
Run processing in thread
2022-09-04 21:29:00 +01:00
util.py
Run processing in thread
2022-09-04 21:29:00 +01:00