monolith

Multi-source OSINT data collection and parallel processing tool. Indexes 4chan, Discord and IRC, reorganizes the data into a common format, annotates language, sentiment and tokens in multiple threads, and outputs the results to Elasticsearch.

Go to file

Mark Veidemanis 06e80a9759 Time stuff and switch to gensim for tokenisation		2022-10-01 14:46:45 +01:00
docker	Time stuff and switch to gensim for tokenisation	2022-10-01 14:46:45 +01:00
legacy	Use only one Redis key for the queue to make chunk size more precise for thread allocation	2022-09-30 07:22:22 +01:00
processing	Time stuff and switch to gensim for tokenisation	2022-10-01 14:46:45 +01:00
schemas	Implement threshold writing to Redis and manticore ingesting from Redis	2022-09-07 07:20:30 +01:00
sources	Remove commented debug code	2022-09-30 07:22:22 +01:00
.gitignore	Add config directories to gitignore	2022-09-08 09:45:18 +01:00
.pre-commit-config.yaml	Reinstate Redis cache	2022-09-04 21:38:53 +01:00
db.py	Remove commented debug code	2022-09-30 07:22:22 +01:00
docker-compose.yml	Use only one Redis key for the queue to make chunk size more precise for thread allocation	2022-09-30 07:22:22 +01:00
env.example	Document new PROCESS_THREADS setting in example file	2022-09-20 22:43:04 +01:00
environment	Fix indexer options	2022-09-22 17:39:18 +01:00
event_log.txt	Implement sentiment/NLP annotation and optimise processing	2022-09-16 17:09:49 +01:00
monolith.py	Reformat	2022-09-30 15:23:00 +01:00
requirements.txt	Time stuff and switch to gensim for tokenisation	2022-10-01 14:46:45 +01:00
util.py	Implement sentiment/NLP annotation and optimise processing	2022-09-16 17:09:49 +01:00