monolith

Multi-source OSINT data collection and parallel processing tool. Indexes 4chan, Discord and IRC, reorganizes the data into a common format, annotates language, sentiment and tokens in multiple threads, and outputs the results to Elasticsearch.

Go to file

Mark Veidemanis 808ed18b74 Switch quickstart setting to nano		2022-10-04 20:37:02 +01:00
docker	Switch quickstart setting to nano	2022-10-04 20:37:02 +01:00
legacy	Use only one Redis key for the queue to make chunk size more precise for thread allocation	2022-09-30 07:22:22 +01:00
processing	Time stuff and switch to gensim for tokenisation	2022-10-01 14:46:45 +01:00
schemas	Implement threshold writing to Redis and manticore ingesting from Redis	2022-09-07 07:20:30 +01:00
sources	Remove commented debug code	2022-09-30 07:22:22 +01:00
.gitignore	Add config directories to gitignore	2022-09-08 09:45:18 +01:00
.pre-commit-config.yaml	Reinstate Redis cache	2022-09-04 21:38:53 +01:00
db.py	Remove commented debug code	2022-09-30 07:22:22 +01:00
docker-compose.yml	Add persistent Redis data store and copy over Druid config to production	2022-10-04 20:26:58 +01:00
env.example	Document new PROCESS_THREADS setting in example file	2022-09-20 22:43:04 +01:00
environment	Switch quickstart setting to nano	2022-10-04 20:37:02 +01:00
event_log.txt	Implement sentiment/NLP annotation and optimise processing	2022-09-16 17:09:49 +01:00
monolith.py	Use only one Redis key for the queue to make chunk size more precise for thread allocation	2022-09-30 07:22:22 +01:00
requirements.txt	Time stuff and switch to gensim for tokenisation	2022-10-01 14:46:45 +01:00
util.py	Implement sentiment/NLP annotation and optimise processing	2022-09-16 17:09:49 +01:00