Multi-source OSINT data collection and parallel processing tool. Indexes 4chan, Discord and IRC, reorganizes the data into a common format, annotates language, sentiment and tokens in multiple threads, and outputs the results to Elasticsearch.
Go to file
Mark Veidemanis e5b5268f5c
Add example Druid spec
2022-10-21 07:20:30 +01:00
docker Remove unused ssdb_data volume 2022-10-21 07:20:30 +01:00
legacy Switch to SSDB for message queueing 2022-10-21 11:53:29 +01:00
processing Print the length of the flattened list in debug message 2022-10-21 07:20:30 +01:00
schemas Implement threshold writing to Redis and manticore ingesting from Redis 2022-09-07 07:20:30 +01:00
sources Switch to SSDB for message queueing 2022-10-21 11:53:29 +01:00
.gitignore Update gitignore 2022-10-21 11:53:28 +01:00
.pre-commit-config.yaml Reinstate Redis cache 2022-09-04 21:38:53 +01:00
Makefile Clean up docker environment 2022-10-19 16:45:18 +01:00
db.py Switch to SSDB for message queueing 2022-10-21 11:53:29 +01:00
docker-compose.yml Clean up docker environment 2022-10-19 16:45:18 +01:00
druid-spec.json Add example Druid spec 2022-10-21 07:20:30 +01:00
env.example Document new PROCESS_THREADS setting in example file 2022-09-20 22:43:04 +01:00
environment Clean up docker environment 2022-10-19 16:45:18 +01:00
monolith.py Reformat 2022-09-30 15:23:00 +01:00
requirements.txt Time stuff and switch to gensim for tokenisation 2022-10-01 14:46:45 +01:00
util.py Implement sentiment/NLP annotation and optimise processing 2022-09-16 17:09:49 +01:00