Commit Graph

36 Commits

Author SHA1 Message Date
Mark Veidemanis 508b00e471
Pre-create meta index 2022-11-23 19:02:31 +00:00
Mark Veidemanis 371bce1094
Remove print statements 2022-11-22 21:43:56 +00:00
Mark Veidemanis be0cf231b4
Fix mapping and make Threshold talk to SSDB 2022-11-22 21:42:35 +00:00
Mark Veidemanis c53438d07b
Remove port variable 2022-11-22 20:17:51 +00:00
Mark Veidemanis 49f46c33ba
Fully implement Elasticsearch indexing 2022-11-22 20:15:02 +00:00
Mark Veidemanis 2d7b6268dd
Don't shadow previous iterator variable 2022-10-21 07:20:30 +01:00
Mark Veidemanis f774f4c2d2
Add some environment variables to control debug output 2022-10-21 07:20:30 +01:00
Mark Veidemanis e32b330ef4
Switch to SSDB for message queueing 2022-10-21 11:53:29 +01:00
Mark Veidemanis ab5e85c5c6 Begin switching away from Redis 2022-10-21 11:14:51 +01:00
Mark Veidemanis 7482064aee Clean up docker environment 2022-10-19 16:45:18 +01:00
Mark Veidemanis 5c91f1af87 Remove commented debug code 2022-09-30 07:22:22 +01:00
Mark Veidemanis 02ff44a6f5 Use only one Redis key for the queue to make chunk size more precise for thread allocation 2022-09-30 07:22:22 +01:00
Mark Veidemanis 09fc63d0ad Make debug output cleaner 2022-09-22 17:39:29 +01:00
Mark Veidemanis 5ebae02bf2 Remove commented code for debugging 2022-09-21 10:02:05 +01:00
Mark Veidemanis ced3a251b2 Normalise fields in processing and remove invalid characters 2022-09-21 10:01:12 +01:00
Mark Veidemanis 2763e52e6b Don't muddle up the topics when sending Kafka batches 2022-09-20 23:03:02 +01:00
Mark Veidemanis d4b8e11525 Reformat comment 2022-09-18 13:02:06 +01:00
Mark Veidemanis 38d00f2c21 Implement restricted sources 2022-09-18 13:01:19 +01:00
Mark Veidemanis a89b5a8b6f Implement sentiment/NLP annotation and optimise processing 2022-09-16 17:09:49 +01:00
Mark Veidemanis f432e9b29e Properly process Redis buffered messages and ingest into Kafka 2022-09-14 18:32:32 +01:00
Mark Veidemanis c5f01c3084 Ingest into Kafka and queue messages better 2022-09-13 22:17:46 +01:00
Mark Veidemanis fd90c233c2 Begin implementing Apache Druid 2022-09-08 07:20:30 +01:00
Mark Veidemanis 04b5dec843 Treat text fields as string and try beta Kibana image 2022-09-12 08:27:13 +01:00
Mark Veidemanis 92475ee9a9 Add 4chan update message type to main types 2022-09-07 07:20:30 +01:00
Mark Veidemanis 5c3b338017 Implement threshold writing to Redis and manticore ingesting from Redis 2022-09-07 07:20:30 +01:00
Mark Veidemanis e79de2b377 Add aioredis 2022-09-08 09:44:27 +01:00
Mark Veidemanis 79b1bee9e4 Implement ingesting to Redis from Threshold 2022-09-07 07:20:30 +01:00
Mark Veidemanis ddcfa614ad Remove some debugging code 2022-09-05 07:20:30 +01:00
Mark Veidemanis d1c6bd1fb5 Reformat and set the net and channel for 4chan 2022-09-05 07:20:30 +01:00
Mark Veidemanis b8d2ecc009 Make crawler more efficient and implement configurable parameters 2022-09-05 07:20:30 +01:00
Mark Veidemanis f8fc5e1a1b Split thread list into chunks to save memory 2022-09-05 07:20:30 +01:00
Mark Veidemanis 6e00f70184 Reformat code 2022-09-04 21:40:04 +01:00
Mark Veidemanis 60c43b4eb5 Run processing in thread 2022-09-04 21:29:00 +01:00
Mark Veidemanis db23b31f30 Implement aiohttp 2022-09-04 19:44:25 +01:00
Mark Veidemanis f7860bf08b Begin implementing aiohttp 2022-09-04 13:47:32 +01:00
Mark Veidemanis 734a2b7879 Implement running Discord and 4chan gathering simultaneously 2022-09-02 22:30:45 +01:00