Commit Graph

25 Commits

Author SHA1 Message Date
5c91f1af87 Remove commented debug code 2022-09-30 07:22:22 +01:00
02ff44a6f5 Use only one Redis key for the queue to make chunk size more precise for thread allocation 2022-09-30 07:22:22 +01:00
a2f88e29e6 Implement uvloop 2022-09-23 07:20:30 +01:00
f0df3e80fd Print Ingest settings on start 2022-09-23 08:32:29 +01:00
5ebae02bf2 Remove commented code for debugging 2022-09-21 10:02:05 +01:00
ced3a251b2 Normalise fields in processing and remove invalid characters 2022-09-21 10:01:12 +01:00
2763e52e6b Don't muddle up the topics when sending Kafka batches 2022-09-20 23:03:02 +01:00
40a0c2d22e Make performance settings configurable 2022-09-20 22:22:13 +01:00
a89b5a8b6f Implement sentiment/NLP annotation and optimise processing 2022-09-16 17:09:49 +01:00
f432e9b29e Properly process Redis buffered messages and ingest into Kafka 2022-09-14 18:32:32 +01:00
c5f01c3084 Ingest into Kafka and queue messages better 2022-09-13 22:17:46 +01:00
c2bdb3fd15 Reformat 2022-09-07 07:20:30 +01:00
5c3b338017 Implement threshold writing to Redis and manticore ingesting from Redis 2022-09-07 07:20:30 +01:00
7bb2264d91 Increase thread delay time 2022-09-05 07:20:30 +01:00
1858e06c4b Alter schemas and 4chan performance settings 2022-09-05 07:20:30 +01:00
ddcfa614ad Remove some debugging code 2022-09-05 07:20:30 +01:00
d1c6bd1fb5 Reformat and set the net and channel for 4chan 2022-09-05 07:20:30 +01:00
b8d2ecc009 Make crawler more efficient and implement configurable parameters 2022-09-05 07:20:30 +01:00
f8fc5e1a1b Split thread list into chunks to save memory 2022-09-05 07:20:30 +01:00
6e00f70184 Reformat code 2022-09-04 21:40:04 +01:00
0f717b987d Reinstate Redis cache 2022-09-04 21:38:53 +01:00
60c43b4eb5 Run processing in thread 2022-09-04 21:29:00 +01:00
db23b31f30 Implement aiohttp 2022-09-04 19:44:25 +01:00
f7860bf08b Begin implementing aiohttp 2022-09-04 13:47:32 +01:00
734a2b7879 Implement running Discord and 4chan gathering simultaneously 2022-09-02 22:30:45 +01:00