Mark Veidemanis
|
40cf0c6430
|
Remove commented debug code
|
2022-09-30 07:22:22 +01:00 |
Mark Veidemanis
|
63081f68b7
|
Use only one Redis key for the queue to make chunk size more precise for thread allocation
|
2022-09-30 07:22:22 +01:00 |
Mark Veidemanis
|
fc7450c33a
|
Make debug output cleaner
|
2022-09-22 17:39:29 +01:00 |
Mark Veidemanis
|
d6d19625f3
|
Remove commented code for debugging
|
2022-09-21 10:02:05 +01:00 |
Mark Veidemanis
|
cf4aa45663
|
Normalise fields in processing and remove invalid characters
|
2022-09-21 10:01:12 +01:00 |
Mark Veidemanis
|
027c43b60a
|
Don't muddle up the topics when sending Kafka batches
|
2022-09-20 23:03:02 +01:00 |
Mark Veidemanis
|
ebfa06e8d6
|
Reformat comment
|
2022-09-18 13:02:06 +01:00 |
Mark Veidemanis
|
3ed382ec13
|
Implement restricted sources
|
2022-09-18 13:01:19 +01:00 |
Mark Veidemanis
|
143f2a0bf0
|
Implement sentiment/NLP annotation and optimise processing
|
2022-09-16 17:09:49 +01:00 |
Mark Veidemanis
|
4ea77ac543
|
Properly process Redis buffered messages and ingest into Kafka
|
2022-09-14 18:32:32 +01:00 |
Mark Veidemanis
|
fec0d379a6
|
Ingest into Kafka and queue messages better
|
2022-09-13 22:17:46 +01:00 |
Mark Veidemanis
|
79a430be04
|
Begin implementing Apache Druid
|
2022-09-08 07:20:30 +01:00 |
Mark Veidemanis
|
21182629b4
|
Treat text fields as string and try beta Kibana image
|
2022-09-12 08:27:13 +01:00 |
Mark Veidemanis
|
32249a1d99
|
Add 4chan update message type to main types
|
2022-09-07 07:20:30 +01:00 |
Mark Veidemanis
|
cdd12cd082
|
Implement threshold writing to Redis and manticore ingesting from Redis
|
2022-09-07 07:20:30 +01:00 |
Mark Veidemanis
|
2aedcf77a0
|
Add aioredis
|
2022-09-08 09:44:27 +01:00 |
Mark Veidemanis
|
49784dfbe5
|
Implement ingesting to Redis from Threshold
|
2022-09-07 07:20:30 +01:00 |
Mark Veidemanis
|
ed7c439b56
|
Remove some debugging code
|
2022-09-05 07:20:30 +01:00 |
Mark Veidemanis
|
9c9d49dcd2
|
Reformat and set the net and channel for 4chan
|
2022-09-05 07:20:30 +01:00 |
Mark Veidemanis
|
dcd648e1d2
|
Make crawler more efficient and implement configurable parameters
|
2022-09-05 07:20:30 +01:00 |
Mark Veidemanis
|
318a8ddbd5
|
Split thread list into chunks to save memory
|
2022-09-05 07:20:30 +01:00 |
Mark Veidemanis
|
20e22ae7ca
|
Reformat code
|
2022-09-04 21:40:04 +01:00 |
Mark Veidemanis
|
db46fea550
|
Run processing in thread
|
2022-09-04 21:29:00 +01:00 |
Mark Veidemanis
|
22cef33342
|
Implement aiohttp
|
2022-09-04 19:44:25 +01:00 |
Mark Veidemanis
|
663a26778d
|
Begin implementing aiohttp
|
2022-09-04 13:47:32 +01:00 |
Mark Veidemanis
|
36de004ee5
|
Implement running Discord and 4chan gathering simultaneously
|
2022-09-02 22:30:45 +01:00 |