Commit Graph

506 Commits

Author SHA1 Message Date
Mark Veidemanis 1c2ff41b56
Add ripsecrets to pre-commit hook 2022-11-03 07:20:30 +00:00
Mark Veidemanis 51a9b2af79
Improve memory usage and fix 4chan crawler 2022-10-21 07:20:30 +01:00
Mark Veidemanis 2d7b6268dd
Don't shadow previous iterator variable 2022-10-21 07:20:30 +01:00
Mark Veidemanis e5b5268f5c
Add example Druid spec 2022-10-21 07:20:30 +01:00
Mark Veidemanis dc1ed1fe10
Print the length of the flattened list in debug message 2022-10-21 07:20:30 +01:00
Mark Veidemanis eaf9a3c937
Remove unused ssdb_data volume 2022-10-21 07:20:30 +01:00
Mark Veidemanis 054a7a3ccf
Don't mount the template directory 2022-10-21 07:20:30 +01:00
Mark Veidemanis f774f4c2d2
Add some environment variables to control debug output 2022-10-21 07:20:30 +01:00
Mark Veidemanis e32b330ef4
Switch to SSDB for message queueing 2022-10-21 11:53:29 +01:00
Mark Veidemanis 8c596ec516
Update gitignore 2022-10-21 11:53:28 +01:00
Mark Veidemanis ab5e85c5c6 Begin switching away from Redis 2022-10-21 11:14:51 +01:00
Mark Veidemanis 7482064aee Clean up docker environment 2022-10-19 16:45:18 +01:00
Mark Veidemanis dccbc6b158 Remove dependencies on infra stuff 2022-10-11 11:16:24 +01:00
Mark Veidemanis 8cc1a48a25 Separate out infra in production 2022-10-11 11:04:03 +01:00
Mark Veidemanis 83e8fb0e38 Remove event log file 2022-10-05 12:52:30 +01:00
Mark Veidemanis 64cf7d0d4a Set Superset directory relative to Portainer Git root 2022-10-04 21:43:16 +01:00
Mark Veidemanis ae12e37e9b Set Superset path properly 2022-10-04 21:41:22 +01:00
Mark Veidemanis 5bb9bd3998 Use local storage in production 2022-10-04 21:33:08 +01:00
Mark Veidemanis d96dc573c5 Update production compose 2022-10-04 21:32:14 +01:00
Mark Veidemanis aea1c7faf6 Use one image for all the Druid services 2022-10-04 21:30:17 +01:00
Mark Veidemanis 2d6b3bb090 Set Superset volume relative to docker folder 2022-10-04 20:54:38 +01:00
Mark Veidemanis 83ffd6517c Switch quickstart setting to nano 2022-10-04 20:37:02 +01:00
Mark Veidemanis 8465e8fb77 Set Superset env file relative to docker directory 2022-10-04 20:30:14 +01:00
Mark Veidemanis d7d9958e54 Add persistent Redis data store and copy over Druid config to production 2022-10-04 20:26:58 +01:00
Mark Veidemanis 464c831686 Add Apache Superset and fix Druid resource usage 2022-10-04 20:17:04 +01:00
Mark Veidemanis 5ad6cd0354 Add postgres config to Metabase 2022-10-02 14:29:40 +01:00
Mark Veidemanis 06e80a9759 Time stuff and switch to gensim for tokenisation 2022-10-01 14:46:45 +01:00
Mark Veidemanis 5c91f1af87 Remove commented debug code 2022-09-30 07:22:22 +01:00
Mark Veidemanis 02ff44a6f5 Use only one Redis key for the queue to make chunk size more precise for thread allocation 2022-09-30 07:22:22 +01:00
Mark Veidemanis a5d29606e9 Remove ujson 2022-09-30 15:30:34 +01:00
Mark Veidemanis 6b549dee6a Reformat 2022-09-30 15:23:00 +01:00
Mark Veidemanis 2dd2360b4f Add config file to Turnilo 2022-09-27 08:30:28 +01:00
Mark Veidemanis a2f88e29e6 Implement uvloop 2022-09-23 07:20:30 +01:00
Mark Veidemanis f0df3e80fd Print Ingest settings on start 2022-09-23 08:32:29 +01:00
Mark Veidemanis 09fc63d0ad Make debug output cleaner 2022-09-22 17:39:29 +01:00
Mark Veidemanis e9ae499ce8 Fix indexer options 2022-09-22 17:39:18 +01:00
Mark Veidemanis b6f8dabccd Fix Java variable in indexer parameters 2022-09-22 08:41:59 +01:00
Mark Veidemanis 395dfb1e7b Decrease memory requirements further and switch Kafka image 2022-09-21 21:11:13 +01:00
Mark Veidemanis ee79762c73 Set Kafka max heap size 2022-09-21 20:26:05 +01:00
Mark Veidemanis e58b9960b2 Set max memory for Metabase 2022-09-21 14:39:11 +01:00
Mark Veidemanis 4a60dec964 Remove debugging code and fix regex substitution 2022-09-21 12:48:54 +01:00
Mark Veidemanis 9ee55a720b Change dev container names 2022-09-21 12:09:18 +01:00
Mark Veidemanis 799286ca76 Change prod container names 2022-09-21 12:08:29 +01:00
Mark Veidemanis 0e62a5b4b8 Remove prod compose comment 2022-09-21 12:04:54 +01:00
Mark Veidemanis 5ebae02bf2 Remove commented code for debugging 2022-09-21 10:02:05 +01:00
Mark Veidemanis ced3a251b2 Normalise fields in processing and remove invalid characters 2022-09-21 10:01:12 +01:00
Mark Veidemanis 740f93208b Make production volumes point to external storage 2022-09-21 10:00:48 +01:00
Mark Veidemanis 2763e52e6b Don't muddle up the topics when sending Kafka batches 2022-09-20 23:03:02 +01:00
Mark Veidemanis 869af451e5 Document new PROCESS_THREADS setting in example file 2022-09-20 22:43:04 +01:00
Mark Veidemanis 31c58dd85b Make CPU threads configurable 2022-09-20 22:29:13 +01:00