Implement indexing into Apache Druid #1

Closed
m wants to merge 263 commits from druid into master

263 Commits (master)

Author SHA1 Message Date
Mark Veidemanis 143f2a0bf0
Implement sentiment/NLP annotation and optimise processing 2 years ago
Mark Veidemanis 4ea77ac543
Properly process Redis buffered messages and ingest into Kafka 2 years ago
Mark Veidemanis fec0d379a6
Ingest into Kafka and queue messages better 2 years ago
Mark Veidemanis 3c2adfc16e
Implement Apache Druid/Kafka and Metabase 2 years ago
Mark Veidemanis 79a430be04
Begin implementing Apache Druid 2 years ago
Mark Veidemanis baea6aebeb
Use stable after all 2 years ago
Mark Veidemanis eaecc5cdbe
Switch production image back to dev 2 years ago
Mark Veidemanis 764e36ef14
Lower memory requirements to prevent crashes 2 years ago
Mark Veidemanis 50a873dbba
Set dev image back to the default 2 years ago
Mark Veidemanis 21182629b4
Treat text fields as string and try beta Kibana image 2 years ago
Mark Veidemanis dfd71b6c64
Add Mysql port to ports instead of expose 2 years ago
Mark Veidemanis 1b0817b047
Expose the Mysql port 2 years ago
Mark Veidemanis 0ba4929294
Use dev image of manticore 2 years ago
Mark Veidemanis caded433b7
Remove indexer block to attempt to prevent Manticore DB crash 2 years ago
Mark Veidemanis bf802d7fdf
Reformat 2 years ago
Mark Veidemanis 89328a827a
Raise open files limit for Redis 2 years ago
Mark Veidemanis 32249a1d99
Add 4chan update message type to main types 2 years ago
Mark Veidemanis cdd12cd082
Implement threshold writing to Redis and manticore ingesting from Redis 2 years ago
Mark Veidemanis 137299fe9e
Add config directories to gitignore 2 years ago
Mark Veidemanis 2aedcf77a0
Add aioredis 2 years ago
Mark Veidemanis 49784dfbe5
Implement ingesting to Redis from Threshold 2 years ago
Mark Veidemanis a6b5348224
Config relative to Git dir 2 years ago
Mark Veidemanis d0fe2baafe
Store persistent database elsewhere 2 years ago
Mark Veidemanis e092327932
Improve DB performance with caching 2 years ago
Mark Veidemanis 8b9ad05089
Reformat legacy project 2 years ago
Mark Veidemanis 6b082adeb2
Merge branch 'threshold' 2 years ago
Mark Veidemanis bd9f9378cf
Moved files to subdirectory 2 years ago
Mark Veidemanis 62fe03a6cb
Increase thread delay time 2 years ago
Mark Veidemanis 297bbbe035
Alter schemas and 4chan performance settings 2 years ago
Mark Veidemanis ed7c439b56
Remove some debugging code 2 years ago
Mark Veidemanis ecb8079b5b
Change Python to 3.10 2 years ago
Mark Veidemanis 6811ce4af5
Update production env file path 2 years ago
Mark Veidemanis e34d281774
Remove development dotenv loading 2 years ago
Mark Veidemanis 91e18c60e6
Add debug statement 2 years ago
Mark Veidemanis 9c9d49dcd2
Reformat and set the net and channel for 4chan 2 years ago
Mark Veidemanis dcd648e1d2
Make crawler more efficient and implement configurable parameters 2 years ago
Mark Veidemanis 318a8ddbd5
Split thread list into chunks to save memory 2 years ago
Mark Veidemanis 20e22ae7ca
Reformat code 2 years ago
Mark Veidemanis 8feccbbf00
Reinstate Redis cache 2 years ago
Mark Veidemanis db46fea550
Run processing in thread 2 years ago
Mark Veidemanis 22cef33342
Implement aiohttp 2 years ago
Mark Veidemanis 663a26778d
Begin implementing aiohttp 2 years ago
Mark Veidemanis 36de004ee5
Implement running Discord and 4chan gathering simultaneously 2 years ago
Mark Veidemanis 2c3d83fe9a
Fix error when no email can be found 2 years ago
Mark Veidemanis d7adffb47f
Fix getting first relay when they are not sequential 2 years ago
Mark Veidemanis 4f4820818a
Log authentication messages 2 years ago
Mark Veidemanis 5cc38da00e
Implement deduplicating channels 2 years ago
Mark Veidemanis a4dae2a583
Switch to siphash 2 years ago
Mark Veidemanis 5f1667869f
Re-add fake messages 2 years ago
Mark Veidemanis 09a5cd14ad
Detect queries if nick and channel are the same 2 years ago
Mark Veidemanis 96de70aaf2
Add sinst fetch and fix message send logic 2 years ago
Mark Veidemanis f8c1e952bb
Switch debugging statements to trace in ChanKeep 2 years ago
Mark Veidemanis 36628e157d
Fix query handling and don't send a fake message 2 years ago
Mark Veidemanis aeee745ac9
Only run pingmsg after negative has completed 2 years ago
Mark Veidemanis d795af164f
Fix debug statements and amend function names 2 years ago
Mark Veidemanis 4acadd3508
Properly format string 2 years ago
Mark Veidemanis 5c4904ba56
Improve regPing debugging 2 years ago
Mark Veidemanis 4e88b93856
Improve regPing negative handling logic 2 years ago
Mark Veidemanis af1dba5741
Fix double messages and regPing logic 2 years ago
Mark Veidemanis 553e2eb2b7
Set the channel limit on connected relays, not active 2 years ago
Mark Veidemanis 3dfc6d736a
Look before you leap to confirming registrations 2 years ago
Mark Veidemanis 7ef76d1424
Fix IRC config mutation 2 years ago
Mark Veidemanis d78600a2f1
Change authentication endpoint 2 years ago
Mark Veidemanis f004bd47af
Reorder API endpoints to prevent clashing 2 years ago
Mark Veidemanis fafcff1427
Add more debugging information 2 years ago
Mark Veidemanis e56bd61362
Figure out the channel parsing logic 2 years ago
Mark Veidemanis 2b7bd486f1
Pass a list instead of listinfo 2 years ago
Mark Veidemanis a9592a85d0
Fix variable placement 2 years ago
Mark Veidemanis e77c046965
Fix list parsing 2 years ago
Mark Veidemanis 7a8cee1431
Fix debugging code in keepChannels 2 years ago
Mark Veidemanis e6527b4f9f
Add debugging code in keepChannels 2 years ago
Mark Veidemanis 8979a03bbd
Subtract one from length of list for indices 2 years ago
Mark Veidemanis f7b84913f2
Lower max_chans to length of LIST if it's shorter 2 years ago
Mark Veidemanis d46c98a211
Reset negative pass status when requesting recheck 2 years ago
Mark Veidemanis d68f0589cb
Implement initial WHO loop delay 2 years ago
Mark Veidemanis d9ec68708b
Fix getting all unregistered relays 2 years ago
Mark Veidemanis 1b77c50552
Blacklist channels we are kicked from 2 years ago
Mark Veidemanis 1ce5a8228c
Use JSON for sending messages 2 years ago
Mark Veidemanis f6f515b308
Implement API call to register 2 years ago
Mark Veidemanis 9864b4e2b5
Convert num to number in registration confirmation 2 years ago
Mark Veidemanis 2fdd0cf6b8
Allow current nick substitution in IRC commands 2 years ago
Mark Veidemanis 8c809ad444
Fix variable shadowing 2 years ago
Mark Veidemanis 2022ab985b
Print identification message 2 years ago
Mark Veidemanis b5e78bc4de
Implement manual authentication mode 2 years ago
Mark Veidemanis eba2c387f0
Implement API for authentication management actions 2 years ago
Mark Veidemanis 5123941c79
More debugging for reg tests and getstr command 2 years ago
Mark Veidemanis 6cc07c9171
Add allRelaysActive output to network info 2 years ago
Mark Veidemanis ed1f3cdca7
Add debug statements and only check if network is connected when parting channels 2 years ago
Mark Veidemanis 128e005611
Use JSON for joining channels and don't shadow auth variable when getting network info 2 years ago
Mark Veidemanis 713e03b66e
Make channel deletion endpoint accept JSON 2 years ago
Mark Veidemanis a0761ff1ae
LBYL 2 years ago
Mark Veidemanis 15523bed96
Add more information to relay API return 2 years ago
Mark Veidemanis 653d9ea4f9
Add even more debugging 2 years ago
Mark Veidemanis f1229a76e1
Extra debugging for getting active relays 2 years ago
Mark Veidemanis d4bcbf99e5
Fix typo in module name 2 years ago
Mark Veidemanis e517d04095
Extra debugging for get_first_relay 2 years ago
Mark Veidemanis 65697ce8f0
Filter queries more carefully 2 years ago
Mark Veidemanis ab9b0a1c9f
Update CHANLIMIT on all instances when set via API 2 years ago
Mark Veidemanis 60f7a84383
Add helper to get all active relays 2 years ago
Mark Veidemanis 956d328fd3
Implement API endpoint to enable authentication 2 years ago
Mark Veidemanis dcd7fcc3c0
Filter AUTH channel (OFTC fix) 2 years ago
Mark Veidemanis 7415ca5556
Use ChanKeep system for joining channels with joinSingle 2 years ago
Mark Veidemanis 9780a2dfc8
Fully make use of ECA for multiple channels 2 years ago
Mark Veidemanis c7fa508a38
Return chanlimit for each relay 2 years ago
Mark Veidemanis b83062c34f
Check token before attempting to confirm 2 years ago
Mark Veidemanis 2e57e0930a
Implement API endpoint for provisioning relays 2 years ago
Mark Veidemanis 43c5625b3b
Implement configurable chanlimit and add more fields about LIST output to Redis 2 years ago
Mark Veidemanis 291968fbc7
Implement updating registration via API 2 years ago
Mark Veidemanis dd67e9cc8b
Implement ChanKeep without requiring persistent chanlimits on all networks 2 years ago
Mark Veidemanis c145e5cf18
Add some debug statements and statistics for chanlimits 2 years ago
Mark Veidemanis 5db0373731
Print message if relay is unauthenticated/disconnected 2 years ago
Mark Veidemanis 6c11bbe912
Return relay numbers with channel list 2 years ago
Mark Veidemanis 4d543f31ec
Add connected status to IRC info return and check when getting active relays 2 years ago
Mark Veidemanis 6c92e8e7d9
Reformat code 2 years ago
Mark Veidemanis 836e621063
Implement getting LIST information from API 2 years ago
Mark Veidemanis 852d62a9c9
Provision relay on creation 2 years ago
Mark Veidemanis ddc9af0ddf
Add docstrings to chankeep 2 years ago
Mark Veidemanis edfb3f15eb
Implement migrating networks 2 years ago
Mark Veidemanis 14967f662c
Subtract allocated channel slots from total 2 years ago
Mark Veidemanis 0b370fc155
Improve channel allocation and write basic tests for it 2 years ago
Mark Veidemanis 9804f30060
Make channel join notification a TRACE 2 years ago
Mark Veidemanis f7d6cec896
Fix email command 2 years ago
Mark Veidemanis b871fea039
Add endpoint to get the bot's nickname 2 years ago
Mark Veidemanis e69ce5090a
Properly implement querying with API 2 years ago
Mark Veidemanis 813c9baf30
Get our hostname from WHO when we create fake events 2 years ago
Mark Veidemanis 220ce976f2
Fire a fake event when we send a message 2 years ago
Mark Veidemanis 719f014265
Implement best effort allocation 2 years ago
Mark Veidemanis 1ef600a9df
Simplify variable names and reformat 2 years ago
Mark Veidemanis b72a0672a5
Use ceil instead of round for relay number rounding 2 years ago
Mark Veidemanis bb3b96e7f7
Expand ECA secondary allocation algorithm 2 years ago
Mark Veidemanis c4db8ec99d
Adding more debug statements in ECA system 2 years ago
Mark Veidemanis 73b0518a8f
Print information about received LIST 2 years ago
Mark Veidemanis 571a527f43
Return correct data type for provisioning relays 2 years ago
Mark Veidemanis 4c3bab6d96
Simplify is_first_relay 2 years ago
Mark Veidemanis 14eb05722c
Add even more debugging 2 years ago
Mark Veidemanis 11c226833d
Add more LIST handling debugging 2 years ago
Mark Veidemanis ea81fc80e3
Don't add 1 to current relays when iterating 2 years ago
Mark Veidemanis 8cd22888b7
Add extra debug call for allRelaysActive 2 years ago
Mark Veidemanis ba4b8c7501
Reformat helpers 2 years ago
Mark Veidemanis 0666c4a153
Enable debug mode with env vars 2 years ago
Mark Veidemanis 2a5e6766be
Update IRC template 2 years ago
Mark Veidemanis c983a8e3b6
Allow gaps in relay numbering 2 years ago
Mark Veidemanis a3fe92bea9
Implement deleting networks 2 years ago
Mark Veidemanis 9b03485b69
More error handling when joining channels with ChanKeep 2 years ago
Mark Veidemanis 98dcb99f90
Implement adding networks 2 years ago
Mark Veidemanis aa68bfd9be
Implement requesting channel list for network 2 years ago
Mark Veidemanis f3f717e693
Remove debugging code 2 years ago
Mark Veidemanis 864f0904f5
Implement automatic provisioning 2 years ago
Mark Veidemanis b72d3d67a1
Implement updating aliases 2 years ago
Mark Veidemanis 96d189290b
Implement API endpoint to add next relay 2 years ago
Mark Veidemanis c950bcbd43
Implement deleeting relays and fix adding 2 years ago
Mark Veidemanis 4472352785
Reformat code 2 years ago
Mark Veidemanis 75f79cf072
Fix joining channels with inactive relays 2 years ago
Mark Veidemanis 1ca6d79868
Implement creating relays via the API 2 years ago
Mark Veidemanis 33466b90ba
Fix Redis config path 2 years ago
Mark Veidemanis 659d5b391b
Use proper port for SSL listener 2 years ago
Mark Veidemanis 6e1dfecc95
Disable RelayAPI by default in stack file 2 years ago
Mark Veidemanis 3354a94024
Add stack example to test production 2 years ago
Mark Veidemanis a5b25b2048
Use Git dir to make redis config absolute path 2 years ago
Mark Veidemanis 1f51bf2972
Use paths relative to root in production compose 2 years ago
Mark Veidemanis 6e41c8dfc0
Switch paths 2 years ago
Mark Veidemanis ce0b26577f
Use relative paths 2 years ago
Mark Veidemanis 335e602072
Fix redis.conf location in prod compose 2 years ago
Mark Veidemanis 1fcc9d6643
Don't pass template directory 2 years ago
Mark Veidemanis 1ab9824e95
Fix path issue 2 years ago
Mark Veidemanis 47312b04d4
Pass through configuration directories to compose 2 years ago
Mark Veidemanis 743c1d6be8
Fix environment variable path on production compose 2 years ago
Mark Veidemanis 1b60ec62f6
Properly configure production compose file 2 years ago
Mark Veidemanis 94303b1108
Create separate production configuration 2 years ago
Mark Veidemanis 219fc8ac35
Remove print statements 2 years ago
Mark Veidemanis c5604c0ca8
Add trailing slash to example directory 2 years ago
Mark Veidemanis f9482cac63
Add Portainer Git directory to env file 2 years ago
Mark Veidemanis a61ba7b9e1
Seamlessly handle nonexistent configurations 2 years ago
Mark Veidemanis b3dce50ce4
Add stack.env file 2 years ago
Mark Veidemanis 7eee2ec929
Move env file to example 2 years ago
Mark Veidemanis 2ad61e6afa
Properly pass environment variables to the process 2 years ago
Mark Veidemanis a598bbab4b
Make some addresses and hosts configurable with environment variables 2 years ago
Mark Veidemanis 422d3d4cdc
Lower compose version 2 years ago
Mark Veidemanis 2b4e037b51
Add docker definitions 2 years ago
Mark Veidemanis 15583bdaab
Implement relay, channel and alias management 2 years ago
Mark Veidemanis 8050484b6f
Implement editing networks via the API 2 years ago
Mark Veidemanis 4f141b976a
Implement network and channels view 2 years ago
Mark Veidemanis c302cd25da
Implement API endpoint for network listing 2 years ago
Mark Veidemanis 24a2f79e8e
Don't send to Logstash if it's disabled 2 years ago
Mark Veidemanis 8c9ec3ab9c
Implement getting number of channels and users 2 years ago
Mark Veidemanis a8d0a7d886
Implement more API functions 2 years ago
Mark Veidemanis e3e150c805
Update config 2 years ago
Mark Veidemanis 071d6f4579
Implement API 2 years ago
Mark Veidemanis 4a8605626a
Begin work on API endpoint 2 years ago
Mark Veidemanis 80c016761f
Reformat again 2 years ago
Mark Veidemanis 7a0e2be66c
Remove some legacy code 2 years ago
Mark Veidemanis 2fecd98978
Reformat project 2 years ago
Mark Veidemanis 4ecb37b179
Reformat and fix circular import 2 years ago
Mark Veidemanis 27cafa1def
Revert "Reformat project"
This reverts commit 64e3e1160aa76d191740342ab3edc68807f890fb.
2 years ago
Mark Veidemanis da678617d8
Reformat project 2 years ago
Mark Veidemanis 4669096fcb
Don't attempt secondary registration if it is disabled 2 years ago
Mark Veidemanis 404fdb000f
Don't attempt to register if it is disabled 2 years ago
Mark Veidemanis 2177766d90
Rename time to ts 2 years ago
Mark Veidemanis 4734a271a1
Extra error handling around emails 2 years ago
Mark Veidemanis ef3151f34c
Make Redis DBs configurable 2 years ago
Mark Veidemanis 8442c799be
Add Redis DB numbers to configuration 2 years ago
Mark Veidemanis e0f86ec853
Fix provisioning with emails 2 years ago
Mark Veidemanis f88e6dec5a
Fix some issues with the default config 2 years ago
Mark Veidemanis 4ff111a216
Improve email command 2 years ago
Mark Veidemanis 7c855e09c0
Reformat code with pre-commit 2 years ago
Mark Veidemanis 61f6715b20 Start implementing email command 3 years ago
Mark Veidemanis 0854c6d60d Add Logstash file 3 years ago
Mark Veidemanis 5179c43972 Implement modifying emails for aliases 3 years ago
Mark Veidemanis 7439d97c71 Finish Logstash implementation 3 years ago
Mark Veidemanis 391f917b38 Update requirements without versions 3 years ago
Mark Veidemanis 2686e4ab04 Merge branch 'master' into datarestructure 4 years ago
Mark Veidemanis 08b5dc06f0 Implement relay-independent join 4 years ago
Mark Veidemanis 5deb0649fb Don't discard server messages 4 years ago
Mark Veidemanis 9959231d50 Use substitutions in registration tests 4 years ago
Mark Veidemanis 73e596dac3 Additional error handling for command parsing 4 years ago
Mark Veidemanis be405160e4 Fix bug with reg command 4 years ago
Mark Veidemanis 7489512a82 Add example file for blacklist 4 years ago
Mark Veidemanis 1f178a20ed Implement channel blacklisting 4 years ago
Mark Veidemanis cb21ad8fca Fix bug with using muser attribute when absent 4 years ago
Mark Veidemanis c10274ccd6 Fix syntax error in reg command 4 years ago
Mark Veidemanis 9fd6688892 Implement setting modes in ZNC 4 years ago
Mark Veidemanis f54a448d54 Prepare command loader for reloading commands 4 years ago
Mark Veidemanis fe52561b71 Implement registration at net-level 4 years ago
Mark Veidemanis 09405f374e Clarify message output on confirm command 4 years ago
Mark Veidemanis 16ab37cc0c Log error when ZNC says a channel can't be joined 4 years ago
Mark Veidemanis fc3a349cb3 Fix registration cancellation bug in regproc 4 years ago
Mark Veidemanis fe86d30155 Fix various bugs and off by one with provisioning 4 years ago
Mark Veidemanis 7485bbefd1 Move WHO and NAMES logging to trace 4 years ago
Mark Veidemanis 82a98c9539 Don't deduplicate global messages (NICK/QUIT) 4 years ago
Mark Veidemanis 45f02c323b Improve authentication detection
Add a negative check in the event we are authenticated and registered,
but not confirmed, as this fools other checks.
4 years ago
Mark Veidemanis bdb3d059e3 Use zero-padded numbers to maximise usuable ports 4 years ago
Mark Veidemanis e403852778 Error checking in testing for registration message 4 years ago
Mark Veidemanis f3dd102096 Deauth bot when disconnected and lowercase user 4 years ago
Mark Veidemanis 1fec14d759 Clarify error message to be more helpful 4 years ago
Mark Veidemanis b67eee42c1 Implement another level of logging for tracing 4 years ago
Mark Veidemanis 9e6dd5e03d Note that arguments to list are optional 4 years ago
Mark Veidemanis 77e8ef4c16 Implement authentication checking on connection 4 years ago
Mark Veidemanis c879caa9d7 Add checks in dedup for time-less messages 4 years ago
Mark Veidemanis db7e5677d3 Fix decoding issue with some Redis keys 4 years ago
Mark Veidemanis f848b5afd6 Provision users with lowercase names 4 years ago
Mark Veidemanis 3bc65f8456 Add the time field to some notifications 4 years ago
Mark Veidemanis 95ee63e399 Fix circular import in ChanKeep/provisioning modules 4 years ago
Mark Veidemanis a1e045793c
Start implementing prefixes 4 years ago
Mark Veidemanis f50a40d207
Fixes to auth detection and message parsing
* don't check authentication if the network doesn't need to
  register
* don't pass through muser for ZNC type messages
* avoid duplicate message for queries containing highlights
* make a copy of the cast for metadata analysis to avoid poisoning it
* set up callback for when the instance is authenticated, so we can
  request a LIST immediately if so desired
* separate out seeding functions to populate CHANLIMIT to ease future
  work involving other options, such as PREFIX
4 years ago
Mark Veidemanis 4c08225a50
Remove condition-based monitoring system 4 years ago
Mark Veidemanis 11f15ac960
Fix various bugs in the event system
Squash many bugs in the event notification system and simplify the
code.
4 years ago
Mark Veidemanis 8103c16253
Fix syntax error in redis query 4 years ago
Mark Veidemanis 45070b06e2
Implement authentication detection
* pending command to see which instances have never authenticated
* authcheck command to see which instances are not currently
  authenticated
4 years ago
Mark Veidemanis 12db2f349e
Add help for pending command 4 years ago
Mark Veidemanis 40e1f38508
Add additional error handling in user queries 4 years ago
Mark Veidemanis 63c97db12e
Function to select and merge IRC network defs 4 years ago
Mark Veidemanis 91885170f1
Check registration status before joining channels
Do not join channels if any relay for a network is unregistered.
4 years ago
Mark Veidemanis 7c23766763
Allow sending LIST to all networks at once 4 years ago
Mark Veidemanis 9e62ac62bc
Add confirm command
Confirm command to check which relays need manual
confirmation.
4 years ago
Mark Veidemanis 014de9f958
Remove leftover irc.json file 4 years ago
Mark Veidemanis f90f2fdef7
Implement registration and confirmation of nicks 4 years ago
Mark Veidemanis e0549cdd30
Restructure provisioning into fewer functions 4 years ago
Mark Veidemanis a78229a288
Add irc.json to gitignore 4 years ago
Mark Veidemanis 918d410927
Fix variable scope in LIST error handling 4 years ago
Mark Veidemanis bc4d5cba8e
Separate provisioning into user and auth info 4 years ago
Mark Veidemanis 376d1bd911
Add IRC network definitions 4 years ago
Mark Veidemanis 778690ae3a
Add more comments and remove obsolete code 5 years ago
Mark Veidemanis da3ba4ea8c
Add requirements 5 years ago