My recent focus has been the implementation of the Vanquish real-time security analysis pipeline using Big Data processing and ML algorithms. Vanquish provides fast, robust, actionable security detection and alerting for the Microsoft Office 365 substrate handling the volume associated with O365 - millions of events per second.
Major components of the Vanquish pipeline:
1. Kafka (message queue)
2. Spark / Scala (streaming & batch analytics).
3. Cassandra, CosmosDB, Azure SQL, HDFS & Microsoft Kusto (storage)
4. C# / ASP.NET (authentication & access layer)
5. Web-based analyst interface employing Angular / Kendo / Bootstrap
Analytics on Vanquish include: Anomaly detection, outlier detection, supervised & unsupervised ML.