This document discusses using Apache Spark for forensic linguistics analysis of tweets from Greece during the 2015 financial crisis. It introduces key concepts like idiolect, sociolect and intertextuality. It then explores the dataset of 190k Greek Twitter users from June-August 2015 using Spark and techniques like word2vec, LDA and clustering to analyze writing styles, identify similar users and detect communities. Future directions discussed include deep learning approaches for tasks like author profiling.